Genetic correlation estimation¶

Step 0: Prepare data¶

Phased genotypes and inferred local ancestry (please follow preparing dataset). So you have ${prefix}.chr${chrom}.[pgen|psam|pvar|lanc] files.
Phenotype and covariates file per trait ${trait}.txt.

With these files, you can run the following command to estimate $r_\text{admix}$.

Step 1: compute GRM $\mathbf{K}_1$ and $\mathbf{K}_2$ for each chromosome¶

mkdir -p ${out_dir}/admix-grm
admix admix-grm \
    --pfile ${prefix}.chr${chrom} \
    --out-prefix ${out_dir}/admix-grm/chr${chrom}

This step will generate ${out_dir}/admix-grm/chr${chrom}.[grm.bin|grm.id|grm.n|weight.tsv] files.

Step 2: merging GRMs across chromosomes¶

admix admix-grm-merge \
    --prefix ${out_dir}/admix-grm/chr\
    --out-prefix ${out_dir}/admix-grm/merged

This step will generate ${out_dir}/admix-grm/merged.[grm.bin|grm.id|grm.n|weight.tsv] files.

Step 3: calculating the GRM ($\mathbf{K}_1 + r_\text{admix} \mathbf{K}_2)$ at different $r_\text{admix}$ values and estimating log-likelihood at different $r_\text{admix}$ values¶

admix genet-cor \
    --pheno ${trait}.txt
    --grm-prefix ${out_dir}/admix-grm/merged \
    --out-dir ${out_dir}/estimate/${trait}

Parameter options¶

admix.cli.admix_grm(pfile: str, out_prefix: str, maf_cutoff: float = 0.005, her_model='mafukb', freq_cols=['LANC_FREQ1', 'LANC_FREQ2'], snp_chunk_size: int = 256, snp_list: str | None = None, write_raw: bool = False) → None[source]¶

Calculate the admix GRM for a given pfile

Parameters:

pfile (str) – Path to the pfile
out_prefix (str) – Prefix of the output files
maf_cutoff (float, optional) – MAF cutoff for the admixed individuals, by default 0.005
her_model (str, optional) – Heritability model, by default “mafukb” one of “uniform”, “gcta”, “ldak”, “mafukb”
freq_cols (List[str], optional) – Columns of the pfile to use as frequency, by default [“LANC_FREQ1”, “LANC_FREQ2”] to perform the ancestry-specific MAF cutoffs
snp_chunk_size (int, optional) – Number of SNPs to read at a time, by default 256 This can be tuned to reduce memory usage
snp_list (str, optional) – Path to a file containing a list of SNPs to use. Each line should be a SNP ID. Only SNPs in the list will be used for the analysis. By default None
write_raw (bool, optional) – Whether to write the raw GRM, G1, G2, G12, by default False

Returns:

GRM files ({out_prefix}.[K1, K2].[grm.bin | grm.id | grm.n] will be generated)
Weight file ({out_prefix}.weight.tsv will be generated)

admix.cli.admix_grm_merge(prefix: str, out_prefix: str, n_part: int = 22) → None[source]¶

Merge multiple GRM matrices

Parameters:

prefix (str) – Prefix of the GRM files, any files with the pattern of <prefix>.* will be merged
out_prefix (str) – Prefix of the output file
n_part (int, optional) – Number of partitions, by default 22

Returns:

GRM files ({out_prefix}.[K1, K2].[grm.bin | grm.id | grm.n] will be generated)
Weight file ({out_prefix}.weight.tsv will be generated)

admix.cli.genet_cor(pheno: str, grm_prefix: str, out_dir: str, rg_grid=array([0., 0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95, 1.]), quantile_normalize: bool = True, n_thread: int = 2, clean: bool = True)[source]¶

Estimate genetic correlation

Parameters:

pheno (str) – phenotype file, the 1st column contains ID, 2nd column contains phenotype, and the rest of columns are covariates.
grm_prefix (str) – folder containing K1, K2 GRM files
out_dir (str) – folder to store the output files
rg_grid (list, optional) – List of rg values to grid search, by default np.linspace(0, 1.0, 21)
quantile_normalize (bool) – whether to perform quantile normalization for both phenotype and each column of covariates
n_thread (int, optional) – number of threads, by default 2

Additional notes¶

For more background, we recommend reading Causal effects on complex traits are similar across segments of different continental ancestries within admixed individuals. Nature Genetics (2023).

Genetic correlation estimation¶

Step 0: Prepare data¶

Step 1: compute GRM \(\mathbf{K}_1\) and \(\mathbf{K}_2\) for each chromosome¶

Step 2: merging GRMs across chromosomes¶

Step 3: calculating the GRM (\(\mathbf{K}_1 + r_\text{admix} \mathbf{K}_2)\) at different \(r_\text{admix}\) values and estimating log-likelihood at different \(r_\text{admix}\) values¶

Parameter options¶

Additional notes¶