Genetic correlation across local ancestry segments¶

[1]:

import pandas as pd
import numpy as np
import admix
import os

[2]:

dset = admix.io.read_dataset("example_data/CEU-YRI")

[3]:

dset

[3]:

admix.Dataset object with n_snp x n_indiv = 15357 x 10000, n_anc=2
        snp: 'CHROM', 'POS', 'REF', 'ALT', 'QUAL', 'FILTER', 'INFO', 'LANC_FREQ1', 'LANC_FREQ2', 'LANC_NHAPLO1', 'LANC_NHAPLO2', 'FREQ'

[4]:

np.random.seed(1234)
cor = 0.9
sim = admix.simulate.quant_pheno(dset, hsq=0.25, cor = 0.9, n_sim=10)

admix.simulate.quant_pheno: 100%|██████████████████████████████████████████████████████████████████████████| 15/15 [00:13<00:00,  1.08it/s]

[5]:

pheno_df = pd.DataFrame({"PHENO": sim['pheno'][:, 0], "COVAR": np.random.normal(size=dset.n_indiv)}, index=dset.indiv.index)
os.makedirs("example_data/genet_cor/", exist_ok=True)
pheno_df.to_csv("example_data/genet_cor/pheno.tsv", sep='\t')

out_dir=example_data/genet_cor
pfile=example_data/CEU-YRI

Step 1: Compute GRM¶

mkdir -p ${out_dir}/admix-grm
admix append-snp-info \
    --pfile ${pfile} \
    --out ${pfile}.snp_info

admix admix-grm \
    --pfile ${pfile} \
    --out-prefix ${out_dir}/admix-grm/grm

Step 2: Estimate genetic correlation¶

mkdir -p ${out_dir}/estimate
# this step will take a while
admix genet-cor \
    --pheno ${out_dir}/pheno.tsv \
    --grm-prefix ${out_dir}/admix-grm/grm \
    --out-dir ${out_dir}/estimate/

admix summarize-genet-cor \
    --est-dir ${out_dir}/estimate/ \
    --out-prefix ${out_dir}/estimate

cat ${out_dir}/estimate.summary.json

{
    "n": 10000,
    "rg_mode": 0.875,
    "rg_hpdi(50%)": [
        0.838,
        0.909
    ],
    "rg_hpdi(95%)": [
        0.754,
        0.961
    ],
    "rg=1_pval": 0.001345845749376501
}

Step 3: meta-analysis¶

To obtain results for simulations from all correlation parameters and simulation replicates (or for all traits in real data analysis), we recommend using computing clusters to parallelize this process. After these results are obtained, one can use admix meta-analyze-genet-cor to meta-analyze these results. For example,

admix meta-analyze-genet-cor --loglkl-files "${out_dir}/*.loglkl.txt"