admix.simulate.admix_geno#
- admix.simulate.admix_geno(geno_list: List[Array], df_snp: DataFrame, anc_props: List[float], mosaic_size: float, n_indiv: int, return_sparse_lanc=False) Dataset [source]#
Simulate admixed genotype
The generative model is:
for each ancestry, the allele frequencies are drawn
- for each individual, breakpoints are drawn from a Poisson process. and the ancestry
will be filled based a multinomial distribution with n_anc components
- for each SNP, given the ancestry and the allele frequencies, the haplotype is drawn.
Haplotype are simulated under some frequencies
- Parameters:
geno_list (List[da.Array]) – List of ancestral data sets, each with (n_snp, n_indiv)
df_snp (pd.DataFrame) – Dataframe of SNPs shared across ancestral data sets
anc_props (list of float) – Proportion of ancestral populations
mosaic_size (float) – Expected mosaic size in # of SNPs. use admix.lanc.calculate_mosaic_size() to calculate the mosaic size
n_indiv (int) – Number of individuals to simulate
- Returns:
admix.Dataset – Simulated admixed dataset