admix.simulate.admix_geno#

admix.simulate.admix_geno(geno_list: List[Array], df_snp: DataFrame, anc_props: List[float], mosaic_size: float, n_indiv: int, return_sparse_lanc=False) Dataset[source]#

Simulate admixed genotype

The generative model is:

  • for each ancestry, the allele frequencies are drawn

  • for each individual, breakpoints are drawn from a Poisson process. and the ancestry

    will be filled based a multinomial distribution with n_anc components

  • for each SNP, given the ancestry and the allele frequencies, the haplotype is drawn.

    Haplotype are simulated under some frequencies

Parameters:
  • geno_list (List[da.Array]) – List of ancestral data sets, each with (n_snp, n_indiv)

  • df_snp (pd.DataFrame) – Dataframe of SNPs shared across ancestral data sets

  • anc_props (list of float) – Proportion of ancestral populations

  • mosaic_size (float) – Expected mosaic size in # of SNPs. use admix.lanc.calculate_mosaic_size() to calculate the mosaic size

  • n_indiv (int) – Number of individuals to simulate

Returns:

admix.Dataset – Simulated admixed dataset