qtl_power package
Module contents
Initialization of qtl-power module.
Submodules
qtl_power.extreme_pheno module
Power calculations for extreme phenotype sampling designs.
- class qtl_power.extreme_pheno.ExtremePhenotype
- Bases: - object- Class defining extreme phenotype designs. - est_power_extreme_pheno(n=100, maf=0.01, beta=0.1, niter=100, alpha=0.05, q0=0.1, q1=0.1)
- Estimate the power from an extreme-phenotype sampling design. - Parameters:
- n (int) – total sample size. 
- maf (float) – minor allele frequency of tested variant. 
- beta (float) – effect-size in standard deviations. 
- niter (int) – number of simulation iterations. 
- alpha (float) – significance threshold for Fishers Exact Test. 
- q0 (float) – bottom quantile to establish as controls (or low-extremes). 
- q1 (float) – upper quantile to establish as cases (or upper extremes). 
 
- Returns:
- power of extreme sampling design 
- Return type:
- power (float) 
 
 - sim_extreme_pheno(n=100, maf=0.01, beta=0.1, seed=42)
- Simulate an extreme phenotype under an HWE assumption. - Parameters:
- n (int) – total sample size. 
- maf (float) – minor allele frequency of tested variant. 
- beta (float) – effect-size in standard deviations. 
- seed (int) – random seed for simulations. 
 
- Returns:
- vector of allele-counts. phenotypes (np.array): quantitative phenotypes. 
- Return type:
- allele_count (np.array) 
 
 
qtl_power.gwas module
Functions to calculate power in GWAS designs.
- class qtl_power.gwas.Gwas
- Bases: - object- Parent class for GWAS Power calculation. - llr_power(alpha=5e-08, df=1, ncp=1)
- Power under a non-central chi-squared distribution. - Parameters:
- alpha (float) – p-value threshold for GWAS 
- df (int) – degrees of freedom 
- ncp (float) – non-centrality parameter 
 
- Returns:
- power for association 
- Return type:
- power (float) 
 
 
- class qtl_power.gwas.GwasBinary
- Bases: - Gwas- GWAS Power calculator for Case/Control study design. - binary_trait_beta_power(n=100, power=0.9, p=0.1, r2=1.0, alpha=5e-08, prop_cases=0.5)
- Optimal detectable effect-size under a case-control GWAS study design. - Parameters:
- n (int) – sample-size of unrelated individuals. 
- power (float) – - . 
- beta (float) – effect-size of variant. 
- r2 (float) – correlation r2 between causal variant and tag variant. 
- alpha (float) – p-value threshold for detection. 
- prop_cases (float) – proportion of samples that are cases. 
 
- Returns:
- non-centrality parameter. 
- Return type:
- ncp (float) 
 
 - binary_trait_opt_n(beta=0.1, power=0.9, p=0.1, r2=1.0, alpha=5e-08, prop_cases=0.5)
- Determine the sample-size required to detect this effect. - Parameters:
- beta (float) – effect-size of the variant. 
- power (float) – threshold power level. 
- p (float) – minor allele frequency of variant. 
- r2 (float) – correlation r2 between causal variant and tagging variant. 
- alpha (float) – p-value threshold for GWAS 
- prop_cases (float) – proportion of cases in the dataset 
 
- Returns:
- optimal sample size for detection at this power-level. 
- Return type:
- opt_n (float) 
 
 - binary_trait_power(n=100, p=0.1, beta=0.1, r2=1.0, alpha=5e-08, prop_cases=0.1)
- Power under a case-control GWAS study design. - Parameters:
- n (int) – sample-size of unrelated individuals. 
- p (float) – minor allele frequency of variant. 
- beta (float) – effect-size of variant. 
- r2 (float) – correlation r2 between causal variant and tagging variant. 
- alpha (float) – p-value threshold for detection. 
- prop_cases (float) – proportion of samples that are cases. 
 
- Returns:
- non-centrality parameter. 
- Return type:
- ncp (float) 
 
 - ncp_binary(n=100, p=0.1, beta=0.1, r2=1.0, prop_cases=0.1)
- Determine the effect-size required to detect an association at this MAF. - Parameters:
- n (int) – sample-size of unrelated individuals. 
- p (float) – minor allele frequency of variant. 
- beta (float) – effect-size of variant. 
- r2 (float) – correlation r2 between causal variant and tagging variant. 
- prop_cases (float) – proportion of samples that are cases. 
 
- Returns:
- non-centrality parameter. 
- Return type:
- ncp (float) 
 
 
- class qtl_power.gwas.GwasBinaryModel
- Bases: - Gwas- GWAS Power calculations under different encodings of genotypic risk. - binary_trait_beta_power_model(n=100, p=0.1, model='additive', prev=0.01, alpha=5e-08, prop_cases=0.5, power=0.9)
- Threshold effects under a specific power threshold and genetic model. - Parameters:
- n (int) – sample-size of unrelated individuals. 
- p (float) – minor allele frequency of variant. 
- beta (float) – effect-size of variant (in terms of relative-risk). 
- model (string) – genetic model for effects (additive, recessive, or dominant). 
- prev (float) – prevalence of the trait in question. 
- alpha (float) – p-value threshold for detection. 
- prop_cases (float) – proportion of samples that are cases. 
- power (float) – power under the model. 
 
- Returns:
- detectable effect-size at the power threshold and model. 
- Return type:
- opt_beta (float) 
 
 - binary_trait_power_model(n=100, p=0.1, beta=0.1, model='additive', prev=0.01, alpha=5e-08, prop_cases=0.5)
- Power under a case-control GWAS study design. - Parameters:
- n (int) – sample-size of unrelated individuals. 
- p (float) – minor allele frequency of variant. 
- beta (float) – effect-size of variant (in terms of relative-risk). 
- model (string) – genetic model for effects (additive, recessive, or dominant). 
- prev (float) – prevalence of the trait in question. 
- alpha (float) – p-value threshold for detection. 
- prop_cases (float) – proportion of samples that are cases. 
 
- Returns:
- power under the model. 
- Return type:
- power (float) 
 
 - ncp_binary_model(n=100, p=0.1, beta=0.1, model='additive', prev=0.01, alpha=5e-08, prop_cases=0.5)
- Explore how multiple models affect power in case-control traits. 
 
- class qtl_power.gwas.GwasQuant
- Bases: - Gwas- Class for power calculations of a GWAS for a quantitative trait. - ncp_quant(n=100, p=0.1, beta=0.1, r2=1.0)
- Compute the non-centrality parameter for a quantitative trait GWAS. - Parameters:
- n (int) – sample-size of unrelated individuals. 
- p (float) – minor allele frequency of variant. 
- beta (float) – effect-size of variant. 
- r2 (float) – correlation r2 between causal variant and tagging variant. 
 
- Returns:
- non-centrality parameter. 
- Return type:
- ncp (float) 
 
 - quant_trait_beta_power(n=100, power=0.9, p=0.1, r2=1.0, alpha=5e-08)
- Determine the effect-size required to detect an association at this MAF. - Parameters:
- n (int) – sample-size of unrelated individuals. 
- power (float) – threshold power level. 
- p (float) – minor allele frequency of variant. 
- r2 (float) – correlation r2 between causal variant and tagging variant. 
- alpha (float) – p-value threshold for GWAS 
 
- Returns:
- optimal beta for detection at a specific power level 
- Return type:
- opt_beta (float) 
 
 - quant_trait_opt_n(beta=0.1, power=0.9, p=0.1, r2=1.0, alpha=5e-08)
- Determine the sample-size required to detect this effect. - Parameters:
- beta (float) – effect-size of the variant. 
- power (float) – threshold power level. 
- p (float) – minor allele frequency of variant. 
- r2 (float) – correlation r2 between causal variant and tagging variant. 
- alpha (float) – p-value threshold for GWAS 
 
- Returns:
- optimal sample size for detection at this power-level. 
- Return type:
- opt_n (float) 
 
 - quant_trait_power(n=100, p=0.1, beta=0.1, r2=1.0, alpha=5e-08)
- Power for a quantitative trait association study. - Parameters:
- n (int) – sample-size of unrelated individuals. 
- p (float) – minor allele frequency of variant. 
- beta (float) – effect-size of variant. 
- r2 (float) – correlation r2 between causal variant and tagging variant. 
- alpha (float) – p-value threshold for GWAS 
 
- Returns:
- non-centrality parameter. 
- Return type:
- ncp (float) 
 
 
qtl_power.rare_variants module
Estimating power for rare-variant association methods from PAGEANT.
- class qtl_power.rare_variants.RareVariantBurdenPower
- Bases: - RareVariantPower- Approximation of power for rare-variant burden tests based on results from Derkach et al (2018). - ncp_burden_test_model1(n=100, j=30, jd=10, jp=0, tev=0.1)
- Approximation of the non-centrality parameter under model S1 from Derkach et al. - The key assumption in this case is that there is independence between an alleles effect-size and EV. - Parameters:
- n (int) – total sample size. 
- j (int) – total number of variants in the gene. 
- jd (int) – number of disease variants in the gene. 
- jp (int) – number of protective variants in the gene. 
- tev (float) – proportion of variance explained by gene. 
 
- Returns:
- non-centrality parameter. 
- Return type:
- ncp (float) 
 
 - ncp_burden_test_model2(ws, ps, n=100, jd=10, tev=0.1)
- Approximation of the non-centrality parameter under model S2 from Derkach et al. - The key assumption in this case is that there is independence between alleles effect-size and its MAF. - Parameters:
- ws (np.array) – numpy array of variant weights 
- ps (np.array) – numpy array of variant frequencies 
- n (int) – total sample size. 
- jd (int) – number of disease variants in the gene. 
- jp (int) – number of protective variants in the gene. 
- tev (float) – proportion of variance explained by gene. 
 
- Returns:
- non-centrality parameter. 
- Return type:
- ncp (float) 
 
 - ncp_burden_test_model3(ws, ps, n=100, jd=10, tev=0.1, eta=0.1)
- Approximation of the non-centrality parameter under model S3 from Derkach et al. - The key assumption in this case is that alleles effect-size is strongly coupled to its MAF. - Parameters:
- n (int) – total sample size. 
- j (int) – total number of variants in the gene. 
- jd (int) – number of disease variants in the gene. 
- jp (int) – number of protective variants in the gene. 
- tev (float) – proportion of variance explained by gene. 
 
- Returns:
- non-centrality parameter. 
- Return type:
- ncp (float) 
 
 - opt_n_burden_model1(j=30, tev=0.01, prop_causal=0.8, prop_risk=0.5, alpha=1e-06, power=0.8)
- Estimate the sample-size required for detection of supplied TEV in a region. - Parameters:
- j (int) – total number of variants in the gene. 
- tev (float) – proportion of variance explained by gene. 
- prop_causal (float) – proportion of causal variants. 
- prop_risk (float) – number of protective variants. 
- alpha (float) – p-value threshold for power. 
- power (float) – power for detection under the burden model. 
 
- Returns:
- TEV required for detection at this rate. 
- Return type:
- opt_tev (float) 
 
 - power_burden_model1(n=100, j=30, prop_causal=0.8, prop_risk=0.1, tev=0.1, alpha=1e-06)
- Estimate the power under a burden model 1 from PAGEANT. - Parameters:
- n (int) – total sample size. 
- j (int) – total number of variants in the gene. 
- prop_causal (float) – proportion of causal variants. 
- prop_risk (float) – number of protective variants. 
- tev (float) – proportion of variance explained by gene. 
- alpha (float) – p-value threshold for power. 
 
- Returns:
- power for detection under the burden model. 
- Return type:
- power (float) 
 
 - power_burden_model1_real(n=100, nreps=10, **kwargs)
- Estimate power under model 1 from PAGEANT with realistic variants per gene. - Parameters:
- n (int) – number of samples 
- nreps (int) – number of replicates 
 
- Returns:
- array of power estimates based on realistic number of variants. 
- Return type:
- est_power (np.array) 
 
 - power_burden_model2(ws, ps, n=100, j=30, prop_causal=0.8, prop_risk=0.1, tev=0.1, alpha=1e-06)
- Estimate the power under a burden model 1 from PAGEANT. - Parameters:
- n (int) – total sample size. 
- j (int) – total number of variants in the gene. 
- prop_causal (float) – proportion of causal variants. 
- prop_risk (float) – number of protective variants. 
- tev (float) – proportion of variance explained by gene. 
- alpha (float) – p-value threshold for power. 
 
- Returns:
- power for detection under the burden model. 
- Return type:
- power (float) 
 
 - tev_power_burden_model1(n=100, j=30, prop_causal=0.8, prop_risk=0.5, alpha=1e-06, power=0.8)
- Estimate the total explained variance by a region for adequate detection at a power threshold. - Parameters:
- n (int) – total sample size. 
- j (int) – total number of variants in the gene. 
- prop_causal (float) – proportion of causal variants. 
- prop_risk (float) – number of protective variants. 
- alpha (float) – p-value threshold for power. 
- power (float) – power for detection under the burden model. 
 
- Returns:
- TEV required for detection at this rate. 
- Return type:
- opt_tev (float) 
 
 
- class qtl_power.rare_variants.RareVariantPower
- Bases: - object- Power calculator for rare-variant power. - Methods based on derivations from [PAGEANT](https://doi.org/10.1093/bioinformatics/btx770) - llr_power(alpha=1e-06, df=1, ncp=1, ncp0=0)
- Power under a non-central chi-squared distribution. - Parameters:
- alpha (float) – p-value threshold for GWAS 
- df (int) – degrees of freedom 
- ncp (float) – non-centrality parameter 
- ncp0 (float) – null non-centrality parameter 
 
- Returns:
- power for association 
- Return type:
- power (float) 
 
 - sim_af_weights(j=100, a1=0.1846, b1=11.1248, n=100, clip=True, seed=42, test='SKAT')
- Simulate allele frequencies from a beta distribution. - Ideally the beta distribution is derived from realized allele frequencies. The current parameters are based on 15k African ancestry individuals. For mimicing a much larger set (112k) of Non-Finnish European individuals, use the parameters a1=0.14311324240262455, b1=26.97369198989023, - Parameters:
- j (int) – number of variants 
- a1 (float) – shape parameter of the beta distribution 
- b1 (float) – scale parameter of the beta distribution 
- n (float) – number of samples 
- clip (boolean) – perform clipping based on the current sample-size. 
- seed (int) – random seed. 
- test (string) – type of test to be performed (SKAT, Calpha, Hotelling) 
 
- Returns:
- array of weights per-variant. ps (np.array): array of allele frequencies. 
- Return type:
- ws (np.array) 
 
 - sim_var_per_gene(a=1.47, b=0.0108, seed=42)
- Simulate the number of variants per-gene. - Parameter values are derived from GnomAD Exonic variants on Chromosome 4 from ~15730 AFR ancestry subjects. - For a Non-Finnish European ancestry setting with larger sample size (~112350), use a=1.44306, b=0.00372. - Parameters:
- a (float) – shape parameter for a gamma distribution 
- b (float) – scale parameter for a gamma distribution 
- seed (int) – random seed. 
 
- Returns:
- number of variants per-gene. 
- Return type:
- nvar (int) 
 
 
- class qtl_power.rare_variants.RareVariantVCPower
- Bases: - RareVariantPower- Approximation of power for rare-variant variance component tests based on results from Derkach et al (2018). - match_cumulants_ncp(c1, c2, c3, c4)
- Obtain the degrees of freedom and non-centrality parameter from cumulants. - Parameters:
- c1 (float) – first cumulant of non-central chi-squared dist. 
- c2 (float) – second cumulant of non-central chi-squared dist. 
- c3 (float) – third cumulant of non-central chi-squared dist. 
- c4 (float) – fourth cumulant of non-central chi-squared dist. 
 
- Returns:
- degrees of freedom for test. ncp (float): non-centrality parameter. 
- Return type:
- df (int) 
 
 - ncp_vc_first_order_model1(ws, ps, n=100, tev=0.1)
- Approximation of the non-centrality parameter under model S1 from Derkach et al. - The key assumption is independence between an alleles effect-size and its MAF, from Table S1 in Derkach et al. - Parameters:
- ws (np.array) – numpy array of weights per-variant 
- ps (np.array) – numpy array of allele frequencies 
- n (int) – sample size 
- tev (float) – total explained variance by a locus 
 
- Returns:
- degrees of freedom for variance component test ncp (float): non-centrality parameter 
- Return type:
- df (float) 
 
 - power_vc_first_order_model1(ws, ps, n=100, tev=0.1, alpha=1e-06, df=1)
- Compute the power for detection under model 1 for a variance component test. - Parameters:
- ws (np.array) – numpy array of weights per-variant 
- ps (np.array) – numpy array of allele frequencies 
- n (int) – sample size 
- tev (float) – total explained variance by a locus 
- alpha (float) – total significance level for estimation of power 
- df (float) – degree of freedom for test 
 
- Returns:
- estimated power under this variance component model. 
- Return type:
- power (float)