qtl_power package
Module contents
Initialization of qtl-power module.
qtl_power.extreme_pheno module
Power calculations for extreme phenotype sampling designs.
- class qtl_power.extreme_pheno.ExtremePhenotype
Class defining extreme phenotype designs.
- est_power_extreme_pheno(n=100, maf=0.01, beta=0.1, niter=100, alpha=0.05, q0=0.1, q1=0.1)
Estimate the power from an extreme-phenotype sampling design.
- Parameters:
n (int) – total sample size.
maf (float) – minor allele frequency of tested variant.
beta (float) – effect-size in standard deviations.
niter (int) – number of simulation iterations.
alpha (float) – significance threshold for Fishers Exact Test.
q0 (float) – bottom quantile to establish as controls (or low-extremes).
q1 (float) – upper quantile to establish as cases (or upper extremes).
- Returns:
power of extreme sampling design
- Return type:
power (float)
- sim_extreme_pheno(n=100, maf=0.01, beta=0.1, seed=42)
Simulate an extreme phenotype under an HWE assumption.
- Parameters:
n (int) – total sample size.
maf (float) – minor allele frequency of tested variant.
beta (float) – effect-size in standard deviations.
seed (int) – random seed for simulations.
- Returns:
vector of allele-counts. phenotypes (np.array): quantitative phenotypes.
- Return type:
allele_count (np.array)
qtl_power.gwas module
Functions to calculate power in GWAS designs.
- class qtl_power.gwas.Gwas
Parent class for GWAS Power calculation.
- llr_power(alpha=5e-08, df=1, ncp=1)
Power under a non-central chi-squared distribution.
- Parameters:
alpha (float) – p-value threshold for GWAS
df (int) – degrees of freedom
ncp (float) – non-centrality parameter
- Returns:
power for association
- Return type:
power (float)
- class qtl_power.gwas.GwasBinary
GWAS Power calculator for Case/Control study design.
- binary_trait_beta_power(n=100, power=0.9, p=0.1, r2=1.0, alpha=5e-08, prop_cases=0.5)
Optimal detectable effect-size under a case-control GWAS study design.
- Parameters:
n (int) – sample-size of unrelated individuals.
power (float) –
beta (float) – effect-size of variant.
r2 (float) – correlation r2 between causal variant and tag variant.
alpha (float) – p-value threshold for detection.
prop_cases (float) – proportion of samples that are cases.
- Returns:
non-centrality parameter.
- Return type:
ncp (float)
- binary_trait_opt_n(beta=0.1, power=0.9, p=0.1, r2=1.0, alpha=5e-08, prop_cases=0.5)
Determine the sample-size required to detect this effect.
- Parameters:
beta (float) – effect-size of the variant.
power (float) – threshold power level.
p (float) – minor allele frequency of variant.
r2 (float) – correlation r2 between causal variant and tagging variant.
alpha (float) – p-value threshold for GWAS
prop_cases (float) – proportion of cases in the dataset
- Returns:
optimal sample size for detection at this power-level.
- Return type:
opt_n (float)
- binary_trait_power(n=100, p=0.1, beta=0.1, r2=1.0, alpha=5e-08, prop_cases=0.1)
Power under a case-control GWAS study design.
- Parameters:
n (int) – sample-size of unrelated individuals.
p (float) – minor allele frequency of variant.
beta (float) – effect-size of variant.
r2 (float) – correlation r2 between causal variant and tagging variant.
alpha (float) – p-value threshold for detection.
prop_cases (float) – proportion of samples that are cases.
- Returns:
non-centrality parameter.
- Return type:
ncp (float)
- ncp_binary(n=100, p=0.1, beta=0.1, r2=1.0, prop_cases=0.1)
Determine the effect-size required to detect an association at this MAF.
- Parameters:
n (int) – sample-size of unrelated individuals.
p (float) – minor allele frequency of variant.
beta (float) – effect-size of variant.
r2 (float) – correlation r2 between causal variant and tagging variant.
prop_cases (float) – proportion of samples that are cases.
- Returns:
non-centrality parameter.
- Return type:
ncp (float)
- class qtl_power.gwas.GwasBinaryModel
GWAS Power calculations under different encodings of genotypic risk.
- binary_trait_beta_power_model(n=100, p=0.1, model='additive', prev=0.01, alpha=5e-08, prop_cases=0.5, power=0.9)
Threshold effects under a specific power threshold and genetic model.
- Parameters:
n (int) – sample-size of unrelated individuals.
p (float) – minor allele frequency of variant.
beta (float) – effect-size of variant (in terms of relative-risk).
model (string) – genetic model for effects (additive, recessive, or dominant).
prev (float) – prevalence of the trait in question.
alpha (float) – p-value threshold for detection.
prop_cases (float) – proportion of samples that are cases.
power (float) – power under the model.
- Returns:
detectable effect-size at the power threshold and model.
- Return type:
opt_beta (float)
- binary_trait_power_model(n=100, p=0.1, beta=0.1, model='additive', prev=0.01, alpha=5e-08, prop_cases=0.5)
Power under a case-control GWAS study design.
- Parameters:
n (int) – sample-size of unrelated individuals.
p (float) – minor allele frequency of variant.
beta (float) – effect-size of variant (in terms of relative-risk).
model (string) – genetic model for effects (additive, recessive, or dominant).
prev (float) – prevalence of the trait in question.
alpha (float) – p-value threshold for detection.
prop_cases (float) – proportion of samples that are cases.
- Returns:
power under the model.
- Return type:
power (float)
- ncp_binary_model(n=100, p=0.1, beta=0.1, model='additive', prev=0.01, alpha=5e-08, prop_cases=0.5)
Explore how multiple models affect power in case-control traits.
- class qtl_power.gwas.GwasQuant
Class for power calculations of a GWAS for a quantitative trait.
- ncp_quant(n=100, p=0.1, beta=0.1, r2=1.0)
Compute the non-centrality parameter for a quantitative trait GWAS.
- Parameters:
n (int) – sample-size of unrelated individuals.
p (float) – minor allele frequency of variant.
beta (float) – effect-size of variant.
r2 (float) – correlation r2 between causal variant and tagging variant.
- Returns:
non-centrality parameter.
- Return type:
ncp (float)
- quant_trait_beta_power(n=100, power=0.9, p=0.1, r2=1.0, alpha=5e-08)
Determine the effect-size required to detect an association at this MAF.
- Parameters:
n (int) – sample-size of unrelated individuals.
power (float) – threshold power level.
p (float) – minor allele frequency of variant.
r2 (float) – correlation r2 between causal variant and tagging variant.
alpha (float) – p-value threshold for GWAS
- Returns:
optimal beta for detection at a specific power level
- Return type:
opt_beta (float)
- quant_trait_opt_n(beta=0.1, power=0.9, p=0.1, r2=1.0, alpha=5e-08)
Determine the sample-size required to detect this effect.
- Parameters:
beta (float) – effect-size of the variant.
power (float) – threshold power level.
p (float) – minor allele frequency of variant.
r2 (float) – correlation r2 between causal variant and tagging variant.
alpha (float) – p-value threshold for GWAS
- Returns:
optimal sample size for detection at this power-level.
- Return type:
opt_n (float)
- quant_trait_power(n=100, p=0.1, beta=0.1, r2=1.0, alpha=5e-08)
Power for a quantitative trait association study.
- Parameters:
n (int) – sample-size of unrelated individuals.
p (float) – minor allele frequency of variant.
beta (float) – effect-size of variant.
r2 (float) – correlation r2 between causal variant and tagging variant.
alpha (float) – p-value threshold for GWAS
- Returns:
non-centrality parameter.
- Return type:
ncp (float)
qtl_power.rare_variants module
Estimating power for rare-variant association methods from PAGEANT.
- class qtl_power.rare_variants.RareVariantBurdenPower
Approximation of power for rare-variant burden tests based on results from Derkach et al (2018).
- ncp_burden_test_model1(n=100, j=30, jd=10, jp=0, tev=0.1)
Approximation of the non-centrality parameter under model S1 from Derkach et al.
The key assumption in this case is that there is independence between an alleles effect-size and its MAF.
- Parameters:
n (int) – total sample size.
j (int) – total number of variants in the gene.
jd (int) – number of disease variants in the gene.
jp (int) – number of protective variants in the gene.
tev (float) – proportion of variance explained by gene.
- Returns:
non-centrality parameter.
- Return type:
ncp (float)
- opt_n_burden_model1(j=30, tev=0.01, prop_causal=0.8, prop_risk=0.5, alpha=1e-06, power=0.8)
Estimate the sample-size required for detection of supplied TEV in a region.
- Parameters:
j (int) – total number of variants in the gene.
tev (float) – proportion of variance explained by gene.
prop_causal (float) – proportion of causal variants.
prop_risk (float) – number of protective variants.
alpha (float) – p-value threshold for power.
power (float) – power for detection under the burden model.
- Returns:
TEV required for detection at this rate.
- Return type:
opt_tev (float)
- power_burden_model1(n=100, j=30, prop_causal=0.8, prop_risk=0.1, tev=0.1, alpha=1e-06)
Estimate the power under a burden model 1 from PAGEANT.
- Parameters:
n (int) – total sample size.
j (int) – total number of variants in the gene.
prop_causal (float) – proportion of causal variants.
prop_risk (float) – number of protective variants.
tev (float) – proportion of variance explained by gene.
alpha (float) – p-value threshold for power.
- Returns:
power for detection under the burden model.
- Return type:
power (float)
- power_burden_model1_real(n=100, nreps=10, **kwargs)
Estimate power under model 1 from PAGEANT with realistic variants per gene.
- Parameters:
n (int) – number of samples
nreps (int) – number of replicates
- Returns:
array of power estimates based on realistic number of variants.
- Return type:
est_power (np.array)
- tev_power_burden_model1(n=100, j=30, prop_causal=0.8, prop_risk=0.5, alpha=1e-06, power=0.8)
Estimate the total explained variance by a region for adequate detection at a power threshold.
- Parameters:
n (int) – total sample size.
j (int) – total number of variants in the gene.
prop_causal (float) – proportion of causal variants.
prop_risk (float) – number of protective variants.
alpha (float) – p-value threshold for power.
power (float) – power for detection under the burden model.
- Returns:
TEV required for detection at this rate.
- Return type:
opt_tev (float)
- class qtl_power.rare_variants.RareVariantPower
Power calculator for rare-variant power.
Methods based on derivations from [PAGEANT](https://doi.org/10.1093/bioinformatics/btx770)
- llr_power(alpha=1e-06, df=1, ncp=1, ncp0=0)
Power under a non-central chi-squared distribution.
- Parameters:
alpha (float) – p-value threshold for GWAS
df (int) – degrees of freedom
ncp (float) – non-centrality parameter
ncp0 (float) – null non-centrality parameter
- Returns:
power for association
- Return type:
power (float)
- sim_af_weights(j=100, a1=0.1846, b1=11.1248, n=100, clip=True, seed=42, test='SKAT')
Simulate allele frequencies from a beta distribution.
Ideally the beta distribution is derived from realized allele frequencies. The current parameters are based on 15k African ancestry individuals. For mimicing a much larger set (112k) of Non-Finnish European individuals, use the parameters a1=0.14311324240262455, b1=26.97369198989023,
- Parameters:
j (int) – number of variants
a1 (float) – shape parameter of the beta distribution
b1 (float) – scale parameter of the beta distribution
n (float) – number of samples
clip (boolean) – perform clipping based on the current sample-size.
seed (int) – random seed.
test (string) – type of test to be performed (SKAT, Calpha, Hotelling)
- Returns:
array of weights per-variant. ps (np.array): array of allele frequencies.
- Return type:
ws (np.array)
- sim_var_per_gene(a=1.47, b=0.0108, seed=42)
Simulate the number of variants per-gene.
Parameter values are derived from GnomAD Exonic variants on Chromosome 4 from ~15730 AFR ancestry subjects.
For a Non-Finnish European ancestry setting with larger sample size (~112350), use a=1.44306, b=0.00372.
- Parameters:
a (float) – shape parameter for a gamma distribution
b (float) – scale parameter for a gamma distribution
seed (int) – random seed.
- Returns:
number of variants per-gene.
- Return type:
nvar (int)
- class qtl_power.rare_variants.RareVariantVCPower
Approximation of power for rare-variant variance component tests based on results from Derkach et al (2018).
- match_cumulants_ncp(c1, c2, c3, c4)
Obtain the degrees of freedom and non-centrality parameter from cumulants.
- Parameters:
c1 (float) – first cumulant of non-central chi-squared dist.
c2 (float) – second cumulant of non-central chi-squared dist.
c3 (float) – third cumulant of non-central chi-squared dist.
c4 (float) – fourth cumulant of non-central chi-squared dist.
- Returns:
degrees of freedom for test. ncp (float): non-centrality parameter.
- Return type:
df (int)
- ncp_vc_first_order_model1(ws, ps, n=100, tev=0.1)
Approximation of the non-centrality parameter under model S1 from Derkach et al.
The key assumption is independence between an alleles effect-size and its MAF, from Table S1 in Derkach et al.
- Parameters:
ws (np.array) – numpy array of weights per-variant
ps (np.array) – numpy array of allele frequencies
n (int) – sample size
tev (float) – total explained variance by a locus
- Returns:
degrees of freedom for variance component test ncp (float): non-centrality parameter
- Return type:
df (float)
- power_vc_first_order_model1(ws, ps, n=100, tev=0.1, alpha=1e-06, df=1)
Compute the power for detection under model 1 for a variance component test.
- Parameters:
ws (np.array) – numpy array of weights per-variant
ps (np.array) – numpy array of allele frequencies
n (int) – sample size
tev (float) – total explained variance by a locus
alpha (float) – total significance level for estimation of power
df (float) – degree of freedom for test
- Returns:
estimated power under this variance component model.
- Return type:
power (float)