The following options are designed to perform an MLM based association analysis. Previous data management options such as --keep, --extract and --maf, REML analysis options such as --reml-priors, --reml-maxit and --reml-no-constrain and multi-threading option --thread-num are still valid for this analysis.
This option will initiate an MLM based association analysis including the candidate SNP
y = a + bx + g + e
where y is the phenotype, a is the mean term, b is the additive effect (fixed effect) of the candidate SNP to be tested for association, x is the SNP genotype indicator variable coded as 0, 1 or 2, g is the polygenic effect (random effect) i.e. the accumulated effect of all SNPs (as captured by the GRM calculated using all SNPs) and e is the residual. For the ease of computation, the genetic variance, var(g), is estimated based on the null model i.e. y = a + g + e and then fixed while testing for the association between each SNP and the trait. This analysis would be similar as that implemented in other software tools such as EMMAX, FaST-LMM and GEMMA.
The results will be saved in the *.mlma file.
This option will implement an MLM based association analysis with the chromosome, on which the candidate SNP is located, excluded from calculating the GRM. We call it MLM leaving-one-chromosome-out (LOCO) analysis. The model is
y = a + bx + g- + e
where g- is the accumulated effect of all SNPs except those on the chromosome where the candidate SNP is located. The var(g-) will be re-estimated each time when a chromosome is excluded from calculating the GRM. The MLM-LOCO analysis is computationally less efficient but more powerful as compared with the MLM analysis including the candidate (--mlma).
The results will be saved in the *.loco.mlma file.
If there are covariates included in the analysis, the covariates will be fitted in the null model, a model including the mean term (fixed effect), covariates (fixed effects), polygenic effects (random effects) and residuals (random effects). By default, in order to improve computational efficiency, the phenotype will be adjusted by the mean and covariates, and the adjusted phenotype will subsequently be used for testing SNP association. However, if SNPs are correlated with the covariates, pre-adjusting the phenotype by the covariates will probably cause loss of power. If this option is specified, the covariates will be fitted together with the SNP for association test. However, this will significantly reduce computational efficiency.
# MLM based association analysis - If you have already computed the GRM
gcta64 --mlma --bfile
test --pheno test.phen --out test --thread-num 10
# MLM based association analysis including the candidate SNP (MLMi)
gcta64 --mlma --bfile test --pheno test.phen --out test --thread-num 10
# MLM leaving-one-chromosome-out (LOCO) analysis
gcta64 --mlma-loco --bfile test --pheno test.phen --out test --thread-num 10
Output file format
test.mlma or test.loco.mlma (columns are chromosome, SNP, physical position, reference allele (the coded effect allele), the other allele, frequency of the reference allele, SNP effect, standard error and p-value).
Chr SNP bp ReferenceAllele OtherAllele Freq b se p
1 qtl2_1 1001 L H 0.366 0.0143857 0.0411682 0.726761
1 qtl2_2 1002 H L 0.326 -0.0240756 0.0421248 0.56764
1 qtl2_3 1003 H L 0.146 -0.0921772 0.0565541 0.103124
1 qtl2_4 1004 H L 0.3865 -0.0771376 0.0394826 0.0507357
1 qtl2_5 1005 H L 0.1665 0.00251276 0.0526821 0.961958
1 qtl2_6 1006 L H 0.119 -0.0153568 0.059891 0.797632
1 qtl2_7 1007 L H 0.1675 -0.0487809 0.0512279 0.340979
An overview of the MLM based association methods:
J, Zaitlen NA, Goddard ME, Visscher PM and Price AL (2014) Mixed model
association methods: advantages and pitfalls. Nat Genet. 2014 Feb;46(2):100-6. [Pubmed ID: 24473328]
REML analysis and GCTA Software: Yang J, Lee SH, Goddard ME and Visscher PM. GCTA: a tool for Genome-wide Complex Trait Analysis. Am J Hum Genet. 2011 Jan 88(1): 76-82. [PubMed ID: 21167468]
Options10. Conditional & joint GWAS analysis