GWAS tools
GWAS Tools
What are GWAS Tools?
Genome-Wide Association Study (GWAS) tools are resources used to examine common genetic variants in individuals to see which of those variants are associated with a given trait. For the purposes of Validate, GWAS tools are used to identify SNPs and estimate the size of their effect.
Selecting a GWAS Tool
Validate 0.9 offers several choices of GWAS tool. Should none of these GWAS tools fit to your liking, you may easily bring in your own code via iRODs or the file transfer option on the Atmosphere instance. For the sake of this tutorial, we will cover only the four main GWAS tools on the Validate Workflow:
FaST-LMM
Linear scalability with data size, meaning large data set analyses are not as slow as traditional GWAS tools
Ability to save genetic similarity matrices and other components of analysis
Handles epistasis and accounts for multiple confounding
Requires either text files or PLINK format for input
GEMMA
Fits univariate and multivariate linear mixed modeling for marker association with single and multiple phenotypes
Fits Bayesian Linear Mixed Modeling for estimating PVE by typed genotypes, predicting phenotypes, and identifying associated markers
Uses freely available open-source numerical libraries
Input files can be in PLINK or BIMBAM format
Optional covariate file can be included
QxPak
Allows IBD matrices to be included through input files
Offers a wide statistical modeling flexibility such as multivariate models, REML, Maximum likelihood, and BLUP estimation
Includes QTL analysis and also has multiQTL and multitrait modeling fully implemented
Custom epistasis modeling
PLINK
Honestly not particularly faster or more thorough than any of the tools above
However, good for example analyses and file format conversion.
Each of these GWAS tools is perfectly capable for analyzing whatever dataset you may have. The next section will detail how to run everything.
Further Information
Introduction to SNPs: http://learn.genetics.utah.edu/content/pharma/snips/
The following apps have at one point been used within our workflow, but are no longer included. They can be explored via the links below:
antepi https://pods.iplantcollaborative.org/wiki/display/DEapps/AntEpiSeeker+2.0
borda count https://pods.iplantcollaborative.org/wiki/display/DEapps/Borda+Count
random jungle (which has since been turned into a program called ranger) https://pods.iplantcollaborative.org/wiki/display/DEapps/Random+Jungle+2
This tool is still in development and we are testing it currently. If you notice any issues or have any comments we would greatly appreciate them!
Please contact us at labstapleton@gmail.com. Thank you for using our tools!