Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The Validate Workflow 

This page is designed to aid users in navigating the Validate Workflow.

Table of Contents
minLevel2

What is Validate?

"The purpose of Validate is to provide information on both SNP effect size estimation and identifying SNP capability performance for various GWAS and QTL tools. The eventual goal is two-fold: 1 Publish information about the performance of different tools for different types of simulation parameters (such as population structure and different levels of heritability) somewhere easily viewable for iPlant users. Essentially, we hope to show researchers when best to use a tool as compared with another. 2 Provide a pipeline or workflow for testing installed tools. This is to encourage iterations of the first goal."

- Dustin Landers (The architect of the original Validate program)


The workflow consists of several pieces of software called genome wide association study (GWAS) tools, and software to analyze the GWAS tool performance.

More specifically, the workflow includes:

  • Simulate: A Python-based simulation software that also outputs the known-truth phenotypes for a given population

  • Multiple GWAS tools

    •  FaST-LMM: GWAS analysis tool designed for large data sets, more specifically used to test all SNPs in a data set for statistical significance

    • GEMMA: A GWAS analysis tool specializing in standard linear mixed models and variations thereof

    • QxPak: A versatile statistics package specializing in statistical genomics and quantitative trait loci (QTL) analyses.

    • PLINK: Open-source software designed to convert data into usable formats and to perform basic, large scale analyses efficiently.

  • Winnow: A Python-compatible known truth testing tool for genome wide association studies (i.e. a tool that evaluates other GWAS tools)

  • Demonstrate: An R script that produces human-readable visual output from the results files produced from Winnow 

How to get started

  1. It is highly recommended that you watch the webinar “Getting Started with iPlant” given monthly by Jason Williams as an introduction to iPlant, and some of its features.

  2. There are a series of accounts which need to be setup and software which needs to be downloaded before getting started. Follow this link and return after you have followed the instructions for setting up accounts.

  3. You can acquaint yourself with Atmosphere here, generally though, atmosphere allows you to access a virtual machine where all of the necessary programs have been installed to run the workflow.  This lab has several atmosphere images which have been launched although you will likely only need the validate image unless you want to work with a specific tool.  Validate 0.9 is available as an Atmosphere image under the name Validate Workflow v0.9. 

  4. It can also be helpful to, Check here to learn more about stampede and check here to learn more about the Agave API. Stampede is housed at TACC and is the world's largest supercomputer dedicated to science. The Agave API is a tool for creating and implementing apps into stampede. The workflow can be operated exclusively on stampede, however, this process is under development and the following pages will be for use in atmosphere.

 

Warning
titleLearn about allocations

Learn about CyVerse's allocation policies here.

Next Steps

After looking through and completing the above you are ready to begin, you can either start at the simulate page,found here, or you can continue to scroll through this page to learn more about the validate project, and find links to useful information.

To learn more about the various offerings of the iPlant collaborative please check out the main page for getting started with the iPlant collaborative.

If you are interested in further developing the Validate workflow check here or here for a more statistically oriented guide.

...

You may also want to learn how to install R packages into atmosphere, instructions for which can be found here .    

For viewing the source code and additional information on any of these programs, please check the main Github repository.

...