Qxpak
Qxpak
What is Qxpak?
This software aims at simplifying statistical genetic analyses implementing a coherent and unified mixed model approach. The goal is to provide software that can be used in a wide variety of situations with ample genetic and statistical modeling flexibility. The main kind of analyses that QxPak may be used for include regular mixed model solving, QTL, segment analysis, multitrait, association, molecular relation matrix, and sequence based association studies.
How to Get Started
Required Files
parameter file with input details
data file containing phenotypes and any effect that may be included in the model
pedigree file with genotypes
marker file with genotypes
Parameter file
The parameter file carries the parameters for the program to run, and specifies the files which Qxpak will use. The file will ask for the following files to be defined.
___ | |
$data | Data file |
$pedigree | Pedigree file |
$marker | Marker file |
$userInverse | User direct - user-defined covariance matrix file |
$userDirect | User direct - user-defined covariance matrix file |
$haplotype | Haplotype file |
$output | Output file |
Data File
The data file is a .dat file, a free-format file without a header that always contains the individual in the first column, whether it be numeric or alphanumeric. Record order is not important. Subsequent columns include traits and effects and may include more than are used in the model. Missing values must be coded as 0. If the actual value is 0, recode it as 0.000001.
Pedigree File
A pedigree file is required for quantitative trait locus (QTL) analysis. This file includes the individual, father, mother, sex, and breed. The last two are optional if analyzing non-sex chromosomes or within breed populations. Individuals do not have to be coded; missing parents should be indicated with a 0. Breed is irrelevant unless at least one parent is unknown.
Marker File
Two formats are available: “usual” and “transposed”. In the usual format, the first record contains the chromosome name and successive records contain: individual, allele1_mkr1, allele2_mkr1, etc. Missing alleles are specified by 0. The transposed format is appropriate when there are many more markers than individuals. In this format, the first row is a list of individual codes, and successive rows contain: SNP_name, chr_number, ind1_allele1, ind1_allele2, ind2_allele1, ind2_allele2, etc. Unknown markers should be coded as 0, and chromosomes must have numbers rather than names; markers should be arranged by chromosome 1, 2, etc.
Optional Files
User-Defined Covariance Matrix Files
One or two files can be included to allow for the including of random effects distributed as N(0,V), where V can be any positive definite matrix which is stored in the file. The matrix is then invertyed to obtain random effects predictions, and the inverse can also be included to save computation. The parameter file must be modified appropriately to apply the effects to specific columns. The format of these files is: row, column, value in space-delimited form like the other files.
Haplotype File
Contains known haplotypes if any. The first record contains the name of the chromosome. Successive records include individual, order of markers where phases known. If several chromosomes are analyzed, the format should be repeated for each.
Outputs
q.0 Contains running output that might be useful for, among other things, checking convergence.
Primary Output File A variety of different results are reported in the same file.
Haplotype Output File If the applicable section in the parameter file is specified, the haplotypes sampled at each MCMC iteration are written. The format is: chromosome, MCMC_iteration, individual, phase, alleles.
Z Files Z files contain the IDB probabilities or SNP configurations.
Other Output Files There are numerous other undocumented output files.
Wrapper Script
There is also a wrapper script available here and also linked on the Qxpak wikipage . This allows the user to write in the parameters directly into the command line without having to edit the parameter file repeatedly.
________ | |
- p | Name of the parameter file using the $ markers above. |
- d | Name of the data file; the replacement for $data. |
- g | Name of the pedigree file; the replacement for $pedigree. |
- m | Name of the marker file; the replacement for $marker. |
- i | Name of the user-defined inverse file; replacement for $userInverse. |
- t | Name of the user-defined direct file; replacement for $userDirect. |
- h | Name of the haplotype file; replacement for $haplotype. |
- o | Name of the output file; defaults to result.txt. Several other files may be generated depending upon the parameter file. |
Wrapper Script sample code
qxpakwrapper.py -p <parameterFile.par> -d <dataFile.dat> -g <pedigreeFile.ped> -m <markerFile.mkr> -i <userInverse.inverse>
Further Information
QxPak v5.05 manual: qxpak.pdf
Wrapper scripts: qxpakwrapper.py
Example input and output data can be found in the discovery environment under Community Data/iplantcollaborative/example_data/qxpak
This tool is still in development and we are testing it currently. If you notice any issues or have any comments we would greatly appreciate them!
Please contact us at labstapleton@gmail.com. Thank you for using our tools!