PGDSpider tool for data conversion

PGDSpider

PGDSpider is a powerful automated data conversion tool for population genetic and genomics programs. It facilitates the data exchange possibilities between programs for a vast range of data types (e.g. DNA, RNA, NGS, microsatellite, SNP, RFLP, AFLP, multi-allelic data, allele frequency or genetic distances). Besides the conventional population genetics formats, PGDSpider integrates population genomics data formats commonly used to store and handle next-generation sequencing (NGS) data.Quick Start

Test Data

Input Files

Some example input files can be found at the Community Data:

Test input in GenBank sequence format: Community Data -> iplantcollaborative -> example_data ->PGDSpider. Pick up example_Structure.txt as the input file, for the schema pick up PGD_schema.xsd there is one more input parameter called spid where you can use the Arlequine2Structure.spid file. In the same PGDSpider folder there are a fe example .spid files to to convert data among other popular tools.

Parameters Used in App

When the app is run in the Discovery Environment, use the parameter file provided with the .spid extension.

As described by the authors, this file is required for the program to run. As PPDSpider can cover up to 30 different data formats either use the local version of PGDSpider to generate an spid file for your own needs or the online version at http://www.cmpg.unibe.ch/software/PGDSpider/jnlp/PGDSpider2.jnlp

Output Files

The output is a file at the format a user want to have and it will be saved at his/her own data directory. Example output files are at

Expected output files: Community Data -> iplantcollaborative -> example_data -> PGDSpider and it is called example_Arlequin.arp