Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Rationale

NCBI fastq-dump can be very slow sometimes, even if you have the resources (network, IO, CPU) to go faster, even if you already downloaded the sra file (see the protip below). This tool speeds up the process by dividing the work into multiple threads. This is possible because fastq-dump have options (-N and -X) to query specific ranges of the sra file, this tool works by dividing the work into the requested number of threads, running multiple fastq-dump in parallel and concatenating the results back together, as if you had just executed a plain fastq-dump call.

Image ModifiedQuick Start

To use

parallel-fastq-dump

, you can either upload your data in SRA format (SRR012345.lite.sra) or specify the SRA accession name represented by that file (SRR012345)
  • Resources: documentation
  • Test Data

    All files are located in the Community Data directory of the CyVerse Discovery Environment at the following path:

    Community Data > iplantcollaborative > example_data > ncbi_sra_toolkit_fastq_dump

    Input File(s)

    Use SRR4101052.sra  as a test input file.

    Or

    Use SRR4101052

    Parameters Used in App

    When the app is run in the Discovery Environment, use the following parameters with the above input file(s) to get the output provided in the section below.

    Output File(s)

    Expect a FASTQ file named after the accession as output.

    Related Tutorials

    ChIPseq Using the iPlant Discovery Environment

    -0.6.1 is invoked using the following:

    1. Input (s)
      1. SRA file or SRA accession number
    2. Optional Parameters
    3. Outpus
      1.  Output Folder Name (default - sra_out)

    Please work through the documentation and add your comments on the bottom of this page, or email comments to support@cyverse.org. Thank you.

    Test Data

     Run parallel-fastq-dump-0.6.1 as following:

    1. Input (s)
      1. SRA accession number (SRR070570)
    2. Optional Parameters
    3. Outpus
      1. Output Folder Name (default - sra_out)

    Tool Source for App

    https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=softwarehttps://edwards.sdsu.edu/research/=toolkit_doc&f=fastq-dump/