Get Sequences

NCBI GenBank Import can be used to retrieve nucleotide or amino acid sequences from NCBI's GenBank repository by submitting a list of accession or GI numbers.

Quick Start

To use NCBI GenBank Import, upload your list of identifiers in plain text/.txt format. Select the list and chose the amount of information to be stored in the sequence header of the output file. Options are:
- Minimal: GI number only,
- Species: GI number, species name and taxon ID,
- Short: GI number and shortened species name,
- Full: GI number, accession, version, species name and taxon ID.
  If necessary, enter a length difference threshold to exclude sequences that are longer or shorter than the median length. Leave blank to retrieve all sequences.

Test Data

All files are located in the Community Data directory of the iPlant Discovery Environment at the following path:

Community Data > iplantcollaborative > example_data > ncbi_genbank_import

Input File(s)

Use file elongation_factor_alpha_SMALL.txt from http://mirrors.iplantcollaborative.org/example_data/get_sequences/elongation_factor_alpha_SMALL.txt as a test input file.

Parameters Used in App

When the app is run in the Discovery Environment, use the following parameters with the above input file(s) to get the output provided in the next section below.

Default parameters only, no further configuration needed.

Output File(s)

Expect sequences.fa file from http://mirrors.iplantcollaborative.org/example_data/get_sequences/sequences.fa as output.

Tool Source for App

Internally developed. Source code available from iPlant's development repository.

Discovery Environment Applications List

NCBI GenBank Import

Analytics