Create BLAST database-2.6.0+
Rationale and Background
The makeblastdb application produces BLAST databases from FASTA files. In the simplest case the FASTA definition lines are not parsed by makeblastdb and may be completely unstructured. The text in the definition line will be stored in the BLAST database and displayed in the BLAST report
Mandatory arguments
Input file: Path to the query file name. Nucleotide sequences in fasta format or Amino acid sequences in fasta format
Input Sequence Format: Type of sequence formats of the input files (nucleotide or Protein)
Input type: Type of data specified in input file (Fasta or ASN1 (txt) or Blastdb)
Prefix to use for database: Database name
Parameters
Title for the database: Title for BLAST database (Default = input file name provided)
File containing masking data (csv format): Comma-separated list of input files containing masking data as produced by NCBI masking applications (e.g. dustmasker, segmasker, windowmasker)
Max per file size: Maximum file size for BLAST database files (Default = `1GB')
Test Run
All files are located in the Community Data directory of the CyVerse Discovery Environment at the following path:
Community Data > iplantcollaborative > example_data > makeblastdb (/iplant/home/shared/iplantcollaborative/example_data/makeblastdb)
Mandatory arguments:
Input file: plant.118.1.genomic.fna/plant.118.protein.faa
Input Sequence Format: Nulceotide/Protein
Input type: Fasta
Prefix to use for database: blastdb_n/blastdb_p
Parameters:
Leave these as default
Output
With plant.118.1.genomic.fna as input file and nucleotide as sequence format
blastdb_n.nhr
blastdb_n.nin
blastdb_n.nsq
With plant.118.protein.faa as input file and protein as sequence format
blastdb_p.phr
blastdb_p.pin
blastdb_p.pseq
The Blastp-2.6.0+ and Blastn-2.6.0+ apps take a folder as input for the database because there are multiple files involved. The best thing to do when you are creating a database is to give the database and the output file the same name e.g. "mygenome". Then after it has run, make a new folder inside the output directory, name it "mygenome", and drag all the database files into it, but not the logs directory. You can then drag that directory "mygenome" to one of your other directories so it will be easy to find. When you run Blastp or Blastn drag and drop the database directory you created into the database input for Blastp/Blastn
Please work through the documentation and add your comments on the bottom of this page, or email comments to support@cyverse.org or click the intercom button on this page. Thank you.
References
For more options of makeblastdb visit this page