F) Map transcripts
Map transcripts (app: Blat (with options))
Description: Use Blat (with options) to map the translated transcripts in the renamed peptide sequence file from Section D (Rename transcripts) against the refseq protein database in FASTA format. (The alignment app Blat (with options) does not require a pre-made index and allows many options to be set.) As described in Section E (Split RefSeq file), the FASTA file for the refseq database was split into 3 smaller files to reduce the amount of memory used by Blat and to complete the mapping in a couple of hours. Documentation: http://www.kentinformatics.com/products.html.
- Log into the Discovery Environment: https://de.iplantcollaborative.org/de/.
- Open the Blat (with options) app (Public Applications > NGS > Aligners > Blat (with options)).
- Change 'Analysis Name' to Map_Transcripts_0, add a 'Description' (optional), and use the default 'output folder'.
- Click on the Input sequences files tab.
- Click on the 'reference' field. Browse to the folder that holds the reference sequence (Sample data: Community Data > iplant_training > rna-seq_without_genome > F_map_transcripts > RefseqProtein.0). Select the file, then click on OK.
- Click on the 'query' field. Browse to the folder that holds the renamed .pep file from Section D (Rename transcripts) (Sample data: Community Data > iplant_training > rna-seq_without_genome > F_map_transcripts > BA_transcripts_peptides.fa). Select the file, then click on OK.
- Click on the Output file tab.
- Change the output file name to 'BA_trnsPep_v_refseq0.psl'.
- Click on the Options tab.
- Select 'protein alignment', 'output has no header', 'fine mapping', and 'extend through N'.
- Click on "Launch Analysis".
- Repeat this analysis with any remaining reference sequence files that were generated in Section E (Split RefSeq file) (Sample data: RefseqProtein.1, RefseqProtein.2).
- Change 'Analysis Name' accordingly (i.e. Map_Transcripts_1, Map_Transcripts_2).
- Change the output file name to match the inputs (i.e. BA_trnsPep_v_refseq1.psl, BA_trnsPep_v_refseq2.psl).
- Click on 'Analyses' from the DE workspace and monitor the 'Status' of the analysis (e.g., Idle, Submitted, Pending, Running, Completed, Failed).
- Once launched, an analysis will continue whether the user remains logged in or not.
- Email notifications update on the analysis progress; they can be switched off under 'Preferences'.
- If the analysis fails or does not proceed in the anticipated timeline, check these tips for troubleshooting. (Using the sample data, the analysis should be complete in about 2 hours.)
- To re-run an analysis, click the analysis "App" in the 'Analyses' window.
- Access analysis results in one of two ways:
- In the 'Analyses' window click on the analysis "Name" to open the output folder.
- In the 'Data' window, click on user name, then navigate to the folder that holds the output of the analysis. (Find the output for the sample at Community Data > iplant_training > rna-seq_without_genome > F_map_transcripts > output_from_sample_data.)
- Blat, the blast-like alignment tool, runs faster than using Blastp for this step.
Unable to render {include} The included page could not be found.