I) Reformat Blat results

Reformat Blat results (app: Cut Columns)

Description: To rename the contig sequences to include the Blat matches in refseq_protein, a list can be made to be used with Rename contigs 2.0 in the next section. The list in this case should contain the existing contig names in one column, and the additional information to be added in another. Cut columns can be used to create the tab-delimited list. Documentation: http://www.gnu.org/software/coreutils/manual/html_node/cut-invocation.html.

  1. Log into the Discovery Environment: https://de.iplantcollaborative.org/de/.
  2. Open the Cut Columns app (Public Applications > General Utilities > Text and Tabular Data > Cut Columns).
    1. Change 'Analysis Name' to Reformat_Blat_Results, add a 'Description' (optional), and use the default 'output folder'.
  3. Click on the Input data tab.
    1. Click on the 'Select a tabular text data file' field. Browse to the folder containing the file that contains the best matches collected in Section H (Identify best matches) (Sample data: Community Data > iplant_training > rna-seq_without_genome > I_reformat_blat_results > BA_pep_vs_Refseq_pep.psl). Select the file, then click OK.
  4. Click on the Options tab.
    1. Click the 'Enter comma-separated list of columns to extract (ie. c1,c3)' field. Enter 'c10,c14' for the columns to extract.
  5. Click on "Launch Analysis".
  6. Click on 'Analyses' from the DE workspace and monitor the 'Status' of the analysis (e.g., Idle, Submitted, Pending, Running, Completed, Failed).
    1. Once launched, an analysis will continue whether the user remains logged in or not.
    2. Email notifications update on the analysis progress; they can be switched off under 'Preferences'.
    3. If the analysis fails or does not proceed in the anticipated timeline, check these tips for troubleshooting. (Using the sample data, the analysis should be complete in less than 5 minutes.)
    4. To re-run an analysis, click the analysis "App" in the 'Analyses' window.
  7. Access analysis results in one of two ways:
    1. In the 'Analyses' window click on the analysis "Name" to open the output folder.
    2. In the 'Data' window, click on user name, then navigate to the folder that holds the output of the analysis. (Find the output for the sample at Community Data > iplant_training > rna-seq_without_genome > I_reformat_blat_results > output_from_sample_data.)
  8. The output file consists of columns 10 and 14 from the input psl file.
Unable to render {include} The included page could not be found.