Clean_fasta_header

Clean_fasta_header

Rationale and background

Clean_fasta_header app removes everything after "|" in the fasta header of the fasta file. The special character "|" is not ideal with many of the bioinformatics tools and it is important to remove them in the fasta header. This app will help you remove one of the special character 

Prerequisites

  1. A CyVerse account (Register for a CyVerse account at https://user.cyverse.org/).

  2. An up-to-date Java-enabled web browser. (Firefox recommended. If you wish to work with your own large datasets and upload them using iCommands, Chrome is not suitable due to its issues in utilizing 64-bit Java.)

  3. Input: 

    Either one of the below options should be selected for modifying the fasta header. Custom reference genome or any fasta sequence with "|" can be used here

    1. Cutsom Reference genome

    2. Reference genome from DE

  4. Output Folder name: Name of the output folder (default "output")

Test/sample data

This tutorial uses the test data that is stored in the Data Store at Community Data > iplantcollaborative > example_data > clean_fasta_header          

  1. Input:

    1. Reference genomes: Acromyrmex_echinatior

  2. Output Folder name: Use default folder name - "output"

Output

  1. logs

  2. Output

    1. genome.cleaned.fas