GATC06

  1. Abyss-PE
    1. status: Matt built it and started Apps API (Runs in paired end mode and ...).  Expects completion by next week.
    2. issues: First draft functionality will be limited to input of a single input library of sequence files
    3. Testing: Working with simulated maize chr 10; 5-6 Gb sequence; 50 nt reads
  2. FASTQC
    1. status: basically done; runs in DE.  Matt used in tutorials
    2. next steps Apps API but not required
    3. Testing:  Works on chip-seq data
    4. Outputs a directory with HTML of report and images.  Matt renders with PDF and PNG.  Also zips it for user download of results.
  3. Blastall (NCBI)
    1. Status: Deployed on Ranger, but needs to be built on Lonestar
    2. Can do testing using a local copy in Matt's "collaborator" folder
  4. Blast database RefSeq v47 plants/plastids
    1. Database formatted tested and copied to iRods
  5. Blastx parsing and analysis script
    1. Status: Complete; runs on blue helix (CSHL)
    2. Next step: Move to Lonestar
  6. Analysis & Annotation script based on blastx to RefSeq
    1. Analysis script almost complete
    2. Reports:
      1. Detailed (per contig) output on top hit and characteristics of alignment
      2. Summary statistics on several quality metrics
      3. Diversity summary based on species in RefSeq
        1. issue: might want to break-out to a different analysis because it requires all hits not just top hits
      4. Readme: guide to results
  7. R script plotting or results
    1. Status: Not needed at this point