03-02-2015

 updates:

03/02/2015:

1, more work on the concept map: 

    will use bowtie2 to map the sRNAs sequences to the col-O genome sequence;

    will use cufflinks2 to calculate how many reads are mapped within a gene.

    both bowtie2 and cufflinks2 apps are available in iPlant.

2, input/output:

sRNAs sequence profile:

GSE62801: flowers

GSM1533527

Col0 replicate 1 small RNA

GSM1533528

Col0 replicate 2 small RNA

GSM1533529

Col0 replicate 3 small RNA

GSM1533542

dcl3 replicate 1 small RNA

GSM1533543

dcl3 replicate 2 small RNA

GSM1533544

dcl3 replicate 3 small RNA

GSE14695:
GSM366868 Whole-aerial_Col-0

GSM366870 Whole-aerial_dcl2-1dcl3-1dcl4-2

col-0 genome sequence and annotation file are needed.

QUESTION: which file format should be used in iplant/bowtie2?

FTP/HTTP means? .txt gz.

how to input the data into the iPLANT?

How much room do we have ?

bowtie2: 

    input: 

        reference col-0 genome sequence: fasta;

        query sRNAs sequences: short reads /fastq/fasta/csfastq;

    output:

        SAM format;

Transform SAM format to BAM format for cufflinks.

cufflinks:

    input: sorted.BAM+annotation.gtf;

    output: transcripts.gtf (sRNAs.gtf).

             FPKM: Fragments Per Kilobase of exon per Million fragments mapped.

future tasks:

download all sequence files;

try bowtie2 in iPLANT.