2. Genetics of being Phytophthora?
• Objective: Find a coding sequence that is unique
in Phytophthora.
• What is starting material?
– 16 million RNASeq reads are assembled into P.sojae
reference sequence to generate junctions. These
junctions are judged using some of the best available
algorithms.
• http://vmd.vbi.vt.edu/download/data/workshop
2010/
– Coverage.wig
– ps1V1.fasta
3. Transcript discovery
• Sort the coverage file on the basis of the
number of hits to the reads on column 4.
• Find the upper 25% percentile.
• Remove sequences larger than 1000 or less
than 10 bases long.
• Fetch data from ps1V1 file.
• Split fasta file into N equal parts.
4. Annotation Steps
• Blast against P.sojae gene
models(vmd.vbi.vt.edu/toolkit).
• Check coding potential with P.sojae codon usage
tables.
- If found hit, then get the gene model and compare
the splice sites and correct it.
- If not found, then blast against
P.ramorum/H.arabidopsidis/P.infestans coding
sequences.
- See if matches with the splice junctions correctly – if
not, the gene models in those organisms are
INCORRECT.
5. Annotation
• Blast against nr database. If blast hit is not
found with any coding sequences in nr
database, then most probably you found a
new gene..
• Check if the sequence is a signal
peptide/target peptide to determine if it is
secretory in nature.
• Run MEME motif analysis search on the
sequence.