3. What are Annotations?
FunctionalStructural
Function
cAMP-dependent and sulfonylurea-sensitive anion transporter. Key
gatekeeper influencing intracellular cholesterol transport.
Subcellular
location Membrane; Multi-pass membrane protein Ref.13 Ref.14.
Domain
Multifunctional polypeptide with two homologous halves, each
containing a hydrophobic membrane-anchoring domain and an ATP
binding cassette (ABC) domain.
8. Next Gen Genome Annotation 2013-
14
• Coelacanth
• Pine
• Sacred Lotus
• Conus ballatus
• Pigeon
• King Cobra
• Hymenopterids
• Fusarium cirinatum
• Cardiocondyla
obscurior
• Burmese Python
• Sarcocystis neurona
• Spotted Gar
• Apple magot fly
9. The ‘NextGen’ Genome Project
Lab/Small Group Funding
Short-read Genome Sequencing
RNASeq Data
Genome/Transcriptome Assembly
Gene Annotation
Genome Database / Blast Server
Manual curation
New assembly
Reannotate/Merge annotations
10. • The Annotation Problem
• How MAKER Works
• Why Choose MAKER
• Working with MAKER
MAKER
11. The Source of Annotations
RNA and
Protein
Evidence
Accurate
Gene
Annotations
Ab Initio
Computational
Evidence
12. Annotating the Genome – Apollo View
current evidence
gene annotations
genome assembly
http://apollo.berkeleybop.org/
13. Identify and mask repetitive elements
current evidence
genome assembly
http://www.repeatmasker.org
14. Generate ab initio gene predictions
ab initio predictionsSNAP
GeneMark
Augustus
current evidence
genome assembly
http://korflab.ucdavis.edu/
15. Align RNA and protein evidence
ab initio predictions
protein - BLASTX
EST - BLASTN
altEST - TBLASTX
current evidence
genome assembly
http://blast.ncbi.nlm.nih.gov
16. Polish BLAST alignments with Exonerate
ab initio predictions
polished protein
polished EST
current evidence
genome assembly
http://www.ebi.ac.uk/~guy/exonerate/
27. • The Annotation Problem
• How MAKER Works
• Why Choose MAKER
• Working with MAKER
MAKER
28. MAKER
The Genome Annotation Pipeline
GMOD Summer Course
May 19, 2014
Barry Moore/Carson Holt
Yandell Lab
University of Utah
29. MAKER2 Use Cases
1. De novo annotation providing quality metrics
2. Merging multiple annotation sets
3. Re-annotation with new evidence
4. Mapping annotations forward to a new
assembly
5. Generating GMOD Compliant Output
1. Gbrowse/JBrowse
2. Apollo
3. Tripal
31. SN SP AC
1.0 1.0 100%
Gold Standard Genes
Perfect Accuracy
Sensitivity, Specificity, Accuracy
As a Measure of Annotation Quality
32. SN SP AC
1.0 1.0 100%
1.0 0.5 80%
Gold Standard Genes
Perfect Accuracy
Poor Specificity
Sensitivity, Specificity, Accuracy
As a Measure of Annotation Quality
33. SN SP AC
1.0 1.0 100%
1.0 0.5 80%
0.5 1.0 80%
Gold Standard Genes
Perfect Accuracy
Poor Specificity
Poor Sensitivity
Sensitivity, Specificity, Accuracy
As a Measure of Annotation Quality
34. SN SP AC
1.0 1.0 100%
1.0 0.5 80%
0.5 1.0 80%
0.5 0.5 50%
Gold Standard Genes
Perfect Accuracy
Poor Specificity
Poor Sensitivity
Poor Specificity
and Sensitivity
Sensitivity, Specificity, Accuracy
As a Measure of Annotation Quality
Guigó R et al. Genome Biol. 2006
50. MAKER Runtime Features
• Fill out a config file with input data and
parameters
• Parallelize:
– Running with MPI
– Simply start multiple instances in the same
directory.
• Re-run MAKER in the same directory and it
won't redo completed work.
• Restart aborted jobs without losing any work.
52. • The Annotation Problem
• How MAKER Works
• Why Choose MAKER
• Working with MAKER
MAKER
53.
54.
55. Acknowledgements
• Mark Yandell
– Carson Holt
– Mike Campbell
– Daniel Ence
– Steven Flygare
– Zev Kronenberg
– Qing Li
– Marc Singleton
– Bretty Kennedy
– Brandi Cantarel
– Hadi Islam
– Shawn Reynearson
– Nicole Ruiz
– Keith Simmons
– Bret Heale
• Alejandro Alvarado
– Eric Ross
• Jason Stajich
• Sophia Robb
• Kevin Childs
• Shin-Han Shui
• Ning Jiang
• Yanni Sun