SlideShare uma empresa Scribd logo
1 de 63
TAIR: A Sustainable Community Resource
for Arabidopsis Research
International Conference on Arabidopsis Research (ICAR 2016), GyeongJu, Korea
1. TAIR: a sustainable community resource for Arabidopsis
research (Eva Huala)
2. Using biological ontologies to accelerate progress in plant
biology research (Donghui Li)
3. Community annotation: making your data and publication
more discoverable (Donghui Li)
Using biological ontologies to accelerate
progress in plant biology research
Donghui Li
TAIR/Phoenix Bioinformatics
Every year, an average of:
• Over 3000 Arabidopsis research articles are added
• Over 2000 papers are associated with genes
• Over 400 articles have gene function, expression or
phenotype data extracted
• Over 5000 experiment-based annotations are added
using controlled vocabularies (GO and PO ontologies)
Producing a ‘gold standard’ annotated reference plant genome
Highly structured, searchable, computable
functional annotations
• How do we use biological ontologies to annotate Arabidopsis
gene function?
• How to read/interpret annotations?
• What can you do with these annotations?
Outline
Why do we need ontologies?
Inconsistency in free text:
Different names for the same concept
translation, protein synthesis
Same name for different concepts
Bud initiation?
A Gene Ontology (GO) term
Accession: GO:0006412
Name: translation
Ontology: biological_process
Synonyms: protein anabolism, protein biosynthesis, protein biosynthetic
process, protein formation, protein synthesis, protein translation
Definition: The cellular metabolic process in which a protein is formed,
using the sequence of a mature mRNA molecule to specify the
sequence of amino acids in a polypeptide chain. Translation is
mediated by the ribosome, and begins with the formation of a ternary
complex between aminoacylated initiator methionine tRNA, GTP, and
initiation factor 2, which subsequently associates with the small subunit
of the ribosome and an mRNA. Translation ends with the release of a
polypeptide chain from the ribosome. Source: GOC:go_curators
molecular function: catalytic / binding activities
kinase activity, DNA binding activity
biological process: biological goal or objective
protein translation, mitosis
cellular component: location or complex
nucleus, ribosome, proteasome
More info at www.geneontology.org
Gene Ontology (GO)
Terms in an ontology are connected
is_a
part_of
Annotation at different depth of the ontology
is_a
part_of
Retrieval at higher nodes in the ontology
is_a
part_of
Manual literature annotation
Gene
product GO term
Evidence
code
Anatomy of a GO annotation
Reference
Experimental evidence codes (EXP)
IDA Inferred from Direct Assay (enzyme assays, in situ hybridization)
IMP Inferred from Mutant Phenotype (analysis of visible trait)
IPI Inferred from Physical Interaction (yeast-2-hybrid)
IEP Inferred from Expression Pattern (RT-PCR, Western blot)
IGI Inferred from Genetic Interaction (double mutant analysis)
Examples
http://geneontology.org/page/guide-go-evidence-codes
Commonly used evidence codes
Experimental evidence codes (EXP)
IDA Inferred from Direct Assay (enzyme assays, in situ hybridization)
IMP Inferred from Mutant Phenotype (analysis of visible trait)
IPI Inferred from Physical Interaction (yeast-2-hybrid)
IEP Inferred from Expression Pattern (RT-PCR, Western blot)
IGI Inferred from Genetic Interaction (double mutant analysis)
Computational Analysis Evidence Codes (non-EXP)
ISS Inferred from Sequence or Structural Similarity
- based on published sequence alignment
IEA Inferred from Electronic Annotation
- InterPro2GO
Examples
http://geneontology.org/page/guide-go-evidence-codes
Commonly used evidence codes
Evidence
code
Annotation
counts %
Evidence
code
Annotation
counts %
EXP 95,435 34.7 IDA 56,271 20.4
IEP 6,651 2.4
IGI 4,286 1.6
IMP 19,441 7.1
IPI 8,786 3.2
Non-EXP 179,801 66.2
Total 275,236 101
Summary of Arabidopsis GO annotations in TAIR
Notes: 9,186 unique publications used in EXP annotations
Based on TAIR ATH_GO_GOSLIM.txt 2016-06-05
Based on annotation data as of May 24, 2016
Summary of Arabidopsis GO annotations in TAIR
- Query gene function information
- GO annotation projection
- Functional categorization
- Term enrichment
Application: What can you do with TAIR GO/PO annotations?
Get annotations for individual genes from the TAIR locus page
Gene Ontology
annotations
Plant Ontology
annotations
Get annotations for individual genes from the TAIR locus page
Other functional information:
Gene summary
Polymorphism
Phenotype
Publications
Gene symbols
Get annotations for a list of genes
Get annotations for a list of genes
Get annotations for a list of genes
Find genes annotated to a GO/PO term
Download all GO/PO annotations
- Query gene function information
- GO annotation projection
- Functional categorization
- Term enrichment
Application: What can you do with TAIR GO/PO annotations?
Source: http://geneontology.org/page/current-go-statistics 2016-06-03
Rat
Human
Mouse
Arabidopsis
Zebrafish
Worm
Chicken
Fly Yeast
Rice E coli
GO annotations by species
Annotating new plant genomes by projecting GO terms from Arabidopsis
onto other non-model plant species based on gene orthology
EnsemblPlants Compara
• Use the Compara pipeline to build orthology
• Automatically transfer GO annotations to plant orthologs
Rules
 at least a 40% peptide identity to each other
 only GO annotations with an evidence type of IDA, IEP, IGI,
IMP or IPI are projected
 no annotations with a 'NOT' qualifier are projected
 annotations to the GO:0005515 protein binding term are not
projected
- Query gene function information
- GO annotation projection
- Functional categorization
- Term enrichment
Application: What can you do with TAIR GO/PO annotations?
TAIR’s functional categorization tool
Cellular
component
Molecular
function
Biological
process
Biological
process
Functional category Gene count
Overrepresentation statistical test:
In my list of genes, are any functional classes (for
example a GO process) found more often than
expected when compared with the reference list?
Term enrichment analysis
GOC provides a term enrichment tool powered by PANTHER
pantherdb.org geneontology.org
Input 1
Input 2
ID
Mapping
Use up-to-date
annotations
Output 168/26684=0.63%
0.63%x442=2.78
Model for the regulation of long-term drought
responses in Q. suber root
Model for ABA-dependent drought response in cork oak
1 The main activity of TAIR curators is producing a ‘gold standard’
annotated reference genome dataset by integrating
experimental data from the research literature. New annotations
are constantly added.
2 One common use of TAIR is to infer the function of genes in
agriculturally important species based on orthology to
Arabidopsis genes.
3 TAIR’s annotations are used in applications such as functional
categorization, term enrichment. It is important to use the latest
annotation file from TAIR.
Summary
Community annotation: making your data and
publication more discoverable
Donghui Li
Community annotation on TAIR
Why should everyone participate -
increased exposure of your work
Community annotation on TAIR
1.Pre-publication: register your
gene symbol to minimize
accidental duplications in gene
nomenclature
2.Preparing your manuscript:
include AGI locus identifiers
3. Post-publication: submit your
annotation to us (any journal)
Tips to make your research more discoverable
AT1G56650 PAP1 PRODUCTION OF ANTHOCYANIN PIGMENT 1
AT2G01180 PAP1 PHOSPHATIDIC ACID PHOSPHATASE 1
AT2G27190 PAP1 PURPLE ACID PHOSPHATASE 1
AT3G16500 PAP1 PHYTOCHROME-ASSOCIATED PROTEIN 1
Gene name duplication make it harder to find the
right gene
Plant Cell Physiol. 2010 Jun;51(6):866-76
Plant Cell Physiol. Jun;51(6):877-83
Conflicting nomenclature / error in publication not
uncommon
PMID:21447788
Mandatory requirement for publishing in some journals
Always include AGI codes
How to submit
Requires a login so we can credit submitter
no subscription required
Video tutorial
Provide ‘evidence with’ as comments
Multiple genes?
• “I do profit a lot from the data on TAIR, thus
this submission is a small contribution to
extend the data present on TAIR.”
• “I gratefully did it [data submission] because I
already benefit from similar information for
other genes.”
Community feedback
Q&A
AT3G25070
AT2G32700
IPI - protein interacting partner
IGI - other mutated loci in a double,
triple mutant
Some (but not all) annotations have supporting information in
the Evidence with field
Pay attention to the NOT qualifier in relationship type
Reactome

Mais conteúdo relacionado

Mais procurados

Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...Surya Saha
 
Functional annotation of invertebrate genomes
Functional annotation of invertebrate genomesFunctional annotation of invertebrate genomes
Functional annotation of invertebrate genomesSurya Saha
 
ICAR 2015 Poster - Araport
ICAR 2015 Poster - AraportICAR 2015 Poster - Araport
ICAR 2015 Poster - AraportAraport
 
Plant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In SequencesPlant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In SequencesLeighton Pritchard
 
What makes the enterobacterial plant pathogen Pectobacterium atrosepticum dif...
What makes the enterobacterial plant pathogen Pectobacterium atrosepticum dif...What makes the enterobacterial plant pathogen Pectobacterium atrosepticum dif...
What makes the enterobacterial plant pathogen Pectobacterium atrosepticum dif...Leighton Pritchard
 
supplement final
supplement finalsupplement final
supplement finalSarah Sdao
 
Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Chris Mungall
 
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...Nathan Dunn
 
BIOINFORMATICS Applications And Challenges
BIOINFORMATICS Applications And ChallengesBIOINFORMATICS Applications And Challenges
BIOINFORMATICS Applications And ChallengesAmos Watentena
 
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Alejandra Gonzalez-Beltran
 
Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!adcobb
 
ADARSH JOSE_Resume
ADARSH JOSE_ResumeADARSH JOSE_Resume
ADARSH JOSE_ResumeAdarsh Jose
 
Introduction to Bioinformatics.
 Introduction to Bioinformatics. Introduction to Bioinformatics.
Introduction to Bioinformatics.Elena Sügis
 
Bioc strucvariant seattle_11_09
Bioc strucvariant seattle_11_09Bioc strucvariant seattle_11_09
Bioc strucvariant seattle_11_09Sean Davis
 
Collaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeChris Mungall
 
Emerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsEmerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsmikaelhuss
 
Biological database by kk sahu
Biological database by kk sahuBiological database by kk sahu
Biological database by kk sahuKAUSHAL SAHU
 
Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1 Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1 Denis C. Bauer
 
Standarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata filesStandarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata filesYasset Perez-Riverol
 

Mais procurados (20)

Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
 
Functional annotation of invertebrate genomes
Functional annotation of invertebrate genomesFunctional annotation of invertebrate genomes
Functional annotation of invertebrate genomes
 
ICAR 2015 Poster - Araport
ICAR 2015 Poster - AraportICAR 2015 Poster - Araport
ICAR 2015 Poster - Araport
 
Plant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In SequencesPlant Pathogen Genome Data: My Life In Sequences
Plant Pathogen Genome Data: My Life In Sequences
 
What makes the enterobacterial plant pathogen Pectobacterium atrosepticum dif...
What makes the enterobacterial plant pathogen Pectobacterium atrosepticum dif...What makes the enterobacterial plant pathogen Pectobacterium atrosepticum dif...
What makes the enterobacterial plant pathogen Pectobacterium atrosepticum dif...
 
supplement final
supplement finalsupplement final
supplement final
 
Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015
 
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
 
BIOINFORMATICS Applications And Challenges
BIOINFORMATICS Applications And ChallengesBIOINFORMATICS Applications And Challenges
BIOINFORMATICS Applications And Challenges
 
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
 
Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!
 
ADARSH JOSE_Resume
ADARSH JOSE_ResumeADARSH JOSE_Resume
ADARSH JOSE_Resume
 
Introduction to Bioinformatics.
 Introduction to Bioinformatics. Introduction to Bioinformatics.
Introduction to Bioinformatics.
 
Bioc strucvariant seattle_11_09
Bioc strucvariant seattle_11_09Bioc strucvariant seattle_11_09
Bioc strucvariant seattle_11_09
 
Collaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of Life
 
bioinformatics enabling knowledge generation from agricultural omics data
bioinformatics enabling knowledge generation from agricultural omics databioinformatics enabling knowledge generation from agricultural omics data
bioinformatics enabling knowledge generation from agricultural omics data
 
Emerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsEmerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomics
 
Biological database by kk sahu
Biological database by kk sahuBiological database by kk sahu
Biological database by kk sahu
 
Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1 Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1
 
Standarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata filesStandarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata files
 

Semelhante a Using Biological Ontologies to Accelerate Progress in Plant Biology Research

TAIR Presentation ASPB 2016
TAIR Presentation ASPB 2016TAIR Presentation ASPB 2016
TAIR Presentation ASPB 2016Leonore Reiser
 
Functional annotation
Functional annotationFunctional annotation
Functional annotationRavi Gandham
 
Computing on the shoulders of giants
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giantsBenjamin Good
 
Lock - PomBase community curation
Lock - PomBase community curationLock - PomBase community curation
Lock - PomBase community curationPascale Gaudet
 
PhoenixBio 2020 Stanford Workshop on PhyloGenes
PhoenixBio 2020 Stanford Workshop on PhyloGenesPhoenixBio 2020 Stanford Workshop on PhyloGenes
PhoenixBio 2020 Stanford Workshop on PhyloGenesPhoenix Bioinformatics
 
Creating an integrated Ondex knowledge base for comparative gene function ana...
Creating an integrated Ondex knowledge base for comparative gene function ana...Creating an integrated Ondex knowledge base for comparative gene function ana...
Creating an integrated Ondex knowledge base for comparative gene function ana...Catherine Canevet
 
FAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use CaseFAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use CaseRothamsted Research, UK
 
Bio-protocol4446.GO_KEGG_RICE.pdf
Bio-protocol4446.GO_KEGG_RICE.pdfBio-protocol4446.GO_KEGG_RICE.pdf
Bio-protocol4446.GO_KEGG_RICE.pdfssuserb500f8
 
RDA Wheat Data Interoperability Cookbook and last developments
RDA Wheat Data Interoperability Cookbook and last developmentsRDA Wheat Data Interoperability Cookbook and last developments
RDA Wheat Data Interoperability Cookbook and last developmentsCIARD Movement
 
Plant Pathology Seminar
Plant Pathology SeminarPlant Pathology Seminar
Plant Pathology SeminarBongsoo Park
 
PRO Use Cases for Scientific Communities
PRO Use Cases for Scientific CommunitiesPRO Use Cases for Scientific Communities
PRO Use Cases for Scientific CommunitiesPaolo Ciccarese
 
An introduction to Web Apollo for i5K Pilot Species Projects - Hemiptera
An introduction to Web Apollo for i5K Pilot Species Projects - HemipteraAn introduction to Web Apollo for i5K Pilot Species Projects - Hemiptera
An introduction to Web Apollo for i5K Pilot Species Projects - HemipteraMonica Munoz-Torres
 

Semelhante a Using Biological Ontologies to Accelerate Progress in Plant Biology Research (20)

Tair workshop stanford2017
Tair workshop stanford2017Tair workshop stanford2017
Tair workshop stanford2017
 
TAIR Presentation ASPB 2016
TAIR Presentation ASPB 2016TAIR Presentation ASPB 2016
TAIR Presentation ASPB 2016
 
Reiser aspb2019 asgiven
Reiser aspb2019 asgivenReiser aspb2019 asgiven
Reiser aspb2019 asgiven
 
Functional annotation
Functional annotationFunctional annotation
Functional annotation
 
Building Communities Around Ontology Development
Building Communities Around Ontology DevelopmentBuilding Communities Around Ontology Development
Building Communities Around Ontology Development
 
Computing on the shoulders of giants
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giants
 
Chibucos annot go_final
Chibucos annot go_finalChibucos annot go_final
Chibucos annot go_final
 
TAIR Presentation ASPB 2017
TAIR Presentation ASPB 2017TAIR Presentation ASPB 2017
TAIR Presentation ASPB 2017
 
Lock - PomBase community curation
Lock - PomBase community curationLock - PomBase community curation
Lock - PomBase community curation
 
Stanford workshop2020
Stanford workshop2020Stanford workshop2020
Stanford workshop2020
 
PhoenixBio 2020 Stanford Workshop on PhyloGenes
PhoenixBio 2020 Stanford Workshop on PhyloGenesPhoenixBio 2020 Stanford Workshop on PhyloGenes
PhoenixBio 2020 Stanford Workshop on PhyloGenes
 
Creating an integrated Ondex knowledge base for comparative gene function ana...
Creating an integrated Ondex knowledge base for comparative gene function ana...Creating an integrated Ondex knowledge base for comparative gene function ana...
Creating an integrated Ondex knowledge base for comparative gene function ana...
 
Mikel egana itbam_2010_ogo_system
Mikel egana itbam_2010_ogo_systemMikel egana itbam_2010_ogo_system
Mikel egana itbam_2010_ogo_system
 
FAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use CaseFAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use Case
 
Bio-protocol4446.GO_KEGG_RICE.pdf
Bio-protocol4446.GO_KEGG_RICE.pdfBio-protocol4446.GO_KEGG_RICE.pdf
Bio-protocol4446.GO_KEGG_RICE.pdf
 
call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...
 
RDA Wheat Data Interoperability Cookbook and last developments
RDA Wheat Data Interoperability Cookbook and last developmentsRDA Wheat Data Interoperability Cookbook and last developments
RDA Wheat Data Interoperability Cookbook and last developments
 
Plant Pathology Seminar
Plant Pathology SeminarPlant Pathology Seminar
Plant Pathology Seminar
 
PRO Use Cases for Scientific Communities
PRO Use Cases for Scientific CommunitiesPRO Use Cases for Scientific Communities
PRO Use Cases for Scientific Communities
 
An introduction to Web Apollo for i5K Pilot Species Projects - Hemiptera
An introduction to Web Apollo for i5K Pilot Species Projects - HemipteraAn introduction to Web Apollo for i5K Pilot Species Projects - Hemiptera
An introduction to Web Apollo for i5K Pilot Species Projects - Hemiptera
 

Mais de Phoenix Bioinformatics

How to make your published data findable, accessible, interoperable and reusable
How to make your published data findable, accessible, interoperable and reusableHow to make your published data findable, accessible, interoperable and reusable
How to make your published data findable, accessible, interoperable and reusablePhoenix Bioinformatics
 
2014 International Conference on Arabidopsis Research (ICAR) presentation
2014 International Conference on Arabidopsis Research (ICAR) presentation2014 International Conference on Arabidopsis Research (ICAR) presentation
2014 International Conference on Arabidopsis Research (ICAR) presentationPhoenix Bioinformatics
 
2014 Plant and Animal Genome Conference- Huala
2014 Plant and Animal Genome Conference- Huala2014 Plant and Animal Genome Conference- Huala
2014 Plant and Animal Genome Conference- HualaPhoenix Bioinformatics
 
A Few Simple Things Authors Can Do to Make Their Data More Discoverable and R...
A Few Simple Things Authors Can Do to Make Their Data More Discoverable and R...A Few Simple Things Authors Can Do to Make Their Data More Discoverable and R...
A Few Simple Things Authors Can Do to Make Their Data More Discoverable and R...Phoenix Bioinformatics
 

Mais de Phoenix Bioinformatics (9)

PhyloGenes Webinar Spring 2020
PhyloGenes Webinar Spring 2020PhyloGenes Webinar Spring 2020
PhyloGenes Webinar Spring 2020
 
TAIR ICAR 2010 Presentation
TAIR ICAR 2010 PresentationTAIR ICAR 2010 Presentation
TAIR ICAR 2010 Presentation
 
TAIR ASPB 2018 Presentation
TAIR ASPB 2018 PresentationTAIR ASPB 2018 Presentation
TAIR ASPB 2018 Presentation
 
How to make your published data findable, accessible, interoperable and reusable
How to make your published data findable, accessible, interoperable and reusableHow to make your published data findable, accessible, interoperable and reusable
How to make your published data findable, accessible, interoperable and reusable
 
2014 International Conference on Arabidopsis Research (ICAR) presentation
2014 International Conference on Arabidopsis Research (ICAR) presentation2014 International Conference on Arabidopsis Research (ICAR) presentation
2014 International Conference on Arabidopsis Research (ICAR) presentation
 
2014 ASPB Presentation- Berardini
2014 ASPB Presentation- Berardini2014 ASPB Presentation- Berardini
2014 ASPB Presentation- Berardini
 
2014 Plant and Animal Genome Conference- Huala
2014 Plant and Animal Genome Conference- Huala2014 Plant and Animal Genome Conference- Huala
2014 Plant and Animal Genome Conference- Huala
 
TAIR Presentation ICAR 2017
TAIR Presentation ICAR 2017TAIR Presentation ICAR 2017
TAIR Presentation ICAR 2017
 
A Few Simple Things Authors Can Do to Make Their Data More Discoverable and R...
A Few Simple Things Authors Can Do to Make Their Data More Discoverable and R...A Few Simple Things Authors Can Do to Make Their Data More Discoverable and R...
A Few Simple Things Authors Can Do to Make Their Data More Discoverable and R...
 

Último

Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 

Último (20)

TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 

Using Biological Ontologies to Accelerate Progress in Plant Biology Research

  • 1. TAIR: A Sustainable Community Resource for Arabidopsis Research International Conference on Arabidopsis Research (ICAR 2016), GyeongJu, Korea
  • 2. 1. TAIR: a sustainable community resource for Arabidopsis research (Eva Huala) 2. Using biological ontologies to accelerate progress in plant biology research (Donghui Li) 3. Community annotation: making your data and publication more discoverable (Donghui Li)
  • 3. Using biological ontologies to accelerate progress in plant biology research Donghui Li TAIR/Phoenix Bioinformatics
  • 4. Every year, an average of: • Over 3000 Arabidopsis research articles are added • Over 2000 papers are associated with genes • Over 400 articles have gene function, expression or phenotype data extracted • Over 5000 experiment-based annotations are added using controlled vocabularies (GO and PO ontologies) Producing a ‘gold standard’ annotated reference plant genome Highly structured, searchable, computable functional annotations
  • 5.
  • 6.
  • 7. • How do we use biological ontologies to annotate Arabidopsis gene function? • How to read/interpret annotations? • What can you do with these annotations? Outline
  • 8. Why do we need ontologies? Inconsistency in free text: Different names for the same concept translation, protein synthesis Same name for different concepts Bud initiation?
  • 9. A Gene Ontology (GO) term Accession: GO:0006412 Name: translation Ontology: biological_process Synonyms: protein anabolism, protein biosynthesis, protein biosynthetic process, protein formation, protein synthesis, protein translation Definition: The cellular metabolic process in which a protein is formed, using the sequence of a mature mRNA molecule to specify the sequence of amino acids in a polypeptide chain. Translation is mediated by the ribosome, and begins with the formation of a ternary complex between aminoacylated initiator methionine tRNA, GTP, and initiation factor 2, which subsequently associates with the small subunit of the ribosome and an mRNA. Translation ends with the release of a polypeptide chain from the ribosome. Source: GOC:go_curators
  • 10. molecular function: catalytic / binding activities kinase activity, DNA binding activity biological process: biological goal or objective protein translation, mitosis cellular component: location or complex nucleus, ribosome, proteasome More info at www.geneontology.org Gene Ontology (GO)
  • 11. Terms in an ontology are connected is_a part_of
  • 12. Annotation at different depth of the ontology is_a part_of
  • 13. Retrieval at higher nodes in the ontology is_a part_of
  • 15. Gene product GO term Evidence code Anatomy of a GO annotation Reference
  • 16. Experimental evidence codes (EXP) IDA Inferred from Direct Assay (enzyme assays, in situ hybridization) IMP Inferred from Mutant Phenotype (analysis of visible trait) IPI Inferred from Physical Interaction (yeast-2-hybrid) IEP Inferred from Expression Pattern (RT-PCR, Western blot) IGI Inferred from Genetic Interaction (double mutant analysis) Examples http://geneontology.org/page/guide-go-evidence-codes Commonly used evidence codes
  • 17. Experimental evidence codes (EXP) IDA Inferred from Direct Assay (enzyme assays, in situ hybridization) IMP Inferred from Mutant Phenotype (analysis of visible trait) IPI Inferred from Physical Interaction (yeast-2-hybrid) IEP Inferred from Expression Pattern (RT-PCR, Western blot) IGI Inferred from Genetic Interaction (double mutant analysis) Computational Analysis Evidence Codes (non-EXP) ISS Inferred from Sequence or Structural Similarity - based on published sequence alignment IEA Inferred from Electronic Annotation - InterPro2GO Examples http://geneontology.org/page/guide-go-evidence-codes Commonly used evidence codes
  • 18. Evidence code Annotation counts % Evidence code Annotation counts % EXP 95,435 34.7 IDA 56,271 20.4 IEP 6,651 2.4 IGI 4,286 1.6 IMP 19,441 7.1 IPI 8,786 3.2 Non-EXP 179,801 66.2 Total 275,236 101 Summary of Arabidopsis GO annotations in TAIR Notes: 9,186 unique publications used in EXP annotations Based on TAIR ATH_GO_GOSLIM.txt 2016-06-05
  • 19. Based on annotation data as of May 24, 2016 Summary of Arabidopsis GO annotations in TAIR
  • 20. - Query gene function information - GO annotation projection - Functional categorization - Term enrichment Application: What can you do with TAIR GO/PO annotations?
  • 21.
  • 22. Get annotations for individual genes from the TAIR locus page Gene Ontology annotations Plant Ontology annotations
  • 23. Get annotations for individual genes from the TAIR locus page Other functional information: Gene summary Polymorphism Phenotype Publications Gene symbols
  • 24. Get annotations for a list of genes
  • 25. Get annotations for a list of genes
  • 26. Get annotations for a list of genes
  • 27. Find genes annotated to a GO/PO term
  • 28.
  • 29.
  • 30. Download all GO/PO annotations
  • 31. - Query gene function information - GO annotation projection - Functional categorization - Term enrichment Application: What can you do with TAIR GO/PO annotations?
  • 33. Annotating new plant genomes by projecting GO terms from Arabidopsis onto other non-model plant species based on gene orthology EnsemblPlants Compara • Use the Compara pipeline to build orthology • Automatically transfer GO annotations to plant orthologs Rules  at least a 40% peptide identity to each other  only GO annotations with an evidence type of IDA, IEP, IGI, IMP or IPI are projected  no annotations with a 'NOT' qualifier are projected  annotations to the GO:0005515 protein binding term are not projected
  • 34. - Query gene function information - GO annotation projection - Functional categorization - Term enrichment Application: What can you do with TAIR GO/PO annotations?
  • 35.
  • 38.
  • 39. Biological process Functional category Gene count Overrepresentation statistical test: In my list of genes, are any functional classes (for example a GO process) found more often than expected when compared with the reference list? Term enrichment analysis
  • 40. GOC provides a term enrichment tool powered by PANTHER pantherdb.org geneontology.org
  • 41. Input 1 Input 2 ID Mapping Use up-to-date annotations
  • 43. Model for the regulation of long-term drought responses in Q. suber root Model for ABA-dependent drought response in cork oak
  • 44. 1 The main activity of TAIR curators is producing a ‘gold standard’ annotated reference genome dataset by integrating experimental data from the research literature. New annotations are constantly added. 2 One common use of TAIR is to infer the function of genes in agriculturally important species based on orthology to Arabidopsis genes. 3 TAIR’s annotations are used in applications such as functional categorization, term enrichment. It is important to use the latest annotation file from TAIR. Summary
  • 45. Community annotation: making your data and publication more discoverable Donghui Li
  • 47. Why should everyone participate - increased exposure of your work
  • 48.
  • 50. 1.Pre-publication: register your gene symbol to minimize accidental duplications in gene nomenclature 2.Preparing your manuscript: include AGI locus identifiers 3. Post-publication: submit your annotation to us (any journal) Tips to make your research more discoverable
  • 51. AT1G56650 PAP1 PRODUCTION OF ANTHOCYANIN PIGMENT 1 AT2G01180 PAP1 PHOSPHATIDIC ACID PHOSPHATASE 1 AT2G27190 PAP1 PURPLE ACID PHOSPHATASE 1 AT3G16500 PAP1 PHYTOCHROME-ASSOCIATED PROTEIN 1 Gene name duplication make it harder to find the right gene
  • 52. Plant Cell Physiol. 2010 Jun;51(6):866-76 Plant Cell Physiol. Jun;51(6):877-83 Conflicting nomenclature / error in publication not uncommon
  • 53. PMID:21447788 Mandatory requirement for publishing in some journals Always include AGI codes
  • 55. Requires a login so we can credit submitter no subscription required Video tutorial
  • 56.
  • 59. • “I do profit a lot from the data on TAIR, thus this submission is a small contribution to extend the data present on TAIR.” • “I gratefully did it [data submission] because I already benefit from similar information for other genes.” Community feedback
  • 60. Q&A
  • 61. AT3G25070 AT2G32700 IPI - protein interacting partner IGI - other mutated loci in a double, triple mutant Some (but not all) annotations have supporting information in the Evidence with field
  • 62. Pay attention to the NOT qualifier in relationship type