SlideShare uma empresa Scribd logo
1 de 1
WikiGenomes and Chlambase: Microbial genomics data in Wikidata.
Tim E. Putman1, Sebastian Burgstaller-Muehlbacher1, Andra Waagmeester2, Chunlei Wu1,
Kevin Hybiske3, Benjamin M. Good1, and Andrew I. Su1
1 Department of Integrative, Structural and Computational Biology, the Scripps Research Institute, La Jolla, USA; sulab.org
2 Micelio, Antwerp, Belgium
3 Division of Allergy and Infectious Diseases, Department of Medicine, University of Washington
Motivation
Wikidata provides an extensible open framework ideal for aggregating
distributed data in a centralized database that supports:
• complex querying based a semantic data model
• providing data for domain specific web applications that allow the user to
both read and write data
Here, we describe the use of Wikidata to integrate microbial genomics data
using WikiGenomes and a Chlamydia-specific instance called Chlambase.
A
A) Semantic microbial data model consisting of a hierarchical taxonomic schema
and separate entities for gene and protein. The nodes are Wikidata ‘items’ and
‘properties’ define the relationships. B) Python based ‘Bot’ software for gathering
data from different resources and reading and writing directly to Wikidata
(https://github.com/SuLab/WikidataIntegrator).
Data model and implementation
A) Various data sources for microbial genetic data. B) Cumulative sum of bacterial
and eukaryotic genome assemblies submitted to NCBI GenBank by year.
A B
Scope and diversity of microbial data
Modeling microbial interactions
C. trachomatis
genome
www.ncbi.nlm.nih.gov/
genome/
indole
www.drugbank.ca/
Chlamydia trachomatis:
genes
www.ncbi.nlm.nih.gov/gene/
Human:
indoleamine 2, 3-dioxygenase
www.uniprot.org/
tryptophanase
www.uniprot.org/
C.trachomatis:
trp. synth.
alpha
and
beta
www.uniprot.org/
C.trachomatis:
tryptophan
synthase
www.rhea-db.org
C.trachomatis:
trpRBA operon
www.operondb.jp/
Akers et al. 2006
A) The interactions between host, pathogen,
microbiome, and small molecules that lead to
pathogen persistence during a chlamydial infection in
humans (originally hypothesized by Caldwell et al.
2003). Blue URLs indicate source of data and edges
are defined by properties in Wikidata. B) SPARQL
query results for organisms that are capable of
producing indole .
B. Organisms that produce indole
Acknowledgements
We would like to thank Lynn Schriml and Elvira Mitraka of the University of Maryland, the members
of The Apollo Project and the many members of the Wikidata community for valuable contributions
to this project.
References/Funding
Caldwell et al. 2003 (PMID:12782678)
Putman et al. 2016 (PMID:27022157)
Burgstaller-Muehlbacher et al. 2015
(PMID:26989148)
This work is supported by the National Institutes of
Health under grants GM089820 and GM114833.
Domain Specific Portals into Wikidata
WikiGenomes serves as a centralized and generalizable microbial genomics database
for the Long Tail of sequenced genomes. WikiGenomes engages domain experts by
providing integrated gene reports that are otherwise difficult of tedious to access.
WikiGenomes also provides an easy interface that supports community annotation,
which is then immediately written to Wikidata.
L-tryptophan
www.drugbank.ca/
Bacteria
(Q10876)
domain
C.
trachomatis
434/BU
(Q20800254)
strain
trpA
(Q21153861)
gene
TRPA
(Q21153984)
protein
found in taxon
(P703)
parent taxon (P171)
encodes (P688)
encoded by (P702)
subclass of (P279)
Entrez ID (P351)
gen. start (P644)
gen. stop (P645)
subclass of
(P279)
UniProt ID
(P352)
RefSeq ID (P637)
molecular
function
(P680)
locus tag (P2393)
C.
trachomatis
(Q131065)
species
biological
process
(P681)
cell
component
(P682)
found in taxon
(P703)
B
N-Formylkynurenine
www.drugbank.ca/
A
Join the team!
bit.ly/genewikidata; sulab.org

Mais conteúdo relacionado

Mais procurados

ADARSH JOSE_Resume
ADARSH JOSE_ResumeADARSH JOSE_Resume
ADARSH JOSE_Resume
Adarsh Jose
 
Gcc talk baltimore july 2014
Gcc talk baltimore july 2014Gcc talk baltimore july 2014
Gcc talk baltimore july 2014
pratikomics
 
Danita CV 2015 July
Danita CV 2015 JulyDanita CV 2015 July
Danita CV 2015 July
Danita Mayer
 

Mais procurados (20)

Creating an integrated Ondex knowledge base for comparative gene function ana...
Creating an integrated Ondex knowledge base for comparative gene function ana...Creating an integrated Ondex knowledge base for comparative gene function ana...
Creating an integrated Ondex knowledge base for comparative gene function ana...
 
GENOME DATA ANALYSIS
GENOME DATA ANALYSISGENOME DATA ANALYSIS
GENOME DATA ANALYSIS
 
Cancer and wikimedia
Cancer and wikimediaCancer and wikimedia
Cancer and wikimedia
 
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...
 
Personalized models for Quantitative Systems Pharmacology
Personalized models for Quantitative Systems PharmacologyPersonalized models for Quantitative Systems Pharmacology
Personalized models for Quantitative Systems Pharmacology
 
iOmics
iOmicsiOmics
iOmics
 
ADARSH JOSE_Resume
ADARSH JOSE_ResumeADARSH JOSE_Resume
ADARSH JOSE_Resume
 
Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics
 
Gcc talk baltimore july 2014
Gcc talk baltimore july 2014Gcc talk baltimore july 2014
Gcc talk baltimore july 2014
 
Genomics2 Phenomics Complete
Genomics2 Phenomics CompleteGenomics2 Phenomics Complete
Genomics2 Phenomics Complete
 
STRING - Prediction of functionally associated proteins from heterogeneous ge...
STRING - Prediction of functionally associated proteins from heterogeneous ge...STRING - Prediction of functionally associated proteins from heterogeneous ge...
STRING - Prediction of functionally associated proteins from heterogeneous ge...
 
CDD poster
CDD posterCDD poster
CDD poster
 
Danita CV 2015 July
Danita CV 2015 JulyDanita CV 2015 July
Danita CV 2015 July
 
How to analyse large data sets
How to analyse large data setsHow to analyse large data sets
How to analyse large data sets
 
iEvoBio Hertweck abstract 2012
iEvoBio Hertweck abstract 2012iEvoBio Hertweck abstract 2012
iEvoBio Hertweck abstract 2012
 
The MRF Genome Library: Epidemiology of meningococcal disease-causing lineage...
The MRF Genome Library: Epidemiology of meningococcal disease-causing lineage...The MRF Genome Library: Epidemiology of meningococcal disease-causing lineage...
The MRF Genome Library: Epidemiology of meningococcal disease-causing lineage...
 
Introducing the KnetMiner Knowledge Graph: things, not strings
Introducing the KnetMiner Knowledge Graph: things, not stringsIntroducing the KnetMiner Knowledge Graph: things, not strings
Introducing the KnetMiner Knowledge Graph: things, not strings
 
Bioinformatics-2009-Moura-1096-8
Bioinformatics-2009-Moura-1096-8Bioinformatics-2009-Moura-1096-8
Bioinformatics-2009-Moura-1096-8
 
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
 
Scott Edmunds: GigaScience Datacite meeting Rapid Fire Talk
Scott Edmunds: GigaScience Datacite meeting Rapid Fire TalkScott Edmunds: GigaScience Datacite meeting Rapid Fire Talk
Scott Edmunds: GigaScience Datacite meeting Rapid Fire Talk
 

Semelhante a WikiGenomes Poster (ISMB)

Open data, compound repurposing, and rare diseases -- Point Loma Nazarene Uni...
Open data, compound repurposing, and rare diseases -- Point Loma Nazarene Uni...Open data, compound repurposing, and rare diseases -- Point Loma Nazarene Uni...
Open data, compound repurposing, and rare diseases -- Point Loma Nazarene Uni...
Andrew Su
 
Bioinformatics and its Applications in Agriculture/Sericulture and in other F...
Bioinformatics and its Applications in Agriculture/Sericulture and in other F...Bioinformatics and its Applications in Agriculture/Sericulture and in other F...
Bioinformatics and its Applications in Agriculture/Sericulture and in other F...
mohd younus wani
 
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
Human Variome Project
 

Semelhante a WikiGenomes Poster (ISMB) (20)

Microbial Metagenomics Drives a New Cyberinfrastructure
Microbial Metagenomics Drives a New CyberinfrastructureMicrobial Metagenomics Drives a New Cyberinfrastructure
Microbial Metagenomics Drives a New Cyberinfrastructure
 
Cross-Disciplinary Biomedical Research at Calit2
Cross-Disciplinary Biomedical Research at Calit2Cross-Disciplinary Biomedical Research at Calit2
Cross-Disciplinary Biomedical Research at Calit2
 
Open data, compound repurposing, and rare diseases -- Point Loma Nazarene Uni...
Open data, compound repurposing, and rare diseases -- Point Loma Nazarene Uni...Open data, compound repurposing, and rare diseases -- Point Loma Nazarene Uni...
Open data, compound repurposing, and rare diseases -- Point Loma Nazarene Uni...
 
Using Supercomputers and Supernetworks to Explore the Ocean of Life
Using Supercomputers and Supernetworks to Explore the Ocean of LifeUsing Supercomputers and Supernetworks to Explore the Ocean of Life
Using Supercomputers and Supernetworks to Explore the Ocean of Life
 
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
 
Building a Network of Interoperable and Independently Produced Linked and Ope...
Building a Network of Interoperable and Independently Produced Linked and Ope...Building a Network of Interoperable and Independently Produced Linked and Ope...
Building a Network of Interoperable and Independently Produced Linked and Ope...
 
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
 
The Emerging Global Community of Microbial Metagenomics Researchers
The Emerging Global Community of Microbial Metagenomics ResearchersThe Emerging Global Community of Microbial Metagenomics Researchers
The Emerging Global Community of Microbial Metagenomics Researchers
 
Modeling Alzheimer’s Disease research claims, evidence, and arguments from a ...
Modeling Alzheimer’s Disease research claims, evidence, and arguments from a ...Modeling Alzheimer’s Disease research claims, evidence, and arguments from a ...
Modeling Alzheimer’s Disease research claims, evidence, and arguments from a ...
 
Bioinformatics and its Applications in Agriculture/Sericulture and in other F...
Bioinformatics and its Applications in Agriculture/Sericulture and in other F...Bioinformatics and its Applications in Agriculture/Sericulture and in other F...
Bioinformatics and its Applications in Agriculture/Sericulture and in other F...
 
Cimetta et al., 2013
Cimetta et al., 2013Cimetta et al., 2013
Cimetta et al., 2013
 
IJSRED-V2I1P5
IJSRED-V2I1P5IJSRED-V2I1P5
IJSRED-V2I1P5
 
Emerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsEmerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomics
 
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.caGenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
 
Analyzing the Human Gut Microbiome Dynamics in Health and Disease Using Super...
Analyzing the Human Gut Microbiome Dynamics in Health and Disease Using Super...Analyzing the Human Gut Microbiome Dynamics in Health and Disease Using Super...
Analyzing the Human Gut Microbiome Dynamics in Health and Disease Using Super...
 
Analyzing the Human Gut Microbiome Dynamics in Health and Disease Using Super...
Analyzing the Human Gut Microbiome Dynamics in Health and Disease Using Super...Analyzing the Human Gut Microbiome Dynamics in Health and Disease Using Super...
Analyzing the Human Gut Microbiome Dynamics in Health and Disease Using Super...
 
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
 
Forest Environment Analysis for the Pandemic Health
Forest Environment Analysis for the Pandemic HealthForest Environment Analysis for the Pandemic Health
Forest Environment Analysis for the Pandemic Health
 
High Performance Cyberinfrastructure to Support Data-Intensive Biomedical Res...
High Performance Cyberinfrastructure to Support Data-Intensive Biomedical Res...High Performance Cyberinfrastructure to Support Data-Intensive Biomedical Res...
High Performance Cyberinfrastructure to Support Data-Intensive Biomedical Res...
 
gky1131.pdf
gky1131.pdfgky1131.pdf
gky1131.pdf
 

Mais de Andrew Su

Building and mining a heterogeneous biomedical knowledge graph
Building and mining a heterogeneous biomedical knowledge graphBuilding and mining a heterogeneous biomedical knowledge graph
Building and mining a heterogeneous biomedical knowledge graph
Andrew Su
 
Crowdsourcing Biology: The Gene Wiki, BioGPS and GeneGames.org
Crowdsourcing Biology: The Gene Wiki, BioGPS and GeneGames.orgCrowdsourcing Biology: The Gene Wiki, BioGPS and GeneGames.org
Crowdsourcing Biology: The Gene Wiki, BioGPS and GeneGames.org
Andrew Su
 

Mais de Andrew Su (20)

Building and mining a heterogeneous biomedical knowledge graph
Building and mining a heterogeneous biomedical knowledge graphBuilding and mining a heterogeneous biomedical knowledge graph
Building and mining a heterogeneous biomedical knowledge graph
 
Wikidata as a FAIR knowledge graph for the life sciences
Wikidata as a FAIR knowledge graph for the life sciencesWikidata as a FAIR knowledge graph for the life sciences
Wikidata as a FAIR knowledge graph for the life sciences
 
The Gene Wiki: Using Wikipedia and Wikidata to organize biomedical knowledge
The Gene Wiki: Using Wikipedia and Wikidata to organize biomedical knowledgeThe Gene Wiki: Using Wikipedia and Wikidata to organize biomedical knowledge
The Gene Wiki: Using Wikipedia and Wikidata to organize biomedical knowledge
 
BOSC2017: Using Wikidata as an open, community-maintained database of biomedi...
BOSC2017: Using Wikidata as an open, community-maintained database of biomedi...BOSC2017: Using Wikidata as an open, community-maintained database of biomedi...
BOSC2017: Using Wikidata as an open, community-maintained database of biomedi...
 
The case for an open biomedical knowledgebase
The case for an open biomedical knowledgebaseThe case for an open biomedical knowledgebase
The case for an open biomedical knowledgebase
 
Open data, compound repurposing, and rare diseases (ISCB)
Open data, compound repurposing, and rare diseases (ISCB)Open data, compound repurposing, and rare diseases (ISCB)
Open data, compound repurposing, and rare diseases (ISCB)
 
Citizen Science and Rare Disease Research
Citizen Science and Rare Disease ResearchCitizen Science and Rare Disease Research
Citizen Science and Rare Disease Research
 
Open biomedical knowledge using crowdsourcing and citizen science
Open biomedical knowledge using crowdsourcing and citizen scienceOpen biomedical knowledge using crowdsourcing and citizen science
Open biomedical knowledge using crowdsourcing and citizen science
 
Heart BD2K, Biocuration, and Citizen Science
Heart BD2K, Biocuration, and Citizen ScienceHeart BD2K, Biocuration, and Citizen Science
Heart BD2K, Biocuration, and Citizen Science
 
Panel on Citizen Science and Crowdsourcing Games - March 27, 2015
Panel on Citizen Science and Crowdsourcing Games - March 27, 2015Panel on Citizen Science and Crowdsourcing Games - March 27, 2015
Panel on Citizen Science and Crowdsourcing Games - March 27, 2015
 
Using Citizen Science to organize biomedical knowledge
Using Citizen Science to organize biomedical knowledgeUsing Citizen Science to organize biomedical knowledge
Using Citizen Science to organize biomedical knowledge
 
UCSD / DBMI seminar 2015-02-6
UCSD / DBMI seminar 2015-02-6UCSD / DBMI seminar 2015-02-6
UCSD / DBMI seminar 2015-02-6
 
Crowdsourcing and Learning from Crowd Data (Tutorial @ PSB2015)
Crowdsourcing and Learning from Crowd Data (Tutorial @ PSB2015)Crowdsourcing and Learning from Crowd Data (Tutorial @ PSB2015)
Crowdsourcing and Learning from Crowd Data (Tutorial @ PSB2015)
 
Microtask crowdsourcing for annotating diseases in PubMed abstracts (ASHG 2014)
Microtask crowdsourcing for annotating diseases in PubMed abstracts (ASHG 2014)Microtask crowdsourcing for annotating diseases in PubMed abstracts (ASHG 2014)
Microtask crowdsourcing for annotating diseases in PubMed abstracts (ASHG 2014)
 
Crowdsourcing Biology: The Gene Wiki, BioGPS, and Citizen Science
Crowdsourcing Biology: The Gene Wiki, BioGPS, and Citizen ScienceCrowdsourcing Biology: The Gene Wiki, BioGPS, and Citizen Science
Crowdsourcing Biology: The Gene Wiki, BioGPS, and Citizen Science
 
Centralized Model Organism Database (Biocuration 2014 poster)
Centralized Model Organism Database (Biocuration 2014 poster)Centralized Model Organism Database (Biocuration 2014 poster)
Centralized Model Organism Database (Biocuration 2014 poster)
 
A Centralized Model Organism Database (CMOD) for the Long Tail of Sequenced G...
A Centralized Model Organism Database (CMOD) for the Long Tail of Sequenced G...A Centralized Model Organism Database (CMOD) for the Long Tail of Sequenced G...
A Centralized Model Organism Database (CMOD) for the Long Tail of Sequenced G...
 
Crowdsourcing Biology: The Gene Wiki, BioGPS and GeneGames.org
Crowdsourcing Biology: The Gene Wiki, BioGPS and GeneGames.orgCrowdsourcing Biology: The Gene Wiki, BioGPS and GeneGames.org
Crowdsourcing Biology: The Gene Wiki, BioGPS and GeneGames.org
 
NCBO Webinar: Translating unstructured, crowdsourced content into structured ...
NCBO Webinar: Translating unstructured, crowdsourced content into structured ...NCBO Webinar: Translating unstructured, crowdsourced content into structured ...
NCBO Webinar: Translating unstructured, crowdsourced content into structured ...
 
Crowdsourcing Biology: The Gene Wiki, BioGPS and GeneGames.org
Crowdsourcing Biology: The Gene Wiki, BioGPS and GeneGames.orgCrowdsourcing Biology: The Gene Wiki, BioGPS and GeneGames.org
Crowdsourcing Biology: The Gene Wiki, BioGPS and GeneGames.org
 

Último

Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 

Último (20)

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 

WikiGenomes Poster (ISMB)

  • 1. WikiGenomes and Chlambase: Microbial genomics data in Wikidata. Tim E. Putman1, Sebastian Burgstaller-Muehlbacher1, Andra Waagmeester2, Chunlei Wu1, Kevin Hybiske3, Benjamin M. Good1, and Andrew I. Su1 1 Department of Integrative, Structural and Computational Biology, the Scripps Research Institute, La Jolla, USA; sulab.org 2 Micelio, Antwerp, Belgium 3 Division of Allergy and Infectious Diseases, Department of Medicine, University of Washington Motivation Wikidata provides an extensible open framework ideal for aggregating distributed data in a centralized database that supports: • complex querying based a semantic data model • providing data for domain specific web applications that allow the user to both read and write data Here, we describe the use of Wikidata to integrate microbial genomics data using WikiGenomes and a Chlamydia-specific instance called Chlambase. A A) Semantic microbial data model consisting of a hierarchical taxonomic schema and separate entities for gene and protein. The nodes are Wikidata ‘items’ and ‘properties’ define the relationships. B) Python based ‘Bot’ software for gathering data from different resources and reading and writing directly to Wikidata (https://github.com/SuLab/WikidataIntegrator). Data model and implementation A) Various data sources for microbial genetic data. B) Cumulative sum of bacterial and eukaryotic genome assemblies submitted to NCBI GenBank by year. A B Scope and diversity of microbial data Modeling microbial interactions C. trachomatis genome www.ncbi.nlm.nih.gov/ genome/ indole www.drugbank.ca/ Chlamydia trachomatis: genes www.ncbi.nlm.nih.gov/gene/ Human: indoleamine 2, 3-dioxygenase www.uniprot.org/ tryptophanase www.uniprot.org/ C.trachomatis: trp. synth. alpha and beta www.uniprot.org/ C.trachomatis: tryptophan synthase www.rhea-db.org C.trachomatis: trpRBA operon www.operondb.jp/ Akers et al. 2006 A) The interactions between host, pathogen, microbiome, and small molecules that lead to pathogen persistence during a chlamydial infection in humans (originally hypothesized by Caldwell et al. 2003). Blue URLs indicate source of data and edges are defined by properties in Wikidata. B) SPARQL query results for organisms that are capable of producing indole . B. Organisms that produce indole Acknowledgements We would like to thank Lynn Schriml and Elvira Mitraka of the University of Maryland, the members of The Apollo Project and the many members of the Wikidata community for valuable contributions to this project. References/Funding Caldwell et al. 2003 (PMID:12782678) Putman et al. 2016 (PMID:27022157) Burgstaller-Muehlbacher et al. 2015 (PMID:26989148) This work is supported by the National Institutes of Health under grants GM089820 and GM114833. Domain Specific Portals into Wikidata WikiGenomes serves as a centralized and generalizable microbial genomics database for the Long Tail of sequenced genomes. WikiGenomes engages domain experts by providing integrated gene reports that are otherwise difficult of tedious to access. WikiGenomes also provides an easy interface that supports community annotation, which is then immediately written to Wikidata. L-tryptophan www.drugbank.ca/ Bacteria (Q10876) domain C. trachomatis 434/BU (Q20800254) strain trpA (Q21153861) gene TRPA (Q21153984) protein found in taxon (P703) parent taxon (P171) encodes (P688) encoded by (P702) subclass of (P279) Entrez ID (P351) gen. start (P644) gen. stop (P645) subclass of (P279) UniProt ID (P352) RefSeq ID (P637) molecular function (P680) locus tag (P2393) C. trachomatis (Q131065) species biological process (P681) cell component (P682) found in taxon (P703) B N-Formylkynurenine www.drugbank.ca/ A Join the team! bit.ly/genewikidata; sulab.org