SlideShare a Scribd company logo
1 of 15
Gene Tree/
Species Tree
Reconciliation



Phylotastic Hackathon
    June 4, 2012
iPlant Tree of Life (iPTOL)
• Tree Reconciliation
• Big Trees
• Data Assembly
• Trait Evolution
• Data Integration
• Tree Visualization
Gene Tree Reconciliation

Projection of gene trees onto a species tree
•   gene duplications
•   gene losses
•   lineage sorting
•   horizontal transfer
Gene Tree Reconciliation

• Locating gene duplications allows us to
  identify orthologs and paralogs
• Identify gene composition in inferred ancestral
  genomes
• Map of the positions of ancestral polyploidy
  events
• Contribute to the study of the “fate” of
  duplicated genes
• Address questions of gene family coevolution
Existing TR Cyberinfrastructure
  Generate         EC       Visualize
Reconciliations   Gene    Reconciliations
                  Trees
  TreeBeST                   primeTV


  primeGSR                   fltreebest
Extending TR Cyberinfrastructure

• Increased interoperability among the
  component pieces
• Query the location of gene duplications on the
  species tree
• Integrate tree visualization tools that scale to
  many thousands of nodes
• Allow for the storage and analysis of multiple
  reconciliations for a single gene tree within a
  single database structure
Extending TR Cyberinfrastructure
  Generate                       Visualize
Reconciliations     Gene       Reconciliations
                    Trees
  TreeBeST                        primeTV
                  Reconciled
  primeGSR
                                  fltreebest
   NOTUNG          Species
                    Trees
    annot8r
                  Ontology
  Functional
  Annotation
Tree Reconciliation GUI
Tree Reconciliation GUI
Tree Reconciliation GUI
Tree Reconciliation GUI



                 Queries
                 • BLAST
                 • GO Term
                 • Locus Name
                 • Gene Family
                   Name
Current Limitations

• Users query against a pre-computed set of
  reconciliations
  • We generate the species trees
  • We generate the gene trees given alignments
  • We generate reconciliation mappings
• Reconciliation visualization is currently tied to
  the database
• Users can NOT submit their own data (genes
  trees or alignments) for reconciliation
Making TR Phylotastic

• Allow users to generate reconciliations
  using their own data
  • Supply a species tree OR
  • Supply an gene family alignment
Phylotastic Components

• Name resolution
  • Given a gene tree or alignments determine the
    species list

• Tree Pruner
    • Given the species list above, generate the
      species tree required for reconciliation

• NEXML encoding
    • Return reconciled tree using NEXML
A Phylotastic DNA Subway ..

More Related Content

What's hot

3A_3_Informing population genetics through spatial analysis of surnames
3A_3_Informing population genetics through spatial analysis of surnames3A_3_Informing population genetics through spatial analysis of surnames
3A_3_Informing population genetics through spatial analysis of surnames
GISRUK conference
 
Biological Database Systems
Biological Database SystemsBiological Database Systems
Biological Database Systems
Denis Shestakov
 
Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02
Sreekanth Gali
 

What's hot (20)

Molecular Phylogenetics
Molecular PhylogeneticsMolecular Phylogenetics
Molecular Phylogenetics
 
Multiple Sequence Alignment-just glims of viewes on bioinformatics.
 Multiple Sequence Alignment-just glims of viewes on bioinformatics. Multiple Sequence Alignment-just glims of viewes on bioinformatics.
Multiple Sequence Alignment-just glims of viewes on bioinformatics.
 
Molecular Evolution and Phylogenetics (2009)
Molecular Evolution and Phylogenetics (2009)Molecular Evolution and Phylogenetics (2009)
Molecular Evolution and Phylogenetics (2009)
 
The tree of life
The tree of lifeThe tree of life
The tree of life
 
Phylogenetic analysis in nutshell
Phylogenetic analysis in nutshellPhylogenetic analysis in nutshell
Phylogenetic analysis in nutshell
 
Ensembl annotation
Ensembl annotationEnsembl annotation
Ensembl annotation
 
3A_3_Informing population genetics through spatial analysis of surnames
3A_3_Informing population genetics through spatial analysis of surnames3A_3_Informing population genetics through spatial analysis of surnames
3A_3_Informing population genetics through spatial analysis of surnames
 
philogenetic tree
philogenetic treephilogenetic tree
philogenetic tree
 
The ensembl database
The ensembl databaseThe ensembl database
The ensembl database
 
Genomic databases
Genomic databasesGenomic databases
Genomic databases
 
Phylogenetics Questions Answers
Phylogenetics Questions Answers Phylogenetics Questions Answers
Phylogenetics Questions Answers
 
Gene association networks: Large-scale integration of data and text
Gene association networks: Large-scale integration of data and textGene association networks: Large-scale integration of data and text
Gene association networks: Large-scale integration of data and text
 
Gene association networks: Large-scale integration of data and text
Gene association networks: Large-scale integration of data and textGene association networks: Large-scale integration of data and text
Gene association networks: Large-scale integration of data and text
 
Softwares For Phylogentic Analysis
Softwares For Phylogentic AnalysisSoftwares For Phylogentic Analysis
Softwares For Phylogentic Analysis
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Perl for Phyloinformatics
Perl for PhyloinformaticsPerl for Phyloinformatics
Perl for Phyloinformatics
 
Ensembl genome
Ensembl genomeEnsembl genome
Ensembl genome
 
Biological Database Systems
Biological Database SystemsBiological Database Systems
Biological Database Systems
 
Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02
 
Biological databases
Biological databasesBiological databases
Biological databases
 

Similar to Phylotastic reconciliation

Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database
Towards a Simple, Standards-Compliant, and Generic Phylogenetic DatabaseTowards a Simple, Standards-Compliant, and Generic Phylogenetic Database
Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database
Hilmar Lapp
 
Carleton Biology talk : March 2014
Carleton Biology talk : March 2014Carleton Biology talk : March 2014
Carleton Biology talk : March 2014
Karen Cranston
 
Bls 303 l1.phylogenetics
Bls 303 l1.phylogeneticsBls 303 l1.phylogenetics
Bls 303 l1.phylogenetics
Bruno Mmassy
 

Similar to Phylotastic reconciliation (20)

Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database
Towards a Simple, Standards-Compliant, and Generic Phylogenetic DatabaseTowards a Simple, Standards-Compliant, and Generic Phylogenetic Database
Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database
 
Carleton Biology talk : March 2014
Carleton Biology talk : March 2014Carleton Biology talk : March 2014
Carleton Biology talk : March 2014
 
The iPlant Tree of Life Project and Toolkit
The iPlant Tree of Life Project and ToolkitThe iPlant Tree of Life Project and Toolkit
The iPlant Tree of Life Project and Toolkit
 
iPlant Tree of Life
iPlant Tree of LifeiPlant Tree of Life
iPlant Tree of Life
 
Phylotastic @iEvoBio
Phylotastic @iEvoBioPhylotastic @iEvoBio
Phylotastic @iEvoBio
 
KnetMiner Overview Oct 2017
KnetMiner Overview Oct 2017KnetMiner Overview Oct 2017
KnetMiner Overview Oct 2017
 
KnetMiner - Knowledge Network Miner
KnetMiner - Knowledge Network MinerKnetMiner - Knowledge Network Miner
KnetMiner - Knowledge Network Miner
 
Phylo finder: an intelligent search engine for phylogenetic tree databases
Phylo finder: an intelligent search engine for phylogenetic tree databasesPhylo finder: an intelligent search engine for phylogenetic tree databases
Phylo finder: an intelligent search engine for phylogenetic tree databases
 
Comparative Genomics and Visualisation BS32010
Comparative Genomics and Visualisation BS32010Comparative Genomics and Visualisation BS32010
Comparative Genomics and Visualisation BS32010
 
Assembling the Tree of Life from public DNA sequence data
Assembling the Tree of Life from public DNA sequence dataAssembling the Tree of Life from public DNA sequence data
Assembling the Tree of Life from public DNA sequence data
 
Investigating plant systems using data integration and network analysis
Investigating plant systems using data integration and network analysisInvestigating plant systems using data integration and network analysis
Investigating plant systems using data integration and network analysis
 
Phylogenetic tree construction step by step
Phylogenetic tree construction step by stepPhylogenetic tree construction step by step
Phylogenetic tree construction step by step
 
Bls 303 l1.phylogenetics
Bls 303 l1.phylogeneticsBls 303 l1.phylogenetics
Bls 303 l1.phylogenetics
 
BIOINFORMATICS_AND_PHYLOGENY.pdf.pdf
BIOINFORMATICS_AND_PHYLOGENY.pdf.pdfBIOINFORMATICS_AND_PHYLOGENY.pdf.pdf
BIOINFORMATICS_AND_PHYLOGENY.pdf.pdf
 
Rob Beiko - #SMBE12 presentation
Rob Beiko - #SMBE12 presentationRob Beiko - #SMBE12 presentation
Rob Beiko - #SMBE12 presentation
 
07_Phylogeny_2022.pdf
07_Phylogeny_2022.pdf07_Phylogeny_2022.pdf
07_Phylogeny_2022.pdf
 
Semantics of and for the diversity of life:
 Opportunities and perils of tryi...
Semantics of and for the diversity of life:
 Opportunities and perils of tryi...Semantics of and for the diversity of life:
 Opportunities and perils of tryi...
Semantics of and for the diversity of life:
 Opportunities and perils of tryi...
 
Basics of constructing Phylogenetic tree.ppt
Basics of constructing Phylogenetic tree.pptBasics of constructing Phylogenetic tree.ppt
Basics of constructing Phylogenetic tree.ppt
 
Prediction of protein function
Prediction of protein functionPrediction of protein function
Prediction of protein function
 
phylogenetics.pdf
phylogenetics.pdfphylogenetics.pdf
phylogenetics.pdf
 

More from Naim Matasci

More from Naim Matasci (9)

iPlant Taxonomic Name Resolution Service v. 3
iPlant Taxonomic Name Resolution Service v. 3iPlant Taxonomic Name Resolution Service v. 3
iPlant Taxonomic Name Resolution Service v. 3
 
iPlant TNRS for digital collections - iDigBio Workshop
iPlant TNRS for digital collections - iDigBio WorkshopiPlant TNRS for digital collections - iDigBio Workshop
iPlant TNRS for digital collections - iDigBio Workshop
 
The iPlant Collaborative: A Cyberinfrastructure for the Life Sciences
The iPlant Collaborative: A Cyberinfrastructure for the Life SciencesThe iPlant Collaborative: A Cyberinfrastructure for the Life Sciences
The iPlant Collaborative: A Cyberinfrastructure for the Life Sciences
 
iPlant TNRS
iPlant TNRSiPlant TNRS
iPlant TNRS
 
Post-tree Analyses Workflow
Post-tree Analyses WorkflowPost-tree Analyses Workflow
Post-tree Analyses Workflow
 
Phylogenetic Workflows
Phylogenetic WorkflowsPhylogenetic Workflows
Phylogenetic Workflows
 
Phylogenetic Workflows
Phylogenetic WorkflowsPhylogenetic Workflows
Phylogenetic Workflows
 
The iPlant Tree of Life Project and Toolkit
The iPlant Tree of Life Project and ToolkitThe iPlant Tree of Life Project and Toolkit
The iPlant Tree of Life Project and Toolkit
 
The TNRS: a Taxonomic Name Resolution Service for Plants
The TNRS: a Taxonomic Name Resolution Service for PlantsThe TNRS: a Taxonomic Name Resolution Service for Plants
The TNRS: a Taxonomic Name Resolution Service for Plants
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

Phylotastic reconciliation

  • 2. iPlant Tree of Life (iPTOL) • Tree Reconciliation • Big Trees • Data Assembly • Trait Evolution • Data Integration • Tree Visualization
  • 3. Gene Tree Reconciliation Projection of gene trees onto a species tree • gene duplications • gene losses • lineage sorting • horizontal transfer
  • 4. Gene Tree Reconciliation • Locating gene duplications allows us to identify orthologs and paralogs • Identify gene composition in inferred ancestral genomes • Map of the positions of ancestral polyploidy events • Contribute to the study of the “fate” of duplicated genes • Address questions of gene family coevolution
  • 5. Existing TR Cyberinfrastructure Generate EC Visualize Reconciliations Gene Reconciliations Trees TreeBeST primeTV primeGSR fltreebest
  • 6. Extending TR Cyberinfrastructure • Increased interoperability among the component pieces • Query the location of gene duplications on the species tree • Integrate tree visualization tools that scale to many thousands of nodes • Allow for the storage and analysis of multiple reconciliations for a single gene tree within a single database structure
  • 7. Extending TR Cyberinfrastructure Generate Visualize Reconciliations Gene Reconciliations Trees TreeBeST primeTV Reconciled primeGSR fltreebest NOTUNG Species Trees annot8r Ontology Functional Annotation
  • 11. Tree Reconciliation GUI Queries • BLAST • GO Term • Locus Name • Gene Family Name
  • 12. Current Limitations • Users query against a pre-computed set of reconciliations • We generate the species trees • We generate the gene trees given alignments • We generate reconciliation mappings • Reconciliation visualization is currently tied to the database • Users can NOT submit their own data (genes trees or alignments) for reconciliation
  • 13. Making TR Phylotastic • Allow users to generate reconciliations using their own data • Supply a species tree OR • Supply an gene family alignment
  • 14. Phylotastic Components • Name resolution • Given a gene tree or alignments determine the species list • Tree Pruner • Given the species list above, generate the species tree required for reconciliation • NEXML encoding • Return reconciled tree using NEXML
  • 15. A Phylotastic DNA Subway ..

Editor's Notes

  1. This iPlant Sponsored Tree Reconciliation Working group is one of six main working groups that are part of the iPlant Tree of Life program. The overall goals of iPToL project are to develop the cyberinfrastructure needed to assemble, visualize and analyze the plant tree of life. The goals of the Tree Reconciliation Working group include the development of database tools for 'post-tree' analysis of the reconciliation of gene trees to species trees. This is post-tree in the sense that the species tree is taken as a given that will result from work being developed by the Big Trees group.
  2. Gene tree reconciliations allow us to map processes and events from the gene tree onto the species tree. These include: *gene duplications *gene losses *lineage sorting *horizontal transfer
  3. The utility of gene tree reconciliation … Ancestral polyploidy events are a major component of plant genome evolution.
  4. Existing tools for gene tree reconcliation include: *Software to generate reconciliations (TreeBeST, primeGSR) *Software to visualize these reconciliations (primeTV/fltreebest) *Databases such as En semble Compara that allow us to store reconciled gene trees as well as information regarding the sequences, alignments and locations of the genes comprising the reconciled gene families
  5. Our initial goals in extending cyberinfrastructure for gene tree reconciliation involved developing a static database of precomputed reconciliations.
  6. We extended the Ensemble Compare database design to include precomputed species trees, precomputed gene trees and a reconciliation mapping between the two. We have also added support for ontologies to tag attributes of trees, nodes, functional gene annotation and developed a Tree We have high-throughput pipelines for TreeBEST, primeGSR and NOTUNG to generate large numbers of reconciliations and load these to the database. We can also populate functional annotation of genes using input from the annot8r functional annotation program. We also have developed a new interface for visualizing reconciled trees. This interface allow for visualizing reconciled trees stored in the database as well as supports queries to find reconciled trees within the database.
  7. The GUI allows for a simultaneously viewing the species tree and a gene tree reconciled to that species tree. These trees “interact” such that selecting branches in one tree can highlight nodes and edges in the other.
  8. The gene tree node color highlight the location of duplication and speciation events ..
  9. .. the species tree maps the location of duplication events from the gene tree onto the species tree. Duplication events are shown here as green triangles.
  10. The GUI also provides a way to find reconciled gene families within the database …. Queries for: BLAST Can search for gene families in the database that match a DNA or protein sequence query. GO Term Can search for gene families that have been annotated for a specific GO term. Locus Name It is possible to identify the gene families that contain a known locus name. Gene Family Name It is also possible to jump directly to a gene family name.
  11. Having reconciliations mapped to a database that can be queried like this is awesome, and allows us to ask new questions,
  12. Having reconciliations mapped to a database that can be queried like this is awesome, and allows us to ask new questions,
  13. A difficulty here is determining the species source of the gene given the gene information. The third component, shown here as NEXML encoding would depend in part on the standards used by phylotastic for communication among the components of the phylotastic workflows. See Daniel Packer’s GSOC Project for notes on NEXML encoding.
  14. The DNA subway is an AWESOME education tool that takes users through the process of genome annotation. Starting with genome sequence data (such a sequenced BAC), students find the genes and can even generate gene trees using their annotated gene as a query sequence for an automated generation of a gene tree. The ‘Prospect Genome’ track current dead ends with this gene tree. Given a system that could accept that gene tree as input for reconciliation it would be possible to generate a reconciled gene tree that would provide an awesome way to introduce students to the concepts of orthology and paralogy using data that they have generated themselves starting with raw genome sequence. In this case the initial input is unannotated genome sequence .. so it would be possible to go from raw genome sequence data to reconciled gene trees using an intuitive interface that is simple enough to use in undergraduate education. This is awesome because this could be student generated sequence data that has never been annotated before, and the pipeline could result in a set of student derived reconciled gene trees.