15. Mechanisms of genomic evolution
Gene family
evolution
Homeologues
Horizontal
transfer
Aggregated genomic
distributions of sitewise
phenomena
16. Plant and Fungal Trees of Life
(PAFTOL)
• 2020 Output
• Genus-level
phylogenies
inferred from
genetic data for
plants and fungi
• You can help
17. Thanks – joe.parker@kew.org
BI&SA:
Abigail Barker, Rob Turner, John
Iacona, James Crowe
PAFTOL:
Bill Baker, Delivery Group, Steering
Group
QMUL:
Steve Rossiter, Kalina Davies, Georgia
Tsagkogeorga
WIMM/NDM:
Guillaume Stewart-Jones, Emma Bowles
SSI Fellowship
• Software Sustainability Institute-funded
hackathon series
• Community building
Sequence data will become ubiquitous, as compouters are now
-much of it available real-time and collected by non-experts
Exploiting this requires informatics
Talk about virus stuff relate progress to HIV seq compendium ebola monitoring programme
Why we are interested in evolution etc
Evolutionary biology mechanisms as well as facts
My central thesis is that comparative analyses of evolutionary processes using genetic data must be statistical…
… only informatics approaches let us do this properly…
… in the very near future all of us will be comparative statistical evolutionary genomicists, or
Phylogenomics bods
BGI leader Jun Wang and others (mick watson) have even suggested even that AI is needed
Genbank doubling ~18monthly (2007)
ENA doubling ~10monthly
My central thesis is that comparative analyses of evolutionary processes using genetic data must be statistical…
… only informatics approaches let us do this properly…
… in the very near future all of us will be comparative statistical evolutionary genomicists, or
Phylogenomics bods
Total holdings:
As of 15 August 2015, GenBank release 209.0 has 187,066,846 loci, 199,823,644,287 bases, from 187,066,846 reported sequences.[3]
Like going from the 7 books pictured to nearly a million…
My evidence for showing phylogenetics phylogenomics?
Some past work Showing how informatics enables statistical application of the comparative method to biological questions
---
Basis of our method
Two convergence hypotheses
Large numbers of loci, including sensory
Simulated sites
Random control tree correction
22 mammals, 2326 loci, ~600,000 sites
Convergence signals across genome?
Loci linked to sensory perception?
My evidence for showing phylogenetics phylogenomics?
Some past work Showing how informatics enables statistical application of the comparative method to biological questions
The phylogenies
OK that’s the ‘genomics’ bit covered. Now the ‘real time’ bit
Analysing, as fast as sequencing, in the field / real-time
Science fiction?
The MinION
Iterative analysis key to doing better science, not just more
Place natural variation into context
Reduce ascertainment bias, increase taxon discovery
Make use of cloud resources and get phylogenetics into the field
DEMO
0) delete extra reads in m/raw
Start metrichor + workflow (1-D MAP 005) “app store”
Start RRA and tidy windows “prototype”
Copy reads from pooled -> raw “simulate”
Cross fingers
Microscope? Is ‘camera’ more apt?
From a specialised unwieldy device to commercialised one with thousands of users, app store, magazines, forums etc
Where will we be in 2020? 20205?
My central thesis is that comparative analyses of evolutionary processes using genetic data must be statistical…
… only informatics approaches let us do this properly…
… in the very near future all of us will be comparative statistical evolutionary genomicists, or
Phylogenomics bods
BGI leader Jun Wang and others (mick watson) have even suggested even that AI is needed
future work – implications of the RTP / MinION and fellows
Real-time phylogenomics and the minion for monitoring, turbotaxonomy / ID
Applications, use cases
Whacky theoretical stuff
OK that’s the ‘genomics’ bit covered. Now the ‘real time’ bit
Analysing, as fast as sequencing, in the field / real-time
Science fiction?
The MinION
Iterative analysis key to doing better science, not just more
Place natural variation into context
Reduce ascertainment bias, increase taxon discovery
Make use of cloud resources and get phylogenetics into the field
Of course I think the tools are cool but I’m a scientist not a tech for a reason
By which we mean evolution on genomes and their landscape/architecture:
gene family evolution, homology/paralogy/orthology
horizontal transfer
non independence of sitewise processes (fourier etc)
What it is
Genus level phylogenies dynamically (automatically, continuously) inferred from genomic data
Challenges
∫A LOT of data needed
Implications of ubiquitous sequencing
Data deluge
My WP
How you can help, tissue etc now
minIONs with you in a few years?
Thank people involved so far
future work – implications of the RTP / MinION and fellows
Real-time phylogenomics and the minion for monitoring, turbotaxonomy / ID
Applications, use cases
Whacky theoretical stuff