SlideShare uma empresa Scribd logo
1 de 29
Managing Genetic
Ancestry at Scale
Jason Clark
@jclark1985
Copyright 2015 Monsanto Company
Food is a looming issue as populations
rise and farm acres shrink
2
By 2050, the world will grow by 2 billion people,
that’s as many people as there are currently in
North and South America combined
TWICE!!!
Copyright 2015 Monsanto Company
Breeding for a Better Harvest
3
Approaches to
make crops yield
better under
dwindling
resources requires
huge advances in
breeding
FEED
FOOD
10K YEARS
Copyright 2015 Monsanto Company
Plant Breeding in a Nutshell
4
Tracking Our Plant Ancestry
5
Plant
ID
1
attributes..
...
2 ...
3 ...
Plant Relationship
Plant ID Parent ID
3 1
3 2
Copyright 2015 Monsanto Company
0
10
20
30
40
50
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
# of Inserts (M)
Our R&D pipeline can be cyclical
Copyright 2015 Monsanto Company
8-10 years
Ask a question about an Ancestry
7
Copyright 2015 Monsanto Company
Can you return to me all
ancestors of a given plant?
It’s Complicated
8
Copyright 2015 Monsanto Company
This is a single breeding line!
Our reads do not scale…
9
0
5
10
15
20
25
30
Response(s)
Response (s)
Copyright 2015 Monsanto Company
At a depth of 15 – We killed the query at 1.5 hours
Database indexes do not help
10
Identifying each
set of related
materials
potentially
requires a
full scan of an
index
O(m log n)
Copyright 2015 Monsanto Company
Ask a question about an Ancestry
11
Copyright 2015 Monsanto Company
Can you return to me all
ancestors of a given plant?
Index Free Adjacency (IFA)
12
A single index
hit finds my
starting point;
all other
relationship
identification is
O(1)
Copyright 2015 Monsanto Company
We were looking for…
13
Something that can
accurately represent the
domain model
We were looking for…
14
Query performance to
remain near constant as
we ask questions about
particular plants
We were looking for…
15
Something that easily
lends itself to TDD
We were looking for…
16
Ideally open source with
a low barrier to entry
17
Copyright 2015 Monsanto Company
18
VS.
Copyright 2015 Monsanto Company
~700M nodes
~1.2B relationships
Ask a question about an Ancestry
19
Copyright 2015 Monsanto Company
Can you return to me all
ancestors of a given plant?
Enabling Innovation
Providing the ability to consume raw trees gives
our consumers a way to leverage the power of
the Graph Database on top of our ancestry
grammar
20
Team identified a basic set of features and Codify
patterns to identify important features in an
Ancestry
Derived at query time
• Return “raw” ancestral trees to consumers
• Allow on-demand pruning of raw trees
• Promote language consistency across business
consumers
Copyright 2015 Monsanto Company
In RESTful Style
21
/materials/1/parents
{
“nodes”: [
{ “id”: 1, “attr1”: “foo” },
{ “id”: 2, “attr1”: “bar” }
],
“relationships”: [
{ “from”: 1, “to”: 2, “relation”: “PARENT” }
]
}
Copyright 2015 Monsanto Company
Predefined Ancestral Milestones
Given where I am at on the Earth
now, where is the closest sandwich
shop “X”?
22
Team identified a basic set of features and Codify
patterns to identify important features in an
Ancestry
Derived at query time
• Traverse raw crossing records at query time
• Derivation at query time allows patterns to more
easily adapt to changes in business process
• Prevents data decay
Copyright 2015 Monsanto Company
Binary Cross Milestone
23
GET /materials/5/binary-cross
{
“male”: {
“id”: 1
},
“female”: {
“id”: 2
}
}
Copyright 2015 Monsanto Company
Let’s ask a more complex question
24
Do any ancestors of a given
plant show a strong resistance
to a particular disease?
Copyright 2015 Monsanto Company
Who are the first of my
ancestors to immigrate from
Germany and Ireland to
America?
Decorating the Ancestry
25
G G
G
Genotype
nodes act
as simple
pointers to
remote
systems
Copyright 2015 Monsanto Company
Ask a complex question
26
/materials/1/parents?until=genotyped-ancestor&props=genotypes
{“nodes”: [
{ “id”: 1 },
{ “id”: 2 },
{ “id”: 3, “genotypes”: [{“id”: 1234}]}
],
“relationships”: [
{ “from”: 1, “to”: 2, “relation”: “PARENT” },
{ “from”: 2, “to”: 3, “relation”: “PARENT” }
]}
Copyright 2015 Monsanto Company
Architecture
Informing our Ancestry backbone of additional data
that identify significant events in a line’s history
allows our APIs to evolve and adapt as our
agronomic practices change.
27
Copyright 2015 Monsanto Company
Take this with you…
28
• Untie yourself from your database indexes
• Let Neo4j do the heavy lifting
• Value added even as non system of record
• Keep the storage model as close to mental
model as possible
Copyright 2015 Monsanto Company
Thank You
29
engineering.monsanto.com
discover.monsanto.com

Mais conteúdo relacionado

Semelhante a Managing Genetic Ancestry at Scale with Neo4j and Kafka - StampedeCon 2015

AgAI engr 245 lean launchpad stanford 2019
AgAI engr 245 lean launchpad stanford 2019AgAI engr 245 lean launchpad stanford 2019
AgAI engr 245 lean launchpad stanford 2019Stanford University
 
Shareholders And Radishes 2010
Shareholders And Radishes 2010Shareholders And Radishes 2010
Shareholders And Radishes 2010Elizabeth Gooding
 
Customer discovery and market intelligence - Entrepreneurship 101
Customer discovery and market intelligence - Entrepreneurship 101Customer discovery and market intelligence - Entrepreneurship 101
Customer discovery and market intelligence - Entrepreneurship 101MaRS Discovery District
 
Can a data infrastructure become relevant to small businesses?
Can a data infrastructure become relevant to small businesses?Can a data infrastructure become relevant to small businesses?
Can a data infrastructure become relevant to small businesses?Nikos Manouselis
 
Big data & analytics for banking new york lars hamberg
Big data & analytics for banking new york   lars hambergBig data & analytics for banking new york   lars hamberg
Big data & analytics for banking new york lars hambergLars Hamberg
 
Biodiversity 101: Dependencies, impacts and opportunities
Biodiversity 101: Dependencies, impacts and opportunitiesBiodiversity 101: Dependencies, impacts and opportunities
Biodiversity 101: Dependencies, impacts and opportunitiesGreenBiz Group
 
HKU Data Curation MLIM7350 Class 7
HKU Data Curation MLIM7350 Class 7HKU Data Curation MLIM7350 Class 7
HKU Data Curation MLIM7350 Class 7Scott Edmunds
 
How To Head A College Application Essay. Online assignment writing service.
How To Head A College Application Essay. Online assignment writing service.How To Head A College Application Essay. Online assignment writing service.
How To Head A College Application Essay. Online assignment writing service.Kristi Anderson
 
Consumer and Corporation Roles
Consumer and Corporation RolesConsumer and Corporation Roles
Consumer and Corporation RolesMark Alan Lenox
 
Effectively Communicating Agricultural Science
Effectively Communicating Agricultural ScienceEffectively Communicating Agricultural Science
Effectively Communicating Agricultural ScienceKevin Folta
 
Diving In The Deep End Of The Big Data Pool
Diving In The Deep End Of The Big Data PoolDiving In The Deep End Of The Big Data Pool
Diving In The Deep End Of The Big Data PoolFrançois Garillot
 
Open Product Data presentation at OGDC
Open Product Data presentation at OGDCOpen Product Data presentation at OGDC
Open Product Data presentation at OGDCOpen Product Data
 
Will Potato Growers be Allowed to Benefit from New Technology?
Will Potato Growers be Allowed to Benefit from New Technology? Will Potato Growers be Allowed to Benefit from New Technology?
Will Potato Growers be Allowed to Benefit from New Technology? Kevin Folta
 
Spark Summit EU talk by Erwin Datema and Roeland van Ham
Spark Summit EU talk by Erwin Datema and Roeland van HamSpark Summit EU talk by Erwin Datema and Roeland van Ham
Spark Summit EU talk by Erwin Datema and Roeland van HamSpark Summit
 
GODAN presentation with South Chinese Scientific Institutions
GODAN presentation with South Chinese Scientific InstitutionsGODAN presentation with South Chinese Scientific Institutions
GODAN presentation with South Chinese Scientific InstitutionsJohannes Keizer
 
CIP DOI Presentation by Dave Ellis
CIP DOI Presentation by Dave EllisCIP DOI Presentation by Dave Ellis
CIP DOI Presentation by Dave EllisEdwin Rojas
 

Semelhante a Managing Genetic Ancestry at Scale with Neo4j and Kafka - StampedeCon 2015 (20)

Untangling Synthetic Biology by Jim Thomas, ETC Group
Untangling Synthetic Biology by Jim Thomas, ETC GroupUntangling Synthetic Biology by Jim Thomas, ETC Group
Untangling Synthetic Biology by Jim Thomas, ETC Group
 
AgAI engr 245 lean launchpad stanford 2019
AgAI engr 245 lean launchpad stanford 2019AgAI engr 245 lean launchpad stanford 2019
AgAI engr 245 lean launchpad stanford 2019
 
Shareholders And Radishes 2010
Shareholders And Radishes 2010Shareholders And Radishes 2010
Shareholders And Radishes 2010
 
Customer discovery and market intelligence - Entrepreneurship 101
Customer discovery and market intelligence - Entrepreneurship 101Customer discovery and market intelligence - Entrepreneurship 101
Customer discovery and market intelligence - Entrepreneurship 101
 
Can a data infrastructure become relevant to small businesses?
Can a data infrastructure become relevant to small businesses?Can a data infrastructure become relevant to small businesses?
Can a data infrastructure become relevant to small businesses?
 
Big data & analytics for banking new york lars hamberg
Big data & analytics for banking new york   lars hambergBig data & analytics for banking new york   lars hamberg
Big data & analytics for banking new york lars hamberg
 
Biodiversity 101: Dependencies, impacts and opportunities
Biodiversity 101: Dependencies, impacts and opportunitiesBiodiversity 101: Dependencies, impacts and opportunities
Biodiversity 101: Dependencies, impacts and opportunities
 
HKU Data Curation MLIM7350 Class 7
HKU Data Curation MLIM7350 Class 7HKU Data Curation MLIM7350 Class 7
HKU Data Curation MLIM7350 Class 7
 
How To Head A College Application Essay. Online assignment writing service.
How To Head A College Application Essay. Online assignment writing service.How To Head A College Application Essay. Online assignment writing service.
How To Head A College Application Essay. Online assignment writing service.
 
Consumer and Corporation Roles
Consumer and Corporation RolesConsumer and Corporation Roles
Consumer and Corporation Roles
 
Effectively Communicating Agricultural Science
Effectively Communicating Agricultural ScienceEffectively Communicating Agricultural Science
Effectively Communicating Agricultural Science
 
Meatingplace article pdf
Meatingplace article pdfMeatingplace article pdf
Meatingplace article pdf
 
Diving In The Deep End Of The Big Data Pool
Diving In The Deep End Of The Big Data PoolDiving In The Deep End Of The Big Data Pool
Diving In The Deep End Of The Big Data Pool
 
Ignitepii2014
Ignitepii2014Ignitepii2014
Ignitepii2014
 
Open Product Data presentation at OGDC
Open Product Data presentation at OGDCOpen Product Data presentation at OGDC
Open Product Data presentation at OGDC
 
Will Potato Growers be Allowed to Benefit from New Technology?
Will Potato Growers be Allowed to Benefit from New Technology? Will Potato Growers be Allowed to Benefit from New Technology?
Will Potato Growers be Allowed to Benefit from New Technology?
 
Spark Summit EU talk by Erwin Datema and Roeland van Ham
Spark Summit EU talk by Erwin Datema and Roeland van HamSpark Summit EU talk by Erwin Datema and Roeland van Ham
Spark Summit EU talk by Erwin Datema and Roeland van Ham
 
Open product data
Open product dataOpen product data
Open product data
 
GODAN presentation with South Chinese Scientific Institutions
GODAN presentation with South Chinese Scientific InstitutionsGODAN presentation with South Chinese Scientific Institutions
GODAN presentation with South Chinese Scientific Institutions
 
CIP DOI Presentation by Dave Ellis
CIP DOI Presentation by Dave EllisCIP DOI Presentation by Dave Ellis
CIP DOI Presentation by Dave Ellis
 

Mais de StampedeCon

Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...StampedeCon
 
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017StampedeCon
 
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017StampedeCon
 
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...StampedeCon
 
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017StampedeCon
 
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017StampedeCon
 
Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017StampedeCon
 
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...StampedeCon
 
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...StampedeCon
 
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017StampedeCon
 
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017StampedeCon
 
A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017StampedeCon
 
Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017StampedeCon
 
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017StampedeCon
 
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017StampedeCon
 
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...StampedeCon
 
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...StampedeCon
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016StampedeCon
 
Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016StampedeCon
 
Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016StampedeCon
 

Mais de StampedeCon (20)

Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
 
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
 
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
 
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
 
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
 
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
 
Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017
 
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
 
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
 
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
 
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
 
A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017
 
Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017
 
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
 
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
 
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
 
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016
 
Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016
 
Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016
 

Último

Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 

Último (20)

Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 

Managing Genetic Ancestry at Scale with Neo4j and Kafka - StampedeCon 2015

  • 1. Managing Genetic Ancestry at Scale Jason Clark @jclark1985 Copyright 2015 Monsanto Company
  • 2. Food is a looming issue as populations rise and farm acres shrink 2 By 2050, the world will grow by 2 billion people, that’s as many people as there are currently in North and South America combined TWICE!!! Copyright 2015 Monsanto Company
  • 3. Breeding for a Better Harvest 3 Approaches to make crops yield better under dwindling resources requires huge advances in breeding FEED FOOD 10K YEARS Copyright 2015 Monsanto Company
  • 4. Plant Breeding in a Nutshell 4
  • 5. Tracking Our Plant Ancestry 5 Plant ID 1 attributes.. ... 2 ... 3 ... Plant Relationship Plant ID Parent ID 3 1 3 2 Copyright 2015 Monsanto Company 0 10 20 30 40 50 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 # of Inserts (M)
  • 6. Our R&D pipeline can be cyclical Copyright 2015 Monsanto Company 8-10 years
  • 7. Ask a question about an Ancestry 7 Copyright 2015 Monsanto Company Can you return to me all ancestors of a given plant?
  • 8. It’s Complicated 8 Copyright 2015 Monsanto Company This is a single breeding line!
  • 9. Our reads do not scale… 9 0 5 10 15 20 25 30 Response(s) Response (s) Copyright 2015 Monsanto Company At a depth of 15 – We killed the query at 1.5 hours
  • 10. Database indexes do not help 10 Identifying each set of related materials potentially requires a full scan of an index O(m log n) Copyright 2015 Monsanto Company
  • 11. Ask a question about an Ancestry 11 Copyright 2015 Monsanto Company Can you return to me all ancestors of a given plant?
  • 12. Index Free Adjacency (IFA) 12 A single index hit finds my starting point; all other relationship identification is O(1) Copyright 2015 Monsanto Company
  • 13. We were looking for… 13 Something that can accurately represent the domain model
  • 14. We were looking for… 14 Query performance to remain near constant as we ask questions about particular plants
  • 15. We were looking for… 15 Something that easily lends itself to TDD
  • 16. We were looking for… 16 Ideally open source with a low barrier to entry
  • 18. 18 VS. Copyright 2015 Monsanto Company ~700M nodes ~1.2B relationships
  • 19. Ask a question about an Ancestry 19 Copyright 2015 Monsanto Company Can you return to me all ancestors of a given plant?
  • 20. Enabling Innovation Providing the ability to consume raw trees gives our consumers a way to leverage the power of the Graph Database on top of our ancestry grammar 20 Team identified a basic set of features and Codify patterns to identify important features in an Ancestry Derived at query time • Return “raw” ancestral trees to consumers • Allow on-demand pruning of raw trees • Promote language consistency across business consumers Copyright 2015 Monsanto Company
  • 21. In RESTful Style 21 /materials/1/parents { “nodes”: [ { “id”: 1, “attr1”: “foo” }, { “id”: 2, “attr1”: “bar” } ], “relationships”: [ { “from”: 1, “to”: 2, “relation”: “PARENT” } ] } Copyright 2015 Monsanto Company
  • 22. Predefined Ancestral Milestones Given where I am at on the Earth now, where is the closest sandwich shop “X”? 22 Team identified a basic set of features and Codify patterns to identify important features in an Ancestry Derived at query time • Traverse raw crossing records at query time • Derivation at query time allows patterns to more easily adapt to changes in business process • Prevents data decay Copyright 2015 Monsanto Company
  • 23. Binary Cross Milestone 23 GET /materials/5/binary-cross { “male”: { “id”: 1 }, “female”: { “id”: 2 } } Copyright 2015 Monsanto Company
  • 24. Let’s ask a more complex question 24 Do any ancestors of a given plant show a strong resistance to a particular disease? Copyright 2015 Monsanto Company Who are the first of my ancestors to immigrate from Germany and Ireland to America?
  • 25. Decorating the Ancestry 25 G G G Genotype nodes act as simple pointers to remote systems Copyright 2015 Monsanto Company
  • 26. Ask a complex question 26 /materials/1/parents?until=genotyped-ancestor&props=genotypes {“nodes”: [ { “id”: 1 }, { “id”: 2 }, { “id”: 3, “genotypes”: [{“id”: 1234}]} ], “relationships”: [ { “from”: 1, “to”: 2, “relation”: “PARENT” }, { “from”: 2, “to”: 3, “relation”: “PARENT” } ]} Copyright 2015 Monsanto Company
  • 27. Architecture Informing our Ancestry backbone of additional data that identify significant events in a line’s history allows our APIs to evolve and adapt as our agronomic practices change. 27 Copyright 2015 Monsanto Company
  • 28. Take this with you… 28 • Untie yourself from your database indexes • Let Neo4j do the heavy lifting • Value added even as non system of record • Keep the storage model as close to mental model as possible Copyright 2015 Monsanto Company