SlideShare a Scribd company logo
1 of 43
Download to read offline
Graph Databases on
the Edge
Presented by: William McKnight
“#1 Global Influencer in Data Warehousing” Onalytica
President, McKnight Consulting Group
An Inc. 5000 Company in 2018 and 2017
@williammcknight
www.mcknightcg.com
(214) 514-1444
Second Thursday of Every Month, at 2:00 ET
•
•
•
•
•
•
AnzoGraph DB
Triplestore with labeled properties
Built for diverse data harmonization and
analytics at scale (trillions of triples &
more
Graph analytics like page rank and
shortest path.
BI-style analytics like graph views, named
queries, aggregates, built-in data science
functions
Inferencing and ontology native support
XXXXXX
XXXXXXXX
XXXXXXXX
XXXXXXXX
XXXXXXXXX
KEY VALUE
XXXXX
Order
Prod
uct
Product
XXXXX
XXXXX
XXXXX Location 1
2
3
4
Used_with
R
eceive_paym
ent
Sets_Up
Used_with
Used_with
Involved in Prior Fraud Cases
5
Graph Databases on
the Edge
Presented by: William McKnight
“#1 Global Influencer in Data Warehousing” Onalytica
President, McKnight Consulting Group
An Inc. 5000 Company in 2018 and 2017
@williammcknight
www.mcknightcg.com
(214) 514-1444
Second Thursday of Every Month, at 2:00 ET
Customers Achieve Sustainable Competitive
Advantage By Adopting Graph Databases
New Products & Services Leveraging
Data Relationships
• First to market, up and running in
days, not weeks or months
• Reduced churn, increasing
engagement and uncovering fraud
• Achieved new company vision
centered around Business Graph
• Leapfrogged the competition with a
360 degree view of the customer
Reimagine Existing Applications, and Innovate
with Data Relationships
• Kept the business running when data growth
threatened to stop it
• Drastically reduced project complexity and
risk
• Increased revenue and delighted
customers by improving user experience
• Brought new offering to market to
compete with Amazon Prime & Fresh,
and Google Express
2
Graph Growth Ahead
“The application of graph processing and graph DBMSs will
grow at 100 percent annually through 2022 to continuously
accelerate data preparation and enable more complex and
adaptive data science.”
“Graph analytics will grow in the next few years due to the
need to ask complex questions across complex data, which is
not always practical or even possible at scale using SQL
queries.”
source - February 2019 press release by Gartner - https://www.gartner.com/en/newsroom/press-releases/2019-02-18-gartner-
identifies-top-10-data-and-analytics-technolo
3
What Can Be Vertices?
• Things
– Bank accounts
– Customer accounts
• Mobile phones
– Products
– Trading networks, auctions
– Water, power, gas grids
– Disease, drugs, molecules
• Interactions, transmission
– Insurance policies
– Machines, servers, URLs
– Sensor networks
4
• People
– Customers, families
– Employees
– Affinity groups, clubs
• Politics, causes, doctors
• Professionals (LinkedIn)
– Companies, institutions
• Places
– Map locations
• Cities, landmarks
– Retail stores
– Houses or buildings
– Communication networks
– Transportation hubs
• Airports, shipping lanes, etc.
What Can be Edges?
• People
– Relationships
– Ideas, preferences
– Email, phone calls, SMS, IM
– Collaborations
• Places
– Roads, routes, railways
– Water, power, gas,
pipelines, telephone lines
– Anything with GPS
coordinates
• Things
– Events
– Money, transactions
– Purchases
– Pressure, temperature
– Diseases
– Contraband
– URLs
– Phone calls
– Citations
– Weights, scores
– Timestamps
5
Social Network “path exists” Performance
• Experiment:
• 1000 persons
• Average 50 friends
per person
• pathExists(a,b)
limited to depth 4
# persons query time
Relational
database
1000 2000ms
Graph db 1000 2ms
Graph db 1000000 2ms
Excessive
relationships
Healthcare Fraud
• Monitor drugs and
treatments
– Excessive prescribers
– Excessive consumers
• Patients connected to
– Doctors, pharmacies
• Use Graph Access
– Find outliers and investigate
– Find X actual frauds
7
Relational DBs Can’t Handle Data
Relationships Well
• Cannot model or store data and
relationships without complexity
• Performance degrades with number
and levels of relationships, and
database size
• Query complexity grows with need for
JOINs
• Adding new types of data and
relationships requires schema redesign,
increasing time to market
8
Slow development
Poor performance
Low scalability
Hard to maintain
… making traditional databases inappropriate
when data relationships are valuable in real-time
Discrete Data
Minimally
connected data
Graph Databases are designed for data relationships
Use the Right Database for the Right Job
Other NoSQL Relational DBMS Graph DB
Connected Data
Focused on
Data Relationships
Development Benefits
Model maintenance
Deployment Benefits
Performance
Minimal resource usage
Graph Visualization
10
Graph Algorithms
PageRank
12
Page A
1.0
Page C
1.0
Page B
1.0
Page D
1.0
1*0.85/2
1*0.85/2
1*0.85
1*0.85
1*0.85
Sum of inputs + 0.15
http://www.whitelines.nl/html/google-page-rank.html see spreadsheet
http://www.cs.princeton.edu/~chazelle/courses/BIB/pagerank.htm
+0.150
page D +0.850
page B +0.850
page A +0.425
C Total 2.275
PageRank: After 1st Results
Page A
1.0
Page C
2.275
Page B
0.575
Page D
0.15
+0.150
page A +0.425
B Total 0.575
+0.15
Page C +0.85
A Total 1.00
+0.150
D Total 0.150
1*0.85/2
1*0.85/2
1*0.85
1*0.85
1*0.85
http://www.whitelines.nl/html/google-page-rank.html (see spreadsheet)
13
Page Rank Iterations
14
End of iteration A result B result C result D result
1 1.000 0.575 2.275 0.150
2 2.084 0.575 1.191 0.150
3 1.163 1.036 1.652 0.150
4 1.554 0.644 1.652 0.150
5 1.554 0.810 1.485 0.150
6 1.413 0.810 1.627 0.150
7 1.533 0.750 1.567 0.150
8 1.482 0.801 1.567 0.150
9 1.482 0.780 1.588 0.150
10 1.500 0.780 1.570 0.150
11 1.485 0.788 1.578 0.150
12 1.491 0.781 1.578 0.150
13 1.491 0.784 1.575 0.150
14 1.489 0.784 1.577 0.150
15 1.491 0.783 1.576 0.150
16 1.490 0.784 1.576 0.150
17 1.490 0.783 1.577 0.150
18 1.490 0.783 1.576 0.150
19 1.490 0.783 1.577 0.150
20 1.490 0.783 1.577 0.150
PageRank: 20 Iterations Until Convergence
Page A
1.49
Page C
1.58
Page B
0.78
Page D
0.15
Most important
web page
Page C
increases page A
importance
15
Betweenness
• Find bridges across different communities
• High score = edge links different
communities
Bridge
vertex
Bridge
vertex
16
Closeness
• The shortest paths between any two
vertices
17
Eigen Centrality
• Measures the importance of a vertex by
the importance of its neighbors
importantimportant
important
must be
important
18
Clustering Coefficient: Cascading Churn
19
If two people churn,
what is the likelihood
others will?
The two churners affect
the central influencer
Finally: All contacts churn.
Individual-focused model
underestimates churn by 6X.
SELECT *
FROM LocalClusteringCoefficient(
ON Calls as edges
PARTITION BY caller_from
ON caller_from as vertices
PARTITION BY caller_id
targetKey(caller_to')
directed('f')
degreeRange('[3:]')
accumulate('personId')
);
Loopy Belief Propagation
• Loopy belief works by peer-pressure
– Node X gets a final belief value by listening to
its neighbors
– Nodes with known values propagate through
the graph
• Adjacent nodes send message saying
“update your beliefs”
– Based on priors, conditional probabilities, and
evidence
• Keep passing messages until a stable belief
state is reached
See https://www.ics.uci.edu/~welling/teaching/ICS279/GBP-vision.pdf
20
Great Questions for Graph Databases
• In what order did a specific set of related events
happen?
• Are there patterns of events in our data that seem
to be related by time?
• How far apart in a (social or physical) network are
two “actors” and how strong is their relationship?
• What are the identifiable social groups and what are
the general patterns of such groups?
• How important is any given “actor” in any given
network and event?
• What type of messages emanate from a specific
area?
21
How to Identify a Graph Workload
• Workload is identified by “network,
hierarchy, tree, ancestry, structure” words
• You are planning to use relational
performance tricks
• Your queries will be about pathing
• You are limiting queries by their complexity
• You are looking for “non-obvious” patterns
in the data
22
Graph Modeling
The Domain Model
24
Actions
Model actions depending on what you want
as vertices
(Bill)-[:SENT]->(email)-[:TO]->(Jim)
OR
(Bill)-[:EMAILED]->(Jim)
25
Semantic Graphs
• Subject: John R Peterson Predicate: Knows Object: Frank T
Smith
• Subject: Triple #1 Predicate: Confidence Percent Object: 70
• Subject: Triple #1 Predicate: Provenance Object: Mary L Jones
26
Triple Store
• A triple is a data entity composed of
subject-predicate-object
– "Bob is 35”
– "Bob knows Fred”
– “William likes running”
27
Conclusion
• Graph is a Fast Growing data category
• It’s all about the Use Case; Good for Graph:
– Real-time recommendations
– Fraud detection
– Network and IT operations
– Identity and access management
– Graph-based search
– Identifying relative importance
• Reimagine your data as a graph
– The whiteboard model is the physical model
• Remember Page Rank
28
Graph Databases on
the Edge
Presented by: William McKnight
“#1 Global Influencer in Data Warehousing” OnAlytica
President, McKnight Consulting Group
An Inc. 5000 Company in 2018 and 2017
@williammcknight
www.mcknightcg.com
(214) 514-1444
Second Thursday of Every Month, at 2:00 ET

More Related Content

What's hot

Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015Big Data Spain
 
Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?Gregg Barrett
 
Engines of Order. Social Media and the Rise of Algorithmic Knowing.
Engines of Order. Social Media and the Rise of Algorithmic Knowing.Engines of Order. Social Media and the Rise of Algorithmic Knowing.
Engines of Order. Social Media and the Rise of Algorithmic Knowing.Bernhard Rieder
 
An Obligatory Introduction to Data Science
An Obligatory Introduction to Data ScienceAn Obligatory Introduction to Data Science
An Obligatory Introduction to Data ScienceWesley Eldridge
 
Data ethics and machine learning: discrimination, algorithmic bias, and how t...
Data ethics and machine learning: discrimination, algorithmic bias, and how t...Data ethics and machine learning: discrimination, algorithmic bias, and how t...
Data ethics and machine learning: discrimination, algorithmic bias, and how t...Data Driven Innovation
 
Pie chart or pizza: identifying chart types and their virality on Twitter
Pie chart or pizza: identifying chart types and their virality on TwitterPie chart or pizza: identifying chart types and their virality on Twitter
Pie chart or pizza: identifying chart types and their virality on TwitterElena Simperl
 
Data-driven journalism: What is there to learn? (Stanford, June 2010) #ddj
Data-driven journalism: What is there to learn? (Stanford, June 2010) #ddjData-driven journalism: What is there to learn? (Stanford, June 2010) #ddj
Data-driven journalism: What is there to learn? (Stanford, June 2010) #ddjMirko Lorenz
 
Figures of the Many - Quantitative Concepts for Qualitative Thinking
Figures of the Many - Quantitative Concepts for Qualitative ThinkingFigures of the Many - Quantitative Concepts for Qualitative Thinking
Figures of the Many - Quantitative Concepts for Qualitative ThinkingBernhard Rieder
 

What's hot (11)

The data we want
The data we wantThe data we want
The data we want
 
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
 
Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?
 
Data stories
Data storiesData stories
Data stories
 
Engines of Order. Social Media and the Rise of Algorithmic Knowing.
Engines of Order. Social Media and the Rise of Algorithmic Knowing.Engines of Order. Social Media and the Rise of Algorithmic Knowing.
Engines of Order. Social Media and the Rise of Algorithmic Knowing.
 
An Obligatory Introduction to Data Science
An Obligatory Introduction to Data ScienceAn Obligatory Introduction to Data Science
An Obligatory Introduction to Data Science
 
Data ethics and machine learning: discrimination, algorithmic bias, and how t...
Data ethics and machine learning: discrimination, algorithmic bias, and how t...Data ethics and machine learning: discrimination, algorithmic bias, and how t...
Data ethics and machine learning: discrimination, algorithmic bias, and how t...
 
Pie chart or pizza: identifying chart types and their virality on Twitter
Pie chart or pizza: identifying chart types and their virality on TwitterPie chart or pizza: identifying chart types and their virality on Twitter
Pie chart or pizza: identifying chart types and their virality on Twitter
 
Data-driven journalism: What is there to learn? (Stanford, June 2010) #ddj
Data-driven journalism: What is there to learn? (Stanford, June 2010) #ddjData-driven journalism: What is there to learn? (Stanford, June 2010) #ddj
Data-driven journalism: What is there to learn? (Stanford, June 2010) #ddj
 
Broad Data
Broad DataBroad Data
Broad Data
 
Figures of the Many - Quantitative Concepts for Qualitative Thinking
Figures of the Many - Quantitative Concepts for Qualitative ThinkingFigures of the Many - Quantitative Concepts for Qualitative Thinking
Figures of the Many - Quantitative Concepts for Qualitative Thinking
 

Similar to ADV Slides: Graph Databases on the Edge

Advanced Analytics: Graph Database Use Cases
Advanced Analytics: Graph Database Use CasesAdvanced Analytics: Graph Database Use Cases
Advanced Analytics: Graph Database Use CasesDATAVERSITY
 
chương 1 - Tổng quan về khai phá dữ liệu.pdf
chương 1 - Tổng quan về khai phá dữ liệu.pdfchương 1 - Tổng quan về khai phá dữ liệu.pdf
chương 1 - Tổng quan về khai phá dữ liệu.pdfphongnguyen312110237
 
Meet 1 - Introduction Data Mining - Dedi Darwis.pdf
Meet 1 - Introduction Data Mining - Dedi Darwis.pdfMeet 1 - Introduction Data Mining - Dedi Darwis.pdf
Meet 1 - Introduction Data Mining - Dedi Darwis.pdf09372002dedi
 
Introduction: Relational to Graphs
Introduction: Relational to GraphsIntroduction: Relational to Graphs
Introduction: Relational to GraphsNeo4j
 
Knowledge Graphs Webinar- 11/7/2017
Knowledge Graphs Webinar- 11/7/2017Knowledge Graphs Webinar- 11/7/2017
Knowledge Graphs Webinar- 11/7/2017Neo4j
 
01-introduction.ppt the paper that you can unless you want to join me because...
01-introduction.ppt the paper that you can unless you want to join me because...01-introduction.ppt the paper that you can unless you want to join me because...
01-introduction.ppt the paper that you can unless you want to join me because...teodroscampaus
 
Neo4j GraphTour New York_ State of the State_Amit Chaudhry Neo4j
Neo4j GraphTour New York_ State of the State_Amit Chaudhry Neo4jNeo4j GraphTour New York_ State of the State_Amit Chaudhry Neo4j
Neo4j GraphTour New York_ State of the State_Amit Chaudhry Neo4jNeo4j
 
BigData and Beyond
BigData and BeyondBigData and Beyond
BigData and BeyondJohn Avery
 
Thwart Fraud Using Graph-Enhanced Machine Learning and AI
Thwart Fraud Using Graph-Enhanced Machine Learning and AIThwart Fraud Using Graph-Enhanced Machine Learning and AI
Thwart Fraud Using Graph-Enhanced Machine Learning and AINeo4j
 
Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics Caserta
 
Geospatial Intelligence Middle East 2013_Big Data_Steven Ramage
Geospatial Intelligence Middle East 2013_Big Data_Steven RamageGeospatial Intelligence Middle East 2013_Big Data_Steven Ramage
Geospatial Intelligence Middle East 2013_Big Data_Steven RamageSteven Ramage
 
Graph Database Use Cases - StampedeCon 2015
Graph Database Use Cases - StampedeCon 2015Graph Database Use Cases - StampedeCon 2015
Graph Database Use Cases - StampedeCon 2015StampedeCon
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use CasesMax De Marzi
 
Forces and Threats in a Data Warehouse (and why metadata and architecture is ...
Forces and Threats in a Data Warehouse (and why metadata and architecture is ...Forces and Threats in a Data Warehouse (and why metadata and architecture is ...
Forces and Threats in a Data Warehouse (and why metadata and architecture is ...Stefan Urbanek
 
Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...
Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...
Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...Caserta
 
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...DATAVERSITY
 
Data-Ed Online Webinar: Data Architecture Requirements
Data-Ed Online Webinar: Data Architecture RequirementsData-Ed Online Webinar: Data Architecture Requirements
Data-Ed Online Webinar: Data Architecture RequirementsDATAVERSITY
 
Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016
Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016
Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016Caserta
 

Similar to ADV Slides: Graph Databases on the Edge (20)

Advanced Analytics: Graph Database Use Cases
Advanced Analytics: Graph Database Use CasesAdvanced Analytics: Graph Database Use Cases
Advanced Analytics: Graph Database Use Cases
 
chương 1 - Tổng quan về khai phá dữ liệu.pdf
chương 1 - Tổng quan về khai phá dữ liệu.pdfchương 1 - Tổng quan về khai phá dữ liệu.pdf
chương 1 - Tổng quan về khai phá dữ liệu.pdf
 
datamining-lect1.pptx
datamining-lect1.pptxdatamining-lect1.pptx
datamining-lect1.pptx
 
Meet 1 - Introduction Data Mining - Dedi Darwis.pdf
Meet 1 - Introduction Data Mining - Dedi Darwis.pdfMeet 1 - Introduction Data Mining - Dedi Darwis.pdf
Meet 1 - Introduction Data Mining - Dedi Darwis.pdf
 
Introduction: Relational to Graphs
Introduction: Relational to GraphsIntroduction: Relational to Graphs
Introduction: Relational to Graphs
 
Data Mining Lecture_1.pptx
Data Mining Lecture_1.pptxData Mining Lecture_1.pptx
Data Mining Lecture_1.pptx
 
Knowledge Graphs Webinar- 11/7/2017
Knowledge Graphs Webinar- 11/7/2017Knowledge Graphs Webinar- 11/7/2017
Knowledge Graphs Webinar- 11/7/2017
 
01-introduction.ppt the paper that you can unless you want to join me because...
01-introduction.ppt the paper that you can unless you want to join me because...01-introduction.ppt the paper that you can unless you want to join me because...
01-introduction.ppt the paper that you can unless you want to join me because...
 
Neo4j GraphTour New York_ State of the State_Amit Chaudhry Neo4j
Neo4j GraphTour New York_ State of the State_Amit Chaudhry Neo4jNeo4j GraphTour New York_ State of the State_Amit Chaudhry Neo4j
Neo4j GraphTour New York_ State of the State_Amit Chaudhry Neo4j
 
BigData and Beyond
BigData and BeyondBigData and Beyond
BigData and Beyond
 
Thwart Fraud Using Graph-Enhanced Machine Learning and AI
Thwart Fraud Using Graph-Enhanced Machine Learning and AIThwart Fraud Using Graph-Enhanced Machine Learning and AI
Thwart Fraud Using Graph-Enhanced Machine Learning and AI
 
Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics
 
Geospatial Intelligence Middle East 2013_Big Data_Steven Ramage
Geospatial Intelligence Middle East 2013_Big Data_Steven RamageGeospatial Intelligence Middle East 2013_Big Data_Steven Ramage
Geospatial Intelligence Middle East 2013_Big Data_Steven Ramage
 
Graph Database Use Cases - StampedeCon 2015
Graph Database Use Cases - StampedeCon 2015Graph Database Use Cases - StampedeCon 2015
Graph Database Use Cases - StampedeCon 2015
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use Cases
 
Forces and Threats in a Data Warehouse (and why metadata and architecture is ...
Forces and Threats in a Data Warehouse (and why metadata and architecture is ...Forces and Threats in a Data Warehouse (and why metadata and architecture is ...
Forces and Threats in a Data Warehouse (and why metadata and architecture is ...
 
Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...
Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...
Data Intelligence: How the Amalgamation of Data, Science, and Technology is C...
 
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...
 
Data-Ed Online Webinar: Data Architecture Requirements
Data-Ed Online Webinar: Data Architecture RequirementsData-Ed Online Webinar: Data Architecture Requirements
Data-Ed Online Webinar: Data Architecture Requirements
 
Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016
Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016
Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016
 

More from DATAVERSITY

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...DATAVERSITY
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceDATAVERSITY
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data LiteracyDATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsDATAVERSITY
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for YouDATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?DATAVERSITY
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling FundamentalsDATAVERSITY
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectDATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsDATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayDATAVERSITY
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise AnalyticsDATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best PracticesDATAVERSITY
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?DATAVERSITY
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best PracticesDATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 

More from DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Recently uploaded

RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 

ADV Slides: Graph Databases on the Edge

  • 1. Graph Databases on the Edge Presented by: William McKnight “#1 Global Influencer in Data Warehousing” Onalytica President, McKnight Consulting Group An Inc. 5000 Company in 2018 and 2017 @williammcknight www.mcknightcg.com (214) 514-1444 Second Thursday of Every Month, at 2:00 ET
  • 2.
  • 3.
  • 5.
  • 6.
  • 7.
  • 8. AnzoGraph DB Triplestore with labeled properties Built for diverse data harmonization and analytics at scale (trillions of triples & more Graph analytics like page rank and shortest path. BI-style analytics like graph views, named queries, aggregates, built-in data science functions Inferencing and ontology native support
  • 9.
  • 11. 3
  • 12. 4
  • 14.
  • 15. Graph Databases on the Edge Presented by: William McKnight “#1 Global Influencer in Data Warehousing” Onalytica President, McKnight Consulting Group An Inc. 5000 Company in 2018 and 2017 @williammcknight www.mcknightcg.com (214) 514-1444 Second Thursday of Every Month, at 2:00 ET
  • 16. Customers Achieve Sustainable Competitive Advantage By Adopting Graph Databases New Products & Services Leveraging Data Relationships • First to market, up and running in days, not weeks or months • Reduced churn, increasing engagement and uncovering fraud • Achieved new company vision centered around Business Graph • Leapfrogged the competition with a 360 degree view of the customer Reimagine Existing Applications, and Innovate with Data Relationships • Kept the business running when data growth threatened to stop it • Drastically reduced project complexity and risk • Increased revenue and delighted customers by improving user experience • Brought new offering to market to compete with Amazon Prime & Fresh, and Google Express 2
  • 17. Graph Growth Ahead “The application of graph processing and graph DBMSs will grow at 100 percent annually through 2022 to continuously accelerate data preparation and enable more complex and adaptive data science.” “Graph analytics will grow in the next few years due to the need to ask complex questions across complex data, which is not always practical or even possible at scale using SQL queries.” source - February 2019 press release by Gartner - https://www.gartner.com/en/newsroom/press-releases/2019-02-18-gartner- identifies-top-10-data-and-analytics-technolo 3
  • 18. What Can Be Vertices? • Things – Bank accounts – Customer accounts • Mobile phones – Products – Trading networks, auctions – Water, power, gas grids – Disease, drugs, molecules • Interactions, transmission – Insurance policies – Machines, servers, URLs – Sensor networks 4 • People – Customers, families – Employees – Affinity groups, clubs • Politics, causes, doctors • Professionals (LinkedIn) – Companies, institutions • Places – Map locations • Cities, landmarks – Retail stores – Houses or buildings – Communication networks – Transportation hubs • Airports, shipping lanes, etc.
  • 19. What Can be Edges? • People – Relationships – Ideas, preferences – Email, phone calls, SMS, IM – Collaborations • Places – Roads, routes, railways – Water, power, gas, pipelines, telephone lines – Anything with GPS coordinates • Things – Events – Money, transactions – Purchases – Pressure, temperature – Diseases – Contraband – URLs – Phone calls – Citations – Weights, scores – Timestamps 5
  • 20. Social Network “path exists” Performance • Experiment: • 1000 persons • Average 50 friends per person • pathExists(a,b) limited to depth 4 # persons query time Relational database 1000 2000ms Graph db 1000 2ms Graph db 1000000 2ms
  • 21. Excessive relationships Healthcare Fraud • Monitor drugs and treatments – Excessive prescribers – Excessive consumers • Patients connected to – Doctors, pharmacies • Use Graph Access – Find outliers and investigate – Find X actual frauds 7
  • 22. Relational DBs Can’t Handle Data Relationships Well • Cannot model or store data and relationships without complexity • Performance degrades with number and levels of relationships, and database size • Query complexity grows with need for JOINs • Adding new types of data and relationships requires schema redesign, increasing time to market 8 Slow development Poor performance Low scalability Hard to maintain … making traditional databases inappropriate when data relationships are valuable in real-time
  • 23. Discrete Data Minimally connected data Graph Databases are designed for data relationships Use the Right Database for the Right Job Other NoSQL Relational DBMS Graph DB Connected Data Focused on Data Relationships Development Benefits Model maintenance Deployment Benefits Performance Minimal resource usage
  • 26. PageRank 12 Page A 1.0 Page C 1.0 Page B 1.0 Page D 1.0 1*0.85/2 1*0.85/2 1*0.85 1*0.85 1*0.85 Sum of inputs + 0.15 http://www.whitelines.nl/html/google-page-rank.html see spreadsheet http://www.cs.princeton.edu/~chazelle/courses/BIB/pagerank.htm
  • 27. +0.150 page D +0.850 page B +0.850 page A +0.425 C Total 2.275 PageRank: After 1st Results Page A 1.0 Page C 2.275 Page B 0.575 Page D 0.15 +0.150 page A +0.425 B Total 0.575 +0.15 Page C +0.85 A Total 1.00 +0.150 D Total 0.150 1*0.85/2 1*0.85/2 1*0.85 1*0.85 1*0.85 http://www.whitelines.nl/html/google-page-rank.html (see spreadsheet) 13
  • 28. Page Rank Iterations 14 End of iteration A result B result C result D result 1 1.000 0.575 2.275 0.150 2 2.084 0.575 1.191 0.150 3 1.163 1.036 1.652 0.150 4 1.554 0.644 1.652 0.150 5 1.554 0.810 1.485 0.150 6 1.413 0.810 1.627 0.150 7 1.533 0.750 1.567 0.150 8 1.482 0.801 1.567 0.150 9 1.482 0.780 1.588 0.150 10 1.500 0.780 1.570 0.150 11 1.485 0.788 1.578 0.150 12 1.491 0.781 1.578 0.150 13 1.491 0.784 1.575 0.150 14 1.489 0.784 1.577 0.150 15 1.491 0.783 1.576 0.150 16 1.490 0.784 1.576 0.150 17 1.490 0.783 1.577 0.150 18 1.490 0.783 1.576 0.150 19 1.490 0.783 1.577 0.150 20 1.490 0.783 1.577 0.150
  • 29. PageRank: 20 Iterations Until Convergence Page A 1.49 Page C 1.58 Page B 0.78 Page D 0.15 Most important web page Page C increases page A importance 15
  • 30. Betweenness • Find bridges across different communities • High score = edge links different communities Bridge vertex Bridge vertex 16
  • 31. Closeness • The shortest paths between any two vertices 17
  • 32. Eigen Centrality • Measures the importance of a vertex by the importance of its neighbors importantimportant important must be important 18
  • 33. Clustering Coefficient: Cascading Churn 19 If two people churn, what is the likelihood others will? The two churners affect the central influencer Finally: All contacts churn. Individual-focused model underestimates churn by 6X. SELECT * FROM LocalClusteringCoefficient( ON Calls as edges PARTITION BY caller_from ON caller_from as vertices PARTITION BY caller_id targetKey(caller_to') directed('f') degreeRange('[3:]') accumulate('personId') );
  • 34. Loopy Belief Propagation • Loopy belief works by peer-pressure – Node X gets a final belief value by listening to its neighbors – Nodes with known values propagate through the graph • Adjacent nodes send message saying “update your beliefs” – Based on priors, conditional probabilities, and evidence • Keep passing messages until a stable belief state is reached See https://www.ics.uci.edu/~welling/teaching/ICS279/GBP-vision.pdf 20
  • 35. Great Questions for Graph Databases • In what order did a specific set of related events happen? • Are there patterns of events in our data that seem to be related by time? • How far apart in a (social or physical) network are two “actors” and how strong is their relationship? • What are the identifiable social groups and what are the general patterns of such groups? • How important is any given “actor” in any given network and event? • What type of messages emanate from a specific area? 21
  • 36. How to Identify a Graph Workload • Workload is identified by “network, hierarchy, tree, ancestry, structure” words • You are planning to use relational performance tricks • Your queries will be about pathing • You are limiting queries by their complexity • You are looking for “non-obvious” patterns in the data 22
  • 39. Actions Model actions depending on what you want as vertices (Bill)-[:SENT]->(email)-[:TO]->(Jim) OR (Bill)-[:EMAILED]->(Jim) 25
  • 40. Semantic Graphs • Subject: John R Peterson Predicate: Knows Object: Frank T Smith • Subject: Triple #1 Predicate: Confidence Percent Object: 70 • Subject: Triple #1 Predicate: Provenance Object: Mary L Jones 26
  • 41. Triple Store • A triple is a data entity composed of subject-predicate-object – "Bob is 35” – "Bob knows Fred” – “William likes running” 27
  • 42. Conclusion • Graph is a Fast Growing data category • It’s all about the Use Case; Good for Graph: – Real-time recommendations – Fraud detection – Network and IT operations – Identity and access management – Graph-based search – Identifying relative importance • Reimagine your data as a graph – The whiteboard model is the physical model • Remember Page Rank 28
  • 43. Graph Databases on the Edge Presented by: William McKnight “#1 Global Influencer in Data Warehousing” OnAlytica President, McKnight Consulting Group An Inc. 5000 Company in 2018 and 2017 @williammcknight www.mcknightcg.com (214) 514-1444 Second Thursday of Every Month, at 2:00 ET