SlideShare uma empresa Scribd logo
1 de 31
Baixar para ler offline
Graph Database Use
Cases
Presented by: William McKnight
“#1 Global Influencer in Big Data” Thinkers360
President, McKnight Consulting Group
An Inc. 5000 Company in 2018 and 2017
@williammcknight
www.mcknightcg.com
(214) 514-1444
Second Thursday of Every Month, at 2:00 ET
2023 Advanced Analytics Topics
1. 2023 Trends in Enterprise Analytics
2. Showing ROI for your Analytic Project
3. Architecture, Products and Total Cost of Ownership of the Leading
Machine Learning Stacks
4. Competitive Analytic Architectures: Comparing the Data Mesh, Data
Fabric, Data Lakehouse and Data Cloud
5. Why Analytics Leaders deploy Master Data Management
6. What Does Information Management Maturity Look Like in 2023
7. Understanding the Modern Applications of Graph Databases
8. Common Misconceptions About Master Data Management
9. Organizational Change Management: Will it Hold Back Artificial
Intelligence Deployments?
10. Open-Source vs Commercial Vendor Software in the Enterprise
11. Data Quality: The ROI of Adding Intelligence to Data
12. Strategies for Machine Learning Success
2
Relational DBs Can’t Handle Data
Relationships Well
• Cannot model or store data and
relationships without complexity
• Performance degrades with number
and levels of relationships, and
database size
• Query complexity grows with need for
JOINs
• Adding new types of data and
relationships requires schema redesign,
increasing time to market
3
Slow development
Poor performance
Low scalability
Hard to maintain
… making traditional databases inappropriate
when data relationships are valuable in real-time
Discrete Data
Minimally
connected data
Graph Databases are designed for data relationships
Use the Right Database for the Right Job
Other NoSQL Relational DBMS Graph DB
Connected Data
Focused on
Data Relationships
Development Benefits
Model maintenance
Deployment Benefits
Performance
Minimal resource usage
What Can Be Vertices?
• Things
– Bank accounts
– Customer accounts
• Mobile phones
– Products
– Trading networks, auctions
– Water, power, gas grids
– Disease, drugs, molecules
• Interactions, transmission
– Insurance policies
– Machines, servers, URLs
– Sensor networks
5
• People
– Customers, families
– Employees
– Affinity groups, clubs
• Politics, causes, doctors
• Professionals (LinkedIn)
– Companies, institutions
• Places
– Map locations
• Cities, landmarks
– Retail stores
– Houses or buildings
– Communication networks
– Transportation hubs
• Airports, shipping lanes, etc.
What Can be Edges?
• People
– Relationships
– Ideas, preferences
– Email, phone calls, SMS, IM
– Collaborations
• Places
– Roads, routes, railways
– Water, power, gas,
pipelines, telephone lines
– Anything with GPS
coordinates
• Things
– Events
– Money Transactions
– Purchases
– Pressure
– Diseases
– Contraband
– URLs
– Phone calls
– Citations
– Weights, scores
– Timestamps
6
Actions
Model actions depending on what you want
as vertices
(Bill)-[:SENT]->(email)-[:TO]->(Jim)
OR
(Bill)-[:EMAILED]->(Jim)
7
Property Graph: The Domain Model
8
Semantic/RDF/Knowledge Graphs
• A triple is a data entity composed of subject-predicate-
object
– "Bob is 35”
– "Bob knows Fred”
– “William likes running”
• In the image:
– Subject: John R Peterson Predicate: Knows Object: Frank T Smith
– Subject: Triple #1 Predicate: Confidence Percent Object: 70
– Subject: Triple #1 Predicate: Provenance Object: Mary L Jones
9
Graph Visualization
10
Graph Algorithms
PageRank
12
Page A
1.0
Page C
1.0
Page B
1.0
Page D
1.0
1*0.85/2
1*0.85/2
1*0.85
1*0.85
1*0.85
Sum of inputs + 0.15
http://www.whitelines.nl/html/google-page-rank.html see spreadsheet
http://www.cs.princeton.edu/~chazelle/courses/BIB/pagerank.htm
+0.150
page D +0.850
page B +0.850
page A +0.425
C Total 2.275
PageRank: After 1st Results
Page A
1.0
Page C
2.275
Page B
0.575
Page D
0.15
+0.150
page A +0.425
B Total 0.575
+0.15
Page C +0.85
A Total 1.00
+0.150
D Total 0.150
1*0.85/2
1*0.85/2
1*0.85
1*0.85
1*0.85
http://www.whitelines.nl/html/google-page-rank.html (see spreadsheet)
13
Page Rank Iterations
14
End of iteration A result B result C result D result
1 1.000 0.575 2.275 0.150
2 2.084 0.575 1.191 0.150
3 1.163 1.036 1.652 0.150
4 1.554 0.644 1.652 0.150
5 1.554 0.810 1.485 0.150
6 1.413 0.810 1.627 0.150
7 1.533 0.750 1.567 0.150
8 1.482 0.801 1.567 0.150
9 1.482 0.780 1.588 0.150
10 1.500 0.780 1.570 0.150
11 1.485 0.788 1.578 0.150
12 1.491 0.781 1.578 0.150
13 1.491 0.784 1.575 0.150
14 1.489 0.784 1.577 0.150
15 1.491 0.783 1.576 0.150
16 1.490 0.784 1.576 0.150
17 1.490 0.783 1.577 0.150
18 1.490 0.783 1.576 0.150
19 1.490 0.783 1.577 0.150
20 1.490 0.783 1.577 0.150
PageRank: 20 Iterations Until Convergence
Page A
1.49
Page C
1.58
Page B
0.78
Page D
0.15
Most important
web page
Page C
increases page A
importance
15
Betweenness
• Find bridges across different communities
• High score = edge links different
communities
Bridge
vertex
Bridge
vertex
16
Closeness
• The shortest paths between any two
vertices
17
Eigen Centrality
• Measures the importance of a vertex by
the importance of its neighbors
important
important
important
must be
important
18
Clustering Coefficient: Cascading Churn
19
If two people churn,
what is the likelihood
others will?
The two churners affect
the central influencer
Finally: All contacts churn.
An Individual-focused model underestimates
churn by 6X.
SELECT *
FROM LocalClusteringCoefficient(
ON Calls as edges
PARTITION BY caller_from
ON caller_from as vertices
PARTITION BY caller_id
targetKey(caller_to')
directed('f')
degreeRange('[3:]')
accumulate('personId')
);
Great Questions for Graph Databases
• In what order did a specific set of related events
happen?
• Are there patterns of events in our data that seem
to be related by time?
• How far apart in a (social or physical) network are
two “actors” and how strong is their relationship?
• What are the identifiable social groups and what are
the general patterns of such groups?
• How important is any given “actor” in any given
network and event?
• What type of messages emanate from a specific
area?
20
How to Identify a Graph Workload
• Workload is identified by “network,
hierarchy, tree, ancestry, structure” words
• You are planning to use relational
performance tricks
• Your queries will be about pathing
• You are limiting queries by their complexity
• You are looking for “non-obvious” patterns
in the data
21
Excessive
relationships
Healthcare Fraud
• Monitor drugs and
treatments
– Excessive prescribers
– Excessive consumers
• Patients connected to
– Doctors, pharmacies,
medications
• Use Graph Access
– Find outliers and investigate
22
Online Shopping
• Bring fast context to a shopping experience
• Need to recall past similar interactions
• Need probabilistic models
– Product catalog
– Shopper attributes
23
Major Insurer
• Insight into risk environment
• Risks such as
– People appearing in multiple policies and
claims
– Premium leakage i.e., Underestimated mileage,
undeclared drivers, false garaging
– Padded claims
• Policyholder graph with risk indicators
– Risk indicators spread in graph
• Worker’s Compensation Fraud
24
Television, Magazine and Media
• Analyze content and consumption for
personalization
• Most users don’t “log in”
• Identified anonymous users through unique
cookies
– Cookies unstable, used third-party to enrich;
needed to vet
• Determine valuable (connected) providers,
audience segments
• Enabled evaluation of the accuracy of vendor
data
– And cut the cost of using unreliable data
25
Cybersecurity
• Can categorize new websites and sources
• Continuous updated knowledge of
classifications, risk scores and identification
of new cyber threats
26
Automotive
• Identify which robotic parts were about to
fail so they could replace the failing parts all
at once
• Able to reconcile data to the same piece of
the production line machinery
• Able to identify when a part is about to fail
so they can pre-plan and avoid unnecessary
breaks in the production assembly line
28
Pharmaceutical/Research
• Need to connect data from disparate parts of
the company to increase research and
operational efficiency, increase output, and
accelerate drug research
– Allow analysts to quickly and easily access the full
body of institutional knowledge
• Graph allowed bioinformaticians to more
easily identify useful signals within large sets of
noisy data and to answer highly-specific
questions
• Link targets, genes, and disease data across
different parts of the company
30
Financial Services
• Anti-Money Laundering
– Identify connections
– Display the connections
surrounding a specific
point
– Identify which
connections and
situations of interest lead
to productive
investigations
and inform work
31
Company
Trading
Partner
Customer
Creditor
Conclusion
• Graph is a Fast Growing data category
• It’s all about the Use Case; Good for Graph:
– Real-time recommendations
– Fraud detection
– Network and IT operations
– Identity and access management
– Graph-based search
– Identifying relative importance
• Reimagine your data as a graph
– The whiteboard model is the physical model
• Remember Page Rank
33
Graph Database Use
Cases
Presented by: William McKnight
“#1 Global Influencer in Data Warehousing” OnAlytica
President, McKnight Consulting Group
An Inc. 5000 Company in 2018 and 2017
@williammcknight
www.mcknightcg.com
(214) 514-1444
Second Thursday of Every Month, at 2:00 ET

Mais conteúdo relacionado

Semelhante a Advanced Analytics: Graph Database Use Cases

Digital Transformation Business Evolution
Digital Transformation Business Evolution Digital Transformation Business Evolution
Digital Transformation Business Evolution Digital Catapult
 
UNICOM Conference on Digital Transformation - The Trust Framework Initiative ...
UNICOM Conference on Digital Transformation - The Trust Framework Initiative ...UNICOM Conference on Digital Transformation - The Trust Framework Initiative ...
UNICOM Conference on Digital Transformation - The Trust Framework Initiative ...MicheleNati
 
Turning Big Data to Business Advantage
Turning Big Data to Business AdvantageTurning Big Data to Business Advantage
Turning Big Data to Business AdvantageTeradata Aster
 
ADV Slides: Graph Databases on the Edge
ADV Slides: Graph Databases on the EdgeADV Slides: Graph Databases on the Edge
ADV Slides: Graph Databases on the EdgeDATAVERSITY
 
Big Data Tools PowerPoint Presentation Slides
Big Data Tools PowerPoint Presentation SlidesBig Data Tools PowerPoint Presentation Slides
Big Data Tools PowerPoint Presentation SlidesSlideTeam
 
Data Science Governance
Data Science GovernanceData Science Governance
Data Science GovernanceBart Hamers
 
New Analytic Uses of Master Data Management in the Enterprise
New Analytic Uses of Master Data Management in the EnterpriseNew Analytic Uses of Master Data Management in the Enterprise
New Analytic Uses of Master Data Management in the EnterpriseDATAVERSITY
 
DataSpryng Overview
DataSpryng OverviewDataSpryng Overview
DataSpryng Overviewjkvr
 
Relying on Data for Strategic Decision-Making--Financial Services Experience
Relying on Data for Strategic Decision-Making--Financial Services ExperienceRelying on Data for Strategic Decision-Making--Financial Services Experience
Relying on Data for Strategic Decision-Making--Financial Services ExperienceCloudera, Inc.
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise AnalyticsDATAVERSITY
 
The Data Science Institute-Cognitive Solutions
The Data Science Institute-Cognitive SolutionsThe Data Science Institute-Cognitive Solutions
The Data Science Institute-Cognitive SolutionsThe Data Science Institute
 
Does big data = big insights?
Does big data = big insights?Does big data = big insights?
Does big data = big insights?Colin Strong
 
Deliveinrg explainable AI
Deliveinrg explainable AIDeliveinrg explainable AI
Deliveinrg explainable AIGary Allemann
 
Big Data Meetup by Chad Richeson
Big Data Meetup by Chad RichesonBig Data Meetup by Chad Richeson
Big Data Meetup by Chad RichesonSocietyConsulting
 
Big Data, Big Investment
Big Data, Big InvestmentBig Data, Big Investment
Big Data, Big InvestmentGGV Capital
 
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on Track
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on TrackYour AI and ML Projects Are Failing – Key Steps to Get Them Back on Track
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on TrackPrecisely
 

Semelhante a Advanced Analytics: Graph Database Use Cases (20)

Digital Transformation Business Evolution
Digital Transformation Business Evolution Digital Transformation Business Evolution
Digital Transformation Business Evolution
 
UNICOM Conference on Digital Transformation - The Trust Framework Initiative ...
UNICOM Conference on Digital Transformation - The Trust Framework Initiative ...UNICOM Conference on Digital Transformation - The Trust Framework Initiative ...
UNICOM Conference on Digital Transformation - The Trust Framework Initiative ...
 
Turning Big Data to Business Advantage
Turning Big Data to Business AdvantageTurning Big Data to Business Advantage
Turning Big Data to Business Advantage
 
Trends in data analytics
Trends in data analyticsTrends in data analytics
Trends in data analytics
 
ADV Slides: Graph Databases on the Edge
ADV Slides: Graph Databases on the EdgeADV Slides: Graph Databases on the Edge
ADV Slides: Graph Databases on the Edge
 
Big Data Tools PowerPoint Presentation Slides
Big Data Tools PowerPoint Presentation SlidesBig Data Tools PowerPoint Presentation Slides
Big Data Tools PowerPoint Presentation Slides
 
Data Science Governance
Data Science GovernanceData Science Governance
Data Science Governance
 
Market Scan- Knowledge Management Portals 1999
Market Scan- Knowledge Management Portals 1999Market Scan- Knowledge Management Portals 1999
Market Scan- Knowledge Management Portals 1999
 
uae views on big data
  uae views on  big data  uae views on  big data
uae views on big data
 
New Analytic Uses of Master Data Management in the Enterprise
New Analytic Uses of Master Data Management in the EnterpriseNew Analytic Uses of Master Data Management in the Enterprise
New Analytic Uses of Master Data Management in the Enterprise
 
Who is 1010data?
Who is 1010data?Who is 1010data?
Who is 1010data?
 
DataSpryng Overview
DataSpryng OverviewDataSpryng Overview
DataSpryng Overview
 
Relying on Data for Strategic Decision-Making--Financial Services Experience
Relying on Data for Strategic Decision-Making--Financial Services ExperienceRelying on Data for Strategic Decision-Making--Financial Services Experience
Relying on Data for Strategic Decision-Making--Financial Services Experience
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
The Data Science Institute-Cognitive Solutions
The Data Science Institute-Cognitive SolutionsThe Data Science Institute-Cognitive Solutions
The Data Science Institute-Cognitive Solutions
 
Does big data = big insights?
Does big data = big insights?Does big data = big insights?
Does big data = big insights?
 
Deliveinrg explainable AI
Deliveinrg explainable AIDeliveinrg explainable AI
Deliveinrg explainable AI
 
Big Data Meetup by Chad Richeson
Big Data Meetup by Chad RichesonBig Data Meetup by Chad Richeson
Big Data Meetup by Chad Richeson
 
Big Data, Big Investment
Big Data, Big InvestmentBig Data, Big Investment
Big Data, Big Investment
 
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on Track
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on TrackYour AI and ML Projects Are Failing – Key Steps to Get Them Back on Track
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on Track
 

Mais de DATAVERSITY

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...DATAVERSITY
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceDATAVERSITY
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data LiteracyDATAVERSITY
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for YouDATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?DATAVERSITY
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling FundamentalsDATAVERSITY
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectDATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsDATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayDATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best PracticesDATAVERSITY
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?DATAVERSITY
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best PracticesDATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...DATAVERSITY
 
Empowering the Data Driven Business with Modern Business Intelligence
Empowering the Data Driven Business with Modern Business IntelligenceEmpowering the Data Driven Business with Modern Business Intelligence
Empowering the Data Driven Business with Modern Business IntelligenceDATAVERSITY
 

Mais de DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
 
Empowering the Data Driven Business with Modern Business Intelligence
Empowering the Data Driven Business with Modern Business IntelligenceEmpowering the Data Driven Business with Modern Business Intelligence
Empowering the Data Driven Business with Modern Business Intelligence
 

Último

怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制vexqp
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATIONCapstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATIONLakpaYanziSherpa
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制vexqp
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格q6pzkpark
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........EfruzAsilolu
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowgargpaaro
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxParas Gupta
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1ranjankumarbehera14
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 

Último (20)

怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATIONCapstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptx
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 

Advanced Analytics: Graph Database Use Cases

  • 1. Graph Database Use Cases Presented by: William McKnight “#1 Global Influencer in Big Data” Thinkers360 President, McKnight Consulting Group An Inc. 5000 Company in 2018 and 2017 @williammcknight www.mcknightcg.com (214) 514-1444 Second Thursday of Every Month, at 2:00 ET
  • 2. 2023 Advanced Analytics Topics 1. 2023 Trends in Enterprise Analytics 2. Showing ROI for your Analytic Project 3. Architecture, Products and Total Cost of Ownership of the Leading Machine Learning Stacks 4. Competitive Analytic Architectures: Comparing the Data Mesh, Data Fabric, Data Lakehouse and Data Cloud 5. Why Analytics Leaders deploy Master Data Management 6. What Does Information Management Maturity Look Like in 2023 7. Understanding the Modern Applications of Graph Databases 8. Common Misconceptions About Master Data Management 9. Organizational Change Management: Will it Hold Back Artificial Intelligence Deployments? 10. Open-Source vs Commercial Vendor Software in the Enterprise 11. Data Quality: The ROI of Adding Intelligence to Data 12. Strategies for Machine Learning Success 2
  • 3. Relational DBs Can’t Handle Data Relationships Well • Cannot model or store data and relationships without complexity • Performance degrades with number and levels of relationships, and database size • Query complexity grows with need for JOINs • Adding new types of data and relationships requires schema redesign, increasing time to market 3 Slow development Poor performance Low scalability Hard to maintain … making traditional databases inappropriate when data relationships are valuable in real-time
  • 4. Discrete Data Minimally connected data Graph Databases are designed for data relationships Use the Right Database for the Right Job Other NoSQL Relational DBMS Graph DB Connected Data Focused on Data Relationships Development Benefits Model maintenance Deployment Benefits Performance Minimal resource usage
  • 5. What Can Be Vertices? • Things – Bank accounts – Customer accounts • Mobile phones – Products – Trading networks, auctions – Water, power, gas grids – Disease, drugs, molecules • Interactions, transmission – Insurance policies – Machines, servers, URLs – Sensor networks 5 • People – Customers, families – Employees – Affinity groups, clubs • Politics, causes, doctors • Professionals (LinkedIn) – Companies, institutions • Places – Map locations • Cities, landmarks – Retail stores – Houses or buildings – Communication networks – Transportation hubs • Airports, shipping lanes, etc.
  • 6. What Can be Edges? • People – Relationships – Ideas, preferences – Email, phone calls, SMS, IM – Collaborations • Places – Roads, routes, railways – Water, power, gas, pipelines, telephone lines – Anything with GPS coordinates • Things – Events – Money Transactions – Purchases – Pressure – Diseases – Contraband – URLs – Phone calls – Citations – Weights, scores – Timestamps 6
  • 7. Actions Model actions depending on what you want as vertices (Bill)-[:SENT]->(email)-[:TO]->(Jim) OR (Bill)-[:EMAILED]->(Jim) 7
  • 8. Property Graph: The Domain Model 8
  • 9. Semantic/RDF/Knowledge Graphs • A triple is a data entity composed of subject-predicate- object – "Bob is 35” – "Bob knows Fred” – “William likes running” • In the image: – Subject: John R Peterson Predicate: Knows Object: Frank T Smith – Subject: Triple #1 Predicate: Confidence Percent Object: 70 – Subject: Triple #1 Predicate: Provenance Object: Mary L Jones 9
  • 12. PageRank 12 Page A 1.0 Page C 1.0 Page B 1.0 Page D 1.0 1*0.85/2 1*0.85/2 1*0.85 1*0.85 1*0.85 Sum of inputs + 0.15 http://www.whitelines.nl/html/google-page-rank.html see spreadsheet http://www.cs.princeton.edu/~chazelle/courses/BIB/pagerank.htm
  • 13. +0.150 page D +0.850 page B +0.850 page A +0.425 C Total 2.275 PageRank: After 1st Results Page A 1.0 Page C 2.275 Page B 0.575 Page D 0.15 +0.150 page A +0.425 B Total 0.575 +0.15 Page C +0.85 A Total 1.00 +0.150 D Total 0.150 1*0.85/2 1*0.85/2 1*0.85 1*0.85 1*0.85 http://www.whitelines.nl/html/google-page-rank.html (see spreadsheet) 13
  • 14. Page Rank Iterations 14 End of iteration A result B result C result D result 1 1.000 0.575 2.275 0.150 2 2.084 0.575 1.191 0.150 3 1.163 1.036 1.652 0.150 4 1.554 0.644 1.652 0.150 5 1.554 0.810 1.485 0.150 6 1.413 0.810 1.627 0.150 7 1.533 0.750 1.567 0.150 8 1.482 0.801 1.567 0.150 9 1.482 0.780 1.588 0.150 10 1.500 0.780 1.570 0.150 11 1.485 0.788 1.578 0.150 12 1.491 0.781 1.578 0.150 13 1.491 0.784 1.575 0.150 14 1.489 0.784 1.577 0.150 15 1.491 0.783 1.576 0.150 16 1.490 0.784 1.576 0.150 17 1.490 0.783 1.577 0.150 18 1.490 0.783 1.576 0.150 19 1.490 0.783 1.577 0.150 20 1.490 0.783 1.577 0.150
  • 15. PageRank: 20 Iterations Until Convergence Page A 1.49 Page C 1.58 Page B 0.78 Page D 0.15 Most important web page Page C increases page A importance 15
  • 16. Betweenness • Find bridges across different communities • High score = edge links different communities Bridge vertex Bridge vertex 16
  • 17. Closeness • The shortest paths between any two vertices 17
  • 18. Eigen Centrality • Measures the importance of a vertex by the importance of its neighbors important important important must be important 18
  • 19. Clustering Coefficient: Cascading Churn 19 If two people churn, what is the likelihood others will? The two churners affect the central influencer Finally: All contacts churn. An Individual-focused model underestimates churn by 6X. SELECT * FROM LocalClusteringCoefficient( ON Calls as edges PARTITION BY caller_from ON caller_from as vertices PARTITION BY caller_id targetKey(caller_to') directed('f') degreeRange('[3:]') accumulate('personId') );
  • 20. Great Questions for Graph Databases • In what order did a specific set of related events happen? • Are there patterns of events in our data that seem to be related by time? • How far apart in a (social or physical) network are two “actors” and how strong is their relationship? • What are the identifiable social groups and what are the general patterns of such groups? • How important is any given “actor” in any given network and event? • What type of messages emanate from a specific area? 20
  • 21. How to Identify a Graph Workload • Workload is identified by “network, hierarchy, tree, ancestry, structure” words • You are planning to use relational performance tricks • Your queries will be about pathing • You are limiting queries by their complexity • You are looking for “non-obvious” patterns in the data 21
  • 22. Excessive relationships Healthcare Fraud • Monitor drugs and treatments – Excessive prescribers – Excessive consumers • Patients connected to – Doctors, pharmacies, medications • Use Graph Access – Find outliers and investigate 22
  • 23. Online Shopping • Bring fast context to a shopping experience • Need to recall past similar interactions • Need probabilistic models – Product catalog – Shopper attributes 23
  • 24. Major Insurer • Insight into risk environment • Risks such as – People appearing in multiple policies and claims – Premium leakage i.e., Underestimated mileage, undeclared drivers, false garaging – Padded claims • Policyholder graph with risk indicators – Risk indicators spread in graph • Worker’s Compensation Fraud 24
  • 25. Television, Magazine and Media • Analyze content and consumption for personalization • Most users don’t “log in” • Identified anonymous users through unique cookies – Cookies unstable, used third-party to enrich; needed to vet • Determine valuable (connected) providers, audience segments • Enabled evaluation of the accuracy of vendor data – And cut the cost of using unreliable data 25
  • 26. Cybersecurity • Can categorize new websites and sources • Continuous updated knowledge of classifications, risk scores and identification of new cyber threats 26
  • 27. Automotive • Identify which robotic parts were about to fail so they could replace the failing parts all at once • Able to reconcile data to the same piece of the production line machinery • Able to identify when a part is about to fail so they can pre-plan and avoid unnecessary breaks in the production assembly line 28
  • 28. Pharmaceutical/Research • Need to connect data from disparate parts of the company to increase research and operational efficiency, increase output, and accelerate drug research – Allow analysts to quickly and easily access the full body of institutional knowledge • Graph allowed bioinformaticians to more easily identify useful signals within large sets of noisy data and to answer highly-specific questions • Link targets, genes, and disease data across different parts of the company 30
  • 29. Financial Services • Anti-Money Laundering – Identify connections – Display the connections surrounding a specific point – Identify which connections and situations of interest lead to productive investigations and inform work 31 Company Trading Partner Customer Creditor
  • 30. Conclusion • Graph is a Fast Growing data category • It’s all about the Use Case; Good for Graph: – Real-time recommendations – Fraud detection – Network and IT operations – Identity and access management – Graph-based search – Identifying relative importance • Reimagine your data as a graph – The whiteboard model is the physical model • Remember Page Rank 33
  • 31. Graph Database Use Cases Presented by: William McKnight “#1 Global Influencer in Data Warehousing” OnAlytica President, McKnight Consulting Group An Inc. 5000 Company in 2018 and 2017 @williammcknight www.mcknightcg.com (214) 514-1444 Second Thursday of Every Month, at 2:00 ET