SlideShare a Scribd company logo
1 of 40
Download to read offline
Scalability and Graph
Analytics with Neo4j
Stefan Kolmar
VP Field Engineering - Neo4j
I Remember...
The Evolution of Databases
The Evolution of Databases
TRADITIONAL OLTP/RELATIONAL
The Evolution of Databases
TRADITIONAL OLTP/RELATIONAL BIG DATA TECHNOLOGY
The Evolution of Databases
TRADITIONAL OLTP/RELATIONAL BIG DATA TECHNOLOGY
The Evolution of Databases
TRADITIONAL OLTP/RELATIONAL BIG DATA TECHNOLOGY
The classic challenges for Telco’s
Large Data Volumes
CDRs
Network Metrics
Customer Metrics
The classic challenges for Telco’s
Large Data Volumes
CDRs
Network Metrics
Customer Metrics
Dynamic Access Dynamic Access
What Is Different in Neo4j?
Index-Free Adjacency
Connectedness and Size of Data Set
ResponseTime
Relational and
Other NoSQL
Databases
0 to 2 hops
0 to 3 degrees
Thousands of connections
1000x
Advantage
Tens to hundreds of hops
Thousands of degrees
Billions of connections
Neo4j
“Minutes to
milliseconds”
The Largest Investment
in Graph Databases
Multi-tenancy with Neo4j 4.0
• B2B SaaS:
Greatly simplified management of DB infrastructure for your customers.
• Multi-tenancy:
A single instance of Neo4j Server/Cluster may serve multiple customers/users within an
organization.
• Rapid Testing/Development/Deployment:
Manage separate databases for development, testing, staging, etc. in a single infrastructure.
• Scalability:
Disjoint data is organized in physically separate structures, strong isolation.
• Cloud-Friendly:
Databases can be associated to cloud storage and easily detached from a server and attached
to another server.
Multi-Database: Use Cases
Administration commands:
● CREATE|DROP|START|STOP DATABASE name
Use commands:
● HTTP API: http://server:port/.../database
● Browser & Cypher Shell: :USE database
● Drivers: Session(database)
● Browser:
Configure and Manage Neo4j Multi-Database
Network Mgmt
Customer
Relations
Unbounded Scalability in Neo4j 4.0
Causal Clustering with Neo4j
• Scale-out model
• Two ways of using:
• Operate over single large, decomposed graph
• Query across disjoint graphs, per business domain
Data Scientists
Run analysis on large, distributed databases.
Developers
Develop large scale applications on
laptops/desktops and deploy
in a network of Neo4j clusters.
Enterprises
Keep data in designated geographies
Analyse graphs without replicating or
moving them.
Fabric: Distributed Graph Query
Cypher Queries
SQL
Cypher in Neo4j
MATCH (boss)-[:MANAGES*0..3]->(sub),
(sub)-[:MANAGES*1..3]->(report)
RETURN boss.name AS Boss,
sub.name AS Subordinate,
count(report) AS Total
Multi-graph Cypher Queries
SQL
UNWIND corporate.graphIds() AS gid
CALL {
USE corporate.graph( gid )
MATCH (boss)-[:MANAGES*0..3]->(sub),
(sub)-[:MANAGES*1..3]->(report)
RETURN boss.name AS Boss,
sub.name AS Subordinate,
count(report) AS Total
}
RETURN Boss, Subordinate, Total ORDER BY Total
Cypher in Neo4j 4.0
• Executes queries in parallel on multiple databases, combining or aggregating results.
• Chains queries together from multiple databases for sophisticated real-time analyses.
The foundation:
Causal Cluster
How will this help a Telco to scale?
The evolution:
Fabric
Large Data Volumes
CDRs
Network Metrics
Customer Metrics
Large Data Volumes
CDRs
Network Metrics
Customer Metrics
Large Data Volumes
CDRs
Network Metrics
Customer Metrics
Scaling R/W access
The foundation:
Causal Cluster
How will this help a Telco to scale?
The evolution:
Fabric
Large Data Volumes
CDRs
Network Metrics
Customer Metrics
Large Data Volumes
CDRs
Network Metrics
Customer Metrics
Large Data Volumes
CDRs
Network Metrics
Customer Metrics
Scaling R/W access
NEO4J DBMSuser
NEO4J DBMS
CLUSTER A
CORE 1
CORE 3CORE 2
REPLICA 1
REPLICA 2
CLUSTER B
CORE 1
CORE 3CORE 2
NM1
Network Metrics
Network Metrics
NM2
NM1 NM2
NM1 NM2
NM3
NM3 NM3
NM3
NM3
http://ldbcouncil.org/developer/snb and https://neo4j.com/fosdem20
Neo4j 4.0 Scalability in Action
Sharding the LDBC Social Network Benchmark
Data Model
http://ldbcouncil.org/developer/snb and https://neo4j.com/fosdem20
Neo4j 4.0 Scalability in Action
Sharding the LDBC Social Network Benchmark
• 1-shard for the Persons graph
• N-shards for the Forums graph
http://ldbcouncil.org/developer/snb and https://neo4j.com/fosdem20
Neo4j 4.0 Scalability in Action
Sharding the LDBC Social Network Benchmark
Up to 300x reduced latency
Up to 10x Performance improvement
Scalability → Security?
BobJoe
• Based on Role-based Access Control for
graphs
• Restrictions on what data can be seen by
different users, applied to all database
interactions
• Implicit security view of the data for each
user through schema-based security
definitions
• Grant/Deny permissions to traverse, read or
write data based on node labels, relationship
types or database and property names
• Security rules are replicated across the
cluster via roles that are associated with the
users
Security and Data Privacy
Baseline_Personnel
_Security_Standard
Security_Check Counter_Terrorism
_Check
Developed_Vetting
Security and Data Privacy in Practice
• Call Centre Agent:
-> needs Doctor’s name
-> not allowed to read diagnosis
• Doctor:
-> ability to view patient records and
-> ability to view patient diagnoses
Constraints
// Doctors get wide-ranging access
GRANT ACCESS ON DATABASE healthcare TO doctor;
GRANT TRAVERSE {*} ON GRAPH healthcare TO doctor;
GRANT READ {*} ON GRAPH healthcare TO doctor;
GRANT WRITE ON GRAPH healthcare TO doctor;
Security Config
// Agents get narrower access
GRANT ACCESS ON DATABASE healthcare TO agent;
GRANT TRAVERSE {*} ON GRAPH healthcare TO agent;
GRANT READ {Name} ON GRAPH healthcare NODES Doctor TO agent;
GRANT READ {Name} ON GRAPH healthcare NODES Patient TO agent;
Call Centre Agent
MATCH (:CallcenterAgent {name: 'Alice'})
<-[:CALLED]-(p:Patient)-[:HAS_DIAGNOSIS]-(dia)
<-[:ESTABLISHED]-(d:Doctor)
RETURN p.name, d.name, dia.name;
Reactive Architecture Neo4j 4.0
• Flow control throughout the stack, allowing for
the client application to fully control the
production and flow of records within a result
• Synchronous/Asynchronous execution
• Based on reactive streams with non-blocking
backpressure library
• Client applications can pull or discard the whole
result or N elements
• Can also be gracefully cancelled
• Exposed through a reactive API in Drivers v4.0
• Use Cases:
• Long queries with large result sets
• Paged results
• Thin/small clients
Reactive Architecture
Graph Recipes & Analytics Graph Enhanced ML & AI
Graph Data Science
Science-driven approach to gain knowledge from the
relationships and structures in data, typically to power predictions.
Uses multi-disciplinary workflows that may include
queries, statistics, algorithms and machine learning.
`
Answers specific questions to gain insights from
connections in existing/historical data
Approaches typically include global queries and
algorithms and direct use of results
Training models (ML) with graph structured data
to be used to emulate human, probabilistic
decisions within a solution/ application (AI
system)
Optimized for Analytics
Leverage custom data structures
optimized for global traversals and
aggregation
Flexibly decompose and reshape
your graph for specific use cases
Algorithms for Insights
Robust algorithms that are highly
parallelized and scale to billions of
nodes
Early access to dozens of
experimental implementations
Intuitive Interface
Drastically simplified and
standardized API that enables
custom configurations
Documentation, training, and
examples so getting started is simple
Product Supported & Under Active Development
The Graph Data Science Library
Graph Data Science
Analytics projections:
- Specialized data structure for algorithms,
capable of supporting billions of nodes
- Cypher loaders for experimentation
- Quickly reshape, combine, aggregate, and
deduplicate your transactional data
- Support for multiple node labels,
relationship types, and properties
- Manage multiple in-memory analytics
graphs for different workloads
- Memory footprint allowing large scale use
Graph algorithms & more:
- 40+ algorithms in 5 categories: community,
centrality, similarity, pathfinding, and link
prediction
- Helper algorithms like graph generation, one
hot encoding, and random walk
- Early previews to new implementations in the
alpha & beta name spaces
- Supported, scalable algorithms include seeding,
determinism, and incremental calculations
- Estimate mode for memory requirements
Graph Data Science Algorithms
Generally Unsupervised
38
A subset of data science algorithms that come from network science,
Graph Algorithms enable reasoning about network structure.
Pathfinding
and Search
Centrality
(Importance)
Community
Detection
Heuristic
Link Prediction
Similarity
• Neo4j provides
• Scalability for Telco’s
• Carrier grade high availability with Causal Cluster
• Security features to fulfill privacy requirements
• Graph Analytics to provide Data Science infrastructure for Telcos
Conclusions
Scalability and Graph
Analytics with Neo4j
Stefan Kolmar
VP Field Engineering - Neo4j

More Related Content

What's hot

What's hot (20)

Introduction to Neo4j
Introduction to Neo4jIntroduction to Neo4j
Introduction to Neo4j
 
Risk Analytics Using Knowledge Graphs / FIBO with Deep Learning
Risk Analytics Using Knowledge Graphs / FIBO with Deep LearningRisk Analytics Using Knowledge Graphs / FIBO with Deep Learning
Risk Analytics Using Knowledge Graphs / FIBO with Deep Learning
 
Graphdatenbank Neo4j: Konzept, Positionierung, Status Region DACH - Bruno Un...
 Graphdatenbank Neo4j: Konzept, Positionierung, Status Region DACH - Bruno Un... Graphdatenbank Neo4j: Konzept, Positionierung, Status Region DACH - Bruno Un...
Graphdatenbank Neo4j: Konzept, Positionierung, Status Region DACH - Bruno Un...
 
Big Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data DemocratizationBig Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data Democratization
 
Using Cloud Automation Technologies to Deliver an Enterprise Data Fabric
Using Cloud Automation Technologies to Deliver an Enterprise Data FabricUsing Cloud Automation Technologies to Deliver an Enterprise Data Fabric
Using Cloud Automation Technologies to Deliver an Enterprise Data Fabric
 
GraphTour - Neo4j Platform Overview
GraphTour - Neo4j Platform OverviewGraphTour - Neo4j Platform Overview
GraphTour - Neo4j Platform Overview
 
Introduction to Neo4j
Introduction to Neo4jIntroduction to Neo4j
Introduction to Neo4j
 
The Business Case for Semantic Web Ontology & Knowledge Graph
The Business Case for Semantic Web Ontology & Knowledge GraphThe Business Case for Semantic Web Ontology & Knowledge Graph
The Business Case for Semantic Web Ontology & Knowledge Graph
 
Knowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data ScienceKnowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data Science
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
 
A field guide to the Financial Times, Rhys Evans, Financial Times
A field guide to the Financial Times, Rhys Evans, Financial TimesA field guide to the Financial Times, Rhys Evans, Financial Times
A field guide to the Financial Times, Rhys Evans, Financial Times
 
Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins
 Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins
Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins
 
Should a Graph Database Be in Your Next Data Warehouse Stack?
Should a Graph Database Be in Your Next Data Warehouse Stack?Should a Graph Database Be in Your Next Data Warehouse Stack?
Should a Graph Database Be in Your Next Data Warehouse Stack?
 
Roadmap for Enterprise Graph Strategy
Roadmap for Enterprise Graph StrategyRoadmap for Enterprise Graph Strategy
Roadmap for Enterprise Graph Strategy
 
Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data FabricUsing a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
 
Fireside Chat with Bloor Research: State of the Graph Database Market 2020
Fireside Chat with Bloor Research: State of the Graph Database Market 2020Fireside Chat with Bloor Research: State of the Graph Database Market 2020
Fireside Chat with Bloor Research: State of the Graph Database Market 2020
 
Future of Data Platform in Cloud Native world
Future of Data Platform in Cloud Native worldFuture of Data Platform in Cloud Native world
Future of Data Platform in Cloud Native world
 
3. Relationships Matter: Using Connected Data for Better Machine Learning
3. Relationships Matter: Using Connected Data for Better Machine Learning3. Relationships Matter: Using Connected Data for Better Machine Learning
3. Relationships Matter: Using Connected Data for Better Machine Learning
 
How OpenTable uses Big Data to impact growth by Raman Marya
How OpenTable uses Big Data to impact growth by Raman MaryaHow OpenTable uses Big Data to impact growth by Raman Marya
How OpenTable uses Big Data to impact growth by Raman Marya
 

Similar to Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j

Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Precisely
 
GraphTour London 2020 - What's New, Jim Webber
GraphTour London 2020 -  What's New, Jim WebberGraphTour London 2020 -  What's New, Jim Webber
GraphTour London 2020 - What's New, Jim Webber
Neo4j
 

Similar to Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j (20)

Microsoft Azure Big Data Analytics
Microsoft Azure Big Data AnalyticsMicrosoft Azure Big Data Analytics
Microsoft Azure Big Data Analytics
 
Squirrel – Enabling Accessible Analytics for All
Squirrel – Enabling Accessible Analytics for AllSquirrel – Enabling Accessible Analytics for All
Squirrel – Enabling Accessible Analytics for All
 
BIG DATA ANALYTICS MEANS “IN-DATABASE” ANALYTICS
BIG DATA ANALYTICS MEANS “IN-DATABASE” ANALYTICSBIG DATA ANALYTICS MEANS “IN-DATABASE” ANALYTICS
BIG DATA ANALYTICS MEANS “IN-DATABASE” ANALYTICS
 
Big Data Analytics: From SQL to Machine Learning and Graph Analysis
Big Data Analytics: From SQL to Machine Learning and Graph AnalysisBig Data Analytics: From SQL to Machine Learning and Graph Analysis
Big Data Analytics: From SQL to Machine Learning and Graph Analysis
 
Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft Azure
 
Data Treatment MongoDB
Data Treatment MongoDBData Treatment MongoDB
Data Treatment MongoDB
 
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the CloudFSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud
 
Virtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & BénéficesVirtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & Bénéfices
 
The Hidden Value of Hadoop Migration
The Hidden Value of Hadoop MigrationThe Hidden Value of Hadoop Migration
The Hidden Value of Hadoop Migration
 
Bitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FSBitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FS
 
6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...
6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...
6° Sessione - Ambiti applicativi nella ricerca di tecnologie statistiche avan...
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
 
Webinar: The Modern Streaming Data Stack with Kinetica & StreamSets
Webinar: The Modern Streaming Data Stack with Kinetica & StreamSetsWebinar: The Modern Streaming Data Stack with Kinetica & StreamSets
Webinar: The Modern Streaming Data Stack with Kinetica & StreamSets
 
Big Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesBig Data: It’s all about the Use Cases
Big Data: It’s all about the Use Cases
 
Integration Patterns for Big Data Applications
Integration Patterns for Big Data ApplicationsIntegration Patterns for Big Data Applications
Integration Patterns for Big Data Applications
 
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data AnalyticsHow to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
 
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
 
Analytics&IoT
Analytics&IoTAnalytics&IoT
Analytics&IoT
 
GraphTour London 2020 - What's New, Jim Webber
GraphTour London 2020 -  What's New, Jim WebberGraphTour London 2020 -  What's New, Jim Webber
GraphTour London 2020 - What's New, Jim Webber
 
Melbourne: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cl...
Melbourne: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cl...Melbourne: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cl...
Melbourne: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cl...
 

More from Neo4j

More from Neo4j (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansQIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG time
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge Graphs
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
 
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with Graph
 

Recently uploaded

%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
masabamasaba
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
masabamasaba
 

Recently uploaded (20)

Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 

Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j

  • 1. Scalability and Graph Analytics with Neo4j Stefan Kolmar VP Field Engineering - Neo4j
  • 3. The Evolution of Databases
  • 4. The Evolution of Databases TRADITIONAL OLTP/RELATIONAL
  • 5. The Evolution of Databases TRADITIONAL OLTP/RELATIONAL BIG DATA TECHNOLOGY
  • 6. The Evolution of Databases TRADITIONAL OLTP/RELATIONAL BIG DATA TECHNOLOGY
  • 7. The Evolution of Databases TRADITIONAL OLTP/RELATIONAL BIG DATA TECHNOLOGY
  • 8. The classic challenges for Telco’s Large Data Volumes CDRs Network Metrics Customer Metrics
  • 9. The classic challenges for Telco’s Large Data Volumes CDRs Network Metrics Customer Metrics Dynamic Access Dynamic Access
  • 10. What Is Different in Neo4j? Index-Free Adjacency
  • 11. Connectedness and Size of Data Set ResponseTime Relational and Other NoSQL Databases 0 to 2 hops 0 to 3 degrees Thousands of connections 1000x Advantage Tens to hundreds of hops Thousands of degrees Billions of connections Neo4j “Minutes to milliseconds”
  • 12. The Largest Investment in Graph Databases
  • 14. • B2B SaaS: Greatly simplified management of DB infrastructure for your customers. • Multi-tenancy: A single instance of Neo4j Server/Cluster may serve multiple customers/users within an organization. • Rapid Testing/Development/Deployment: Manage separate databases for development, testing, staging, etc. in a single infrastructure. • Scalability: Disjoint data is organized in physically separate structures, strong isolation. • Cloud-Friendly: Databases can be associated to cloud storage and easily detached from a server and attached to another server. Multi-Database: Use Cases
  • 15. Administration commands: ● CREATE|DROP|START|STOP DATABASE name Use commands: ● HTTP API: http://server:port/.../database ● Browser & Cypher Shell: :USE database ● Drivers: Session(database) ● Browser: Configure and Manage Neo4j Multi-Database Network Mgmt Customer Relations
  • 18. • Scale-out model • Two ways of using: • Operate over single large, decomposed graph • Query across disjoint graphs, per business domain Data Scientists Run analysis on large, distributed databases. Developers Develop large scale applications on laptops/desktops and deploy in a network of Neo4j clusters. Enterprises Keep data in designated geographies Analyse graphs without replicating or moving them. Fabric: Distributed Graph Query
  • 19. Cypher Queries SQL Cypher in Neo4j MATCH (boss)-[:MANAGES*0..3]->(sub), (sub)-[:MANAGES*1..3]->(report) RETURN boss.name AS Boss, sub.name AS Subordinate, count(report) AS Total
  • 20. Multi-graph Cypher Queries SQL UNWIND corporate.graphIds() AS gid CALL { USE corporate.graph( gid ) MATCH (boss)-[:MANAGES*0..3]->(sub), (sub)-[:MANAGES*1..3]->(report) RETURN boss.name AS Boss, sub.name AS Subordinate, count(report) AS Total } RETURN Boss, Subordinate, Total ORDER BY Total Cypher in Neo4j 4.0 • Executes queries in parallel on multiple databases, combining or aggregating results. • Chains queries together from multiple databases for sophisticated real-time analyses.
  • 21. The foundation: Causal Cluster How will this help a Telco to scale? The evolution: Fabric Large Data Volumes CDRs Network Metrics Customer Metrics Large Data Volumes CDRs Network Metrics Customer Metrics Large Data Volumes CDRs Network Metrics Customer Metrics Scaling R/W access
  • 22. The foundation: Causal Cluster How will this help a Telco to scale? The evolution: Fabric Large Data Volumes CDRs Network Metrics Customer Metrics Large Data Volumes CDRs Network Metrics Customer Metrics Large Data Volumes CDRs Network Metrics Customer Metrics Scaling R/W access
  • 23. NEO4J DBMSuser NEO4J DBMS CLUSTER A CORE 1 CORE 3CORE 2 REPLICA 1 REPLICA 2 CLUSTER B CORE 1 CORE 3CORE 2 NM1 Network Metrics Network Metrics NM2 NM1 NM2 NM1 NM2 NM3 NM3 NM3 NM3 NM3
  • 24. http://ldbcouncil.org/developer/snb and https://neo4j.com/fosdem20 Neo4j 4.0 Scalability in Action Sharding the LDBC Social Network Benchmark Data Model
  • 25. http://ldbcouncil.org/developer/snb and https://neo4j.com/fosdem20 Neo4j 4.0 Scalability in Action Sharding the LDBC Social Network Benchmark • 1-shard for the Persons graph • N-shards for the Forums graph
  • 26. http://ldbcouncil.org/developer/snb and https://neo4j.com/fosdem20 Neo4j 4.0 Scalability in Action Sharding the LDBC Social Network Benchmark Up to 300x reduced latency Up to 10x Performance improvement
  • 28. BobJoe • Based on Role-based Access Control for graphs • Restrictions on what data can be seen by different users, applied to all database interactions • Implicit security view of the data for each user through schema-based security definitions • Grant/Deny permissions to traverse, read or write data based on node labels, relationship types or database and property names • Security rules are replicated across the cluster via roles that are associated with the users Security and Data Privacy Baseline_Personnel _Security_Standard Security_Check Counter_Terrorism _Check Developed_Vetting
  • 29. Security and Data Privacy in Practice
  • 30. • Call Centre Agent: -> needs Doctor’s name -> not allowed to read diagnosis • Doctor: -> ability to view patient records and -> ability to view patient diagnoses Constraints
  • 31. // Doctors get wide-ranging access GRANT ACCESS ON DATABASE healthcare TO doctor; GRANT TRAVERSE {*} ON GRAPH healthcare TO doctor; GRANT READ {*} ON GRAPH healthcare TO doctor; GRANT WRITE ON GRAPH healthcare TO doctor; Security Config // Agents get narrower access GRANT ACCESS ON DATABASE healthcare TO agent; GRANT TRAVERSE {*} ON GRAPH healthcare TO agent; GRANT READ {Name} ON GRAPH healthcare NODES Doctor TO agent; GRANT READ {Name} ON GRAPH healthcare NODES Patient TO agent;
  • 32. Call Centre Agent MATCH (:CallcenterAgent {name: 'Alice'}) <-[:CALLED]-(p:Patient)-[:HAS_DIAGNOSIS]-(dia) <-[:ESTABLISHED]-(d:Doctor) RETURN p.name, d.name, dia.name;
  • 34. • Flow control throughout the stack, allowing for the client application to fully control the production and flow of records within a result • Synchronous/Asynchronous execution • Based on reactive streams with non-blocking backpressure library • Client applications can pull or discard the whole result or N elements • Can also be gracefully cancelled • Exposed through a reactive API in Drivers v4.0 • Use Cases: • Long queries with large result sets • Paged results • Thin/small clients Reactive Architecture
  • 35. Graph Recipes & Analytics Graph Enhanced ML & AI Graph Data Science Science-driven approach to gain knowledge from the relationships and structures in data, typically to power predictions. Uses multi-disciplinary workflows that may include queries, statistics, algorithms and machine learning. ` Answers specific questions to gain insights from connections in existing/historical data Approaches typically include global queries and algorithms and direct use of results Training models (ML) with graph structured data to be used to emulate human, probabilistic decisions within a solution/ application (AI system)
  • 36. Optimized for Analytics Leverage custom data structures optimized for global traversals and aggregation Flexibly decompose and reshape your graph for specific use cases Algorithms for Insights Robust algorithms that are highly parallelized and scale to billions of nodes Early access to dozens of experimental implementations Intuitive Interface Drastically simplified and standardized API that enables custom configurations Documentation, training, and examples so getting started is simple Product Supported & Under Active Development The Graph Data Science Library
  • 37. Graph Data Science Analytics projections: - Specialized data structure for algorithms, capable of supporting billions of nodes - Cypher loaders for experimentation - Quickly reshape, combine, aggregate, and deduplicate your transactional data - Support for multiple node labels, relationship types, and properties - Manage multiple in-memory analytics graphs for different workloads - Memory footprint allowing large scale use Graph algorithms & more: - 40+ algorithms in 5 categories: community, centrality, similarity, pathfinding, and link prediction - Helper algorithms like graph generation, one hot encoding, and random walk - Early previews to new implementations in the alpha & beta name spaces - Supported, scalable algorithms include seeding, determinism, and incremental calculations - Estimate mode for memory requirements
  • 38. Graph Data Science Algorithms Generally Unsupervised 38 A subset of data science algorithms that come from network science, Graph Algorithms enable reasoning about network structure. Pathfinding and Search Centrality (Importance) Community Detection Heuristic Link Prediction Similarity
  • 39. • Neo4j provides • Scalability for Telco’s • Carrier grade high availability with Causal Cluster • Security features to fulfill privacy requirements • Graph Analytics to provide Data Science infrastructure for Telcos Conclusions
  • 40. Scalability and Graph Analytics with Neo4j Stefan Kolmar VP Field Engineering - Neo4j