The Neo4j graph database is the fastest growing database engine in the market and has hundreds of customer references across Europe and globally, solving significant technology problems for large Enterprises in Finance, Telco, Retail, Utilities, Logistics and Internet sectors. Typical use cases are Recommendations, Fraud Detection, MDM, Network and Software Analysis and Optimization, Identity and Access Management.
2. Who am I?
• Rik Van Bruggen
• 5y of graph madness
• Lots of blogging, writing, podcasting
• Blog.bruggen.com
• Learningneo4j.net
• Graphistania.com
• @rvanbruggen
• Don’t forget to #graphday!
2
3. Neo4j: The Graph Database Leader
3
2000 2003 2007 2011 2013 2014 20152012 2016 2017
First and only
declarative
query
language
for property
graph
Invented
property
graph
model
Extended
graph data
model to
labeled
property
graph
First modern
open-source
commercial
graph DB
1st 1st
1st
Introduced
3rd-gen
clustering
architecture
with causal
consistency
Multi-
data center
support with
network
topology
awareness
1st
First
cost-based
graph
query
optimizer
1st
Published
O’Reilly
book on
graph DB
First
native
graph DB
in 24/7
production
First
visual
development
environment
for graphs
Launched
openCypher
as “SQL for
graphs”
standard
First
database
with native
graph
storage
and
processing
Introduced
graph DB as
a NoSQL
category
1st
1st 1st
Scale,
performance,
governance
for global
internet
apps
Security
Foundation
for data
security and
compliance
V3.2V3.1
1st
First
built-in
graph
ETL in
Cypher
9. Relational
Database
Good for:
• Well-understood data structures that
don’t change too frequently
A way of representing data
• Known problems involving discrete
parts of the data, or minimal
connectivity
DATA
10. Graph
Database
Relational
Database
A way of representing data
Good for:
• Dynamic systems: where the data
topology is difficult to predict
• Dynamic requirements:
the evolve with the business
• Problems where the relationships in
data contribute meaning & value
Good for:
• Well-understood data structures that
don’t change too frequently
• Known problems involving discrete
parts of the data, or minimal
connectivity
21. NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
22. “As the current market leader in graph
databases, and with enterprise features
for scalability and availability, Neo4j is the
right choice to meet our demands.”Marcos Wada
Software Developer, Walmart
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
23. Neo4j is the heart of Cisco HMP: used for
governance and single source of truth and a
one-stop shop for all of Cisco’s hierarchies.
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
24. “Graph databases offer new methods of
uncovering fraud rings and other
sophisticated scams with a high-level of
accuracy, and are capable of stopping
advanced fraud scenarios in real-time.”Large Intelligence Agency
Cyber Security Expert
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
25. Uses Neo4j to manage the digital assets inside
of its next generation in-flight entertainment
system.
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
26. Uses Neo4j for network topology
analysis for big telco service
providers
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
27. UBS was the recipient of the
2014 Graphie Award for “Best
Identify And Access
Management App”
NEO4j USE CASES
Real Time Recommendations
Master Data Management
Fraud Detection
Identity & Access Management
Graph Based Search
Network & IT-Operations
29. Neo4j 3.1 in Review
Security
Foundation
29
Causal
Clustering
State-of-the-Art
Distributed
Architecture for
Graphs
30. Introducing Neo4j 3.2
May 2017 GA
Enterprise scale
for global
applications
Continuous
improvement in
native performance
Enterprise governance
for the
connected enterprise
30
31. Enterprise Scale for Global Applications
Causal Clusters can now span data centers
• Clusters can be subdivided into groups and spread
across DCs
• Read-time choice of consistency at global scale:
“Read Any”, “Read-your-own-Writes”
Tiered Subclusters boost performance
• Speeds local reads and writes
• Replica servers pull from nearest
replicas minimizing WAN traffic
Topology-aware stack insulates developers & apps
from the many complexities of clustering
Improved Cloud Delivery via RPM, Azure and AWS EC2
31
dc1 group
dc2 group
32. New in Neo4j 3.2
Multi-Data Center Support for Global Internet Apps
Support global-scale apps across continental data centers—via a single switch
32
Each server in a
Global Causal Cluster
is aware of its
role in the topology
Local data-center
load balancing
drives performance
and availability
Local tiered
hierarchies
speed updates
sa group
uk group
us_east
group
hk group
33. Groups can include cores or just tiered replicas
Hierarchical Replica Server Updates
33
RRRR RR
C
CORES
RRRR
C
READ REPLICAS
RRRR RR
RR RR
RR
RR
READ REPLICAS
34. Fast, Local Reads and Writes with
Global Causal Consistency Across the Cluster
Reads occur at the highest speed from a local replica server,
which gets refreshed by local cores
35
CORES
R
R
R
R
R
R
R
R
READ
REPLICAS
R
R
AnalysisR
R
READ
REPLICAS
C
CORES
RR
RR
RR
RR
Analysis
RR
C
C
C
C
READ WRITE
Writes are written to a local core server, which propagates
the new data to other local cores, and then to remote core servers
35. Global Read-Your-Own-Writes:
Choices at Read time: Immediacy Or Full Consistency
Readers can choose between immediate access to Replica data
or waiting for any pending writes to propagate to the Replica
36
CORES
R
R
R
R
R
R
R
R
READ
REPLICAS
R
R
AnalysisR
R
READ
REPLICAS
C
CORES
RR
RR
RR
RR
Analysis
RR
CREAD WRITE
Neo4j drivers maintain knowledge of server locations and
transaction IDs so developers and users don’t need to
C
C
C
36. Enterprise Governance
Neo4j is IT friendly
Node Keys: new type of schema constraint
• Tied to labels, nodes can have any number of Node Keys
• Ensure graph integrity by enforcing existence and uniqueness
• Improves data exchange across multiple data sources
Kerberos encrypted-authentication module add-on
• Supports three-tier integration of client, directory
and database
Causal Clustering available on CAPI-Flash hardware
from IBM Power8 via add-on
Better metrics in Query Monitor to reveal query
behavior and resource consumption
37
37. Native Graph Performance Improvements
• Native Label index improves write speed by 30-
300%
• Composite indexes supercharge lookup speeds
• Cypher’s depth query in DISTINCT function
eliminates repetitious traversals through
deep levels creating exponential time savings
• New Compiled Cypher runtime in Enterprise
Edition to speed common queries by 300%
• Cost-based-optimizer replaces rules based
optimizer (which has been deprecated)
• Snappier Neo4j Browser with new more flexible
JavaScript framework
38
38. One More Thing!
New Cypher Editor in Neo4j Browser
Syntax Highlighting
Auto Complete for Labels, Relationship
Types, Properties, and Variables
Command Auto Complete
Over a decade of leadership in the Graph Space. The seed was planted back in 2000 when our founders invented the property graph model but it wasn’t until 2010 that we contributed the first GA version of Neo4j 1.0 to the open source community and started building a commercial engine around it. We have had a series of first – we introduced Cypher, the first and only declarative language for property graph, launched graph connect and the O’Reilly book to build out the category. The marked rewarded us with commercial success and by the end of 2015 we had 150 paying customers and 50k monthly downloads. The V3.1 and 3.2 releases make Neo4j ready to develop and deploy mission-critical, internet-based, enterprise graph applications.
First, not everyone in the room would know what a graph is.
First, not everyone in the room would know what a graph is.
First, not everyone in the room would know what a graph is.
What this means for your data structure
First, not everyone in the room would know what a graph is.
A graph is connected data.
Which essentially means – datapoints that have relationships with other datapoints.
Or a hotel that has rooms, which have availability
Or a hotel that has rooms, which have availability
Or a hotel that has rooms, which have availability
Or it could be people who know other people – who know other people.. who studied together, who work at the same place – who studied with other people, who works somewhere else… etc.
…forming an extremely powerful foundation from which you can derive value.
And deriving value from data-relationships is exactly what some of the most successful companies in the world have done.
Google created perhaps the most valuable advertising system of all time on top of their search-enginge, which is based on relationships between webpages.
Linkedin created perhaps the most valuable HR-tool ever based on relationships amongst professional
And this is also what pay-pal did, creating a peer-to-peer transaction service, based on relationships.
First, not everyone in the room would know what a graph is.
First, not everyone in the room would know what a graph is.
Establishes Neo4j as the enterprise standard graph technology
First, not everyone in the room would know what a graph is.