SlideShare uma empresa Scribd logo
1 de 43
Introduction to
Graph Databases
  Chicago Graph Database Meet-Up
          Max De Marzi
About Me
    Built the Neography Gem (Ruby
    Wrapper to the Neo4j REST API)
    Playing with Neo4j since 10/2009


•   My Blog: http://maxdemarzi.com
•   Find me on Twitter: @maxdemarzi
•   Email me: maxdemarzi@gmail.com
•   GitHub: http://github.com/maxdemarzi
Agenda
•   Trends in Data
•   NOSQL
•   What is a Graph?
•   What is a Graph Database?
•   What is Neo4j?
Trends in Data
Data is getting bigger:
“Every 2 days we
create as much
information as we did
up to 2003”

– Eric Schmidt, Google
Data is more connected:
•   Text (content)
•   HyperText (added pointers)
•   RSS (joined those pointers)
•   Blogs (added pingbacks)
•   Tagging (grouped related data)
•   RDF (described connected data)
•   GGG (content + pointers + relationships +
    descriptions)
Data is more Semi-Structured:
• If you tried to collect all the data of every
  movie ever made, how would you model it?
• Actors, Characters, Locations, Dates, Costs,
  Ratings, Showings, Ticket Sales, etc.
NOSQL
Not Only SQL
Less than 10% of the NOSQL Vendors
Key Value Stores
• Most Based on Dynamo: Amazon Highly
  Available Key-Value Store
• Data Model:
  – Global key-value mapping
  – Big scalable HashMap
  – Highly fault tolerant (typically)
• Examples:
  – Redis, Riak, Voldemort
Key Value Stores: Pros and Cons
• Pros:
  – Simple data model
  – Scalable
• Cons
  – Create your own “foreign keys”
  – Poor for complex data
Column Family
• Most Based on BigTable: Google’s Distributed
  Storage System for Structured Data
• Data Model:
  – A big table, with column families
  – Map Reduce for querying/processing
• Examples:
  – HBase, HyperTable, Cassandra
Column Family: Pros and Cons
• Pros:
  – Supports Simi-Structured Data
  – Naturally Indexed (columns)
  – Scalable
• Cons
  – Poor for interconnected data
Document Databases
• Data Model:
  – A collection of documents
  – A document is a key value collection
  – Index-centric, lots of map-reduce
• Examples:
  – CouchDB, MongoDB
Document Databases: Pros and Cons
• Pros:
  – Simple, powerful data model
  – Scalable
• Cons
  – Poor for interconnected data
  – Query model limited to keys and indexes
  – Map reduce for larger queries
Graph Databases
• Data Model:
  – Nodes and Relationships
• Examples:
  – Neo4j, OrientDB, InfiniteGraph, AllegroGraph
Graph Databases: Pros and Cons
• Pros:
  – Powerful data model, as general as RDBMS
  – Connected data locally indexed
  – Easy to query
• Cons
  – Sharding ( lots of people working on this)
     • Scales UP reasonably well
  – Requires rewiring your brain
Living in a NOSQL World
                                  RDBMS
                                Graph
                               Databases
Complexity




                                           Document
                                           Databases




                                                       BigTable
                                                        Clones

                                                                  Key-Value
             Relational                                             Store
             Databases




                           90% of
                          Use Cases
                                           Size
What is a Graph?
What is a Graph?
• An abstract representation of a set of objects
  where some pairs are connected by links.

             Object (Vertex, Node)

             Link (Edge, Arc, Relationship)
Different Kinds of Graphs
• Undirected Graph
• Directed Graph

• Pseudo Graph
• Multi Graph

• Hyper Graph
More Kinds of Graphs
• Weighted Graph

• Labeled Graph

• Property Graph
What is a Graph Database?
• A database with an explicit graph structure
• Each node knows its adjacent nodes
• As the number of nodes increases, the cost of
  a local step (or hop) remains the same
• Plus an Index for lookups
Compared to Relational Databases
 Optimized for aggregation   Optimized for connections
Compared to Key Value Stores
Optimized for simple look-ups   Optimized for traversing connected data
Compared to Key Value Stores
Optimized for “trees” of data   Optimized for seeing the forest and the
                                trees, and the branches, and the trunks
What is Neo4j?
What is Neo4j?
• A Graph Database + Lucene Index
• Property Graph
• Full ACID
  (atomicity, consistency, isolation, durability)
• High Availability (with Enterprise Edition)
• 32 Billion Nodes, 32 Billion Relationships,
  64 Billion Properties
• Embedded Server
• REST API
Good For
• Highly connected data (social networks)
• Recommendations (e-commerce)
• Path Finding (how do I know you?)

• A* (Least Cost path)
• Data First Schema (bottom-up, but you still
  need to design)
Property Graph
// then traverse to find results
    start n=(people-index, name, “Andreas”)
    match (n)--()--(foaf) return foaf




n
Cypher
Pattern Matching Query Language (like SQL for graphs)
 // get node 0

 start a=(0) return a

 // traverse from node 1

 start a=(1) match (a)-->(b) return b

 // return friends of friends

 start a=(1) match (a)--()--(c) return c
Gremlin
A Graph Scripting DSL (groovy-based)
 // get node 0

 g.v(0)

 // nodes with incoming relationship

 g.v(0).in

 // outgoing “KNOWS” relationship

 g.v(0).out(“KNOWS”)
If you’ve ever
•   Joined more than 7 tables together
•   Modeled a graph in a table
•   Written a recursive CTE
•   Tried to write some crazy stored procedure
    with multiple recursive self and inner joins

    You should use Neo4j
Language    LanguageCountry          Country

language_code     language_code      country_code
language_name     country_code       country_name
word_count        primary            flag_uri




       Language                             Country

name                                 name
                    IS_SPOKEN_IN
code                                 code
word_count           as_primary      flag_uri
name: “Canada”
                 languages_spoken: “[ „English‟, „French‟ ]”




                           language:“English”     spoken_in
                                                               name: “USA”




name: “Canada”




                 language:“French”    spoken_in
                                                     name: “France”
Country

                 name
                 flag_uri
                 language_name
                 number_of_words
                 yes_in_langauge
                 no_in_language
                 currency_code
                 currency_name

       Country
                                          Language
name                               name
flag_uri                SPEAKS
                                   number_of_words
                                   yes
                                   no
                        Currency
                   code
                   name
Neo4j Data Browser
Neo4j Console
console.neo4j.org
Try it right now:
start n=node(*) match n-[r:LOVES]->m return n, type(r), m
Notice the two nodes in red, they are your result set.
What does a Graph look like?
Questions?




  ?
Thank you!
 http://maxdemarzi.com

Mais conteúdo relacionado

Mais procurados

Introduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & BahrainIntroduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & Bahrain
Neo4j
 
An Introduction to Graph Databases
An Introduction to Graph DatabasesAn Introduction to Graph Databases
An Introduction to Graph Databases
InfiniteGraph
 

Mais procurados (20)

Introduction to Neo4j
Introduction to Neo4jIntroduction to Neo4j
Introduction to Neo4j
 
Introduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & BahrainIntroduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & Bahrain
 
Graph Databases for Master Data Management
Graph Databases for Master Data ManagementGraph Databases for Master Data Management
Graph Databases for Master Data Management
 
Introducing Neo4j
Introducing Neo4jIntroducing Neo4j
Introducing Neo4j
 
Introduction: Relational to Graphs
Introduction: Relational to GraphsIntroduction: Relational to Graphs
Introduction: Relational to Graphs
 
NOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jNOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4j
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 
Graph databases
Graph databasesGraph databases
Graph databases
 
Document Database
Document DatabaseDocument Database
Document Database
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use Cases
 
An Introduction to Graph Databases
An Introduction to Graph DatabasesAn Introduction to Graph Databases
An Introduction to Graph Databases
 
Intro to Cypher
Intro to CypherIntro to Cypher
Intro to Cypher
 
Introduction to Graph Database
Introduction to Graph DatabaseIntroduction to Graph Database
Introduction to Graph Database
 
Intro to Neo4j
Intro to Neo4jIntro to Neo4j
Intro to Neo4j
 
Neo4J : Introduction to Graph Database
Neo4J : Introduction to Graph DatabaseNeo4J : Introduction to Graph Database
Neo4J : Introduction to Graph Database
 
9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented Databases
 
Neo4j GraphDay Seattle- Sept19- neo4j basic training
Neo4j GraphDay Seattle- Sept19- neo4j basic trainingNeo4j GraphDay Seattle- Sept19- neo4j basic training
Neo4j GraphDay Seattle- Sept19- neo4j basic training
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introduction
 
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data ScienceGet Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
 
Neo4j Presentation
Neo4j PresentationNeo4j Presentation
Neo4j Presentation
 

Destaque

Neo4j - 5 cool graph examples
Neo4j - 5 cool graph examplesNeo4j - 5 cool graph examples
Neo4j - 5 cool graph examples
Peter Neubauer
 
Introduction to Gremlin
Introduction to GremlinIntroduction to Gremlin
Introduction to Gremlin
Max De Marzi
 
NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and How
BigBlueHat
 
Introduction to graph databases GraphDays
Introduction to graph databases  GraphDaysIntroduction to graph databases  GraphDays
Introduction to graph databases GraphDays
Neo4j
 

Destaque (17)

Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4j
 
Getting started with Graph Databases & Neo4j
Getting started with Graph Databases & Neo4jGetting started with Graph Databases & Neo4j
Getting started with Graph Databases & Neo4j
 
Neo4j - 5 cool graph examples
Neo4j - 5 cool graph examplesNeo4j - 5 cool graph examples
Neo4j - 5 cool graph examples
 
Introduction to Gremlin
Introduction to GremlinIntroduction to Gremlin
Introduction to Gremlin
 
Relational vs. Non-Relational
Relational vs. Non-RelationalRelational vs. Non-Relational
Relational vs. Non-Relational
 
Semantic Graph Databases: The Evolution of Relational Databases
Semantic Graph Databases: The Evolution of Relational DatabasesSemantic Graph Databases: The Evolution of Relational Databases
Semantic Graph Databases: The Evolution of Relational Databases
 
Graph Database, a little connected tour - Castano
Graph Database, a little connected tour - CastanoGraph Database, a little connected tour - Castano
Graph Database, a little connected tour - Castano
 
Designing and Building a Graph Database Application – Architectural Choices, ...
Designing and Building a Graph Database Application – Architectural Choices, ...Designing and Building a Graph Database Application – Architectural Choices, ...
Designing and Building a Graph Database Application – Architectural Choices, ...
 
Graph Based Recommendation Systems at eBay
Graph Based Recommendation Systems at eBayGraph Based Recommendation Systems at eBay
Graph Based Recommendation Systems at eBay
 
Converting Relational to Graph Databases
Converting Relational to Graph DatabasesConverting Relational to Graph Databases
Converting Relational to Graph Databases
 
Relational to Graph - Import
Relational to Graph - ImportRelational to Graph - Import
Relational to Graph - Import
 
Neo4j - graph database for recommendations
Neo4j - graph database for recommendationsNeo4j - graph database for recommendations
Neo4j - graph database for recommendations
 
Lju Lazarevic
Lju LazarevicLju Lazarevic
Lju Lazarevic
 
NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and How
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4j
 
Introduction to graph databases GraphDays
Introduction to graph databases  GraphDaysIntroduction to graph databases  GraphDays
Introduction to graph databases GraphDays
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 

Semelhante a Introduction to Graph Databases

Graph Database and Neo4j
Graph Database and Neo4jGraph Database and Neo4j
Graph Database and Neo4j
Sina Khorami
 
Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data Lessons
George Stathis
 
NOSQL Databases for the .NET Developer
NOSQL Databases for the .NET DeveloperNOSQL Databases for the .NET Developer
NOSQL Databases for the .NET Developer
Jesus Rodriguez
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and models
Korea Sdec
 

Semelhante a Introduction to Graph Databases (20)

NoSQL, Neo4J for Java Developers , OracleWeek-2012
NoSQL, Neo4J for Java Developers , OracleWeek-2012NoSQL, Neo4J for Java Developers , OracleWeek-2012
NoSQL, Neo4J for Java Developers , OracleWeek-2012
 
Graph Databases
Graph DatabasesGraph Databases
Graph Databases
 
managing big data
managing big datamanaging big data
managing big data
 
Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011
 
Gerry McNicol Graph Databases
Gerry McNicol Graph DatabasesGerry McNicol Graph Databases
Gerry McNicol Graph Databases
 
Betabit - syrwag 2018-03-28
Betabit - syrwag 2018-03-28Betabit - syrwag 2018-03-28
Betabit - syrwag 2018-03-28
 
Graph Database and Neo4j
Graph Database and Neo4jGraph Database and Neo4j
Graph Database and Neo4j
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
 
Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data Lessons
 
Graph Databases & OrientDB
Graph Databases & OrientDBGraph Databases & OrientDB
Graph Databases & OrientDB
 
How Graph Databases used in Police Department?
How Graph Databases used in Police Department?How Graph Databases used in Police Department?
How Graph Databases used in Police Department?
 
NOSQL Databases for the .NET Developer
NOSQL Databases for the .NET DeveloperNOSQL Databases for the .NET Developer
NOSQL Databases for the .NET Developer
 
Intro to Neo4j with Ruby
Intro to Neo4j with RubyIntro to Neo4j with Ruby
Intro to Neo4j with Ruby
 
No SQL : Which way to go? Presented at DDDMelbourne 2015
No SQL : Which way to go?  Presented at DDDMelbourne 2015No SQL : Which way to go?  Presented at DDDMelbourne 2015
No SQL : Which way to go? Presented at DDDMelbourne 2015
 
NoSQL, which way to go?
NoSQL, which way to go?NoSQL, which way to go?
NoSQL, which way to go?
 
Hadoop with Python
Hadoop with PythonHadoop with Python
Hadoop with Python
 
Intro to Graphs for Fedict
Intro to Graphs for FedictIntro to Graphs for Fedict
Intro to Graphs for Fedict
 
Non Relational Databases
Non Relational DatabasesNon Relational Databases
Non Relational Databases
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and models
 
Graph Databases in the Microsoft Ecosystem
Graph Databases in the Microsoft EcosystemGraph Databases in the Microsoft Ecosystem
Graph Databases in the Microsoft Ecosystem
 

Mais de Max De Marzi

Mais de Max De Marzi (20)

DataDay 2023 Presentation
DataDay 2023 PresentationDataDay 2023 Presentation
DataDay 2023 Presentation
 
DataDay 2023 Presentation - Notes
DataDay 2023 Presentation - NotesDataDay 2023 Presentation - Notes
DataDay 2023 Presentation - Notes
 
Developer Intro Deck-PowerPoint - Download for Speaker Notes
Developer Intro Deck-PowerPoint - Download for Speaker NotesDeveloper Intro Deck-PowerPoint - Download for Speaker Notes
Developer Intro Deck-PowerPoint - Download for Speaker Notes
 
Outrageous Ideas for Graph Databases
Outrageous Ideas for Graph DatabasesOutrageous Ideas for Graph Databases
Outrageous Ideas for Graph Databases
 
Neo4j Training Cypher
Neo4j Training CypherNeo4j Training Cypher
Neo4j Training Cypher
 
Neo4j Training Modeling
Neo4j Training ModelingNeo4j Training Modeling
Neo4j Training Modeling
 
Neo4j Training Introduction
Neo4j Training IntroductionNeo4j Training Introduction
Neo4j Training Introduction
 
Detenga el fraude complejo con Neo4j
Detenga el fraude complejo con Neo4jDetenga el fraude complejo con Neo4j
Detenga el fraude complejo con Neo4j
 
Data Modeling Tricks for Neo4j
Data Modeling Tricks for Neo4jData Modeling Tricks for Neo4j
Data Modeling Tricks for Neo4j
 
Fraud Detection and Neo4j
Fraud Detection and Neo4j Fraud Detection and Neo4j
Fraud Detection and Neo4j
 
Detecion de Fraude con Neo4j
Detecion de Fraude con Neo4jDetecion de Fraude con Neo4j
Detecion de Fraude con Neo4j
 
Neo4j Data Science Presentation
Neo4j Data Science PresentationNeo4j Data Science Presentation
Neo4j Data Science Presentation
 
Neo4j Stored Procedure Training Part 2
Neo4j Stored Procedure Training Part 2Neo4j Stored Procedure Training Part 2
Neo4j Stored Procedure Training Part 2
 
Neo4j Stored Procedure Training Part 1
Neo4j Stored Procedure Training Part 1Neo4j Stored Procedure Training Part 1
Neo4j Stored Procedure Training Part 1
 
Decision Trees in Neo4j
Decision Trees in Neo4jDecision Trees in Neo4j
Decision Trees in Neo4j
 
Neo4j y Fraude Spanish
Neo4j y Fraude SpanishNeo4j y Fraude Spanish
Neo4j y Fraude Spanish
 
Data modeling with neo4j tutorial
Data modeling with neo4j tutorialData modeling with neo4j tutorial
Data modeling with neo4j tutorial
 
Neo4j Fundamentals
Neo4j FundamentalsNeo4j Fundamentals
Neo4j Fundamentals
 
Fraud Detection Class Slides
Fraud Detection Class SlidesFraud Detection Class Slides
Fraud Detection Class Slides
 
Neo4j in Depth
Neo4j in DepthNeo4j in Depth
Neo4j in Depth
 

Último

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Último (20)

ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Introduction to Graph Databases

  • 1. Introduction to Graph Databases Chicago Graph Database Meet-Up Max De Marzi
  • 2. About Me Built the Neography Gem (Ruby Wrapper to the Neo4j REST API) Playing with Neo4j since 10/2009 • My Blog: http://maxdemarzi.com • Find me on Twitter: @maxdemarzi • Email me: maxdemarzi@gmail.com • GitHub: http://github.com/maxdemarzi
  • 3. Agenda • Trends in Data • NOSQL • What is a Graph? • What is a Graph Database? • What is Neo4j?
  • 5. Data is getting bigger: “Every 2 days we create as much information as we did up to 2003” – Eric Schmidt, Google
  • 6. Data is more connected: • Text (content) • HyperText (added pointers) • RSS (joined those pointers) • Blogs (added pingbacks) • Tagging (grouped related data) • RDF (described connected data) • GGG (content + pointers + relationships + descriptions)
  • 7. Data is more Semi-Structured: • If you tried to collect all the data of every movie ever made, how would you model it? • Actors, Characters, Locations, Dates, Costs, Ratings, Showings, Ticket Sales, etc.
  • 9. Less than 10% of the NOSQL Vendors
  • 10. Key Value Stores • Most Based on Dynamo: Amazon Highly Available Key-Value Store • Data Model: – Global key-value mapping – Big scalable HashMap – Highly fault tolerant (typically) • Examples: – Redis, Riak, Voldemort
  • 11. Key Value Stores: Pros and Cons • Pros: – Simple data model – Scalable • Cons – Create your own “foreign keys” – Poor for complex data
  • 12. Column Family • Most Based on BigTable: Google’s Distributed Storage System for Structured Data • Data Model: – A big table, with column families – Map Reduce for querying/processing • Examples: – HBase, HyperTable, Cassandra
  • 13. Column Family: Pros and Cons • Pros: – Supports Simi-Structured Data – Naturally Indexed (columns) – Scalable • Cons – Poor for interconnected data
  • 14. Document Databases • Data Model: – A collection of documents – A document is a key value collection – Index-centric, lots of map-reduce • Examples: – CouchDB, MongoDB
  • 15. Document Databases: Pros and Cons • Pros: – Simple, powerful data model – Scalable • Cons – Poor for interconnected data – Query model limited to keys and indexes – Map reduce for larger queries
  • 16. Graph Databases • Data Model: – Nodes and Relationships • Examples: – Neo4j, OrientDB, InfiniteGraph, AllegroGraph
  • 17. Graph Databases: Pros and Cons • Pros: – Powerful data model, as general as RDBMS – Connected data locally indexed – Easy to query • Cons – Sharding ( lots of people working on this) • Scales UP reasonably well – Requires rewiring your brain
  • 18. Living in a NOSQL World RDBMS Graph Databases Complexity Document Databases BigTable Clones Key-Value Relational Store Databases 90% of Use Cases Size
  • 19. What is a Graph?
  • 20. What is a Graph? • An abstract representation of a set of objects where some pairs are connected by links. Object (Vertex, Node) Link (Edge, Arc, Relationship)
  • 21. Different Kinds of Graphs • Undirected Graph • Directed Graph • Pseudo Graph • Multi Graph • Hyper Graph
  • 22. More Kinds of Graphs • Weighted Graph • Labeled Graph • Property Graph
  • 23. What is a Graph Database? • A database with an explicit graph structure • Each node knows its adjacent nodes • As the number of nodes increases, the cost of a local step (or hop) remains the same • Plus an Index for lookups
  • 24. Compared to Relational Databases Optimized for aggregation Optimized for connections
  • 25. Compared to Key Value Stores Optimized for simple look-ups Optimized for traversing connected data
  • 26. Compared to Key Value Stores Optimized for “trees” of data Optimized for seeing the forest and the trees, and the branches, and the trunks
  • 28. What is Neo4j? • A Graph Database + Lucene Index • Property Graph • Full ACID (atomicity, consistency, isolation, durability) • High Availability (with Enterprise Edition) • 32 Billion Nodes, 32 Billion Relationships, 64 Billion Properties • Embedded Server • REST API
  • 29. Good For • Highly connected data (social networks) • Recommendations (e-commerce) • Path Finding (how do I know you?) • A* (Least Cost path) • Data First Schema (bottom-up, but you still need to design)
  • 31. // then traverse to find results start n=(people-index, name, “Andreas”) match (n)--()--(foaf) return foaf n
  • 32. Cypher Pattern Matching Query Language (like SQL for graphs) // get node 0 start a=(0) return a // traverse from node 1 start a=(1) match (a)-->(b) return b // return friends of friends start a=(1) match (a)--()--(c) return c
  • 33. Gremlin A Graph Scripting DSL (groovy-based) // get node 0 g.v(0) // nodes with incoming relationship g.v(0).in // outgoing “KNOWS” relationship g.v(0).out(“KNOWS”)
  • 34. If you’ve ever • Joined more than 7 tables together • Modeled a graph in a table • Written a recursive CTE • Tried to write some crazy stored procedure with multiple recursive self and inner joins You should use Neo4j
  • 35. Language LanguageCountry Country language_code language_code country_code language_name country_code country_name word_count primary flag_uri Language Country name name IS_SPOKEN_IN code code word_count as_primary flag_uri
  • 36. name: “Canada” languages_spoken: “[ „English‟, „French‟ ]” language:“English” spoken_in name: “USA” name: “Canada” language:“French” spoken_in name: “France”
  • 37. Country name flag_uri language_name number_of_words yes_in_langauge no_in_language currency_code currency_name Country Language name name flag_uri SPEAKS number_of_words yes no Currency code name
  • 40. console.neo4j.org Try it right now: start n=node(*) match n-[r:LOVES]->m return n, type(r), m Notice the two nodes in red, they are your result set.
  • 41. What does a Graph look like?

Notas do Editor

  1. An undirected graph is one in which edges have no orientation. The edge (a, b) is identical to the edge (b, a).A directed graph or digraph is an ordered pair D = (V, A)A pseudo graph is a graph with loopsA multi graph allows for multiple edges between nodesA hyper graph allows an edge to join more than two nodes
  2. A weighted graph has a number assigned to each edgeAlabeled graph has a label assigned to each node or edgeA property graph has keys and values for each node or edge
  3. Atomic = all or nothing, consistent = stay consistent from one tx to another, isolation = no tx will mess with another tx, durability = once tx committed, it stays