SlideShare uma empresa Scribd logo
1 de 61
Baixar para ler offline
Graph Databases
             and Neo4j
                          twitter: @thobe / #neo4j
Tobias Ivarsson           email: tobias@neotechnology.com
                          web: http://www.neo4j.org/
Hacker @ Neo Technology   web: http://www.thobe.org/
NOSQL - Why now?
    Four trends


                  2
Trend 1: Data size
               ExaBytes (10¹⁸) of data stored per year
                                                             988
1000
         Each year more and
         more digital data is
         created. Over t wo
 750     years we create more
         digital data than all                623
         the data created in
         history before that.
 500
                                  397

                            253
 250    161


   0
       2006                2007   2008        2009           2010
                                     Data source: IDC 2007     3
Trend 2: Connectedness
                                                                                                                    Giant
                                                                                                                    Global
                                                                                                                 Graph (GGG)


                                    Over time data has evolved to                                   Ontologies
                                    be more and more interlinked
                                    and connected.
                                                                                           RDF
                                    Hypertext has links,
                                    Blogs have pingback,
                                    Tagging groups all related data                                       Folksonomies
  Information connectivity




                                                                                        Tagging


                                                                        Wikis            User-generated
                                                                                            content
                                                                                Blogs


                                                                      RSS


                                                  Hypertext


                         Text documents
                                                         web 1.0                  web 2.0                        “web 3.0”

                                             1990                     2000                        2010                   2020   4
Trend 3: Semi-structure
๏ Individualization of content
   • In the salary lists of the 1970s, all elements had exactly one job
   • In Or 15? lists of the 2000s, we need 5 job columns! Or 8?
        the salary


๏ All encompassing “entire world views”
   • Store more data about each entity
๏ Trend accelerated by the decentralization of content generation
     that is the hallmark of the age of participation (“web 2.0”)



                                                                    5
Trend 4: Architecture

              1980s: Mainframe applications


                       Application




                           DB




                                              6
Trend 4: Architecture

             1990s: Database as integration hub


          Application   Application    Application




                            DB




                                                     7
Trend 4: Architecture

         2000s: (moving towards) Decoupled services
                        with their own backend

          Application       Application          Application




              DB                 DB                  DB




                                                               8
Why NOSQL Now?

๏Trend 1: Size
๏Trend 2: Connectedness
๏Trend 3: Semi-structure
๏Trend 4: Architecture

                           9
RDBMS performance
               Salary List                                        Relational database

                                                                  Requirement of application
 Performance




                                         Majority of
                                         Webapps



                                                       Social network
               We are building




                                                            }
               applications today that
                                                                              Semantic Trading
               have complexity
               requirements that a
               Relational Database
               cannot handle with
               sufficient performance
                                                                        custom



                                                            Data complexity                      10
Scaling to size vs. Scaling to complexity
    Size
       Key/Value stores

                          Bigtable clones

                                            Document databases

                                                                 Graph databases
                                                                             Billions of nodes
                                                                             and relationships




                                > 90% of use cases

                                                                           Complexity

                                                                                   11
Graph Databases focuses on structure of data
                                   Graph databases focus
                                   on the structure of the
                                   data, scaling to the
                                   complexity of the data
                                   and of the application.




                                                 12
What is Neo4j?
๏ Neo4j is a Graph Database
   • Non-relational (“#nosql”), transactional (ACID), embedded
   • Data is stored as a Graph / Network
      ‣Nodes and relationships with properties
      ‣“Property Graph” or “edge-labeled multidigraph”
   • Schema free, bottom-up data model design
๏ Neo4j is Open Source / Free (as in speech) Software
                                                            Prices are available at
                                                            http://neotechnology.com/



   • AGPLv3
                                                            Contact us if you have
                                                            questions and/or special
                                                            license needs (e.g. if you


   • Commercial (“dual license”) license available
                                                            want an evaluation license)




      ‣First server is free (as in beer), next is inexpensive         13
More about Neo4j
๏ Neo4j is stable
   • In 24/7 operation since 2003
๏ Neo4j is in active development
   • Neo Technology received VC funding October 2009
๏ Neo4j delivers high performance graph operations
   • traverses 1’000’000+ relationships / second
       on commodity hardware




                                                       14
The Neo4j Graph data model




•Nodes
•Relationships bet ween Nodes
•Relationships have Labels
•Relationships are directed, but traversed at
equal speed in both directions
•The semantics of the direction is up to the
application (LIVES WITH is reflexive, LOVES is not)
•Nodes have key-value properties
•Relationships have key-value properties              15
The Neo4j Graph data model




•Nodes
•Relationships bet ween Nodes
•Relationships have Labels
•Relationships are directed, but traversed at
equal speed in both directions
•The semantics of the direction is up to the
application (LIVES WITH is reflexive, LOVES is not)
•Nodes have key-value properties
•Relationships have key-value properties              15
The Neo4j Graph data model


                                                      LIVES WITH
                                                               LOVES



                                         OWNS
                                                                       DRIVES

•Nodes
•Relationships bet ween Nodes
•Relationships have Labels
•Relationships are directed, but traversed at
equal speed in both directions
•The semantics of the direction is up to the
application (LIVES WITH is reflexive, LOVES is not)
•Nodes have key-value properties
•Relationships have key-value properties                                        15
The Neo4j Graph data model

                                                                 LOVES

                                                      LIVES WITH
                                                               LOVES



                                         OWNS
                                                                       DRIVES

•Nodes
•Relationships bet ween Nodes
•Relationships have Labels
•Relationships are directed, but traversed at
equal speed in both directions
•The semantics of the direction is up to the
application (LIVES WITH is reflexive, LOVES is not)
•Nodes have key-value properties
•Relationships have key-value properties                                        15
The Neo4j Graph data model
                                                                                name: “Mary”
                                                                 LOVES
             name: “James”                                                      age: 35
             age: 32                                  LIVES WITH
             twitter: “@spam”                                  LOVES



                                         OWNS
                                                                       DRIVES

•Nodes
•Relationships bet ween Nodes
•Relationships have Labels                                     brand: “Volvo”
•Relationships are directed, but traversed at                  model: “V70”
equal speed in both directions
•The semantics of the direction is up to the
application (LIVES WITH is reflexive, LOVES is not)
•Nodes have key-value properties
•Relationships have key-value properties                                                 15
The Neo4j Graph data model
                                                                                name: “Mary”
                                                                 LOVES
             name: “James”                                                      age: 35
             age: 32                                  LIVES WITH
             twitter: “@spam”                                  LOVES



                                         OWNS
                                     item type: “car”                  DRIVES

•Nodes
•Relationships bet ween Nodes
•Relationships have Labels                                     brand: “Volvo”
•Relationships are directed, but traversed at                  model: “V70”
equal speed in both directions
•The semantics of the direction is up to the
application (LIVES WITH is reflexive, LOVES is not)
•Nodes have key-value properties
•Relationships have key-value properties                                                 15
Graphs are all around us
          A                        B           C             D           ...
   1              17                  3.14          3   17.79333333333

   2              42               10.11           14            30.33

   3           316                    6.66          1          2104.56

   4              32                  9.11     592      0.492432432432

   5      Even if this spreadsheet looks
          like it could be a fit for a RDBMS
                                                        2153.175765766
          it isn’t:
          •RDBMSes have problems with
  ...     extending indefinitely on both
          rows and columns
          •Formulas and data
          dependencies would quickly lead
          to heavy join operations

                                                                         16
Graphs are all around us
                 A                B      C         D            ...
   1            17               3.14     3    = A1 * B1 / C1

   2            42               10.11   14    = A2 * B2 / C2

   3           316               6.66     1    = A3 * B3 / C3

   4            32               9.11    592   = A4 * B4 / C4

   5                                           = SUM(D2:D5)
        With data dependencies
  ...   the spread sheet turns
        out to be a graph.




                                                                17
Graphs are all around us
                 A                B      C         D            ...
   1            17               3.14     3    = A1 * B1 / C1

   2            42               10.11   14    = A2 * B2 / C2

   3           316               6.66     1    = A3 * B3 / C3

   4            32               9.11    592   = A4 * B4 / C4

   5                                           = SUM(D2:D5)
        With data dependencies
  ...   the spread sheet turns
        out to be a graph.




                                                                17
Graphs are all around us                      If we add external data
                                              sources the problem
                                              becomes even more
                                              interesting...




          17     3.14       3    = A1 * B1 / C1

          42     10.11     14    = A2 * B2 / C2

          316    6.66       1    = A3 * B3 / C3

          32     9.11      592   = A4 * B4 / C4

                                 = SUM(D2:D5)




                                                      18
Graphs are all around us                      If we add external data
                                              sources the problem
                                              becomes even more
                                              interesting...




          17     3.14       3    = A1 * B1 / C1

          42     10.11     14    = A2 * B2 / C2

          316    6.66       1    = A3 * B3 / C3

          32     9.11      592   = A4 * B4 / C4

                                 = SUM(D2:D5)




                                                      18
Graphs are whiteboard friendly                  An application domain model
                                                outlined on a whiteboard or piece
                                                of paper would be translated to
                                                an ER-diagram, then normalized
                                                to fit a Relational Database.
                                                With a Graph Database the model
                                                from the whiteboard is
                                                implemented directly.




                         Image credits: Tobias Ivarsson            19
Graphs are whiteboard friendly                         An application domain model
                                                       outlined on a whiteboard or piece
                                                       of paper would be translated to
                                                       an ER-diagram, then normalized
                                                       to fit a Relational Database.
                                                       With a Graph Database the model
                                                       from the whiteboard is
                                                       implemented directly.

                            *
                    1
                                          *
            *           1




            *                                 1
                        *

                   1
                            *


                                Image credits: Tobias Ivarsson            19
Graphs are whiteboard friendly                         An application domain model
                                                       outlined on a whiteboard or piece
                                                       of paper would be translated to
                                                       an ER-diagram, then normalized
                                                       to fit a Relational Database.
                                                       With a Graph Database the model
                                                       from the whiteboard is
                                                       implemented directly.
                        thobe



                                       Joe project blog


                                     Wardrobe Strength


                 Hello Joe

                 Modularizing Jython

                    Neo4j performance analysis
                                Image credits: Tobias Ivarsson            19
Query Languages
๏ Traversal APIs
   • Neo4j core traversers
   • Blueprint pipes
๏ SPARQL - “SQL for linked data” - query by graph pattern matching
   SELECT ?person WHERE {                                                      Find all persons that
       ?person neo4j:KNOWS ?friend .                                           KNOWS a friend that
       ?friend neo4j:KNOWS ?foe .                                              KNOWS someone named
                                                                               “Larry Ellison”.
       ?foe neo4j:name "Larry Ellison" .
   }

๏ Gremlin - “perl for graphs” - query by traversal
   ./outE[@label='KNOWS']/inV[@age > 30]/@name

          Give me the names of all the people I know that are older than 30.                           20
Data manipulation API
GraphDatabaseService graphDb = getGraphDbInstanceSomehow();


   // Create Thomas 'Neo' Anderson
   Node mrAnderson = graphDb.createNode();
   mrAnderson.setProperty( "name", "Thomas Anderson" );
   mrAnderson.setProperty( "age", 29 );

   // Create Morpheus
   Node morpheus = graphDb.createNode();
   morpheus.setProperty( "name", "Morpheus" );
   morpheus.setProperty( "rank", "Captain" );
   morpheus.setProperty( "occupation", "Total bad ass" );

   // Create relationship representing they know each other
   mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS );
   // ... similarly for Trinity, Cypher, Agent Smith, Architect


                                                          21
Data manipulation API
GraphDatabaseService graphDb = getGraphDbInstanceSomehow();
Transaction tx = graphDb.beginTx();
try {
   // Create Thomas 'Neo' Anderson
   Node mrAnderson = graphDb.createNode();
   mrAnderson.setProperty( "name", "Thomas Anderson" );
   mrAnderson.setProperty( "age", 29 );

   // Create Morpheus
   Node morpheus = graphDb.createNode();
   morpheus.setProperty( "name", "Morpheus" );
   morpheus.setProperty( "rank", "Captain" );
   morpheus.setProperty( "occupation", "Total bad ass" );

   // Create relationship representing they know each other
   mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS );
   // ... similarly for Trinity, Cypher, Agent Smith, Architect
    tx.success();
} finally {
   tx.finish();                                          21
}
Graph traversals


                                                                                  name: “The Architect”
                                    disclosure: “public”
name: “Thomas Anderson”
age: 29                                                     name: “Cypher”
                                                            last name: “Reagan”
                   KNOWS name: “Morpheus”
             KNOWS                                  KNOWS
                         rank: “Captain”                                                CODED BY
       LOVES             occupation: “Total badass”                        KNOWS
                           KNOWS
         name: “Trinity”                            disclosure: “secret”
                                                                              name: “Agent Smith”
                                                                              version: “1.0b”
 since: “meeting the oracle”       since: “a year before the movie”
                                                                              language: “C++”
                                   cooperates on: “The Nebuchadnezzar”




                                                                                           22
Graph traversals                                                                  name: “The Architect”
                                    disclosure: “public”
name: “Thomas Anderson”
age: 29                                                     name: “Cypher”
                                                            last name: “Reagan”
                   KNOWS name: “Morpheus”
             KNOWS                                  KNOWS
                         rank: “Captain”                                                CODED BY
       LOVES             occupation: “Total badass”                        KNOWS
                           KNOWS
         name: “Trinity”                            disclosure: “secret”
                                                                              name: “Agent Smith”
                                                                              version: “1.0b”
 since: “meeting the oracle”       since: “a year before the movie”
                                                                              language: “C++”
                                   cooperates on: “The Nebuchadnezzar”
import neo4j
class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j
   types = [ neo4j.Outgoing.KNOWS ]
   order = neo4j.BREADTH_FIRST
   stop = neo4j.STOP_AT_END_OF_GRAPH
   returnable = neo4j.RETURN_ALL_BUT_START_NODE
for friend_node in Friends(mr_anderson):
   print "%s (@ depth=%s)" % ( friend_node["name"],
     friend_node.depth )
                                                                                           23
Graph traversals                                                                  name: “The Architect”
                                    disclosure: “public”
name: “Thomas Anderson”
age: 29                                                     name: “Cypher”
                                                            last name: “Reagan”
                   KNOWS name: “Morpheus”
             KNOWS                                  KNOWS
                         rank: “Captain”                                                CODED BY
       LOVES             occupation: “Total badass”                        KNOWS
                           KNOWS
         name: “Trinity”                            disclosure: “secret”
                                                                              name: “Agent Smith”
                                                                              version: “1.0b”
 since: “meeting the oracle”       since: “a year before the movie”
                                                                              language: “C++”
                                   cooperates on: “The Nebuchadnezzar”
import neo4j
class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j
   types = [ neo4j.Outgoing.KNOWS ]
   order = neo4j.BREADTH_FIRST
   stop = neo4j.STOP_AT_END_OF_GRAPH
   returnable = neo4j.RETURN_ALL_BUT_START_NODE
for friend_node in Friends(mr_anderson):
   print "%s (@ depth=%s)" % ( friend_node["name"],
     friend_node.depth )
                                                                                           23
Graph traversals                                                                  name: “The Architect”
                                    disclosure: “public”
name: “Thomas Anderson”
age: 29                                                     name: “Cypher”
                                                            last name: “Reagan”
                   KNOWS name: “Morpheus”
             KNOWS                                  KNOWS
                         rank: “Captain”                                                CODED BY
       LOVES             occupation: “Total badass”                        KNOWS
                           KNOWS
         name: “Trinity”                            disclosure: “secret”
                                                                              name: “Agent Smith”
                                                                              version: “1.0b”
 since: “meeting the oracle”       since: “a year before the movie”
                                                                              language: “C++”
                                   cooperates on: “The Nebuchadnezzar”
import neo4j
class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j
   types = [ neo4j.Outgoing.KNOWS ]               Morpheus (@ depth=1)
   order = neo4j.BREADTH_FIRST
   stop = neo4j.STOP_AT_END_OF_GRAPH
   returnable = neo4j.RETURN_ALL_BUT_START_NODE
for friend_node in Friends(mr_anderson):
   print "%s (@ depth=%s)" % ( friend_node["name"],
     friend_node.depth )
                                                                                           23
Graph traversals                                                                  name: “The Architect”
                                    disclosure: “public”
name: “Thomas Anderson”
age: 29                                                     name: “Cypher”
                                                            last name: “Reagan”
                   KNOWS name: “Morpheus”
             KNOWS                                  KNOWS
                         rank: “Captain”                                                CODED BY
       LOVES             occupation: “Total badass”                        KNOWS
                           KNOWS
         name: “Trinity”                            disclosure: “secret”
                                                                              name: “Agent Smith”
                                                                              version: “1.0b”
 since: “meeting the oracle”       since: “a year before the movie”
                                                                              language: “C++”
                                   cooperates on: “The Nebuchadnezzar”
import neo4j
class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j
   types = [ neo4j.Outgoing.KNOWS ]               Morpheus (@ depth=1)
   order = neo4j.BREADTH_FIRST                    Trinity (@ depth=1)
   stop = neo4j.STOP_AT_END_OF_GRAPH
   returnable = neo4j.RETURN_ALL_BUT_START_NODE
for friend_node in Friends(mr_anderson):
   print "%s (@ depth=%s)" % ( friend_node["name"],
     friend_node.depth )
                                                                                           23
Graph traversals                                                                  name: “The Architect”
                                    disclosure: “public”
name: “Thomas Anderson”
age: 29                                                     name: “Cypher”
                                                            last name: “Reagan”
                   KNOWS name: “Morpheus”
             KNOWS                                  KNOWS
                         rank: “Captain”                                                CODED BY
       LOVES             occupation: “Total badass”                        KNOWS
                           KNOWS
         name: “Trinity”                            disclosure: “secret”
                                                                              name: “Agent Smith”
                                                                              version: “1.0b”
 since: “meeting the oracle”       since: “a year before the movie”
                                                                              language: “C++”
                                   cooperates on: “The Nebuchadnezzar”
import neo4j
class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j
   types = [ neo4j.Outgoing.KNOWS ]               Morpheus (@ depth=1)
   order = neo4j.BREADTH_FIRST                    Trinity (@ depth=1)
   stop = neo4j.STOP_AT_END_OF_GRAPH
                                                  Cypher (@ depth=2)
   returnable = neo4j.RETURN_ALL_BUT_START_NODE
for friend_node in Friends(mr_anderson):
   print "%s (@ depth=%s)" % ( friend_node["name"],
     friend_node.depth )
                                                                                           23
Graph traversals                                                                  name: “The Architect”
                                    disclosure: “public”
name: “Thomas Anderson”
age: 29                                                     name: “Cypher”
                                                            last name: “Reagan”
                   KNOWS name: “Morpheus”
             KNOWS                                  KNOWS
                         rank: “Captain”                                                CODED BY
       LOVES             occupation: “Total badass”                        KNOWS
                           KNOWS
         name: “Trinity”                            disclosure: “secret”
                                                                              name: “Agent Smith”
                                                                              version: “1.0b”
 since: “meeting the oracle”       since: “a year before the movie”
                                                                              language: “C++”
                                   cooperates on: “The Nebuchadnezzar”
import neo4j
class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j
   types = [ neo4j.Outgoing.KNOWS ]               Morpheus (@ depth=1)
   order = neo4j.BREADTH_FIRST                    Trinity (@ depth=1)
   stop = neo4j.STOP_AT_END_OF_GRAPH
                                                  Cypher (@ depth=2)
   returnable = neo4j.RETURN_ALL_BUT_START_NODE
                                                                             Agent Smith (@ depth=3)
for friend_node in Friends(mr_anderson):
   print "%s (@ depth=%s)" % ( friend_node["name"],
     friend_node.depth )
                                                                                           23
Graph traversals                                                                  name: “The Architect”
                                    disclosure: “public”
name: “Thomas Anderson”
age: 29                                                     name: “Cypher”
                                                            last name: “Reagan”
                   KNOWS name: “Morpheus”
             KNOWS                                  KNOWS
                         rank: “Captain”                                                CODED BY
       LOVES             occupation: “Total badass”                        KNOWS
                           KNOWS
         name: “Trinity”                            disclosure: “secret”
                                                                              name: “Agent Smith”
                                                                              version: “1.0b”
 since: “meeting the oracle”       since: “a year before the movie”
                                                                              language: “C++”
                                   cooperates on: “The Nebuchadnezzar”
import neo4j
class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j
   types = [ neo4j.Outgoing.KNOWS ]               Morpheus (@ depth=1)
   order = neo4j.BREADTH_FIRST                    Trinity (@ depth=1)
   stop = neo4j.STOP_AT_END_OF_GRAPH
                                                  Cypher (@ depth=2)
   returnable = neo4j.RETURN_ALL_BUT_START_NODE
                                                                             Agent Smith (@ depth=3)
for friend_node in Friends(mr_anderson):
   print "%s (@ depth=%s)" % ( friend_node["name"],
     friend_node.depth )
                                                                                           23
Finding a place to start
๏ Traversals need a Node to start from
    • QUESTION: How do I find the start Node?
    • ANSWER:You use an Index
๏ Indexes in Neo4j are different from Indexes in Relational Databases
    • RDBMSes use them for Joining
    • Neo4j use them for simple lookup
IndexService index = getGraphDbIndexServiceSomehow();

Node mrAnderson = index.getSingleNode( "name",
                                        "Thomas Anderson" );

performTraversalFrom( mrAnderson );
                                                              24
Indexes in Neo4j
๏ The Graph *is* the main index
   • Use relationship labels for navigation
   • Build index structures *in the graph*
     ‣Search trees, tag clouds, geospatial indexes, et.c.
     ‣Linked/skip lists or other data structures in the graph
     ‣We have utility libraries for this
๏ External indexes used *for lookup*
   • Finding a (number of) points to start traversals from
   • Major difference from RDBMS that use indexes for everything
                                                                25
A domain object implemented in Neo4j
public interface Person {
   String getName();
   void setName( String firstName, String lastName );
}

public final class PersonImpl implements Person {
   private final Node underlyingNode;
   public PersonImpl( Node underlyingNode ) {
       this.underlyingNode = underlyingNode;
   }
   public String getName() {
       return String.format("%s %s",
          underlyingNode.getProperty("first name"),
          underlyingNode.getProperty("last name") );
   }
   public String setName(String firstName, String lastName) {
       underlyingNode.setProperty("first name", firstName);
       underlyingNode.setProperty("last name", lastName);
   }
}                                                         26
Neo4j as Software Transactional Memory
๏ Implement objects as wrappers around Nodes and Relationships
   • Neo4j is fast enough to allow you to read all state from the
      Node/Relationship
๏ Mutating operations require transactions
   • The changes are isolated from all other threads until committed
   • Multiple mutations can be committed atomically
๏ Nested transactions are flattened
   • Makes it possible to have methods open their own transaction
๏ Fits nicely with the OO paradigm
   • More focus on data than on objects (comp. Object DBs)    27
Why not use an O/R mapper?
๏ Model evolution in ORMs is a hard problem
   • virtually unsupported in most ORM systems
๏ SQL is “compatible” across many RDBMSs
   • data is still locked in
๏ Each ORM maps object models differently
   • Moving to another ORM == legacy schema support
      ‣except your legacy schema is a strange auto-generated one
๏ Object/Graph Mapping is always done the same way
   • allows you to keep your data through application changes
   • or share data between multiple implementations         28
What an ORM doesn’t do

๏Deep traversals
๏Graph algorithms
๏Shortest path(s)
๏Routing
๏etc.
                          29
Path exists in social network
๏ Each person has on average 50 friends      The performance impact
                                             in Neo4j depends only on
                                             the degree of each node. in
             Tobias                          an RDBMS it depends on
                                             the number of entries in
                                             the tables involved in the
                                             join(s).
                                   Emil



                 Johan
                                                Peter


        Database               # persons query time
  Relational database                 1 000      2 000 ms
  Neo4j Graph Database                1 000          2 ms
  Neo4j Graph Database            1 000 000          2 ms
  Relational database             1 000 000 way too long...
                                                                    30
Path exists in social network
๏ Each person has on average 50 friends      The performance impact
                                             in Neo4j depends only on
                                             the degree of each node. in
             Tobias                          an RDBMS it depends on
                                             the number of entries in
                                             the tables involved in the
                                             join(s).
                                   Emil



                 Johan
                                                Peter


        Database               # persons query time
  Relational database                 1 000      2 000 ms
  Neo4j Graph Database                1 000          2 ms
  Neo4j Graph Database            1 000 000          2 ms
  Relational database             1 000 000 way too long...
                                                                    30
Path exists in social network
๏ Each person has on average 50 friends      The performance impact
                                             in Neo4j depends only on
                                             the degree of each node. in
             Tobias                          an RDBMS it depends on
                                             the number of entries in
                                             the tables involved in the
                                             join(s).
                                   Emil



                 Johan
                                                Peter


        Database               # persons query time
  Relational database                 1 000      2 000 ms
  Neo4j Graph Database                1 000          2 ms
  Neo4j Graph Database            1 000 000          2 ms
  Relational database             1 000 000 way too long...
                                                                    30
Path exists in social network
๏ Each person has on average 50 friends      The performance impact
                                             in Neo4j depends only on
                                             the degree of each node. in
             Tobias                          an RDBMS it depends on
                                             the number of entries in
                                             the tables involved in the
                                             join(s).
                                   Emil



                 Johan
                                                Peter


        Database               # persons query time
  Relational database                 1 000      2 000 ms
  Neo4j Graph Database                1 000          2 ms
  Neo4j Graph Database            1 000 000          2 ms
  Relational database             1 000 000 way too long...
                                                                    30
Path exists in social network
๏ Each person has on average 50 friends      The performance impact
                                             in Neo4j depends only on
                                             the degree of each node. in
             Tobias                          an RDBMS it depends on
                                             the number of entries in
                                             the tables involved in the
                                             join(s).
                                   Emil



                 Johan
                                                Peter


        Database               # persons query time
  Relational database                 1 000      2 000 ms
  Neo4j Graph Database                1 000          2 ms
  Neo4j Graph Database            1 000 000          2 ms
  Relational database             1 000 000 way too long...
                                                                    30
On-line real time routing with Neo4j
๏ 20 million Nodes - represents places
๏ 62 million Edges - represents direct roads between places
   • These edges have a length property, for the length of the road
๏ Average optimal route, 100 separate roads, found in 100ms
๏ Worst case route we could find:
   • Optimal route is 5500 separate roads
   • Total length ~770km                             There’s a difference


   • Found in less than 3 seconds
                                                     bet ween least
                                                     number of hops and
                                                     least cost.

๏ Uses A* “best first” search
                                                                    31
Routing with Neo4j - using Neo4j Graph-Algos
# The cost evaluator - for choosing the best next node
class GeoCostEvaluator
    include EstimateEvaluator
    def getCost(node, goal)
        straight_path_distance(
           node.getProperty("lat"), node.getProperty("lon"),
           goal.getProperty("lat"), goal.getProperty("lon") )
    end
end

# Instantiate the A* search function
path_finder = AStar.new( Neo4j::instance,
   RelationshipExpander.forTypes(
       DynamicRelationshipType.withName("road"),
          Direction::BOTH ),
   DoubleEvaluator.new("length"), GeoCostEvaluator.new )

# Find the best path between New York City and San Francisco
best_path = path_finder.findSinglePath( NYC, SF )
                                                           32
Newest addition: Neo4j lets you REST
๏ Hello Neo4j REST server - Neo4j no longer needs to be embedded
๏ Opens up Neo4j to your favorite platform (even if that isn’t Java)
   • PHP, .NET, et.c. - libraries already exists!
   • http://wiki.neo4j.org/content/Getting_Started_REST
๏ Uses JSON for state transfer + browsable HTML for introspection
๏ Atomic modification operations
๏ Brand new declarative traversal framework
   • Extensible using your favorite scripting language
      ‣javascript is included. Jython, JRuby, et.c. supported
                                                                33
Other cool Graph Databases
๏ Sones GraphDB
   • Graph Query Language - a SQL-like query language for graphs
๏ Franz Inc. AllegroGraph
๏ HypergraphDB
๏ InfoGrid
๏ Twitter’s FlockDB
   • Optimized for the Twitter use case - one level relationships
๏ Interestingly we all have different approaches
                                                               34
Up until recently there was
                                                   only one Database, the
                                                   RDBMS.
                                                   The days of a single database
                                                   that rules all is over.




One database to rule them all


            Image credits: The Lord of the Rings, New Line Cinema

                                                                        35
Use best suited storage for each kind of data
                                                      The era of using
                                                      RDBMSes for all
                                                      problems is over.
                                                      Instead we should use
                                                      the database most
                                                      suited for the problem
                                                      at hand.




                             Image credits: Unknown :’(        36
Polyglot persistence
                                    ... we could even use
                                    multiple databases in
                                    conjunction, and let
                                    each database handle
                                    the things it does best.




                       Document
                            {...}


                            {...}


                            {...}
                                             37
Polyglot persistence
                 SQL && NOSQL


                                            Document
                                                 {...}


                                                 {...}

      All databases are welcome!
      SQL and NOSQL - it is Not Only SQL!        {...}
                                                         38
Finding out more
๏ http://neo4j.org/ - project website
      ‣http://api.neo4j.org/ and http://components.neo4j.org/
      ‣http://wiki.neo4j.org/ - HowTos, Tutorials, Examples, FAQ, et.c.
      ‣http://planet.neo4j.org/ - aggregation of blogs about Neo4j
๏ http://neotechnology.com/ - commercial licensing
๏ http://twitter.com/neo4j/team - follow the Neo4j team
๏ http://nosql.mypopescu.com/ - good source for news on NOSQL
     monitors Neo4j and other NOSQL solutions
๏ http://highscalability.com/ - has published a few articles about Neo4j
                                                                39
Buzzword summary                                                      http://neo4j.org/


                                                   Semi structured
                        SPARQL
      AGPLv3
                                                                 ACID transactions
                                         Open Source

               Object mapping                          Gremlin        Shortest path
In-Graph indexes                           NOSQL
             A* routing
                                                       whiteboard friendly
                               RESTful
       Traversal
                                         Query language

                 Embedded
                                                           Beer
                                                                       Schema free
                                   Software Transactional Memory
Right tool for the right job
           Scaling to complexity
                                                   Free Software

                         Polyglot persistence
                                                                             40
http://neotechnology.com

Mais conteúdo relacionado

Mais procurados

Neo4j GraphTalk Helsinki - Introduction and Graph Use Cases
Neo4j GraphTalk Helsinki - Introduction and Graph Use CasesNeo4j GraphTalk Helsinki - Introduction and Graph Use Cases
Neo4j GraphTalk Helsinki - Introduction and Graph Use CasesNeo4j
 
Neo4j GraphDay Seattle- Sept19- neo4j basic training
Neo4j GraphDay Seattle- Sept19- neo4j basic trainingNeo4j GraphDay Seattle- Sept19- neo4j basic training
Neo4j GraphDay Seattle- Sept19- neo4j basic trainingNeo4j
 
Intro to Neo4j
Intro to Neo4jIntro to Neo4j
Intro to Neo4jNeo4j
 
RDBMS to Graph
RDBMS to GraphRDBMS to Graph
RDBMS to GraphNeo4j
 
Introduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & BahrainIntroduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & BahrainNeo4j
 
Neo4J : Introduction to Graph Database
Neo4J : Introduction to Graph DatabaseNeo4J : Introduction to Graph Database
Neo4J : Introduction to Graph DatabaseMindfire Solutions
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4jNeo4j
 
Introduction to Graph Database
Introduction to Graph DatabaseIntroduction to Graph Database
Introduction to Graph DatabaseEric Lee
 
Workshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data ScienceWorkshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data ScienceNeo4j
 
Building Applications with a Graph Database
Building Applications with a Graph DatabaseBuilding Applications with a Graph Database
Building Applications with a Graph DatabaseTobias Lindaaker
 
The Graph Database Universe: Neo4j Overview
The Graph Database Universe: Neo4j OverviewThe Graph Database Universe: Neo4j Overview
The Graph Database Universe: Neo4j OverviewNeo4j
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use CasesMax De Marzi
 
Intro to Neo4j presentation
Intro to Neo4j presentationIntro to Neo4j presentation
Intro to Neo4j presentationjexp
 
Optimizing Your Supply Chain with the Neo4j Graph
Optimizing Your Supply Chain with the Neo4j GraphOptimizing Your Supply Chain with the Neo4j Graph
Optimizing Your Supply Chain with the Neo4j GraphNeo4j
 
Intro to Graphs and Neo4j
Intro to Graphs and Neo4jIntro to Graphs and Neo4j
Intro to Graphs and Neo4jjexp
 
Neo4j Graph Platform Overview, Kurt Freytag, Neo4j
Neo4j Graph Platform Overview, Kurt Freytag, Neo4jNeo4j Graph Platform Overview, Kurt Freytag, Neo4j
Neo4j Graph Platform Overview, Kurt Freytag, Neo4jNeo4j
 
Introduction to Neo4j - a hands-on crash course
Introduction to Neo4j - a hands-on crash courseIntroduction to Neo4j - a hands-on crash course
Introduction to Neo4j - a hands-on crash courseNeo4j
 
Neo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j Graph Use Cases, Bruno Ungermann, Neo4jNeo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j Graph Use Cases, Bruno Ungermann, Neo4jNeo4j
 

Mais procurados (20)

Neo4j GraphTalk Helsinki - Introduction and Graph Use Cases
Neo4j GraphTalk Helsinki - Introduction and Graph Use CasesNeo4j GraphTalk Helsinki - Introduction and Graph Use Cases
Neo4j GraphTalk Helsinki - Introduction and Graph Use Cases
 
Neo4j GraphDay Seattle- Sept19- neo4j basic training
Neo4j GraphDay Seattle- Sept19- neo4j basic trainingNeo4j GraphDay Seattle- Sept19- neo4j basic training
Neo4j GraphDay Seattle- Sept19- neo4j basic training
 
Intro to Neo4j
Intro to Neo4jIntro to Neo4j
Intro to Neo4j
 
RDBMS to Graph
RDBMS to GraphRDBMS to Graph
RDBMS to Graph
 
Introduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & BahrainIntroduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & Bahrain
 
Neo4J : Introduction to Graph Database
Neo4J : Introduction to Graph DatabaseNeo4J : Introduction to Graph Database
Neo4J : Introduction to Graph Database
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4j
 
Graph based data models
Graph based data modelsGraph based data models
Graph based data models
 
Introduction to Graph Database
Introduction to Graph DatabaseIntroduction to Graph Database
Introduction to Graph Database
 
Workshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data ScienceWorkshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data Science
 
Building Applications with a Graph Database
Building Applications with a Graph DatabaseBuilding Applications with a Graph Database
Building Applications with a Graph Database
 
The Graph Database Universe: Neo4j Overview
The Graph Database Universe: Neo4j OverviewThe Graph Database Universe: Neo4j Overview
The Graph Database Universe: Neo4j Overview
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use Cases
 
Intro to Neo4j presentation
Intro to Neo4j presentationIntro to Neo4j presentation
Intro to Neo4j presentation
 
Neo4j graph database
Neo4j graph databaseNeo4j graph database
Neo4j graph database
 
Optimizing Your Supply Chain with the Neo4j Graph
Optimizing Your Supply Chain with the Neo4j GraphOptimizing Your Supply Chain with the Neo4j Graph
Optimizing Your Supply Chain with the Neo4j Graph
 
Intro to Graphs and Neo4j
Intro to Graphs and Neo4jIntro to Graphs and Neo4j
Intro to Graphs and Neo4j
 
Neo4j Graph Platform Overview, Kurt Freytag, Neo4j
Neo4j Graph Platform Overview, Kurt Freytag, Neo4jNeo4j Graph Platform Overview, Kurt Freytag, Neo4j
Neo4j Graph Platform Overview, Kurt Freytag, Neo4j
 
Introduction to Neo4j - a hands-on crash course
Introduction to Neo4j - a hands-on crash courseIntroduction to Neo4j - a hands-on crash course
Introduction to Neo4j - a hands-on crash course
 
Neo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j Graph Use Cases, Bruno Ungermann, Neo4jNeo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j Graph Use Cases, Bruno Ungermann, Neo4j
 

Destaque

Graph database super star
Graph database super starGraph database super star
Graph database super starandres_taylor
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jDebanjan Mahata
 
GraphTalks Rome - Introducing Neo4j
GraphTalks Rome - Introducing Neo4jGraphTalks Rome - Introducing Neo4j
GraphTalks Rome - Introducing Neo4jNeo4j
 
Working With a Real-World Dataset in Neo4j: Import and Modeling
Working With a Real-World Dataset in Neo4j: Import and ModelingWorking With a Real-World Dataset in Neo4j: Import and Modeling
Working With a Real-World Dataset in Neo4j: Import and ModelingNeo4j
 
OrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionalityOrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionalityCurtis Mosters
 
Neo4j - graph database for recommendations
Neo4j - graph database for recommendationsNeo4j - graph database for recommendations
Neo4j - graph database for recommendationsproksik
 
Graph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataGraph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataMarko Rodriguez
 
Relational to Big Graph
Relational to Big GraphRelational to Big Graph
Relational to Big GraphNeo4j
 
Natural Language Processing with Neo4j
Natural Language Processing with Neo4jNatural Language Processing with Neo4j
Natural Language Processing with Neo4jKenny Bastani
 
Big Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache SparkBig Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache SparkKenny Bastani
 
Neo4j PartnerDay Amsterdam 2017
Neo4j PartnerDay Amsterdam 2017Neo4j PartnerDay Amsterdam 2017
Neo4j PartnerDay Amsterdam 2017Neo4j
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDBAlex Sharp
 
Neo4j - 5 cool graph examples
Neo4j - 5 cool graph examplesNeo4j - 5 cool graph examples
Neo4j - 5 cool graph examplesPeter Neubauer
 
Graph Databases
Graph DatabasesGraph Databases
Graph Databasesthai
 
Use Neo4j In Your Next Java Project
Use Neo4j In Your Next Java ProjectUse Neo4j In Your Next Java Project
Use Neo4j In Your Next Java ProjectTobias Coetzee
 
The Panama Papers: analysing it with neo4j and neo4j spatial - MINC 2016
The Panama Papers: analysing it with neo4j and neo4j spatial - MINC 2016The Panama Papers: analysing it with neo4j and neo4j spatial - MINC 2016
The Panama Papers: analysing it with neo4j and neo4j spatial - MINC 2016Craig Taverner
 
The Definition of GraphDB
The Definition of GraphDBThe Definition of GraphDB
The Definition of GraphDBTakahiro Inoue
 
Vbug nov 2010 Visio Validation
Vbug nov 2010   Visio ValidationVbug nov 2010   Visio Validation
Vbug nov 2010 Visio ValidationDavid Parker
 
Graph databases in PHP @ PHPCon Poland 10-22-2011
Graph databases in PHP @ PHPCon Poland 10-22-2011 Graph databases in PHP @ PHPCon Poland 10-22-2011
Graph databases in PHP @ PHPCon Poland 10-22-2011 Alessandro Nadalin
 
Sql saturday and share point saturday cambridge 2015 - david parker - visio
Sql saturday and share point saturday cambridge 2015 - david parker - visioSql saturday and share point saturday cambridge 2015 - david parker - visio
Sql saturday and share point saturday cambridge 2015 - david parker - visioDavid Parker
 

Destaque (20)

Graph database super star
Graph database super starGraph database super star
Graph database super star
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4j
 
GraphTalks Rome - Introducing Neo4j
GraphTalks Rome - Introducing Neo4jGraphTalks Rome - Introducing Neo4j
GraphTalks Rome - Introducing Neo4j
 
Working With a Real-World Dataset in Neo4j: Import and Modeling
Working With a Real-World Dataset in Neo4j: Import and ModelingWorking With a Real-World Dataset in Neo4j: Import and Modeling
Working With a Real-World Dataset in Neo4j: Import and Modeling
 
OrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionalityOrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionality
 
Neo4j - graph database for recommendations
Neo4j - graph database for recommendationsNeo4j - graph database for recommendations
Neo4j - graph database for recommendations
 
Graph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataGraph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of Data
 
Relational to Big Graph
Relational to Big GraphRelational to Big Graph
Relational to Big Graph
 
Natural Language Processing with Neo4j
Natural Language Processing with Neo4jNatural Language Processing with Neo4j
Natural Language Processing with Neo4j
 
Big Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache SparkBig Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache Spark
 
Neo4j PartnerDay Amsterdam 2017
Neo4j PartnerDay Amsterdam 2017Neo4j PartnerDay Amsterdam 2017
Neo4j PartnerDay Amsterdam 2017
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDB
 
Neo4j - 5 cool graph examples
Neo4j - 5 cool graph examplesNeo4j - 5 cool graph examples
Neo4j - 5 cool graph examples
 
Graph Databases
Graph DatabasesGraph Databases
Graph Databases
 
Use Neo4j In Your Next Java Project
Use Neo4j In Your Next Java ProjectUse Neo4j In Your Next Java Project
Use Neo4j In Your Next Java Project
 
The Panama Papers: analysing it with neo4j and neo4j spatial - MINC 2016
The Panama Papers: analysing it with neo4j and neo4j spatial - MINC 2016The Panama Papers: analysing it with neo4j and neo4j spatial - MINC 2016
The Panama Papers: analysing it with neo4j and neo4j spatial - MINC 2016
 
The Definition of GraphDB
The Definition of GraphDBThe Definition of GraphDB
The Definition of GraphDB
 
Vbug nov 2010 Visio Validation
Vbug nov 2010   Visio ValidationVbug nov 2010   Visio Validation
Vbug nov 2010 Visio Validation
 
Graph databases in PHP @ PHPCon Poland 10-22-2011
Graph databases in PHP @ PHPCon Poland 10-22-2011 Graph databases in PHP @ PHPCon Poland 10-22-2011
Graph databases in PHP @ PHPCon Poland 10-22-2011
 
Sql saturday and share point saturday cambridge 2015 - david parker - visio
Sql saturday and share point saturday cambridge 2015 - david parker - visioSql saturday and share point saturday cambridge 2015 - david parker - visio
Sql saturday and share point saturday cambridge 2015 - david parker - visio
 

Semelhante a NOSQLEU - Graph Databases and Neo4j

NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)
NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)
NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)Emil Eifrem
 
Django and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assDjango and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assTobias Lindaaker
 
No Sql Movement
No Sql MovementNo Sql Movement
No Sql MovementAjit Koti
 
NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010)
NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010)NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010)
NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010)Emil Eifrem
 
NOSQL Overview, Neo4j Intro And Production Example (QCon London 2010)
NOSQL Overview, Neo4j Intro And Production Example (QCon London 2010)NOSQL Overview, Neo4j Intro And Production Example (QCon London 2010)
NOSQL Overview, Neo4j Intro And Production Example (QCon London 2010)Emil Eifrem
 
Why Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop WebinarWhy Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop WebinarCloudera, Inc.
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBWilliam LaForest
 
A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)
A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)
A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)Emil Eifrem
 
Big Data = Big Decisions
Big Data = Big DecisionsBig Data = Big Decisions
Big Data = Big DecisionsInnoTech
 
Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011jexp
 
Mongodb open source_high_performance_database
Mongodb open source_high_performance_databaseMongodb open source_high_performance_database
Mongodb open source_high_performance_databaseMurat Çakal
 
Web 3.0: The Upcoming Revolution
Web 3.0: The Upcoming RevolutionWeb 3.0: The Upcoming Revolution
Web 3.0: The Upcoming RevolutionNitin Godawat
 
The Perfect Storm: The Impact of Analytics, Big Data and Analytics
The Perfect Storm: The Impact of Analytics, Big Data and AnalyticsThe Perfect Storm: The Impact of Analytics, Big Data and Analytics
The Perfect Storm: The Impact of Analytics, Big Data and AnalyticsInside Analysis
 
Oracle unified directory_11g
Oracle unified directory_11gOracle unified directory_11g
Oracle unified directory_11gOracleIDM
 
CloudFest Denver When Worlds Collide: HTML5 Meets the Cloud
CloudFest Denver When Worlds Collide: HTML5 Meets the CloudCloudFest Denver When Worlds Collide: HTML5 Meets the Cloud
CloudFest Denver When Worlds Collide: HTML5 Meets the CloudDavid Pallmann
 
An overview of NOSQL (JFokus 2011)
An overview of NOSQL (JFokus 2011)An overview of NOSQL (JFokus 2011)
An overview of NOSQL (JFokus 2011)Emil Eifrem
 

Semelhante a NOSQLEU - Graph Databases and Neo4j (20)

NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)
NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)
NOSQL Overview Lightning Talk (Scalability Geekcruise 2009)
 
Django and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assDjango and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks ass
 
No Sql Movement
No Sql MovementNo Sql Movement
No Sql Movement
 
NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010)
NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010)NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010)
NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010)
 
NOSQL Overview, Neo4j Intro And Production Example (QCon London 2010)
NOSQL Overview, Neo4j Intro And Production Example (QCon London 2010)NOSQL Overview, Neo4j Intro And Production Example (QCon London 2010)
NOSQL Overview, Neo4j Intro And Production Example (QCon London 2010)
 
Why Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop WebinarWhy Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop Webinar
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDB
 
A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)
A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)
A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)
 
Big Data = Big Decisions
Big Data = Big DecisionsBig Data = Big Decisions
Big Data = Big Decisions
 
NoSQL Basics - a quick tour
NoSQL Basics - a quick tourNoSQL Basics - a quick tour
NoSQL Basics - a quick tour
 
Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011
 
Anti-social Databases
Anti-social DatabasesAnti-social Databases
Anti-social Databases
 
Mongodb open source_high_performance_database
Mongodb open source_high_performance_databaseMongodb open source_high_performance_database
Mongodb open source_high_performance_database
 
Web 3.0: The Upcoming Revolution
Web 3.0: The Upcoming RevolutionWeb 3.0: The Upcoming Revolution
Web 3.0: The Upcoming Revolution
 
The Perfect Storm: The Impact of Analytics, Big Data and Analytics
The Perfect Storm: The Impact of Analytics, Big Data and AnalyticsThe Perfect Storm: The Impact of Analytics, Big Data and Analytics
The Perfect Storm: The Impact of Analytics, Big Data and Analytics
 
Spring Into the Cloud
Spring Into the CloudSpring Into the Cloud
Spring Into the Cloud
 
Oracle unified directory_11g
Oracle unified directory_11gOracle unified directory_11g
Oracle unified directory_11g
 
CloudFest Denver When Worlds Collide: HTML5 Meets the Cloud
CloudFest Denver When Worlds Collide: HTML5 Meets the CloudCloudFest Denver When Worlds Collide: HTML5 Meets the Cloud
CloudFest Denver When Worlds Collide: HTML5 Meets the Cloud
 
An overview of NOSQL (JFokus 2011)
An overview of NOSQL (JFokus 2011)An overview of NOSQL (JFokus 2011)
An overview of NOSQL (JFokus 2011)
 
MongoDB
MongoDBMongoDB
MongoDB
 

Mais de Tobias Lindaaker

Choosing the right NOSQL database
Choosing the right NOSQL databaseChoosing the right NOSQL database
Choosing the right NOSQL databaseTobias Lindaaker
 
[JavaOne 2011] Models for Concurrent Programming
[JavaOne 2011] Models for Concurrent Programming[JavaOne 2011] Models for Concurrent Programming
[JavaOne 2011] Models for Concurrent ProgrammingTobias Lindaaker
 
Persistent graphs in Python with Neo4j
Persistent graphs in Python with Neo4jPersistent graphs in Python with Neo4j
Persistent graphs in Python with Neo4jTobias Lindaaker
 
A Better Python for the JVM
A Better Python for the JVMA Better Python for the JVM
A Better Python for the JVMTobias Lindaaker
 
A Better Python for the JVM
A Better Python for the JVMA Better Python for the JVM
A Better Python for the JVMTobias Lindaaker
 
Exploiting Concurrency with Dynamic Languages
Exploiting Concurrency with Dynamic LanguagesExploiting Concurrency with Dynamic Languages
Exploiting Concurrency with Dynamic LanguagesTobias Lindaaker
 

Mais de Tobias Lindaaker (8)

NOSQL Overview
NOSQL OverviewNOSQL Overview
NOSQL Overview
 
JDK Power Tools
JDK Power ToolsJDK Power Tools
JDK Power Tools
 
Choosing the right NOSQL database
Choosing the right NOSQL databaseChoosing the right NOSQL database
Choosing the right NOSQL database
 
[JavaOne 2011] Models for Concurrent Programming
[JavaOne 2011] Models for Concurrent Programming[JavaOne 2011] Models for Concurrent Programming
[JavaOne 2011] Models for Concurrent Programming
 
Persistent graphs in Python with Neo4j
Persistent graphs in Python with Neo4jPersistent graphs in Python with Neo4j
Persistent graphs in Python with Neo4j
 
A Better Python for the JVM
A Better Python for the JVMA Better Python for the JVM
A Better Python for the JVM
 
A Better Python for the JVM
A Better Python for the JVMA Better Python for the JVM
A Better Python for the JVM
 
Exploiting Concurrency with Dynamic Languages
Exploiting Concurrency with Dynamic LanguagesExploiting Concurrency with Dynamic Languages
Exploiting Concurrency with Dynamic Languages
 

Último

Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 

Último (20)

Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 

NOSQLEU - Graph Databases and Neo4j

  • 1. Graph Databases and Neo4j twitter: @thobe / #neo4j Tobias Ivarsson email: tobias@neotechnology.com web: http://www.neo4j.org/ Hacker @ Neo Technology web: http://www.thobe.org/
  • 2. NOSQL - Why now? Four trends 2
  • 3. Trend 1: Data size ExaBytes (10¹⁸) of data stored per year 988 1000 Each year more and more digital data is created. Over t wo 750 years we create more digital data than all 623 the data created in history before that. 500 397 253 250 161 0 2006 2007 2008 2009 2010 Data source: IDC 2007 3
  • 4. Trend 2: Connectedness Giant Global Graph (GGG) Over time data has evolved to Ontologies be more and more interlinked and connected. RDF Hypertext has links, Blogs have pingback, Tagging groups all related data Folksonomies Information connectivity Tagging Wikis User-generated content Blogs RSS Hypertext Text documents web 1.0 web 2.0 “web 3.0” 1990 2000 2010 2020 4
  • 5. Trend 3: Semi-structure ๏ Individualization of content • In the salary lists of the 1970s, all elements had exactly one job • In Or 15? lists of the 2000s, we need 5 job columns! Or 8? the salary ๏ All encompassing “entire world views” • Store more data about each entity ๏ Trend accelerated by the decentralization of content generation that is the hallmark of the age of participation (“web 2.0”) 5
  • 6. Trend 4: Architecture 1980s: Mainframe applications Application DB 6
  • 7. Trend 4: Architecture 1990s: Database as integration hub Application Application Application DB 7
  • 8. Trend 4: Architecture 2000s: (moving towards) Decoupled services with their own backend Application Application Application DB DB DB 8
  • 9. Why NOSQL Now? ๏Trend 1: Size ๏Trend 2: Connectedness ๏Trend 3: Semi-structure ๏Trend 4: Architecture 9
  • 10. RDBMS performance Salary List Relational database Requirement of application Performance Majority of Webapps Social network We are building } applications today that Semantic Trading have complexity requirements that a Relational Database cannot handle with sufficient performance custom Data complexity 10
  • 11. Scaling to size vs. Scaling to complexity Size Key/Value stores Bigtable clones Document databases Graph databases Billions of nodes and relationships > 90% of use cases Complexity 11
  • 12. Graph Databases focuses on structure of data Graph databases focus on the structure of the data, scaling to the complexity of the data and of the application. 12
  • 13. What is Neo4j? ๏ Neo4j is a Graph Database • Non-relational (“#nosql”), transactional (ACID), embedded • Data is stored as a Graph / Network ‣Nodes and relationships with properties ‣“Property Graph” or “edge-labeled multidigraph” • Schema free, bottom-up data model design ๏ Neo4j is Open Source / Free (as in speech) Software Prices are available at http://neotechnology.com/ • AGPLv3 Contact us if you have questions and/or special license needs (e.g. if you • Commercial (“dual license”) license available want an evaluation license) ‣First server is free (as in beer), next is inexpensive 13
  • 14. More about Neo4j ๏ Neo4j is stable • In 24/7 operation since 2003 ๏ Neo4j is in active development • Neo Technology received VC funding October 2009 ๏ Neo4j delivers high performance graph operations • traverses 1’000’000+ relationships / second on commodity hardware 14
  • 15. The Neo4j Graph data model •Nodes •Relationships bet ween Nodes •Relationships have Labels •Relationships are directed, but traversed at equal speed in both directions •The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not) •Nodes have key-value properties •Relationships have key-value properties 15
  • 16. The Neo4j Graph data model •Nodes •Relationships bet ween Nodes •Relationships have Labels •Relationships are directed, but traversed at equal speed in both directions •The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not) •Nodes have key-value properties •Relationships have key-value properties 15
  • 17. The Neo4j Graph data model LIVES WITH LOVES OWNS DRIVES •Nodes •Relationships bet ween Nodes •Relationships have Labels •Relationships are directed, but traversed at equal speed in both directions •The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not) •Nodes have key-value properties •Relationships have key-value properties 15
  • 18. The Neo4j Graph data model LOVES LIVES WITH LOVES OWNS DRIVES •Nodes •Relationships bet ween Nodes •Relationships have Labels •Relationships are directed, but traversed at equal speed in both directions •The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not) •Nodes have key-value properties •Relationships have key-value properties 15
  • 19. The Neo4j Graph data model name: “Mary” LOVES name: “James” age: 35 age: 32 LIVES WITH twitter: “@spam” LOVES OWNS DRIVES •Nodes •Relationships bet ween Nodes •Relationships have Labels brand: “Volvo” •Relationships are directed, but traversed at model: “V70” equal speed in both directions •The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not) •Nodes have key-value properties •Relationships have key-value properties 15
  • 20. The Neo4j Graph data model name: “Mary” LOVES name: “James” age: 35 age: 32 LIVES WITH twitter: “@spam” LOVES OWNS item type: “car” DRIVES •Nodes •Relationships bet ween Nodes •Relationships have Labels brand: “Volvo” •Relationships are directed, but traversed at model: “V70” equal speed in both directions •The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not) •Nodes have key-value properties •Relationships have key-value properties 15
  • 21. Graphs are all around us A B C D ... 1 17 3.14 3 17.79333333333 2 42 10.11 14 30.33 3 316 6.66 1 2104.56 4 32 9.11 592 0.492432432432 5 Even if this spreadsheet looks like it could be a fit for a RDBMS 2153.175765766 it isn’t: •RDBMSes have problems with ... extending indefinitely on both rows and columns •Formulas and data dependencies would quickly lead to heavy join operations 16
  • 22. Graphs are all around us A B C D ... 1 17 3.14 3 = A1 * B1 / C1 2 42 10.11 14 = A2 * B2 / C2 3 316 6.66 1 = A3 * B3 / C3 4 32 9.11 592 = A4 * B4 / C4 5 = SUM(D2:D5) With data dependencies ... the spread sheet turns out to be a graph. 17
  • 23. Graphs are all around us A B C D ... 1 17 3.14 3 = A1 * B1 / C1 2 42 10.11 14 = A2 * B2 / C2 3 316 6.66 1 = A3 * B3 / C3 4 32 9.11 592 = A4 * B4 / C4 5 = SUM(D2:D5) With data dependencies ... the spread sheet turns out to be a graph. 17
  • 24. Graphs are all around us If we add external data sources the problem becomes even more interesting... 17 3.14 3 = A1 * B1 / C1 42 10.11 14 = A2 * B2 / C2 316 6.66 1 = A3 * B3 / C3 32 9.11 592 = A4 * B4 / C4 = SUM(D2:D5) 18
  • 25. Graphs are all around us If we add external data sources the problem becomes even more interesting... 17 3.14 3 = A1 * B1 / C1 42 10.11 14 = A2 * B2 / C2 316 6.66 1 = A3 * B3 / C3 32 9.11 592 = A4 * B4 / C4 = SUM(D2:D5) 18
  • 26. Graphs are whiteboard friendly An application domain model outlined on a whiteboard or piece of paper would be translated to an ER-diagram, then normalized to fit a Relational Database. With a Graph Database the model from the whiteboard is implemented directly. Image credits: Tobias Ivarsson 19
  • 27. Graphs are whiteboard friendly An application domain model outlined on a whiteboard or piece of paper would be translated to an ER-diagram, then normalized to fit a Relational Database. With a Graph Database the model from the whiteboard is implemented directly. * 1 * * 1 * 1 * 1 * Image credits: Tobias Ivarsson 19
  • 28. Graphs are whiteboard friendly An application domain model outlined on a whiteboard or piece of paper would be translated to an ER-diagram, then normalized to fit a Relational Database. With a Graph Database the model from the whiteboard is implemented directly. thobe Joe project blog Wardrobe Strength Hello Joe Modularizing Jython Neo4j performance analysis Image credits: Tobias Ivarsson 19
  • 29. Query Languages ๏ Traversal APIs • Neo4j core traversers • Blueprint pipes ๏ SPARQL - “SQL for linked data” - query by graph pattern matching SELECT ?person WHERE { Find all persons that ?person neo4j:KNOWS ?friend . KNOWS a friend that ?friend neo4j:KNOWS ?foe . KNOWS someone named “Larry Ellison”. ?foe neo4j:name "Larry Ellison" . } ๏ Gremlin - “perl for graphs” - query by traversal ./outE[@label='KNOWS']/inV[@age > 30]/@name Give me the names of all the people I know that are older than 30. 20
  • 30. Data manipulation API GraphDatabaseService graphDb = getGraphDbInstanceSomehow(); // Create Thomas 'Neo' Anderson Node mrAnderson = graphDb.createNode(); mrAnderson.setProperty( "name", "Thomas Anderson" ); mrAnderson.setProperty( "age", 29 ); // Create Morpheus Node morpheus = graphDb.createNode(); morpheus.setProperty( "name", "Morpheus" ); morpheus.setProperty( "rank", "Captain" ); morpheus.setProperty( "occupation", "Total bad ass" ); // Create relationship representing they know each other mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS ); // ... similarly for Trinity, Cypher, Agent Smith, Architect 21
  • 31. Data manipulation API GraphDatabaseService graphDb = getGraphDbInstanceSomehow(); Transaction tx = graphDb.beginTx(); try { // Create Thomas 'Neo' Anderson Node mrAnderson = graphDb.createNode(); mrAnderson.setProperty( "name", "Thomas Anderson" ); mrAnderson.setProperty( "age", 29 ); // Create Morpheus Node morpheus = graphDb.createNode(); morpheus.setProperty( "name", "Morpheus" ); morpheus.setProperty( "rank", "Captain" ); morpheus.setProperty( "occupation", "Total bad ass" ); // Create relationship representing they know each other mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS ); // ... similarly for Trinity, Cypher, Agent Smith, Architect tx.success(); } finally { tx.finish(); 21 }
  • 32. Graph traversals name: “The Architect” disclosure: “public” name: “Thomas Anderson” age: 29 name: “Cypher” last name: “Reagan” KNOWS name: “Morpheus” KNOWS KNOWS rank: “Captain” CODED BY LOVES occupation: “Total badass” KNOWS KNOWS name: “Trinity” disclosure: “secret” name: “Agent Smith” version: “1.0b” since: “meeting the oracle” since: “a year before the movie” language: “C++” cooperates on: “The Nebuchadnezzar” 22
  • 33. Graph traversals name: “The Architect” disclosure: “public” name: “Thomas Anderson” age: 29 name: “Cypher” last name: “Reagan” KNOWS name: “Morpheus” KNOWS KNOWS rank: “Captain” CODED BY LOVES occupation: “Total badass” KNOWS KNOWS name: “Trinity” disclosure: “secret” name: “Agent Smith” version: “1.0b” since: “meeting the oracle” since: “a year before the movie” language: “C++” cooperates on: “The Nebuchadnezzar” import neo4j class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j types = [ neo4j.Outgoing.KNOWS ] order = neo4j.BREADTH_FIRST stop = neo4j.STOP_AT_END_OF_GRAPH returnable = neo4j.RETURN_ALL_BUT_START_NODE for friend_node in Friends(mr_anderson): print "%s (@ depth=%s)" % ( friend_node["name"], friend_node.depth ) 23
  • 34. Graph traversals name: “The Architect” disclosure: “public” name: “Thomas Anderson” age: 29 name: “Cypher” last name: “Reagan” KNOWS name: “Morpheus” KNOWS KNOWS rank: “Captain” CODED BY LOVES occupation: “Total badass” KNOWS KNOWS name: “Trinity” disclosure: “secret” name: “Agent Smith” version: “1.0b” since: “meeting the oracle” since: “a year before the movie” language: “C++” cooperates on: “The Nebuchadnezzar” import neo4j class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j types = [ neo4j.Outgoing.KNOWS ] order = neo4j.BREADTH_FIRST stop = neo4j.STOP_AT_END_OF_GRAPH returnable = neo4j.RETURN_ALL_BUT_START_NODE for friend_node in Friends(mr_anderson): print "%s (@ depth=%s)" % ( friend_node["name"], friend_node.depth ) 23
  • 35. Graph traversals name: “The Architect” disclosure: “public” name: “Thomas Anderson” age: 29 name: “Cypher” last name: “Reagan” KNOWS name: “Morpheus” KNOWS KNOWS rank: “Captain” CODED BY LOVES occupation: “Total badass” KNOWS KNOWS name: “Trinity” disclosure: “secret” name: “Agent Smith” version: “1.0b” since: “meeting the oracle” since: “a year before the movie” language: “C++” cooperates on: “The Nebuchadnezzar” import neo4j class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j types = [ neo4j.Outgoing.KNOWS ] Morpheus (@ depth=1) order = neo4j.BREADTH_FIRST stop = neo4j.STOP_AT_END_OF_GRAPH returnable = neo4j.RETURN_ALL_BUT_START_NODE for friend_node in Friends(mr_anderson): print "%s (@ depth=%s)" % ( friend_node["name"], friend_node.depth ) 23
  • 36. Graph traversals name: “The Architect” disclosure: “public” name: “Thomas Anderson” age: 29 name: “Cypher” last name: “Reagan” KNOWS name: “Morpheus” KNOWS KNOWS rank: “Captain” CODED BY LOVES occupation: “Total badass” KNOWS KNOWS name: “Trinity” disclosure: “secret” name: “Agent Smith” version: “1.0b” since: “meeting the oracle” since: “a year before the movie” language: “C++” cooperates on: “The Nebuchadnezzar” import neo4j class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j types = [ neo4j.Outgoing.KNOWS ] Morpheus (@ depth=1) order = neo4j.BREADTH_FIRST Trinity (@ depth=1) stop = neo4j.STOP_AT_END_OF_GRAPH returnable = neo4j.RETURN_ALL_BUT_START_NODE for friend_node in Friends(mr_anderson): print "%s (@ depth=%s)" % ( friend_node["name"], friend_node.depth ) 23
  • 37. Graph traversals name: “The Architect” disclosure: “public” name: “Thomas Anderson” age: 29 name: “Cypher” last name: “Reagan” KNOWS name: “Morpheus” KNOWS KNOWS rank: “Captain” CODED BY LOVES occupation: “Total badass” KNOWS KNOWS name: “Trinity” disclosure: “secret” name: “Agent Smith” version: “1.0b” since: “meeting the oracle” since: “a year before the movie” language: “C++” cooperates on: “The Nebuchadnezzar” import neo4j class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j types = [ neo4j.Outgoing.KNOWS ] Morpheus (@ depth=1) order = neo4j.BREADTH_FIRST Trinity (@ depth=1) stop = neo4j.STOP_AT_END_OF_GRAPH Cypher (@ depth=2) returnable = neo4j.RETURN_ALL_BUT_START_NODE for friend_node in Friends(mr_anderson): print "%s (@ depth=%s)" % ( friend_node["name"], friend_node.depth ) 23
  • 38. Graph traversals name: “The Architect” disclosure: “public” name: “Thomas Anderson” age: 29 name: “Cypher” last name: “Reagan” KNOWS name: “Morpheus” KNOWS KNOWS rank: “Captain” CODED BY LOVES occupation: “Total badass” KNOWS KNOWS name: “Trinity” disclosure: “secret” name: “Agent Smith” version: “1.0b” since: “meeting the oracle” since: “a year before the movie” language: “C++” cooperates on: “The Nebuchadnezzar” import neo4j class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j types = [ neo4j.Outgoing.KNOWS ] Morpheus (@ depth=1) order = neo4j.BREADTH_FIRST Trinity (@ depth=1) stop = neo4j.STOP_AT_END_OF_GRAPH Cypher (@ depth=2) returnable = neo4j.RETURN_ALL_BUT_START_NODE Agent Smith (@ depth=3) for friend_node in Friends(mr_anderson): print "%s (@ depth=%s)" % ( friend_node["name"], friend_node.depth ) 23
  • 39. Graph traversals name: “The Architect” disclosure: “public” name: “Thomas Anderson” age: 29 name: “Cypher” last name: “Reagan” KNOWS name: “Morpheus” KNOWS KNOWS rank: “Captain” CODED BY LOVES occupation: “Total badass” KNOWS KNOWS name: “Trinity” disclosure: “secret” name: “Agent Smith” version: “1.0b” since: “meeting the oracle” since: “a year before the movie” language: “C++” cooperates on: “The Nebuchadnezzar” import neo4j class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j types = [ neo4j.Outgoing.KNOWS ] Morpheus (@ depth=1) order = neo4j.BREADTH_FIRST Trinity (@ depth=1) stop = neo4j.STOP_AT_END_OF_GRAPH Cypher (@ depth=2) returnable = neo4j.RETURN_ALL_BUT_START_NODE Agent Smith (@ depth=3) for friend_node in Friends(mr_anderson): print "%s (@ depth=%s)" % ( friend_node["name"], friend_node.depth ) 23
  • 40. Finding a place to start ๏ Traversals need a Node to start from • QUESTION: How do I find the start Node? • ANSWER:You use an Index ๏ Indexes in Neo4j are different from Indexes in Relational Databases • RDBMSes use them for Joining • Neo4j use them for simple lookup IndexService index = getGraphDbIndexServiceSomehow(); Node mrAnderson = index.getSingleNode( "name", "Thomas Anderson" ); performTraversalFrom( mrAnderson ); 24
  • 41. Indexes in Neo4j ๏ The Graph *is* the main index • Use relationship labels for navigation • Build index structures *in the graph* ‣Search trees, tag clouds, geospatial indexes, et.c. ‣Linked/skip lists or other data structures in the graph ‣We have utility libraries for this ๏ External indexes used *for lookup* • Finding a (number of) points to start traversals from • Major difference from RDBMS that use indexes for everything 25
  • 42. A domain object implemented in Neo4j public interface Person { String getName(); void setName( String firstName, String lastName ); } public final class PersonImpl implements Person { private final Node underlyingNode; public PersonImpl( Node underlyingNode ) { this.underlyingNode = underlyingNode; } public String getName() { return String.format("%s %s", underlyingNode.getProperty("first name"), underlyingNode.getProperty("last name") ); } public String setName(String firstName, String lastName) { underlyingNode.setProperty("first name", firstName); underlyingNode.setProperty("last name", lastName); } } 26
  • 43. Neo4j as Software Transactional Memory ๏ Implement objects as wrappers around Nodes and Relationships • Neo4j is fast enough to allow you to read all state from the Node/Relationship ๏ Mutating operations require transactions • The changes are isolated from all other threads until committed • Multiple mutations can be committed atomically ๏ Nested transactions are flattened • Makes it possible to have methods open their own transaction ๏ Fits nicely with the OO paradigm • More focus on data than on objects (comp. Object DBs) 27
  • 44. Why not use an O/R mapper? ๏ Model evolution in ORMs is a hard problem • virtually unsupported in most ORM systems ๏ SQL is “compatible” across many RDBMSs • data is still locked in ๏ Each ORM maps object models differently • Moving to another ORM == legacy schema support ‣except your legacy schema is a strange auto-generated one ๏ Object/Graph Mapping is always done the same way • allows you to keep your data through application changes • or share data between multiple implementations 28
  • 45. What an ORM doesn’t do ๏Deep traversals ๏Graph algorithms ๏Shortest path(s) ๏Routing ๏etc. 29
  • 46. Path exists in social network ๏ Each person has on average 50 friends The performance impact in Neo4j depends only on the degree of each node. in Tobias an RDBMS it depends on the number of entries in the tables involved in the join(s). Emil Johan Peter Database # persons query time Relational database 1 000 2 000 ms Neo4j Graph Database 1 000 2 ms Neo4j Graph Database 1 000 000 2 ms Relational database 1 000 000 way too long... 30
  • 47. Path exists in social network ๏ Each person has on average 50 friends The performance impact in Neo4j depends only on the degree of each node. in Tobias an RDBMS it depends on the number of entries in the tables involved in the join(s). Emil Johan Peter Database # persons query time Relational database 1 000 2 000 ms Neo4j Graph Database 1 000 2 ms Neo4j Graph Database 1 000 000 2 ms Relational database 1 000 000 way too long... 30
  • 48. Path exists in social network ๏ Each person has on average 50 friends The performance impact in Neo4j depends only on the degree of each node. in Tobias an RDBMS it depends on the number of entries in the tables involved in the join(s). Emil Johan Peter Database # persons query time Relational database 1 000 2 000 ms Neo4j Graph Database 1 000 2 ms Neo4j Graph Database 1 000 000 2 ms Relational database 1 000 000 way too long... 30
  • 49. Path exists in social network ๏ Each person has on average 50 friends The performance impact in Neo4j depends only on the degree of each node. in Tobias an RDBMS it depends on the number of entries in the tables involved in the join(s). Emil Johan Peter Database # persons query time Relational database 1 000 2 000 ms Neo4j Graph Database 1 000 2 ms Neo4j Graph Database 1 000 000 2 ms Relational database 1 000 000 way too long... 30
  • 50. Path exists in social network ๏ Each person has on average 50 friends The performance impact in Neo4j depends only on the degree of each node. in Tobias an RDBMS it depends on the number of entries in the tables involved in the join(s). Emil Johan Peter Database # persons query time Relational database 1 000 2 000 ms Neo4j Graph Database 1 000 2 ms Neo4j Graph Database 1 000 000 2 ms Relational database 1 000 000 way too long... 30
  • 51. On-line real time routing with Neo4j ๏ 20 million Nodes - represents places ๏ 62 million Edges - represents direct roads between places • These edges have a length property, for the length of the road ๏ Average optimal route, 100 separate roads, found in 100ms ๏ Worst case route we could find: • Optimal route is 5500 separate roads • Total length ~770km There’s a difference • Found in less than 3 seconds bet ween least number of hops and least cost. ๏ Uses A* “best first” search 31
  • 52. Routing with Neo4j - using Neo4j Graph-Algos # The cost evaluator - for choosing the best next node class GeoCostEvaluator include EstimateEvaluator def getCost(node, goal) straight_path_distance( node.getProperty("lat"), node.getProperty("lon"), goal.getProperty("lat"), goal.getProperty("lon") ) end end # Instantiate the A* search function path_finder = AStar.new( Neo4j::instance, RelationshipExpander.forTypes( DynamicRelationshipType.withName("road"), Direction::BOTH ), DoubleEvaluator.new("length"), GeoCostEvaluator.new ) # Find the best path between New York City and San Francisco best_path = path_finder.findSinglePath( NYC, SF ) 32
  • 53. Newest addition: Neo4j lets you REST ๏ Hello Neo4j REST server - Neo4j no longer needs to be embedded ๏ Opens up Neo4j to your favorite platform (even if that isn’t Java) • PHP, .NET, et.c. - libraries already exists! • http://wiki.neo4j.org/content/Getting_Started_REST ๏ Uses JSON for state transfer + browsable HTML for introspection ๏ Atomic modification operations ๏ Brand new declarative traversal framework • Extensible using your favorite scripting language ‣javascript is included. Jython, JRuby, et.c. supported 33
  • 54. Other cool Graph Databases ๏ Sones GraphDB • Graph Query Language - a SQL-like query language for graphs ๏ Franz Inc. AllegroGraph ๏ HypergraphDB ๏ InfoGrid ๏ Twitter’s FlockDB • Optimized for the Twitter use case - one level relationships ๏ Interestingly we all have different approaches 34
  • 55. Up until recently there was only one Database, the RDBMS. The days of a single database that rules all is over. One database to rule them all Image credits: The Lord of the Rings, New Line Cinema 35
  • 56. Use best suited storage for each kind of data The era of using RDBMSes for all problems is over. Instead we should use the database most suited for the problem at hand. Image credits: Unknown :’( 36
  • 57. Polyglot persistence ... we could even use multiple databases in conjunction, and let each database handle the things it does best. Document {...} {...} {...} 37
  • 58. Polyglot persistence SQL && NOSQL Document {...} {...} All databases are welcome! SQL and NOSQL - it is Not Only SQL! {...} 38
  • 59. Finding out more ๏ http://neo4j.org/ - project website ‣http://api.neo4j.org/ and http://components.neo4j.org/ ‣http://wiki.neo4j.org/ - HowTos, Tutorials, Examples, FAQ, et.c. ‣http://planet.neo4j.org/ - aggregation of blogs about Neo4j ๏ http://neotechnology.com/ - commercial licensing ๏ http://twitter.com/neo4j/team - follow the Neo4j team ๏ http://nosql.mypopescu.com/ - good source for news on NOSQL monitors Neo4j and other NOSQL solutions ๏ http://highscalability.com/ - has published a few articles about Neo4j 39
  • 60. Buzzword summary http://neo4j.org/ Semi structured SPARQL AGPLv3 ACID transactions Open Source Object mapping Gremlin Shortest path In-Graph indexes NOSQL A* routing whiteboard friendly RESTful Traversal Query language Embedded Beer Schema free Software Transactional Memory Right tool for the right job Scaling to complexity Free Software Polyglot persistence 40