Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
AgensGraph: a Multi-model Graph Database based on PostgreSql
1. AgensGraph: a Multi-Model Graph Database
based-on PostgreSQL
Kisung Kim (kskim@bitnine.net)
Bitnine R&D Center
2017-1-14
2. Who am I
• Ph.D Kisung Kim -Chief Technology Officer of Bitnine Global Inc.
• Researched query optimization for graph-structured data during
doctorate degree
• Developed a distributed relational database engine in TmaxSoft
• Lead the development of a new graph database, AgensGraph in
Bitnine Global
3. What is Graph Database?
Images from http://www.slideshare.net/debanjanmahata/an-introduction-to-nosql-graph-databases-and-neo4j
4. What is Graph Database?
• Relationship is the first-class citizen in the graph database
• Make your data connected in the graph database
Relational Database Graph Database
Entity Row Node (Vertex)
Relationship Row Relationship (Edge)
5. What is the Graph Database?
• Handle data in different view
• Data model similar to entity-relationship model
• Gartner says it represents a radical change in how data is
organized and processed
6. Cypher Query Language
• Declarative query language for the property graph model
• Inspired by SQL and SPARQL
– Designed to be human-readable query language
• Developed by Neo technology Inc. since 2011
• Current version is 3.0
• OpenCypher.org (http://opencypher.org)
– Participate in developing the query language
7. Cypher Query Example
Make two nodes
CREATE (:person {id: 1, name: “Kisung Kim”, birthday: 1980-01-05});
CREATE (:company {id: 1, name: “Bitnine Global”});
Make a relationship between the two nodes
MATCH (p:person {id: 1}), (c:company {id:1})
CREATE (p)-[:workFor {title: “CTO”, since: 2014}]->(c);
Kisung Kim Bitnine Global
workFor
8. Cypher Query Example
Querying
MATCH (p:person {name: “Kisung Kim”})-[:workFor]->(c:company)
RETURN (p), (c)
No Table Definitions and No Joins
Query with variable length relationships
MATCH (p:person {name: “Kisung Kim”})-[:knows*..3]->(f:person)
RETURN (f)
Kisung Kim ?
workFor
Kisung Kim ?
knows
?
knows
?
knows
9. GraphDB to PostgreSQL Case
• From Hipolabs
http://engineering.hipolabs.com/graphdb-to-postgresql/
10. Graph Database and Hybrid Database
Magic Quadrant for Operational Database Management Systems, Gartner, 2016
11. So, What We Want to Make is
• Hybrid database engine with graph and relational model
• Cypher query processing on PostgreSQL
• Online transactional graph database
• Disk-based persistent graph storage
( ) -[:processes]->(Cypher)
12. Why We Choose PostgreSQL?
• Fully-featured enterprise-ready open source database
• Graph processing actually uses relational algebra
– Graph is serialized as tables in disk
– Every graph traversal step is in principle a join
(from LDBC documentation)
• It is important to optimize the joins speed up join processing
– PostgreSQL has an excellent query optimizer
• And…. Abundant eco-system of PostgreSQL
13. Challenges
• How to store graph data
– Efficient structure for graph pattern matching
– At the same time, efficient for transaction processing
• How to process graph queries
– Processing complex graph pattern matching: variable length path,
shortest path
– Mismatches between graph data model & relational data model
– Graph query optimization
14. Graph Storage
• Graph data is stored in disk as decomposed into vertexes
and edges
• When processing graph pattern matching, it is essential to
find adjacent vertexes or edges efficiently
– Given a start vertex, find end vertexes
– Given an end vertex, find start vertexes
v1
15. Two Graph Databases
Solution Company Latest Version Features
Neo Technology 3.1
Most famous graph database, Cypher
O(1) access using fixed-size array
Datastax -
Distributed graph system based on
Cassandra
Titan
16. Graph Storage -Neo4j
• Fixed-size array for nodes and relationships
• Relationships for a node is organized as a doubly-linked list
• Index-free adjacency
• O(1) access for adjacent edges: follow the pointer
From Graph Databases 2nd ed. O’Reilly, 2015
17. Graph Storage – Titan (DSE Graph)
• Titan stores graphs in adjacency list format
• Each edge is stored twice
• Vertex and edge list are stored in backend storage like HBase
Cassandra or BerkeleyDB
From http://s3.thinkaurelius.com/docs/titan/1.0.0/data-model.html
18. Graph Storage -AgensGraph
• Fixed-size array is hard to implement in PostgreSQL
– Tuples are moved when updated
• Titan’s big row approach is also inadequate
• We chose B-tree index for graph traversal
Graph
Vertex Edge
Vertex ID Properties Edge ID PropertiesStart Vertex ID End Vertex ID
B-tree
Vertex ID
B-tree
(Start, End)
B-tree
(End, Start)
19. Index Problems
• Current B-tree has several disadvantages for our workload
– Composite index is preferable but the size increases
– There exists a lot of duplicate keys (vertex ID)on start_ID or end_ID
– Property updates incur insertions into B-trees
• We are developing a new index having bucket structure (like
GIN index), in-direct index and supports for index-only scan
for the graph traversals
20. Graph Storage -AgensGraph
• Vertexes and edges are grouped into labels
• Labels are organized as a label hierarchy
• We use PostgreSQL’s table hierarchy feature
Vertex ID Properties
ag_vertex
Vertex ID Properties
Person
Vertex ID Properties
Message
Vertex ID Properties
Comment
Vertex ID Properties
Post
21. Current Status
• AgensGraph v0.9
(https://github.com/bitnine-oss/agens-graph or http://bitnine.net/downloads/)
– Graph data model and DDL on PostgreSQL 9.6
– Cypher query processing (70% of OpenCypher spec.)
– Integrated query processing (Cypher + SQL)
– Client library (JDBC, ODBC, Python)
– Monitoring and development using Tadpole DB-hub
22. Tadpole for Agens Graph
• Tadpole DB Hub is open-source project for managing unified
infrastructure (https://github.com/hangum/TadpoleForDBTools)
• Support various databases including (PostgreSQL and Agens Graph)
• Features of Tadpole for Agens Graph
– Monitoring Agens Graph server
– Cypher query browser and graph visualization
24. Future Roadmap
• Distributed graph database
– Plan to exploit Postgres-XL
• Specialized storage and index for graph traversals
• Dictionary compression for JSONB (ZSON)
• Graph query optimization using graph statistics
• Integration with big data systems
– HDFS Storage
– Graph analysis using GraphX
25. Join Us
• AgensGraph is an open-source project https://github.com/bitnine-oss/agens-
graph
• We also wish to contribute PostgreSQL community
• Graph database meetup in Silicon Valley
– http://www.meetup.com/Graph-Database-in-Silicon-Valley/