Exploiting graph database to discover value in complex Big Data. Lunch will be provided while you discover the power of graph database technology for your Big Data needs.
Bring your charged laptops to this upcoming meetup to walk through how to get started with InfiniteGraph. Nick Quinn, Senior Software Developer for InfiniteGraph, will walk you through the initial installation of InfiniteGraph and the HelloGraph sample to get you started with your graph database. Download InfiniteGraph for free here: http://www.objectivity.com/downloads
Once we get through the tutorial, there will be time for Q&A and more hands on support from additional members of the InfiniteGraph technical team.
If you have a complex Big Data problem and are looking to discover deeper connections and relationships within your data to create next-generation applications for social networks, healthcare, finance, telecom and security this is a must attend event! Get started quickly with our enterprise proven, massively scalable and distributed graph database!
2. What are we talking about today?
•Big Data and Databases
•What is a Graph Database?
•What is InfiniteGraph?
•Demo and Q&A – Hands On
– Installing InfiniteGraph
• https://download.infinitegraph.com
– FlightPlan Sample
• http://wiki.infinitegraph.com “Download
Examples” FlightPlanSample.zip
Images Courtesy of IMDB (www.imdb.com)
3. NoSQL 2013
• Developers are embracing choice
• More than Dynamo and BigTable clones
• Incorporates specialized data models like
Document, Object and Graph
• 100+ projects and products (Wikipedia)
• ~250 Meetup.com Groups (5 meetups this week!)
• NoSQL fans consume 12% of the worlds Beer & Pizza
11/13/2013
4. NoSQL and BigData – What’s the Connection ?
big data is a loosely-defined term used to describe data sets
so large and complex that they become awkward to work
with using on-hand database management tools (wikipedia)
•
•
•
•
•
•
Making big data “appear” smaller
Partitioning, replication & distributed query
Storage model optimizations
Consistency trade offs
Simplified query models
Dynamic views
11/13/2013
4
5. The Specialist !
• Everyone specializes
– Doctors, Lawyers, Bankers, Developers
• Why was data so normalized for so long !
• NoSQL is all about the data specialist
• Specializing in…
–
–
–
–
11/13/2013
Distribution / deployment
Physical data storage
Logical data model
Query mechanism
5
7. NoSQL Landscape - How it all stacks up!
Data
Model
Performance
Scalability
Flexibility
Complexity
Functionality
Key–value
Stores
high
high
high
none
variable
(none)
Column
Store
high
high
moderate
low
minimal
Document
Store
high
variable
high
low
variable (low)
Graph
Database
variable
variable
high
high
graph theory
Relational
Database
variable
variable
low
moderate
relational
algebra.
From…http://wikipedia.org/wiki/NoSQL
11/13/2013
7
9. The Physical Data Model
• Becoming a relationship specialist…
Rows/Columns/Tables
Relationship/Graph Optimized
Meetings
P1
Alice
P2
Bob
Place
Denver
Time
5-27-10
Alice
Met
5-27-10
Charlie
Calls
From
Bob
Bob
To
Carlos
Charlie
Time
13:20
17:10
Duration
25
15
Called
13:20
Called
17:10
Carlos
Bob
Paid
100000
Payments
From
Date
Amount
Carlos
11/13/2013
To
Charlie
5-12-10
100000
9
10. Sometimes Big Data is just Fast Data !
• Some data is only actionable momentarily
–
–
–
–
Intelligence
IT Security
Site/page visit
Financial / trading behavior
• Presents a different type of challenge
• Latency of batch data processing becomes
problematic
11/13/2013
10
11. Scaling Writes
• Big/Fast data demands write performance
• Most NoSQL solutions allow you to scale writes by…
– Partitioning the data
– Understanding your consistency requirements
– Allowing you to defer conflicts
11/13/2013
11
17. Why InfiniteGraph™?
• Objectivity/DB is a proven foundation
– Building highly connected databases since 1993
– A complete database management system
• Concurrency, transactions, cache, schema, query, indexing
• It’s a Graph Specialist !
– Simple but powerful API tailored for navigation
through data
– Easy to configure distribution model
11/13/2013
17
19. Fully Distributed Data Model
AddVertex()
IG Core/API
ADP Placement
Distributed Object and Relationship Persistence Layer
HostA
HostB
HostC
Zone 1
11/13/2013
HostX
Zone 2
19
20. InfiniteGraph is a Complete Database
• InfiniteGraph helps manage the things you don’t want to do, but
want to have done:
– Concurrency
• Transactions (commit/rollback)
• Controlled multi-user reading during updates
– Schema Control
• Build complex data structures, make changes easily and migrate existing data
– Distribution
• Sharing large amounts of distributed data between distributed processes
– Indexes
• Choose built-in key-value, b-tree or other indexes
– Cache
• Keep large sections of the graphs in configurable memory caches
11/13/2013
20
25. Why are Graphs Different ?
Application(s)
Distributed API
Processor
Processor
Processor
Processor
Partition 1
Partition 2
Partition 3
Partition ...n
11/13/2013
25
26. Optimizing Distributed Navigation
• Detect local hops and perform in memory
traversal
– Intelligently cache freq accessed remote data
• Route tasks to other hosts when it is optimal
Application
Distributed API
Processor
Processor
A
C
B
X
F
D
P(A,B,C,D)
E
Y
Partition 1
11/13/2013
Partition 2
26
G
27. Super Simple API
Person alice = new Person(“Alice”);
helloGraphDB.addVertex( alice );
Person bob = new Person(“Bob”);
helloGraphDB.addVertex( bob );
Person carlos = new Person(“Carlos”);
helloGraphDB.addVertex( carlos );
Person charlie = new Person(“Charlie”);
helloGraphDB.addVertex( charlie );
11/13/2013
27
30. Graph Traversal (Navigation) Queries
• Use an instance of the Navigator class to perform a
navigation query.
• A navigation instance is highly customizable, but is
comprised of the following basic parts:
– The vertex from which to start the navigation query.
– A guide strategy, which is a high-level navigational aid. You
can create a custom guide, or there are several available
built-in guide strategies.
• Guide.Strategy.NONE
• Guide.Strategy.SIMPLE_BREADTH_FIRST
• Guide.Strategy.SIMPLE_DEPTH_FIRST
– Qualifiers
• A path qualifier
• A result qualifier
– Handlers
• A result handler
11/13/2013
30
31. Schema – It’s not your enemy ! (well not all the time...)
• Schema vs Schema-less
–
–
–
–
Database religion
No time for a full debate here
InfiniteGraph supports schema
Planning to also support optional properties on
schema types
• Graph Views : A Great Use Case for Schema!
– Filter by type and predicate during navigation
– Connection Inference!
11/13/2013
31
32. Graph Views and Bacon!
•
Filter out uninteresting projects connected to Kevin Bacon
GraphView view = new GraphView();
//Excludes all instances of TvShow from navigation
view.excludeClass(myDb.getTypeId(TvShow.class.getName()));
//Excludes all movies made for TV/Video
view.excludeClass(myDb.getTypeId(Movie.class.getName()), “de
tails.madeForTv || details.madeForVideo”);
//Include ActedIn w/ characterName not containing “Himself”
view.excludeClass(myDb.getTypeId(WorkedOn.class.getName()));
view.includeClass(myDb.getTypeId(ActedIn.class.getName()),
“!CONTAINS(characterName, “Himself”)”);
Movie
Ryan Hardy
TV Show
The
Following
Actor
Himself
Kevin Bacon
Jack Swigert
Movie
Apollo 13
Behind the
Scenes
Relationships and connections are EVERYWHERE. Examples include CRM, Telecom, Intelligence, Research, Healthcare, Finance and yes, social networks too. But notice, it’s absolutely not just about social networks, in the Facebook sense. ANY application that needs to find connections and relationships separated by more than 2 degrees, is a good candidate for InfiniteGraph.
SIMPLE_BREADTH_FIRSTTraversal from a given vertex proceeds to all related vertices that are one degree of separation out before backtracking to traverse to related vertices that are two degrees of separation out, and so forth.SIMPLE_DEPTH_FIRSTTraversal from a given vertex continues down a path until it reaches an endpoint before backtracking to the originating vertex to check for additional outgoing paths, and so forth.