SlideShare uma empresa Scribd logo
1 de 34
www.Objectivity.com

Getting Started
with Graph
Databases
Nick Quinn
Principal Engineer, InfiniteGraph

11/13/2013

1
What are we talking about today?
•Big Data and Databases
•What is a Graph Database?
•What is InfiniteGraph?
•Demo and Q&A – Hands On
– Installing InfiniteGraph
• https://download.infinitegraph.com

– FlightPlan Sample
• http://wiki.infinitegraph.com  “Download
Examples”  FlightPlanSample.zip
Images Courtesy of IMDB (www.imdb.com)
NoSQL 2013
• Developers are embracing choice
• More than Dynamo and BigTable clones
• Incorporates specialized data models like
Document, Object and Graph
• 100+ projects and products (Wikipedia)
• ~250 Meetup.com Groups (5 meetups this week!)
• NoSQL fans consume 12% of the worlds Beer & Pizza

11/13/2013
NoSQL and BigData – What’s the Connection ?
big data is a loosely-defined term used to describe data sets
so large and complex that they become awkward to work
with using on-hand database management tools (wikipedia)

•
•
•
•
•
•

Making big data “appear” smaller
Partitioning, replication & distributed query
Storage model optimizations
Consistency trade offs
Simplified query models
Dynamic views

11/13/2013

4
The Specialist !
• Everyone specializes
– Doctors, Lawyers, Bankers, Developers 

• Why was data so normalized for so long !
• NoSQL is all about the data specialist
• Specializing in…
–
–
–
–

11/13/2013

Distribution / deployment
Physical data storage
Logical data model
Query mechanism

5
Polyglot NoSQL Architectures
Users

Applications

RDBMS

Document

Graph
Database

6

Business

External / Legacy Data
11/13/2013

Distributed Data
Processing
Platform

Transformation  MDM

Partitioned Distributed DB (often Document / KV)
NoSQL Landscape - How it all stacks up!
Data
Model

Performance

Scalability

Flexibility

Complexity

Functionality

Key–value
Stores

high

high

high

none

variable
(none)

Column
Store

high

high

moderate

low

minimal

Document
Store

high

variable

high

low

variable (low)

Graph
Database

variable

variable

high

high

graph theory

Relational
Database

variable

variable

low

moderate

relational
algebra.

From…http://wikipedia.org/wiki/NoSQL

11/13/2013

7
Navigational Query Performance

11/13/2013

8
The Physical Data Model
• Becoming a relationship specialist…
Rows/Columns/Tables

Relationship/Graph Optimized

Meetings
P1
Alice

P2
Bob

Place
Denver

Time
5-27-10

Alice

Met
5-27-10
Charlie

Calls
From
Bob
Bob

To
Carlos
Charlie

Time
13:20
17:10

Duration
25
15

Called
13:20

Called
17:10

Carlos

Bob

Paid
100000

Payments
From

Date

Amount

Carlos

11/13/2013

To
Charlie

5-12-10

100000

9
Sometimes Big Data is just Fast Data !
• Some data is only actionable momentarily
–
–
–
–

Intelligence
IT Security
Site/page visit
Financial / trading behavior

• Presents a different type of challenge
• Latency of batch data processing becomes
problematic

11/13/2013

10
Scaling Writes
• Big/Fast data demands write performance
• Most NoSQL solutions allow you to scale writes by…
– Partitioning the data
– Understanding your consistency requirements
– Allowing you to defer conflicts

11/13/2013

11
Why a Graph Database ?

11/13/2013

12
Relationships are everywhere
CRM, Sales &
Marketing

Network
Mgmt,
Telecom

Intelligence
(Government
& Business)

PLM (Product
Lifecycle
Mgmt)

Finance

Social
Networks

11/13/2013

Healthcare

Research:
Genomics

13
Exploding Connections
• More often than not… graphs are big !

11/13/2013

14
The Graph Database Landscape
• Neo4J
• Titan (Aurelius)
• AllegroGraph (RDF)
• FlockDB (Twitter)
• DEX (Sparsity)
• OrientDB (Document)
• + 24 others (from wikipedia.org)
Copyright © InfiniteGraph
The Graph Database Landscape Cont’d
• Graph Analytics: High latency, Batch Processing, offline
– Apache Giraph
– GraphLab
– Intel’s Graph Builder
• Visual Analytics: In Memory, High Performance, Poor
Scalability
–
–
–
–

Tom Sawyer
D3JS
KeyLines
InfoVis

• Tinkerpop stack (Blueprints/Gremlin)
– 16 implementations and counting…

Copyright © InfiniteGraph
Why InfiniteGraph™?
• Objectivity/DB is a proven foundation

– Building highly connected databases since 1993
– A complete database management system
• Concurrency, transactions, cache, schema, query, indexing

• It’s a Graph Specialist !

– Simple but powerful API tailored for navigation
through data
– Easy to configure distribution model

11/13/2013

17
InfiniteGraph™ Basic Architecture
User Apps

Blueprints

InfiniteGraph - Core/API

Management
Extensions

Navigation
Execution

Placement

Session / TX
Management

Configuration

Distributed Object and Relationship Persistence Layer

11/13/2013

18
Fully Distributed Data Model
AddVertex()
IG Core/API
ADP Placement

Distributed Object and Relationship Persistence Layer

HostA

HostB

HostC

Zone 1

11/13/2013

HostX
Zone 2

19
InfiniteGraph is a Complete Database
• InfiniteGraph helps manage the things you don’t want to do, but
want to have done:
– Concurrency
• Transactions (commit/rollback)
• Controlled multi-user reading during updates

– Schema Control
• Build complex data structures, make changes easily and migrate existing data

– Distribution
• Sharing large amounts of distributed data between distributed processes

– Indexes
• Choose built-in key-value, b-tree or other indexes

– Cache
• Keep large sections of the graphs in configurable memory caches

11/13/2013

20
Scaling Graph Writes
App-2
App-2
(Ingest V2)
(E23{ V2V3})

App-1
(E1 2{ V1V2})
(Ingest V1)

App-3
(Ingest V3)

InfiniteGraph
Objectivity/DB Persistence Layer

V1

E12

Copyright © InfiniteGraph

V2

E23

V3
High Performance Edge Ingest
IG Core/API

E23

E(2->1)

E(1->2)

E(2->3)

E(2->3)
E(3->1)

E(1->2)
E(3->2)

11/13/2013

Pipeline

E(1->2)

E(3->1)

Target Containers

E12

E(2->3)

E(2->1)
E(2->3)

E(3->1)

E(3->1)
E(3->2)

22

C1

Pipeline Containers

E(1->2)

C2

Agent

C3
Result…

500000
450000

Nodes and Edges per second

400000
350000
1 client

300000

2 clients

250000

4 clients

200000
88 Hosts
clients

150000

44 Hosts
clients

100000
50000

22 Hosts
clients

0
1

Single
1 clientHost
2
4

11/13/2013

23

8 clients
Scaling Reads and Query
Partitioning and Read Replicas… easy right !
Application(s)

Distributed API

Processor

Processor

Processor

Processor

Partition 1

Partition 2

Partition 3

Partition ...n

Copyright © InfiniteGraph
Why are Graphs Different ?
Application(s)

Distributed API

Processor

Processor

Processor

Processor

Partition 1

Partition 2

Partition 3

Partition ...n

11/13/2013

25
Optimizing Distributed Navigation
• Detect local hops and perform in memory
traversal
– Intelligently cache freq accessed remote data

• Route tasks to other hosts when it is optimal
Application
Distributed API
Processor

Processor

A

C
B

X

F

D
P(A,B,C,D)

E

Y

Partition 1

11/13/2013

Partition 2

26

G
Super Simple API
Person alice = new Person(“Alice”);
helloGraphDB.addVertex( alice );
Person bob = new Person(“Bob”);
helloGraphDB.addVertex( bob );
Person carlos = new Person(“Carlos”);
helloGraphDB.addVertex( carlos );
Person charlie = new Person(“Charlie”);
helloGraphDB.addVertex( charlie );

11/13/2013

27
Adding Edges
MyEdgeType edge = new MyEdgeType();
vertexA.addEdge ( edge, vertexB, EdgeKind.???, weight );

Meeting denverMeeting = new Meeting("Denver", "5-27-10");
alice.addEdge(denverMeeting, bob, EdgeKind.BIDIRECTIONAL, (short)1);
Call bobToCarlos = new Call(getRandomJulyTime());
bob.addEdge(bobToCarlos, carlos, EdgeKind.OUTGOING, (short)0);

Payment payment = new Payment(10000.00);
carlos.addEdge(payment, charlie, EdgeKind.OUTGOING, (short)2);
Call bobToCharlie = new Call(getRandomJulyTime());
bob.addEdge(bobToCharlie, charlie, EdgeKind.INCOMING, (short)0);

11/13/2013

28
The Result…

11/13/2013

29
Graph Traversal (Navigation) Queries
• Use an instance of the Navigator class to perform a
navigation query.
• A navigation instance is highly customizable, but is
comprised of the following basic parts:
– The vertex from which to start the navigation query.
– A guide strategy, which is a high-level navigational aid. You
can create a custom guide, or there are several available
built-in guide strategies.
• Guide.Strategy.NONE
• Guide.Strategy.SIMPLE_BREADTH_FIRST
• Guide.Strategy.SIMPLE_DEPTH_FIRST

– Qualifiers
• A path qualifier
• A result qualifier

– Handlers
• A result handler

11/13/2013

30
Schema – It’s not your enemy ! (well not all the time...)
• Schema vs Schema-less
–
–
–
–

Database religion
No time for a full debate here
InfiniteGraph supports schema
Planning to also support optional properties on
schema types

• Graph Views : A Great Use Case for Schema!
– Filter by type and predicate during navigation
– Connection Inference!

11/13/2013

31
Graph Views and Bacon!
•

Filter out uninteresting projects connected to Kevin Bacon
GraphView view = new GraphView();
//Excludes all instances of TvShow from navigation
view.excludeClass(myDb.getTypeId(TvShow.class.getName()));
//Excludes all movies made for TV/Video
view.excludeClass(myDb.getTypeId(Movie.class.getName()), “de
tails.madeForTv || details.madeForVideo”);
//Include ActedIn w/ characterName not containing “Himself”
view.excludeClass(myDb.getTypeId(WorkedOn.class.getName()));
view.includeClass(myDb.getTypeId(ActedIn.class.getName()),
“!CONTAINS(characterName, “Himself”)”);

Movie
Ryan Hardy
TV Show
The
Following

Actor

Himself

Kevin Bacon
Jack Swigert

Movie
Apollo 13

Behind the
Scenes
Tools To Suit the Solution

11/13/2013

33
Demo
 Installing InfiniteGraph
 FlightPlan Sample

Mais conteúdo relacionado

Mais de InfiniteGraph

Making Sense of Graph Databases
Making Sense of Graph DatabasesMaking Sense of Graph Databases
Making Sense of Graph DatabasesInfiniteGraph
 
Webinar 3/12/14: Using Social Media to Drive Value
Webinar 3/12/14: Using Social Media to Drive ValueWebinar 3/12/14: Using Social Media to Drive Value
Webinar 3/12/14: Using Social Media to Drive ValueInfiniteGraph
 
NoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-lessNoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-lessInfiniteGraph
 
The Value of Explicit Schema for Graph Use Cases
The Value of Explicit Schema for Graph Use CasesThe Value of Explicit Schema for Graph Use Cases
The Value of Explicit Schema for Graph Use CasesInfiniteGraph
 
Solution Use Case Demo: The Power of Relationships in Your Big Data
Solution Use Case Demo: The Power of Relationships in Your Big DataSolution Use Case Demo: The Power of Relationships in Your Big Data
Solution Use Case Demo: The Power of Relationships in Your Big DataInfiniteGraph
 
PowerOfRelationshipsInBigData_SVNoSQL
PowerOfRelationshipsInBigData_SVNoSQLPowerOfRelationshipsInBigData_SVNoSQL
PowerOfRelationshipsInBigData_SVNoSQLInfiniteGraph
 
Objectivity/DB: A Multipurpose NoSQL Database
Objectivity/DB: A Multipurpose NoSQL DatabaseObjectivity/DB: A Multipurpose NoSQL Database
Objectivity/DB: A Multipurpose NoSQL DatabaseInfiniteGraph
 
Making sense of the Graph Revolution
Making sense of the Graph RevolutionMaking sense of the Graph Revolution
Making sense of the Graph RevolutionInfiniteGraph
 
An Introduction to Graph Databases
An Introduction to Graph DatabasesAn Introduction to Graph Databases
An Introduction to Graph DatabasesInfiniteGraph
 
Using A Distributed Graph Database To Make Sense Of Disparate Data Stores
Using A Distributed Graph Database To Make Sense Of Disparate Data StoresUsing A Distributed Graph Database To Make Sense Of Disparate Data Stores
Using A Distributed Graph Database To Make Sense Of Disparate Data StoresInfiniteGraph
 
Turning Big Data into Smart Data with Graph Technologies
Turning Big Data into Smart Data with Graph TechnologiesTurning Big Data into Smart Data with Graph Technologies
Turning Big Data into Smart Data with Graph TechnologiesInfiniteGraph
 
NoSQL Technology and Real-time, Accurate Predictive Analytics
NoSQL Technology and Real-time, Accurate Predictive AnalyticsNoSQL Technology and Real-time, Accurate Predictive Analytics
NoSQL Technology and Real-time, Accurate Predictive AnalyticsInfiniteGraph
 
How we Learned to Stop Worrying and Solve the Distributed Graph Problem
How we Learned to Stop Worrying and Solve the Distributed Graph ProblemHow we Learned to Stop Worrying and Solve the Distributed Graph Problem
How we Learned to Stop Worrying and Solve the Distributed Graph ProblemInfiniteGraph
 
Everything Goes Better With Bacon: Revisiting the Six Degrees Problem with a ...
Everything Goes Better With Bacon: Revisiting the Six Degrees Problem with a ...Everything Goes Better With Bacon: Revisiting the Six Degrees Problem with a ...
Everything Goes Better With Bacon: Revisiting the Six Degrees Problem with a ...InfiniteGraph
 
Vodafone xone fev142013v3 ext
Vodafone xone fev142013v3 extVodafone xone fev142013v3 ext
Vodafone xone fev142013v3 extInfiniteGraph
 
Dbta Webinar Realize Value of Big Data with graph 011713
Dbta Webinar Realize Value of Big Data with graph  011713Dbta Webinar Realize Value of Big Data with graph  011713
Dbta Webinar Realize Value of Big Data with graph 011713InfiniteGraph
 
Oracle no sql overview brief
Oracle no sql overview briefOracle no sql overview brief
Oracle no sql overview briefInfiniteGraph
 
Infinite graph nosql meetup dec 2012
Infinite graph nosql meetup dec 2012Infinite graph nosql meetup dec 2012
Infinite graph nosql meetup dec 2012InfiniteGraph
 
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph TechnologyOracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph TechnologyInfiniteGraph
 
Silicon valley nosql meetup april 2012
Silicon valley nosql meetup  april 2012Silicon valley nosql meetup  april 2012
Silicon valley nosql meetup april 2012InfiniteGraph
 

Mais de InfiniteGraph (20)

Making Sense of Graph Databases
Making Sense of Graph DatabasesMaking Sense of Graph Databases
Making Sense of Graph Databases
 
Webinar 3/12/14: Using Social Media to Drive Value
Webinar 3/12/14: Using Social Media to Drive ValueWebinar 3/12/14: Using Social Media to Drive Value
Webinar 3/12/14: Using Social Media to Drive Value
 
NoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-lessNoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-less
 
The Value of Explicit Schema for Graph Use Cases
The Value of Explicit Schema for Graph Use CasesThe Value of Explicit Schema for Graph Use Cases
The Value of Explicit Schema for Graph Use Cases
 
Solution Use Case Demo: The Power of Relationships in Your Big Data
Solution Use Case Demo: The Power of Relationships in Your Big DataSolution Use Case Demo: The Power of Relationships in Your Big Data
Solution Use Case Demo: The Power of Relationships in Your Big Data
 
PowerOfRelationshipsInBigData_SVNoSQL
PowerOfRelationshipsInBigData_SVNoSQLPowerOfRelationshipsInBigData_SVNoSQL
PowerOfRelationshipsInBigData_SVNoSQL
 
Objectivity/DB: A Multipurpose NoSQL Database
Objectivity/DB: A Multipurpose NoSQL DatabaseObjectivity/DB: A Multipurpose NoSQL Database
Objectivity/DB: A Multipurpose NoSQL Database
 
Making sense of the Graph Revolution
Making sense of the Graph RevolutionMaking sense of the Graph Revolution
Making sense of the Graph Revolution
 
An Introduction to Graph Databases
An Introduction to Graph DatabasesAn Introduction to Graph Databases
An Introduction to Graph Databases
 
Using A Distributed Graph Database To Make Sense Of Disparate Data Stores
Using A Distributed Graph Database To Make Sense Of Disparate Data StoresUsing A Distributed Graph Database To Make Sense Of Disparate Data Stores
Using A Distributed Graph Database To Make Sense Of Disparate Data Stores
 
Turning Big Data into Smart Data with Graph Technologies
Turning Big Data into Smart Data with Graph TechnologiesTurning Big Data into Smart Data with Graph Technologies
Turning Big Data into Smart Data with Graph Technologies
 
NoSQL Technology and Real-time, Accurate Predictive Analytics
NoSQL Technology and Real-time, Accurate Predictive AnalyticsNoSQL Technology and Real-time, Accurate Predictive Analytics
NoSQL Technology and Real-time, Accurate Predictive Analytics
 
How we Learned to Stop Worrying and Solve the Distributed Graph Problem
How we Learned to Stop Worrying and Solve the Distributed Graph ProblemHow we Learned to Stop Worrying and Solve the Distributed Graph Problem
How we Learned to Stop Worrying and Solve the Distributed Graph Problem
 
Everything Goes Better With Bacon: Revisiting the Six Degrees Problem with a ...
Everything Goes Better With Bacon: Revisiting the Six Degrees Problem with a ...Everything Goes Better With Bacon: Revisiting the Six Degrees Problem with a ...
Everything Goes Better With Bacon: Revisiting the Six Degrees Problem with a ...
 
Vodafone xone fev142013v3 ext
Vodafone xone fev142013v3 extVodafone xone fev142013v3 ext
Vodafone xone fev142013v3 ext
 
Dbta Webinar Realize Value of Big Data with graph 011713
Dbta Webinar Realize Value of Big Data with graph  011713Dbta Webinar Realize Value of Big Data with graph  011713
Dbta Webinar Realize Value of Big Data with graph 011713
 
Oracle no sql overview brief
Oracle no sql overview briefOracle no sql overview brief
Oracle no sql overview brief
 
Infinite graph nosql meetup dec 2012
Infinite graph nosql meetup dec 2012Infinite graph nosql meetup dec 2012
Infinite graph nosql meetup dec 2012
 
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph TechnologyOracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
Oracle NoSQL DB & InfiniteGraph - Trends in Big Data and Graph Technology
 
Silicon valley nosql meetup april 2012
Silicon valley nosql meetup  april 2012Silicon valley nosql meetup  april 2012
Silicon valley nosql meetup april 2012
 

Último

Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 

Último (20)

Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 

Getting Started with Graph Databases

  • 1. www.Objectivity.com Getting Started with Graph Databases Nick Quinn Principal Engineer, InfiniteGraph 11/13/2013 1
  • 2. What are we talking about today? •Big Data and Databases •What is a Graph Database? •What is InfiniteGraph? •Demo and Q&A – Hands On – Installing InfiniteGraph • https://download.infinitegraph.com – FlightPlan Sample • http://wiki.infinitegraph.com  “Download Examples”  FlightPlanSample.zip Images Courtesy of IMDB (www.imdb.com)
  • 3. NoSQL 2013 • Developers are embracing choice • More than Dynamo and BigTable clones • Incorporates specialized data models like Document, Object and Graph • 100+ projects and products (Wikipedia) • ~250 Meetup.com Groups (5 meetups this week!) • NoSQL fans consume 12% of the worlds Beer & Pizza 11/13/2013
  • 4. NoSQL and BigData – What’s the Connection ? big data is a loosely-defined term used to describe data sets so large and complex that they become awkward to work with using on-hand database management tools (wikipedia) • • • • • • Making big data “appear” smaller Partitioning, replication & distributed query Storage model optimizations Consistency trade offs Simplified query models Dynamic views 11/13/2013 4
  • 5. The Specialist ! • Everyone specializes – Doctors, Lawyers, Bankers, Developers  • Why was data so normalized for so long ! • NoSQL is all about the data specialist • Specializing in… – – – – 11/13/2013 Distribution / deployment Physical data storage Logical data model Query mechanism 5
  • 6. Polyglot NoSQL Architectures Users Applications RDBMS Document Graph Database 6 Business External / Legacy Data 11/13/2013 Distributed Data Processing Platform Transformation MDM Partitioned Distributed DB (often Document / KV)
  • 7. NoSQL Landscape - How it all stacks up! Data Model Performance Scalability Flexibility Complexity Functionality Key–value Stores high high high none variable (none) Column Store high high moderate low minimal Document Store high variable high low variable (low) Graph Database variable variable high high graph theory Relational Database variable variable low moderate relational algebra. From…http://wikipedia.org/wiki/NoSQL 11/13/2013 7
  • 9. The Physical Data Model • Becoming a relationship specialist… Rows/Columns/Tables Relationship/Graph Optimized Meetings P1 Alice P2 Bob Place Denver Time 5-27-10 Alice Met 5-27-10 Charlie Calls From Bob Bob To Carlos Charlie Time 13:20 17:10 Duration 25 15 Called 13:20 Called 17:10 Carlos Bob Paid 100000 Payments From Date Amount Carlos 11/13/2013 To Charlie 5-12-10 100000 9
  • 10. Sometimes Big Data is just Fast Data ! • Some data is only actionable momentarily – – – – Intelligence IT Security Site/page visit Financial / trading behavior • Presents a different type of challenge • Latency of batch data processing becomes problematic 11/13/2013 10
  • 11. Scaling Writes • Big/Fast data demands write performance • Most NoSQL solutions allow you to scale writes by… – Partitioning the data – Understanding your consistency requirements – Allowing you to defer conflicts 11/13/2013 11
  • 12. Why a Graph Database ? 11/13/2013 12
  • 13. Relationships are everywhere CRM, Sales & Marketing Network Mgmt, Telecom Intelligence (Government & Business) PLM (Product Lifecycle Mgmt) Finance Social Networks 11/13/2013 Healthcare Research: Genomics 13
  • 14. Exploding Connections • More often than not… graphs are big ! 11/13/2013 14
  • 15. The Graph Database Landscape • Neo4J • Titan (Aurelius) • AllegroGraph (RDF) • FlockDB (Twitter) • DEX (Sparsity) • OrientDB (Document) • + 24 others (from wikipedia.org) Copyright © InfiniteGraph
  • 16. The Graph Database Landscape Cont’d • Graph Analytics: High latency, Batch Processing, offline – Apache Giraph – GraphLab – Intel’s Graph Builder • Visual Analytics: In Memory, High Performance, Poor Scalability – – – – Tom Sawyer D3JS KeyLines InfoVis • Tinkerpop stack (Blueprints/Gremlin) – 16 implementations and counting… Copyright © InfiniteGraph
  • 17. Why InfiniteGraph™? • Objectivity/DB is a proven foundation – Building highly connected databases since 1993 – A complete database management system • Concurrency, transactions, cache, schema, query, indexing • It’s a Graph Specialist ! – Simple but powerful API tailored for navigation through data – Easy to configure distribution model 11/13/2013 17
  • 18. InfiniteGraph™ Basic Architecture User Apps Blueprints InfiniteGraph - Core/API Management Extensions Navigation Execution Placement Session / TX Management Configuration Distributed Object and Relationship Persistence Layer 11/13/2013 18
  • 19. Fully Distributed Data Model AddVertex() IG Core/API ADP Placement Distributed Object and Relationship Persistence Layer HostA HostB HostC Zone 1 11/13/2013 HostX Zone 2 19
  • 20. InfiniteGraph is a Complete Database • InfiniteGraph helps manage the things you don’t want to do, but want to have done: – Concurrency • Transactions (commit/rollback) • Controlled multi-user reading during updates – Schema Control • Build complex data structures, make changes easily and migrate existing data – Distribution • Sharing large amounts of distributed data between distributed processes – Indexes • Choose built-in key-value, b-tree or other indexes – Cache • Keep large sections of the graphs in configurable memory caches 11/13/2013 20
  • 21. Scaling Graph Writes App-2 App-2 (Ingest V2) (E23{ V2V3}) App-1 (E1 2{ V1V2}) (Ingest V1) App-3 (Ingest V3) InfiniteGraph Objectivity/DB Persistence Layer V1 E12 Copyright © InfiniteGraph V2 E23 V3
  • 22. High Performance Edge Ingest IG Core/API E23 E(2->1) E(1->2) E(2->3) E(2->3) E(3->1) E(1->2) E(3->2) 11/13/2013 Pipeline E(1->2) E(3->1) Target Containers E12 E(2->3) E(2->1) E(2->3) E(3->1) E(3->1) E(3->2) 22 C1 Pipeline Containers E(1->2) C2 Agent C3
  • 23. Result… 500000 450000 Nodes and Edges per second 400000 350000 1 client 300000 2 clients 250000 4 clients 200000 88 Hosts clients 150000 44 Hosts clients 100000 50000 22 Hosts clients 0 1 Single 1 clientHost 2 4 11/13/2013 23 8 clients
  • 24. Scaling Reads and Query Partitioning and Read Replicas… easy right ! Application(s) Distributed API Processor Processor Processor Processor Partition 1 Partition 2 Partition 3 Partition ...n Copyright © InfiniteGraph
  • 25. Why are Graphs Different ? Application(s) Distributed API Processor Processor Processor Processor Partition 1 Partition 2 Partition 3 Partition ...n 11/13/2013 25
  • 26. Optimizing Distributed Navigation • Detect local hops and perform in memory traversal – Intelligently cache freq accessed remote data • Route tasks to other hosts when it is optimal Application Distributed API Processor Processor A C B X F D P(A,B,C,D) E Y Partition 1 11/13/2013 Partition 2 26 G
  • 27. Super Simple API Person alice = new Person(“Alice”); helloGraphDB.addVertex( alice ); Person bob = new Person(“Bob”); helloGraphDB.addVertex( bob ); Person carlos = new Person(“Carlos”); helloGraphDB.addVertex( carlos ); Person charlie = new Person(“Charlie”); helloGraphDB.addVertex( charlie ); 11/13/2013 27
  • 28. Adding Edges MyEdgeType edge = new MyEdgeType(); vertexA.addEdge ( edge, vertexB, EdgeKind.???, weight ); Meeting denverMeeting = new Meeting("Denver", "5-27-10"); alice.addEdge(denverMeeting, bob, EdgeKind.BIDIRECTIONAL, (short)1); Call bobToCarlos = new Call(getRandomJulyTime()); bob.addEdge(bobToCarlos, carlos, EdgeKind.OUTGOING, (short)0); Payment payment = new Payment(10000.00); carlos.addEdge(payment, charlie, EdgeKind.OUTGOING, (short)2); Call bobToCharlie = new Call(getRandomJulyTime()); bob.addEdge(bobToCharlie, charlie, EdgeKind.INCOMING, (short)0); 11/13/2013 28
  • 30. Graph Traversal (Navigation) Queries • Use an instance of the Navigator class to perform a navigation query. • A navigation instance is highly customizable, but is comprised of the following basic parts: – The vertex from which to start the navigation query. – A guide strategy, which is a high-level navigational aid. You can create a custom guide, or there are several available built-in guide strategies. • Guide.Strategy.NONE • Guide.Strategy.SIMPLE_BREADTH_FIRST • Guide.Strategy.SIMPLE_DEPTH_FIRST – Qualifiers • A path qualifier • A result qualifier – Handlers • A result handler 11/13/2013 30
  • 31. Schema – It’s not your enemy ! (well not all the time...) • Schema vs Schema-less – – – – Database religion No time for a full debate here InfiniteGraph supports schema Planning to also support optional properties on schema types • Graph Views : A Great Use Case for Schema! – Filter by type and predicate during navigation – Connection Inference! 11/13/2013 31
  • 32. Graph Views and Bacon! • Filter out uninteresting projects connected to Kevin Bacon GraphView view = new GraphView(); //Excludes all instances of TvShow from navigation view.excludeClass(myDb.getTypeId(TvShow.class.getName())); //Excludes all movies made for TV/Video view.excludeClass(myDb.getTypeId(Movie.class.getName()), “de tails.madeForTv || details.madeForVideo”); //Include ActedIn w/ characterName not containing “Himself” view.excludeClass(myDb.getTypeId(WorkedOn.class.getName())); view.includeClass(myDb.getTypeId(ActedIn.class.getName()), “!CONTAINS(characterName, “Himself”)”); Movie Ryan Hardy TV Show The Following Actor Himself Kevin Bacon Jack Swigert Movie Apollo 13 Behind the Scenes
  • 33. Tools To Suit the Solution 11/13/2013 33

Notas do Editor

  1. Kevin Norwood Bacon, 1958 in Pennsylvania
  2. Relationships and connections are EVERYWHERE. Examples include CRM, Telecom, Intelligence, Research, Healthcare, Finance and yes, social networks too. But notice, it’s absolutely not just about social networks, in the Facebook sense. ANY application that needs to find connections and relationships separated by more than 2 degrees, is a good candidate for InfiniteGraph.
  3. SIMPLE_BREADTH_FIRSTTraversal from a given vertex proceeds to all related vertices that are one degree of separation out before backtracking to traverse to related vertices that are two degrees of separation out, and so forth.SIMPLE_DEPTH_FIRSTTraversal from a given vertex continues down a path until it reaches an endpoint before backtracking to the originating vertex to check for additional outgoing paths, and so forth.