SlideShare uma empresa Scribd logo
1 de 31
Baixar para ler offline
AURELIUS
THINKAURELIUS.COM
TITAN
Distributed Graph Computing
Matthias Broecheler, CTO
@mbroecheler
June XI, MMXIII
#CASSANDRA13
This presentation introduces Titan, Faunus, and scalable graph
computing in general. We present a case study of how Pearson
builds an education social network on top of Titan, Faunus, and
Cassandra to support learning in the 21st century.
Titan is an open source distributed graph database build on top
of Cassandra that can power real-time applications with
thousands of concurrent users over graphs with billions of
edges. Faunus is an open source global graph processing engine
build on top of Hadoop and compatible with Cassandra that can
analyze graphs, compute graph statistics, and execute global
traversals. Titan and Faunus are components of the Aurelius
Graph Cluster which enables scalable graph computation and
powers applications in social networking, recommendation
engines, advertisement optimization, knowledge
representation, health care, education, and security.
Thank You!
JOFF L?KO?MNM
@?;NOL? MOAA?MNCIHM
<OA L?JILNM
=IGGOHCNS MOJJILN
June 14th
2012
September
2012
December
2012
March
2013
May
2013
Alpha
Release
Titan
0.1.0
Titan
0.2.0
Titan
0.3.0
Titan
0.3.1
%RJ?LCG?HN;F L?F?;M? I@ ;
>CMNLC<ON?>m IJ?H rMIOL=?
AL;JB >;N;<;M?
&CLMN MN;<F? L?F?;M?
2?QLCN? I@ =IL?
)H>?RCHA h %F;MNC=3?;L=B
0?L@ILG;H=? "OA@CRCHA
June 14th
2012
September
2012
December
2012
March
2013
May
2013
Alpha
Release
Titan
0.1.0
Titan
0.2.0
Titan
0.3.0
Titan
0.3.1
%RJ?LCG?HN;F L?F?;M? I@ ;
>CMNLC<ON?>m IJ?H rMIOL=?
AL;JB >;N;<;M?
&CLMN MN;<F? L?F?;M?
2?QLCN? I@ =IL?
)H>?RCHA h %F;MNC=3?;L=B
0?L@ILG;H=? "OA@CRCHA
Faunus Release
Titan
Graph Database
>CMNLC<ON?>
L?;F NCG?
IJ?H
MIOL=?
name: Hercules
type: demigod
name: Cerberus
type: monster
battled
time:12
6?LN?R
%>A? ,;<?F
%>A?
0LIJ?LNS
Value in Relationships
low
 high
Key-Value
7B?H MBIOF> SIO OM? ; 'L;JB $;N;<;M?g
K
 V
BigTable
K
 V
 V
 V
 V
Document
Relational
Graph
"
Educating the Planet
Educating the Planet
Person
Person
Student
 Teacher
Course
Institution
Concept
Discussion
Comment
Share
enrolledIn
teaches
relatesTo
hasCourse
belongsTo
follows
author
references
hasComment
 relatesTo
author
partOf
relatesTo
Person
Person
Student
 Teacher
Course
Institution
Concept
Discussion
Comment
Share
enrolledIn
teaches
relatesTo
hasCourse
belongsTo
follows
author
references
hasComment
 relatesTo
author
partOf
relatesTo
Titan
Integrative Data Model
CH ; JIFSAFIN
MNIL;A? QILF>
Student
Person
Teacher
Course
Institution
Concept
Discussion
Comment
Share
enrolledIn
teaches
relatesTo
hasCourse
belongsTo
follows
author
references
hasComment
 relatesTo
author
partOf
DiscussionRank
relatesTo
Titan
Analyze Relationships
CH L?;F NCG?
Scaling Titan
HOG<?L I@
NL;HM;=NCIHM
MCT? I@ NB? AL;JB
121 Billion Edges
6.2 Billion Vertices
U -CFFCIH 5HCP?LMCNC?M
0F;=?G?HN 'LIOJ
BCU .4RF
1.1 million edges / sec
OMCHA <;N=B GI>?
Data Ingestion
^ GU .G?>COG
x = [] as Set;!
m = user.out('follows').aggregate(x)[0..(num*2-1)]!
!.out('follows').except(x)[0..limit]!
!.groupCount.cap.next();!
m.sort{-it.value}[0..(num-1)]!
._().transform{ [userid: it.key.id, !
! ! ! ! ! ! !points: it.value]};!
&IFFIQ 2?=IGG?H>;NCIH
Generic
Graph API
Dataflow
Processing
Traversal
Language
Object-Graph
Mapper
Graph
Algorithms
Graph
Server
=IIF MNO@@
=IGCHA
2%34 h *3/.4CN;H’M
%=IMSMN?G
KO?LS
F;HAO;A?
10,200 transactions / sec
UZ L;H>IGFS =BIM?H =IGJF?R
NL;P?LM;F N?GJF;N?M
Throughput
Transaction Description Avg (ms) Stdev (ms)
Student retrieves all content for a
single course in their course list
279.32 81.83
Student follows another student 193.72 22.77
Student is recommended people
to follow
241.33 256.48
Student reads their stream and
shares an item with followers
284.07 68.20
Student retrieves their profile 53.740 22.61
Student reads the most recent
comments for their courses
211.07 45.56
Scaling Titan
N?=BHC=;F J?LMJ?=NCP?
Vertex Representation
time: 1
5
8
4
9
2
7
mother
battled
battled
battled
fought
time: 4
time: 7
 CH>O=?>IL>?L
name:
Hercules
type:
demigod
5
Property
Property
Edge
Edge
Edge
Edge
Edge
LIQ CH>C=?M
@IL @;MN
P?LN?R =?HNLC=
KO?LC?M
label id +
direction
primary key
 edge id
Δ
vertex id
signature
properties
other
properties
Edge Representation
Column
 Value
=IGJL?MM?> M?LC;FCT?> I<D?=NM
P;LC;<F? FIHA ?H=I>CHA
Token Ring
Graph Partitioning
;MMCAHM C>M NI G;J
P?LNC=?M CHNI “IJNCG;F”
NIE?H L;HA?
,INM I@ CHN?L?MNCHA KO?MNCIHM @IL@ONOL? QILE
OM?M "/0
Aurelius Graph Cluster
Stores a massive-scale
property graph allowing real-
time traversals and updates
Batch processing of large
graphs with Hadoop
Runs global graph algorithms
on large, compressed,
in-memory graphs
Map/Reduce
 Load & Compress
Analysis results
back into Titan
Bulk Load
TITAN FAUNUS FULGORA
Apache 2
aureliusgraphs@googlegroups.com
titan.thinkaurelius.com
 faunus.thinkaurelius.com
Special Thanks
Steve Hill (@kindageeky)
Director Architecture & Innovation
at Pearson Education
AURELIUS
THINKAURELIUS.COM
We are Hiring

Mais conteúdo relacionado

Semelhante a C* Summit 2013: Distributed Graph Computing with Titan and Faunus by Matthias Broecheler

THIC MedIX Summer 2015 Poster
THIC MedIX Summer 2015 PosterTHIC MedIX Summer 2015 Poster
THIC MedIX Summer 2015 Poster
Diana Zajac
 
Trilinos progress, challenges and future plans
Trilinos progress, challenges and future plansTrilinos progress, challenges and future plans
Trilinos progress, challenges and future plans
M Reza Rahmati
 
Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...
Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...
Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...
Dawen Liang
 
Presentation_BigData_NenaMarin
Presentation_BigData_NenaMarinPresentation_BigData_NenaMarin
Presentation_BigData_NenaMarin
n5712036
 
OneBookHigher_poster_ver7
OneBookHigher_poster_ver7OneBookHigher_poster_ver7
OneBookHigher_poster_ver7
Roselyn Luhur
 

Semelhante a C* Summit 2013: Distributed Graph Computing with Titan and Faunus by Matthias Broecheler (20)

BIG GRAPH: TOOLS, TECHNIQUES, ISSUES, CHALLENGES AND FUTURE DIRECTIONS
BIG GRAPH: TOOLS, TECHNIQUES, ISSUES, CHALLENGES AND FUTURE DIRECTIONSBIG GRAPH: TOOLS, TECHNIQUES, ISSUES, CHALLENGES AND FUTURE DIRECTIONS
BIG GRAPH: TOOLS, TECHNIQUES, ISSUES, CHALLENGES AND FUTURE DIRECTIONS
 
Big Graph : Tools, Techniques, Issues, Challenges and Future Directions
Big Graph : Tools, Techniques, Issues, Challenges and Future Directions Big Graph : Tools, Techniques, Issues, Challenges and Future Directions
Big Graph : Tools, Techniques, Issues, Challenges and Future Directions
 
Big data processing using - Hadoop Technology
Big data processing using - Hadoop TechnologyBig data processing using - Hadoop Technology
Big data processing using - Hadoop Technology
 
THIC MedIX Summer 2015 Poster
THIC MedIX Summer 2015 PosterTHIC MedIX Summer 2015 Poster
THIC MedIX Summer 2015 Poster
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
 
Trilinos progress, challenges and future plans
Trilinos progress, challenges and future plansTrilinos progress, challenges and future plans
Trilinos progress, challenges and future plans
 
Propagating Data Policies - A User Study
Propagating Data Policies - A User StudyPropagating Data Policies - A User Study
Propagating Data Policies - A User Study
 
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
 
Open Analytics Environment
Open Analytics EnvironmentOpen Analytics Environment
Open Analytics Environment
 
Massive Data Analysis- Challenges and Applications
Massive Data Analysis- Challenges and ApplicationsMassive Data Analysis- Challenges and Applications
Massive Data Analysis- Challenges and Applications
 
On Big Data
On Big DataOn Big Data
On Big Data
 
2014-10-10-SBC361-Reproducible research
2014-10-10-SBC361-Reproducible research2014-10-10-SBC361-Reproducible research
2014-10-10-SBC361-Reproducible research
 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22
 
Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...
Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...
Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...
 
Proposed Talk Outline for Pycon2017
Proposed Talk Outline for Pycon2017 Proposed Talk Outline for Pycon2017
Proposed Talk Outline for Pycon2017
 
The Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape OverviewThe Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape Overview
 
Faunus: Graph Analytics Engine
Faunus: Graph Analytics EngineFaunus: Graph Analytics Engine
Faunus: Graph Analytics Engine
 
Presentation_BigData_NenaMarin
Presentation_BigData_NenaMarinPresentation_BigData_NenaMarin
Presentation_BigData_NenaMarin
 
OneBookHigher_poster_ver7
OneBookHigher_poster_ver7OneBookHigher_poster_ver7
OneBookHigher_poster_ver7
 
MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)
MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)
MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)
 

Mais de DataStax Academy

Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
DataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
DataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
DataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
DataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
DataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
DataStax Academy
 

Mais de DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

C* Summit 2013: Distributed Graph Computing with Titan and Faunus by Matthias Broecheler