SlideShare uma empresa Scribd logo
1 de 32
Baixar para ler offline
Graph Analysis with One Trillion Edges
on Apache Giraph
2/13/2014
Avery Ching, Facebook
Strata
Motivation
Apache Giraph
• Inspired by Google’s Pregel but runs on Hadoop
• “Think like a vertex”
• Maximum value vertex example
Processor 1

Time

5

5

5

1

1
5

5

5

2

Processor 2

5

2

2
5

5
Giraph on Hadoop / Yarn
Giraph
MapReduce
Hadoop
0.20.x

Hadoop
0.20.203

Hadoop
1.x

YARN
Hadoop
2.0.x
Apache Giraph data flow

Split 3

Load/
Send
Graph

Part 1
Part 2
Part 3

Compute/
Send
Messages

Compute/
Send
Messages

Send stats/iterate!

Worker 0

Part 0

Worker 0

Load/
Send
Graph

Storing the graph

Worker 1

Split 2

In-memory
graph

Worker 1

Split 1

Compute / Iterate

Master

Master

Split 0

Worker 1

Input 

format

Worker 0

Loading the graph

Part 0
Part 1

Output 

format
Part 0
Part 1

Part 2
Part 3

Part 2
Part 3
Beyond Pregel
Sharded aggregators
Master computation
Composable computation
Use case: k-means clustering
Cluster input vectors into k clusters

• Assign each input vector to the closest centroid
• Update centroid locations based on assignments
Random centroid location

Assignment to centroid

c0

Update centroids

c0
c2

c0
c2

c2
c0
c2

c1

c1

c1

c1
k-means in Giraph

Partitioning the problem
c0
c2

Input vectors → vertices

• Partitioned across machines
Centroids → aggregators

• Shared data across all machines

c1

!
!

Worker 0

Problem solved....right?

Worker 1

c0

c0
c2

c1

c2

c1
Problem 1: Massive dimensions
Cluster Facebook members by friendships?

• 1 billion members (dimensions)
• k clusters
Each worker sending to the master a maximum of

• 1B * (2 bytes - max 5k friends) * k = 2 * k GB
Master receives up to 2 * k * workers GB

• Saturated network link
• OOM
Sharded aggregators
Master handles all aggregators

Aggregators sharded to workers

final agg 0

master

final agg 0

master

final agg 1

final agg 1

final agg 2

final agg 2

partial agg 0
partial agg 1

final agg 1

partial agg 2

worker	

0

final agg 0

partial agg 0

worker	

0

final agg 0

final agg 2

partial agg 2

final agg 2

final agg 0

partial agg 0

final agg 0

partial agg 1

final agg 1

partial agg 2

final agg 2

partial agg 2

final agg 2

partial agg 0

final agg 0

partial agg 0

final agg 0

partial agg 1

final agg 1

partial agg 1

final agg 1

partial agg 2

worker	

2

final agg 1

partial agg 0

worker	

1

partial agg 1

final agg 2

worker	

1

worker	

2

partial agg 1

partial agg 2

final agg 1

final agg 2

• Share aggregator load across workers
• Future work - tree-based optimizations (not yet a problem)
Problem 2: Edge cut metric
Clusters should reduce the number of cut edges
Two phases

• Send all out edges your cluster id
• Aggregate edges with different cluster ids
Calculate no more than once an hour?
Master computation
Serial computation on master

• Communicates to workers via aggregators
• Added to Giraph by Stanford GPS team
Master

Worker 0

Worker 1

Time

k-means

k-means

start cut

end cut

k-means

k-means

k-means

start cut

end cut

k-means
Problem 3: More phases, more problems
Add a stage to initialize the centroids
Add random input vectors to centroids

• Add a few random friends
Two phases

c0
c2

• Randomly sample input vertices to add
• Send messages to a few random neighbors

c3
Problem 3: (continued)
Cannot easily support different messages,
combiners
Vertex compute code getting messy

c0
c2

if (phase == INITIALIZE_SELF)
// Randomly add to centroid
else if (phase == INITIALIZE_FRIEND)
// Add my vector to centroid if a friend selected me
else if (phase == K_MEANS)
// Do k-means
else if (phase == START_EDGE_CUT)...

c3
Composable computation
Decouple vertex from computation
Master sets the computation, combiner classes
Reusable and composable

Computation

Add random
centroid /
random friends

Add to centroid

K-means

Start edge cut

End edge cut

In message

Null

Centroid
message

Null

Null

Cluster

Out message

Centroid
message

Null

Null

Cluster

Null

Combiner

N/A

N/A

N/A

Cluster combiner

N/A
Composable computation (cont)
Balanced Label Propagation
compute candidates to
move to partitions

probabilistically
move vertices

Continue if halting condition not met (i.e. < n
vertices moved?)
Composable computation (cont)
Balanced Label Propagation
compute candidates to
move to partitions

probabilistically
move vertices

Continue if halting condition not met (i.e. < n
vertices moved?)

Affinity Propagation
calculate and send
responsibilities

calculate and send
availabilities

Continue if halting condition met (i.e. < n
vertices changed exemplars?)

update exemplars
Faster than Hive?
Application

Graph Size

CPU Time Speedup

Elapsed Time Speedup

Page rank


400B+ edges

26x

120x

71B+ edges

12.5x

48x

(single iteration)

Friends of
friends score

Apache Giraph scalability
Scalability of workers
Scalability of edges (50
(200B edges)

workers)

500

375

375

Seconds

Seconds

500

250
125
0

50 100 150 200 250 300
# of Workers
Giraph

Ideal

250
125
0
1E+09

7E+10

1E+11

# of Edges
Giraph

Ideal

2E+11
A billion edges isn’t cool. 

You know what’s cool?
A TRILLION edges.
Page rank on 200 machines
with 1 trillion
(1,000,000,000,000) edges
<4 minutes / iteration!
* Results from 6/30/2013 with one-to-all messaging + request
processing improvements
Why balanced partitioning
Random partitioning == good balance
BUT ignores entity affinity

0

1

6

3

4

5

10

7

8

9

2

11
Balanced partitioning application
Results from one service:
Cache hit rate grew from 70% to 85%, bandwidth cut in 1/2
!
!

0

3

6

9

1

4

7

10

2

5

8

11
Balanced label propagation results

* Loosely based on Ugander and Backstrom. Balanced label
propagation for partitioning massive graphs, WSDM '13
Avoiding out-of-core
Example: Mutual friends calculation between
neighbors

!
C:{D}
D:{C}
A

1. Send your friends a list of your friends

!
!
E:{}
B

2. Intersect with your friend list
!

1.23B (as of 1/2014)

A:{D}
D:{A,E}
E:{D}

C

E

200+ average friends (2011 S1)
8-byte ids (longs)
= 394 TB / 100 GB machines
3,940 machines (not including the graph)

D

A:{C}
C:{A,E}
E:{C}

B:{}
C:{D}
D:{C}
Superstep splitting
Subsets of sources/destinations edges per superstep
* Currently manual - future work automatic!

Sources: A (on), B (off)
Destinations: A (on), B (off)

Sources: A (on), B (off)
Destinations: A (off), B (on)

B

Sources: A (off), B (on)
Destinations: A (on), B (off)

B

Sources: A (off), B (on)
Destinations: A (off), B (on)

B

B

A

B

A

B

A

B

A

B

B

A

B

A

B

A

B

A

A

A

A

A
Debugging with GiraphicJam
Giraph in Production
Over 1.5 years in production
Over 100 jobs processed a week
30+ applications in our internal application repository
Sample production job - 700B+ edges
Very stable

• Checkpointing disabled (highly loaded HDFS adds instability)
• Retries handle intermittent failures
Giraph roadmap

2/12 - 0.1

Relaxing BSP - 1.2?

• Giraph++ (IBM research)
• Giraphx (University at Buffalo, SUNY)

5/13 - 1.0

Spring 2014 - 1.1
Future work
Evaluate alternative computing models
Performance
Lower the barrier to entry
Applications
Our team
!

Pavan
Athivarapu

Avery
Ching

Maja
Kabiljo

Greg
Malewicz

Sambavi
Muthukrishnan
2014.02.13 (Strata) Graph Analysis with One Trillion Edges on Apache Giraph

Mais conteúdo relacionado

Mais procurados

Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...
Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...
Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...
Databricks
 

Mais procurados (20)

Data profiling in Apache Calcite
Data profiling in Apache CalciteData profiling in Apache Calcite
Data profiling in Apache Calcite
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 
Random Walks on Large Scale Graphs with Apache Spark with Min Shen
Random Walks on Large Scale Graphs with Apache Spark with Min ShenRandom Walks on Large Scale Graphs with Apache Spark with Min Shen
Random Walks on Large Scale Graphs with Apache Spark with Min Shen
 
Machine Learning as a Service: Apache Spark MLlib Enrichment and Web-Based Co...
Machine Learning as a Service: Apache Spark MLlib Enrichment and Web-Based Co...Machine Learning as a Service: Apache Spark MLlib Enrichment and Web-Based Co...
Machine Learning as a Service: Apache Spark MLlib Enrichment and Web-Based Co...
 
Enterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using SparkEnterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using Spark
 
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
 
Fast, Scalable Graph Processing: Apache Giraph on YARN
Fast, Scalable Graph Processing: Apache Giraph on YARNFast, Scalable Graph Processing: Apache Giraph on YARN
Fast, Scalable Graph Processing: Apache Giraph on YARN
 
Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...
Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...
Deep Learning Pipelines for High Energy Physics using Apache Spark with Distr...
 
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
 
Why Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Why Apache Spark is the Heir to MapReduce in the Hadoop EcosystemWhy Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Why Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
 
R for hadoopers
R for hadoopersR for hadoopers
R for hadoopers
 
Hadoop to spark-v2
Hadoop to spark-v2Hadoop to spark-v2
Hadoop to spark-v2
 
Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Handling Data Skew Adaptively In Spark Using Dynamic RepartitioningHandling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning
 
Building Robust ETL Pipelines with Apache Spark
Building Robust ETL Pipelines with Apache SparkBuilding Robust ETL Pipelines with Apache Spark
Building Robust ETL Pipelines with Apache Spark
 
From Pipelines to Refineries: Scaling Big Data Applications
From Pipelines to Refineries: Scaling Big Data ApplicationsFrom Pipelines to Refineries: Scaling Big Data Applications
From Pipelines to Refineries: Scaling Big Data Applications
 
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
 
A Data Frame Abstraction Layer for SparkR-(Chris Freeman, Alteryx)
A Data Frame Abstraction Layer for SparkR-(Chris Freeman, Alteryx)A Data Frame Abstraction Layer for SparkR-(Chris Freeman, Alteryx)
A Data Frame Abstraction Layer for SparkR-(Chris Freeman, Alteryx)
 
Let Spark Fly: Advantages and Use Cases for Spark on Hadoop
 Let Spark Fly: Advantages and Use Cases for Spark on Hadoop Let Spark Fly: Advantages and Use Cases for Spark on Hadoop
Let Spark Fly: Advantages and Use Cases for Spark on Hadoop
 
Hadoop to spark_v2
Hadoop to spark_v2Hadoop to spark_v2
Hadoop to spark_v2
 
Spark Summit East 2015 Advanced Devops Student Slides
Spark Summit East 2015 Advanced Devops Student SlidesSpark Summit East 2015 Advanced Devops Student Slides
Spark Summit East 2015 Advanced Devops Student Slides
 

Destaque

2013 06-03 berlin buzzwords
2013 06-03 berlin buzzwords2013 06-03 berlin buzzwords
2013 06-03 berlin buzzwords
Nitay Joffe
 
02 probabilistic inference in graphical models
02 probabilistic inference in graphical models02 probabilistic inference in graphical models
02 probabilistic inference in graphical models
zukun
 
Label propagation - Semisupervised Learning with Applications to NLP
Label propagation - Semisupervised Learning with Applications to NLPLabel propagation - Semisupervised Learning with Applications to NLP
Label propagation - Semisupervised Learning with Applications to NLP
David Przybilla
 

Destaque (11)

2013 06-03 berlin buzzwords
2013 06-03 berlin buzzwords2013 06-03 berlin buzzwords
2013 06-03 berlin buzzwords
 
Apache Giraph
Apache GiraphApache Giraph
Apache Giraph
 
Giraph++: From "Think Like a Vertex" to "Think Like a Graph"
Giraph++: From "Think Like a Vertex" to "Think Like a Graph"Giraph++: From "Think Like a Vertex" to "Think Like a Graph"
Giraph++: From "Think Like a Vertex" to "Think Like a Graph"
 
SocNL: Bayesian Label Propagation with Confidence
SocNL: Bayesian Label Propagation with ConfidenceSocNL: Bayesian Label Propagation with Confidence
SocNL: Bayesian Label Propagation with Confidence
 
Protractor powerpoint
Protractor powerpointProtractor powerpoint
Protractor powerpoint
 
02 probabilistic inference in graphical models
02 probabilistic inference in graphical models02 probabilistic inference in graphical models
02 probabilistic inference in graphical models
 
2011.10.14 Apache Giraph - Hortonworks
2011.10.14 Apache Giraph - Hortonworks2011.10.14 Apache Giraph - Hortonworks
2011.10.14 Apache Giraph - Hortonworks
 
Cyber security and attack analysis : how Cisco uses graph analytics
Cyber security and attack analysis : how Cisco uses graph analyticsCyber security and attack analysis : how Cisco uses graph analytics
Cyber security and attack analysis : how Cisco uses graph analytics
 
My Old Friend Malloc
My Old Friend MallocMy Old Friend Malloc
My Old Friend Malloc
 
Artificial Intelligence 06.3 Bayesian Networks - Belief Propagation - Junctio...
Artificial Intelligence 06.3 Bayesian Networks - Belief Propagation - Junctio...Artificial Intelligence 06.3 Bayesian Networks - Belief Propagation - Junctio...
Artificial Intelligence 06.3 Bayesian Networks - Belief Propagation - Junctio...
 
Label propagation - Semisupervised Learning with Applications to NLP
Label propagation - Semisupervised Learning with Applications to NLPLabel propagation - Semisupervised Learning with Applications to NLP
Label propagation - Semisupervised Learning with Applications to NLP
 

Semelhante a 2014.02.13 (Strata) Graph Analysis with One Trillion Edges on Apache Giraph

2013.09.10 Giraph at London Hadoop Users Group
2013.09.10 Giraph at London Hadoop Users Group2013.09.10 Giraph at London Hadoop Users Group
2013.09.10 Giraph at London Hadoop Users Group
Nitay Joffe
 

Semelhante a 2014.02.13 (Strata) Graph Analysis with One Trillion Edges on Apache Giraph (20)

2013.09.10 Giraph at London Hadoop Users Group
2013.09.10 Giraph at London Hadoop Users Group2013.09.10 Giraph at London Hadoop Users Group
2013.09.10 Giraph at London Hadoop Users Group
 
Graph processing
Graph processingGraph processing
Graph processing
 
Spark Summit EU talk by Nick Pentreath
Spark Summit EU talk by Nick PentreathSpark Summit EU talk by Nick Pentreath
Spark Summit EU talk by Nick Pentreath
 
MapReduce presentation
MapReduce presentationMapReduce presentation
MapReduce presentation
 
[262] netflix 빅데이터 플랫폼
[262] netflix 빅데이터 플랫폼[262] netflix 빅데이터 플랫폼
[262] netflix 빅데이터 플랫폼
 
Mining quasi bicliques using giraph
Mining quasi bicliques using giraphMining quasi bicliques using giraph
Mining quasi bicliques using giraph
 
GraphChi big graph processing
GraphChi big graph processingGraphChi big graph processing
GraphChi big graph processing
 
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleStrata Singapore: GearpumpReal time DAG-Processing with Akka at Scale
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
 
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeAdaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Large-Scale Training with GPUs at Facebook
Large-Scale Training with GPUs at FacebookLarge-Scale Training with GPUs at Facebook
Large-Scale Training with GPUs at Facebook
 
Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Greg Hogan – To Petascale and Beyond- Apache Flink in the CloudsGreg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Cloud computing_processing frameworks
Cloud computing_processing frameworksCloud computing_processing frameworks
Cloud computing_processing frameworks
 
Xia Zhu – Intel at MLconf ATL
Xia Zhu – Intel at MLconf ATLXia Zhu – Intel at MLconf ATL
Xia Zhu – Intel at MLconf ATL
 
Batch and Stream Graph Processing with Apache Flink
Batch and Stream Graph Processing with Apache FlinkBatch and Stream Graph Processing with Apache Flink
Batch and Stream Graph Processing with Apache Flink
 
Performance improvements in etcd 3.5 release
Performance improvements in etcd 3.5 releasePerformance improvements in etcd 3.5 release
Performance improvements in etcd 3.5 release
 
PraveenBOUT++
PraveenBOUT++PraveenBOUT++
PraveenBOUT++
 
Apache Flink & Graph Processing
Apache Flink & Graph ProcessingApache Flink & Graph Processing
Apache Flink & Graph Processing
 
Hadoop 101 for bioinformaticians
Hadoop 101 for bioinformaticiansHadoop 101 for bioinformaticians
Hadoop 101 for bioinformaticians
 

Último

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Último (20)

Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 

2014.02.13 (Strata) Graph Analysis with One Trillion Edges on Apache Giraph

  • 1. Graph Analysis with One Trillion Edges on Apache Giraph 2/13/2014 Avery Ching, Facebook Strata
  • 3. Apache Giraph • Inspired by Google’s Pregel but runs on Hadoop • “Think like a vertex” • Maximum value vertex example Processor 1 Time 5 5 5 1 1 5 5 5 2 Processor 2 5 2 2 5 5
  • 4. Giraph on Hadoop / Yarn Giraph MapReduce Hadoop 0.20.x Hadoop 0.20.203 Hadoop 1.x YARN Hadoop 2.0.x
  • 5. Apache Giraph data flow Split 3 Load/ Send Graph Part 1 Part 2 Part 3 Compute/ Send Messages Compute/ Send Messages Send stats/iterate! Worker 0 Part 0 Worker 0 Load/ Send Graph Storing the graph Worker 1 Split 2 In-memory graph Worker 1 Split 1 Compute / Iterate Master Master Split 0 Worker 1 Input 
 format Worker 0 Loading the graph Part 0 Part 1 Output 
 format Part 0 Part 1 Part 2 Part 3 Part 2 Part 3
  • 6. Beyond Pregel Sharded aggregators Master computation Composable computation
  • 7. Use case: k-means clustering Cluster input vectors into k clusters • Assign each input vector to the closest centroid • Update centroid locations based on assignments Random centroid location Assignment to centroid c0 Update centroids c0 c2 c0 c2 c2 c0 c2 c1 c1 c1 c1
  • 8. k-means in Giraph Partitioning the problem c0 c2 Input vectors → vertices • Partitioned across machines Centroids → aggregators • Shared data across all machines c1 ! ! Worker 0 Problem solved....right? Worker 1 c0 c0 c2 c1 c2 c1
  • 9. Problem 1: Massive dimensions Cluster Facebook members by friendships? • 1 billion members (dimensions) • k clusters Each worker sending to the master a maximum of • 1B * (2 bytes - max 5k friends) * k = 2 * k GB Master receives up to 2 * k * workers GB • Saturated network link • OOM
  • 10. Sharded aggregators Master handles all aggregators Aggregators sharded to workers final agg 0 master final agg 0 master final agg 1 final agg 1 final agg 2 final agg 2 partial agg 0 partial agg 1 final agg 1 partial agg 2 worker 0 final agg 0 partial agg 0 worker 0 final agg 0 final agg 2 partial agg 2 final agg 2 final agg 0 partial agg 0 final agg 0 partial agg 1 final agg 1 partial agg 2 final agg 2 partial agg 2 final agg 2 partial agg 0 final agg 0 partial agg 0 final agg 0 partial agg 1 final agg 1 partial agg 1 final agg 1 partial agg 2 worker 2 final agg 1 partial agg 0 worker 1 partial agg 1 final agg 2 worker 1 worker 2 partial agg 1 partial agg 2 final agg 1 final agg 2 • Share aggregator load across workers • Future work - tree-based optimizations (not yet a problem)
  • 11. Problem 2: Edge cut metric Clusters should reduce the number of cut edges Two phases • Send all out edges your cluster id • Aggregate edges with different cluster ids Calculate no more than once an hour?
  • 12. Master computation Serial computation on master • Communicates to workers via aggregators • Added to Giraph by Stanford GPS team Master Worker 0 Worker 1 Time k-means k-means start cut end cut k-means k-means k-means start cut end cut k-means
  • 13. Problem 3: More phases, more problems Add a stage to initialize the centroids Add random input vectors to centroids • Add a few random friends Two phases c0 c2 • Randomly sample input vertices to add • Send messages to a few random neighbors c3
  • 14. Problem 3: (continued) Cannot easily support different messages, combiners Vertex compute code getting messy c0 c2 if (phase == INITIALIZE_SELF) // Randomly add to centroid else if (phase == INITIALIZE_FRIEND) // Add my vector to centroid if a friend selected me else if (phase == K_MEANS) // Do k-means else if (phase == START_EDGE_CUT)... c3
  • 15. Composable computation Decouple vertex from computation Master sets the computation, combiner classes Reusable and composable Computation Add random centroid / random friends Add to centroid K-means Start edge cut End edge cut In message Null Centroid message Null Null Cluster Out message Centroid message Null Null Cluster Null Combiner N/A N/A N/A Cluster combiner N/A
  • 16. Composable computation (cont) Balanced Label Propagation compute candidates to move to partitions probabilistically move vertices Continue if halting condition not met (i.e. < n vertices moved?)
  • 17. Composable computation (cont) Balanced Label Propagation compute candidates to move to partitions probabilistically move vertices Continue if halting condition not met (i.e. < n vertices moved?) Affinity Propagation calculate and send responsibilities calculate and send availabilities Continue if halting condition met (i.e. < n vertices changed exemplars?) update exemplars
  • 18. Faster than Hive? Application Graph Size CPU Time Speedup Elapsed Time Speedup Page rank
 400B+ edges 26x 120x 71B+ edges 12.5x 48x (single iteration) Friends of friends score

  • 19. Apache Giraph scalability Scalability of workers Scalability of edges (50 (200B edges) workers) 500 375 375 Seconds Seconds 500 250 125 0 50 100 150 200 250 300 # of Workers Giraph Ideal 250 125 0 1E+09 7E+10 1E+11 # of Edges Giraph Ideal 2E+11
  • 20. A billion edges isn’t cool. 
 You know what’s cool? A TRILLION edges.
  • 21. Page rank on 200 machines with 1 trillion (1,000,000,000,000) edges <4 minutes / iteration! * Results from 6/30/2013 with one-to-all messaging + request processing improvements
  • 22. Why balanced partitioning Random partitioning == good balance BUT ignores entity affinity 0 1 6 3 4 5 10 7 8 9 2 11
  • 23. Balanced partitioning application Results from one service: Cache hit rate grew from 70% to 85%, bandwidth cut in 1/2 ! ! 0 3 6 9 1 4 7 10 2 5 8 11
  • 24. Balanced label propagation results * Loosely based on Ugander and Backstrom. Balanced label propagation for partitioning massive graphs, WSDM '13
  • 25. Avoiding out-of-core Example: Mutual friends calculation between neighbors ! C:{D} D:{C} A 1. Send your friends a list of your friends ! ! E:{} B 2. Intersect with your friend list ! 1.23B (as of 1/2014) A:{D} D:{A,E} E:{D} C E 200+ average friends (2011 S1) 8-byte ids (longs) = 394 TB / 100 GB machines 3,940 machines (not including the graph) D A:{C} C:{A,E} E:{C} B:{} C:{D} D:{C}
  • 26. Superstep splitting Subsets of sources/destinations edges per superstep * Currently manual - future work automatic! Sources: A (on), B (off) Destinations: A (on), B (off) Sources: A (on), B (off) Destinations: A (off), B (on) B Sources: A (off), B (on) Destinations: A (on), B (off) B Sources: A (off), B (on) Destinations: A (off), B (on) B B A B A B A B A B B A B A B A B A A A A A
  • 28. Giraph in Production Over 1.5 years in production Over 100 jobs processed a week 30+ applications in our internal application repository Sample production job - 700B+ edges Very stable • Checkpointing disabled (highly loaded HDFS adds instability) • Retries handle intermittent failures
  • 29. Giraph roadmap 2/12 - 0.1 Relaxing BSP - 1.2? • Giraph++ (IBM research) • Giraphx (University at Buffalo, SUNY) 5/13 - 1.0 Spring 2014 - 1.1
  • 30. Future work Evaluate alternative computing models Performance Lower the barrier to entry Applications