O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
Pregel
Pregel 
• A System for Large-Scale Graph Processing 
• Sufficiently flexible to express arbitrary graph 
algorithms 
• So ...
Pregel: Model Of Computation 
• Vertex state 
• Terminate codition: all vertex are inactive
Pregel: Model Of Computation 
• Sequence of supersteps 
• Invoke compute() for each active vertex 
• Each vertex can 
– Mo...
Pregel: Model Of Computation
Pregel: Model Of Computation
Pregel API
Pregel API 
• Combiners 
• Aggregators 
• Topology Mutations 
• Input and Output
Giraph
Why not implement Giraph with 
multiple MapReduce jobs 
• Too much disk, no in-memory caching, a 
superstep becomes a job!
Giraph is a single Map-only job in 
Hadoop 
• Hadoop is purely a resource manager for Giraph, all 
communication is done t...
Maximum vertex value 
implementation
Giraph components 
• Master 
– One active master at a time 
– Assign partition owners to workers prior to each 
superstep ...
Graph distribution
Próximos SlideShares
Carregando em…5
×

Pregel and giraph

596 visualizações

Publicada em

Pregel and giraph

Publicada em: Dados e análise
  • Seja o primeiro a comentar

  • Seja a primeira pessoa a gostar disto

Pregel and giraph

  1. 1. Pregel
  2. 2. Pregel • A System for Large-Scale Graph Processing • Sufficiently flexible to express arbitrary graph algorithms • So easy
  3. 3. Pregel: Model Of Computation • Vertex state • Terminate codition: all vertex are inactive
  4. 4. Pregel: Model Of Computation • Sequence of supersteps • Invoke compute() for each active vertex • Each vertex can – Modify its state, its outgoing edges – Recive messages – Send messages to another
  5. 5. Pregel: Model Of Computation
  6. 6. Pregel: Model Of Computation
  7. 7. Pregel API
  8. 8. Pregel API • Combiners • Aggregators • Topology Mutations • Input and Output
  9. 9. Giraph
  10. 10. Why not implement Giraph with multiple MapReduce jobs • Too much disk, no in-memory caching, a superstep becomes a job!
  11. 11. Giraph is a single Map-only job in Hadoop • Hadoop is purely a resource manager for Giraph, all communication is done through Netty-based IPC
  12. 12. Maximum vertex value implementation
  13. 13. Giraph components • Master – One active master at a time – Assign partition owners to workers prior to each superstep – Synchronize supersteps • Worker – Load the graph from input – Does the computation/messaging of its assigned partitions
  14. 14. Graph distribution

×