O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
@alexandregamma
Apache Kafka
Alexandre Gama
- Senior Software Engineer
- Team Leader
- Teacher
- Speaker / Writer
@alexandregamma
Agenda
Apache Kafka 101
Kafka High Level Architecture
Kafka Low Level Architecture
@alexandregamma
Apache
Kafka 101
- Where is Used ?
- What is Apache Kafka ?
- For What is Used ?
- Why is so Special?
Companies Using Kafka
- 1⁄3 of all Fortune 500
Companies Using Kafka
- 1⁄3 of all Fortune 500
- Top 10 travel companies
Companies Using Kafka
- 1⁄3 of all Fortune 500
- Top 10 travel companies
- 7 of top 10 banks
Companies Using Kafka
- 1⁄3 of all Fortune 500
- Top 10 travel companies
- 7 of top 10 banks
- 8 of top 10 insurance compa...
Companies Using Kafka
- 1⁄3 of all Fortune 500
- Top 10 travel companies
- 7 of top 10 banks
- 8 of top 10 insurance compa...
Companies Using Kafka
- 1⁄3 of all Fortune 500
- Top 10 travel companies
- 7 of top 10 banks
- 8 of top 10 insurance compa...
Companies Using Kafka
- 1⁄3 of all Fortune 500
- Top 10 travel companies
- 7 of top 10 banks
- 8 of top 10 insurance compa...
Companies Using Kafka
- 1⁄3 of all Fortune 500
- Top 10 travel companies
- 7 of top 10 banks
- 8 of top 10 insurance compa...
Companies Using Kafka
- 1⁄3 of all Fortune 500
- Top 10 travel companies
- 7 of top 10 banks
- 8 of top 10 insurance compa...
What is Kafka?
What is Kafka?
- Distributed Streaming Platform
What is Kafka?
- Distributed Streaming Platform
- Publish and Subscribe
What is Kafka?
- Distributed Streaming Platform
- Publish and Subscribe
- React to Events in Real Time
Kafka Big Picture
For What it is Used?
For What?
- Real Time Streams
For What?
- Real Time Streams
- Collect Big Data
For What?
- Real Time Streams
- Collect Big Data
- 1,000,000,000,000
For What?
- Real Time Streams
- Collect Big Data
- 1,000,000,000,000
- Real Time Analysis
Can Work With?
Can work with?
Can work with?
Can work with?
Can work with?
Why is so Special?
Why is so special?
- Distributed Streaming Platform
- Distributed Streaming Platform
- High Throughput and Real Time
Why is so special?
Why is so special?
- Distributed Streaming Platform
- High Throughput and Real Time
- Reliability
Why is so special?
- Distributed Streaming Platform
- High Throughput and Real Time
- Reliability
- Awesome Replication
Why is so special?
- Distributed Streaming Platform
- High Throughput and Real Time
- Reliability
- Awesome Replication
- ...
@alexandregamma
High Level
Architecture
- Brokers
- Topic Architecture
- Producer Architecture
- Consumer Architecture
Big Picture Again
Kafka Broker
Kafka Cluster
Producer -> Broker
Broker -> Consumer
Topic Architecture
Topic Architecture
- Kafka stores Topics in logs
Topic Architecture
- Kafka stores Topics in logs
- Topic log is broken up into Partitions
Topic Architecture
- Kafka stores Topics in logs
- Topic log is broken up into Partitions
- Partitions for Speed, Scalabil...
Topic Architecture
- Kafka stores Topics in logs
- Topic log is broken up into Partitions
- Partitions for Speed, Scalabil...
Topic Architecture
- Kafka stores Topics in logs
- Topic log is broken up into Partitions
- Partitions for Speed, Scalabil...
Producer Architecture
Producer Architecture
- Load Balancing
Producer Architecture
- Load Balancing
- Async Messages
Producer Architecture
- Load Balancing
- Async Messages
- Durability
Load Balancing: Route Tier?
Wrong Broker ?
Wrong Broker - Load Balancing
Strategy
Async Messages
Async Messages
Async Messages
Batch Messages
Batch Messages
Curiosity..
Buffer Memory on Producer
Curiosity 2..
Producer Retries on Errors
Curiosity 3..
Messages in Flight
Producer Durability - But Before...
Producer Durability - But Before...
Producer Durability - But Before...
Producer Durability - But Before...
Producer Durability - But Before...
Producer Durability
Producer Durability
Producer Durability
Producer Durability
Consumer Architecture
Consumer Architecture
- Big Picture
Consumer Architecture
- Big Picture
- Consumer Group
Consumer Architecture
- Big Picture
- Consumer Group
- ZooKeeper
Consuming Messages
Consuming Messages
Pull vs Push
Pull Strategy
Offset
Consumer Group
Consumer Group
Consumer Group
Consumer Group
Consumer Group
@alexandregamma
Low Level
Architecture
- Persistence
- Efficiency
Persistence
Persistence
Data
● Filesystem vs Memory
Persistence
Data
● Filesystem vs Memory
● Disk is slow?
Data
● Filesystem vs Memory
● Disk is slow?
● Linear Writes vs Random Writes
Persistence
Data
● Filesystem vs Memory
● Disk is slow?
● Linear Writes vs Random Writes
● JVM Memory
Persistence
Persistence
Data
● Filesystem vs Memory
● Disk is slow?
● Linear Writes vs Random Writes
● JVM Memory
● Compact Byte Struc...
Persistence
Data
● Filesystem vs Memory
● Disk is slow?
● Linear Writes vs Random Writes
● JVM Memory
● Compact Byte Struc...
Persistence
Data
● Filesystem vs Memory
● Disk is slow?
● Linear Writes vs Random Writes
● JVM Memory
● Compact Byte Struc...
Persistence
Data
● Filesystem vs Memory
● Disk is slow?
● Linear Writes vs Random Writes
● JVM Memory
● Compact Byte Struc...
Persistence
Data
● Filesystem vs Memory
● Disk is slow?
● Linear Writes vs Random Writes
● JVM Memory
● Compact Byte Struc...
Persistence
Data
● Filesystem vs Memory
● Disk is slow?
● Linear Writes vs Random Writes
● JVM Memory
● Compact Byte Struc...
Efficiency
Required!
Required!
Required!
Required!
Required!
Required!
Required!
Required!
Required!
Required!
Required!
Required!
Required!
Required!
Zero Copy
Zero Copy
Zero Copy
Zero Copy
Thank You!
TDC2017 | POA Trilha BigData - Escalabilidade, Performance e Arquitetura do Apache Kafka para Event-Driven Architectures
Próximos SlideShares
Carregando em…5
×

TDC2017 | POA Trilha BigData - Escalabilidade, Performance e Arquitetura do Apache Kafka para Event-Driven Architectures

159 visualizações

Publicada em

Trilha BigData - Escalabilidade, Performance e Arquitetura do Apache Kafka para Event-Driven Architectures

Publicada em: Educação
  • Seja o primeiro a comentar

TDC2017 | POA Trilha BigData - Escalabilidade, Performance e Arquitetura do Apache Kafka para Event-Driven Architectures

  1. 1. @alexandregamma Apache Kafka
  2. 2. Alexandre Gama - Senior Software Engineer - Team Leader - Teacher - Speaker / Writer
  3. 3. @alexandregamma Agenda Apache Kafka 101 Kafka High Level Architecture Kafka Low Level Architecture
  4. 4. @alexandregamma Apache Kafka 101 - Where is Used ? - What is Apache Kafka ? - For What is Used ? - Why is so Special?
  5. 5. Companies Using Kafka - 1⁄3 of all Fortune 500
  6. 6. Companies Using Kafka - 1⁄3 of all Fortune 500 - Top 10 travel companies
  7. 7. Companies Using Kafka - 1⁄3 of all Fortune 500 - Top 10 travel companies - 7 of top 10 banks
  8. 8. Companies Using Kafka - 1⁄3 of all Fortune 500 - Top 10 travel companies - 7 of top 10 banks - 8 of top 10 insurance companies
  9. 9. Companies Using Kafka - 1⁄3 of all Fortune 500 - Top 10 travel companies - 7 of top 10 banks - 8 of top 10 insurance companies - 9 of top ten telecom companies
  10. 10. Companies Using Kafka - 1⁄3 of all Fortune 500 - Top 10 travel companies - 7 of top 10 banks - 8 of top 10 insurance companies - 9 of top ten telecom companies - Linkedin
  11. 11. Companies Using Kafka - 1⁄3 of all Fortune 500 - Top 10 travel companies - 7 of top 10 banks - 8 of top 10 insurance companies - 9 of top ten telecom companies - Linkedin - Microsoft
  12. 12. Companies Using Kafka - 1⁄3 of all Fortune 500 - Top 10 travel companies - 7 of top 10 banks - 8 of top 10 insurance companies - 9 of top ten telecom companies - Linkedin - Microsoft - Netflix
  13. 13. Companies Using Kafka - 1⁄3 of all Fortune 500 - Top 10 travel companies - 7 of top 10 banks - 8 of top 10 insurance companies - 9 of top ten telecom companies - Linkedin - Microsoft - Netflix - Elo7 (the best one :P)
  14. 14. What is Kafka?
  15. 15. What is Kafka? - Distributed Streaming Platform
  16. 16. What is Kafka? - Distributed Streaming Platform - Publish and Subscribe
  17. 17. What is Kafka? - Distributed Streaming Platform - Publish and Subscribe - React to Events in Real Time
  18. 18. Kafka Big Picture
  19. 19. For What it is Used?
  20. 20. For What? - Real Time Streams
  21. 21. For What? - Real Time Streams - Collect Big Data
  22. 22. For What? - Real Time Streams - Collect Big Data - 1,000,000,000,000
  23. 23. For What? - Real Time Streams - Collect Big Data - 1,000,000,000,000 - Real Time Analysis
  24. 24. Can Work With?
  25. 25. Can work with?
  26. 26. Can work with?
  27. 27. Can work with?
  28. 28. Can work with?
  29. 29. Why is so Special?
  30. 30. Why is so special? - Distributed Streaming Platform
  31. 31. - Distributed Streaming Platform - High Throughput and Real Time Why is so special?
  32. 32. Why is so special? - Distributed Streaming Platform - High Throughput and Real Time - Reliability
  33. 33. Why is so special? - Distributed Streaming Platform - High Throughput and Real Time - Reliability - Awesome Replication
  34. 34. Why is so special? - Distributed Streaming Platform - High Throughput and Real Time - Reliability - Awesome Replication - Fault Tolerance
  35. 35. @alexandregamma High Level Architecture - Brokers - Topic Architecture - Producer Architecture - Consumer Architecture
  36. 36. Big Picture Again
  37. 37. Kafka Broker
  38. 38. Kafka Cluster
  39. 39. Producer -> Broker
  40. 40. Broker -> Consumer
  41. 41. Topic Architecture
  42. 42. Topic Architecture - Kafka stores Topics in logs
  43. 43. Topic Architecture - Kafka stores Topics in logs - Topic log is broken up into Partitions
  44. 44. Topic Architecture - Kafka stores Topics in logs - Topic log is broken up into Partitions - Partitions for Speed, Scalability, and Size
  45. 45. Topic Architecture - Kafka stores Topics in logs - Topic log is broken up into Partitions - Partitions for Speed, Scalability, and Size - Maintains record Order only in a single Partition
  46. 46. Topic Architecture - Kafka stores Topics in logs - Topic log is broken up into Partitions - Partitions for Speed, Scalability, and Size - Maintains record Order only in a single Partition - Topic Partitions are a unit of Parallelism
  47. 47. Producer Architecture
  48. 48. Producer Architecture - Load Balancing
  49. 49. Producer Architecture - Load Balancing - Async Messages
  50. 50. Producer Architecture - Load Balancing - Async Messages - Durability
  51. 51. Load Balancing: Route Tier?
  52. 52. Wrong Broker ?
  53. 53. Wrong Broker - Load Balancing
  54. 54. Strategy
  55. 55. Async Messages
  56. 56. Async Messages
  57. 57. Async Messages
  58. 58. Batch Messages
  59. 59. Batch Messages
  60. 60. Curiosity.. Buffer Memory on Producer
  61. 61. Curiosity 2.. Producer Retries on Errors
  62. 62. Curiosity 3.. Messages in Flight
  63. 63. Producer Durability - But Before...
  64. 64. Producer Durability - But Before...
  65. 65. Producer Durability - But Before...
  66. 66. Producer Durability - But Before...
  67. 67. Producer Durability - But Before...
  68. 68. Producer Durability
  69. 69. Producer Durability
  70. 70. Producer Durability
  71. 71. Producer Durability
  72. 72. Consumer Architecture
  73. 73. Consumer Architecture - Big Picture
  74. 74. Consumer Architecture - Big Picture - Consumer Group
  75. 75. Consumer Architecture - Big Picture - Consumer Group - ZooKeeper
  76. 76. Consuming Messages
  77. 77. Consuming Messages
  78. 78. Pull vs Push
  79. 79. Pull Strategy
  80. 80. Offset
  81. 81. Consumer Group
  82. 82. Consumer Group
  83. 83. Consumer Group
  84. 84. Consumer Group
  85. 85. Consumer Group
  86. 86. @alexandregamma Low Level Architecture - Persistence - Efficiency
  87. 87. Persistence
  88. 88. Persistence Data ● Filesystem vs Memory
  89. 89. Persistence Data ● Filesystem vs Memory ● Disk is slow?
  90. 90. Data ● Filesystem vs Memory ● Disk is slow? ● Linear Writes vs Random Writes Persistence
  91. 91. Data ● Filesystem vs Memory ● Disk is slow? ● Linear Writes vs Random Writes ● JVM Memory Persistence
  92. 92. Persistence Data ● Filesystem vs Memory ● Disk is slow? ● Linear Writes vs Random Writes ● JVM Memory ● Compact Byte Structure
  93. 93. Persistence Data ● Filesystem vs Memory ● Disk is slow? ● Linear Writes vs Random Writes ● JVM Memory ● Compact Byte Structure Cache ● Flush vs Persisted Log
  94. 94. Persistence Data ● Filesystem vs Memory ● Disk is slow? ● Linear Writes vs Random Writes ● JVM Memory ● Compact Byte Structure Cache ● Flush vs Persisted Log ● Kernel Pagecache
  95. 95. Persistence Data ● Filesystem vs Memory ● Disk is slow? ● Linear Writes vs Random Writes ● JVM Memory ● Compact Byte Structure Cache ● Flush vs Persisted Log ● Kernel Pagecache ● O(1) -> Data Size
  96. 96. Persistence Data ● Filesystem vs Memory ● Disk is slow? ● Linear Writes vs Random Writes ● JVM Memory ● Compact Byte Structure Cache ● Flush vs Persisted Log ● Kernel Pagecache ● O(1) -> Data Size ● OS -> Read Ahead
  97. 97. Persistence Data ● Filesystem vs Memory ● Disk is slow? ● Linear Writes vs Random Writes ● JVM Memory ● Compact Byte Structure Cache ● Flush vs Persisted Log ● Kernel Pagecache ● O(1) -> Data Size ● OS -> Read Ahead ● OS -> Write Behind
  98. 98. Efficiency
  99. 99. Required!
  100. 100. Required!
  101. 101. Required!
  102. 102. Required!
  103. 103. Required!
  104. 104. Required!
  105. 105. Required!
  106. 106. Required!
  107. 107. Required!
  108. 108. Required!
  109. 109. Required!
  110. 110. Required!
  111. 111. Required!
  112. 112. Required!
  113. 113. Zero Copy
  114. 114. Zero Copy
  115. 115. Zero Copy
  116. 116. Zero Copy
  117. 117. Thank You!

×