O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Apache Kafka – (Pattern and) Anti-Pattern

353 visualizações

Publicada em

Speaker: Damien Gasparina, Engineer, Confluent

Here's how to fail at Apache Kafka brilliantly!

https://www.meetup.com/Paris-Data-Engineers/events/260694777/

Publicada em: Tecnologia
  • Seja o primeiro a comentar

Apache Kafka – (Pattern and) Anti-Pattern

  1. 1. 1Confidential Apache Kafka® – Pattern and Anti-Pattern How to fail at Apache Kafka brilliantly!
  2. 2. This is me - Engineer at Confluent - Ex-MongoDB, ex-Oracle - Cute smooth curly hair
  3. 3. Distributed Logs Scalable, Fault Tolerant, Concurrent, Strongly Ordered, Retentive
  4. 4. Kafka: a Streaming Platform The Log ConnectorsConnectors Producer Consumer Streaming Engine
  5. 5. Producer Consumer Partitions Kafka Cluster
  6. 6. … which is true for any data systems
  7. 7. And today we will discuss about the most common one
  8. 8. Problem #1 Not thinking about Durability * How to fail *
  9. 9. Durability is achieved throught Replication
  10. 10. When will the cluster acknowlege ? L Producer
  11. 11. As soon as it is on the page cache of the leader... L Producer
  12. 12. Acks Acks=1 (default value) : good for latency Acks=all : good for durability
  13. 13. Replication before acknolowdging L Producer Wait for Data replication Acks=all 
  14. 14. Acks = All This means the leader will wait for the full set of in-sync replicas to acknowledge the record.
  15. 15. But only to the In-Sync Replicas... L Producer Acks=all 
  16. 16. … which could be only one server L Producer Acks=all 
  17. 17. min.insync.replicas minimum number of replicas that must Acknowledge (default 1)
  18. 18. Data Durability while Producing Tune it with the parameters  acks and min.isr
  19. 19. The default values are optimized For availability & latency  If durability is more important, tune it !
  20. 20. For consumer, it’s a lot easier Consumer can read only replicated data
  21. 21. If you are using transaction isolation.level=read_commited (default read_uncommited)
  22. 22. Think about data durability and decide of the best trade-off for you * How to succeed *
  23. 23. Problem #2 Focusing only on the happy path * How to fail *
  24. 24. Some parameters to know
  25. 25. Builtin Retries Retries : will cause the client to resend any record whose send fails with a potentially transient error. Default value : MAX_INT (before AK 2.1 : 0...)
  26. 26. What’s happening in case of issue with retry ? Producer No longer leader L The leader moved to a different broker
  27. 27. Use builtin retries ! * How to succeed *
  28. 28. But you are exposed to a different kind of issue
  29. 29. Message duplication L Producer Timeout
  30. 30. Builtin Idempotency enable.idempotence: When set to 'true', the producer will ensure that exactly one copy of each message is written. Default value: false
  31. 31. With Retries and Idempotency L Producer Timeout UUID UUID
  32. 32. With Retries and Idempotency L Producer Timeout UUID UUID UUID UUIDUUID
  33. 33. Use builtin idempotency! * How to succeed *
  34. 34. Problem #4 No Idempotent consumer * How to fail *
  35. 35. At least once (default) At most once Exactly Once ?
  36. 36. Embrace at least once ! * How to succeed *
  37. 37. Rely on Kafka Streams with Exactly Once ! * How to succeed *
  38. 38. Problem #4 No exception handling * How to fail *
  39. 39. Producer Consumer What to do in case of an error ?
  40. 40. Plenty of potential issues
  41. 41. 1) A message can not be processed 2) A message doesn’t have the expected schema
  42. 42. Multiple strategy
  43. 43. 1) Retry while (this.getRunning()) { try { var consumerRecords = consumer.poll(1000) ; } catch (Exception e) { Logger.error(e); continue ; } for (var record : consumerRecords) { try { /* Processing messages */ } catch (Exception e) { … } } }
  44. 44. 2) Write to a dead letter queue and continue while (this.getRunning()) { var consumerRecords = consumer.poll(1000) ; for (var record : consumerRecords) { try { /* Processing messages */ } catch (Exception e) { producer.send(« dead-my-topic », new ProducerRecord(…)) ; Logger.error(e); } } }
  45. 45. 3) Ignore & continue kafkaProducer.send(record, ( (metadata, exception)→ { if (exception != null) { /* Something bad happened */ /* But those are ephemaral data anyway */ Logger.error(exception) ; } }) );
  46. 46. There is no silver bullet solution
  47. 47. Handle the exceptions ! * How to succeed * https://eng.uber.com/reliable-reprocessing/
  48. 48. Problem #5 No data governance * How to fail *
  49. 49. Changes in producers might impact consumers
  50. 50. Sharing dataset! As soon as producer and consumer are Part of different teams. Consider using a Schema Registry ! * How to succeed *
  51. 51. Consumer Producer Application 1 Consider using quotas as soon as there is multi-tenancy Producer Application 2 Application 3 Consumer Application 4
  52. 52. Problem #6 Installing prod Sunday night * How to fail *
  53. 53. A typical devops team
  54. 54. Misconfiguration ! If you use the default configuration… You will have surprises! * How to fail *
  55. 55. The best things to do is to .… Read the goddam manual ! Running Apache Kafka in Production Running Apache Zookeeper in Production * How to succeed *
  56. 56. Not using security * How to fail *
  57. 57. Ordering? * How to fail *
  58. 58. Too many partitions * How to fail *
  59. 59. Too few partitions * How to fail *
  60. 60. Calling external service in Kafka Streams * How to fail *
  61. 61. No Monitoring * How to fail *
  62. 62. Problems Data durability Error Handling & transient issues Data governance Misconfiguration Monitoring
  63. 63. DO Understand Apache Kafka Configure your clients & cluster properly Follow engineering best-practices https://www.confluent.io/white-paper/optimizing-your-apache-kafka-deployment/
  64. 64. And if you can not do all of this Consider using a Cloud Service
  65. 65. And if you can We are hiring !
  66. 66. References • Running Apache Kafka in production : https://docs.confluent.io/current/kafka/deployment.html • Running Apache Zookeeper in production : https://docs.confluent.io/current/zookeeper/deployment.html • Confluent Reference Architecutre : https://www.confluent.io/resources/apache-kafka-confluent-enterprise-reference-architecture/ • The goddam documentation : https://docs.confluent.io/current/getting-started.html • Kafka Summit : https://kafka-summit.org/
  67. 67. I hope it has been useful :=)
  68. 68. Questions ?

×