O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Reactive Kafka

4.049 visualizações

Publicada em

SpringOne Platform 2016
Speaker: Rajini Sivaram; Principal Software Engineer, Pivotal

Apache Kafka is a distributed, scalable, high-throughput messaging bus. Over the last few years, Kafka has emerged as a key building block for data-intensive distributed applications. As a high performance message bus, Kafka enables the development of distributed applications using the microservices architecture.

Reactive Streams simplifies the development of asynchronous systems using non-blocking back pressure. The reactive framework enables the development of asynchronous microservices by providing a side-effect free functional API with minimal overhead that supports low latency, non-blocking end-to-end data flows.

After an introduction to Kafka and reactive streams, we will explore the development of a reactive streams interface for Kafka and the use of this interface for building robust applications using the microservices architecture.

Publicada em: Tecnologia

Reactive Kafka

  1. 1. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Reactive Kafka Rajini Sivaram
  2. 2. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Safe Harbor Statement •  The following is intended to outline the general direction of Pivotal's offerings. It is intended for information purposes only and may not be incorporated into any contract. Any information regarding pre-release of Pivotal offerings, future updates or other planned modifications is subject to ongoing evaluation by Pivotal and is subject to change. This information is provided without warranty or any kind, express or implied, and is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions regarding Pivotal's offerings. These purchasing decisions should only be based on features currently available. The development, release, and timing of any features or functionality described for Pivotal's offerings in this presentation remain at the sole discretion of Pivotal. Pivotal has no obligation to update forward looking information in this presentation. 2
  3. 3. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ About me •  Principal Software Engineer at Pivotal UK •  Previously at IBM •  Message Hub developer – Kafka-as-a-Service on Bluemix •  Contributor to Apache Kafka 3
  4. 4. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Agenda Ø  Introduction to Apache Kafka Ø  Clients and tools for Kafka Ø  Non-reactive Java clients Ø  Introduction to Reactive Streams Ø  Reactive Kafka Ø  Reactive API for Kafka Ø  Performance evaluation Ø  Summary 4 Reactive Kafka
  5. 5. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ What is Kafka? •  Message Bus •  Publish/Subscribe •  Fast •  Scalable •  Durable •  Highly available •  Commit Log Service •  Distributed •  Replicated •  Partitioned 5
  6. 6. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Kafka as a Message Bus 6 Ø  Decouples producers and consumers Ø  Messages are stored for a configurable retention interval Ø  Consumers can replay message stream
  7. 7. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Kafka as a Commit Log Service 7 Broker 1 Broker 2 Broker 3 Broker 4 Broker 5 Partition 0 Partition 1 topicA Follower Leader Ø  Distributed Ø  Replicated Ø  Partition Ø  Immutable, Ordered Stream Ø  Enables scalability and parallelism
  8. 8. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Message Format 8 Message (v1) => Crc MagicByte Attributes Key Value o    Crc => int32 o    MagicByte => int8 o    Attributes => int8 o    Timestamp => int64 o    Key => bytes o    Value => bytes Header Typically used to choose partition Opaque array of bytes CRC Magic Attr Timestamp Key Length Key Bytes Value Length Value Bytes
  9. 9. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Topics and Logs 9 On-disk: Ø  /kafka-logs Ø  topicA-0 Ø  00000000000000000000.index Ø  00000000000000000000.log Ø  topicA-1 Ø  00000000000000000000.index Ø  00000000000000000000.log Ø  00000000000008012974.index Ø  00000000000008012974.log Ø  recovery-point-offset-checkpoint Ø  replication-offset-checkpoint Ø  cleaner-offset-checkpoint Ø  Topics Ø  Partitions Ø  Offsets Ø  Logs Ø  Log segments Ø  Log compaction
  10. 10. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Kafka BrokerKafka Broker Kafka Cluster 10 Kafka Broker Kafka Cluster Kafka BrokerKafka BrokerZookeeper Server Zookeeper Cluster Monitoring
  11. 11. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Message transmission 11 Broker 1 Broker 2 Broker 3 Broker 4 Broker 5 Partition 0 Partition 1 topicA Follower Leader ProducerConsumer Ø  Replication Ø  Async persistence to disk Ø  In-sync replicas Ø  Zero-copy writes þ
  12. 12. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Broker 5Broker 4 High Availability 12 Broker 1 Broker 2 Broker 3 Partition 0 Partition 1 topicA Follower Leader Controller Zookeeper Ø  Rolling upgrades
  13. 13. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Messaging guarantees •  Message ordering •  Ordered partitions •  Messages sent by a producer to a partition are appended to log in that order •  Consumer instances see messages in the order stored in the log •  Delivery semantics •  At least once •  At most once •  Exactly once – not currently supported within Kafka 13
  14. 14. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Broker configuration 14 Configuration Description advertised.listeners Listeners published to ZK for clients to use auto.leader.rebalance.enable Background thread periodically checks and triggers rebalance if required replica.lag.time.max.ms If follower hasn’t fetched or consumed up to log end for this period, follower is removed from ISR unclean.leader.election.enable Indicates whether to enable replicas not in the ISR set to be elected as leader as a last resort, even though doing so may result in data loss broker.rack Used for rack-aware replica assignment for improved fault tolerance etc. etc.
  15. 15. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Topic configuration 15 Configuration Description cleanup.policy Compact or delete retention.bytes Maximum log size retention.ms Maximum time logs are retained (7 days by default) max.message.bytes Largest message size allowed for topic segment.bytes Maximum log segment size min.insync.replicas For producer acks=all, minimum number of replicas for a successful write etc. etc.
  16. 16. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Security •  TLS •  Encryption •  Client authentication •  SASL Authentication •  GSSAPI/Kerberos •  PLAIN •  Access Control •  Operations: CRUD, ClusterAction, All •  ACL Resources: Cluster, Topic, Consumer groups •  Quotas 16 Kafka Zookeeper Kafka Kafka Client Proxy Proxy Proxy TLS/ Plaintext TLS
  17. 17. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Kafka Use cases •  Messaging •  Micro-service architecture •  Commit Log •  Stream processing •  Event sourcing •  See http://kafka.apache.org/documentation.html#uses for more examples 17
  18. 18. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Agenda Ø  Introduction to Apache Kafka Ø  Clients and tools for Kafka Ø  Non-reactive Java clients Ø  Introduction to Reactive Streams Ø  Reactive Kafka Ø  Reactive API for Kafka Ø  Performance evaluation Ø  Summary 18
  19. 19. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Kafka clients 19 Kafka BrokerKafka BrokerKafka Broker Kafka Connect REST proxy Ø  Large ecosystem of clients, tools, connectors Kafka Streams Schema registry C/C++ Consumer Go Consumer Java ConsumerJava Producer C/C++ Producer Go Producer
  20. 20. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Kafka producer: non-reactive Java example 20 producer = new KafkaProducer<>(producerProps); for (int i = 0; i < count; i++) { String message = "Message_" + i; producer.send(new ProducerRecord<>(topic, i, message), new Callback() { public void onCompletion(RecordMetadata metadata, Exception exception) { if (exception == null) System.out.println("Message sent successfully, metadata=" + metadata); else log.error("Send failed", exception); } }); }
  21. 21. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Kafka producer: non-reactive Java 8 21 producer = new KafkaProducer<>(producerProps); for (int i = 0; i < count; i++) { String message = "Message_" + i; producer.send(new ProducerRecord<>(topic, i, message), (metadata, exception) -> { if (exception == null) System.out.println("Message sent successfully, metadata=" + metadata); else log.error("Send failed", exception); });}
  22. 22. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Producer sequence diagram 22 Producer Broker 1 Broker 2 Find leader Send message to leader Determine partition Add to local queue Producer Network Thread send() Wait for metadata Broker 3 Fetch Send response Callback Message sent successfully, metadata=demo-topic-2@62 Bootstrap Leader ISR
  23. 23. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Producer configuration 23 Configuration Description acks 0, 1, all buffer.memory Total memory for local buffer max.block.ms Max block time to wait for metadata or buffer space retries Automatic retries max.in.flight.requests.per.connection Max number of unacked requests on a connection batch.size Max size for batch of records sent to broker linger.ms Time to wait for batch etc. etc.
  24. 24. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Kafka Consumer Ø  Messaging models Ø  Publish-subscribe Ø  Point-to-point 24
  25. 25. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Broker Consuming from Kafka •  Push vs Pull •  Push •  Pull •  Kafka •  Producers push messages, consumers pull messages •  Long poll •  Offset management •  Auto-commit •  Manual commit •  External offset management 25 __consumer_offsets commit() Compacted topic
  26. 26. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Kafka consumer: non-reactive Java example 26 consumer.subscribe(topics, new ConsumerRebalanceListener() { public void onPartitionsAssigned(Collection<TopicPartition> partitions) {} public void onPartitionsRevoked(Collection<TopicPartition> partitions) {} }); while (received < count) { ConsumerRecords<Integer, String> records = consumer.poll(1000); // 1 second for (ConsumerRecord<Integer, String> record : records) { System.out.println("Received message: " + record); received++; } }
  27. 27. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Consumer sequence diagram 27 Broker 1 Broker 2Consumer Broker 3 subscribe() poll() Create Subscription Find group coordinator Consumer Client-side coordinator Bootstrap Leader Group coordinator Rebalance Join group Sync Group onPartitionsRevoked() Fetch topic metadata onPartitionsAssigned() Fetch offsets
  28. 28. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Consumer sequence diagram 28 Broker 1 Broker 2Consumer Broker 3Consumer Heartbeatpoll() Prefetch Client-side coordinator Bootstrap Leader Group coordinator Received message: ConsumerRecord(topic = demo-topic, partition = 2, offset = 62 …, , key = 5, value = Message_5 Fetch messages
  29. 29. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Consumer configuration 29 Configuration Description group.id Consumer group fetch.min.bytes Min bytes to wait for per fetch max.partition.fetch.bytes Max bytes per partition per fetch fetch.max.wait.ms Max time server blocks before sending response max.poll.records Max messages per poll heartbeat.interval.ms Keeps session active and facilitates group rebalancing session.timeout.ms Timeout to detect consumer instance failure etc. etc.
  30. 30. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Kafka Connect 30 Kafka BrokerKafka BrokerKafka Broker KafkaConnect KafkaConnect External Data Sources External Data Sinks
  31. 31. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Kafka Streams 31 Kafka Broker Processor Topology Kafka Streams Ø  Simple threading model Ø  Non-reactive Ø  Back-pressure not required
  32. 32. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Agenda Ø  Introduction to Apache Kafka Ø  Clients and tools for Kafka Ø  Non-reactive Java clients Ø  Introduction to Reactive Streams Ø  Reactive Kafka Ø  Reactive API for Kafka Ø  Performance evaluation Ø  Summary 32
  33. 33. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Reactive Streams •  Fundamental shift from imperative-style to asynchronous, non-blocking functional-style code •  Serve more heterogeneous requests concurrently •  Handle remote operations more efficiently •  Composable •  Back-pressure •  Consumer requests data from emitter •  Start receiving data only when consumer is ready to process data •  Control the amount of inflight data 33
  34. 34. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Project Reactor 34 Flux Mono
  35. 35. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 35 GATEWAY Service B Message Broker Datastore Service A End-to-End Flow Control
  36. 36. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Agenda Ø  Introduction to Apache Kafka Ø  Clients and tools for Kafka Ø  Non-reactive Java clients Ø  Introduction to Reactive Streams Ø  Reactive Kafka Ø  Reactive API for Kafka Ø  Performance evaluation Ø  Summary 36 Reactive Kafka
  37. 37. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Reactive Kafka •  Bringing reactive streams and Kafka together •  Functional-style reactive API for Kafka •  Back-pressure •  Composable API •  Implementation •  As a shim over non-reactive Java API for Kafka 37 Reactive Kafka
  38. 38. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Reactive Kafka API 38 public static <K, V> KafkaSender<K, V> create(SenderConfig<K, V> config) {} public Mono<RecordMetadata> send(ProducerRecord<K, V> record) {} public static <K, V> KafkaFlux<K, V> listenOn(FluxConfig<K, V> config, Collection<String> topics) {} public enum AckMode { AUTO_ACK, ATMOST_ONCE, MANUAL_ACK, MANUAL_COMMIT } KafkaSender: KafkaFlux:
  39. 39. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Reactive Kafka Producer 39 KafkaSender<Integer, String> sender = KafkaSender.create(senderConfig); Flux<RecordMetadata> flux = Flux.range(1, count) .flatMap(i -> sender.send(new ProducerRecord<>(topic, i, "Message_" + i))); .doOnNext(metadata -> { log.info("Message sent successfully, metadata=" + metadata); latch.countDown(); }) .doOnError(e-> log.error("Send failed", e)) … flux.subscribe();
  40. 40. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Reactive KafkaSender sequence 40 1 2 2 3 3 flatMap({ }, maxConcurrency=256, prefetch=32) 2 31 request(256) request(32) Kafka Broker Kafka Broker Kafka Broker 1
  41. 41. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Producer sample log (edited for readability) [main] onSubscribe(r.c.publisher.FluxRange$RangeSubscription) [main] onSubscribe(r.c.publisher.FluxFlatMap$FlatMapMain) [main] request(unbounded) [main] request(256) [main] onNext(1) [main] ProducerConfig values: [main] onNext(2) .... [single-1] onNext(demo-topic-2@64) [single-1] Message sent successfully, metadata=demo-topic-2@64 41 Kafka producer created
  42. 42. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Reactive Kafka Consumer 42 KafkaFlux<Integer, String> kafkaFlux = KafkaFlux.listenOn(fluxConfig, Collections.singleton(topic)) .doOnPartitionsAssigned(partitions -> log.debug(assigned {}", partitions)) .doOnPartitionsRevoked(partitions -> log.debug(”revoked {}", partitions)); kafkaFlux.subscribe(message -> { ConsumerOffset offset = message.consumerOffset(); ConsumerRecord<Integer, String> record = message.consumerRecord(); log.info("Received message: {} commitOffset: {}”, record.value(), offset); latch.countDown(); });
  43. 43. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Reactive KafkaFlux sequence 43 concatMap({ }) Kafka Broker Kafka Broker Kafka Broker H merge H P P P P P H P C H P C Poll Commit Heartbeat
  44. 44. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Consumer sample log (edited for readability) [main] onSubscribe(r.c.publisher.EmitterProcessor$EmitterSubscriber) [main] onSubscribe(r.c.publisher.FluxFlatMap$FlatMapMain) [main] request(256) [main] onNext(reactor.kafka.internals.FluxManager$InitEvent) [main] request(unbounded) [main] onNext(r.k.internals.FluxManager$PollEvent) [sample-group-1] ConsumerConfig values: [sample-group-1] Discovered coordinator … group sample-group. [sample-group-1] (Re-)joining group sample-group [sample-group-1] onPartitionsAssigned [demo-topic-2, demo-topic-1] [parallel-1] onNext(org.apache.kafka.clients.consumer.ConsumerRecords) [parallel-1] ConsumerRecord(topic = demo-topic, partition = 2, offset = 64, …, key = 5, value = Message_5) [parallel-1] Received message: Message_5 commitOffset: demo-topic-2@65 44 Kafka consumer created
  45. 45. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Reactive consumer configuration 45 Configuration Description pollTimeout Timeout specified in each consumer.poll() commitInterval Duration between auto commits commitBatchSize Maximum batch size before auto commit is triggered maxAutoCommitAttempts Number of consecutive attempts to auto commit before raising an error closeTimeout Timeout for graceful close (for both producer and consumer) Ø  All non-reactive producer and properties configurable for reactive clients
  46. 46. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Agenda Ø  Introduction to Apache Kafka Ø  Clients and tools for Kafka Ø  Non-reactive Java clients Ø  Introduction to Reactive Streams Ø  Reactive Kafka Ø  Reactive API for Kafka Ø  Performance evaluation Ø  Summary 46 Reactive Kafka
  47. 47. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Performance: Test configuration (Rackspace) 47 Test Driver Kafka Cluster ZK Cluster Kafka Broker Zookeeper Kafka Broker Zookeeper Kafka Broker Zookeeper
  48. 48. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Rackspace machine configuration •  OnMetal I/O v2 instances for Kafka •  CPU: Dual 2.6 GHz, 10 core Intel® Xeon® E5-2660 v3 •  RAM: 128 GB •  Storage: Dual 1.6 TB PCIe flash cards •  Network: Redundant 10 Gb / s connections in a high availability bond •  Bare Metal Compute instances for Test driver and HTTP proxy •  CPU: Dual 2.4 Ghz, 6 core Intel® Xeon® E5-2620 v3 •  RAM: 64 GB •  Network: Redundant 10 Gb / s connections in a high availability bond 48
  49. 49. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Producer performance 49 0 0.5 1 1.5 2 Millionmessages/second Message Size Throughput(million msgs/sec) Non-reactive Reactive 1 10 100 1000 MB/second Message Size Throughput (MB/second) Non-reactive Reactive
  50. 50. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Consumer Performance 50 0 1 2 3 4 5 Million messages/second Message Size Throughput (million msgs/sec) Non-reac0ve Reac0ve 1 10 100 1000 MB/second Message Size Throughput (MB/second) Non-reac0ve Reac0ve
  51. 51. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ End-to-End Latency 51 0 1 2 3 Latency (ms) Message Size Latency 50% (millis) Non-reac0ve Reac0ve 0 1 2 3 Latency (ms) Message Size Latency 75% (millis) Non-reac0ve Reac0ve
  52. 52. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Performance: Kafka-REST vs Reactive HTTP 52 HTTP Test Driver Kafka-REST Proxy or Spring-Reactive-Web Proxy Kafka Cluster ZK Cluster Kafka Broker Zookeeper Kafka Broker Zookeeper Kafka Broker Zookeeper
  53. 53. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring Reactive Web Framework HTTP->Kafka 53 @Profile("kafkahttp") @RestController public class KafkaHttpController { private final ReactiveKafkaHttpSender sender; public KafkaHttpController(ReactiveKafkaHttpSender sender) { this.sender = sender; } @RequestMapping(path= "/kafkahttp/{topic}", method = RequestMethod.POST) public Flux<SendResponse> sendToKafka(@PathVariable String topic, @RequestBody Flux<Records> binaryStream) { return this.sender.sendToKafka(topic, binaryStream) .map(metadata -> new SendResponse(metadata)); }
  54. 54. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ HTTP->Kafka 54 @Profile("kafkahttp”) public class ReactiveKafkaHttpSender { private final KafkaSender<String, byte[]> kafkaSender; public ReactiveKafkaHttpSender(KafkaSender<String, byte[]> kafkaSender) { this.kafkaSender = kafkaSender; } public Flux<RecordMetadata> sendToKafka(String topic, Publisher<Records> recordStream) { return Flux.from(recordStream) .concatMap(records -> Flux.fromArray(records.getRecords())) .concatMap(r -> kafkaSender.send(new ProducerRecord<>(topic, r.getValue()))); }
  55. 55. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ HTTP->Kafka: Kafka-REST vs Reactive 55 0 10000 20000 30000 40000 50000 1 10 50 100 1000 2000 4000 6000 8000 Messages/second Concurrency Throughput(messages/second) Non-reac0ve Reac0ve 0 20 40 60 80 1 10 50 100 1000 2000 4000 6000 8000 CPU (%) Concurrency CPU usage Non-reac0ve Reac0ve Ø  Higher throughput, better CPU utilization with reactive
  56. 56. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ HTTP->Kafka: Kafka-REST vs Reactive 56 0 200 400 600 800 1 10 50 100 1000 2000 4000 6000 8000 Latency (ms) Concurrency Latency 50% (millis) Non-reac0ve Reac0ve Ø  Lower latency at high concurrency with reactive 0 200 400 600 800 1 10 50 100 1000 2000 4000 6000 8000 Latency (ms) Concurrency Latency 75% (millis) Non-reac0ve Reac0ve
  57. 57. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ ETL: MongoDB => Kafka => ElasticSearch 57 Kafka BrokerKafka BrokerKafka Broker Ø  ETL pipeline: Extract, Transform, Load Ø  Can be done with Kafka Connect Ø  Reactive API useful when pipeline includes multiple remote components
  58. 58. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ MongoDB => Kafka => ElasticSearch 58 kafkaConnector.createFlux() .doOnSubscribe(s -> mongoSource.findAll() // Extract from MongoDB source .flatMap(person -> kafkaConnector.store(person)) .subscribe() ) .map(k-> new PersonRecord(k).transform()) // Transform .window(sinkBufferSize, sinkBufferTimeout) .flatMap(list -> elasticSearchSink.put(list)); // Load into ElasticSearch
  59. 59. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Composition 59 mongoSource .findAll() // Extract .flatMap(person -> kafkaConnector.store(person)) // Store in Kafka .map(person -> personalDataSource.fetch(person)) // Remote operation .window(sinkBufferSize, sinkBufferTimeout) .flatMap(list -> elasticSearchSink.put(list)); // Load
  60. 60. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Summary •  Apache Kafka •  Distributed, highly available, scalable, high throughput, low-latency •  Vast ecosystem of clients and tools o  Works well without the need for back-pressure for Kafka interactions •  Reactive Streams •  Functional-style, Back-pressure •  Reactive Kafka •  Full functionality of non-reactive clients •  May see performance drop in some scenarios •  Reactive pipeline with Kafka as well as non-Kafka components o  Benefits from non-blocking back-pressure o  Better concurrency, more efficient use of resources, better CPU utilization 60 Reactive Kafka
  61. 61. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Kafka clients - revisited 61 Kafka BrokerKafka BrokerKafka Broker Kafka Connect REST proxy Ø  Large ecosystem of clients, tools, connectors Kafka Streams Schema registry C/C++ Consumer Go Consumer Java ConsumerJava Producer C/C++ Producer Go Producer ReactiveProducer ReactiveConsumer
  62. 62. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Want to find out more? •  Apache Kafka •  http://kafka.apache.org/documentation.html •  https://engineering.linkedin.com/distributed-systems/log-what-every- software-engineer-should-know-about-real-time-datas-unifying •  https://cwiki.apache.org/confluence/display/KAFKA/Clients •  Project Reactor •  https://projectreactor.io/docs/ •  Reactive Kafka •  https://github.com/reactor/reactor-kafka 62
  63. 63. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Learn More. Stay Connected. rsivaram@pivotal.io Related Sessions: Reactor 3.0, a JVM Foundation for Java 8 and Reactive Streams Designing, Implementing, and Using Reactive APIs From Imperative To Reactive Web Apps Spring for Apache Kafka @springcentral spring.io/blog @pivotal pivotal.io/blog @pivotalcf http://engineering.pivotal.io
  64. 64. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Thank you for listening. @springcentral spring.io/blog @pivotal pivotal.io/blog @pivotalcf http://engineering.pivotal.io Questions?

×