You have built an event-driven system leveraging Apache Kafka. Now you face the challenge of integrating traditional synchronous request-response capabilities, such as user interaction, through an HTTP web service.
There are various techniques, each with advantages and disadvantages. This talk discusses multiple options on how to do a request-response over Kafka — showcasing producers and consumers using single and multiple topics, and more advanced considerations using the interactive queries of ksqlDB and Kafka Streams.
Advanced considerations discussed:
What a consumer rebalance means to your active request-responses.
Discuss options for blocking for the async response in the web-service.
How can the CQRS (Command Query Responsibility Segregation) be leveraged with the interactive state stores of Kafka Streams and ksqlDB?
Interactive queries of the ksqlDB and Kafka Streams state stores are not available during a rebalance. What is the active Kafka development happening that will make interactive queries a more feasible option?
Would a custom state store help with rebalancing limitations?
Can custom partitioning be used for proper routing, and what impacts could that have to the other services in your ecosystem?
We will explore the above considerations with an interactive quiz application built using Apache Kafka, Kafka Streams, and ksqlDB. With a proper implementation in place, your request-response application can scale and be performant along with handling all of the requests.
2. Producer
Messaging
Consumer
Command/
Response
• Something that should happen
• Tell others what to do
• Presumption of a response
• Ask questions from others
Request (Command) Driven
• Something that has happened
• Tell others what you did
• No presumption of a response
• Others determine what they do
Event Driven
4. API Database
Quizzer - Streams Application
Submit
Next
Result
Quiz
Users
Aggregate
(KTable)
Questions
Difficulty
Global KTable
KTable
KTable
Start
Status
5. Quizzer - Streams Application
• https://events.confluent.io/meetups
• "Building a Web Application with Kafka as your
Database", March 24th, 2020
• "Interactive Kafka Streams", May 21st, 2020
12. The Legacy App
Web
Application
200
OK
200
OK
• How do you block in the web
application?
• How do you ensure the correct
web application instance that
publishes to Kafka is able to
consume the response topic.
Legacy Application, expects 200/OK Response
Blocking
14. Blocking Options
• Techniques to block for a response message in a
JVM Application.
• Countdown Latch
• Deferred Result (Spring MVC)
15. Blocking - Countdown Latch
• Algorithm
• Publish to Kafka
• Block on Latch
• Release Latch (Consumer from Kafka in separate thread)
16. Blocking - Countdown Latch
Object waitForResponse(String requestId) {
CountDownLatch l = new CountDownLatch(1);
CountDownLatch racer = latchByRequestId.putIfAbsent(requestId, l);
if (racer != null) l = racer;
//block
boolean success = l.await(timeoutMs, TimeUnit.MILLISECONDS);
if (success) {
//remove response from shared map
return responseByRequestId.remove(requestId);
} else {
throw new CorrelationException("Timeout: " + requestId);
}
}
17. Blocking - Countdown Latch
void addResponse(String id, Object response) {
CountDownLatch l = new CountDownLatch(1);
CountDownLatch r = latchByRequestId.putIfAbsent(id, l);
if (r != null) l = r; //usually
//make response available for blocking thread
responseByRequestId.put(id, response);
l.countDown(); //unblock
}
18. Blocking - Countdown Latch
• Pros
• Standard Java code Java code (since 1.5)
• Can be used anywhere
• Cons
• Blocks request thread
• Limits incoming requests (Servlet Container)
• Increases resource consumption
19. Blocking - Deferred Result
• Offloads to secondary thread
• Less coding
• Specific to Spring MVC
• CompletableFuture interface supported
• Other Web frameworks have this too
20. Blocking - Deferred Result
Cache<String, DeferredResult> cache =
CacheBuilder.newBuilder().maximumSize(1000).build();
DeferredResult waitForResponse(String requestId) {
DeferredResult deferredResult = new DeferredResult(5000L);
cache.put(requestId, deferredResult);
return deferredResult; //no actual waiting here, spring does that.
}
21. Blocking - Deferred Result
void addResponse(String requestId, JsonNode resp) {
DeferredResult result = cache.getIfPresent(requestId);
if (result != null) {
ResponseEntity<JsonNode> content = new ResponseEntity<>(resp, OK);
result.setResult(content); //unblocks response
cache.invalidate(requestId);
}
}
25. Consuming - 1 Topic & Assignment
• every Web Application assigns themselves to all partitions
• request-id in Kafka Header
• response topic must have header (automatic in Kafka Streams)
• key free for other uses, doesn't have to be the request-id
• all web applications get all messages
• discard messages where request-id doesn't exist
• don't deserialize key/value before checking header
26. Consuming - 1 Topic & Assignment
• Pros
• Can spin up additional web-applications w/out creating
topics
• Not limited to the number of partitions
• Correlation ID (request Id) does not have
to be key.
• No pause with a consumer group rebalancing.
27. Consuming - 1 Topic & Assignment
• Cons
• Every Web Application has to consume ever message
• Have to check and deserialize request-id header
29. Consuming - Topic / Web App
• Every web application gets its own topic
• additional header, resp-topic.
• Streaming application responds to the topic defined in
the header
• TopicNameExtractor (access to headers)
.to((k, v, context) ->
bytesToString(context.headers().lastHeader("resp-topic").value()));
30. Consuming - Topic / Web App
• Pros
• Only consume messages you produced
• No pause from a consumer group rebalancing
• no additional burden or assumption on
use of key.
31. Consuming - Topic / Web App
• Cons
• More work on streaming application to respond to the
proper topic.
• Must create a topic for every web application
instance
• Responses spanned across multiple topics
33. Consuming - 1 Topic & Subscribe
• consumer.subscribe("response-topic", rebalListener)
• considerations
• is the topic key based on data from the
incoming request?
• how sophisticated is your Load Balancer?
34. Consuming - 1 Topic & Subscribe
• Topic Key is known value (quiz_id vs request_id)
• route to all, "Not Me"
• Topic Key is not a known value (request_id)
• round-robin route to web-service and check
hash before using generated key.
• Have LB generate request-id and hash
performed before routing (LB needs more info)
36. Consuming - 1 Topic & Subscribe
• Pros
• Leverages most common Consumer Group Pattern
• No burden on streaming applications
• KIP-429
Kafka Consumer Incremental Rebalance Protocol
• Only a single consumer processes the message
37. Consuming - 1 Topic & Subscribe
• Cons
• More coordination depending on topic key
• Responses paused during a rebalancing
• Partitions moving consumers on rebalance
• Key and Partitioning concerns
minimized when using with CQRS.
39. • No need to block in Web
application.
• No need to route request back
to specific instance
• Requires Fully Accessible State
Web
Application
Command Query Responsibility Segregation
200
OK
202
Accepted
Querying
40. • No need to block in Web
application.
• Route request back to same
instance
• State / Web Application State
Web
Application
Command Query Responsibility Segregation
202
Accepted
200
OK
Querying
42. Leveraging Http Redirects
• 303 See Other
• Client will redirect and convert to GET
• Unfortunately, browsers handle location, so AJAX
solutions require additional work.
• CORs and allowed headers
• Build your own rules requires specific API contract
44. State Stores
• Global State Store
• Doesn't matter which Web Service handles the query
• examples
• microservice (web service doesn't need to know)
• ksqlDB (while it might Shard the data, it is queries as a single
unit)
• any key=value datastore (Cassandra, Mongo, MemCache)
45. State Stores
• Embedded Shard State Store
• Need to route/reroute query to the correct Web Service
• Leverage Load Balancer
• Inter web-service communication (as in Kafka Streams Metadata
API)
• Kafka Streams Examples / Microservices / OrderService.java /
fetchFromOtherHost
https://github.com/confluentinc/kafka-streams-examples/blob/master/src/main/java/io/confluent/examples/streams/microservices/OrdersService.java
46. Kafka Streams State Stores
• 1 Topic & subscribe() streams consumer within Web Service
• KIPS
• KIP-429 (Kafka 2.4)
Kafka Consumer Incremental Rebalance Protocol
• Allow consumer.poll() to return data in the middle of rebalance (https://issues.apache.org/jira/
browse/KAFKA-8421) (Kafka 2.5)
• KIP-535 (Kafka 2.5)
Allow state stores to serve stale reads during rebalance
• KIP-562 (Kafka 2.5)
Allow fetching a key from a single partition rather than iterating over all the stores on an instance
• KIP-441 (Expected, Kafka 2.6)
Smooth Scaling Out for Kafka Streams
47. Kafka Streams State Stores
• Things to consider
• Minimize duration of data being stored
• Isolate (minimize) topologies, reduce
session.timeout.ms
• stand by replicas
48. ksqldb State Store
• leverage client for table queries
• Table must be created by KSQL operation
• latest_by_offset() function works well for this
• want state-stores to be self cleaning
• leverage windowing
• ksql state store queries handles all windowed stores
49. ksqldb State Store
create stream QUIZ_NEXT with (KAFKA_TOPIC='quiz_next', VALUE_FORMAT='avro');
create table KSQL_QUIZ_NEXT as
select request_id,
latest_by_offset(quiz_id) as quiz_id,
latest_by_offset(user_id) as user_id,
latest_by_offset(question_id) as question_id,
latest_by_offset(statement) as statement,
latest_by_offset(a) as a,
latest_by_offset(b) as b,
latest_by_offset(c) as c,
latest_by_offset(d) as d,
latest_by_offset(difficulty) as difficulty
from QUIZ_NEXT
window tumbling (size 30 seconds)
group by request_id;
50. ksqldb State Store
create stream QUIZ_RESULT with (KAFKA_TOPIC='quiz_result', VALUE_FORMAT='avro');
create table KSQL_QUIZ_RESULT as
select request_id,
latest_by_offset(quiz_id) as quiz_id,
latest_by_offset(user_id) as user_id,
latest_by_offset(user_name) as user_name,
latest_by_offset(questions) as questions,
latest_by_offset(correct) as correct
from QUIZ_RESULT
window tumbling (size 30 seconds)
group by request_id;
51. BatchedQueryResult result =
client.executeQuery(
"SELECT * FROM KSQL_QUIZ_NEXT where " +
"REQUEST_ID='" + requestId + "';");
List<Row> list = result.get();
int last = list.size() - 1;
map.put("quiz_id", list.get(last).getString("QUIZ_ID"));
...
ksqldb Queries
52. Final Thoughts
• if using consumer groups, design with rebalancing in mind.
• Explore options with your Load Balancer.
• CQRS w/ Kafka Streams as your State Store
• 2.5+
• Minimize Topology Complexity
• Minimize changelog data by leveraging windowing or proper
tombstoning.
53. Resources
• Book Event Driven Systems
• https://www.confluent.io/wp-content/uploads/
confluent-designing-event-driven-systems.pdf
• Source Code
• http://github.com/nbuesing/quizzer