Here we'll talk about our experience with Kafka at Wirecard Brasil. We'll discuss about a real system, its older architecture, how it evolved and how we solved our problems using Apache Kafka.
Afterwards, lets understand what was our learnings; what we did good, what we did wrong and what problems we faced!
For full immersion, there's also a medium article where you can better understand everything!
https://medium.com/@caueferreira/a-real-showcase-of-kafka-at-wirecard-brazil-9b9c2055fcce
2. HOW AM I?
CAUÊ FERREIRA
▸ Senior Software Engineer at Wirecard Brasil for 7 years
▸ About 10 years of experience in development
▸ Loves to solve and architect new solutions
Medium, Github, Linkedin, Twitter: caueferreira
4. WHAT WE WILL NOT COVER DURING THIS PRESENTATION
WHAT THIS PRESENTATION IS NOT ABOUT
▸ Coding
▸ How to implement
▸ Tutorials
5. SO, WHAT WILL WE FIND IN THIS PRESENTATION?
WHAT IS HERE?
▸ The architecture of a real service
▸ The evolution of this architecture
▸ The final solution of the architecture
▸ Why we decided to move to Kafka Streams
▸ What challenges we faced during the development
▸ What problems we had while implementing it
▸ How we are today and what is up to the future
7. JUST A REGULAR HTTP CALL
https: //sandbox.moip.com.br/v2/orders?q=caue.ferreira
&filters=status::in(PAID,WAITING)|paymentMethod::in(CREDIT_CARD,BOLETO)|
value::bt(5000,10000)&limit=3&offset=0
REPORTS-API
the status must be either PAID or WAITING
the paymentMethod should be either CREDIT_CARD or BOLETO
the value must be between 5000 to 10000
9. OUR FIRST ARCHITECTURE
INVOICE’S SERVICE FIRST ARCHITECTURE
INVOICE-API
DATA
BASE
RETRIEVES
DATA
RABBITMQ
UPDATES
DATABASE
INVOICE-
SYNC
ORDER-API
LISTEN TO
QUEUE
UPDATES
INVOICE
GETS
DATA
REPORTS-
SYNC
ELASTIC
SEARCH
INSERT
DOCUMENTS
10. A CLASSIC ELASTIC SYNC PROBLEM
WHAT WENT WRONG?
▸ Reports-Sync, was single application responsible to sync all data from several
other applications
▸ Reports-Sync retrieves data in chunks from Invoice’s database, eventually
leading to unsynchronised data
▸ Reports-Api/Sync lack of a real owner
▸ The business logic of building an invoice was scattered between more than one
application
▸ Any down time in Reports-Sync would affect the synching of several applications
▸ Any down time in Reports-Api would render other applications unable to
retrieve invoices from the Elastic Search
12. OUR SOLUTION, BEFORE KAFKA
INVOICE’S SERVICE SECOND ARCHITECTURE
INVOICE-API
DATA
BASE
RABBITMQ
UPDATES
DATABASE
INVOICE-
SYNC
ORDER-API
LISTEN TO
QUEUE
UPDATES
INVOICE
GETS
DATA
ELASTIC
SEARCH
INSERT
DOCUMENTS
13. WHAT WE DID GOOD?
WHAT WAS OUR HITS?
▸ Invoice now has ownership of every aspect of itself
▸ Invoice no longer depends on any Reports service
▸ The business logic of building an invoice was scattered
between more than one application
▸ Invoice had a new elastic search cluster
15. OUR FIRST IDEA WAS TO IMPLEMENT A CDC PATTERN
RETRIEVING THE DATA FROM THE DATABASE. TO DO SO WE
NEEDED TO CREATE A CONNECT AND WE DECIDED TO USE
DEBEZIUM AS OUR SOURCE CONNECTOR TO RETRIEVE ALL
DATA FROM THE DATABASE BINLOG AND INSERT THEM
INTO THE KAFKA CONNECT'S TOPIC.
Just a phrase from the article
WHAT WE’RE AIMING?
16. OUR FINAL ARCHITECTURE USING KAFKA
INVOICE’S CURRENT ARCHITECTURE
INVOICE-API
DATA
BASE
INVOICE-
SYNC
ORDER-API
RETRIEVES
DATA FROM
THE BINLOG
FILE
INSERT
DOCUMENTS
ELASTIC
SEARCH
KAFKA-CONNECT
Database connector
ZOOKEEPER
KAFKA CLUSTER
KAFKA
BROKER
KAFKA
BROKER
KAFKA
BROKER
PRODUCE
EVENTS
RETRIEVES
DATA FROM
TOPIC
GETS
INVOICE
FROM ID
RETRIEVE
PAYMENT
INFO
PRODUCES
TRANSFORMED
EVENT
KAFKA-CONNECT
Elastic search connector
RETRIEVES
DATA FROM
TOPIC
17. WHAT WE DID GOOD?
WHAT WAS OUR HITS?
▸ We've a really stable and robust environment, about twenty projects are
already using Kafka streams at Wirecard Brazil
▸ We reduced considerable the amount of tickets opened because of
missing or out of date resources
▸ The invoice api had a memory leak that every now and then would
shutdown the application, not anymore
▸ While is true that the invoice api uses kafka exclusively to produces
documents to its elastic search, we're already producing events of invoices
so any application that wants to work with those resources are able to get
them without requesting data from the invoice api, thus not overloading it.
19. WHAT WE LEARNED?
WHAT HAPPENED?
▸ We started with a single kafka connect to rule all connectors across all applications,
it didn't took long for us to realize that we were back to the reports-api problem.
▸ Pay attention to Kafka variables
▸ Load retroactive data
▸ Avro and bad parsed resource
▸ A Kafka connect for each application
▸ Check your retention
▸ Kafka Manager
▸ Complexity and learning curve
21. WHERE ARE WE HEADING NOW?
WHAT IS NEXT?
▸ There are several applications that we’ve that aren’t using
Kafka
▸ We still have some applications using Reports service
▸ Some applications are not using Avro
▸ Infrastructure as a code, immutable.
▸ We should update our Kafka version
▸ Try the new Cloud-Native Experience for Kafka
22. THERE IS ALSO A POST IN THE MEDIUM
LINK TO THE ARTICLE
You can find it at my Medium account,
https://medium.com/@caueferreira
23. WHERE CAN I FIND THIS PRESENTATION
LINK TO THE PRESENTATION
You can also find it at my LinkedIn account,
https://linkedin.com/in/caueferreira