SlideShare uma empresa Scribd logo
1 de 70
Dive into Streams with Brooklin
Celia Kung
LinkedIn
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
linkedin-streams-brooklin/
Presented at QCon New York
www.qconnewyork.com
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Outline
Background
Scenarios
Application Use Cases
Architecture
Current and Future
Background
Nearline Applications
• Require near real-time response
• Thousands of applications at LinkedIn
○ E.g. Live search indices, Notifications
Nearline Applications
• Require continuous, low-latency access to data
○ Data could be spread across multiple database systems
• Need an easy way to move data to applications
○ App devs should focus on event processing and not on
data access
Heterogeneous Data Systems
Espresso
(LinkedIn’s
document store)
Microsoft
EventHubs
Building the Right Infrastructure
• Build separate, specialized
solutions to stream data from
and to each different system?
○ Slows down development
○ Hard to manage!
Microsoft
EventHubs
...
Streaming
System C
Streaming
System D
Streaming
System A
Streaming
System B
...
Nearline
Applications
Need a centralized, managed, and extensible service
to continuously deliver data in near real-time
Brooklin
Brooklin
• Streams are dynamically
provisioned and individually
configured
• Extensible: Plug-in support for
additional sources/destinations
• Streaming data pipeline service
• Propagates data from many source
types to many destination types
• Multitenant: Can run several
thousand streams simultaneously
Kinesis
EventHubs
Kafka
Destinations
Applications
Kinesis
EventHubs
Kafka
Databases
Messaging Systems
Sources
Espresso
Pluggable Sources & Destinations
Scenarios
Change Data Capture
Scenario 1:
Capturing Live Updates
1. Member updates her profile to
reflect her recent job change
Capturing Live Updates
2. LinkedIn wants to inform her
colleagues of this change
Capturing Live Updates
Member DB
Updates
News Feed
Service
query
Capturing Live Updates
Member DB
Updates
News Feed
Service
Search Indices
Service
query
query
Capturing Live Updates
Member DB
Updates
News Feed
Service
query
Notifications
Service
Standardization
Service
...
query
query
query
Search Indices
Service
Capturing Live Updates
Member DB
Updates
News Feed
Service
query
Notifications
Service
Standardization
Service
...
query
query
query
Search Indices
Service
Capturing Live Updates
Member DB
Updates
News Feed
Service
query
Notifications
Service
Standardization
Service
...
query
query
query
Search Indices
Service
• Isolation: Applications are decoupled
from the sources and don’t compete
for resources with online queries
• Applications can be at different
points in change timelines
• Brooklin can stream database
updates to a change stream
• Data processing applications
consume from change streams
Change Data Capture (CDC)
Change Data Capture (CDC)
Messaging System
News Feed
Service
Notifications
Service
Standardization
Service
Search Indices
Service
Member DB
Updates
Streaming Bridge
Scenario 2:
● Across…
○ cloud services
○ clusters
○ data centers
Stream Data from X to Y
Streaming Bridge
• Data pipe to move data between
different environments
• Enforce policy: Encryption,
Obfuscation, Data formats
● Aggregating data from all data centers into a centralized place
● Moving data between LinkedIn and external cloud services (e.g. Azure)
● Brooklin has replaced Kafka MirrorMaker (KMM) at LinkedIn
○ Issues with KMM: didn’t scale well, difficult to operate and manage, poor
failure isolation
Mirroring Kafka Data
Use Brooklin to Mirror Kafka Data
DestinationsSources
Messaging systems
Microsoft
EventHubs
Messaging systems
Microsoft
EventHubs
Databases Databases
Kafka
MirrorMaker
Topology
Datacenter B
aggregate
tracking
tracking
Datacenter A
aggregate
tracking
tracking
KMM
aggregate
metrics
metrics
aggregate
metrics
metrics
Datacenter C
aggregate
tracking
tracking
aggregate
metrics
metrics
...
KMM KMM
KMM KMM KMM
KMM KMM KMM KMM KMM KMM
KMM KMM KMM KMM KMM KMM
... ... ...
Brooklin Kafka Mirroring Topology
Datacenter A
aggregate
tracking
tracking
Brooklin
metrics
aggregate
metrics
Datacenter B
aggregate
tracking
tracking metrics
aggregate
metrics
Datacenter C
aggregate
tracking
tracking metrics
aggregate
metrics
...Brooklin Brooklin
Brooklin Kafka Mirroring
● Optimized for stability and operability
● Manually pause and resume mirroring at every level
○ Entire pipeline, topic, topic-partition
● Can auto-pause partitions facing mirroring issues
○ Auto-resumes the partitions after a configurable duration
● Flow of messages from other partitions is unaffected
Application Use Cases
Security
Cache
Application Use Cases
Security
Cache Search Indices
Application Use Cases
Security
Cache Search Indices ETL or Data
warehouse
Application Use Cases
Security
Cache Search Indices ETL or Data
warehouse
Materialized Views or
Replication
Application Use Cases
Security
Cache Search Indices ETL or Data
warehouse
Materialized Views or
Replication
Repartitioning
Application Use Cases
Adjunct Data
Application Use Cases
BridgeAdjunct Data
Application Use Cases
Bridge Serde, Encryption,
Policy
Adjunct Data
Application Use Cases
Bridge Serde, Encryption,
Policy
Standardization,
Notifications …
Adjunct Data
Application Use Cases
Architecture
Stream updates made to Member Profile
Example:
Capturing Live Updates
Member DB
Updates
News Feed
Service
● Scenario: Stream Espresso Member Profile updates into Kafka
○ Source Database: Espresso (Member DB, Profile table)
○ Destination: Kafka
○ Application: News Feed service
Example
• Describes the data pipeline
• Mapping between source and
destination
• Holds the configuration for the pipeline
Datastream
Name: MemberProfileChangeStream
Source: MemberDB/ProfileTable
Type: Espresso
Partitions: 8
Destination: ProfileTopic
Type: Kafka
Partitions: 8
Metadata:
Application: News Feed service
Owner: newsfeed@linkedin.com
1. Client makes REST call to create datastream
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
create POST /datastream
News Feed
service
2. Create request goes to any Brooklin instance
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
3. Datastream is written to ZooKeeper
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
4. Leader coordinator is notified of new datastream
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
5. Leader coordinator calculates work distribution
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
6. Leader coordinator writes the assignments to ZK
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
7. ZooKeeper is used to communicate the assignments
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
8. Coordinators hand task assignments to consumers
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
9. Consumers start streaming data from the source
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
10. Consumers propagate data to producers
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
11. Producers write data to the destination
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
12. App consumes from Kafka
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
13. Destinations can be shared by apps
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
News Feed
service
News Feed
service
Brooklin Architecture
ZooKeeper
Brooklin Instance
Brooklin Instance
Coordinator
Datastream
Management
Service
(DMS)
ConsumerA
ConsumerB
ProducerX
ProducerY
ProducerZ
Brooklin Instance
Datastream
Management
Service
(DMS)
ConsumerA
ProducerX
ConsumerB
ProducerY
ProducerZ
Coordinator
(Leader)
Brooklin Instance
Coordinator
Datastream
Management
Service
(DMS)
ConsumerA
ConsumerB
ProducerX
ProducerY
ProducerZ
Brooklin Instance
Current & Future
Current
• Consumers:
○ Espresso
○ Oracle
○ Kafka
○ EventHubs
○ Kinesis
• Producers:
○ Kafka
○ EventHubs
• APIs are standardized to support additional
sources and destinations
• Multitenant: Can power thousands of
datastreams across several source and
destination types
• Guarantees: At-least-once delivery, order is
maintained at partition level
• Kafka mirroring improvements: finer control
of pipelines (pause/auto-pause partitions),
improved latency with flushless-produce
mode
Sources & Destinations Features
38Bmessages/day
2K+datastreams
1K+unique sources
200+applications
Brooklin in Production
Brooklin streams with Espresso, Oracle, or EventHubs as the source
2T+messages/day
200+datastreams
10K+topics
Brooklin in Production
Brooklin streams mirroring Kafka data
Future
• Consumers:
○ MySQL
○ Cosmos DB
○ Azure SQL
• Producers:
○ Azure Blob storage
○ Kinesis
○ Cosmos DB
○ Azure SQL
○ Couchbase
Sources & Destinations Open Source
• Plan to open source Brooklin in
2019 (soon!)
Optimizations
• Brooklin auto-scaling
• Passthrough compression
• Read optimizations: Read once,
Write multiple
Thank you
Questions?
Celia Kung
: ckung@linkedin.com
: /in/celiakkung/
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
linkedin-streams-brooklin/

Mais conteúdo relacionado

Mais procurados

(ATS3-DEV05) Coding up Pipeline Pilot Components
(ATS3-DEV05) Coding up Pipeline Pilot Components(ATS3-DEV05) Coding up Pipeline Pilot Components
(ATS3-DEV05) Coding up Pipeline Pilot ComponentsBIOVIA
 
Hia 1693-effective application-development_in_iib
Hia 1693-effective application-development_in_iibHia 1693-effective application-development_in_iib
Hia 1693-effective application-development_in_iibAndrew Coleman
 
Introduction to the Spark MLLib Toolkit in IBM Streams V4.1
Introduction to the Spark MLLib Toolkit in IBM Streams V4.1Introduction to the Spark MLLib Toolkit in IBM Streams V4.1
Introduction to the Spark MLLib Toolkit in IBM Streams V4.1lisanl
 
Right-size Deployment Instances to Meet Enterprise Demand
Right-size Deployment Instances to Meet Enterprise Demand Right-size Deployment Instances to Meet Enterprise Demand
Right-size Deployment Instances to Meet Enterprise Demand WSO2
 
Hia 1691-using iib-to_support_api_economy
Hia 1691-using iib-to_support_api_economyHia 1691-using iib-to_support_api_economy
Hia 1691-using iib-to_support_api_economyAndrew Coleman
 
Enterprise Integration Patterns
Enterprise Integration PatternsEnterprise Integration Patterns
Enterprise Integration PatternsSergey Podolsky
 
Lucid logistics case study
Lucid logistics case studyLucid logistics case study
Lucid logistics case studyVMware Tanzu
 
S2GX 2012 - Introduction to Spring Integration and Spring Batch
S2GX 2012 - Introduction to Spring Integration and Spring BatchS2GX 2012 - Introduction to Spring Integration and Spring Batch
S2GX 2012 - Introduction to Spring Integration and Spring BatchGunnar Hillert
 
Hia 1689-techinical introduction-to_iib
Hia 1689-techinical introduction-to_iibHia 1689-techinical introduction-to_iib
Hia 1689-techinical introduction-to_iibAndrew Coleman
 
Facilitating continuous delivery in a FinTech world with Salt, Jenkins, Nexus...
Facilitating continuous delivery in a FinTech world with Salt, Jenkins, Nexus...Facilitating continuous delivery in a FinTech world with Salt, Jenkins, Nexus...
Facilitating continuous delivery in a FinTech world with Salt, Jenkins, Nexus...Chocolatey Software
 

Mais procurados (12)

(ATS3-DEV05) Coding up Pipeline Pilot Components
(ATS3-DEV05) Coding up Pipeline Pilot Components(ATS3-DEV05) Coding up Pipeline Pilot Components
(ATS3-DEV05) Coding up Pipeline Pilot Components
 
Hia 1693-effective application-development_in_iib
Hia 1693-effective application-development_in_iibHia 1693-effective application-development_in_iib
Hia 1693-effective application-development_in_iib
 
Introduction to the Spark MLLib Toolkit in IBM Streams V4.1
Introduction to the Spark MLLib Toolkit in IBM Streams V4.1Introduction to the Spark MLLib Toolkit in IBM Streams V4.1
Introduction to the Spark MLLib Toolkit in IBM Streams V4.1
 
Right-size Deployment Instances to Meet Enterprise Demand
Right-size Deployment Instances to Meet Enterprise Demand Right-size Deployment Instances to Meet Enterprise Demand
Right-size Deployment Instances to Meet Enterprise Demand
 
Hia 1691-using iib-to_support_api_economy
Hia 1691-using iib-to_support_api_economyHia 1691-using iib-to_support_api_economy
Hia 1691-using iib-to_support_api_economy
 
Enterprise Integration Patterns
Enterprise Integration PatternsEnterprise Integration Patterns
Enterprise Integration Patterns
 
Lucid logistics case study
Lucid logistics case studyLucid logistics case study
Lucid logistics case study
 
S2GX 2012 - Introduction to Spring Integration and Spring Batch
S2GX 2012 - Introduction to Spring Integration and Spring BatchS2GX 2012 - Introduction to Spring Integration and Spring Batch
S2GX 2012 - Introduction to Spring Integration and Spring Batch
 
Hia 1689-techinical introduction-to_iib
Hia 1689-techinical introduction-to_iibHia 1689-techinical introduction-to_iib
Hia 1689-techinical introduction-to_iib
 
Facilitating continuous delivery in a FinTech world with Salt, Jenkins, Nexus...
Facilitating continuous delivery in a FinTech world with Salt, Jenkins, Nexus...Facilitating continuous delivery in a FinTech world with Salt, Jenkins, Nexus...
Facilitating continuous delivery in a FinTech world with Salt, Jenkins, Nexus...
 
Mule real-world-old
Mule real-world-oldMule real-world-old
Mule real-world-old
 
Spring integration
Spring integrationSpring integration
Spring integration
 

Semelhante a A Dive into Streams @LinkedIn with Brooklin

Dive into Streams with Brooklin
Dive into Streams with BrooklinDive into Streams with Brooklin
Dive into Streams with BrooklinCelia Kung
 
Introducing Events and Stream Processing into Nationwide Building Society (Ro...
Introducing Events and Stream Processing into Nationwide Building Society (Ro...Introducing Events and Stream Processing into Nationwide Building Society (Ro...
Introducing Events and Stream Processing into Nationwide Building Society (Ro...confluent
 
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...HostedbyConfluent
 
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn confluent
 
Immutable Infrastructure: Rise of the Machine Images
Immutable Infrastructure: Rise of the Machine ImagesImmutable Infrastructure: Rise of the Machine Images
Immutable Infrastructure: Rise of the Machine ImagesC4Media
 
Design Microservice Architectures the Right Way
Design Microservice Architectures the Right WayDesign Microservice Architectures the Right Way
Design Microservice Architectures the Right WayC4Media
 
Introducing Events and Stream Processing into Nationwide Building Society
Introducing Events and Stream Processing into Nationwide Building SocietyIntroducing Events and Stream Processing into Nationwide Building Society
Introducing Events and Stream Processing into Nationwide Building Societyconfluent
 
Databus - LinkedIn's Change Data Capture Pipeline
Databus - LinkedIn's Change Data Capture PipelineDatabus - LinkedIn's Change Data Capture Pipeline
Databus - LinkedIn's Change Data Capture PipelineSunil Nagaraj
 
Splunk Ninjas: New Features, Pivot, and Search Dojo
Splunk Ninjas: New Features, Pivot, and Search DojoSplunk Ninjas: New Features, Pivot, and Search Dojo
Splunk Ninjas: New Features, Pivot, and Search DojoSplunk
 
More Data, More Problems: Scaling Kafka Mirroring Pipelines at LinkedIn
More Data, More Problems: Scaling Kafka Mirroring Pipelines at LinkedInMore Data, More Problems: Scaling Kafka Mirroring Pipelines at LinkedIn
More Data, More Problems: Scaling Kafka Mirroring Pipelines at LinkedInCelia Kung
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...confluent
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022StreamNative
 
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsPortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsTimothy Spann
 
Building Event-Driven (Micro)Services with Apache Kafka
Building Event-Driven (Micro)Services with Apache KafkaBuilding Event-Driven (Micro)Services with Apache Kafka
Building Event-Driven (Micro)Services with Apache KafkaGuido Schmutz
 
Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017Nitin Kumar
 
Confluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with ReplyConfluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with Replyconfluent
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flinkconfluent
 
James Watters Kafka Summit NYC 2019 Keynote
James Watters Kafka Summit NYC 2019 KeynoteJames Watters Kafka Summit NYC 2019 Keynote
James Watters Kafka Summit NYC 2019 KeynoteJames Watters
 
What's New in IBM Streams V4.2
What's New in IBM Streams V4.2What's New in IBM Streams V4.2
What's New in IBM Streams V4.2lisanl
 
A Microservices Journey - Susanne Kaiser
A Microservices Journey - Susanne KaiserA Microservices Journey - Susanne Kaiser
A Microservices Journey - Susanne KaiserThoughtworks
 

Semelhante a A Dive into Streams @LinkedIn with Brooklin (20)

Dive into Streams with Brooklin
Dive into Streams with BrooklinDive into Streams with Brooklin
Dive into Streams with Brooklin
 
Introducing Events and Stream Processing into Nationwide Building Society (Ro...
Introducing Events and Stream Processing into Nationwide Building Society (Ro...Introducing Events and Stream Processing into Nationwide Building Society (Ro...
Introducing Events and Stream Processing into Nationwide Building Society (Ro...
 
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
 
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
 
Immutable Infrastructure: Rise of the Machine Images
Immutable Infrastructure: Rise of the Machine ImagesImmutable Infrastructure: Rise of the Machine Images
Immutable Infrastructure: Rise of the Machine Images
 
Design Microservice Architectures the Right Way
Design Microservice Architectures the Right WayDesign Microservice Architectures the Right Way
Design Microservice Architectures the Right Way
 
Introducing Events and Stream Processing into Nationwide Building Society
Introducing Events and Stream Processing into Nationwide Building SocietyIntroducing Events and Stream Processing into Nationwide Building Society
Introducing Events and Stream Processing into Nationwide Building Society
 
Databus - LinkedIn's Change Data Capture Pipeline
Databus - LinkedIn's Change Data Capture PipelineDatabus - LinkedIn's Change Data Capture Pipeline
Databus - LinkedIn's Change Data Capture Pipeline
 
Splunk Ninjas: New Features, Pivot, and Search Dojo
Splunk Ninjas: New Features, Pivot, and Search DojoSplunk Ninjas: New Features, Pivot, and Search Dojo
Splunk Ninjas: New Features, Pivot, and Search Dojo
 
More Data, More Problems: Scaling Kafka Mirroring Pipelines at LinkedIn
More Data, More Problems: Scaling Kafka Mirroring Pipelines at LinkedInMore Data, More Problems: Scaling Kafka Mirroring Pipelines at LinkedIn
More Data, More Problems: Scaling Kafka Mirroring Pipelines at LinkedIn
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
 
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsPortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
 
Building Event-Driven (Micro)Services with Apache Kafka
Building Event-Driven (Micro)Services with Apache KafkaBuilding Event-Driven (Micro)Services with Apache Kafka
Building Event-Driven (Micro)Services with Apache Kafka
 
Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017
 
Confluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with ReplyConfluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with Reply
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
James Watters Kafka Summit NYC 2019 Keynote
James Watters Kafka Summit NYC 2019 KeynoteJames Watters Kafka Summit NYC 2019 Keynote
James Watters Kafka Summit NYC 2019 Keynote
 
What's New in IBM Streams V4.2
What's New in IBM Streams V4.2What's New in IBM Streams V4.2
What's New in IBM Streams V4.2
 
A Microservices Journey - Susanne Kaiser
A Microservices Journey - Susanne KaiserA Microservices Journey - Susanne Kaiser
A Microservices Journey - Susanne Kaiser
 

Mais de C4Media

Streaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoC4Media
 
Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileC4Media
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020C4Media
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsC4Media
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No KeeperC4Media
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like OwnersC4Media
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaC4Media
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideC4Media
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDC4Media
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine LearningC4Media
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at SpeedC4Media
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsC4Media
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsC4Media
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerC4Media
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleC4Media
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeC4Media
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereC4Media
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing ForC4Media
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data EngineeringC4Media
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreC4Media
 

Mais de C4Media (20)

Streaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live Video
 
Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy Mobile
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java Applications
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No Keeper
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like Owners
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate Guide
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 

Último

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 

Último (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

A Dive into Streams @LinkedIn with Brooklin

  • 1. Dive into Streams with Brooklin Celia Kung LinkedIn
  • 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ linkedin-streams-brooklin/
  • 3. Presented at QCon New York www.qconnewyork.com Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide
  • 6. Nearline Applications • Require near real-time response • Thousands of applications at LinkedIn ○ E.g. Live search indices, Notifications
  • 7. Nearline Applications • Require continuous, low-latency access to data ○ Data could be spread across multiple database systems • Need an easy way to move data to applications ○ App devs should focus on event processing and not on data access
  • 9. Building the Right Infrastructure • Build separate, specialized solutions to stream data from and to each different system? ○ Slows down development ○ Hard to manage! Microsoft EventHubs ... Streaming System C Streaming System D Streaming System A Streaming System B ... Nearline Applications
  • 10. Need a centralized, managed, and extensible service to continuously deliver data in near real-time
  • 12. Brooklin • Streams are dynamically provisioned and individually configured • Extensible: Plug-in support for additional sources/destinations • Streaming data pipeline service • Propagates data from many source types to many destination types • Multitenant: Can run several thousand streams simultaneously
  • 16. Capturing Live Updates 1. Member updates her profile to reflect her recent job change
  • 17. Capturing Live Updates 2. LinkedIn wants to inform her colleagues of this change
  • 18. Capturing Live Updates Member DB Updates News Feed Service query
  • 19. Capturing Live Updates Member DB Updates News Feed Service Search Indices Service query query
  • 20. Capturing Live Updates Member DB Updates News Feed Service query Notifications Service Standardization Service ... query query query Search Indices Service
  • 21. Capturing Live Updates Member DB Updates News Feed Service query Notifications Service Standardization Service ... query query query Search Indices Service
  • 22. Capturing Live Updates Member DB Updates News Feed Service query Notifications Service Standardization Service ... query query query Search Indices Service
  • 23. • Isolation: Applications are decoupled from the sources and don’t compete for resources with online queries • Applications can be at different points in change timelines • Brooklin can stream database updates to a change stream • Data processing applications consume from change streams Change Data Capture (CDC)
  • 24. Change Data Capture (CDC) Messaging System News Feed Service Notifications Service Standardization Service Search Indices Service Member DB Updates
  • 26. ● Across… ○ cloud services ○ clusters ○ data centers Stream Data from X to Y
  • 27. Streaming Bridge • Data pipe to move data between different environments • Enforce policy: Encryption, Obfuscation, Data formats
  • 28. ● Aggregating data from all data centers into a centralized place ● Moving data between LinkedIn and external cloud services (e.g. Azure) ● Brooklin has replaced Kafka MirrorMaker (KMM) at LinkedIn ○ Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data
  • 29. Use Brooklin to Mirror Kafka Data DestinationsSources Messaging systems Microsoft EventHubs Messaging systems Microsoft EventHubs Databases Databases
  • 30. Kafka MirrorMaker Topology Datacenter B aggregate tracking tracking Datacenter A aggregate tracking tracking KMM aggregate metrics metrics aggregate metrics metrics Datacenter C aggregate tracking tracking aggregate metrics metrics ... KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM ... ... ...
  • 31. Brooklin Kafka Mirroring Topology Datacenter A aggregate tracking tracking Brooklin metrics aggregate metrics Datacenter B aggregate tracking tracking metrics aggregate metrics Datacenter C aggregate tracking tracking metrics aggregate metrics ...Brooklin Brooklin
  • 32. Brooklin Kafka Mirroring ● Optimized for stability and operability ● Manually pause and resume mirroring at every level ○ Entire pipeline, topic, topic-partition ● Can auto-pause partitions facing mirroring issues ○ Auto-resumes the partitions after a configurable duration ● Flow of messages from other partitions is unaffected
  • 36. Security Cache Search Indices ETL or Data warehouse Application Use Cases
  • 37. Security Cache Search Indices ETL or Data warehouse Materialized Views or Replication Application Use Cases
  • 38. Security Cache Search Indices ETL or Data warehouse Materialized Views or Replication Repartitioning Application Use Cases
  • 41. Bridge Serde, Encryption, Policy Adjunct Data Application Use Cases
  • 42. Bridge Serde, Encryption, Policy Standardization, Notifications … Adjunct Data Application Use Cases
  • 44. Stream updates made to Member Profile Example:
  • 45. Capturing Live Updates Member DB Updates News Feed Service
  • 46. ● Scenario: Stream Espresso Member Profile updates into Kafka ○ Source Database: Espresso (Member DB, Profile table) ○ Destination: Kafka ○ Application: News Feed service Example
  • 47. • Describes the data pipeline • Mapping between source and destination • Holds the configuration for the pipeline Datastream Name: MemberProfileChangeStream Source: MemberDB/ProfileTable Type: Espresso Partitions: 8 Destination: ProfileTopic Type: Kafka Partitions: 8 Metadata: Application: News Feed service Owner: newsfeed@linkedin.com
  • 48. 1. Client makes REST call to create datastream Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB create POST /datastream News Feed service
  • 49. 2. Create request goes to any Brooklin instance Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service
  • 50. 3. Datastream is written to ZooKeeper Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service
  • 51. 4. Leader coordinator is notified of new datastream Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service
  • 52. 5. Leader coordinator calculates work distribution Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service
  • 53. 6. Leader coordinator writes the assignments to ZK Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service
  • 54. 7. ZooKeeper is used to communicate the assignments Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service
  • 55. 8. Coordinators hand task assignments to consumers Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service
  • 56. 9. Consumers start streaming data from the source Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service
  • 57. 10. Consumers propagate data to producers Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service
  • 58. 11. Producers write data to the destination Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service
  • 59. 12. App consumes from Kafka Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service
  • 60. 13. Destinations can be shared by apps Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service News Feed service News Feed service
  • 61. Brooklin Architecture ZooKeeper Brooklin Instance Brooklin Instance Coordinator Datastream Management Service (DMS) ConsumerA ConsumerB ProducerX ProducerY ProducerZ Brooklin Instance Datastream Management Service (DMS) ConsumerA ProducerX ConsumerB ProducerY ProducerZ Coordinator (Leader) Brooklin Instance Coordinator Datastream Management Service (DMS) ConsumerA ConsumerB ProducerX ProducerY ProducerZ Brooklin Instance
  • 63. Current • Consumers: ○ Espresso ○ Oracle ○ Kafka ○ EventHubs ○ Kinesis • Producers: ○ Kafka ○ EventHubs • APIs are standardized to support additional sources and destinations • Multitenant: Can power thousands of datastreams across several source and destination types • Guarantees: At-least-once delivery, order is maintained at partition level • Kafka mirroring improvements: finer control of pipelines (pause/auto-pause partitions), improved latency with flushless-produce mode Sources & Destinations Features
  • 64. 38Bmessages/day 2K+datastreams 1K+unique sources 200+applications Brooklin in Production Brooklin streams with Espresso, Oracle, or EventHubs as the source
  • 66. Future • Consumers: ○ MySQL ○ Cosmos DB ○ Azure SQL • Producers: ○ Azure Blob storage ○ Kinesis ○ Cosmos DB ○ Azure SQL ○ Couchbase Sources & Destinations Open Source • Plan to open source Brooklin in 2019 (soon!) Optimizations • Brooklin auto-scaling • Passthrough compression • Read optimizations: Read once, Write multiple
  • 69.
  • 70. Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ linkedin-streams-brooklin/