SlideShare uma empresa Scribd logo
1 de 39
Baixar para ler offline
The Data Driven Network
Kapil Surlaker
Director of Engineering
Bridging Batch and Streaming Data
Integration with Gobblin
Shirshanka Das
Gobblin team
26th Apr, 2017
Big Data Meetup
github.com/linkedin/gobblin
@ApacheGobblin
gitter.im/gobblin
Data Integration: key requirements
Source, Sink
Diversity
Batch
+
Streaming
Data
Quality
So, we built
SFTP
JDBC
REST
Simplifying Data Integration
@LinkedIn
Hundreds of TB per day
Thousands of datasets
~30 different source systems
80%+ of data ingest
Open source @ github.com/linkedin/gobblin
Adopted by LinkedIn, Intel, Swisscom, Prezi, PayPal,
CERN, NerdWallet and many more…
Apache incubation under way
SFTP
Azure
StorageAzure
Storage
4
Other Open Source Systems in this Space
Sqoop, Flume, Falcon, Nifi, Kafka Connect
Flink, Spark, Samza, Apex
Similar in pieces, dissimilar in aggregate
Most are tied to a specific execution model (batch / stream)
Most are tied to a specific implementation, ecosystem
(Kafka, Hadoop etc)
: Under the Hood
5
6
Gobblin: The Logical Pipeline
7
WorkUnit
A logical unit of work, typically bounded but not necessary.
Kafka Topic: LoginEvent, Partition: 10, Offsets: 10-200
HDFS Folder: /data/Login, File: part-0.avro
Hive Dataset: Tracking.Login, date-partition=mm-dd-yy-hh
8
Source: A provider of WorkUnits
(typically a system like Kafka, HDFS etc.)
9
Task: A unit of execution that operates on a WorkUnit
Extracts records from the source, writes to the destination
Ends when WorkUnit is exhausted of records
(assigned to Thread in ThreadPool, Mapper in Map-Reduce etc.)
10
Extractor: A provider of records given a WorkUnit
Connects to Data Source
Deserializer of records
11
Converter: A 1:N mapper of input records to output records
Multiple converters can be chained
(e.g. Avro <-> JSON, Schema project, Encrypt)
12
Quality Checker: Can check if the quality of the output is
satisfactory
Row-level (e.g. time value check)
Task-level (e.g. audit check, schema compatibility)
13
Writer: Writes to the destination
Connection to the destination, Serializer of records
Sync / Async
e.g. FsWriter, KafkaWriter, CouchbaseWriter
14
Publisher: Finalizes / Commits the data
Used for destinations that support atomicity
(e.g. move tmp staging directory to final
output directory on HDFS)
15
Gobblin: The Logical Pipeline
16
State Store (HDFS, S3, MySQL, ZK, …)
Load config
previous watermarks
save watermarks
Gobblin: The Logical Pipeline
Stateful
^
: Pipeline Specification
17
Gobblin: Pipeline Specification
job.name=PullFromWikipedia	
job.group=Wikipedia	
job.description=A	getting	started	example	for	Gobblin	
source.class=gobblin.example.wikipedia.WikipediaSource	
source.page.titles=LinkedIn,Wikipedia:Sandbox	
source.revisions.cnt=5	
wikipedia.api.rooturl=https://en.wikipedia.org/w/api.php	
wikipedia.avro.schema={"namespace":	“example.wikipedia.avro”	
,…"null"]}]}	
gobblin.wikipediaSource.maxRevisionsPerPage=10	
converter.classes=gobblin.example.wikipedia.WikipediaConverter	
Pipeline Name, Description
Source
+ configuration
source.revisions.cnt=5	
wikipedia.api.rooturl=https://en.wikipedia.org/w/api.php	
wikipedia.avro.schema={"namespace":	“example.wikipedia.avro”	
,…"null"]}]}	
gobblin.wikipediaSource.maxRevisionsPerPage=10	
converter.classes=gobblin.example.wikipedia.WikipediaConverter	
extract.namespace=gobblin.example.wikipedia	
writer.destination.type=HDFS	
writer.output.format=AVRO	
writer.partitioner.class=gobblin.example.wikipedia.WikipediaPartitioner	
data.publisher.type=gobblin.publisher.BaseDataPublisher
Gobblin: Pipeline Specification
Converter
Writer
+ configuration
converter.classes=gobblin.example.wikipedia.WikipediaConverter	
extract.namespace=gobblin.example.wikipedia	
writer.destination.type=HDFS	
writer.output.format=AVRO	
writer.partitioner.class=gobblin.example.wikipedia.WikipediaPartitioner	
data.publisher.type=gobblin.publisher.BaseDataPublisher
Gobblin: Pipeline Specification
Publisher
Gobblin: Pipeline Deployment
Bare Metal / AWS / Azure / VM
Standalone:
Single Instance
Small Medium Large
AWS (EC2)
Hadoop (YARN / MR)
Standalone Cluster
Pipeline Specification
Static Cluster Elastic ClusterOne Box
One Spec
Multiple Environments
Execution Model: Batch versus Streaming
Batch
Determine work, Acquire slots, Run, Checkpoint, Repeat
+ Cost-efficient, deterministic, repeatable
- Higher latency
- Setup, Checkpoint costs dominate if “micro-batching”
Execution Model: Batch versus Streaming
Streaming
Determine work streams, Run continuously, Checkpoint periodically
+ Low latency
- Higher-cost because it is harder to provision
accurately
- More sophistication needed to deal with change
Batch
Execution Model Scorecard
Batch
Streaming
Streaming
Streaming
Streaming
Batch
Batch
JDBC <->HDFS Kafka ->HDFS
HDFS ->Kafka Kafka <->Kinesis
Can we run in both models
using the same system?
26
Gobblin: The Logical Pipeline
27
Batch
Determine work
Streaming
Determine work
- unbounded WorkUnit
Pipeline Stages: Start
28
Batch
Acquire slots, Run
Streaming
Run continuously
Checkpoint periodically
Shutdown gracefully
Pipeline Stages: Run
Watermark Manager
State Storage
notify ack
shutdown
29
Batch
Checkpoint, Commit
Streaming
Do nothing
- NoOpPublisher
Pipeline Stages: End
Enabling Streaming mode
task.executionMode = streaming
Standalone:
Single Instance
AWS
Hadoop (YARN / MR)
Standalone Cluster
A Streaming Pipeline Spec: Kafka 2 Kafka
# A sample pull file that copies an input Kafka topic and
# produces to an output Kafka topic with sampling
job.name=Kafka2KafkaStreaming
job.group=Kafka
job.description=This is a job that runs forever, copies an input Kafka
topic to an output Kafka topic
job.lock.enabled=false
source.class=gobblin.source….KafkaSimpleStreamingSource
Pipeline Name, Description
job.description=This is a job that runs forever, copies an input Kafka
topic to an output Kafka topic
job.lock.enabled=false
source.class=gobblin.source….KafkaSimpleStreamingSource
gobblin.streaming.kafka.topic.key.deserializer=org.apache.kafka.com
mon.serialization.StringDeserializer
gobblin.streaming.kafka.topic.value.deserializer=org.apache.kafka.co
mmon.serialization.ByteArrayDeserializer
gobblin.streaming.kafka.topic.singleton=test
kafka.brokers=localhost:9092
# Sample 10% of the records
Source, configuration
A Streaming Pipeline Spec: Kafka 2 Kafka
mmon.serialization.ByteArrayDeserializer
gobblin.streaming.kafka.topic.singleton=test
kafka.brokers=localhost:9092
# Sample 10% of the records
converter.classes=gobblin.converter.SamplingConverter
converter.sample.ratio=0.10
writer.builder.class=gobblin.kafka.writer.KafkaDataWriterBuilder
writer.kafka.topic=test_copied
writer.kafka.producerConfig.bootstrap.servers=localhost:9092
writer.kafka.producerConfig.value.serializer=org.apache.kafka.comm
on.serialization.ByteArraySerializer
A Streaming Pipeline Spec: Kafka 2 Kafka
Converter, configuration
# Sample 10% of the records
converter.classes=gobblin.converter.SamplingConverter
converter.sample.ratio=0.10
writer.builder.class=gobblin.kafka.writer.KafkaDataWriterBuilder
writer.kafka.topic=test_copied
writer.kafka.producerConfig.bootstrap.servers=localhost:9092
writer.kafka.producerConfig.value.serializer=org.apache.kafka.comm
on.serialization.ByteArraySerializer
data.publisher.type=gobblin.publisher.NoopPublisher
task.executionMode=STREAMING
A Streaming Pipeline Spec: Kafka 2 Kafka
Writer, configuration
Publisher
data.publisher.type=gobblin.publisher.NoopPublisher
task.executionMode=STREAMING
# Configure watermark storage for streaming
#streaming.watermarkStateStore.type=zk
#streaming.watermarkStateStore.config.state.store.zk.connectString=
localhost:2181
# Configure watermark commit settings for streaming
#streaming.watermark.commitIntervalMillis=2000
A Streaming Pipeline Spec: Kafka 2 Kafka
Execution Mode,
watermark storage configuration
Gobblin Streaming: Cluster view
Cluster of processes
Apache Helix:
work-unit assignment,
fault-tolerance,
reassignment Cluster
Master
Helix
Worker 1
Worker 2
Worker 3
Sink
(Kafka,
HDFS,
…)
Stream Source
Active Workstreams in Gobblin
Gobblin as a Service
Global orchestrator with REST API for submitting logical flow specifications
Logical flow specifications compile down to physical pipeline specs
Global Throttling
Throttling capability to ensure Gobblin respects quotas globally (e.g. api calls, network b/w,
Hadoop namenode etc.)
Generic: can be used outside Gobblin
Metadata driven
Integration with Metadata Service (c.f. WhereHows)
Policy driven replication, permissions, encryption etc.
Roadmap
Final LinkedIn Gobblin 0.10.0 release
Apache Incubator code donation and release
More Streaming runtimes
Integration with Apache Samza, LinkedIn Brooklin
GDPR Compliance: Data purge for Hadoop and other systems
Security improvements
Credential storage, Secure specs
39
Gobblin Team @ LinkedIn

Mais conteúdo relacionado

Mais procurados

How netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloudHow netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloud
Vinay Kumar Chella
 
High Concurrency Architecture and Laravel Performance Tuning
High Concurrency Architecture and Laravel Performance TuningHigh Concurrency Architecture and Laravel Performance Tuning
High Concurrency Architecture and Laravel Performance Tuning
Albert Chen
 
C22 Oracle Database を監視しようぜ! by 山下正/内山義夫
C22 Oracle Database を監視しようぜ! by 山下正/内山義夫C22 Oracle Database を監視しようぜ! by 山下正/内山義夫
C22 Oracle Database を監視しようぜ! by 山下正/内山義夫
Insight Technology, Inc.
 

Mais procurados (20)

ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobs
 
Best Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWSBest Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWS
 
Top 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applicationsTop 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applications
 
Databus: LinkedIn's Change Data Capture Pipeline SOCC 2012
Databus: LinkedIn's Change Data Capture Pipeline SOCC 2012Databus: LinkedIn's Change Data Capture Pipeline SOCC 2012
Databus: LinkedIn's Change Data Capture Pipeline SOCC 2012
 
mnesia脑裂问题综述
mnesia脑裂问题综述mnesia脑裂问题综述
mnesia脑裂问题综述
 
Apache Airflow overview
Apache Airflow overviewApache Airflow overview
Apache Airflow overview
 
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsRunning Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
The Lyft data platform: Now and in the future
The Lyft data platform: Now and in the futureThe Lyft data platform: Now and in the future
The Lyft data platform: Now and in the future
 
How netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloudHow netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloud
 
Uber: Kafka Consumer Proxy
Uber: Kafka Consumer ProxyUber: Kafka Consumer Proxy
Uber: Kafka Consumer Proxy
 
High Concurrency Architecture and Laravel Performance Tuning
High Concurrency Architecture and Laravel Performance TuningHigh Concurrency Architecture and Laravel Performance Tuning
High Concurrency Architecture and Laravel Performance Tuning
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
 
Snowflake SnowPro Certification Exam Cheat Sheet
Snowflake SnowPro Certification Exam Cheat SheetSnowflake SnowPro Certification Exam Cheat Sheet
Snowflake SnowPro Certification Exam Cheat Sheet
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
 
"It can always get worse!" – Lessons Learned in over 20 years working with Or...
"It can always get worse!" – Lessons Learned in over 20 years working with Or..."It can always get worse!" – Lessons Learned in over 20 years working with Or...
"It can always get worse!" – Lessons Learned in over 20 years working with Or...
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando Patroni
 
C22 Oracle Database を監視しようぜ! by 山下正/内山義夫
C22 Oracle Database を監視しようぜ! by 山下正/内山義夫C22 Oracle Database を監視しようぜ! by 山下正/内山義夫
C22 Oracle Database を監視しようぜ! by 山下正/内山義夫
 
Understanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIsUnderstanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIs
 

Destaque

Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...
Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...
Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...
Shirshanka Das
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Shirshanka Das
 
Брокер сообщений Kafka в условиях повышенной нагрузки / Артём Выборнов (Rambl...
Брокер сообщений Kafka в условиях повышенной нагрузки / Артём Выборнов (Rambl...Брокер сообщений Kafka в условиях повышенной нагрузки / Артём Выборнов (Rambl...
Брокер сообщений Kafka в условиях повышенной нагрузки / Артём Выборнов (Rambl...
Ontico
 
Resume- William Myers FD2016.1.4
Resume- William Myers FD2016.1.4Resume- William Myers FD2016.1.4
Resume- William Myers FD2016.1.4
William Myers
 
Using Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and AnalyticsUsing Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and Analytics
Perficient, Inc.
 

Destaque (20)

Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...
Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...
Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...
 
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
 
Bigger Faster Easier: LinkedIn Hadoop Summit 2015
Bigger Faster Easier: LinkedIn Hadoop Summit 2015Bigger Faster Easier: LinkedIn Hadoop Summit 2015
Bigger Faster Easier: LinkedIn Hadoop Summit 2015
 
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemStrata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
 
Aksyon radyo
Aksyon radyoAksyon radyo
Aksyon radyo
 
Брокер сообщений Kafka в условиях повышенной нагрузки / Артём Выборнов (Rambl...
Брокер сообщений Kafka в условиях повышенной нагрузки / Артём Выборнов (Rambl...Брокер сообщений Kafka в условиях повышенной нагрузки / Артём Выборнов (Rambl...
Брокер сообщений Kafka в условиях повышенной нагрузки / Артём Выборнов (Rambl...
 
LinkedIn Segmentation & Targeting Platform: A Big Data Application
LinkedIn Segmentation & Targeting Platform: A Big Data ApplicationLinkedIn Segmentation & Targeting Platform: A Big Data Application
LinkedIn Segmentation & Targeting Platform: A Big Data Application
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedInData Infrastructure at LinkedIn
Data Infrastructure at LinkedIn
 
Personal branding playbook
Personal branding playbookPersonal branding playbook
Personal branding playbook
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedInData Infrastructure at LinkedIn
Data Infrastructure at LinkedIn
 
Resume- William Myers FD2016.1.4
Resume- William Myers FD2016.1.4Resume- William Myers FD2016.1.4
Resume- William Myers FD2016.1.4
 
Using Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and AnalyticsUsing Big Data for Improved Healthcare Operations and Analytics
Using Big Data for Improved Healthcare Operations and Analytics
 
Unlocking the Experts
Unlocking the ExpertsUnlocking the Experts
Unlocking the Experts
 
Participatory Design: Bringing Users Into Your Process
Participatory Design: Bringing Users Into Your ProcessParticipatory Design: Bringing Users Into Your Process
Participatory Design: Bringing Users Into Your Process
 
Introduction To TensorFlow | Deep Learning Using TensorFlow | TensorFlow Tuto...
Introduction To TensorFlow | Deep Learning Using TensorFlow | TensorFlow Tuto...Introduction To TensorFlow | Deep Learning Using TensorFlow | TensorFlow Tuto...
Introduction To TensorFlow | Deep Learning Using TensorFlow | TensorFlow Tuto...
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
What to Upload to SlideShare
What to Upload to SlideShareWhat to Upload to SlideShare
What to Upload to SlideShare
 
Making Great User Experiences, Pittsburgh Scrum MeetUp, Oct 17, 2017
Making Great User Experiences, Pittsburgh Scrum MeetUp, Oct 17, 2017Making Great User Experiences, Pittsburgh Scrum MeetUp, Oct 17, 2017
Making Great User Experiences, Pittsburgh Scrum MeetUp, Oct 17, 2017
 
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
 

Semelhante a Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetup @ LinkedIn Apr 2017

Structured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim Dowling
Structured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim DowlingStructured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim Dowling
Structured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim Dowling
Databricks
 
Spark Streaming Info
Spark Streaming InfoSpark Streaming Info
Spark Streaming Info
Doug Chang
 

Semelhante a Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetup @ LinkedIn Apr 2017 (20)

Structured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim Dowling
Structured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim DowlingStructured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim Dowling
Structured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim Dowling
 
What is Apache Kafka®?
What is Apache Kafka®?What is Apache Kafka®?
What is Apache Kafka®?
 
What is apache Kafka?
What is apache Kafka?What is apache Kafka?
What is apache Kafka?
 
On-premise Spark as a Service with YARN
On-premise Spark as a Service with YARN On-premise Spark as a Service with YARN
On-premise Spark as a Service with YARN
 
Spark Summit EU talk by Jim Dowling
Spark Summit EU talk by Jim DowlingSpark Summit EU talk by Jim Dowling
Spark Summit EU talk by Jim Dowling
 
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
 
Multi-tenant Flink as-a-service with Kafka on Hopsworks
Multi-tenant Flink as-a-service with Kafka on HopsworksMulti-tenant Flink as-a-service with Kafka on Hopsworks
Multi-tenant Flink as-a-service with Kafka on Hopsworks
 
Jim Dowling - Multi-tenant Flink-as-a-Service on YARN
Jim Dowling - Multi-tenant Flink-as-a-Service on YARN Jim Dowling - Multi-tenant Flink-as-a-Service on YARN
Jim Dowling - Multi-tenant Flink-as-a-Service on YARN
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Spark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsSpark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka Streams
 
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkReal time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache Spark
 
Always on. 2018-10 Reactive Summit
Always on. 2018-10 Reactive SummitAlways on. 2018-10 Reactive Summit
Always on. 2018-10 Reactive Summit
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
 
Building Scalable Data Pipelines - 2016 DataPalooza Seattle
Building Scalable Data Pipelines - 2016 DataPalooza SeattleBuilding Scalable Data Pipelines - 2016 DataPalooza Seattle
Building Scalable Data Pipelines - 2016 DataPalooza Seattle
 
From Zero to Stream Processing
From Zero to Stream ProcessingFrom Zero to Stream Processing
From Zero to Stream Processing
 
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
 
Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!
 
Spark Streaming Info
Spark Streaming InfoSpark Streaming Info
Spark Streaming Info
 
Java/Scala Lab: Руслан Шевченко - Implementation of CSP (Communication Sequen...
Java/Scala Lab: Руслан Шевченко - Implementation of CSP (Communication Sequen...Java/Scala Lab: Руслан Шевченко - Implementation of CSP (Communication Sequen...
Java/Scala Lab: Руслан Шевченко - Implementation of CSP (Communication Sequen...
 
Tips for Apache Flink on Kafka with Olena Babenko | Kafka Summit London 2022
Tips for Apache Flink on Kafka with Olena Babenko | Kafka Summit London 2022Tips for Apache Flink on Kafka with Olena Babenko | Kafka Summit London 2022
Tips for Apache Flink on Kafka with Olena Babenko | Kafka Summit London 2022
 

Último

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 

Último (20)

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 

Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetup @ LinkedIn Apr 2017