SlideShare uma empresa Scribd logo
1 de 32
Baixar para ler offline
diciembre 2010
Kappa Architecture
Our Experience
Who am I
CDO ASPgems
Former President of Hispalinux (Spanish
LUG)
Author “La Pastilla Roja” first spanish book
about Free Software.
Menu
A little context about Kappa Architecture
What’s Kappa Architecture
What is not Kappa Architecture
How we implement it
Real use cases with KA
A little context
July 2, 2014 Jay Kreps coined the term
Kappa Architecture in an article for
O’reilly Radar
Who is Jay Kreps
Jay has been involved in lots of projects:
Author of the essay:
The Log: What every software engineer
should know about real-time data's
unifying abstraction (12/16/2013)
https://engineering.linkedin.com/distributed-systems/log-what-every-software-
engineer-should-know-about-real-time-datas-unifying
Jay Kreps
Author of the book: I ♥ Logs
Jay Kreps
Involved with projects as:
Apache Kafka
Apache Samza
Voldemort
Azkaban
Ex-Linkedin
Now co-founder and CEO of Confluent
Lambda Architecture
Look something like this:
https://www.mapr.com/developercentral/lambda-architecture
Lambda Architecture
Batch layer that provides the following
functionality
managing the master dataset, an
immutable, append-only set of raw
data.
pre-computing arbitrary query
functions, called batch views.
https://www.mapr.com/developercentral/lambda-architecture
Lambda Architecture
Serving layer
This layer indexes the batch views so
that they can be queried in ad hoc
with low latency.
Speed layer
This layer accommodates all requests
that are subject to low latency
requirements. Using fast and
incremental algorithms, the speed
layer deals with recent data only.
Lambda Architecture
batch layer datasets can be in a distributed
filesystem, while MapReduce can be used to create
batch views that can be fed to the serving layer.
The serving layer can be implemented using NoSQL
technologies such as HBase,Apache Druid, etc.
Querying can be implemented by technologies such as
Apache Drill or Impala
Speed layer can be realized with data streaming
technologies such as Apache Storm or Spark Streaming
https://www.mapr.com/developercentral/lambda-architecture
Pros of Lambda
Architecture
Retain the input data unchanged.
Think about modeling data transformations,
series of data states from the original input.
Lambda architecture take in account the problem
of reprocessing data.
this happens all the time, the code will
change, and you will need to reprocess all the
information. Lots of reasons and you will need
to live with this.
Cons of Lambda
Architecture
Maintain the code that need to produce the same
result from two complex distributed system is
painful.
Very different code for MapReduce and Storm/
Apache Spark
Not only is about different code, is also about
debugging and interaction with other products like
(hive, Oozie, Cascading, etc)
At the end is a problem about different and
diverging programming paradigms.
So what is Kappa
Architecture
The proposal of Jay Kreps is so simple:
Use kafka (or other system) that will let you
retain the full log of the data you need to
reprocess.
When you want to do the reprocessing, start a
second instance of your stream processing job
that starts processing from the beginning of
the retained data, but direct this output data to
a new output table.
So what is Kappa
Architecture
part II
When the second job has caught up, switch the
application to read from the new table.
Stop the old version of the job, and delete the
old output table.
So what is Kappa
Architecture
This architecture looks something like this:
So what is Kappa
Architecture
The first benefit is that only you need to
reprocessing only when you change the code.
You can check if the new version is working ok and
if not reverse to the old output table.
You can mirror a Kafka topic to HDFS so you are
not limited to the Kafka retention configuration.
You have only a code to maintain with an unique
framework.
So what is Kappa
Architecture
The real advantage is not about efficiency at all
(You will need extra temporarily storage when
reprocessing for example) is allowing your team
to develop, test, debug and operate their systems
on top of a single processing framework.
What is not Kappa
Architecture
Is not a silver bullet to solve every problem at
Big Data.
Is not a list of prescriptions of technologies. You
can implement with your favorite frameworks.
Is not a rigid set of rules. But helps to maintain
the complex projects simple.
How we use Kappa
Architecture
We start working with projects with a complex
structure like Linkedin looks at early stage.
That’s very usual.
How we use Kappa
Architecture
How we use Kappa
Architecture
We try to refactoring the data flows to fix in a
Kappa Architecture.
How we use Kappa
Architecture
How we use Kappa
Architecture
We use Kafka as Stream Data Platform
Instead of Samza we feel more comfortable with
Spark Streaming.
At ASPGems we choose Apache Spark as our
Analytics Engine and not only for Spark
Streaming.
How we use Kappa
Architecture
At the end, Kappa Architecture is design pattern
for us.
We use/clone this pattern in almost our projects.
We have projects of every size, volume of data
or speed needing and fix with the Kappa
Architecture.
Use Cases
Telefónica - MSS
We use KA to calculate near real time KPIs,
SLAs related with the managed security system.
We simplify the data flow of the input data.
Kafka in the streaming data platform.
As MPP we use CassandraDB.
IOT - OBD II
One of our clients install On Board Devices in
the cars of its customers.
We implement an API to got all the information
in real time and inject the information in Kafka.
The business rules are implemented in a CEP
running into Apache Spark Streaming.
As MPP we use Elastic Search.
Insurance Company
We implement Kappa Architecture to process
click stream in real time and clustering users
We show content and offers that better fix users
Energy Facility
We implement Kappa Architecture to process
and predict energy consume.
Our customer include energy storage systems
and we got all the information about energy
storage (ultra-capacitors and batteries).
We process this information to calculate the
effective lifetime of the components and its
degradation.
Questions
diciembre 2010
Thank you
Juantomás García
juantomas@aspgems.com
@juantomas

Mais conteúdo relacionado

Mais procurados

Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & FeaturesDataStax Academy
 
Data Streaming in Big Data Analysis
Data Streaming in Big Data AnalysisData Streaming in Big Data Analysis
Data Streaming in Big Data AnalysisVincenzo Gulisano
 
Microservices Architecture - Bangkok 2018
Microservices Architecture - Bangkok 2018Microservices Architecture - Bangkok 2018
Microservices Architecture - Bangkok 2018Araf Karsh Hamid
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architectureAdam Doyle
 
Microservices Testing Strategies JUnit Cucumber Mockito Pact
Microservices Testing Strategies JUnit Cucumber Mockito PactMicroservices Testing Strategies JUnit Cucumber Mockito Pact
Microservices Testing Strategies JUnit Cucumber Mockito PactAraf Karsh Hamid
 
What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?confluent
 
Migrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraMigrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraAdrian Cockcroft
 
Lambda kappa architecture - the jury are still out
Lambda   kappa architecture - the jury are still outLambda   kappa architecture - the jury are still out
Lambda kappa architecture - the jury are still outYoav chernobroda
 
Analyzing 1.2 Million Network Packets per Second in Real-time
Analyzing 1.2 Million Network Packets per Second in Real-timeAnalyzing 1.2 Million Network Packets per Second in Real-time
Analyzing 1.2 Million Network Packets per Second in Real-timeDataWorks Summit
 
Introduction and HDInsight best practices
Introduction and HDInsight best practicesIntroduction and HDInsight best practices
Introduction and HDInsight best practicesAshish Thapliyal
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...confluent
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...Simplilearn
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsAnton Kirillov
 
Big data real time architectures
Big data real time architecturesBig data real time architectures
Big data real time architecturesDaniel Marcous
 
Introduction to Spark Streaming
Introduction to Spark StreamingIntroduction to Spark Streaming
Introduction to Spark Streamingdatamantra
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureKai Wähner
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergFlink Forward
 

Mais procurados (20)

Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Data Streaming in Big Data Analysis
Data Streaming in Big Data AnalysisData Streaming in Big Data Analysis
Data Streaming in Big Data Analysis
 
Microservices Architecture - Bangkok 2018
Microservices Architecture - Bangkok 2018Microservices Architecture - Bangkok 2018
Microservices Architecture - Bangkok 2018
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
 
Microservices Testing Strategies JUnit Cucumber Mockito Pact
Microservices Testing Strategies JUnit Cucumber Mockito PactMicroservices Testing Strategies JUnit Cucumber Mockito Pact
Microservices Testing Strategies JUnit Cucumber Mockito Pact
 
What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?
 
Migrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraMigrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global Cassandra
 
Lambda kappa architecture - the jury are still out
Lambda   kappa architecture - the jury are still outLambda   kappa architecture - the jury are still out
Lambda kappa architecture - the jury are still out
 
Analyzing 1.2 Million Network Packets per Second in Real-time
Analyzing 1.2 Million Network Packets per Second in Real-timeAnalyzing 1.2 Million Network Packets per Second in Real-time
Analyzing 1.2 Million Network Packets per Second in Real-time
 
Introduction and HDInsight best practices
Introduction and HDInsight best practicesIntroduction and HDInsight best practices
Introduction and HDInsight best practices
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
 
Spark architecture
Spark architectureSpark architecture
Spark architecture
 
Big data real time architectures
Big data real time architecturesBig data real time architectures
Big data real time architectures
 
Introduction to Spark Streaming
Introduction to Spark StreamingIntroduction to Spark Streaming
Introduction to Spark Streaming
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 

Destaque

Kappa Architecture, IoT of the cars - LibreCon 2016
Kappa Architecture, IoT of the cars - LibreCon 2016Kappa Architecture, IoT of the cars - LibreCon 2016
Kappa Architecture, IoT of the cars - LibreCon 2016LibreCon
 
El software como acción humana
El software como acción humanaEl software como acción humana
El software como acción humanaOpenSistemas
 
El futuro Data Driven en e-Learning y RR.HH.
El futuro Data Driven en e-Learning y RR.HH.El futuro Data Driven en e-Learning y RR.HH.
El futuro Data Driven en e-Learning y RR.HH.OpenSistemas
 
Apache spark y cómo lo usamos en nuestros proyectos
Apache spark y cómo lo usamos en nuestros proyectosApache spark y cómo lo usamos en nuestros proyectos
Apache spark y cómo lo usamos en nuestros proyectosOpenSistemas
 
Construyendo una Infraestructura de Big Data rentable y escalable (la evoluci...
Construyendo una Infraestructura de Big Data rentable y escalable (la evoluci...Construyendo una Infraestructura de Big Data rentable y escalable (la evoluci...
Construyendo una Infraestructura de Big Data rentable y escalable (la evoluci...Socialmetrix
 
Polyglot Processing - An Introduction 1.0
Polyglot Processing - An Introduction 1.0 Polyglot Processing - An Introduction 1.0
Polyglot Processing - An Introduction 1.0 Dr. Mohan K. Bavirisetty
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Anton Nazaruk
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016StampedeCon
 
Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters
Node Architecture Implications for In-Memory Data Analytics on Scale-in ClustersNode Architecture Implications for In-Memory Data Analytics on Scale-in Clusters
Node Architecture Implications for In-Memory Data Analytics on Scale-in ClustersAhsan Javed Awan
 
Bai tap thuc_hanh_excel_2010
Bai tap thuc_hanh_excel_2010Bai tap thuc_hanh_excel_2010
Bai tap thuc_hanh_excel_2010mainth_gtvt
 
Real time data ingestion and Hybrid Cloud
Real time data ingestion and Hybrid CloudReal time data ingestion and Hybrid Cloud
Real time data ingestion and Hybrid CloudNeeraj Sabharwal
 
A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013Nathan Bijnens
 
Streaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka APIStreaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka APICarol McDonald
 
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016Gyula Fóra
 
Voldemort : Prototype to Production
Voldemort : Prototype to ProductionVoldemort : Prototype to Production
Voldemort : Prototype to ProductionVinoth Chandar
 
Big Data y el sector salud
Big Data y el sector saludBig Data y el sector salud
Big Data y el sector saludBEEVA_es
 
Big Data Architectures
Big Data ArchitecturesBig Data Architectures
Big Data ArchitecturesGuido Schmutz
 

Destaque (20)

Kappa Architecture, IoT of the cars - LibreCon 2016
Kappa Architecture, IoT of the cars - LibreCon 2016Kappa Architecture, IoT of the cars - LibreCon 2016
Kappa Architecture, IoT of the cars - LibreCon 2016
 
Knowledge Discovery
Knowledge DiscoveryKnowledge Discovery
Knowledge Discovery
 
El software como acción humana
El software como acción humanaEl software como acción humana
El software como acción humana
 
El futuro Data Driven en e-Learning y RR.HH.
El futuro Data Driven en e-Learning y RR.HH.El futuro Data Driven en e-Learning y RR.HH.
El futuro Data Driven en e-Learning y RR.HH.
 
Apache spark y cómo lo usamos en nuestros proyectos
Apache spark y cómo lo usamos en nuestros proyectosApache spark y cómo lo usamos en nuestros proyectos
Apache spark y cómo lo usamos en nuestros proyectos
 
Construyendo una Infraestructura de Big Data rentable y escalable (la evoluci...
Construyendo una Infraestructura de Big Data rentable y escalable (la evoluci...Construyendo una Infraestructura de Big Data rentable y escalable (la evoluci...
Construyendo una Infraestructura de Big Data rentable y escalable (la evoluci...
 
Polyglot Processing - An Introduction 1.0
Polyglot Processing - An Introduction 1.0 Polyglot Processing - An Introduction 1.0
Polyglot Processing - An Introduction 1.0
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016
 
Arquitectura Lambda
Arquitectura LambdaArquitectura Lambda
Arquitectura Lambda
 
Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters
Node Architecture Implications for In-Memory Data Analytics on Scale-in ClustersNode Architecture Implications for In-Memory Data Analytics on Scale-in Clusters
Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters
 
Bai tap thuc_hanh_excel_2010
Bai tap thuc_hanh_excel_2010Bai tap thuc_hanh_excel_2010
Bai tap thuc_hanh_excel_2010
 
Real time data ingestion and Hybrid Cloud
Real time data ingestion and Hybrid CloudReal time data ingestion and Hybrid Cloud
Real time data ingestion and Hybrid Cloud
 
A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013
 
Streaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka APIStreaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka API
 
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016
 
Voldemort : Prototype to Production
Voldemort : Prototype to ProductionVoldemort : Prototype to Production
Voldemort : Prototype to Production
 
Apache Zeppelin Helium and Beyond
Apache Zeppelin Helium and BeyondApache Zeppelin Helium and Beyond
Apache Zeppelin Helium and Beyond
 
Big Data y el sector salud
Big Data y el sector saludBig Data y el sector salud
Big Data y el sector salud
 
Big Data Architectures
Big Data ArchitecturesBig Data Architectures
Big Data Architectures
 

Semelhante a ASPgems - kappa architecture

Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Anant Corporation
 
Lambda Architecture with Spark
Lambda Architecture with SparkLambda Architecture with Spark
Lambda Architecture with SparkKnoldus Inc.
 
Stream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and KafkaStream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and KafkaItai Yaffe
 
Learn about SPARK tool and it's componemts
Learn about SPARK tool and it's componemtsLearn about SPARK tool and it's componemts
Learn about SPARK tool and it's componemtssiddharth30121
 
Apache Spark in Scientific Applications
Apache Spark in Scientific ApplicationsApache Spark in Scientific Applications
Apache Spark in Scientific ApplicationsDr. Mirko Kämpf
 
Apache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsApache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsDr. Mirko Kämpf
 
Using pySpark with Google Colab & Spark 3.0 preview
Using pySpark with Google Colab & Spark 3.0 previewUsing pySpark with Google Colab & Spark 3.0 preview
Using pySpark with Google Colab & Spark 3.0 previewMario Cartia
 
Apache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Apache Spark - Intro to Large-scale recommendations with Apache Spark and PythonApache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Apache Spark - Intro to Large-scale recommendations with Apache Spark and PythonChristian Perone
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to sparkHome
 
Apache spark architecture (Big Data and Analytics)
Apache spark architecture (Big Data and Analytics)Apache spark architecture (Big Data and Analytics)
Apache spark architecture (Big Data and Analytics)Jyotasana Bharti
 
Data Pipeline for The Big Data/Data Science OKC
Data Pipeline for The Big Data/Data Science OKCData Pipeline for The Big Data/Data Science OKC
Data Pipeline for The Big Data/Data Science OKCMark Smith
 
Solution Brief: Real-Time Pipeline Accelerator
Solution Brief: Real-Time Pipeline AcceleratorSolution Brief: Real-Time Pipeline Accelerator
Solution Brief: Real-Time Pipeline AcceleratorBlueData, Inc.
 
Transitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to SparkTransitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to SparkSlim Baltagi
 

Semelhante a ASPgems - kappa architecture (20)

Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
 
Apache Spark PDF
Apache Spark PDFApache Spark PDF
Apache Spark PDF
 
spark_v1_2
spark_v1_2spark_v1_2
spark_v1_2
 
Started with-apache-spark
Started with-apache-sparkStarted with-apache-spark
Started with-apache-spark
 
Lambda Architecture with Spark
Lambda Architecture with SparkLambda Architecture with Spark
Lambda Architecture with Spark
 
Stream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and KafkaStream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and Kafka
 
Learn about SPARK tool and it's componemts
Learn about SPARK tool and it's componemtsLearn about SPARK tool and it's componemts
Learn about SPARK tool and it's componemts
 
Apache Spark in Scientific Applications
Apache Spark in Scientific ApplicationsApache Spark in Scientific Applications
Apache Spark in Scientific Applications
 
Apache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsApache Spark in Scientific Applciations
Apache Spark in Scientific Applciations
 
Using pySpark with Google Colab & Spark 3.0 preview
Using pySpark with Google Colab & Spark 3.0 previewUsing pySpark with Google Colab & Spark 3.0 preview
Using pySpark with Google Colab & Spark 3.0 preview
 
Data streaming
Data streamingData streaming
Data streaming
 
Apache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Apache Spark - Intro to Large-scale recommendations with Apache Spark and PythonApache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Apache Spark - Intro to Large-scale recommendations with Apache Spark and Python
 
AI at Scale
AI at ScaleAI at Scale
AI at Scale
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to spark
 
Module01
 Module01 Module01
Module01
 
Apache spark
Apache sparkApache spark
Apache spark
 
Apache spark architecture (Big Data and Analytics)
Apache spark architecture (Big Data and Analytics)Apache spark architecture (Big Data and Analytics)
Apache spark architecture (Big Data and Analytics)
 
Data Pipeline for The Big Data/Data Science OKC
Data Pipeline for The Big Data/Data Science OKCData Pipeline for The Big Data/Data Science OKC
Data Pipeline for The Big Data/Data Science OKC
 
Solution Brief: Real-Time Pipeline Accelerator
Solution Brief: Real-Time Pipeline AcceleratorSolution Brief: Real-Time Pipeline Accelerator
Solution Brief: Real-Time Pipeline Accelerator
 
Transitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to SparkTransitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to Spark
 

Mais de Juantomás García Molina

#AbadIA machine learning pipelines commit conf 2019
#AbadIA   machine learning pipelines commit conf 2019#AbadIA   machine learning pipelines commit conf 2019
#AbadIA machine learning pipelines commit conf 2019Juantomás García Molina
 
AbadIA: the abbey of the crime AI - GDG Cloud London 2018
AbadIA:  the abbey of the crime AI - GDG Cloud London 2018AbadIA:  the abbey of the crime AI - GDG Cloud London 2018
AbadIA: the abbey of the crime AI - GDG Cloud London 2018Juantomás García Molina
 
#AbadIA: the abbey of the crime AI - IO18 extended madrid 2018
#AbadIA:  the abbey of the crime AI - IO18 extended madrid 2018#AbadIA:  the abbey of the crime AI - IO18 extended madrid 2018
#AbadIA: the abbey of the crime AI - IO18 extended madrid 2018Juantomás García Molina
 
#AbadIA: the abbey of the crime AI - IBM meetup Madrid 2018
#AbadIA: the abbey of the crime AI - IBM meetup Madrid 2018#AbadIA: the abbey of the crime AI - IBM meetup Madrid 2018
#AbadIA: the abbey of the crime AI - IBM meetup Madrid 2018Juantomás García Molina
 
AbadIA: the abbey of the crime AI - Vaas Madrid 2018
AbadIA: the abbey of the crime AI - Vaas Madrid 2018AbadIA: the abbey of the crime AI - Vaas Madrid 2018
AbadIA: the abbey of the crime AI - Vaas Madrid 2018Juantomás García Molina
 
From Alpha Go to Alpha Zero - Vaas Madrid 2018
From Alpha Go to Alpha Zero -  Vaas Madrid 2018From Alpha Go to Alpha Zero -  Vaas Madrid 2018
From Alpha Go to Alpha Zero - Vaas Madrid 2018Juantomás García Molina
 
Codemotion madrid 2017 Arquitectura kappa 2.0
Codemotion madrid 2017  Arquitectura kappa 2.0Codemotion madrid 2017  Arquitectura kappa 2.0
Codemotion madrid 2017 Arquitectura kappa 2.0Juantomás García Molina
 
Meetup big data developers 2017 madrid - spark real use cases
Meetup big data developers 2017 madrid - spark real use casesMeetup big data developers 2017 madrid - spark real use cases
Meetup big data developers 2017 madrid - spark real use casesJuantomás García Molina
 
Gdg cloud london 2017 kappa architecture 2.0 copia
Gdg cloud london 2017   kappa architecture 2.0 copiaGdg cloud london 2017   kappa architecture 2.0 copia
Gdg cloud london 2017 kappa architecture 2.0 copiaJuantomás García Molina
 
Datascience lab 2017 odessa kappa architecture 2.0
Datascience lab 2017 odessa   kappa architecture 2.0Datascience lab 2017 odessa   kappa architecture 2.0
Datascience lab 2017 odessa kappa architecture 2.0Juantomás García Molina
 
Databeers madrid 2017 - Paas pigeons as a service
Databeers madrid 2017 - Paas pigeons as a serviceDatabeers madrid 2017 - Paas pigeons as a service
Databeers madrid 2017 - Paas pigeons as a serviceJuantomás García Molina
 

Mais de Juantomás García Molina (20)

#AbadIA machine learning pipelines commit conf 2019
#AbadIA   machine learning pipelines commit conf 2019#AbadIA   machine learning pipelines commit conf 2019
#AbadIA machine learning pipelines commit conf 2019
 
AbadIA - sphere it krakow 2019
AbadIA -   sphere it krakow 2019AbadIA -   sphere it krakow 2019
AbadIA - sphere it krakow 2019
 
AbadIA ING Direct - Madrid 2019
AbadIA ING Direct - Madrid 2019AbadIA ING Direct - Madrid 2019
AbadIA ING Direct - Madrid 2019
 
AbadIA US Secret Tour - Pittsburgh'19
AbadIA US Secret Tour - Pittsburgh'19AbadIA US Secret Tour - Pittsburgh'19
AbadIA US Secret Tour - Pittsburgh'19
 
From alpha go to alpha zero TLP innova 2018
From alpha go to alpha zero  TLP innova 2018From alpha go to alpha zero  TLP innova 2018
From alpha go to alpha zero TLP innova 2018
 
AbadIA: the abbey of the crime AI - GDG Cloud London 2018
AbadIA:  the abbey of the crime AI - GDG Cloud London 2018AbadIA:  the abbey of the crime AI - GDG Cloud London 2018
AbadIA: the abbey of the crime AI - GDG Cloud London 2018
 
#AbadIA: the abbey of the crime AI - IO18 extended madrid 2018
#AbadIA:  the abbey of the crime AI - IO18 extended madrid 2018#AbadIA:  the abbey of the crime AI - IO18 extended madrid 2018
#AbadIA: the abbey of the crime AI - IO18 extended madrid 2018
 
#AbadIA: the abbey of the crime AI - IBM meetup Madrid 2018
#AbadIA: the abbey of the crime AI - IBM meetup Madrid 2018#AbadIA: the abbey of the crime AI - IBM meetup Madrid 2018
#AbadIA: the abbey of the crime AI - IBM meetup Madrid 2018
 
AbadIA: the abbey of the crime AI - Vaas Madrid 2018
AbadIA: the abbey of the crime AI - Vaas Madrid 2018AbadIA: the abbey of the crime AI - Vaas Madrid 2018
AbadIA: the abbey of the crime AI - Vaas Madrid 2018
 
From Alpha Go to Alpha Zero - Vaas Madrid 2018
From Alpha Go to Alpha Zero -  Vaas Madrid 2018From Alpha Go to Alpha Zero -  Vaas Madrid 2018
From Alpha Go to Alpha Zero - Vaas Madrid 2018
 
Alpha zero - London 2018
Alpha zero  - London 2018 Alpha zero  - London 2018
Alpha zero - London 2018
 
Codemotion madrid 2017 Arquitectura kappa 2.0
Codemotion madrid 2017  Arquitectura kappa 2.0Codemotion madrid 2017  Arquitectura kappa 2.0
Codemotion madrid 2017 Arquitectura kappa 2.0
 
JBCN barcelona 2017 kappa architecture 2.0
JBCN barcelona 2017 kappa architecture 2.0JBCN barcelona 2017 kappa architecture 2.0
JBCN barcelona 2017 kappa architecture 2.0
 
Meetup big data developers 2017 madrid - spark real use cases
Meetup big data developers 2017 madrid - spark real use casesMeetup big data developers 2017 madrid - spark real use cases
Meetup big data developers 2017 madrid - spark real use cases
 
Gdg cloud madrid 2017 - GDG kick off metuup
Gdg cloud madrid 2017  - GDG kick off metuupGdg cloud madrid 2017  - GDG kick off metuup
Gdg cloud madrid 2017 - GDG kick off metuup
 
Scalaua 2017 kyev kappa architecture 2.0
Scalaua 2017 kyev   kappa architecture 2.0Scalaua 2017 kyev   kappa architecture 2.0
Scalaua 2017 kyev kappa architecture 2.0
 
Icea 2017 big data - recursos humanos
Icea 2017   big data - recursos humanosIcea 2017   big data - recursos humanos
Icea 2017 big data - recursos humanos
 
Gdg cloud london 2017 kappa architecture 2.0 copia
Gdg cloud london 2017   kappa architecture 2.0 copiaGdg cloud london 2017   kappa architecture 2.0 copia
Gdg cloud london 2017 kappa architecture 2.0 copia
 
Datascience lab 2017 odessa kappa architecture 2.0
Datascience lab 2017 odessa   kappa architecture 2.0Datascience lab 2017 odessa   kappa architecture 2.0
Datascience lab 2017 odessa kappa architecture 2.0
 
Databeers madrid 2017 - Paas pigeons as a service
Databeers madrid 2017 - Paas pigeons as a serviceDatabeers madrid 2017 - Paas pigeons as a service
Databeers madrid 2017 - Paas pigeons as a service
 

Último

Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........EfruzAsilolu
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制vexqp
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制vexqp
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowgargpaaro
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjurptikerjasaptiker
 
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx  Federal Constitution  of the Swiss ConfederationSR-101-01012024-EN.docx  Federal Constitution  of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss ConfederationEfruzAsilolu
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格q6pzkpark
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制vexqp
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 

Último (20)

Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
 
Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx  Federal Constitution  of the Swiss ConfederationSR-101-01012024-EN.docx  Federal Constitution  of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 

ASPgems - kappa architecture

  • 2. Who am I CDO ASPgems Former President of Hispalinux (Spanish LUG) Author “La Pastilla Roja” first spanish book about Free Software.
  • 3. Menu A little context about Kappa Architecture What’s Kappa Architecture What is not Kappa Architecture How we implement it Real use cases with KA
  • 4. A little context July 2, 2014 Jay Kreps coined the term Kappa Architecture in an article for O’reilly Radar
  • 5. Who is Jay Kreps Jay has been involved in lots of projects: Author of the essay: The Log: What every software engineer should know about real-time data's unifying abstraction (12/16/2013) https://engineering.linkedin.com/distributed-systems/log-what-every-software- engineer-should-know-about-real-time-datas-unifying
  • 6. Jay Kreps Author of the book: I ♥ Logs
  • 7. Jay Kreps Involved with projects as: Apache Kafka Apache Samza Voldemort Azkaban Ex-Linkedin Now co-founder and CEO of Confluent
  • 8. Lambda Architecture Look something like this: https://www.mapr.com/developercentral/lambda-architecture
  • 9. Lambda Architecture Batch layer that provides the following functionality managing the master dataset, an immutable, append-only set of raw data. pre-computing arbitrary query functions, called batch views. https://www.mapr.com/developercentral/lambda-architecture
  • 10. Lambda Architecture Serving layer This layer indexes the batch views so that they can be queried in ad hoc with low latency. Speed layer This layer accommodates all requests that are subject to low latency requirements. Using fast and incremental algorithms, the speed layer deals with recent data only.
  • 11. Lambda Architecture batch layer datasets can be in a distributed filesystem, while MapReduce can be used to create batch views that can be fed to the serving layer. The serving layer can be implemented using NoSQL technologies such as HBase,Apache Druid, etc. Querying can be implemented by technologies such as Apache Drill or Impala Speed layer can be realized with data streaming technologies such as Apache Storm or Spark Streaming https://www.mapr.com/developercentral/lambda-architecture
  • 12. Pros of Lambda Architecture Retain the input data unchanged. Think about modeling data transformations, series of data states from the original input. Lambda architecture take in account the problem of reprocessing data. this happens all the time, the code will change, and you will need to reprocess all the information. Lots of reasons and you will need to live with this.
  • 13. Cons of Lambda Architecture Maintain the code that need to produce the same result from two complex distributed system is painful. Very different code for MapReduce and Storm/ Apache Spark Not only is about different code, is also about debugging and interaction with other products like (hive, Oozie, Cascading, etc) At the end is a problem about different and diverging programming paradigms.
  • 14. So what is Kappa Architecture The proposal of Jay Kreps is so simple: Use kafka (or other system) that will let you retain the full log of the data you need to reprocess. When you want to do the reprocessing, start a second instance of your stream processing job that starts processing from the beginning of the retained data, but direct this output data to a new output table.
  • 15. So what is Kappa Architecture part II When the second job has caught up, switch the application to read from the new table. Stop the old version of the job, and delete the old output table.
  • 16. So what is Kappa Architecture This architecture looks something like this:
  • 17. So what is Kappa Architecture The first benefit is that only you need to reprocessing only when you change the code. You can check if the new version is working ok and if not reverse to the old output table. You can mirror a Kafka topic to HDFS so you are not limited to the Kafka retention configuration. You have only a code to maintain with an unique framework.
  • 18. So what is Kappa Architecture The real advantage is not about efficiency at all (You will need extra temporarily storage when reprocessing for example) is allowing your team to develop, test, debug and operate their systems on top of a single processing framework.
  • 19. What is not Kappa Architecture Is not a silver bullet to solve every problem at Big Data. Is not a list of prescriptions of technologies. You can implement with your favorite frameworks. Is not a rigid set of rules. But helps to maintain the complex projects simple.
  • 20. How we use Kappa Architecture We start working with projects with a complex structure like Linkedin looks at early stage. That’s very usual.
  • 21. How we use Kappa Architecture
  • 22. How we use Kappa Architecture We try to refactoring the data flows to fix in a Kappa Architecture.
  • 23. How we use Kappa Architecture
  • 24. How we use Kappa Architecture We use Kafka as Stream Data Platform Instead of Samza we feel more comfortable with Spark Streaming. At ASPGems we choose Apache Spark as our Analytics Engine and not only for Spark Streaming.
  • 25. How we use Kappa Architecture At the end, Kappa Architecture is design pattern for us. We use/clone this pattern in almost our projects. We have projects of every size, volume of data or speed needing and fix with the Kappa Architecture.
  • 27. Telefónica - MSS We use KA to calculate near real time KPIs, SLAs related with the managed security system. We simplify the data flow of the input data. Kafka in the streaming data platform. As MPP we use CassandraDB.
  • 28. IOT - OBD II One of our clients install On Board Devices in the cars of its customers. We implement an API to got all the information in real time and inject the information in Kafka. The business rules are implemented in a CEP running into Apache Spark Streaming. As MPP we use Elastic Search.
  • 29. Insurance Company We implement Kappa Architecture to process click stream in real time and clustering users We show content and offers that better fix users
  • 30. Energy Facility We implement Kappa Architecture to process and predict energy consume. Our customer include energy storage systems and we got all the information about energy storage (ultra-capacitors and batteries). We process this information to calculate the effective lifetime of the components and its degradation.
  • 32. diciembre 2010 Thank you Juantomás García juantomas@aspgems.com @juantomas