SlideShare uma empresa Scribd logo
1 de 39
Baixar para ler offline
BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENF
HAMBURG KOPENHAGEN LAUSANNE MÜNCHEN STUTTGART WIEN ZÜRICH
Architektur von Big Data
Lösungen
Guido Schmutz (guido.schmutz@trivadis.com)
@gschmutz
Guido Schmutz
Working for Trivadis for more than 20 years
Oracle ACE Director for Fusion Middleware and SOA
Co-Author of different books
Consultant, Trainer, Software Architect for Java, SOA & Big Data / Fast Data
Member of Trivadis Architecture Board
Technology Manager @ Trivadis
More than 30 years of software development experience
Contact: guido.schmutz@trivadis.com
Blog: http://guidoschmutz.wordpress.com
Slideshare: http://www.slideshare.net/gschmutz
Twitter: gschmutz
2 Architektur von Big Data Lösungen
Agenda
1. Introduction
2. Big Data Reference Architectures
• Traditional Big Data
• Event / Stream-Processing
• Lambda Architecture
• Kappa Architecture
• Unified Architecture
• Microservices Architecture
3. Big Data Ecosystem – many choices sorted!
3 Architektur von Big Data Lösungen
Introduction
4 Architektur von Big Data Lösungen
Big Data Definition (4 Vs)
+	Time	to	action	?	– Big	Data	+	Real-Time	=	Stream	Processing
Characteristics	of	Big	Data:	Its	Volume,	Velocity	
and	Variety	in	combination
Reliable Data Ingestion in Big Data/IoT
How to do Big Data? Why is a structuring / architecture
important?
6 Architektur von Big Data Lösungen
Why talk about Big Data Architectures?
Choosing the right architecture is key for any (big data) project
Big Data is still quite a rather young field and therefore a “moving target”
no standard architectures available which have been used for years
In the past years, some architectures and best practices have evolved
Know your use cases before choosing your architecture / technologies
To have a reference architecture in place helps in choosing the
right/matching technologies
7 Architektur von Big Data Lösungen
Important Properties for choosing (Big) Data Architecture
Latency
Keep raw and un-interpreted data “forever” ?
Volume, Velocity, Variety, Veracity
Ad-Hoc Query Capabilities needed ?
Robustness & Fault Tolerance
Scalability
…
8 Architektur von Big Data Lösungen
Big Data Reference Architectures -
Traditional Big Data
9 Architektur von Big Data Lösungen
“Traditional Architecture” for Big Data
Data
Ingestion
(Analytical)	Data	Processing
Data
Sources
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
Batch
compute
Pushing	
Ingestion Result	Store
Query
Engine
Computed	
Information
Raw	Data	
(Reservoir)
=	Data	in	Motion =	Data	at	Rest
Pulling	
Ingestion
Channel
10 Architektur von Big Data Lösungen
“Traditional Architecture” for Big Data – Hadoop
Technology Mapping
Data
Ingestion
(Analytical)	Data	Processing
Data
Sources
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
Batch
compute
Pushing	
Ingestion Result	Store
Query
Engine
Computed	
Information
Raw	Data	
(Reservoir)
=	Data	in	Motion =	Data	at	Rest
Pulling	
Ingestion
Channel
11 Architektur von Big Data Lösungen
“Traditional Architecture” for Big Data – Spark
Technology Mapping
Data
Ingestion
(Analytical)	Data	Processing
Data
Sources
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
Batch
compute
Pushing	
Ingestion Result	Store
Query
Engine
Computed	
Information
Raw	Data	
(Reservoir)
=	Data	in	Motion =	Data	at	Rest
Pulling	
Ingestion
Channel
12 Architektur von Big Data Lösungen
“Traditional Architecture” for Big Data – Feeding in High-
Volume Event Streams
Data
Ingestion
(Analytical)	Data	Processing
Data
Sources
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
Batch
compute
Pushing	
Ingestion Result	Store
Query
Engine
Computed	
Information
Raw	Data	
(Reservoir)
=	Data	in	Motion =	Data	at	Rest
Pulling	
Ingestion
Channel
?
?
13 Architektur von Big Data Lösungen
Traditional Architecture for Big Data
• Batch Processing - “Data at Rest”
• Not for low latency use cases
• Responses are delivered “after the fact”
• Maximum value of the identified situation is lost
• Decision are made on old and stale data
• Spark Core is a faster alternative to Hadoop Map
Reduce, but still Batch Processing
• Spark Ecosystems offers a lot of additional
advanced analytic capabilities (machine learning,
graph processing, …)
14 Architektur von Big Data Lösungen
Big Data Reference Architectures –
Event/Stream Processing
15 Architektur von Big Data Lösungen
Event / Stream Processing – “Data in Motion”
“Data in motion”
Events are analyzed and processed in real-
time as the arrive
Decisions are timely, contextual and based
on fresh data
Decision latency is eliminated
16 Architektur von Big Data Lösungen
Event / Stream Processing Architecture
Data
Ingestion
Batch
compute
Data
Sources
Channel
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
Logfiles
Social
RDBMS
ERP
Sensor
Machine
(Analytical)	Real-Time	Data	Processing
Stream/Event	Processing
Messaging
Result	Store
=	Data	in	Motion =	Data	at	Rest
17 Architektur von Big Data Lösungen
Challenges for Ingesting Data
Multitude of sensors
Real-Time Streaming
Multiple Firmware versions
Bad Data from damaged sensors
Regulatory Constraints
Data Quality
18 Architektur von Big Data Lösungen
Continuous Data Ingestion
DB	Source
Big	Data
Log
Stream	
Processing
IoT Sensor
Event	Hub
Topic
Topic
REST
Topic
IoT GW
CDC	GW
Connect
CDC
DB	Source
Log CDC
Native
IoT Sensor
IoT Sensor
19
Dataflow	GW
Topic
Topic
Queue
Message	GW
Topic
Dataflow	GW
Dataflow
TopicREST
19
File	Source
Log
Log
Log
Social
Native
Topic
Topic
19 Architektur von Big Data Lösungen
Continuous Data Ingestion
DB	Source
Big	Data
Log
Stream	
Processing
IoT Sensor
Event	Hub
Topic
Topic
REST
Topic
IoT GW
CDC	GW
Connect
CDC
DB	Source
Log CDC
Native
IoT Sensor
IoT Sensor
20
Dataflow	GW
Topic
Topic
Queue
Message	GW
Topic
Dataflow	GW
Dataflow
TopicREST
20
File	Source
Log
Log
Log
Social
Native
Topic
Topic
20 Architektur von Big Data Lösungen
Data
Ingestion
(Analytical)	Real-Time	Data	Processing
Event / Stream Processing Architecture – Open Source
Technology Mapping
Batch
compute
Data
Sources
Channel
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
Logfiles
Social
RDBMS
ERP
Sensor
Machine
Stream/Event	Processing
Messaging
Result	Store
=	Data	in	Motion =	Data	at	Rest
22 Architektur von Big Data Lösungen
Data
Ingestion
(Analytical)	Real-Time	Data	Processing
Event / Stream Processing Architecture – Oracle
Technology Mapping
Batch
compute
Data
Sources
Channel
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
Logfiles
Social
RDBMS
ERP
Sensor
Machine
Stream/Event	Processing
Messaging
Result	Store
=	Data	in	Motion =	Data	at	Rest
23 Architektur von Big Data Lösungen
Event / Stream Processing Architecture
The solution for low latency use cases
Process each event separately => low latency
Process events in micro-batches => increases latency but offers better
reliability
Previously known as “Complex Event Processing”
Keep the data moving / Data in Motion instead of Data at Rest => raw events
were not stored
24 Architektur von Big Data Lösungen
Event / Stream Processing Architecture - Keep raw
event data
Data
Ingestion
Batch
compute
Data
Sources
Channel
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
Logfiles
Social
RDBMS
ERP
Sensor
Machine
(Analytical)	Real-Time	Data	Processing
Stream/Event	Processing
Messaging
Result	Store
(Analytical)	Batch	Data	Processing
Raw	Data	
(Reservoir)
=	Data	in	Motion =	Data	at	Rest
25 Architektur von Big Data Lösungen
Big Data Reference Architectures -
Lambda Architecture for Big Data
26 Architektur von Big Data Lösungen
“Lambda Architecture” for Big Data
Data
Ingestion
(Analytical)	Batch	Data	Processing
Batch
compute
Data
Sources
Channel
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
(Analytical)	Real-Time	Data	Processing
Stream/Event	Processing
Batch
compute
Messaging
Result	Store
Query
Engine
Result	Store
Computed	
Information
Raw	Data	
(Reservoir)
=	Data	in	Motion =	Data	at	Rest
Pulling	
Ingestion
27 Architektur von Big Data Lösungen
Lambda Architecture for Big Data
Combines (Big) Data at Rest with (Fast) Data in Motion
Closes the gap from high-latency batch processing
Keeps the raw information forever
Makes it possible to rerun analytics operations on whole data set if necessary
=> because the old run had an error or
=> because we have found a better algorithm we want to apply
Have to implement functionality twice
• Once for batch
• Once for real-time streaming
29 Architektur von Big Data Lösungen
Big Data Reference Architectures -
„Kappa“ Architecture
30 Architektur von Big Data Lösungen
“Kappa Architecture” for Big Data
Data
Ingestion
“Raw	Data	Reservoir”
Batch
compute
Data
Sources
Channel
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
(Analytical)	Real-Time	Data	Processing
Stream/Event	Processing
Messaging
Result	Store
Raw	Data	
(Reservoir)
Computed	
Information
=	Data	in	Motion =	Data	at	Rest
31 Architektur von Big Data Lösungen
Queryable State
Organizing NoSQL Data Stores – Different Types
Key	Value	Store
Wide-column	store
Document	store
Graph	store
Key Value
K1 V1
K2 V2
K3 V3
Document
{
k1:	v1,
k2:	v2,	
k3:	[v1,	v2,	v3]
}
Rowkey
CK1
RK1
V1
CK2
V2
CK3
V3
CK4
V4
…
…
CK1
RK2
V1
CK4
V4
CK6
V6
…
…
…
…
…
…
CK3
V3
32 Architektur von Big Data Lösungen
Organizing NoSQL Data Stores – and the Products
Key	Value	Store
Wide-column	store
Document	store
Graph	store
33 Architektur von Big Data Lösungen
Big Data Reference Architectures -
„Unified“ Architecture
34 Architektur von Big Data Lösungen
“Unified Architecture” for Big Data
Data
Ingestion
(Analytical)	Batch	Data	Processing	(Calculate	Models	of	incoming	data)
Batch
compute
Data
Sources
Channel
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
(Analytical)	Real-Time	Data	Processing
Stream/Event	Processing
Batch
compute
Messaging
Result	Store
Result	Store
Computed	
Information
Raw	Data	
(Reservoir)
=	Data	in	Motion =	Data	at	Rest
Prediction	
Models
35 Architektur von Big Data Lösungen
Queryable State
Event Driven (Micro-) Service
Architectures
36 Architektur von Big Data Lösungen
MicroserviceMicroservice
MicroserviceMicroservice
Event-Driven (Micro-) Services Architecture
Data
Ingestion
“Raw	Data	Reservoir”
Batch
compute
Data
Sources
Channel
Data
Consumer
Reports
Service
Analytic
Tools
Alerting
Tools
Content
RDBMS
Social
ERP
Logfiles
Sensor
Machine
Microservice 2
Service
Raw	Data	
(Reservoir)
Computed	
Information
=	Data	in	Motion =	Data	at	Rest
37 Architektur von Big Data Lösungen
State
Batch
compute
Microservice 1
Service State
API
Result	Store
Big Data Ecosystem – many
choices sorted!
38 Architektur von Big Data Lösungen
Building Blocks for (Big) Data Processing
Data
Acquisition
Format
File System
Stream Processing
Batch SQL
Graph DBMS
Document
DBMS
Relational
DBMS
Visualization
IoT
Messaging
Analytics
OLAP DBMS
Query
Federation
Table-Style
DBMS
Key Value
DBMS
Batch Processing
In-Memory
39 Architektur von Big Data Lösungen
Big Data Ecosystem – many choices sorted!
40 Architektur von Big Data Lösungen
Guido Schmutz
Technology Manager
guido.schmutz@trivadis.com
41 Architektur von Big Data Lösungen

Mais conteúdo relacionado

Mais procurados

Introduction to Streaming Analytics
Introduction to Streaming AnalyticsIntroduction to Streaming Analytics
Introduction to Streaming AnalyticsGuido Schmutz
 
Internet of Things - Are traditional architectures good enough?
Internet of Things - Are traditional architectures good enough?Internet of Things - Are traditional architectures good enough?
Internet of Things - Are traditional architectures good enough?Guido Schmutz
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data ArchitectureGuido Schmutz
 
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...Altan Khendup
 
Real Time Analytics with Apache Cassandra - Cassandra Day Berlin
Real Time Analytics with Apache Cassandra - Cassandra Day BerlinReal Time Analytics with Apache Cassandra - Cassandra Day Berlin
Real Time Analytics with Apache Cassandra - Cassandra Day BerlinGuido Schmutz
 
Processing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12c
Processing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12cProcessing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12c
Processing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12cGuido Schmutz
 
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Big Data Spain
 
Lambda architecture for real time big data
Lambda architecture for real time big dataLambda architecture for real time big data
Lambda architecture for real time big dataTrieu Nguyen
 
Владимир Слободянюк «DWH & BigData – architecture approaches»
Владимир Слободянюк «DWH & BigData – architecture approaches»Владимир Слободянюк «DWH & BigData – architecture approaches»
Владимир Слободянюк «DWH & BigData – architecture approaches»Anna Shymchenko
 
Twitter Storm: Ereignisverarbeitung in Echtzeit
Twitter Storm: Ereignisverarbeitung in EchtzeitTwitter Storm: Ereignisverarbeitung in Echtzeit
Twitter Storm: Ereignisverarbeitung in EchtzeitGuido Schmutz
 
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...Big Data Spain
 
Migration and Coexistence between Relational and NoSQL Databases by Manuel H...
 Migration and Coexistence between Relational and NoSQL Databases by Manuel H... Migration and Coexistence between Relational and NoSQL Databases by Manuel H...
Migration and Coexistence between Relational and NoSQL Databases by Manuel H...Big Data Spain
 
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...Romeo Kienzler
 
From Events to Networks: Time Series Analysis on Scale
From Events to Networks: Time Series Analysis on ScaleFrom Events to Networks: Time Series Analysis on Scale
From Events to Networks: Time Series Analysis on ScaleDr. Mirko Kämpf
 
"Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about...
"Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about..."Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about...
"Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about...Kai Wähner
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dataconomy Media
 
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
[Webinar] Measure Twice, Build Once: Real-Time Predictive AnalyticsInfochimps, a CSC Big Data Business
 
Advanced data science algorithms applied to scalable stream processing by Dav...
Advanced data science algorithms applied to scalable stream processing by Dav...Advanced data science algorithms applied to scalable stream processing by Dav...
Advanced data science algorithms applied to scalable stream processing by Dav...Big Data Spain
 
Migrate and Modernize Hadoop-Based Security Policies for Databricks
Migrate and Modernize Hadoop-Based Security Policies for DatabricksMigrate and Modernize Hadoop-Based Security Policies for Databricks
Migrate and Modernize Hadoop-Based Security Policies for DatabricksDatabricks
 
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...Big Data Spain
 

Mais procurados (20)

Introduction to Streaming Analytics
Introduction to Streaming AnalyticsIntroduction to Streaming Analytics
Introduction to Streaming Analytics
 
Internet of Things - Are traditional architectures good enough?
Internet of Things - Are traditional architectures good enough?Internet of Things - Are traditional architectures good enough?
Internet of Things - Are traditional architectures good enough?
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
 
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
 
Real Time Analytics with Apache Cassandra - Cassandra Day Berlin
Real Time Analytics with Apache Cassandra - Cassandra Day BerlinReal Time Analytics with Apache Cassandra - Cassandra Day Berlin
Real Time Analytics with Apache Cassandra - Cassandra Day Berlin
 
Processing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12c
Processing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12cProcessing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12c
Processing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12c
 
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
 
Lambda architecture for real time big data
Lambda architecture for real time big dataLambda architecture for real time big data
Lambda architecture for real time big data
 
Владимир Слободянюк «DWH & BigData – architecture approaches»
Владимир Слободянюк «DWH & BigData – architecture approaches»Владимир Слободянюк «DWH & BigData – architecture approaches»
Владимир Слободянюк «DWH & BigData – architecture approaches»
 
Twitter Storm: Ereignisverarbeitung in Echtzeit
Twitter Storm: Ereignisverarbeitung in EchtzeitTwitter Storm: Ereignisverarbeitung in Echtzeit
Twitter Storm: Ereignisverarbeitung in Echtzeit
 
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
 
Migration and Coexistence between Relational and NoSQL Databases by Manuel H...
 Migration and Coexistence between Relational and NoSQL Databases by Manuel H... Migration and Coexistence between Relational and NoSQL Databases by Manuel H...
Migration and Coexistence between Relational and NoSQL Databases by Manuel H...
 
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
 
From Events to Networks: Time Series Analysis on Scale
From Events to Networks: Time Series Analysis on ScaleFrom Events to Networks: Time Series Analysis on Scale
From Events to Networks: Time Series Analysis on Scale
 
"Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about...
"Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about..."Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about...
"Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about...
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
 
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
 
Advanced data science algorithms applied to scalable stream processing by Dav...
Advanced data science algorithms applied to scalable stream processing by Dav...Advanced data science algorithms applied to scalable stream processing by Dav...
Advanced data science algorithms applied to scalable stream processing by Dav...
 
Migrate and Modernize Hadoop-Based Security Policies for Databricks
Migrate and Modernize Hadoop-Based Security Policies for DatabricksMigrate and Modernize Hadoop-Based Security Policies for Databricks
Migrate and Modernize Hadoop-Based Security Policies for Databricks
 
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
 

Semelhante a Architektur von Big Data Lösungen

Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016Guido Schmutz
 
Data Ingestion in Big Data and IoT platforms
Data Ingestion in Big Data and IoT platformsData Ingestion in Big Data and IoT platforms
Data Ingestion in Big Data and IoT platformsGuido Schmutz
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream ProcessingGuido Schmutz
 
Stream Processing – Concepts and Frameworks
Stream Processing – Concepts and FrameworksStream Processing – Concepts and Frameworks
Stream Processing – Concepts and FrameworksGuido Schmutz
 
SC4 Workshop 1: Simon Scerri: Existing tools and technologies
SC4 Workshop 1: Simon Scerri: Existing tools and technologiesSC4 Workshop 1: Simon Scerri: Existing tools and technologies
SC4 Workshop 1: Simon Scerri: Existing tools and technologiesBigData_Europe
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream ProcessingGuido Schmutz
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming VisualizationGuido Schmutz
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream ProcessingGuido Schmutz
 
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...Matt Stubbs
 
Internet of Things (IoT) and Big Data
Internet of Things (IoT) and Big DataInternet of Things (IoT) and Big Data
Internet of Things (IoT) and Big DataGuido Schmutz
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream ProcessingGuido Schmutz
 
IoT Architecture - Are Traditional Architectures Good Enough or do we Need Ne...
IoT Architecture - Are Traditional Architectures Good Enough or do we Need Ne...IoT Architecture - Are Traditional Architectures Good Enough or do we Need Ne...
IoT Architecture - Are Traditional Architectures Good Enough or do we Need Ne...Guido Schmutz
 
Fundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureFundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureGuido Schmutz
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...Alex Liu
 
Big Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesBig Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesJames Serra
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big DataFrank Kienle
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Shirshanka Das
 
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Yael Garten
 
Self-Service IoT Data Analytics with StreamPipes
Self-Service IoT Data Analytics with StreamPipesSelf-Service IoT Data Analytics with StreamPipes
Self-Service IoT Data Analytics with StreamPipesApache StreamPipes
 

Semelhante a Architektur von Big Data Lösungen (20)

Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016
 
Data Ingestion in Big Data and IoT platforms
Data Ingestion in Big Data and IoT platformsData Ingestion in Big Data and IoT platforms
Data Ingestion in Big Data and IoT platforms
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
Stream Processing – Concepts and Frameworks
Stream Processing – Concepts and FrameworksStream Processing – Concepts and Frameworks
Stream Processing – Concepts and Frameworks
 
SC4 Workshop 1: Simon Scerri: Existing tools and technologies
SC4 Workshop 1: Simon Scerri: Existing tools and technologiesSC4 Workshop 1: Simon Scerri: Existing tools and technologies
SC4 Workshop 1: Simon Scerri: Existing tools and technologies
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...
 
Internet of Things (IoT) and Big Data
Internet of Things (IoT) and Big DataInternet of Things (IoT) and Big Data
Internet of Things (IoT) and Big Data
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
IoT Architecture - Are Traditional Architectures Good Enough or do we Need Ne...
IoT Architecture - Are Traditional Architectures Good Enough or do we Need Ne...IoT Architecture - Are Traditional Architectures Good Enough or do we Need Ne...
IoT Architecture - Are Traditional Architectures Good Enough or do we Need Ne...
 
Fundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureFundamentals Big Data and AI Architecture
Fundamentals Big Data and AI Architecture
 
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
BUILDING BETTER PREDICTIVE MODELS WITH COGNITIVE ASSISTANCE IN A DATA SCIENCE...
 
Big Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesBig Data: It’s all about the Use Cases
Big Data: It’s all about the Use Cases
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big Data
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
 
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
 
Internet of Things and Big Data
Internet of Things and Big DataInternet of Things and Big Data
Internet of Things and Big Data
 
Self-Service IoT Data Analytics with StreamPipes
Self-Service IoT Data Analytics with StreamPipesSelf-Service IoT Data Analytics with StreamPipes
Self-Service IoT Data Analytics with StreamPipes
 

Mais de Guido Schmutz

30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as CodeGuido Schmutz
 
Event Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureEvent Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureGuido Schmutz
 
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsBig Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsGuido Schmutz
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!Guido Schmutz
 
Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?Guido Schmutz
 
Event Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data ArchitectureEvent Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data ArchitectureGuido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaGuido Schmutz
 
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) ArchitectureEvent Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) ArchitectureGuido Schmutz
 
Building Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaBuilding Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaGuido Schmutz
 
Location Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache KafkaLocation Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache KafkaGuido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaSolutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaGuido Schmutz
 
What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?Guido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaGuido Schmutz
 
Location Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using KafkaLocation Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using KafkaGuido Schmutz
 
Streaming Visualisation
Streaming VisualisationStreaming Visualisation
Streaming VisualisationGuido Schmutz
 
Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?Guido Schmutz
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaGuido Schmutz
 
Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Guido Schmutz
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming VisualizationGuido Schmutz
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming VisualizationGuido Schmutz
 

Mais de Guido Schmutz (20)

30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code
 
Event Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureEvent Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data Architecture
 
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsBig Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!
 
Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?
 
Event Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data ArchitectureEvent Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data Architecture
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
 
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) ArchitectureEvent Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
 
Building Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaBuilding Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache Kafka
 
Location Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache KafkaLocation Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache Kafka
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaSolutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
 
What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
 
Location Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using KafkaLocation Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using Kafka
 
Streaming Visualisation
Streaming VisualisationStreaming Visualisation
Streaming Visualisation
 
Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
 
Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
 

Último

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

Architektur von Big Data Lösungen

  • 1. BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENF HAMBURG KOPENHAGEN LAUSANNE MÜNCHEN STUTTGART WIEN ZÜRICH Architektur von Big Data Lösungen Guido Schmutz (guido.schmutz@trivadis.com) @gschmutz
  • 2. Guido Schmutz Working for Trivadis for more than 20 years Oracle ACE Director for Fusion Middleware and SOA Co-Author of different books Consultant, Trainer, Software Architect for Java, SOA & Big Data / Fast Data Member of Trivadis Architecture Board Technology Manager @ Trivadis More than 30 years of software development experience Contact: guido.schmutz@trivadis.com Blog: http://guidoschmutz.wordpress.com Slideshare: http://www.slideshare.net/gschmutz Twitter: gschmutz 2 Architektur von Big Data Lösungen
  • 3. Agenda 1. Introduction 2. Big Data Reference Architectures • Traditional Big Data • Event / Stream-Processing • Lambda Architecture • Kappa Architecture • Unified Architecture • Microservices Architecture 3. Big Data Ecosystem – many choices sorted! 3 Architektur von Big Data Lösungen
  • 4. Introduction 4 Architektur von Big Data Lösungen
  • 5. Big Data Definition (4 Vs) + Time to action ? – Big Data + Real-Time = Stream Processing Characteristics of Big Data: Its Volume, Velocity and Variety in combination Reliable Data Ingestion in Big Data/IoT
  • 6. How to do Big Data? Why is a structuring / architecture important? 6 Architektur von Big Data Lösungen
  • 7. Why talk about Big Data Architectures? Choosing the right architecture is key for any (big data) project Big Data is still quite a rather young field and therefore a “moving target” no standard architectures available which have been used for years In the past years, some architectures and best practices have evolved Know your use cases before choosing your architecture / technologies To have a reference architecture in place helps in choosing the right/matching technologies 7 Architektur von Big Data Lösungen
  • 8. Important Properties for choosing (Big) Data Architecture Latency Keep raw and un-interpreted data “forever” ? Volume, Velocity, Variety, Veracity Ad-Hoc Query Capabilities needed ? Robustness & Fault Tolerance Scalability … 8 Architektur von Big Data Lösungen
  • 9. Big Data Reference Architectures - Traditional Big Data 9 Architektur von Big Data Lösungen
  • 10. “Traditional Architecture” for Big Data Data Ingestion (Analytical) Data Processing Data Sources Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine Batch compute Pushing Ingestion Result Store Query Engine Computed Information Raw Data (Reservoir) = Data in Motion = Data at Rest Pulling Ingestion Channel 10 Architektur von Big Data Lösungen
  • 11. “Traditional Architecture” for Big Data – Hadoop Technology Mapping Data Ingestion (Analytical) Data Processing Data Sources Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine Batch compute Pushing Ingestion Result Store Query Engine Computed Information Raw Data (Reservoir) = Data in Motion = Data at Rest Pulling Ingestion Channel 11 Architektur von Big Data Lösungen
  • 12. “Traditional Architecture” for Big Data – Spark Technology Mapping Data Ingestion (Analytical) Data Processing Data Sources Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine Batch compute Pushing Ingestion Result Store Query Engine Computed Information Raw Data (Reservoir) = Data in Motion = Data at Rest Pulling Ingestion Channel 12 Architektur von Big Data Lösungen
  • 13. “Traditional Architecture” for Big Data – Feeding in High- Volume Event Streams Data Ingestion (Analytical) Data Processing Data Sources Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine Batch compute Pushing Ingestion Result Store Query Engine Computed Information Raw Data (Reservoir) = Data in Motion = Data at Rest Pulling Ingestion Channel ? ? 13 Architektur von Big Data Lösungen
  • 14. Traditional Architecture for Big Data • Batch Processing - “Data at Rest” • Not for low latency use cases • Responses are delivered “after the fact” • Maximum value of the identified situation is lost • Decision are made on old and stale data • Spark Core is a faster alternative to Hadoop Map Reduce, but still Batch Processing • Spark Ecosystems offers a lot of additional advanced analytic capabilities (machine learning, graph processing, …) 14 Architektur von Big Data Lösungen
  • 15. Big Data Reference Architectures – Event/Stream Processing 15 Architektur von Big Data Lösungen
  • 16. Event / Stream Processing – “Data in Motion” “Data in motion” Events are analyzed and processed in real- time as the arrive Decisions are timely, contextual and based on fresh data Decision latency is eliminated 16 Architektur von Big Data Lösungen
  • 17. Event / Stream Processing Architecture Data Ingestion Batch compute Data Sources Channel Data Consumer Reports Service Analytic Tools Alerting Tools Content Logfiles Social RDBMS ERP Sensor Machine (Analytical) Real-Time Data Processing Stream/Event Processing Messaging Result Store = Data in Motion = Data at Rest 17 Architektur von Big Data Lösungen
  • 18. Challenges for Ingesting Data Multitude of sensors Real-Time Streaming Multiple Firmware versions Bad Data from damaged sensors Regulatory Constraints Data Quality 18 Architektur von Big Data Lösungen
  • 19. Continuous Data Ingestion DB Source Big Data Log Stream Processing IoT Sensor Event Hub Topic Topic REST Topic IoT GW CDC GW Connect CDC DB Source Log CDC Native IoT Sensor IoT Sensor 19 Dataflow GW Topic Topic Queue Message GW Topic Dataflow GW Dataflow TopicREST 19 File Source Log Log Log Social Native Topic Topic 19 Architektur von Big Data Lösungen
  • 20. Continuous Data Ingestion DB Source Big Data Log Stream Processing IoT Sensor Event Hub Topic Topic REST Topic IoT GW CDC GW Connect CDC DB Source Log CDC Native IoT Sensor IoT Sensor 20 Dataflow GW Topic Topic Queue Message GW Topic Dataflow GW Dataflow TopicREST 20 File Source Log Log Log Social Native Topic Topic 20 Architektur von Big Data Lösungen
  • 21. Data Ingestion (Analytical) Real-Time Data Processing Event / Stream Processing Architecture – Open Source Technology Mapping Batch compute Data Sources Channel Data Consumer Reports Service Analytic Tools Alerting Tools Content Logfiles Social RDBMS ERP Sensor Machine Stream/Event Processing Messaging Result Store = Data in Motion = Data at Rest 22 Architektur von Big Data Lösungen
  • 22. Data Ingestion (Analytical) Real-Time Data Processing Event / Stream Processing Architecture – Oracle Technology Mapping Batch compute Data Sources Channel Data Consumer Reports Service Analytic Tools Alerting Tools Content Logfiles Social RDBMS ERP Sensor Machine Stream/Event Processing Messaging Result Store = Data in Motion = Data at Rest 23 Architektur von Big Data Lösungen
  • 23. Event / Stream Processing Architecture The solution for low latency use cases Process each event separately => low latency Process events in micro-batches => increases latency but offers better reliability Previously known as “Complex Event Processing” Keep the data moving / Data in Motion instead of Data at Rest => raw events were not stored 24 Architektur von Big Data Lösungen
  • 24. Event / Stream Processing Architecture - Keep raw event data Data Ingestion Batch compute Data Sources Channel Data Consumer Reports Service Analytic Tools Alerting Tools Content Logfiles Social RDBMS ERP Sensor Machine (Analytical) Real-Time Data Processing Stream/Event Processing Messaging Result Store (Analytical) Batch Data Processing Raw Data (Reservoir) = Data in Motion = Data at Rest 25 Architektur von Big Data Lösungen
  • 25. Big Data Reference Architectures - Lambda Architecture for Big Data 26 Architektur von Big Data Lösungen
  • 26. “Lambda Architecture” for Big Data Data Ingestion (Analytical) Batch Data Processing Batch compute Data Sources Channel Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine (Analytical) Real-Time Data Processing Stream/Event Processing Batch compute Messaging Result Store Query Engine Result Store Computed Information Raw Data (Reservoir) = Data in Motion = Data at Rest Pulling Ingestion 27 Architektur von Big Data Lösungen
  • 27. Lambda Architecture for Big Data Combines (Big) Data at Rest with (Fast) Data in Motion Closes the gap from high-latency batch processing Keeps the raw information forever Makes it possible to rerun analytics operations on whole data set if necessary => because the old run had an error or => because we have found a better algorithm we want to apply Have to implement functionality twice • Once for batch • Once for real-time streaming 29 Architektur von Big Data Lösungen
  • 28. Big Data Reference Architectures - „Kappa“ Architecture 30 Architektur von Big Data Lösungen
  • 29. “Kappa Architecture” for Big Data Data Ingestion “Raw Data Reservoir” Batch compute Data Sources Channel Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine (Analytical) Real-Time Data Processing Stream/Event Processing Messaging Result Store Raw Data (Reservoir) Computed Information = Data in Motion = Data at Rest 31 Architektur von Big Data Lösungen Queryable State
  • 30. Organizing NoSQL Data Stores – Different Types Key Value Store Wide-column store Document store Graph store Key Value K1 V1 K2 V2 K3 V3 Document { k1: v1, k2: v2, k3: [v1, v2, v3] } Rowkey CK1 RK1 V1 CK2 V2 CK3 V3 CK4 V4 … … CK1 RK2 V1 CK4 V4 CK6 V6 … … … … … … CK3 V3 32 Architektur von Big Data Lösungen
  • 31. Organizing NoSQL Data Stores – and the Products Key Value Store Wide-column store Document store Graph store 33 Architektur von Big Data Lösungen
  • 32. Big Data Reference Architectures - „Unified“ Architecture 34 Architektur von Big Data Lösungen
  • 33. “Unified Architecture” for Big Data Data Ingestion (Analytical) Batch Data Processing (Calculate Models of incoming data) Batch compute Data Sources Channel Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine (Analytical) Real-Time Data Processing Stream/Event Processing Batch compute Messaging Result Store Result Store Computed Information Raw Data (Reservoir) = Data in Motion = Data at Rest Prediction Models 35 Architektur von Big Data Lösungen Queryable State
  • 34. Event Driven (Micro-) Service Architectures 36 Architektur von Big Data Lösungen
  • 35. MicroserviceMicroservice MicroserviceMicroservice Event-Driven (Micro-) Services Architecture Data Ingestion “Raw Data Reservoir” Batch compute Data Sources Channel Data Consumer Reports Service Analytic Tools Alerting Tools Content RDBMS Social ERP Logfiles Sensor Machine Microservice 2 Service Raw Data (Reservoir) Computed Information = Data in Motion = Data at Rest 37 Architektur von Big Data Lösungen State Batch compute Microservice 1 Service State API Result Store
  • 36. Big Data Ecosystem – many choices sorted! 38 Architektur von Big Data Lösungen
  • 37. Building Blocks for (Big) Data Processing Data Acquisition Format File System Stream Processing Batch SQL Graph DBMS Document DBMS Relational DBMS Visualization IoT Messaging Analytics OLAP DBMS Query Federation Table-Style DBMS Key Value DBMS Batch Processing In-Memory 39 Architektur von Big Data Lösungen
  • 38. Big Data Ecosystem – many choices sorted! 40 Architektur von Big Data Lösungen