SlideShare uma empresa Scribd logo
1 de 102
Baixar para ler offline
© 2016 MapR Technologies L1-1®
© 2016 MapR Technologies
®
Streaming Patterns, Revolutionary
Architectures
Carol McDonald
© 2016 MapR Technologies L1-2®
Agenda
Streams Core Components
•  Topics, Partitions
•  Fault Tolerance
•  High Availability
Patterns
•  Event Sourcing
•  Duality of Streams and Databases
•  Command Query Responsibility Separation
•  Polyglot Persistence, Multiple Materialized Views
•  Turning the Database Upside Down
Real World Examples
•  Fraud Detection
•  Healthcare Exchange
© 2016 MapR Technologies L1-3®
Which products are we discussing?
© 2016 MapR Technologies L1-4®
© 2016 MapR Technologies© 2016 MapR Technologies
Streams Core Components
© 2016 MapR Technologies L1-5®
What’s a Stream ?
Producers ConsumersEvents_Stream
A stream is an unbounded sequence of events carried
from a set of producers to a set of consumers.
Events
© 2016 MapR Technologies L1-6®
What is Streaming Data? Got Some Examples?
Data Collection
Devices
Smart Machinery Phones and Tablets Home Automation
RFID Systems Digital Signage Security Systems Medical Devices
© 2016 MapR Technologies L1-7®
Why Streams?
Trigger Events:
•  Stock Prices
•  User Activity
•  Sensor Data
Topic
Many Big Data sources are Event Oriented
StreamStreamStream
Event Data
TopicTopic
Real-Time Analytics
© 2016 MapR Technologies L1-8®
Analyze Data
What if you need to analyze data as it arrives?
© 2016 MapR Technologies L1-9®
It was hot
at 6:05
yesterday!
Batch Processing with HDFS
Analyze
6:01 P.M.: 72°
6:02 P.M.: 75°
6:03 P.M.: 77°
6:04 P.M.: 85°
6:05 P.M.: 90°
6:06 P.M.: 85°
6:07 P.M.: 77°
6:08 P.M.: 75°
90°90°
6:01 P.M.: 72°
6:02 P.M.: 75°
6:03 P.M.: 77°
6:04 P.M.: 85°
6:05 P.M.: 90°
6:06 P.M.: 85°
6:07 P.M.: 77°
6:08 P.M.: 75°
© 2016 MapR Technologies L1-10®
Event Processing with Streams
6:05 P.M.: 90°
To
pic
Stream
Temperature
Turn on the air
conditioning!
© 2016 MapR Technologies L1-11®
Organize Data
What if you need to organize data as it arrives?
© 2016 MapR Technologies L1-12®
Integrating Many Data Sources and Applications
Sources
(Producers)
Applications
(Consumers)
Unorganized, Complicated, and Tightly Coupled.
© 2016 MapR Technologies L1-13®
Organize Data into Topics with MapR Streams
Topics Organize Events into Categories and Decouple Producers from Consumers
Consumers
MapR Cluster
Topic: Pressure
Topic: Temperature
Topic: Warnings
Consumers
Consumers
Kafka API Kafka API
© 2016 MapR Technologies L1-14®
Process High Volume of Data
What if you need to process a high volume of data as it arrives?
© 2016 MapR Technologies L1-15®
What if BP had detected problems before the oil hit the water ?
•  1M samples/sec
•  High performance at
scale is necessary!
© 2016 MapR Technologies L1-16®
Legacy Messaging
Millions of
Sources
Hundreds of
Destinationsinsert
Legacy Message
Queue:
Message rate
<100K/s
Publish
Acks
delete
Consume
Acks
© 2016 MapR Technologies L1-17®
Mechanisms for Decoupling
Traditional message queues?
•  Huge performance hit for persistence:
•  message acknowledgement per message per consumer
•  Lots of Non sequential disk I/O when messages added/removed
© 2016 MapR Technologies L1-18®
Scalable Messaging with MapR Streams
Server 1
Partition1: Topic - Pressure
Partition1: Topic - Temperature
Partition1: Topic - Warning
Server 2
Partition2: Topic - Pressure
Partition2: Topic - Temperature
Partition2: Topic - Warning
Server 3
Partition3: Topic - Pressure
Partition3: Topic - Temperature
Partition3: Topic - Warning
Topics are partitioned for throughput and scalability
© 2016 MapR Technologies L1-19®
Scalable Messaging with MapR Streams
Partition1: Topic - Pressure
Partition1: Topic - Temperature
Partition1: Topic - Warning
Partition2: Topic - Pressure
Partition2: Topic - Temperature
Partition2: Topic - Warning
Partition3: Topic - Pressure
Partition3: Topic - Temperature
Partition3: Topic - Warning
Producers are load balanced between partitions
Kafka API
© 2016 MapR Technologies L1-20®
Scalable Messaging with MapR Streams
Partition1: Topic - Pressure
Partition1: Topic - Temperature
Partition1: Topic - Warning
Partition2: Topic - Pressure
Partition2: Topic - Temperature
Partition2: Topic - Warning
Partition3: Topic - Pressure
Partition3: Topic - Temperature
Partition3: Topic - Warning
Consumers
Consumers
Consumers
Consumer groups can read in parallel
Kafka API
© 2016 MapR Technologies L1-21®
Core Components: Partitions
Consumers
MapR Cluster
Topic: Admission / Server 1
Topic: Admission / Server 2
Topic: Admission / Server 3
Consumers
Consumers
Partition
1
Partitions:
–  Messages are
appended in
order
Offset:
–  Sequential id of a
message in a
partition Partition
2
Partition
3
6 5 4 3 2 1
3 2 1
5 4 3 2 1
Producers
Producers
Producers
New
Message
6 5 4 3 2 1
Old
Message
© 2016 MapR Technologies L1-22®
Read Cursors
•  Read cursor: offset ID of most recent read message
•  Producers Append New messages to tail
•  Consumers Read from head
MapR Cluster
6 5 4 3 2 1
Consumer
groupProducers
Read cursors
Consumer
group
© 2016 MapR Technologies L1-23®
Consumers
MapR Cluster
Topic: Admission / Server 1
Topic: Admission / Server 2
Topic: Admission / Server 3
Consumers
Consumers
Partition
1
Partition
2
Partition
3
6 5 4 3 2 1
3 2 1
5 4 3 2 1
Producers
Producers
Producers
Events are delivered in the order they are received, like a queue.
Partitioned, Sequential Access =
High Performance New
Message
6 5 4 3 2 1
Old
Message
© 2016 MapR Technologies L1-24®
Unlike a queue, events are persisted even after they’re delivered
Messages remain on the partition, available to other consumers
Minimizes Non-Sequential disk read-writes
MapR Cluster (1 Server)
Topic: Warning
Partition
1
3 2 1 Unread Events
Get Unread
3 2 1
Client Library ConsumerPoll
© 2016 MapR Technologies L1-25®
Considering a Messaging Platform
Kafka-esque Logs?
•  Sequential writing/reading disk:
•  Messages are persisted sequentially as produced, and read sequentially when consumed
•  Performance plus persistence
•  performance of up to a billion messages per second at millisecond-level delivery times.
Kafka model is BLAZING fast
•  Kafka 0.9 API with message sizes at 200 bytes
•  MapR Streams on a 5 node cluster sustained 18 million events / sec
•  Throughput of 3.5GB/s and over 1.5 trillion events / day
© 2016 MapR Technologies L1-26®
When Are Messages Deleted?
•  Messages can be persisted forever
Or
•  Older messages can be deleted automatically based on time to live
MapR Cluster (1 Server)
6 5 4 3 2 1Partition
1
Older
message
© 2016 MapR Technologies L1-27®
Parallelism When Reading
To read messages from the same Topic in parallel:
•  create consumer groups
•  consumers with same group.id
•  partitions assigned dynamically round-robin
Consumer group: Oil Wells
Consumer A
Consumer B
Consumer C
MapR Cluster
Partition 4: Warning
Partition 3: Warning
Partition 2: Warning
Partition 1: Warning
Partition 5: Warning
© 2016 MapR Technologies L1-28®
Fault Tolerance Consumption: Partitions Re-Assigned Dynamically
If consumer goes offline, partitions re-assigned
Consumer group.id: Oil Wells
Consumer A
Consumer C
MapR Cluster
Partition4: Warning
Partition3: Warning
Partition2: Warning
Partition1: Warning
Partition5: Warning
© 2016 MapR Technologies L1-29®
Processing Same Message for Different Views
Consumers
Consumers
Consumers
Producers
Producers
Producers
MapR-FS
Kafka API Kafka API
Pub Sub: Multiple Consumers, Multiple Destinations
© 2016 MapR Technologies L1-30®
© 2016 MapR Technologies© 2016 MapR Technologies
Partition Fault Tolerance
© 2016 MapR Technologies L1-31®
Message Recovery
What if you need to recover messages in case of server failure?
© 2016 MapR Technologies L1-32®
Partitions are Replicated for Fault Tolerance
Producer
Producer
Server 2 Partition2: Topic - Warning
Producer
Server 1 Partition1: Topic - Warning
Server 3 Partition3: Topic - Warning
Server 2
Server 3
Server 1
Server 3
Server 1
Server 2
© 2016 MapR Technologies L1-33®
Partition1: Warning
Partition2: Warning Replica
Partition3: Warning Replica
Partition1: Warning Replica
Partition3: Warning Replica
Partition1: Warning Replica
Partition2: Warning Replica
Partition3: Warning
Producer
Producer
Producer
Server 1
Server 2
Server 3
Security Investigation &
Event Management
Operational
Intelligence
Real-time Analytics
Partition2: Warning
Partitions are Replicated for Fault Tolerance
© 2016 MapR Technologies L1-34®
Partitions are Replicated for Fault Tolerance
Producer
Producer
Producer
Security Investigation &
Event Management
Operational
Intelligence
Real-time Analytics
Partition1: Warning
Partition2: Warning Replica
Partition3: Warning Replica
Partition1: Warning Replica
Partition3: Warning Replica
Partition1: Warning Replica
Partition2: Warning Replica
Partition3: Warning
Server 1
Server 2
Server 3
Partition2: Warning
© 2016 MapR Technologies L1-35®
Partitions are Replicated for Fault tolerance
Producer
Producer
Producer
Security Investigation &
Event Management
Operational
Intelligence
Real-time Analytics
Partition1: Warning
Partition2: Warning Replica
Partition3: Warning Replica
Partition1: Warning Replica
Partition3: Warning Replica
Partition1: Warning Replica
Partition2: Warning Replica
Partition3: Warning
Server 1
Server 2
Server 3
Partition2: Warning
© 2016 MapR Technologies L1-36®
© 2016 MapR Technologies© 2016 MapR Technologies
Streams and High Availability
© 2016 MapR Technologies L1-37®
•  Stream:
–  collection of topics managed together
•  Manage stream:
–  replication
–  security
–  time-to-live
–  number of partitions
Core Components: Streams
Stream
Pressure
Temperature
Warning
Stream
Pressure
Temperature
Warning
Consumers
Consumers
Consumers
Consumers
Producers
Producers
Replication
© 2016 MapR Technologies L1-38®
Real-time Access
What if you need real-time access to live data distributed across multiple clusters
and multiple data centers?
© 2016 MapR Technologies L1-39®
Lack of Global Replication
Topic: C
© 2016 MapR Technologies L1-40®
Streams and Replication
Streams:
•  are a collection of topics
•  can be replicated worldwide
Topic: A
Topic: B
Topic: C
Topic: A
Topic: B
Topic: C
Replicating to
another
cluster
© 2016 MapR Technologies L1-41®
Streams and Replication
Topic: A
Topic: B
Topic: C
Fail Over
Streams:
•  high availability
•  disaster recovery
© 2016 MapR Technologies L1-42®
Replicating Streams: Master-Slave Replication
Venezuela_HA
Cluster
Metrics Stream
MetricsProducers
Venezuela
Cluster
Metrics Stream
Metrics
Consumers
High Availabiltiy
Backup for
Venezula
Master Slave
© 2016 MapR Technologies L1-43®
Replicating Streams: Many-to-One Replication
Houston
Metrics Stream
Metrics
Producers Venezuela
Metrics Stream
MetricsConsumers
Consumers
Producers Mexico
Metrics Stream
MetricsConsumers
Analyze all data from
Houston
Many
One
© 2016 MapR Technologies L1-44®
Replicating Streams: Multi-Master Replication
Producers Seoul
Metrics Stream
MetricsConsumers
ProducersSan Francisco
Metrics Stream
Metrics Consumers
Both send and receive updates
© 2016 MapR Technologies L1-45®
Stream Replication
WAN
Stream
Pressure
Temperature
Warning
Stream
Pressure
Temperature
Warning
Stream
Pressure
Temperature
Warning
© 2016 MapR Technologies L1-46®
Ship picks up containers…
Singapore
© 2016 MapR Technologies L1-47®
Arrives at destination…
Tokyo
© 2016 MapR Technologies L1-48®
While enroute to next destination…
Washington
© 2016 MapR Technologies L1-49®
Where does the data live…
Singapore Washington
Tokyo
© 2016 MapR Technologies L1-50®
What is important about this?
Data is generated on the ship
•  Must have an easy way (i.e. foolproof) to move the data off the ship
Each port stores the data from the ship
•  Moving data between locations
•  Analytics could happen at any location
This is a multi-data center time series data use case
•  Events from sensors = metrics
•  Same concepts as data center monitoring
© 2016 MapR Technologies L1-51®
© 2016 MapR Technologies© 2016 MapR Technologies
Patterns
© 2016 MapR Technologies L1-52®
Event Sourcing
Updates
Imagine each event as a change to an entry in a database.
Account Id Balance
WillO 80.00
BradA 20.00
1: WillO : Deposit : 100.00
2: BradA : Deposit : 50.00
3: BradA : Withdraw : 30.00
4: WillO : Withdraw: 20.00
https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying
Change log
4 3 2 1
credit, debit events
current account balances
© 2016 MapR Technologies L1-53®
Replication
Change Log
https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying
3 2 1 3 2 1
3 2 1
Duality of Streams and Tables:
Database: captures data at rest
Stream: captures data change
Master:
Append writes
Slave:
Apply writes in order
© 2016 MapR Technologies L1-54®
Which Makes a Better System of Record?
Which of these can be used to reconstruct the other?
1: WillO : Deposit : 100.00
2: BradA : Deposit : 50.00
3: BradA : Withdraw : 30.00
4: WillO : Withdraw: 20.00
Account Id Balance
WillO 80.00
BradA 20.00
Change Log
3 2 1
© 2016 MapR Technologies L1-55®
Rewind: Reprocessing Events
MapR Cluster
6 5 4 3 2 1Producers
Reprocess from
oldest message
Consumer
Create new view, Index, cache
© 2016 MapR Technologies L1-56®
Rewind Reprocessing Events
MapR Cluster
6 5 4 3 2 1Producers
To Newest message
Consumer new view
Read from
new view
© 2016 MapR Technologies L1-57®
Event Sourcing, Command Query Responsibility Separation:
Turning the Database Upside Down
Key-Val Document Graph
Wide
Column
Time
Series
Relational
???Events Updates
© 2016 MapR Technologies L1-58®
What Else Do I Use My Stream For?
Lineage - “how did BradA’s balance get so low?”
Auditing - “who deposited/withdrew from BradA’s account?”
History – to see the status of the accounts last year
Integrity - “can I trust this data hasn’t been tampered with?”
•  Yup - Streams are immutable
0: WillO : Deposit : 100.00
1: BradA : Deposit : 50.00
2: BradA : Withdraw : 30.00
3: WillO : Withdraw: 20.00
© 2016 MapR Technologies L1-59®
What Do I Need For This to Work?
Infinitely persisted events
A way to query your persisted stream data
An integrated security model across the stream and databases
© 2016 MapR Technologies L1-60®
Fraud Detection
Point of Sale -> Data Center is Transaction Fraud ?
•  Lots of requests
•  Need answer within ~ 50 100 milliseconds
Data
Center
Point of Sale
Location, time, card#
Fraud yes/no ?
© 2016 MapR Technologies L1-61®
Traditional Solution
POS
1..n
Fraud
detector
Last card
use
1.  Look up last card use
2.  Compute the card velocity:
•  Subtract last location, time from
current location, time
3.  Update last card use
© 2016 MapR Technologies L1-62®
What Happens Next?
POS
1..n
Fraud
detector
Last card
use
POS
1..n
Fraud
detector
POS
1..n
Fraud
detector
1.  Look up last card use
2.  Compute the card velocity
3.  Update last card use
Bottleneck !
© 2016 MapR Technologies L1-63®
Service Isolation: Separate Read from Write
POS
1..n
Fraud
detector
Last card
use
Updater
card activity
Read
Read last card use
© 2016 MapR Technologies L1-64®
Separate Read Model from the Write Model:
Command Query Responsibility Separation
POS
1..n
Fraud
detector
Last card
use
Updater
card activity
Read
Event last card use
Write last card use
© 2016 MapR Technologies L1-65®
Event Sourcing: New Uses of Data
Processing Same Message for Multiple Views
POS
1..n
Fraud
detector
Last card
use
Updater
Card
location
history
Other
card activity
© 2016 MapR Technologies L1-66®
Scaling Through Isolation allows Multiple Consumers
POS
1..n
Last card
use
Updater
POS
1..n
Last card
use
Updater
card activity
Fraud
detector
Fraud
detector
Multiple fraud detectors can use the same message queue
•  De-coupling and
isolation are key
•  Propagate
events, not table
updates
© 2016 MapR Technologies L1-67®
Decoupled Architecture
Producer
Activity Handler
Producer
Producer
Historical
Interesting
Data Real-time
Analysis
Results Dashboard
Anomaly
Detection
more than one component can
make use of the same stream of messages for a variety of uses
© 2016 MapR Technologies L1-68®
Lessons
De-coupling and isolation are key
Propagate events, not table updates
© 2016 MapR Technologies L1-69®
Building Enterprise Software vs Internet Companies
Enterprise Software:
Complexity of domain =>
Business logic, Business rules
Banking, Healthcare, Telecom
Compliance=>
Security
Internet Companies:
Volume of data =>
Complex data infrastructure
Large Scale Availability, Recovery
Reference Martin Kleppmann
© 2016 MapR Technologies L1-70®
Building Enterprise Software vs Internet Companies
Enterprise Software:
Event Sourcing
Internet Companies:
Stream Processing
Reference Martin Kleppmann
© 2016 MapR Technologies L1-71®
© 2016 MapR Technologies© 2016 MapR Technologies
Real World Solution
© 2016 MapR Technologies L1-72®
Credit Card Fraud Model Building
© 2016 MapR Technologies L1-73®
ServeNoSQL StorageData Ingest
Fraud Stream Processing Architecture
Stream
ProcessingSource
MapR-FS
MapR-DB
Topic: A
Topic: B
Topic: C
Topic: A
Topic: B
Topic: C
© 2016 MapR Technologies L1-74®
Streams
Messaging
Fraud Processing
Stream Processing
Derive
features
Model
raw
enriched
alerts
process
Batch Processing
MapR-FS
MapR-DB
MapR-DB
raw
enriched
alerts
Model
build model
update model
© 2016 MapR Technologies L1-75®
Streams
Messaging
Fraud Event Processing
Stream
Processing
NoSQL
Storage
MapR-FS
MapR-DB
Raw
Enriched
Fraud
1.  Parse raw event
2.  read card holder
profile from MapR-DB
3.  Derive features
4.  Get prediction from
model with features
5.  Publish not fraud to
enriched topic
6.  Publish fraud to
fraud topic
© 2016 MapR Technologies L1-76®
Fraud Processing Same Message for Different Views
Partition1: Topic – Raw Trans
Partition1: Topic – Enriched
Partition1: Topic – Fraud Alert
Partition2: Topic – Raw Trans
Partition2: Topic - Enriched
Partition2: Topic – Fraud Alert
Partition3: Topic – Raw Trans
Partition3: Topic - Enriched
Partition3: Topic – Fraud Alert
Consumers
MapR-FS
MapR-DB
Consumers
Consumers
Consumers
MapR-FS
MapR-DB
Consumers
Consumers
Consumers
MapR-FS
MapR-DB
Consumers
Consumers
© 2016 MapR Technologies L1-77®
© 2016 MapR Technologies© 2016 MapR Technologies
Real World Solution
© 2016 MapR Technologies L1-78®
JSON DB
(MapR-DB)
Graph DB
(Titan on
MapR-DB)
Search Engine
(Elastic-Search)
Transforming the Health Care Ecosystem
Electronic Medical
Records
“The Stream is the
System of Record”
–Brad Anderson
VP Big Data Informatics
© 2016 MapR Technologies L1-79®
Liaison ALLOY™ Platform
79
Data Integration
ingest syndicatetransform
Data Management
master
deduplicate
harmonize
relate
merge
tokenize
store / persist
analyze
summarize
report
distill
recommend
explore
query
sandbox
batch transform
learn
traverse
© 2016 MapR Technologies L1-80®
Use Case: Streaming System of Record for Healthcare
Objective:
•  Build a flexible, secure
healthcare exchange
Records Analysis
Applications
Challenges:
•  Many different data models
•  Security and privacy issues
•  HIPAA compliance
Records
© 2016 MapR Technologies L1-81®
ALLOY Health:
Exchange State HIE
Clinical Data Viewer
Analytics queries like:
What are the outcomes in the entire state on diabetes?
Are there doctors that are doing this better than others?
Clinical Data
Financial Data
Provider
Organizations
© 2016 MapR Technologies L1-82®
2000+ Practices 200 + Labs 30,000 + Clinicians
OrdersAnywhere
PORTAL (no EHR)
EHR with
HL7 ONLY
EHR with WORKFLOW
INTEGRATION
RADIOLOGY
LAB
© 2016 MapR Technologies L1-83®
This is a PAIN !
COMPLIAN
CE
SECURITY CONTROLS
COMPLIANCE
FEATURES
PRIVACY
PCI DSS
3.0
21 CFR Part
11
SSAE16 /
SOC2
HIPAA/HITECH	
  
© 2016 MapR Technologies L1-84®
WHY NOW?
84http://bit.ly/29aBatK
© 2016 MapR Technologies L1-85®
WHY NOW?
2014 FQ4 profit
$ -440 M
Total Cost Estimate
$ -12 B
© 2016 MapR Technologies L1-86®
Why Now? The Relational database is not the only tool
1234
Attribute Value
patient_id 1234
Name Jon Smith
Age 50
999
Attribute Value
patient_id 999
Name Jonathan
Smith
DOB Jun 1965
86
9876
Attribute Value
provider_id 86
Name Dr. Nora Paige
Specialty Diabetes
Attribute Value
rx_id 9876
Name Sitagliptin
Dosage 325mg
Visited
Prescribed
WasPrescribed
Patient
Patient
Prescription
Provider
Context and Relationships
© 2016 MapR Technologies L1-87®
WHY NOW? Mind the Gap
87
© 2016 MapR Technologies L1-88®
Streaming System of Record for Healthcare
Stream
Topic
Records
Applications
6 5 4 3 2 1
Search
Graph DB
JSON
HBase
Micro
Service
Micro
Service
Micro
Service
Micro
Service
Micro
Service
Micro
Service
A
P
I
Streaming System of Record Materialized
Views
© 2016 MapR Technologies L1-89®
89	
  
Immutable Log
Raw
Data
workflow
Key/Value
(MapR-DB)
materialized
view
workflow
Search
Engine
materialized
view
CEP
k v v v v v
k v v v
k v v
k v v v v
k v v v
k v v v v v
Document Log
(MapR-FS)
log
API
App
pre-
processor
workflow
Graph
(ArangoDB)
materialized
view
workflow
Time
Series
(OpenTSDB)
materialized
view
micro
service
micro
service
micro
service
micro
service
micro
service
micro
service
micro
service
micro
service
App AppApp
...
The Promised Land
Compliance
Auditor
© 2016 MapR Technologies L1-90®
The Promised Land
Auditor smiley faces
•  Data Lineage
•  Audit Logging
•  Wire-level encryption
•  At Rest encryption
Replication
•  Disaster Recovery
•  EU – data can’t leave
Non-Stream / Non-”Big Data”
•  Software Development Lifecycle
•  System Hardening
•  Separation of Concerns
-  Dev vs Ops
•  Patch Management
90
Compliance
Auditor
© 2016 MapR Technologies L1-91®
Solution
Design/architecture solved some
•  Streams
•  Data Lineage/System of Record
•  Kappa Architecture (Kreps/Kleppman)
MapR solved others
•  Unified Security
•  Replication DC to DC
•  Converge Kafka/HBase/Hadoop to one cluster
•  Multi-tenancy (lots of topics, for lots of tenants)
91
© 2016 MapR Technologies L1-92®
© 2016 MapR Technologies© 2016 MapR Technologies
API
© 2016 MapR Technologies L1-93®
Sample Producer: All Together
public class SampleProducer {
String topic=“/streams/pump:warning”;
public static KafkaProducer producer;
public static void main(String[] args) {
producer=setUpProducer();
for(int i = 0; i < 3; i++) {
String txt = “msg ” + i;
ProducerRecord<String, String> rec = new
ProducerRecord<String, String>(topic, txt);
producer.send(rec);
System.out.println("Sent msg number " + i);
}
producer.close();
}
© 2016 MapR Technologies L1-94®
public class MyConsumer {
public static String topic = "/stream/pump:warning”;
public static KafkaConsumer consumer;
public static void main(String[] args) {
configureConsumer(args);
consumer.subscribe(topic);
while (true) {
ConsumerRecords<String, String> msg=
consumer.poll(pollTimeOut);
Iterator<ConsumerRecord<String, String>> iter =
msg.iterator();
while (iter.hasNext()) {
ConsumerRecord<String, String> record = iter.next();
System.out.println(”read " + record.toString());
}
}
consumer.close();
}
}
Sample Consumer: All Together
© 2016 MapR Technologies L1-95®
© 2016 MapR Technologies© 2016 MapR Technologies
Summary
© 2016 MapR Technologies L1-96®
Can we get “Extreme” ?
1+ Trillion Events
•  per day
Millions of Producers
•  Billions of events per second
Multiple Consumers
•  Potentially for every event
Multiple Data Centers
•  Plan for success
•  Plan for drastic failure
Think that is crazy? Consider having 100
servers and performing:
Monitoring and Application logs…
•  100 metrics per server
•  60 samples per minute
•  50 metrics per request
•  1,000 log entries per request (abnormally
small, depends on level)
•  1million requests per day
~ 2 billion events per day, for one small
(ish) use case
Extreme Average Reality
© 2016 MapR Technologies L1-97®
Stream Processing
Building a Complete Data Architecture
MapR File System
(MapR-FS)
MapR Converged Data Platform
MapR Database
(MapR-DB)
MapR Streams
Sources/Apps Bulk Processing
© 2016 MapR Technologies L1-98®
© 2016 MapR Technologies L1-99®
© 2016 MapR Technologies L1-10
0
®
bit.ly/jjug-aug2016
Find my slides & other related materials to this talk here:
or search:
© 2016 MapR Technologies L1-10
1
®
MapR Blog
• https://www.mapr.com/blog/
© 2016 MapR Technologies L1-10
2
®
…helping you put data technology to work
●  Find answers
●  Ask technical questions
●  Join on-demand training course
discussions
●  Follow release announcements
●  Share and vote on product ideas
●  Find Meetup and event listings
Connect with fellow Apache
Hadoop and Spark professionals
community.mapr.com

Mais conteúdo relacionado

Mais procurados

Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1Carol McDonald
 
When Streaming Becomes Strategic
When Streaming Becomes StrategicWhen Streaming Becomes Strategic
When Streaming Becomes StrategicMapR Technologies
 
Introduction to Spark on Hadoop
Introduction to Spark on HadoopIntroduction to Spark on Hadoop
Introduction to Spark on HadoopCarol McDonald
 
Free Code Friday - Machine Learning with Apache Spark
Free Code Friday - Machine Learning with Apache SparkFree Code Friday - Machine Learning with Apache Spark
Free Code Friday - Machine Learning with Apache SparkMapR Technologies
 
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications MapR Technologies
 
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Carol McDonald
 
Deep Learning vs. Cheap Learning
Deep Learning vs. Cheap LearningDeep Learning vs. Cheap Learning
Deep Learning vs. Cheap LearningMapR Technologies
 
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DBStructured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DBCarol McDonald
 
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR Technologies
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLMapR Technologies
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR Technologies
 
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...Debraj GuhaThakurta
 
Predicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine LearningPredicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine LearningCarol McDonald
 
Dealing with an Upside Down Internet
Dealing with an Upside Down InternetDealing with an Upside Down Internet
Dealing with an Upside Down InternetMapR Technologies
 
Big Data Everywhere Chicago: SQL on Hadoop
Big Data Everywhere Chicago: SQL on Hadoop Big Data Everywhere Chicago: SQL on Hadoop
Big Data Everywhere Chicago: SQL on Hadoop BigDataEverywhere
 
Spark & Hadoop at Production at Scale
Spark & Hadoop at Production at ScaleSpark & Hadoop at Production at Scale
Spark & Hadoop at Production at ScaleMapR Technologies
 

Mais procurados (19)

Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1
 
When Streaming Becomes Strategic
When Streaming Becomes StrategicWhen Streaming Becomes Strategic
When Streaming Becomes Strategic
 
Introduction to Spark on Hadoop
Introduction to Spark on HadoopIntroduction to Spark on Hadoop
Introduction to Spark on Hadoop
 
Free Code Friday - Machine Learning with Apache Spark
Free Code Friday - Machine Learning with Apache SparkFree Code Friday - Machine Learning with Apache Spark
Free Code Friday - Machine Learning with Apache Spark
 
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications
 
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
 
Deep Learning vs. Cheap Learning
Deep Learning vs. Cheap LearningDeep Learning vs. Cheap Learning
Deep Learning vs. Cheap Learning
 
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DBStructured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
 
MapR 5.2 Product Update
MapR 5.2 Product UpdateMapR 5.2 Product Update
MapR 5.2 Product Update
 
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data Platform
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQL
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT Better
 
IoT Use Cases with MapR
IoT Use Cases with MapRIoT Use Cases with MapR
IoT Use Cases with MapR
 
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
 
Predicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine LearningPredicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine Learning
 
Dealing with an Upside Down Internet
Dealing with an Upside Down InternetDealing with an Upside Down Internet
Dealing with an Upside Down Internet
 
Big Data Everywhere Chicago: SQL on Hadoop
Big Data Everywhere Chicago: SQL on Hadoop Big Data Everywhere Chicago: SQL on Hadoop
Big Data Everywhere Chicago: SQL on Hadoop
 
MapR & Skytree:
MapR & Skytree: MapR & Skytree:
MapR & Skytree:
 
Spark & Hadoop at Production at Scale
Spark & Hadoop at Production at ScaleSpark & Hadoop at Production at Scale
Spark & Hadoop at Production at Scale
 

Destaque

Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...DataWorks Summit/Hadoop Summit
 
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache KafkaBuilding Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache KafkaGuozhang Wang
 
Real-Time Data Feeds Using the Streaming API
Real-Time Data Feeds Using the Streaming APIReal-Time Data Feeds Using the Streaming API
Real-Time Data Feeds Using the Streaming APISalesforce Developers
 
Leveraging Mesos to manage container workloads at Samsung SAMI
Leveraging Mesos to manage container workloads at Samsung SAMI Leveraging Mesos to manage container workloads at Samsung SAMI
Leveraging Mesos to manage container workloads at Samsung SAMI Niranjan Hanumegowda
 
SAMI - Samsung Developer Conference - Nov 2014
SAMI - Samsung Developer Conference - Nov 2014SAMI - Samsung Developer Conference - Nov 2014
SAMI - Samsung Developer Conference - Nov 2014Jerome Dubreuil
 
Micro Gateways are a Big Deal
Micro Gateways are a Big DealMicro Gateways are a Big Deal
Micro Gateways are a Big DealJoe Sepi
 
Interoperable Web Services with JAX-WS and WSIT
Interoperable Web Services with JAX-WS and WSITInteroperable Web Services with JAX-WS and WSIT
Interoperable Web Services with JAX-WS and WSITCarol McDonald
 
Making Scrum Work Inside Small Businesses
Making Scrum Work Inside Small Businesses Making Scrum Work Inside Small Businesses
Making Scrum Work Inside Small Businesses Laszlo Szalvay
 
Querying the Internet of Things: Streaming SQL on Kafka/Samza and Storm/Trident
 Querying the Internet of Things: Streaming SQL on Kafka/Samza and Storm/Trident Querying the Internet of Things: Streaming SQL on Kafka/Samza and Storm/Trident
Querying the Internet of Things: Streaming SQL on Kafka/Samza and Storm/TridentJulian Hyde
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Anton Nazaruk
 
Apache Spark Machine Learning Decision Trees
Apache Spark Machine Learning Decision TreesApache Spark Machine Learning Decision Trees
Apache Spark Machine Learning Decision TreesCarol McDonald
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016StampedeCon
 
Building a Node.js API backend with LoopBack in 5 Minutes
Building a Node.js API backend with LoopBack in 5 MinutesBuilding a Node.js API backend with LoopBack in 5 Minutes
Building a Node.js API backend with LoopBack in 5 MinutesRaymond Feng
 
Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters
Node Architecture Implications for In-Memory Data Analytics on Scale-in ClustersNode Architecture Implications for In-Memory Data Analytics on Scale-in Clusters
Node Architecture Implications for In-Memory Data Analytics on Scale-in ClustersAhsan Javed Awan
 
Rapid API Development with LoopBack/StrongLoop
Rapid API Development with LoopBack/StrongLoopRapid API Development with LoopBack/StrongLoop
Rapid API Development with LoopBack/StrongLoopRaymond Camden
 
NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill Carol McDonald
 
Real time data ingestion and Hybrid Cloud
Real time data ingestion and Hybrid CloudReal time data ingestion and Hybrid Cloud
Real time data ingestion and Hybrid CloudNeeraj Sabharwal
 

Destaque (20)

Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
 
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache KafkaBuilding Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
 
Real-Time Data Feeds Using the Streaming API
Real-Time Data Feeds Using the Streaming APIReal-Time Data Feeds Using the Streaming API
Real-Time Data Feeds Using the Streaming API
 
Leveraging Mesos to manage container workloads at Samsung SAMI
Leveraging Mesos to manage container workloads at Samsung SAMI Leveraging Mesos to manage container workloads at Samsung SAMI
Leveraging Mesos to manage container workloads at Samsung SAMI
 
SAMI - Samsung Developer Conference - Nov 2014
SAMI - Samsung Developer Conference - Nov 2014SAMI - Samsung Developer Conference - Nov 2014
SAMI - Samsung Developer Conference - Nov 2014
 
What is your PaaS
What is your PaaSWhat is your PaaS
What is your PaaS
 
Micro Gateways are a Big Deal
Micro Gateways are a Big DealMicro Gateways are a Big Deal
Micro Gateways are a Big Deal
 
Interoperable Web Services with JAX-WS and WSIT
Interoperable Web Services with JAX-WS and WSITInteroperable Web Services with JAX-WS and WSIT
Interoperable Web Services with JAX-WS and WSIT
 
Making Scrum Work Inside Small Businesses
Making Scrum Work Inside Small Businesses Making Scrum Work Inside Small Businesses
Making Scrum Work Inside Small Businesses
 
ESKibana
ESKibanaESKibana
ESKibana
 
Querying the Internet of Things: Streaming SQL on Kafka/Samza and Storm/Trident
 Querying the Internet of Things: Streaming SQL on Kafka/Samza and Storm/Trident Querying the Internet of Things: Streaming SQL on Kafka/Samza and Storm/Trident
Querying the Internet of Things: Streaming SQL on Kafka/Samza and Storm/Trident
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
Apache Spark Machine Learning Decision Trees
Apache Spark Machine Learning Decision TreesApache Spark Machine Learning Decision Trees
Apache Spark Machine Learning Decision Trees
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016
 
Building a Node.js API backend with LoopBack in 5 Minutes
Building a Node.js API backend with LoopBack in 5 MinutesBuilding a Node.js API backend with LoopBack in 5 Minutes
Building a Node.js API backend with LoopBack in 5 Minutes
 
Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters
Node Architecture Implications for In-Memory Data Analytics on Scale-in ClustersNode Architecture Implications for In-Memory Data Analytics on Scale-in Clusters
Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters
 
Rapid API Development with LoopBack/StrongLoop
Rapid API Development with LoopBack/StrongLoopRapid API Development with LoopBack/StrongLoop
Rapid API Development with LoopBack/StrongLoop
 
NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill
 
ASPgems - kappa architecture
ASPgems - kappa architectureASPgems - kappa architecture
ASPgems - kappa architecture
 
Real time data ingestion and Hybrid Cloud
Real time data ingestion and Hybrid CloudReal time data ingestion and Hybrid Cloud
Real time data ingestion and Hybrid Cloud
 

Semelhante a Streaming Patterns Revolutionary Architectures with the Kafka API

Design Patterns for working with Fast Data in Kafka
Design Patterns for working with Fast Data in KafkaDesign Patterns for working with Fast Data in Kafka
Design Patterns for working with Fast Data in KafkaIan Downard
 
Design Patterns for working with Fast Data
Design Patterns for working with Fast DataDesign Patterns for working with Fast Data
Design Patterns for working with Fast DataMapR Technologies
 
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling  the New Wave of Converged ApplicationsHow Spark is Enabling  the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged ApplicationsMapR Technologies
 
Handling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in FinanceHandling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in FinanceMapR Technologies
 
Designing and Implementing your IOT Solutions with Open Source
Designing and Implementing your IOT Solutions with Open SourceDesigning and Implementing your IOT Solutions with Open Source
Designing and Implementing your IOT Solutions with Open SourceDataWorks Summit/Hadoop Summit
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainMapR Technologies
 
Map r seattle streams meetup oct 2016
Map r seattle streams meetup   oct 2016Map r seattle streams meetup   oct 2016
Map r seattle streams meetup oct 2016Nitin Kumar
 
Event Sourcing & CQRS, Kafka, Rabbit MQ
Event Sourcing & CQRS, Kafka, Rabbit MQEvent Sourcing & CQRS, Kafka, Rabbit MQ
Event Sourcing & CQRS, Kafka, Rabbit MQAraf Karsh Hamid
 
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...MapR Technologies
 
HUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions Architect
HUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions ArchitectHUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions Architect
HUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions ArchitectSpagoWorld
 
Event Detection Pipelines with Apache Kafka
Event Detection Pipelines with Apache KafkaEvent Detection Pipelines with Apache Kafka
Event Detection Pipelines with Apache KafkaDataWorks Summit
 
Open Source Bristol 30 March 2022
Open Source Bristol 30 March 2022Open Source Bristol 30 March 2022
Open Source Bristol 30 March 2022Timothy Spann
 
Stream data from Apache Kafka for processing with Apache Apex
Stream data from Apache Kafka for processing with Apache ApexStream data from Apache Kafka for processing with Apache Apex
Stream data from Apache Kafka for processing with Apache ApexApache Apex
 
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...confluent
 
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
 IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
IoT Ingestion & Analytics using Apache Apex - A Native Hadoop PlatformApache Apex
 
Spark Streaming the Industrial IoT
Spark Streaming the Industrial IoTSpark Streaming the Industrial IoT
Spark Streaming the Industrial IoTJim Haughwout
 
The Evolution of Trillion-level Real-time Messaging System in BIGO - Puslar ...
The Evolution of Trillion-level Real-time Messaging System in BIGO  - Puslar ...The Evolution of Trillion-level Real-time Messaging System in BIGO  - Puslar ...
The Evolution of Trillion-level Real-time Messaging System in BIGO - Puslar ...StreamNative
 
MyHeritage Kakfa use cases - Feb 2014 Meetup
MyHeritage Kakfa use cases - Feb 2014 Meetup MyHeritage Kakfa use cases - Feb 2014 Meetup
MyHeritage Kakfa use cases - Feb 2014 Meetup Ran Levy
 

Semelhante a Streaming Patterns Revolutionary Architectures with the Kafka API (20)

Design Patterns for working with Fast Data in Kafka
Design Patterns for working with Fast Data in KafkaDesign Patterns for working with Fast Data in Kafka
Design Patterns for working with Fast Data in Kafka
 
Design Patterns for working with Fast Data
Design Patterns for working with Fast DataDesign Patterns for working with Fast Data
Design Patterns for working with Fast Data
 
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling  the New Wave of Converged ApplicationsHow Spark is Enabling  the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged Applications
 
Handling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in FinanceHandling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in Finance
 
Designing and Implementing your IOT Solutions with Open Source
Designing and Implementing your IOT Solutions with Open SourceDesigning and Implementing your IOT Solutions with Open Source
Designing and Implementing your IOT Solutions with Open Source
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
 
Map r seattle streams meetup oct 2016
Map r seattle streams meetup   oct 2016Map r seattle streams meetup   oct 2016
Map r seattle streams meetup oct 2016
 
Kafka talk
Kafka talkKafka talk
Kafka talk
 
Event Sourcing & CQRS, Kafka, Rabbit MQ
Event Sourcing & CQRS, Kafka, Rabbit MQEvent Sourcing & CQRS, Kafka, Rabbit MQ
Event Sourcing & CQRS, Kafka, Rabbit MQ
 
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
 
HUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions Architect
HUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions ArchitectHUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions Architect
HUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions Architect
 
Event Detection Pipelines with Apache Kafka
Event Detection Pipelines with Apache KafkaEvent Detection Pipelines with Apache Kafka
Event Detection Pipelines with Apache Kafka
 
Open Source Bristol 30 March 2022
Open Source Bristol 30 March 2022Open Source Bristol 30 March 2022
Open Source Bristol 30 March 2022
 
Stream data from Apache Kafka for processing with Apache Apex
Stream data from Apache Kafka for processing with Apache ApexStream data from Apache Kafka for processing with Apache Apex
Stream data from Apache Kafka for processing with Apache Apex
 
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
 
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
 IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
 
Spark Streaming the Industrial IoT
Spark Streaming the Industrial IoTSpark Streaming the Industrial IoT
Spark Streaming the Industrial IoT
 
The Evolution of Trillion-level Real-time Messaging System in BIGO - Puslar ...
The Evolution of Trillion-level Real-time Messaging System in BIGO  - Puslar ...The Evolution of Trillion-level Real-time Messaging System in BIGO  - Puslar ...
The Evolution of Trillion-level Real-time Messaging System in BIGO - Puslar ...
 
MyHeritage Kakfa use cases - Feb 2014 Meetup
MyHeritage Kakfa use cases - Feb 2014 Meetup MyHeritage Kakfa use cases - Feb 2014 Meetup
MyHeritage Kakfa use cases - Feb 2014 Meetup
 

Mais de Carol McDonald

Introduction to machine learning with GPUs
Introduction to machine learning with GPUsIntroduction to machine learning with GPUs
Introduction to machine learning with GPUsCarol McDonald
 
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...Carol McDonald
 
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DBAnalyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DBCarol McDonald
 
Analysis of Popular Uber Locations using Apache APIs: Spark Machine Learning...
Analysis of Popular Uber Locations using Apache APIs:  Spark Machine Learning...Analysis of Popular Uber Locations using Apache APIs:  Spark Machine Learning...
Analysis of Popular Uber Locations using Apache APIs: Spark Machine Learning...Carol McDonald
 
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...Carol McDonald
 
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...Carol McDonald
 
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...Carol McDonald
 
Demystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep LearningDemystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep LearningCarol McDonald
 
Spark machine learning predicting customer churn
Spark machine learning predicting customer churnSpark machine learning predicting customer churn
Spark machine learning predicting customer churnCarol McDonald
 
Apache Spark Machine Learning
Apache Spark Machine LearningApache Spark Machine Learning
Apache Spark Machine LearningCarol McDonald
 
Apache Spark streaming and HBase
Apache Spark streaming and HBaseApache Spark streaming and HBase
Apache Spark streaming and HBaseCarol McDonald
 
Machine Learning Recommendations with Spark
Machine Learning Recommendations with SparkMachine Learning Recommendations with Spark
Machine Learning Recommendations with SparkCarol McDonald
 
Getting started with HBase
Getting started with HBaseGetting started with HBase
Getting started with HBaseCarol McDonald
 

Mais de Carol McDonald (16)

Introduction to machine learning with GPUs
Introduction to machine learning with GPUsIntroduction to machine learning with GPUs
Introduction to machine learning with GPUs
 
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
 
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DBAnalyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
 
Analysis of Popular Uber Locations using Apache APIs: Spark Machine Learning...
Analysis of Popular Uber Locations using Apache APIs:  Spark Machine Learning...Analysis of Popular Uber Locations using Apache APIs:  Spark Machine Learning...
Analysis of Popular Uber Locations using Apache APIs: Spark Machine Learning...
 
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
 
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
 
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
 
Demystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep LearningDemystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep Learning
 
Spark graphx
Spark graphxSpark graphx
Spark graphx
 
Spark machine learning predicting customer churn
Spark machine learning predicting customer churnSpark machine learning predicting customer churn
Spark machine learning predicting customer churn
 
Apache Spark Machine Learning
Apache Spark Machine LearningApache Spark Machine Learning
Apache Spark Machine Learning
 
Apache Spark streaming and HBase
Apache Spark streaming and HBaseApache Spark streaming and HBase
Apache Spark streaming and HBase
 
Machine Learning Recommendations with Spark
Machine Learning Recommendations with SparkMachine Learning Recommendations with Spark
Machine Learning Recommendations with Spark
 
Introduction to Spark
Introduction to SparkIntroduction to Spark
Introduction to Spark
 
CU9411MW.DOC
CU9411MW.DOCCU9411MW.DOC
CU9411MW.DOC
 
Getting started with HBase
Getting started with HBaseGetting started with HBase
Getting started with HBase
 

Último

W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 

Último (20)

W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 

Streaming Patterns Revolutionary Architectures with the Kafka API

  • 1. © 2016 MapR Technologies L1-1® © 2016 MapR Technologies ® Streaming Patterns, Revolutionary Architectures Carol McDonald
  • 2. © 2016 MapR Technologies L1-2® Agenda Streams Core Components •  Topics, Partitions •  Fault Tolerance •  High Availability Patterns •  Event Sourcing •  Duality of Streams and Databases •  Command Query Responsibility Separation •  Polyglot Persistence, Multiple Materialized Views •  Turning the Database Upside Down Real World Examples •  Fraud Detection •  Healthcare Exchange
  • 3. © 2016 MapR Technologies L1-3® Which products are we discussing?
  • 4. © 2016 MapR Technologies L1-4® © 2016 MapR Technologies© 2016 MapR Technologies Streams Core Components
  • 5. © 2016 MapR Technologies L1-5® What’s a Stream ? Producers ConsumersEvents_Stream A stream is an unbounded sequence of events carried from a set of producers to a set of consumers. Events
  • 6. © 2016 MapR Technologies L1-6® What is Streaming Data? Got Some Examples? Data Collection Devices Smart Machinery Phones and Tablets Home Automation RFID Systems Digital Signage Security Systems Medical Devices
  • 7. © 2016 MapR Technologies L1-7® Why Streams? Trigger Events: •  Stock Prices •  User Activity •  Sensor Data Topic Many Big Data sources are Event Oriented StreamStreamStream Event Data TopicTopic Real-Time Analytics
  • 8. © 2016 MapR Technologies L1-8® Analyze Data What if you need to analyze data as it arrives?
  • 9. © 2016 MapR Technologies L1-9® It was hot at 6:05 yesterday! Batch Processing with HDFS Analyze 6:01 P.M.: 72° 6:02 P.M.: 75° 6:03 P.M.: 77° 6:04 P.M.: 85° 6:05 P.M.: 90° 6:06 P.M.: 85° 6:07 P.M.: 77° 6:08 P.M.: 75° 90°90° 6:01 P.M.: 72° 6:02 P.M.: 75° 6:03 P.M.: 77° 6:04 P.M.: 85° 6:05 P.M.: 90° 6:06 P.M.: 85° 6:07 P.M.: 77° 6:08 P.M.: 75°
  • 10. © 2016 MapR Technologies L1-10® Event Processing with Streams 6:05 P.M.: 90° To pic Stream Temperature Turn on the air conditioning!
  • 11. © 2016 MapR Technologies L1-11® Organize Data What if you need to organize data as it arrives?
  • 12. © 2016 MapR Technologies L1-12® Integrating Many Data Sources and Applications Sources (Producers) Applications (Consumers) Unorganized, Complicated, and Tightly Coupled.
  • 13. © 2016 MapR Technologies L1-13® Organize Data into Topics with MapR Streams Topics Organize Events into Categories and Decouple Producers from Consumers Consumers MapR Cluster Topic: Pressure Topic: Temperature Topic: Warnings Consumers Consumers Kafka API Kafka API
  • 14. © 2016 MapR Technologies L1-14® Process High Volume of Data What if you need to process a high volume of data as it arrives?
  • 15. © 2016 MapR Technologies L1-15® What if BP had detected problems before the oil hit the water ? •  1M samples/sec •  High performance at scale is necessary!
  • 16. © 2016 MapR Technologies L1-16® Legacy Messaging Millions of Sources Hundreds of Destinationsinsert Legacy Message Queue: Message rate <100K/s Publish Acks delete Consume Acks
  • 17. © 2016 MapR Technologies L1-17® Mechanisms for Decoupling Traditional message queues? •  Huge performance hit for persistence: •  message acknowledgement per message per consumer •  Lots of Non sequential disk I/O when messages added/removed
  • 18. © 2016 MapR Technologies L1-18® Scalable Messaging with MapR Streams Server 1 Partition1: Topic - Pressure Partition1: Topic - Temperature Partition1: Topic - Warning Server 2 Partition2: Topic - Pressure Partition2: Topic - Temperature Partition2: Topic - Warning Server 3 Partition3: Topic - Pressure Partition3: Topic - Temperature Partition3: Topic - Warning Topics are partitioned for throughput and scalability
  • 19. © 2016 MapR Technologies L1-19® Scalable Messaging with MapR Streams Partition1: Topic - Pressure Partition1: Topic - Temperature Partition1: Topic - Warning Partition2: Topic - Pressure Partition2: Topic - Temperature Partition2: Topic - Warning Partition3: Topic - Pressure Partition3: Topic - Temperature Partition3: Topic - Warning Producers are load balanced between partitions Kafka API
  • 20. © 2016 MapR Technologies L1-20® Scalable Messaging with MapR Streams Partition1: Topic - Pressure Partition1: Topic - Temperature Partition1: Topic - Warning Partition2: Topic - Pressure Partition2: Topic - Temperature Partition2: Topic - Warning Partition3: Topic - Pressure Partition3: Topic - Temperature Partition3: Topic - Warning Consumers Consumers Consumers Consumer groups can read in parallel Kafka API
  • 21. © 2016 MapR Technologies L1-21® Core Components: Partitions Consumers MapR Cluster Topic: Admission / Server 1 Topic: Admission / Server 2 Topic: Admission / Server 3 Consumers Consumers Partition 1 Partitions: –  Messages are appended in order Offset: –  Sequential id of a message in a partition Partition 2 Partition 3 6 5 4 3 2 1 3 2 1 5 4 3 2 1 Producers Producers Producers New Message 6 5 4 3 2 1 Old Message
  • 22. © 2016 MapR Technologies L1-22® Read Cursors •  Read cursor: offset ID of most recent read message •  Producers Append New messages to tail •  Consumers Read from head MapR Cluster 6 5 4 3 2 1 Consumer groupProducers Read cursors Consumer group
  • 23. © 2016 MapR Technologies L1-23® Consumers MapR Cluster Topic: Admission / Server 1 Topic: Admission / Server 2 Topic: Admission / Server 3 Consumers Consumers Partition 1 Partition 2 Partition 3 6 5 4 3 2 1 3 2 1 5 4 3 2 1 Producers Producers Producers Events are delivered in the order they are received, like a queue. Partitioned, Sequential Access = High Performance New Message 6 5 4 3 2 1 Old Message
  • 24. © 2016 MapR Technologies L1-24® Unlike a queue, events are persisted even after they’re delivered Messages remain on the partition, available to other consumers Minimizes Non-Sequential disk read-writes MapR Cluster (1 Server) Topic: Warning Partition 1 3 2 1 Unread Events Get Unread 3 2 1 Client Library ConsumerPoll
  • 25. © 2016 MapR Technologies L1-25® Considering a Messaging Platform Kafka-esque Logs? •  Sequential writing/reading disk: •  Messages are persisted sequentially as produced, and read sequentially when consumed •  Performance plus persistence •  performance of up to a billion messages per second at millisecond-level delivery times. Kafka model is BLAZING fast •  Kafka 0.9 API with message sizes at 200 bytes •  MapR Streams on a 5 node cluster sustained 18 million events / sec •  Throughput of 3.5GB/s and over 1.5 trillion events / day
  • 26. © 2016 MapR Technologies L1-26® When Are Messages Deleted? •  Messages can be persisted forever Or •  Older messages can be deleted automatically based on time to live MapR Cluster (1 Server) 6 5 4 3 2 1Partition 1 Older message
  • 27. © 2016 MapR Technologies L1-27® Parallelism When Reading To read messages from the same Topic in parallel: •  create consumer groups •  consumers with same group.id •  partitions assigned dynamically round-robin Consumer group: Oil Wells Consumer A Consumer B Consumer C MapR Cluster Partition 4: Warning Partition 3: Warning Partition 2: Warning Partition 1: Warning Partition 5: Warning
  • 28. © 2016 MapR Technologies L1-28® Fault Tolerance Consumption: Partitions Re-Assigned Dynamically If consumer goes offline, partitions re-assigned Consumer group.id: Oil Wells Consumer A Consumer C MapR Cluster Partition4: Warning Partition3: Warning Partition2: Warning Partition1: Warning Partition5: Warning
  • 29. © 2016 MapR Technologies L1-29® Processing Same Message for Different Views Consumers Consumers Consumers Producers Producers Producers MapR-FS Kafka API Kafka API Pub Sub: Multiple Consumers, Multiple Destinations
  • 30. © 2016 MapR Technologies L1-30® © 2016 MapR Technologies© 2016 MapR Technologies Partition Fault Tolerance
  • 31. © 2016 MapR Technologies L1-31® Message Recovery What if you need to recover messages in case of server failure?
  • 32. © 2016 MapR Technologies L1-32® Partitions are Replicated for Fault Tolerance Producer Producer Server 2 Partition2: Topic - Warning Producer Server 1 Partition1: Topic - Warning Server 3 Partition3: Topic - Warning Server 2 Server 3 Server 1 Server 3 Server 1 Server 2
  • 33. © 2016 MapR Technologies L1-33® Partition1: Warning Partition2: Warning Replica Partition3: Warning Replica Partition1: Warning Replica Partition3: Warning Replica Partition1: Warning Replica Partition2: Warning Replica Partition3: Warning Producer Producer Producer Server 1 Server 2 Server 3 Security Investigation & Event Management Operational Intelligence Real-time Analytics Partition2: Warning Partitions are Replicated for Fault Tolerance
  • 34. © 2016 MapR Technologies L1-34® Partitions are Replicated for Fault Tolerance Producer Producer Producer Security Investigation & Event Management Operational Intelligence Real-time Analytics Partition1: Warning Partition2: Warning Replica Partition3: Warning Replica Partition1: Warning Replica Partition3: Warning Replica Partition1: Warning Replica Partition2: Warning Replica Partition3: Warning Server 1 Server 2 Server 3 Partition2: Warning
  • 35. © 2016 MapR Technologies L1-35® Partitions are Replicated for Fault tolerance Producer Producer Producer Security Investigation & Event Management Operational Intelligence Real-time Analytics Partition1: Warning Partition2: Warning Replica Partition3: Warning Replica Partition1: Warning Replica Partition3: Warning Replica Partition1: Warning Replica Partition2: Warning Replica Partition3: Warning Server 1 Server 2 Server 3 Partition2: Warning
  • 36. © 2016 MapR Technologies L1-36® © 2016 MapR Technologies© 2016 MapR Technologies Streams and High Availability
  • 37. © 2016 MapR Technologies L1-37® •  Stream: –  collection of topics managed together •  Manage stream: –  replication –  security –  time-to-live –  number of partitions Core Components: Streams Stream Pressure Temperature Warning Stream Pressure Temperature Warning Consumers Consumers Consumers Consumers Producers Producers Replication
  • 38. © 2016 MapR Technologies L1-38® Real-time Access What if you need real-time access to live data distributed across multiple clusters and multiple data centers?
  • 39. © 2016 MapR Technologies L1-39® Lack of Global Replication Topic: C
  • 40. © 2016 MapR Technologies L1-40® Streams and Replication Streams: •  are a collection of topics •  can be replicated worldwide Topic: A Topic: B Topic: C Topic: A Topic: B Topic: C Replicating to another cluster
  • 41. © 2016 MapR Technologies L1-41® Streams and Replication Topic: A Topic: B Topic: C Fail Over Streams: •  high availability •  disaster recovery
  • 42. © 2016 MapR Technologies L1-42® Replicating Streams: Master-Slave Replication Venezuela_HA Cluster Metrics Stream MetricsProducers Venezuela Cluster Metrics Stream Metrics Consumers High Availabiltiy Backup for Venezula Master Slave
  • 43. © 2016 MapR Technologies L1-43® Replicating Streams: Many-to-One Replication Houston Metrics Stream Metrics Producers Venezuela Metrics Stream MetricsConsumers Consumers Producers Mexico Metrics Stream MetricsConsumers Analyze all data from Houston Many One
  • 44. © 2016 MapR Technologies L1-44® Replicating Streams: Multi-Master Replication Producers Seoul Metrics Stream MetricsConsumers ProducersSan Francisco Metrics Stream Metrics Consumers Both send and receive updates
  • 45. © 2016 MapR Technologies L1-45® Stream Replication WAN Stream Pressure Temperature Warning Stream Pressure Temperature Warning Stream Pressure Temperature Warning
  • 46. © 2016 MapR Technologies L1-46® Ship picks up containers… Singapore
  • 47. © 2016 MapR Technologies L1-47® Arrives at destination… Tokyo
  • 48. © 2016 MapR Technologies L1-48® While enroute to next destination… Washington
  • 49. © 2016 MapR Technologies L1-49® Where does the data live… Singapore Washington Tokyo
  • 50. © 2016 MapR Technologies L1-50® What is important about this? Data is generated on the ship •  Must have an easy way (i.e. foolproof) to move the data off the ship Each port stores the data from the ship •  Moving data between locations •  Analytics could happen at any location This is a multi-data center time series data use case •  Events from sensors = metrics •  Same concepts as data center monitoring
  • 51. © 2016 MapR Technologies L1-51® © 2016 MapR Technologies© 2016 MapR Technologies Patterns
  • 52. © 2016 MapR Technologies L1-52® Event Sourcing Updates Imagine each event as a change to an entry in a database. Account Id Balance WillO 80.00 BradA 20.00 1: WillO : Deposit : 100.00 2: BradA : Deposit : 50.00 3: BradA : Withdraw : 30.00 4: WillO : Withdraw: 20.00 https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying Change log 4 3 2 1 credit, debit events current account balances
  • 53. © 2016 MapR Technologies L1-53® Replication Change Log https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying 3 2 1 3 2 1 3 2 1 Duality of Streams and Tables: Database: captures data at rest Stream: captures data change Master: Append writes Slave: Apply writes in order
  • 54. © 2016 MapR Technologies L1-54® Which Makes a Better System of Record? Which of these can be used to reconstruct the other? 1: WillO : Deposit : 100.00 2: BradA : Deposit : 50.00 3: BradA : Withdraw : 30.00 4: WillO : Withdraw: 20.00 Account Id Balance WillO 80.00 BradA 20.00 Change Log 3 2 1
  • 55. © 2016 MapR Technologies L1-55® Rewind: Reprocessing Events MapR Cluster 6 5 4 3 2 1Producers Reprocess from oldest message Consumer Create new view, Index, cache
  • 56. © 2016 MapR Technologies L1-56® Rewind Reprocessing Events MapR Cluster 6 5 4 3 2 1Producers To Newest message Consumer new view Read from new view
  • 57. © 2016 MapR Technologies L1-57® Event Sourcing, Command Query Responsibility Separation: Turning the Database Upside Down Key-Val Document Graph Wide Column Time Series Relational ???Events Updates
  • 58. © 2016 MapR Technologies L1-58® What Else Do I Use My Stream For? Lineage - “how did BradA’s balance get so low?” Auditing - “who deposited/withdrew from BradA’s account?” History – to see the status of the accounts last year Integrity - “can I trust this data hasn’t been tampered with?” •  Yup - Streams are immutable 0: WillO : Deposit : 100.00 1: BradA : Deposit : 50.00 2: BradA : Withdraw : 30.00 3: WillO : Withdraw: 20.00
  • 59. © 2016 MapR Technologies L1-59® What Do I Need For This to Work? Infinitely persisted events A way to query your persisted stream data An integrated security model across the stream and databases
  • 60. © 2016 MapR Technologies L1-60® Fraud Detection Point of Sale -> Data Center is Transaction Fraud ? •  Lots of requests •  Need answer within ~ 50 100 milliseconds Data Center Point of Sale Location, time, card# Fraud yes/no ?
  • 61. © 2016 MapR Technologies L1-61® Traditional Solution POS 1..n Fraud detector Last card use 1.  Look up last card use 2.  Compute the card velocity: •  Subtract last location, time from current location, time 3.  Update last card use
  • 62. © 2016 MapR Technologies L1-62® What Happens Next? POS 1..n Fraud detector Last card use POS 1..n Fraud detector POS 1..n Fraud detector 1.  Look up last card use 2.  Compute the card velocity 3.  Update last card use Bottleneck !
  • 63. © 2016 MapR Technologies L1-63® Service Isolation: Separate Read from Write POS 1..n Fraud detector Last card use Updater card activity Read Read last card use
  • 64. © 2016 MapR Technologies L1-64® Separate Read Model from the Write Model: Command Query Responsibility Separation POS 1..n Fraud detector Last card use Updater card activity Read Event last card use Write last card use
  • 65. © 2016 MapR Technologies L1-65® Event Sourcing: New Uses of Data Processing Same Message for Multiple Views POS 1..n Fraud detector Last card use Updater Card location history Other card activity
  • 66. © 2016 MapR Technologies L1-66® Scaling Through Isolation allows Multiple Consumers POS 1..n Last card use Updater POS 1..n Last card use Updater card activity Fraud detector Fraud detector Multiple fraud detectors can use the same message queue •  De-coupling and isolation are key •  Propagate events, not table updates
  • 67. © 2016 MapR Technologies L1-67® Decoupled Architecture Producer Activity Handler Producer Producer Historical Interesting Data Real-time Analysis Results Dashboard Anomaly Detection more than one component can make use of the same stream of messages for a variety of uses
  • 68. © 2016 MapR Technologies L1-68® Lessons De-coupling and isolation are key Propagate events, not table updates
  • 69. © 2016 MapR Technologies L1-69® Building Enterprise Software vs Internet Companies Enterprise Software: Complexity of domain => Business logic, Business rules Banking, Healthcare, Telecom Compliance=> Security Internet Companies: Volume of data => Complex data infrastructure Large Scale Availability, Recovery Reference Martin Kleppmann
  • 70. © 2016 MapR Technologies L1-70® Building Enterprise Software vs Internet Companies Enterprise Software: Event Sourcing Internet Companies: Stream Processing Reference Martin Kleppmann
  • 71. © 2016 MapR Technologies L1-71® © 2016 MapR Technologies© 2016 MapR Technologies Real World Solution
  • 72. © 2016 MapR Technologies L1-72® Credit Card Fraud Model Building
  • 73. © 2016 MapR Technologies L1-73® ServeNoSQL StorageData Ingest Fraud Stream Processing Architecture Stream ProcessingSource MapR-FS MapR-DB Topic: A Topic: B Topic: C Topic: A Topic: B Topic: C
  • 74. © 2016 MapR Technologies L1-74® Streams Messaging Fraud Processing Stream Processing Derive features Model raw enriched alerts process Batch Processing MapR-FS MapR-DB MapR-DB raw enriched alerts Model build model update model
  • 75. © 2016 MapR Technologies L1-75® Streams Messaging Fraud Event Processing Stream Processing NoSQL Storage MapR-FS MapR-DB Raw Enriched Fraud 1.  Parse raw event 2.  read card holder profile from MapR-DB 3.  Derive features 4.  Get prediction from model with features 5.  Publish not fraud to enriched topic 6.  Publish fraud to fraud topic
  • 76. © 2016 MapR Technologies L1-76® Fraud Processing Same Message for Different Views Partition1: Topic – Raw Trans Partition1: Topic – Enriched Partition1: Topic – Fraud Alert Partition2: Topic – Raw Trans Partition2: Topic - Enriched Partition2: Topic – Fraud Alert Partition3: Topic – Raw Trans Partition3: Topic - Enriched Partition3: Topic – Fraud Alert Consumers MapR-FS MapR-DB Consumers Consumers Consumers MapR-FS MapR-DB Consumers Consumers Consumers MapR-FS MapR-DB Consumers Consumers
  • 77. © 2016 MapR Technologies L1-77® © 2016 MapR Technologies© 2016 MapR Technologies Real World Solution
  • 78. © 2016 MapR Technologies L1-78® JSON DB (MapR-DB) Graph DB (Titan on MapR-DB) Search Engine (Elastic-Search) Transforming the Health Care Ecosystem Electronic Medical Records “The Stream is the System of Record” –Brad Anderson VP Big Data Informatics
  • 79. © 2016 MapR Technologies L1-79® Liaison ALLOY™ Platform 79 Data Integration ingest syndicatetransform Data Management master deduplicate harmonize relate merge tokenize store / persist analyze summarize report distill recommend explore query sandbox batch transform learn traverse
  • 80. © 2016 MapR Technologies L1-80® Use Case: Streaming System of Record for Healthcare Objective: •  Build a flexible, secure healthcare exchange Records Analysis Applications Challenges: •  Many different data models •  Security and privacy issues •  HIPAA compliance Records
  • 81. © 2016 MapR Technologies L1-81® ALLOY Health: Exchange State HIE Clinical Data Viewer Analytics queries like: What are the outcomes in the entire state on diabetes? Are there doctors that are doing this better than others? Clinical Data Financial Data Provider Organizations
  • 82. © 2016 MapR Technologies L1-82® 2000+ Practices 200 + Labs 30,000 + Clinicians OrdersAnywhere PORTAL (no EHR) EHR with HL7 ONLY EHR with WORKFLOW INTEGRATION RADIOLOGY LAB
  • 83. © 2016 MapR Technologies L1-83® This is a PAIN ! COMPLIAN CE SECURITY CONTROLS COMPLIANCE FEATURES PRIVACY PCI DSS 3.0 21 CFR Part 11 SSAE16 / SOC2 HIPAA/HITECH  
  • 84. © 2016 MapR Technologies L1-84® WHY NOW? 84http://bit.ly/29aBatK
  • 85. © 2016 MapR Technologies L1-85® WHY NOW? 2014 FQ4 profit $ -440 M Total Cost Estimate $ -12 B
  • 86. © 2016 MapR Technologies L1-86® Why Now? The Relational database is not the only tool 1234 Attribute Value patient_id 1234 Name Jon Smith Age 50 999 Attribute Value patient_id 999 Name Jonathan Smith DOB Jun 1965 86 9876 Attribute Value provider_id 86 Name Dr. Nora Paige Specialty Diabetes Attribute Value rx_id 9876 Name Sitagliptin Dosage 325mg Visited Prescribed WasPrescribed Patient Patient Prescription Provider Context and Relationships
  • 87. © 2016 MapR Technologies L1-87® WHY NOW? Mind the Gap 87
  • 88. © 2016 MapR Technologies L1-88® Streaming System of Record for Healthcare Stream Topic Records Applications 6 5 4 3 2 1 Search Graph DB JSON HBase Micro Service Micro Service Micro Service Micro Service Micro Service Micro Service A P I Streaming System of Record Materialized Views
  • 89. © 2016 MapR Technologies L1-89® 89   Immutable Log Raw Data workflow Key/Value (MapR-DB) materialized view workflow Search Engine materialized view CEP k v v v v v k v v v k v v k v v v v k v v v k v v v v v Document Log (MapR-FS) log API App pre- processor workflow Graph (ArangoDB) materialized view workflow Time Series (OpenTSDB) materialized view micro service micro service micro service micro service micro service micro service micro service micro service App AppApp ... The Promised Land Compliance Auditor
  • 90. © 2016 MapR Technologies L1-90® The Promised Land Auditor smiley faces •  Data Lineage •  Audit Logging •  Wire-level encryption •  At Rest encryption Replication •  Disaster Recovery •  EU – data can’t leave Non-Stream / Non-”Big Data” •  Software Development Lifecycle •  System Hardening •  Separation of Concerns -  Dev vs Ops •  Patch Management 90 Compliance Auditor
  • 91. © 2016 MapR Technologies L1-91® Solution Design/architecture solved some •  Streams •  Data Lineage/System of Record •  Kappa Architecture (Kreps/Kleppman) MapR solved others •  Unified Security •  Replication DC to DC •  Converge Kafka/HBase/Hadoop to one cluster •  Multi-tenancy (lots of topics, for lots of tenants) 91
  • 92. © 2016 MapR Technologies L1-92® © 2016 MapR Technologies© 2016 MapR Technologies API
  • 93. © 2016 MapR Technologies L1-93® Sample Producer: All Together public class SampleProducer { String topic=“/streams/pump:warning”; public static KafkaProducer producer; public static void main(String[] args) { producer=setUpProducer(); for(int i = 0; i < 3; i++) { String txt = “msg ” + i; ProducerRecord<String, String> rec = new ProducerRecord<String, String>(topic, txt); producer.send(rec); System.out.println("Sent msg number " + i); } producer.close(); }
  • 94. © 2016 MapR Technologies L1-94® public class MyConsumer { public static String topic = "/stream/pump:warning”; public static KafkaConsumer consumer; public static void main(String[] args) { configureConsumer(args); consumer.subscribe(topic); while (true) { ConsumerRecords<String, String> msg= consumer.poll(pollTimeOut); Iterator<ConsumerRecord<String, String>> iter = msg.iterator(); while (iter.hasNext()) { ConsumerRecord<String, String> record = iter.next(); System.out.println(”read " + record.toString()); } } consumer.close(); } } Sample Consumer: All Together
  • 95. © 2016 MapR Technologies L1-95® © 2016 MapR Technologies© 2016 MapR Technologies Summary
  • 96. © 2016 MapR Technologies L1-96® Can we get “Extreme” ? 1+ Trillion Events •  per day Millions of Producers •  Billions of events per second Multiple Consumers •  Potentially for every event Multiple Data Centers •  Plan for success •  Plan for drastic failure Think that is crazy? Consider having 100 servers and performing: Monitoring and Application logs… •  100 metrics per server •  60 samples per minute •  50 metrics per request •  1,000 log entries per request (abnormally small, depends on level) •  1million requests per day ~ 2 billion events per day, for one small (ish) use case Extreme Average Reality
  • 97. © 2016 MapR Technologies L1-97® Stream Processing Building a Complete Data Architecture MapR File System (MapR-FS) MapR Converged Data Platform MapR Database (MapR-DB) MapR Streams Sources/Apps Bulk Processing
  • 98. © 2016 MapR Technologies L1-98®
  • 99. © 2016 MapR Technologies L1-99®
  • 100. © 2016 MapR Technologies L1-10 0 ® bit.ly/jjug-aug2016 Find my slides & other related materials to this talk here: or search:
  • 101. © 2016 MapR Technologies L1-10 1 ® MapR Blog • https://www.mapr.com/blog/
  • 102. © 2016 MapR Technologies L1-10 2 ® …helping you put data technology to work ●  Find answers ●  Ask technical questions ●  Join on-demand training course discussions ●  Follow release announcements ●  Share and vote on product ideas ●  Find Meetup and event listings Connect with fellow Apache Hadoop and Spark professionals community.mapr.com