SlideShare uma empresa Scribd logo
1 de 44
Baixar para ler offline
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Ryan Nienhuis, Sr. Technical Product Manager, Amazon Kinesis
Ram Kumar Rengaswamy, co-founder and CTO, Beeswax
November 29, 2016
BDM403
Beeswax
Building a Real Time Streaming Data Platform on AWS
What to Expect from the Session
• Introduction to Amazon Kinesis as a platform for real
time streaming data on AWS
• Key considerations for building an end to end streaming
platform using Amazon Kinesis Streams
• Introduction to Beeswax real time bidding platform built
on AWS using Amazon Kinesis, Amazon Redshift,
Amazon S3, and AWS Data Pipeline
• Deep dive into best practices for streaming data using
these services
An unbounded sequence of events that is
continuously captured and processed with
low latency.
What is streaming data?
Amazon Kinesis: Streaming Data Made Easy
Services make it easy to capture, deliver, process streams on AWS
Amazon Kinesis
Streams
Amazon Kinesis
Analytics
Amazon Kinesis
Firehose
Amazon Kinesis Streams
• Easy administration
• Build real time applications with framework of choice
• Low cost
Amazon Kinesis Firehose
• Zero administration
• Direct-to-data store integration
• Seamless elasticity
Amazon Kinesis Analytics
• Apply SQL on streams
• Build real-time, stream processing applications
• Easy scalability
Key Concepts
for Amazon Kinesis Streams
Amazon Kinesis Streams Key Concepts
Data
Sources
App.4
[Machine
Learning]
AWSEndpoint
App.1
[Aggregate &
De-Duplicate]
Data
Sources
Data
Sources
Data
Sources
App.2
[Metric
Extraction]
App.3
[Sliding
Window
Analysis]
Availability
Zone
Shard 1
Shard 2
Shard N
Availability
Zone
Availability
Zone
Data
Producers
Amazon Kinesis
stream
Data
Consumers
Downstream
systems
Amazon
S3
Amazon
Redshift
AWS
Lambda
Amazon Kinesis
Analytics
An Amazon Kinesis stream
• Streams are made of
shards
• Each shard is a unit of
parallelism and throughput
• Serves as a durable
temporal buffer with data
stored 1 - 7 days
• Scale by splitting and
merging shards
Putting Data into an Amazon Kinesis stream
• Data producers call PutRecord(s) to send
data to an Amazon Kinesis stream
• Partition key determines which shard the
data is stored
• Each shard supports 1 MB in / 2 MB out
• Each records gets a unique sequence
number
• Options for writing: AWS SDKs, Amazon
Kinesis Producer Library (KPL), Amazon
Kinesis agent, FluentD, Flume, and
more…
Producer
Producer
Producer
Producer
Producer
Producer
Producer
Kinesis stream
Shard 1
Shard 2
Shard 3
Shard 4
Shard n
Key considerations for data producers
• Connectivity - Lost connectivity and latency fluctuations
• Durability – Capture most or all records in event of
failure
• Efficiency – Producer’s primary job is often not
collection
• Distributed – Record ordering and retry strategies
Most customers choose to do some buffering and use a
random partition key; many strategies for failover
Getting Data from an Amazon Kinesis stream
• Consumer applications read each shard
continuously using GetRecords, determine
where to start using GetShardIterator
• Read model is per shard
• Increasing number of shards increases
scalability but reduces processing locality
• Options: Amazon Kinesis Client Library
(KCL) on Amazon EC2, Amazon Kinesis
Analytics, AWS Lambda, Spark Streaming
(Amazon EMR), Storm on EC2, and
more…
Kinesis stream
Shard 1
Shard 2
Shard 3
Shard 4
Shard n
Consumer
Consumer
Consumer
Amazon Kinesis Client Library –KCL
• Open source and available for Java, Ruby, Python, Node.js dev
• Deploy on your EC2 instances, scales easily with Elastic Beanstalk
• Two important components:
1. Record Processor – Processor unit that processes data from a
shard in Amazon Kinesis Streams
2. Worker – Processing unit that maps to each application instance
• Key features include load balancing, shard mapping, check pointing,
and CloudWatch monitoring
Key considerations for data consumer apps
• Scale - Have ready mechanisms for increasing
parallelism and add compute
• Availability - Always be reading latest data and monitor
stream position
• Accuracy - Implement at least once processing logic,
exactly once at destination (if you need it)
• Speed - Scale test your logic to ensure linear scalability
• Replay - Have retry strategy
Key considerations for the end-to-end solution
• Use cases - Start with a simple one, progress to more
advanced
• Data variety – Must support different data formats and
schema; centrally or decentralized management
• Integrations – Determine guarantees and where to
apply back pressure
• Fanning out or in – Determine whether to use multiple
consumers, multiple streams, or both
Beeswax
Powering the next generation
of real-time bidding
Who we are?
Startup based out of NYC, founded by ex-Googlers
We are hiring ! https://www.beeswax.com/careers
We do RTB (Real-time bidding)
Publisher
Ad Exchange
Beeswax Bidder
Scale: O(M) QPS
Latency_99 : 20 ms
- Target campaigns
- Target user profiles
- Optimize for ROI
- Customize
< 200 ms
Step 1:
Send ad request & userid
Step 2:
Broadcast bid request
Step 3:
Submit bid & ad markup
Step 4:
Show ad to user
Auction
Building a bidder is very hard
Need scale to deliver campaigns
• To reach the desired audience, bidder needs to process at least 1M QPS
• Deployment has to be in multiple regions to guarantee reach
Performance
• The timeout from ad exchanges is 100ms including the RTT over internet
• 99%ile tail latency for processing a bid request is 20ms
Complex ecosystem
• Manage integrations with ad exchanges, third-party data providers and vendors
• Requires a lot of domain expertise to optimize the bidder for maximizing
performance
A difficult trade-off
Build your own BidderUse a DSP
Risky investment of time and $
with no success guarantee
Limited to no customization;
Platform lock in
Our First Product: The Bidder-as-a-Service™
A full-stack solution
deployed for each customer
in a sandbox
Services
you control
Pre-built
ecosystem
and supply
relationships
Cookies,
Mobile ID’s, 3rd
Party
Data
Bidding
and Targeting
Engine
Campaign
Management UI/API
Reporting
UI/API
Custom
bidding
algos
Log-level
streaming
RESTful APIs
Direct
connections to
customer-hosted
services
Fully managed ad tech platform on
Outline of the talk
• System architecture
• Why we chose Amazon Kinesis
• Challenge 1: Collecting very high volume streams
• Challenge 2: Stream data transformation and fan out
• Challenge 3: Joining streams and aggregation
Beeswax System Architecture
Event Stream
Impression &
Click Data Producer
Bid Data
Producer
Streaming
Message Hub
Customer
Stream
HTTP
POST
S3 Bucket
Amazon
Redshift
Customer
API
Why we chose Amazon Kinesis?
Infrastructure requirements motivated by RTB use cases
Reason to choose Amazon Kinesis
• Fully managed by AWS; Really important factor for small engineering teams
• Support the scale necessary for RTB
• Pricing model provided opportunities to optimize cost
• Ingestion at very large scale (> 1M
QPS)
• Low latency delivery
• Reliable store of data
• Sequenced retrieval of events
Options available for consideration
1. Amazon Kinesis 2. Apache Kafka on EC2
Problem 1: Collecting high volume streams
Listening Bidders
• Filter very high QPS bid stream using Boolean targeting expressions
• Sample filtered stream and deliver
Challenges
• Collection at very high scale (QPS > 1M)
• Minimize infrastructure cost
• Minimize delivery latency for stream output ( < 10s)
Filtering
and
Sampling
Bids: O(M) QPS Filtered bid stream
Solution 1: Optimized Data Producers
Cost vs Reliability Tradeoff
• Uploads are priced by PUT payload size of 25K
• Buffer incoming records and pack them into single PUT payload
• Possible data loss if application crashes before buffer is flushed
• Be creative! We use ELB logs to replay requests to our collector
Consider overall system cost
• Compression can reduce data payload size but increase data producer
CPU usage
• Evaluate compression vs cost tradeoff. For example, we choose snappy
over gzip
Solution 1: Optimized Data Producers
Throughput vs Latency
• Buffering increases throughput as more data is uploaded per API call
• Increases average latency; Not a concern for very high QPS collectors
• Flush buffers periodically even if not full, to cap latency
Choose uniformly distributed partition keys
Problem 2: Data transformation and fan out
API driven, transparent and flexible platform
• Provide very detailed log level data to all our customers
• Support multiple delivery destinations and data formats
Challenges
• Config driven system to determine format, schema and destination of each record
• Maximize resource utilization by scaling elastically to stream volume
• Monitoring and operating the service
Transform
and
Fan Out
Event Stream
Solution 2: API-driven Streaming Message Hub
• KCL application deployed to Auto Scaling group
• CloudWatch alarms on CPU utilization elastically resize fleet
• Adapters perform schema and data format transformations
• Emitters buffer data in-memory and flush periodically to destination
• Stream is checkpointed after records are flushed by emitters
Kinesis Record
BidAdapters
WinAdapters
S3Emitter
...
HTTPEmitterClickAdapters
KinesisEmitter
...
Streaming message hub design tradeoffs
Single reader vs multiple readers
• Separate reader for every format & destination instead of a single reader
• Having separate readers improves fault tolerance
• However, CPU cost of parsing records is minimized with single reader
EC2 vs Lambda
• Use AWS Lambda instead of self-managed Auto Scaling
• Spot Instances deeply cut down the costs of self-managed solution
• Rich set of Amazon Kinesis stream metrics simplified monitoring and
management of service
Streaming message hub design tradeoffs
Amazon Kinesis Streams versus Amazon Kinesis Firehose
• Firehose does not support record level fan out or arbitrary data
transformations
• With above enhancements, it would be preferred over self-managed Auto
Scaling in EC2
Operating streaming message hub
Scale: ~300 shards, 250 MB/sec
Use CloudWatch metrics published by Amazon Kinesis Streams
Amazon Kinesis capacity alert
• Alert upon approaching 80% capacity
• Manually reshard Amazon Kinesis using KinesisScalingUtils (or new scaling
API)
Reader falling behind alert
• Alert if the average iterator age is greater than 20 sec.
• Ensure reader application is up, examine its custom metrics and triage
Management overhead - We have roughly 2 “incidents” per month
Problem 3: Joining and aggregation
High level value added services
• Joined data directly feeds into model building pipelines for clicks, etc.
• Reporting API, powered by ETL pipeline, provides aggregated metrics.
Challenges
• Supporting exactly once semantics, i.e., eliminate all duplicates
• Minimize end-to-end latency from capture to joining & aggregation
• Be robust to delays between arrival times of correlated events
Bids
Impressions
Clicks, Conversions
Joining
and
Aggregation
Solution 3: Stream joins using Amazon Redshift
• Message hub emits separate log files into S3 for each event type
• Data pipeline schedules periodically loads log files into Amazon Redshift
• Amazon Redshift tables of different event types are joined via primary
key
• FastPath: Joined events in 15min but can miss delayed events
• SlowPath: Fully joined events after 24 hours
Streaming
Message Hub
...
S3 Buckets
Amazon
Redshift
Data
Pipeline
Stream join design trade offs
Joins are not truly streaming in current design
• Batch size of 15 min dictated by lowest interval for scheduling data pipeline
• Lambda can be used instead of AWS Data Pipeline to lower schedule
intervals
• Data loaded into Amazon Redshift cannot be easily fed into Amazon Kinesis
streams
• However, it scales well, is fully AWS managed, and supports many of our
use cases
What are the alternatives?
• Spark streaming via EMR
• Amazon Kinesis Analytics
Early thoughts on comparing the alternatives
• Amazon Kinesis Analytics is fully managed; Spark Streaming is not
• Amazon Kinesis Analytics has usage-based pricing; Spark requires careful
capacity planning
• Need to evaluate Amazon Kinesis Analytics on scale and support for
arbitrary data formats
Summary
Building real time bidding (RTB) applications is very challenging
Beeswax provides a managed platform to build RTB apps on AWS
Beeswax uses Amazon Kinesis as infrastructure for streaming data
Beeswax platform solves key streaming data challenges
• Supports event collection at very large scale
• API driven platform for data transformation and fan out
• Supports joining of streams and aggregation of metrics
Tradeoffs are unique to application; Beeswax is optimized for RTB
Thank you!
Remember to complete
your evaluations!
Reference
We have many AWS Big Data Blog posts which cover more examples. Full list here. Some
good ones:
1. Amazon Kinesis Streams
1. Implement Efficient and Reliable Producers with the Amazon Kinesis Producer Library
2. Presto and Amazon Kinesis
3. Querying Amazon Kinesis Streams Directly with SQL and Sparking Streaming
4. Optimize Spark-Streaming to Efficiently Process Amazon Kinesis Streams
2. Amazon Kinesis Firehose
1. Persist Streaming Data to Amazon S3 using Amazon Kinesis Firehose and AWS Lambda
2. Building a Near Real-Time Discovery Platform with AWS
3. Amazon Kinesis Analytics
1. Writing SQL on Streaming Data With Amazon Kinesis Analytics Part 1 | Part 2
2. Real-time Clickstream Anomaly Detection with Amazon Kinesis Analytics
• Technical documentation
• Amazon Kinesis Agent
• Amazon Kinesis Streams and Spark Streaming
• Amazon Kinesis Producer Library Best Practice
• Amazon Kinesis Firehose and AWS Lambda
• Building Near Real-Time Discovery Platform with Amazon Kinesis
• Public case studies
• Glu mobile – Real-Time Analytics
• Hearst Publishing – Clickstream Analytics
• How Sonos Leverages Amazon Kinesis
• Nordstorm Online Stylist
Reference
Detailed system architecture
Event Stream
PartitionKey = F(EventId)
Config Store
Event Producer
- Reliable
- Record level retries
Bid Producer
- High throughput
- Stream compression
- Batch records w/ flush timeout
Stream Msg Hub
- KCL Application
- Autoscales
- At-least once processing
- Record format transforms
- Route to custom sinks
- Stream window analytics
Customer Log Stream
Partition key = EventId
Customer Http Post
Protobuf/Json payload
S3 Storage
- CSV data
- Customer bucket
Amazon
Redshift
- Join by EventId
- Exactly once
- Fast path 30m
Data Pipeline
Streaming data in real-time bidding application
Filtering
and
Sampling
Joining
and
Aggregation
Analytics
and
Reporting
Data Sources
Bids: O(1M) TPS
Wins: O(10K) TPS
Clicks: O(1K) TPS
Consumers Formats

Mais conteúdo relacionado

Mais procurados

Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Amazon Web Services
 
Day 5 - Real-time Data Processing/Internet of Things (IoT) with Amazon Kinesis
Day 5 - Real-time Data Processing/Internet of Things (IoT) with Amazon KinesisDay 5 - Real-time Data Processing/Internet of Things (IoT) with Amazon Kinesis
Day 5 - Real-time Data Processing/Internet of Things (IoT) with Amazon KinesisAmazon Web Services
 
AWS re:Invent 2016: Born in the Cloud; Built Like a Startup (ARC205)
AWS re:Invent 2016: Born in the Cloud; Built Like a Startup (ARC205)AWS re:Invent 2016: Born in the Cloud; Built Like a Startup (ARC205)
AWS re:Invent 2016: Born in the Cloud; Built Like a Startup (ARC205)Amazon Web Services
 
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)Amazon Web Services
 
Real-Time Processing Using AWS Lambda
Real-Time Processing Using AWS LambdaReal-Time Processing Using AWS Lambda
Real-Time Processing Using AWS LambdaAmazon Web Services
 
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel AvivBig Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel AvivAmazon Web Services
 
AWS re:Invent 2016: 5 Security Automation Improvements You Can Make by Using ...
AWS re:Invent 2016: 5 Security Automation Improvements You Can Make by Using ...AWS re:Invent 2016: 5 Security Automation Improvements You Can Make by Using ...
AWS re:Invent 2016: 5 Security Automation Improvements You Can Make by Using ...Amazon Web Services
 
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...Amazon Web Services
 
AWS Data Transfer Services: Data Ingest Strategies Into the AWS Cloud
AWS Data Transfer Services: Data Ingest Strategies Into the AWS CloudAWS Data Transfer Services: Data Ingest Strategies Into the AWS Cloud
AWS Data Transfer Services: Data Ingest Strategies Into the AWS CloudAmazon Web Services
 
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...Amazon Web Services
 
AWS re:Invent 2016: AWS Database State of the Union (DAT320)
AWS re:Invent 2016: AWS Database State of the Union (DAT320)AWS re:Invent 2016: AWS Database State of the Union (DAT320)
AWS re:Invent 2016: AWS Database State of the Union (DAT320)Amazon Web Services
 
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...Amazon Web Services
 
AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazo...
AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazo...AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazo...
AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazo...Amazon Web Services
 
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...Amazon Web Services
 
Rackspace Best Practices for DevOps on AWS
Rackspace Best Practices for DevOps on AWSRackspace Best Practices for DevOps on AWS
Rackspace Best Practices for DevOps on AWSAmazon Web Services
 
AWS re:Invent 2016: Case Study: How Startups like Mapbox, Ring, Hudl, and Oth...
AWS re:Invent 2016: Case Study: How Startups like Mapbox, Ring, Hudl, and Oth...AWS re:Invent 2016: Case Study: How Startups like Mapbox, Ring, Hudl, and Oth...
AWS re:Invent 2016: Case Study: How Startups like Mapbox, Ring, Hudl, and Oth...Amazon Web Services
 
Getting started with Amazon Kinesis
Getting started with Amazon KinesisGetting started with Amazon Kinesis
Getting started with Amazon KinesisAmazon Web Services
 
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)Amazon Web Services
 
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Amazon Web Services
 
AWS re:Invent 2016: Relational and NoSQL Databases on AWS: NBC, MarkLogic, an...
AWS re:Invent 2016: Relational and NoSQL Databases on AWS: NBC, MarkLogic, an...AWS re:Invent 2016: Relational and NoSQL Databases on AWS: NBC, MarkLogic, an...
AWS re:Invent 2016: Relational and NoSQL Databases on AWS: NBC, MarkLogic, an...Amazon Web Services
 

Mais procurados (20)

Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
 
Day 5 - Real-time Data Processing/Internet of Things (IoT) with Amazon Kinesis
Day 5 - Real-time Data Processing/Internet of Things (IoT) with Amazon KinesisDay 5 - Real-time Data Processing/Internet of Things (IoT) with Amazon Kinesis
Day 5 - Real-time Data Processing/Internet of Things (IoT) with Amazon Kinesis
 
AWS re:Invent 2016: Born in the Cloud; Built Like a Startup (ARC205)
AWS re:Invent 2016: Born in the Cloud; Built Like a Startup (ARC205)AWS re:Invent 2016: Born in the Cloud; Built Like a Startup (ARC205)
AWS re:Invent 2016: Born in the Cloud; Built Like a Startup (ARC205)
 
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
 
Real-Time Processing Using AWS Lambda
Real-Time Processing Using AWS LambdaReal-Time Processing Using AWS Lambda
Real-Time Processing Using AWS Lambda
 
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel AvivBig Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
 
AWS re:Invent 2016: 5 Security Automation Improvements You Can Make by Using ...
AWS re:Invent 2016: 5 Security Automation Improvements You Can Make by Using ...AWS re:Invent 2016: 5 Security Automation Improvements You Can Make by Using ...
AWS re:Invent 2016: 5 Security Automation Improvements You Can Make by Using ...
 
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
 
AWS Data Transfer Services: Data Ingest Strategies Into the AWS Cloud
AWS Data Transfer Services: Data Ingest Strategies Into the AWS CloudAWS Data Transfer Services: Data Ingest Strategies Into the AWS Cloud
AWS Data Transfer Services: Data Ingest Strategies Into the AWS Cloud
 
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
 
AWS re:Invent 2016: AWS Database State of the Union (DAT320)
AWS re:Invent 2016: AWS Database State of the Union (DAT320)AWS re:Invent 2016: AWS Database State of the Union (DAT320)
AWS re:Invent 2016: AWS Database State of the Union (DAT320)
 
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
 
AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazo...
AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazo...AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazo...
AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazo...
 
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
AWS May Webinar Series - Streaming Data Processing with Amazon Kinesis and AW...
 
Rackspace Best Practices for DevOps on AWS
Rackspace Best Practices for DevOps on AWSRackspace Best Practices for DevOps on AWS
Rackspace Best Practices for DevOps on AWS
 
AWS re:Invent 2016: Case Study: How Startups like Mapbox, Ring, Hudl, and Oth...
AWS re:Invent 2016: Case Study: How Startups like Mapbox, Ring, Hudl, and Oth...AWS re:Invent 2016: Case Study: How Startups like Mapbox, Ring, Hudl, and Oth...
AWS re:Invent 2016: Case Study: How Startups like Mapbox, Ring, Hudl, and Oth...
 
Getting started with Amazon Kinesis
Getting started with Amazon KinesisGetting started with Amazon Kinesis
Getting started with Amazon Kinesis
 
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)
AWS re:Invent 2016: Real-time Data Processing Using AWS Lambda (SVR301)
 
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
 
AWS re:Invent 2016: Relational and NoSQL Databases on AWS: NBC, MarkLogic, an...
AWS re:Invent 2016: Relational and NoSQL Databases on AWS: NBC, MarkLogic, an...AWS re:Invent 2016: Relational and NoSQL Databases on AWS: NBC, MarkLogic, an...
AWS re:Invent 2016: Relational and NoSQL Databases on AWS: NBC, MarkLogic, an...
 

Destaque

AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...
AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...
AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...Amazon Web Services
 
AWS re:Invent 2016: Understanding IoT Data: How to Leverage Amazon Kinesis in...
AWS re:Invent 2016: Understanding IoT Data: How to Leverage Amazon Kinesis in...AWS re:Invent 2016: Understanding IoT Data: How to Leverage Amazon Kinesis in...
AWS re:Invent 2016: Understanding IoT Data: How to Leverage Amazon Kinesis in...Amazon Web Services
 
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...Amazon Web Services
 
AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...
AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...
AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...Amazon Web Services
 
AWS re:Invent 2016: Serverless Architectural Patterns and Best Practices (ARC...
AWS re:Invent 2016: Serverless Architectural Patterns and Best Practices (ARC...AWS re:Invent 2016: Serverless Architectural Patterns and Best Practices (ARC...
AWS re:Invent 2016: Serverless Architectural Patterns and Best Practices (ARC...Amazon Web Services
 
AWS re:Invent 2016: Deep Dive: Amazon EMR Best Practices & Design Patterns (B...
AWS re:Invent 2016: Deep Dive: Amazon EMR Best Practices & Design Patterns (B...AWS re:Invent 2016: Deep Dive: Amazon EMR Best Practices & Design Patterns (B...
AWS re:Invent 2016: Deep Dive: Amazon EMR Best Practices & Design Patterns (B...Amazon Web Services
 
AWS re:Invent 2016: Visualizing Big Data Insights with Amazon QuickSight (BDM...
AWS re:Invent 2016: Visualizing Big Data Insights with Amazon QuickSight (BDM...AWS re:Invent 2016: Visualizing Big Data Insights with Amazon QuickSight (BDM...
AWS re:Invent 2016: Visualizing Big Data Insights with Amazon QuickSight (BDM...Amazon Web Services
 
AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...
AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...
AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...Amazon Web Services
 
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.Amazon Web Services
 
AWS re:Invent 2016: Billions of Rows Transformed in Record Time Using Matilli...
AWS re:Invent 2016: Billions of Rows Transformed in Record Time Using Matilli...AWS re:Invent 2016: Billions of Rows Transformed in Record Time Using Matilli...
AWS re:Invent 2016: Billions of Rows Transformed in Record Time Using Matilli...Amazon Web Services
 
AWS re:Invent 2016: Leveraging Amazon Machine Learning, Amazon Redshift, and ...
AWS re:Invent 2016: Leveraging Amazon Machine Learning, Amazon Redshift, and ...AWS re:Invent 2016: Leveraging Amazon Machine Learning, Amazon Redshift, and ...
AWS re:Invent 2016: Leveraging Amazon Machine Learning, Amazon Redshift, and ...Amazon Web Services
 
AWS re:Invent 2016: Workshop: Building Your First Big Data Application with A...
AWS re:Invent 2016: Workshop: Building Your First Big Data Application with A...AWS re:Invent 2016: Workshop: Building Your First Big Data Application with A...
AWS re:Invent 2016: Workshop: Building Your First Big Data Application with A...Amazon Web Services
 
AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ec...
AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ec...AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ec...
AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ec...Amazon Web Services
 
AWS re:Invent 2016: Deep Learning at Cloud Scale: Improving Video Discoverabi...
AWS re:Invent 2016: Deep Learning at Cloud Scale: Improving Video Discoverabi...AWS re:Invent 2016: Deep Learning at Cloud Scale: Improving Video Discoverabi...
AWS re:Invent 2016: Deep Learning at Cloud Scale: Improving Video Discoverabi...Amazon Web Services
 
Real Time Bidding on AWS - Pop-up Loft Tel Aviv
Real Time Bidding on AWS - Pop-up Loft Tel AvivReal Time Bidding on AWS - Pop-up Loft Tel Aviv
Real Time Bidding on AWS - Pop-up Loft Tel AvivAmazon Web Services
 
AWS Architecture Case Study: Real-Time Bidding
AWS Architecture Case Study: Real-Time BiddingAWS Architecture Case Study: Real-Time Bidding
AWS Architecture Case Study: Real-Time BiddingAmazon Web Services
 
AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)
AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)
AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)Amazon Web Services
 
Building A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWSBuilding A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWSAmazon Web Services
 
AWS re:Invent 2016: Cloud Monitoring - Understanding, Preparing, and Troubles...
AWS re:Invent 2016: Cloud Monitoring - Understanding, Preparing, and Troubles...AWS re:Invent 2016: Cloud Monitoring - Understanding, Preparing, and Troubles...
AWS re:Invent 2016: Cloud Monitoring - Understanding, Preparing, and Troubles...Amazon Web Services
 
The AWS Big Data Platform – Overview
The AWS Big Data Platform – OverviewThe AWS Big Data Platform – Overview
The AWS Big Data Platform – OverviewAmazon Web Services
 

Destaque (20)

AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...
AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...
AWS re:Invent 2016: Real-Time Data Exploration and Analytics with Amazon Elas...
 
AWS re:Invent 2016: Understanding IoT Data: How to Leverage Amazon Kinesis in...
AWS re:Invent 2016: Understanding IoT Data: How to Leverage Amazon Kinesis in...AWS re:Invent 2016: Understanding IoT Data: How to Leverage Amazon Kinesis in...
AWS re:Invent 2016: Understanding IoT Data: How to Leverage Amazon Kinesis in...
 
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
 
AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...
AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...
AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...
 
AWS re:Invent 2016: Serverless Architectural Patterns and Best Practices (ARC...
AWS re:Invent 2016: Serverless Architectural Patterns and Best Practices (ARC...AWS re:Invent 2016: Serverless Architectural Patterns and Best Practices (ARC...
AWS re:Invent 2016: Serverless Architectural Patterns and Best Practices (ARC...
 
AWS re:Invent 2016: Deep Dive: Amazon EMR Best Practices & Design Patterns (B...
AWS re:Invent 2016: Deep Dive: Amazon EMR Best Practices & Design Patterns (B...AWS re:Invent 2016: Deep Dive: Amazon EMR Best Practices & Design Patterns (B...
AWS re:Invent 2016: Deep Dive: Amazon EMR Best Practices & Design Patterns (B...
 
AWS re:Invent 2016: Visualizing Big Data Insights with Amazon QuickSight (BDM...
AWS re:Invent 2016: Visualizing Big Data Insights with Amazon QuickSight (BDM...AWS re:Invent 2016: Visualizing Big Data Insights with Amazon QuickSight (BDM...
AWS re:Invent 2016: Visualizing Big Data Insights with Amazon QuickSight (BDM...
 
AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...
AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...
AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...
 
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
 
AWS re:Invent 2016: Billions of Rows Transformed in Record Time Using Matilli...
AWS re:Invent 2016: Billions of Rows Transformed in Record Time Using Matilli...AWS re:Invent 2016: Billions of Rows Transformed in Record Time Using Matilli...
AWS re:Invent 2016: Billions of Rows Transformed in Record Time Using Matilli...
 
AWS re:Invent 2016: Leveraging Amazon Machine Learning, Amazon Redshift, and ...
AWS re:Invent 2016: Leveraging Amazon Machine Learning, Amazon Redshift, and ...AWS re:Invent 2016: Leveraging Amazon Machine Learning, Amazon Redshift, and ...
AWS re:Invent 2016: Leveraging Amazon Machine Learning, Amazon Redshift, and ...
 
AWS re:Invent 2016: Workshop: Building Your First Big Data Application with A...
AWS re:Invent 2016: Workshop: Building Your First Big Data Application with A...AWS re:Invent 2016: Workshop: Building Your First Big Data Application with A...
AWS re:Invent 2016: Workshop: Building Your First Big Data Application with A...
 
AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ec...
AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ec...AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ec...
AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ec...
 
AWS re:Invent 2016: Deep Learning at Cloud Scale: Improving Video Discoverabi...
AWS re:Invent 2016: Deep Learning at Cloud Scale: Improving Video Discoverabi...AWS re:Invent 2016: Deep Learning at Cloud Scale: Improving Video Discoverabi...
AWS re:Invent 2016: Deep Learning at Cloud Scale: Improving Video Discoverabi...
 
Real Time Bidding on AWS - Pop-up Loft Tel Aviv
Real Time Bidding on AWS - Pop-up Loft Tel AvivReal Time Bidding on AWS - Pop-up Loft Tel Aviv
Real Time Bidding on AWS - Pop-up Loft Tel Aviv
 
AWS Architecture Case Study: Real-Time Bidding
AWS Architecture Case Study: Real-Time BiddingAWS Architecture Case Study: Real-Time Bidding
AWS Architecture Case Study: Real-Time Bidding
 
AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)
AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)
AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)
 
Building A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWSBuilding A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWS
 
AWS re:Invent 2016: Cloud Monitoring - Understanding, Preparing, and Troubles...
AWS re:Invent 2016: Cloud Monitoring - Understanding, Preparing, and Troubles...AWS re:Invent 2016: Cloud Monitoring - Understanding, Preparing, and Troubles...
AWS re:Invent 2016: Cloud Monitoring - Understanding, Preparing, and Troubles...
 
The AWS Big Data Platform – Overview
The AWS Big Data Platform – OverviewThe AWS Big Data Platform – Overview
The AWS Big Data Platform – Overview
 

Semelhante a AWS re:Invent 2016: Beeswax: Building a Real-Time Streaming Data Platform on AWS (BDM403)

AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015Amazon Web Services Korea
 
Real-Time Streaming Data Solution on AWS with Beeswax
Real-Time Streaming Data Solution on AWS with BeeswaxReal-Time Streaming Data Solution on AWS with Beeswax
Real-Time Streaming Data Solution on AWS with BeeswaxAmazon Web Services
 
찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)
찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)
찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)Amazon Web Services Korea
 
AWS APAC Webinar Week - Real Time Data Processing with Kinesis
AWS APAC Webinar Week - Real Time Data Processing with KinesisAWS APAC Webinar Week - Real Time Data Processing with Kinesis
AWS APAC Webinar Week - Real Time Data Processing with KinesisAmazon Web Services
 
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014Amazon Web Services
 
AWS Webcast - AWS Kinesis Webinar
AWS Webcast - AWS Kinesis WebinarAWS Webcast - AWS Kinesis Webinar
AWS Webcast - AWS Kinesis WebinarAmazon Web Services
 
Deep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming ApplicationsDeep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming ApplicationsAmazon Web Services
 
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...Amazon Web Services
 
Deep dive and best practices on real time streaming applications nyc-loft_oct...
Deep dive and best practices on real time streaming applications nyc-loft_oct...Deep dive and best practices on real time streaming applications nyc-loft_oct...
Deep dive and best practices on real time streaming applications nyc-loft_oct...Amazon Web Services
 
NATS: A Cloud Native Messaging System
NATS: A Cloud Native Messaging SystemNATS: A Cloud Native Messaging System
NATS: A Cloud Native Messaging SystemShiju Varghese
 
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesBDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesAmazon Web Services
 
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924Amazon Web Services
 
Amazon Kinesis Data Streams Vs Msk (1).pptx
Amazon Kinesis Data Streams Vs Msk (1).pptxAmazon Kinesis Data Streams Vs Msk (1).pptx
Amazon Kinesis Data Streams Vs Msk (1).pptxRenjithPillai26
 
(BDT403) Best Practices for Building Real-time Streaming Applications with Am...
(BDT403) Best Practices for Building Real-time Streaming Applications with Am...(BDT403) Best Practices for Building Real-time Streaming Applications with Am...
(BDT403) Best Practices for Building Real-time Streaming Applications with Am...Amazon Web Services
 
AWS Webcast - Power your Digital Marketing Strategy with Amazon Web Services
AWS Webcast - Power your Digital Marketing Strategy with Amazon Web ServicesAWS Webcast - Power your Digital Marketing Strategy with Amazon Web Services
AWS Webcast - Power your Digital Marketing Strategy with Amazon Web ServicesAmazon Web Services
 
Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSK
Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSKChoose Right Stream Storage: Amazon Kinesis Data Streams vs MSK
Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSKSungmin Kim
 
AWS Webcast - Introduction to Amazon Kinesis
AWS Webcast - Introduction to Amazon KinesisAWS Webcast - Introduction to Amazon Kinesis
AWS Webcast - Introduction to Amazon KinesisAmazon Web Services
 

Semelhante a AWS re:Invent 2016: Beeswax: Building a Real-Time Streaming Data Platform on AWS (BDM403) (20)

AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
AWS를 활용한 첫 빅데이터 프로젝트 시작하기(김일호)- AWS 웨비나 시리즈 2015
 
Real-Time Streaming Data Solution on AWS with Beeswax
Real-Time Streaming Data Solution on AWS with BeeswaxReal-Time Streaming Data Solution on AWS with Beeswax
Real-Time Streaming Data Solution on AWS with Beeswax
 
찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)
찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)
찾아가는 AWS 세미나(구로,가산,판교) - AWS 기반 빅데이터 활용 방법 (김일호 솔루션즈 아키텍트)
 
AWS APAC Webinar Week - Real Time Data Processing with Kinesis
AWS APAC Webinar Week - Real Time Data Processing with KinesisAWS APAC Webinar Week - Real Time Data Processing with Kinesis
AWS APAC Webinar Week - Real Time Data Processing with Kinesis
 
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
 
AWS Webcast - AWS Kinesis Webinar
AWS Webcast - AWS Kinesis WebinarAWS Webcast - AWS Kinesis Webinar
AWS Webcast - AWS Kinesis Webinar
 
Bigdata meetup dwarak_realtime_score_app
Bigdata meetup dwarak_realtime_score_appBigdata meetup dwarak_realtime_score_app
Bigdata meetup dwarak_realtime_score_app
 
Deep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming ApplicationsDeep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming Applications
 
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
 
Deep dive and best practices on real time streaming applications nyc-loft_oct...
Deep dive and best practices on real time streaming applications nyc-loft_oct...Deep dive and best practices on real time streaming applications nyc-loft_oct...
Deep dive and best practices on real time streaming applications nyc-loft_oct...
 
What's new in AWS?
What's new in AWS?What's new in AWS?
What's new in AWS?
 
In Flux Limiting for a multi-tenant logging service
In Flux Limiting for a multi-tenant logging serviceIn Flux Limiting for a multi-tenant logging service
In Flux Limiting for a multi-tenant logging service
 
NATS: A Cloud Native Messaging System
NATS: A Cloud Native Messaging SystemNATS: A Cloud Native Messaging System
NATS: A Cloud Native Messaging System
 
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesBDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
 
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
 
Amazon Kinesis Data Streams Vs Msk (1).pptx
Amazon Kinesis Data Streams Vs Msk (1).pptxAmazon Kinesis Data Streams Vs Msk (1).pptx
Amazon Kinesis Data Streams Vs Msk (1).pptx
 
(BDT403) Best Practices for Building Real-time Streaming Applications with Am...
(BDT403) Best Practices for Building Real-time Streaming Applications with Am...(BDT403) Best Practices for Building Real-time Streaming Applications with Am...
(BDT403) Best Practices for Building Real-time Streaming Applications with Am...
 
AWS Webcast - Power your Digital Marketing Strategy with Amazon Web Services
AWS Webcast - Power your Digital Marketing Strategy with Amazon Web ServicesAWS Webcast - Power your Digital Marketing Strategy with Amazon Web Services
AWS Webcast - Power your Digital Marketing Strategy with Amazon Web Services
 
Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSK
Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSKChoose Right Stream Storage: Amazon Kinesis Data Streams vs MSK
Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSK
 
AWS Webcast - Introduction to Amazon Kinesis
AWS Webcast - Introduction to Amazon KinesisAWS Webcast - Introduction to Amazon Kinesis
AWS Webcast - Introduction to Amazon Kinesis
 

Mais de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mais de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Último (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

AWS re:Invent 2016: Beeswax: Building a Real-Time Streaming Data Platform on AWS (BDM403)

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Ryan Nienhuis, Sr. Technical Product Manager, Amazon Kinesis Ram Kumar Rengaswamy, co-founder and CTO, Beeswax November 29, 2016 BDM403 Beeswax Building a Real Time Streaming Data Platform on AWS
  • 2. What to Expect from the Session • Introduction to Amazon Kinesis as a platform for real time streaming data on AWS • Key considerations for building an end to end streaming platform using Amazon Kinesis Streams • Introduction to Beeswax real time bidding platform built on AWS using Amazon Kinesis, Amazon Redshift, Amazon S3, and AWS Data Pipeline • Deep dive into best practices for streaming data using these services
  • 3. An unbounded sequence of events that is continuously captured and processed with low latency. What is streaming data?
  • 4. Amazon Kinesis: Streaming Data Made Easy Services make it easy to capture, deliver, process streams on AWS Amazon Kinesis Streams Amazon Kinesis Analytics Amazon Kinesis Firehose
  • 5. Amazon Kinesis Streams • Easy administration • Build real time applications with framework of choice • Low cost
  • 6. Amazon Kinesis Firehose • Zero administration • Direct-to-data store integration • Seamless elasticity
  • 7. Amazon Kinesis Analytics • Apply SQL on streams • Build real-time, stream processing applications • Easy scalability
  • 8. Key Concepts for Amazon Kinesis Streams
  • 9. Amazon Kinesis Streams Key Concepts Data Sources App.4 [Machine Learning] AWSEndpoint App.1 [Aggregate & De-Duplicate] Data Sources Data Sources Data Sources App.2 [Metric Extraction] App.3 [Sliding Window Analysis] Availability Zone Shard 1 Shard 2 Shard N Availability Zone Availability Zone Data Producers Amazon Kinesis stream Data Consumers Downstream systems Amazon S3 Amazon Redshift AWS Lambda Amazon Kinesis Analytics
  • 10. An Amazon Kinesis stream • Streams are made of shards • Each shard is a unit of parallelism and throughput • Serves as a durable temporal buffer with data stored 1 - 7 days • Scale by splitting and merging shards
  • 11. Putting Data into an Amazon Kinesis stream • Data producers call PutRecord(s) to send data to an Amazon Kinesis stream • Partition key determines which shard the data is stored • Each shard supports 1 MB in / 2 MB out • Each records gets a unique sequence number • Options for writing: AWS SDKs, Amazon Kinesis Producer Library (KPL), Amazon Kinesis agent, FluentD, Flume, and more… Producer Producer Producer Producer Producer Producer Producer Kinesis stream Shard 1 Shard 2 Shard 3 Shard 4 Shard n
  • 12. Key considerations for data producers • Connectivity - Lost connectivity and latency fluctuations • Durability – Capture most or all records in event of failure • Efficiency – Producer’s primary job is often not collection • Distributed – Record ordering and retry strategies Most customers choose to do some buffering and use a random partition key; many strategies for failover
  • 13. Getting Data from an Amazon Kinesis stream • Consumer applications read each shard continuously using GetRecords, determine where to start using GetShardIterator • Read model is per shard • Increasing number of shards increases scalability but reduces processing locality • Options: Amazon Kinesis Client Library (KCL) on Amazon EC2, Amazon Kinesis Analytics, AWS Lambda, Spark Streaming (Amazon EMR), Storm on EC2, and more… Kinesis stream Shard 1 Shard 2 Shard 3 Shard 4 Shard n Consumer Consumer Consumer
  • 14. Amazon Kinesis Client Library –KCL • Open source and available for Java, Ruby, Python, Node.js dev • Deploy on your EC2 instances, scales easily with Elastic Beanstalk • Two important components: 1. Record Processor – Processor unit that processes data from a shard in Amazon Kinesis Streams 2. Worker – Processing unit that maps to each application instance • Key features include load balancing, shard mapping, check pointing, and CloudWatch monitoring
  • 15. Key considerations for data consumer apps • Scale - Have ready mechanisms for increasing parallelism and add compute • Availability - Always be reading latest data and monitor stream position • Accuracy - Implement at least once processing logic, exactly once at destination (if you need it) • Speed - Scale test your logic to ensure linear scalability • Replay - Have retry strategy
  • 16. Key considerations for the end-to-end solution • Use cases - Start with a simple one, progress to more advanced • Data variety – Must support different data formats and schema; centrally or decentralized management • Integrations – Determine guarantees and where to apply back pressure • Fanning out or in – Determine whether to use multiple consumers, multiple streams, or both
  • 17. Beeswax Powering the next generation of real-time bidding
  • 18. Who we are? Startup based out of NYC, founded by ex-Googlers We are hiring ! https://www.beeswax.com/careers
  • 19. We do RTB (Real-time bidding) Publisher Ad Exchange Beeswax Bidder Scale: O(M) QPS Latency_99 : 20 ms - Target campaigns - Target user profiles - Optimize for ROI - Customize < 200 ms Step 1: Send ad request & userid Step 2: Broadcast bid request Step 3: Submit bid & ad markup Step 4: Show ad to user Auction
  • 20. Building a bidder is very hard Need scale to deliver campaigns • To reach the desired audience, bidder needs to process at least 1M QPS • Deployment has to be in multiple regions to guarantee reach Performance • The timeout from ad exchanges is 100ms including the RTT over internet • 99%ile tail latency for processing a bid request is 20ms Complex ecosystem • Manage integrations with ad exchanges, third-party data providers and vendors • Requires a lot of domain expertise to optimize the bidder for maximizing performance
  • 21. A difficult trade-off Build your own BidderUse a DSP Risky investment of time and $ with no success guarantee Limited to no customization; Platform lock in
  • 22. Our First Product: The Bidder-as-a-Service™ A full-stack solution deployed for each customer in a sandbox Services you control Pre-built ecosystem and supply relationships Cookies, Mobile ID’s, 3rd Party Data Bidding and Targeting Engine Campaign Management UI/API Reporting UI/API Custom bidding algos Log-level streaming RESTful APIs Direct connections to customer-hosted services Fully managed ad tech platform on
  • 23. Outline of the talk • System architecture • Why we chose Amazon Kinesis • Challenge 1: Collecting very high volume streams • Challenge 2: Stream data transformation and fan out • Challenge 3: Joining streams and aggregation
  • 24. Beeswax System Architecture Event Stream Impression & Click Data Producer Bid Data Producer Streaming Message Hub Customer Stream HTTP POST S3 Bucket Amazon Redshift Customer API
  • 25. Why we chose Amazon Kinesis? Infrastructure requirements motivated by RTB use cases Reason to choose Amazon Kinesis • Fully managed by AWS; Really important factor for small engineering teams • Support the scale necessary for RTB • Pricing model provided opportunities to optimize cost • Ingestion at very large scale (> 1M QPS) • Low latency delivery • Reliable store of data • Sequenced retrieval of events Options available for consideration 1. Amazon Kinesis 2. Apache Kafka on EC2
  • 26. Problem 1: Collecting high volume streams Listening Bidders • Filter very high QPS bid stream using Boolean targeting expressions • Sample filtered stream and deliver Challenges • Collection at very high scale (QPS > 1M) • Minimize infrastructure cost • Minimize delivery latency for stream output ( < 10s) Filtering and Sampling Bids: O(M) QPS Filtered bid stream
  • 27. Solution 1: Optimized Data Producers Cost vs Reliability Tradeoff • Uploads are priced by PUT payload size of 25K • Buffer incoming records and pack them into single PUT payload • Possible data loss if application crashes before buffer is flushed • Be creative! We use ELB logs to replay requests to our collector Consider overall system cost • Compression can reduce data payload size but increase data producer CPU usage • Evaluate compression vs cost tradeoff. For example, we choose snappy over gzip
  • 28. Solution 1: Optimized Data Producers Throughput vs Latency • Buffering increases throughput as more data is uploaded per API call • Increases average latency; Not a concern for very high QPS collectors • Flush buffers periodically even if not full, to cap latency Choose uniformly distributed partition keys
  • 29. Problem 2: Data transformation and fan out API driven, transparent and flexible platform • Provide very detailed log level data to all our customers • Support multiple delivery destinations and data formats Challenges • Config driven system to determine format, schema and destination of each record • Maximize resource utilization by scaling elastically to stream volume • Monitoring and operating the service Transform and Fan Out Event Stream
  • 30. Solution 2: API-driven Streaming Message Hub • KCL application deployed to Auto Scaling group • CloudWatch alarms on CPU utilization elastically resize fleet • Adapters perform schema and data format transformations • Emitters buffer data in-memory and flush periodically to destination • Stream is checkpointed after records are flushed by emitters Kinesis Record BidAdapters WinAdapters S3Emitter ... HTTPEmitterClickAdapters KinesisEmitter ...
  • 31. Streaming message hub design tradeoffs Single reader vs multiple readers • Separate reader for every format & destination instead of a single reader • Having separate readers improves fault tolerance • However, CPU cost of parsing records is minimized with single reader EC2 vs Lambda • Use AWS Lambda instead of self-managed Auto Scaling • Spot Instances deeply cut down the costs of self-managed solution • Rich set of Amazon Kinesis stream metrics simplified monitoring and management of service
  • 32. Streaming message hub design tradeoffs Amazon Kinesis Streams versus Amazon Kinesis Firehose • Firehose does not support record level fan out or arbitrary data transformations • With above enhancements, it would be preferred over self-managed Auto Scaling in EC2
  • 33. Operating streaming message hub Scale: ~300 shards, 250 MB/sec Use CloudWatch metrics published by Amazon Kinesis Streams Amazon Kinesis capacity alert • Alert upon approaching 80% capacity • Manually reshard Amazon Kinesis using KinesisScalingUtils (or new scaling API) Reader falling behind alert • Alert if the average iterator age is greater than 20 sec. • Ensure reader application is up, examine its custom metrics and triage Management overhead - We have roughly 2 “incidents” per month
  • 34. Problem 3: Joining and aggregation High level value added services • Joined data directly feeds into model building pipelines for clicks, etc. • Reporting API, powered by ETL pipeline, provides aggregated metrics. Challenges • Supporting exactly once semantics, i.e., eliminate all duplicates • Minimize end-to-end latency from capture to joining & aggregation • Be robust to delays between arrival times of correlated events Bids Impressions Clicks, Conversions Joining and Aggregation
  • 35. Solution 3: Stream joins using Amazon Redshift • Message hub emits separate log files into S3 for each event type • Data pipeline schedules periodically loads log files into Amazon Redshift • Amazon Redshift tables of different event types are joined via primary key • FastPath: Joined events in 15min but can miss delayed events • SlowPath: Fully joined events after 24 hours Streaming Message Hub ... S3 Buckets Amazon Redshift Data Pipeline
  • 36. Stream join design trade offs Joins are not truly streaming in current design • Batch size of 15 min dictated by lowest interval for scheduling data pipeline • Lambda can be used instead of AWS Data Pipeline to lower schedule intervals • Data loaded into Amazon Redshift cannot be easily fed into Amazon Kinesis streams • However, it scales well, is fully AWS managed, and supports many of our use cases
  • 37. What are the alternatives? • Spark streaming via EMR • Amazon Kinesis Analytics Early thoughts on comparing the alternatives • Amazon Kinesis Analytics is fully managed; Spark Streaming is not • Amazon Kinesis Analytics has usage-based pricing; Spark requires careful capacity planning • Need to evaluate Amazon Kinesis Analytics on scale and support for arbitrary data formats
  • 38. Summary Building real time bidding (RTB) applications is very challenging Beeswax provides a managed platform to build RTB apps on AWS Beeswax uses Amazon Kinesis as infrastructure for streaming data Beeswax platform solves key streaming data challenges • Supports event collection at very large scale • API driven platform for data transformation and fan out • Supports joining of streams and aggregation of metrics Tradeoffs are unique to application; Beeswax is optimized for RTB
  • 41. Reference We have many AWS Big Data Blog posts which cover more examples. Full list here. Some good ones: 1. Amazon Kinesis Streams 1. Implement Efficient and Reliable Producers with the Amazon Kinesis Producer Library 2. Presto and Amazon Kinesis 3. Querying Amazon Kinesis Streams Directly with SQL and Sparking Streaming 4. Optimize Spark-Streaming to Efficiently Process Amazon Kinesis Streams 2. Amazon Kinesis Firehose 1. Persist Streaming Data to Amazon S3 using Amazon Kinesis Firehose and AWS Lambda 2. Building a Near Real-Time Discovery Platform with AWS 3. Amazon Kinesis Analytics 1. Writing SQL on Streaming Data With Amazon Kinesis Analytics Part 1 | Part 2 2. Real-time Clickstream Anomaly Detection with Amazon Kinesis Analytics
  • 42. • Technical documentation • Amazon Kinesis Agent • Amazon Kinesis Streams and Spark Streaming • Amazon Kinesis Producer Library Best Practice • Amazon Kinesis Firehose and AWS Lambda • Building Near Real-Time Discovery Platform with Amazon Kinesis • Public case studies • Glu mobile – Real-Time Analytics • Hearst Publishing – Clickstream Analytics • How Sonos Leverages Amazon Kinesis • Nordstorm Online Stylist Reference
  • 43. Detailed system architecture Event Stream PartitionKey = F(EventId) Config Store Event Producer - Reliable - Record level retries Bid Producer - High throughput - Stream compression - Batch records w/ flush timeout Stream Msg Hub - KCL Application - Autoscales - At-least once processing - Record format transforms - Route to custom sinks - Stream window analytics Customer Log Stream Partition key = EventId Customer Http Post Protobuf/Json payload S3 Storage - CSV data - Customer bucket Amazon Redshift - Join by EventId - Exactly once - Fast path 30m Data Pipeline
  • 44. Streaming data in real-time bidding application Filtering and Sampling Joining and Aggregation Analytics and Reporting Data Sources Bids: O(1M) TPS Wins: O(10K) TPS Clicks: O(1K) TPS Consumers Formats