SlideShare uma empresa Scribd logo
1 de 33
Baixar para ler offline
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS re:INVENT
Analyzing Streaming Data in Real
Time with Amazon Kinesis
R y a n N i e n h u i s , S e n i o r P r o d u c t M a n a g e r , A m a z o n K i n e s i s
N o v e m b e r 2 0 1 7
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Hourly server logs
Weekly or monthly bills
Daily web-site clickstream
Daily fraud reports
Real time metrics
Real time spending alerts/caps
Real time clickstream analysis
Real time detection
It’s All About the Pace
Batch Processing Stream Processing
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why? Data loses value over time
Ingest data as it is generated
Analyze data in real time to get
insights immediately
Deliver data to in seconds instead
of hours
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Simple Pattern for Streaming Data
Continuously creates
data
Continuously writes
data to a stream
Can be almost anything
Data Producer
Durably stores data
Provides temporary
buffer that preps data
Supports very high-
throughput
Streaming Service
Continuously processes
data
Cleans, prepares, &
aggregates
Transforms data to
information
Data Consumer
Mobile Client Amazon Kinesis Amazon Kinesis app
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Kinesis
Amazon Kinesis
Data Streams
Amazon Kinesis
Data Analytics
Amazon Kinesis
Data Firehose
Build custom
applications that process
and analyze streaming
data
Easily process and
analyze streaming data
with standard SQL
Easily load streaming
data into AWS
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Kinesis Data Streams
• Easy administration and low cost
• Build real time applications with framework of choice
• Secure, durable storage
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Kinesis Data Analytics
• Powerful real time applications
• Easy to use, fully managed
• Automatic elasticity
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Kinesis Data Firehose
• Zero administration and seamless elasticity
• Direct-to-data store integration
• Serverless, continuous data transformations
Amazon S3
Amazon Redshift
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Kinesis Data Analytics Applications
Easily write SQL code to process streaming data
Connect to streaming source
Continuously deliver SQL results
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Common Use Cases
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Streaming
Ingest-
Transform-Load
Continuous
Metric
Generation
Actionable
Insights
Three Common Scenarios
Compute analytics as the data is generated
React to analytics based off of insights
Deliver data to analytics tools faster and
cheaper
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Web Analytics and Leaderboards
Amazon
Kinesis Data
Analytics
AWS
Lambda
function
Amazon
Cognito
Lightweight JS
client code
Web Server on
Amazon EC2
Instance
OR
Amazon
DynamoDB
Table
Amazon
Kinesis Data
Streams
Compute top 10 usersIngest web app data Persist to feed live apps
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Monitoring IoT Devices
IoT sensors AWS IoT
Amazon
RDS
MySQL DB
instance
Amazon
Kinesis
Data
Streams
Amazon
Kinesis
Data
Analytics
AWS
Lambda function
Compute avg temp
every 10 sec
Ingest sensor data
Persist time series
analytic to database
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Analyzing CloudTrail Event Logs
AWS
CloudTrail
Amazon
CloudWatch
events trigger
Amazon
Kinesis
Data Analytics
AWS
Lambda
function
Amazon S3
bucket for raw
data
Amazon S3
bucket for
processed data
Amazon
DynamoDB
Table(s)
Chart.JS
Dashboard
Compute
operational metrics
Ingest and deliver raw
log data
Deliver to a real time
dashboards and archival
Amazon Kinesis
Data Firehose
Amazon Kinesis
Data Firehose
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Deep Dive into
Analyzing CloudTrail Event Logs
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Ingest and deliver CloudTrail events
• CloudTrail provides continuous
account activity logging
• Events are sent in real time (to near
real time) to Kinesis Data Firehose or
Streams
• Each event includes a timestamp, IAM
user, AWS service name, API call,
response, and more
AWS
CloudTrail
Amazon
CloudWatch
events trigger
Amazon S3
bucket for raw
data
Kinesis Data
Firehose
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Automatic ingestion Easy setup Write your own
Stream Data to Amazon Kinesis
Amazon
VPC Flow
Logs
Elastic Load
Balancing
Amazon
RDS
Amazon
CloudWatch
Logs
AWS
CloudTrail
Event Logs
Amazon
Pinpoint
Amazon API
Gateway
AWS IoT
events
AWS SDKs
Amazon
DynamoDB
Amazon
Kinesis Agent
Amazon
Kinesis
Producer
Library
As a proxy:
For change data capture:
Just a sample… many more ways stream data to Amazon Kinesis
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Compute operational metrics in real time
Compute metrics using SQL in real time
like:
• Total calls by IP, service, API call, IAM
user
• Amazon EC2 API failures (or any other
service)
• Anomalous behavior of Amazon EC2
API (or any other service)
• Top 10 API calls across all services
Amazon
Kinesis
Data Analytics
Raw data Real time
analytics
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How do I write streaming SQL? Easy!
Streams (in memory tables)
CREATE STREAM calls_per_ip_stream(
eventTimeStamp TIMESTAMP,
computationType VARCHAR(256),
category VARCHAR(1024),
subCategory VARCHAR(1024),
unit VARCHAR(256),
unitValue BIGINT
);
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How do I write streaming SQL? Easy!
Pumps (continuous query)
CREATE OR REPLACE PUMP calls_per_ip_pump AS
INSERT INTO calls_per_ip_stream
SELECT STREAM "eventTimestamp",
COUNT(*),
"sourceIPAddress"
FROM source_sql_stream_001 ctrail
GROUP BY "sourceIPAddress",
STEP(ctrail.ROWTIME BY INTERVAL '1' MINUTE),
STEP(ctrail."eventTimestamp" BY INTERVAL '1' MINUTE);
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How do we aggregate streaming data?
• Aggregations (count, sum, min,…) take granular real time data
and turn it into insights
• Data is continuously processed so you need to tell the
application when you want results
Windows!
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Window Types
• Sliding, tumbling, and custom windows
• Tumbling windows are fixed size and grouped keys do not
overlap
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Event, ingest, and processing time
• Event time is the timestamp is assigned when the event
occurred, also called client-side time.
• Processing time is when your application reads and analyzes the
data (ROWTIME).
…
GROUP BY "sourceIPAddress",
/* Trigger for results */
STEP(ctrail.ROWTIME BY INTERVAL '1' MINUTE),
/* A timestamp grouping key */
STEP(ctrail."eventTimestamp" BY INTERVAL '1' MINUTE);
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Persist data for real time dashboards
• Use Kinesis Data
Firehose to archive
processed to in S3
• Use AWS Lambda to
deliver data to
DynamoDB (or another
database)
• Open source or other
tools to visualize the
data
Real time
analytics
AWS
Lambda
function
Amazon S3
bucket for
processed data
Amazon
DynamoDB
Table(s)
Chart.JS
Dashboard
Amazon Kinesis
Data Firehose
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Late results
• An event is late if it arrives after the computation for which it
logically belongs to has been completed
• Your Kinesis Analytics application will produce an amendment
…
GROUP BY "sourceIPAddress",
/* Trigger for results */
STEP(ctrail.ROWTIME BY INTERVAL '1' MINUTE),
/* A timestamp grouping key */
STEP(ctrail."eventTimestamp" BY INTERVAL '1' MINUTE);
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Updating a database
• Perform inserts but on duplicate key update
• For DyanamoDB, here is the AWS Lambda code:
…
GROUP BY "sourceIPAddress",
/* Trigger for results */
STEP(ctrail.ROWTIME BY INTERVAL '1' MINUTE),
/* A timestamp grouping key */
STEP(ctrail."eventTimestamp" BY INTERVAL '1' MINUTE);
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What does all this cost?
• All services used in the solution are pay as you
go
• All services used are serverless and have lower
devops expense
• This solution will cost the “average” customer
less than:
$100 per month
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Where do go next?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Try it out yourself
Go to aws.amazon.com/kinesis/
Some good examples:
• Get started in minutes with a clickthrough template for AWS
CloudTrail Event Log Analytics - <link> (friendly URL)
• Tinyurl.com/rt-dashboard
• Great blog posts with example use cases
Lots of customer examples
1 billion events/wk from
connected devices | IoT
17 PB of game data per
season | Entertainment
80 billion ad
impressions/day, 30 ms
response time | Ad Tech
100 GB/day click streams
from 250+ sites |
Enterprise
50 billion ad
impressions/day sub-50
ms responses | Ad Tech
10 million events/day
| Retail
Amazon Kinesis as Databus -
Migrate from Kafka to Kinesis| Enterprise
Funnel all
production events
through Amazon
Kinesis
Integrate with your current solutions
Get help from partner systems integrators
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
THANK YOU!

Mais conteúdo relacionado

Mais procurados

Introduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis AnalyticsIntroduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis AnalyticsAmazon Web Services
 
Deep Dive into Amazon ECS & Fargate
Deep Dive into Amazon ECS & FargateDeep Dive into Amazon ECS & Fargate
Deep Dive into Amazon ECS & FargateAmazon Web Services
 
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Amazon Web Services
 
Amazon RDS: Deep Dive - SRV310 - Chicago AWS Summit
Amazon RDS: Deep Dive - SRV310 - Chicago AWS SummitAmazon RDS: Deep Dive - SRV310 - Chicago AWS Summit
Amazon RDS: Deep Dive - SRV310 - Chicago AWS SummitAmazon Web Services
 
Real-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaReal-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaAmazon Web Services
 
(SEC318) AWS CloudTrail Deep Dive
(SEC318) AWS CloudTrail Deep Dive(SEC318) AWS CloudTrail Deep Dive
(SEC318) AWS CloudTrail Deep DiveAmazon Web Services
 
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Amazon Web Services
 
Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Araf Karsh Hamid
 
Getting Started with Serverless Architectures
Getting Started with Serverless ArchitecturesGetting Started with Serverless Architectures
Getting Started with Serverless ArchitecturesAmazon Web Services
 
reInvent reCap 2022
reInvent reCap 2022reInvent reCap 2022
reInvent reCap 2022CloudHesive
 
Building APIs with Amazon API Gateway
Building APIs with Amazon API GatewayBuilding APIs with Amazon API Gateway
Building APIs with Amazon API GatewayAmazon Web Services
 
Deep Dive on Amazon Elastic Container Service (ECS) and Fargate
Deep Dive on Amazon Elastic Container Service (ECS) and FargateDeep Dive on Amazon Elastic Container Service (ECS) and Fargate
Deep Dive on Amazon Elastic Container Service (ECS) and FargateAmazon Web Services
 
Storage with Amazon S3 and Amazon Glacier
Storage with Amazon S3 and Amazon GlacierStorage with Amazon S3 and Amazon Glacier
Storage with Amazon S3 and Amazon GlacierAmazon Web Services
 

Mais procurados (20)

Introduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis AnalyticsIntroduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis Analytics
 
AWS Cloud Watch
AWS Cloud WatchAWS Cloud Watch
AWS Cloud Watch
 
Deep Dive into Amazon ECS & Fargate
Deep Dive into Amazon ECS & FargateDeep Dive into Amazon ECS & Fargate
Deep Dive into Amazon ECS & Fargate
 
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
 
Amazon RDS: Deep Dive - SRV310 - Chicago AWS Summit
Amazon RDS: Deep Dive - SRV310 - Chicago AWS SummitAmazon RDS: Deep Dive - SRV310 - Chicago AWS Summit
Amazon RDS: Deep Dive - SRV310 - Chicago AWS Summit
 
A Serverless Data Pipeline
A Serverless Data PipelineA Serverless Data Pipeline
A Serverless Data Pipeline
 
Serverless Architectures.pdf
Serverless Architectures.pdfServerless Architectures.pdf
Serverless Architectures.pdf
 
Real-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaReal-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS Lambda
 
(SEC318) AWS CloudTrail Deep Dive
(SEC318) AWS CloudTrail Deep Dive(SEC318) AWS CloudTrail Deep Dive
(SEC318) AWS CloudTrail Deep Dive
 
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
 
Introduction to AWS Glue
Introduction to AWS Glue Introduction to AWS Glue
Introduction to AWS Glue
 
Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics
 
Deep Dive on AWS Lambda
Deep Dive on AWS LambdaDeep Dive on AWS Lambda
Deep Dive on AWS Lambda
 
Getting Started with Serverless Architectures
Getting Started with Serverless ArchitecturesGetting Started with Serverless Architectures
Getting Started with Serverless Architectures
 
reInvent reCap 2022
reInvent reCap 2022reInvent reCap 2022
reInvent reCap 2022
 
Cloud Migration Workshop
Cloud Migration WorkshopCloud Migration Workshop
Cloud Migration Workshop
 
Building APIs with Amazon API Gateway
Building APIs with Amazon API GatewayBuilding APIs with Amazon API Gateway
Building APIs with Amazon API Gateway
 
Deep Dive on Amazon Elastic Container Service (ECS) and Fargate
Deep Dive on Amazon Elastic Container Service (ECS) and FargateDeep Dive on Amazon Elastic Container Service (ECS) and Fargate
Deep Dive on Amazon Elastic Container Service (ECS) and Fargate
 
Amazon Aurora
Amazon AuroraAmazon Aurora
Amazon Aurora
 
Storage with Amazon S3 and Amazon Glacier
Storage with Amazon S3 and Amazon GlacierStorage with Amazon S3 and Amazon Glacier
Storage with Amazon S3 and Amazon Glacier
 

Semelhante a ABD301-Analyzing Streaming Data in Real Time with Amazon Kinesis

Analyzing Streams: Data Analytics Week SF
Analyzing Streams: Data Analytics Week SFAnalyzing Streams: Data Analytics Week SF
Analyzing Streams: Data Analytics Week SFAmazon Web Services
 
Analyzing Streams: Data Analytics Week at the SF Loft
Analyzing Streams: Data Analytics Week at the SF LoftAnalyzing Streams: Data Analytics Week at the SF Loft
Analyzing Streams: Data Analytics Week at the SF LoftAmazon Web Services
 
Analyzing Streaming Data in Real-time with Amazon Kinesis
Analyzing Streaming Data in Real-time with Amazon KinesisAnalyzing Streaming Data in Real-time with Amazon Kinesis
Analyzing Streaming Data in Real-time with Amazon KinesisAmazon Web Services
 
From Batch to Streaming - How Amazon Flex Uses Real-time Analytics
From Batch to Streaming - How Amazon Flex Uses Real-time AnalyticsFrom Batch to Streaming - How Amazon Flex Uses Real-time Analytics
From Batch to Streaming - How Amazon Flex Uses Real-time AnalyticsAmazon Web Services
 
Serverless Stream Processing Tips & Tricks - BDA311 - Chicago AWS Summit
Serverless Stream Processing Tips & Tricks - BDA311 - Chicago AWS SummitServerless Stream Processing Tips & Tricks - BDA311 - Chicago AWS Summit
Serverless Stream Processing Tips & Tricks - BDA311 - Chicago AWS SummitAmazon Web Services
 
Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018
Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018
Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018Amazon Web Services
 
Real Time Data Ingestion & Analysis - AWS Summit Sydney 2018
Real Time Data Ingestion & Analysis - AWS Summit Sydney 2018Real Time Data Ingestion & Analysis - AWS Summit Sydney 2018
Real Time Data Ingestion & Analysis - AWS Summit Sydney 2018Amazon Web Services
 
GPSTEC313_GPS Real-Time Data Processing with AWS Lambda Quickly, at Scale, an...
GPSTEC313_GPS Real-Time Data Processing with AWS Lambda Quickly, at Scale, an...GPSTEC313_GPS Real-Time Data Processing with AWS Lambda Quickly, at Scale, an...
GPSTEC313_GPS Real-Time Data Processing with AWS Lambda Quickly, at Scale, an...Amazon Web Services
 
Building a Real-Time Data Platform on AWS
Building a Real-Time Data Platform on AWSBuilding a Real-Time Data Platform on AWS
Building a Real-Time Data Platform on AWSInjae Kwak
 
How TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPTHow TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPTAmazon Web Services
 
GAM310_Build a Telemetry and Analytics Pipeline for Game Balancing
GAM310_Build a Telemetry and Analytics Pipeline for Game BalancingGAM310_Build a Telemetry and Analytics Pipeline for Game Balancing
GAM310_Build a Telemetry and Analytics Pipeline for Game BalancingAmazon Web Services
 
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech TalksReal-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech TalksAmazon Web Services
 
ABD335_Real-Time Anomaly Detection Using Amazon Kinesis
ABD335_Real-Time Anomaly Detection Using Amazon KinesisABD335_Real-Time Anomaly Detection Using Amazon Kinesis
ABD335_Real-Time Anomaly Detection Using Amazon KinesisAmazon Web Services
 
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Amazon Web Services
 
ABD307_Deep Analytics for Global AWS Marketing Organization
ABD307_Deep Analytics for Global AWS Marketing OrganizationABD307_Deep Analytics for Global AWS Marketing Organization
ABD307_Deep Analytics for Global AWS Marketing OrganizationAmazon Web Services
 

Semelhante a ABD301-Analyzing Streaming Data in Real Time with Amazon Kinesis (20)

Analyzing Streams: Data Analytics Week SF
Analyzing Streams: Data Analytics Week SFAnalyzing Streams: Data Analytics Week SF
Analyzing Streams: Data Analytics Week SF
 
Analyzing Streams: Data Analytics Week at the SF Loft
Analyzing Streams: Data Analytics Week at the SF LoftAnalyzing Streams: Data Analytics Week at the SF Loft
Analyzing Streams: Data Analytics Week at the SF Loft
 
Analyzing Streaming Data in Real-time with Amazon Kinesis
Analyzing Streaming Data in Real-time with Amazon KinesisAnalyzing Streaming Data in Real-time with Amazon Kinesis
Analyzing Streaming Data in Real-time with Amazon Kinesis
 
Analyzing Streams
Analyzing StreamsAnalyzing Streams
Analyzing Streams
 
Analyzing Streams
Analyzing StreamsAnalyzing Streams
Analyzing Streams
 
Analyzing Streams
Analyzing StreamsAnalyzing Streams
Analyzing Streams
 
Analyzing Streams
Analyzing StreamsAnalyzing Streams
Analyzing Streams
 
From Batch to Streaming - How Amazon Flex Uses Real-time Analytics
From Batch to Streaming - How Amazon Flex Uses Real-time AnalyticsFrom Batch to Streaming - How Amazon Flex Uses Real-time Analytics
From Batch to Streaming - How Amazon Flex Uses Real-time Analytics
 
Analyzing Streams
Analyzing StreamsAnalyzing Streams
Analyzing Streams
 
Serverless Stream Processing Tips & Tricks - BDA311 - Chicago AWS Summit
Serverless Stream Processing Tips & Tricks - BDA311 - Chicago AWS SummitServerless Stream Processing Tips & Tricks - BDA311 - Chicago AWS Summit
Serverless Stream Processing Tips & Tricks - BDA311 - Chicago AWS Summit
 
Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018
Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018
Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018
 
Real Time Data Ingestion & Analysis - AWS Summit Sydney 2018
Real Time Data Ingestion & Analysis - AWS Summit Sydney 2018Real Time Data Ingestion & Analysis - AWS Summit Sydney 2018
Real Time Data Ingestion & Analysis - AWS Summit Sydney 2018
 
GPSTEC313_GPS Real-Time Data Processing with AWS Lambda Quickly, at Scale, an...
GPSTEC313_GPS Real-Time Data Processing with AWS Lambda Quickly, at Scale, an...GPSTEC313_GPS Real-Time Data Processing with AWS Lambda Quickly, at Scale, an...
GPSTEC313_GPS Real-Time Data Processing with AWS Lambda Quickly, at Scale, an...
 
Building a Real-Time Data Platform on AWS
Building a Real-Time Data Platform on AWSBuilding a Real-Time Data Platform on AWS
Building a Real-Time Data Platform on AWS
 
How TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPTHow TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPT
 
GAM310_Build a Telemetry and Analytics Pipeline for Game Balancing
GAM310_Build a Telemetry and Analytics Pipeline for Game BalancingGAM310_Build a Telemetry and Analytics Pipeline for Game Balancing
GAM310_Build a Telemetry and Analytics Pipeline for Game Balancing
 
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech TalksReal-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
 
ABD335_Real-Time Anomaly Detection Using Amazon Kinesis
ABD335_Real-Time Anomaly Detection Using Amazon KinesisABD335_Real-Time Anomaly Detection Using Amazon Kinesis
ABD335_Real-Time Anomaly Detection Using Amazon Kinesis
 
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
 
ABD307_Deep Analytics for Global AWS Marketing Organization
ABD307_Deep Analytics for Global AWS Marketing OrganizationABD307_Deep Analytics for Global AWS Marketing Organization
ABD307_Deep Analytics for Global AWS Marketing Organization
 

Mais de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mais de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

ABD301-Analyzing Streaming Data in Real Time with Amazon Kinesis

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS re:INVENT Analyzing Streaming Data in Real Time with Amazon Kinesis R y a n N i e n h u i s , S e n i o r P r o d u c t M a n a g e r , A m a z o n K i n e s i s N o v e m b e r 2 0 1 7
  • 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Hourly server logs Weekly or monthly bills Daily web-site clickstream Daily fraud reports Real time metrics Real time spending alerts/caps Real time clickstream analysis Real time detection It’s All About the Pace Batch Processing Stream Processing
  • 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Why? Data loses value over time Ingest data as it is generated Analyze data in real time to get insights immediately Deliver data to in seconds instead of hours
  • 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Simple Pattern for Streaming Data Continuously creates data Continuously writes data to a stream Can be almost anything Data Producer Durably stores data Provides temporary buffer that preps data Supports very high- throughput Streaming Service Continuously processes data Cleans, prepares, & aggregates Transforms data to information Data Consumer Mobile Client Amazon Kinesis Amazon Kinesis app
  • 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Kinesis Amazon Kinesis Data Streams Amazon Kinesis Data Analytics Amazon Kinesis Data Firehose Build custom applications that process and analyze streaming data Easily process and analyze streaming data with standard SQL Easily load streaming data into AWS
  • 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Kinesis Data Streams • Easy administration and low cost • Build real time applications with framework of choice • Secure, durable storage
  • 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Kinesis Data Analytics • Powerful real time applications • Easy to use, fully managed • Automatic elasticity
  • 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Kinesis Data Firehose • Zero administration and seamless elasticity • Direct-to-data store integration • Serverless, continuous data transformations Amazon S3 Amazon Redshift
  • 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Kinesis Data Analytics Applications Easily write SQL code to process streaming data Connect to streaming source Continuously deliver SQL results
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Common Use Cases
  • 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Streaming Ingest- Transform-Load Continuous Metric Generation Actionable Insights Three Common Scenarios Compute analytics as the data is generated React to analytics based off of insights Deliver data to analytics tools faster and cheaper
  • 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Web Analytics and Leaderboards Amazon Kinesis Data Analytics AWS Lambda function Amazon Cognito Lightweight JS client code Web Server on Amazon EC2 Instance OR Amazon DynamoDB Table Amazon Kinesis Data Streams Compute top 10 usersIngest web app data Persist to feed live apps
  • 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Monitoring IoT Devices IoT sensors AWS IoT Amazon RDS MySQL DB instance Amazon Kinesis Data Streams Amazon Kinesis Data Analytics AWS Lambda function Compute avg temp every 10 sec Ingest sensor data Persist time series analytic to database
  • 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Analyzing CloudTrail Event Logs AWS CloudTrail Amazon CloudWatch events trigger Amazon Kinesis Data Analytics AWS Lambda function Amazon S3 bucket for raw data Amazon S3 bucket for processed data Amazon DynamoDB Table(s) Chart.JS Dashboard Compute operational metrics Ingest and deliver raw log data Deliver to a real time dashboards and archival Amazon Kinesis Data Firehose Amazon Kinesis Data Firehose
  • 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Deep Dive into Analyzing CloudTrail Event Logs
  • 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Ingest and deliver CloudTrail events • CloudTrail provides continuous account activity logging • Events are sent in real time (to near real time) to Kinesis Data Firehose or Streams • Each event includes a timestamp, IAM user, AWS service name, API call, response, and more AWS CloudTrail Amazon CloudWatch events trigger Amazon S3 bucket for raw data Kinesis Data Firehose
  • 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Automatic ingestion Easy setup Write your own Stream Data to Amazon Kinesis Amazon VPC Flow Logs Elastic Load Balancing Amazon RDS Amazon CloudWatch Logs AWS CloudTrail Event Logs Amazon Pinpoint Amazon API Gateway AWS IoT events AWS SDKs Amazon DynamoDB Amazon Kinesis Agent Amazon Kinesis Producer Library As a proxy: For change data capture: Just a sample… many more ways stream data to Amazon Kinesis
  • 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Compute operational metrics in real time Compute metrics using SQL in real time like: • Total calls by IP, service, API call, IAM user • Amazon EC2 API failures (or any other service) • Anomalous behavior of Amazon EC2 API (or any other service) • Top 10 API calls across all services Amazon Kinesis Data Analytics Raw data Real time analytics
  • 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. How do I write streaming SQL? Easy! Streams (in memory tables) CREATE STREAM calls_per_ip_stream( eventTimeStamp TIMESTAMP, computationType VARCHAR(256), category VARCHAR(1024), subCategory VARCHAR(1024), unit VARCHAR(256), unitValue BIGINT );
  • 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. How do I write streaming SQL? Easy! Pumps (continuous query) CREATE OR REPLACE PUMP calls_per_ip_pump AS INSERT INTO calls_per_ip_stream SELECT STREAM "eventTimestamp", COUNT(*), "sourceIPAddress" FROM source_sql_stream_001 ctrail GROUP BY "sourceIPAddress", STEP(ctrail.ROWTIME BY INTERVAL '1' MINUTE), STEP(ctrail."eventTimestamp" BY INTERVAL '1' MINUTE);
  • 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. How do we aggregate streaming data? • Aggregations (count, sum, min,…) take granular real time data and turn it into insights • Data is continuously processed so you need to tell the application when you want results Windows!
  • 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Window Types • Sliding, tumbling, and custom windows • Tumbling windows are fixed size and grouped keys do not overlap
  • 23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Event, ingest, and processing time • Event time is the timestamp is assigned when the event occurred, also called client-side time. • Processing time is when your application reads and analyzes the data (ROWTIME). … GROUP BY "sourceIPAddress", /* Trigger for results */ STEP(ctrail.ROWTIME BY INTERVAL '1' MINUTE), /* A timestamp grouping key */ STEP(ctrail."eventTimestamp" BY INTERVAL '1' MINUTE);
  • 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Persist data for real time dashboards • Use Kinesis Data Firehose to archive processed to in S3 • Use AWS Lambda to deliver data to DynamoDB (or another database) • Open source or other tools to visualize the data Real time analytics AWS Lambda function Amazon S3 bucket for processed data Amazon DynamoDB Table(s) Chart.JS Dashboard Amazon Kinesis Data Firehose
  • 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Late results • An event is late if it arrives after the computation for which it logically belongs to has been completed • Your Kinesis Analytics application will produce an amendment … GROUP BY "sourceIPAddress", /* Trigger for results */ STEP(ctrail.ROWTIME BY INTERVAL '1' MINUTE), /* A timestamp grouping key */ STEP(ctrail."eventTimestamp" BY INTERVAL '1' MINUTE);
  • 26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Updating a database • Perform inserts but on duplicate key update • For DyanamoDB, here is the AWS Lambda code: … GROUP BY "sourceIPAddress", /* Trigger for results */ STEP(ctrail.ROWTIME BY INTERVAL '1' MINUTE), /* A timestamp grouping key */ STEP(ctrail."eventTimestamp" BY INTERVAL '1' MINUTE);
  • 27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What does all this cost? • All services used in the solution are pay as you go • All services used are serverless and have lower devops expense • This solution will cost the “average” customer less than: $100 per month
  • 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Where do go next?
  • 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Try it out yourself Go to aws.amazon.com/kinesis/ Some good examples: • Get started in minutes with a clickthrough template for AWS CloudTrail Event Log Analytics - <link> (friendly URL) • Tinyurl.com/rt-dashboard • Great blog posts with example use cases
  • 30. Lots of customer examples 1 billion events/wk from connected devices | IoT 17 PB of game data per season | Entertainment 80 billion ad impressions/day, 30 ms response time | Ad Tech 100 GB/day click streams from 250+ sites | Enterprise 50 billion ad impressions/day sub-50 ms responses | Ad Tech 10 million events/day | Retail Amazon Kinesis as Databus - Migrate from Kafka to Kinesis| Enterprise Funnel all production events through Amazon Kinesis
  • 31. Integrate with your current solutions
  • 32. Get help from partner systems integrators
  • 33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. THANK YOU!