SlideShare uma empresa Scribd logo
1 de 46
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Antonello Mantuano
Head of Software Engineering
Cerved
Dr Frank Munz
Senior Technical Evangelist
Amazon Web Services
Architecting for Real-Time Insights
with Streaming Data
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I TS U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Introductory - 200
“These sessions provide an overview of AWS services and
features, and they assume that attendees are new to the
topic. These sessions highlight basic use cases, features,
functions, and benefits."
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
- Streaming Architectures
- Amazon Kinesis
- Serverless Stream Processing
- Amazon Managed Streaming for Kafka (MSK)
- Customer Success Story:
Antonello Mantuano from Cerved
Agenda
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Streaming Data
Web Clickstream Application LogsIoT Sensors
[Wed Oct 11 14:32:52
2018] [error] [client
127.0.0.1] client
denied by server
configuration:
/export/home/live/ap/ht
docs/test
Continuously generated, small size events,
low latency requirements
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Transform and Process Continuously
Streaming
Ingest video
& data as it’s
generated
Process data
on the fly
Real-time
analytics/ML,
alerts, actions
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
From Batch to Streaming Analytics
https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon Kinesis
Real-time data streaming and analytics
Easily collect, process, and analyze streams in real time
Kinesis
Video Streams
Kinesis
Data Streams
Kinesis
Data Firehose
Kinesis
Data Analytics
Capture, process,
and store video
streams for
analytics
Load data streams
into AWS data
stores
Analyze data streams
with SQL or Java
Build custom
applications that
analyze data
streams
NEW!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon Kinesis Data Streams Overview
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Data Ingestion from a Variety of Sources
Kinesis Data
Streams
Transactions
ERP
Web logs/
cookies
Connected
devices
AWS SDKs
• Publish directly from application code via APIs
• AWS Mobile SDK
• Managed AWS sources: CloudWatch Logs, AWS IoT, Kinesis Data
Analytics and more
• RDS Aurora via Lambda
Kinesis Agent
• Monitors log files and forwards lines as messages to Kinesis Data Streams
3rd party and open source
• Log4j appender
• Apache Kafka
• Flume, fluentd, and more …
Kinesis Producer Library (KPL)
• Background process aggregates and batches messages
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Kinesis Data Streams: Standard consumers
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Kinesis Data Streams: Standard consumers
Shard 1
Shard 2
Shard 3
Shard n
Consumer
Application A
GetRecords()
Data
GetRecords():
5 transactions or
2MB per second, per shard
Data
Producer
up to 1 MB or
1000 records
per second, per shard With one consumer application,
records can be retrieved every 200 ms.
Stream
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Kinesis Data Streams: Enhanced fan-out consumers
Every consumer gets dedicated 2MB per second, per
shard. Latency is typically less than 70 msec.
Shard 1
Data
Producer
Consumer
Application B
Consumer
Application A
RegisterStreamConsumer()
EFO Pipe
SubscribeToShard()
Data: up to 2MB per second
EFO Pipe
HTTP/2: Consumers do not poll. Messages are pushed to the consumer as they arrive.
RegisterStreamConsumer()
SubscribeToShard()
Data: up to 2MB per second
Stream
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
The Serverless Operational Model
No provisioning,
no management
Pay for value
Automatic
scaling
Highly available
and secure
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Processing a Data Stream with Lambda
data
producer
Kinesis Data
Streams
Amazon
SNS
Continuously stream data
Lambda
service
Lambda
functionA
Lambda
function B
Continuously polls for new data,
1 poll per second
Automatically invokes your
function(s) when data found
Lambda polls each shard once per second
Lambda’s maximum execution time is 15 minutes
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Kinesis Streaming Data Analytics:
SQL or Apache Flink (Java)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Kinesis Streaming Data Analytics / SQL
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Kinesis Streaming Data Analytics / Apache Flink
Framework and engine for stateful processing of data streams.
Simple
programming
High performance
Stateful
Processing
Strong data
integrity
Easy to use and
flexible APIs make
building apps fast
In-memory
computing provides
low latency & high
throughput
Durable
application state
saves
Exactly-once
processing and
consistent state
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Kinesis Data Firehose:
Ingest Transform Load (ITL)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Kinesis Data Firehose—How it Works
Ingest Transform Deliver
Amazon S3
Amazon Redshift
Amazon Elasticsearch Service
AWS IoT
Amazon Kinesis Agent
Amazon Kinesis Streams
Amazon CloudWatch Logs
Amazon CloudWatch Events
Apache Kafka
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Kinesis Data Firehose: Record format Conversion
Kinesis Data
Firehose
Amazon S3
Glue Data
Catalog
Data
Producer
schema
convert to
columnar format
JSON data
/failed
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon Kinesis Data – Streams vs. Firehose
Scalable and durable real-time data streaming service with
provisioned throughput and sub-second latency that can
continuously capture gigabytes of data per second from hundreds
of thousands of sources.
Kinesis Data
Streams
Kinesis Data
Firehose
Capture, transform, convert and load data streams into AWS data
stores for near real-time analytics. Data latency 60 seconds.
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Demo Architecture
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Live Demo!
Use your phone & connect to:
XXX
2. modo !
3. modo "
1. preparazione
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Comparing Amazon Kinesis Data Streams to MSK
Amazon Kinesis Data Streams Amazon MSK
Newest dataOldest data
50 1 2 3 4
0 1 2 3
0 1 2 3 4
Shard 2
Shard 1
Shard 3
Writes
from
Producers
Stream with 3 shards
Newest dataOldest data
50 1 2 3 4
0 1 2 3
0 1 2 3 4
Partition 2
Partition 1
Partition 3
Writes
from
Producers
Topic with 3 partitions
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
TopicA Partition1
TopicA Partition3
Partition
Replica
Replica
Producer
Zoo-
keeper
Zoo-
keeper
Zoo-
keeper
State
& Config
TopicA Partition2 Replica
Cluster
Apache Kafka: Partitioned, Replicated Commit Log
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Challenges operating Apache Kafka
Difficult to setup,
configure and operate
Hard to achieve high
availability
Tricky to scale
AWS integrations
= development
No console, no visible
metrics
Getting started with Amazon MSK Preview is easy
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Fast Data from Legacy to Cloud
The battle to overcome the gravity
Marzo ’19
Antonello Mantuano
Head of Software Engineering
antonellomantuano
@manant74
Cerved – the Data Driven Company
42
Credit
Information
Credit
Management
Marketing
Solutions
LEAD GENERATION
CREDIT COLLECTION
DATA PROVIDING
& MARKETING
ANALYSIS
CREDIT INFORMATION
CREDIT SCORING
BAD CREDITS EVALUATION
We are deeply passionate about data. Our data enables
various financial services, from credit risk analysis to marketing
solutions to managing non-performing loans and bad debt.
•1M companies sitesWeb Data
•4M info from open data setOpen Data
•70M payment
•60M scoring
Cerved Data
•70M Real EstateProperty
•20M companies
•16M shareholders
Company Data
2.600
Persons
34.000
Customers
40M€
In Data &
Technologies
30 M
Decisions
1.400 TB
Of Data
Why Cerved in Cloud?
43
TIME-TO-MARKET
Rapid implementation for
basic services.
Benefits of
Cloud for Cerved
PRIVACY & SEGREGATION
Manage customers data in secure
mode
AVAILABILITY
Services available 7x24
SCALABILITY
Infrastructure quickly
adaptable to the load
The Data Gravity
44
As data accumulates, it begins to have gravity. This Data Gravity pulls
services and applications closer to the data.
- Dave McCrory, 2010
DATA
Services Apps
Latency
Throughpu
t
This attraction (gravitational force) is caused by the need for
services and applications to have higher bandwidth and/or lower
latency access to the data.
45
Data Ecosystem in Cerved
Sourcing
Liv.2
Sourcing
Liv. 1
REPOS
SYNTH
Mondo Dati Lince
Dati
clienti
NCA
ERG
EBS
HUB
DWH
MBD
Teradata
Tabula
Mongo4
DW
DB4You
XPCH 2
MATCHNEMO
Quaes
tio
LUDO
Tabula
(su
AWS)
Aracne
G4U
MBD1
R3
Pragma
Splunk CDR
Mambo
CAS
Dedalo
ELK
CSS
CR-RIBA
(Payline)
How to overcome Gravity?
How to lift to the cloud with Data Gravity?
Cerved Data in Cloud Architecture
Cerved
DBs
CDC
DBs Operational
Online Data
OLTP Processes
Batch
Hadoop DataLakeNoSql
Tabula
Cloud DB
DynamoDB RDS S3
Streaming is the new ETL
CDC
Producer
Raw Events
Aggregator
Basic Events
Aggregator
HL Events
Aggregator
NoSql
Ingestion
Synk Connector
For Cloud
Hadoop
Ingestion
Stream Processing
Streaming is the Anti-Gravity
Cerved API: a Data In Cloud use case
Cerved
DBs
CDC
DBs Operational
Online Data
OLTP Processes
Batch
Hadoop DataLakeNoSql
Tabula
Cloud DB
DynamoDB RDS S3
CDC
Producer
Raw Events
Aggregator
Basic Events
Aggregator
HL Events
Aggregator
NoSql
Ingestion
Sync
Connector to
CloudHadoop
Ingestion
Stream Processing
Back End
AWS Lambda Spring Boot API Gateway
Front End
ReactJs Redux Swagger
The Results of API in Cloud
49
SLA
API available 7x24x365
99,998% in January 2019
PERFORMANCE
High scalability with quickly
adaptability to load
COSTS
With AWS Lambda, DynamoDB,
S3, ecc… the cost of infrastructure
grows with the load
DATA SYNC
Data are continuously updated in
near real time mode
Future use case of Data in Cloud
Cerved
DBs
CDC
DBs Operational
Online Data
OLTP Processes
Batch
Hadoop DataLakeNoSql
Tabula
Cloud DB
DynamoDB RDS S3
CDC
Producer
Raw Events
Aggregator
Basic Events
Aggregator
HL Events
Aggregator
NoSql
Ingestion
Sync
Connector
To CloudHadoop
Ingestion
Stream Processing
Back End
AWS Lambda
API Gateway
EMR
SageMaker
AWS Kinesis or
Managed
Streaming for
Kafka
Data
Scientist in
Cloud
Real Time
Apps
DR &
Backup
Use Cases
THANK YOU
Moving Fast Data in cloud creates a new gravity for
new and innovative apps and services
Antonello Mantuano
Head of Software Engineering
antonellomantuano
@manant74
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
antonellomantuano
@manant74
frankmunz
@frankmunz

Mais conteúdo relacionado

Mais procurados

Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...AWS Summits
 
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...Amazon Web Services
 
Amazon Aurora, funzionalità e best practice per la migrazione di database su AWS
Amazon Aurora, funzionalità e best practice per la migrazione di database su AWSAmazon Aurora, funzionalità e best practice per la migrazione di database su AWS
Amazon Aurora, funzionalità e best practice per la migrazione di database su AWSAmazon Web Services
 
AWS App Mesh: Manage services mesh discovery, recovery, and monitoring - MAD3...
AWS App Mesh: Manage services mesh discovery, recovery, and monitoring - MAD3...AWS App Mesh: Manage services mesh discovery, recovery, and monitoring - MAD3...
AWS App Mesh: Manage services mesh discovery, recovery, and monitoring - MAD3...Amazon Web Services
 
利用微服務加速創新的步伐
利用微服務加速創新的步伐利用微服務加速創新的步伐
利用微服務加速創新的步伐Amazon Web Services
 
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...Amazon Web Services
 
Best-Practices-for-Running-Windows-Workloads-on-AWS
Best-Practices-for-Running-Windows-Workloads-on-AWSBest-Practices-for-Running-Windows-Workloads-on-AWS
Best-Practices-for-Running-Windows-Workloads-on-AWSAmazon Web Services
 
Building-Serverless-Analytics-On-AWS
Building-Serverless-Analytics-On-AWSBuilding-Serverless-Analytics-On-AWS
Building-Serverless-Analytics-On-AWSAmazon Web Services
 
Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...
Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...
Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...Amazon Web Services
 
Deriving Value with Next Gen Analytics and ML Architectures
Deriving Value with Next Gen Analytics and ML ArchitecturesDeriving Value with Next Gen Analytics and ML Architectures
Deriving Value with Next Gen Analytics and ML ArchitecturesAmazon Web Services
 
利用 Fargate - 無伺服器的容器環境建置高可用的系統
利用 Fargate - 無伺服器的容器環境建置高可用的系統利用 Fargate - 無伺服器的容器環境建置高可用的系統
利用 Fargate - 無伺服器的容器環境建置高可用的系統Amazon Web Services
 
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...Amazon Web Services
 
Introduction to EC2 A1 instances, powered by the AWS Graviton processor - CMP...
Introduction to EC2 A1 instances, powered by the AWS Graviton processor - CMP...Introduction to EC2 A1 instances, powered by the AWS Graviton processor - CMP...
Introduction to EC2 A1 instances, powered by the AWS Graviton processor - CMP...Amazon Web Services
 
Best practices for running Windows workloads on AWS
Best practices for running Windows workloads on AWSBest practices for running Windows workloads on AWS
Best practices for running Windows workloads on AWSAmazon Web Services
 
What's new in Amazon EC2 - CMP201 - New York AWS Summit
What's new in Amazon EC2 - CMP201 - New York AWS SummitWhat's new in Amazon EC2 - CMP201 - New York AWS Summit
What's new in Amazon EC2 - CMP201 - New York AWS SummitAmazon Web Services
 
Breaking the Monolith using AWS Container Services
Breaking the Monolith using AWS Container ServicesBreaking the Monolith using AWS Container Services
Breaking the Monolith using AWS Container ServicesAmazon Web Services
 
Introduction to AWS App Mesh - MAD303 - Atlanta AWS Summit
Introduction to AWS App Mesh - MAD303 - Atlanta AWS SummitIntroduction to AWS App Mesh - MAD303 - Atlanta AWS Summit
Introduction to AWS App Mesh - MAD303 - Atlanta AWS SummitAmazon Web Services
 
Move users to AWS with Amazon WorkSpaces and Amazon AppStream 2-0
Move users to AWS with Amazon WorkSpaces and Amazon AppStream 2-0Move users to AWS with Amazon WorkSpaces and Amazon AppStream 2-0
Move users to AWS with Amazon WorkSpaces and Amazon AppStream 2-0Amazon Web Services
 
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...AWS Summits
 

Mais procurados (20)

Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
 
HK-AWS-Quick-Start-Workshop
HK-AWS-Quick-Start-WorkshopHK-AWS-Quick-Start-Workshop
HK-AWS-Quick-Start-Workshop
 
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
 
Amazon Aurora, funzionalità e best practice per la migrazione di database su AWS
Amazon Aurora, funzionalità e best practice per la migrazione di database su AWSAmazon Aurora, funzionalità e best practice per la migrazione di database su AWS
Amazon Aurora, funzionalità e best practice per la migrazione di database su AWS
 
AWS App Mesh: Manage services mesh discovery, recovery, and monitoring - MAD3...
AWS App Mesh: Manage services mesh discovery, recovery, and monitoring - MAD3...AWS App Mesh: Manage services mesh discovery, recovery, and monitoring - MAD3...
AWS App Mesh: Manage services mesh discovery, recovery, and monitoring - MAD3...
 
利用微服務加速創新的步伐
利用微服務加速創新的步伐利用微服務加速創新的步伐
利用微服務加速創新的步伐
 
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
 
Best-Practices-for-Running-Windows-Workloads-on-AWS
Best-Practices-for-Running-Windows-Workloads-on-AWSBest-Practices-for-Running-Windows-Workloads-on-AWS
Best-Practices-for-Running-Windows-Workloads-on-AWS
 
Building-Serverless-Analytics-On-AWS
Building-Serverless-Analytics-On-AWSBuilding-Serverless-Analytics-On-AWS
Building-Serverless-Analytics-On-AWS
 
Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...
Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...
Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...
 
Deriving Value with Next Gen Analytics and ML Architectures
Deriving Value with Next Gen Analytics and ML ArchitecturesDeriving Value with Next Gen Analytics and ML Architectures
Deriving Value with Next Gen Analytics and ML Architectures
 
利用 Fargate - 無伺服器的容器環境建置高可用的系統
利用 Fargate - 無伺服器的容器環境建置高可用的系統利用 Fargate - 無伺服器的容器環境建置高可用的系統
利用 Fargate - 無伺服器的容器環境建置高可用的系統
 
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...
Train once, deploy anywhere on the cloud and at the edge with Amazon SageMake...
 
Introduction to EC2 A1 instances, powered by the AWS Graviton processor - CMP...
Introduction to EC2 A1 instances, powered by the AWS Graviton processor - CMP...Introduction to EC2 A1 instances, powered by the AWS Graviton processor - CMP...
Introduction to EC2 A1 instances, powered by the AWS Graviton processor - CMP...
 
Best practices for running Windows workloads on AWS
Best practices for running Windows workloads on AWSBest practices for running Windows workloads on AWS
Best practices for running Windows workloads on AWS
 
What's new in Amazon EC2 - CMP201 - New York AWS Summit
What's new in Amazon EC2 - CMP201 - New York AWS SummitWhat's new in Amazon EC2 - CMP201 - New York AWS Summit
What's new in Amazon EC2 - CMP201 - New York AWS Summit
 
Breaking the Monolith using AWS Container Services
Breaking the Monolith using AWS Container ServicesBreaking the Monolith using AWS Container Services
Breaking the Monolith using AWS Container Services
 
Introduction to AWS App Mesh - MAD303 - Atlanta AWS Summit
Introduction to AWS App Mesh - MAD303 - Atlanta AWS SummitIntroduction to AWS App Mesh - MAD303 - Atlanta AWS Summit
Introduction to AWS App Mesh - MAD303 - Atlanta AWS Summit
 
Move users to AWS with Amazon WorkSpaces and Amazon AppStream 2-0
Move users to AWS with Amazon WorkSpaces and Amazon AppStream 2-0Move users to AWS with Amazon WorkSpaces and Amazon AppStream 2-0
Move users to AWS with Amazon WorkSpaces and Amazon AppStream 2-0
 
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
 

Semelhante a Architetture per l'analisi di flussi di dati in tempo reale

Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Amazon Web Services
 
Stream processing and managing real-time data
Stream processing and managing real-time dataStream processing and managing real-time data
Stream processing and managing real-time dataAmazon Web Services
 
Build data-drive, high performance, internet scale applications with AWS Data...
Build data-drive, high performance, internet scale applications with AWS Data...Build data-drive, high performance, internet scale applications with AWS Data...
Build data-drive, high performance, internet scale applications with AWS Data...Amazon Web Services
 
Performing real-time ETL into data lakes - ADB202 - Santa Clara AWS Summit.pdf
Performing real-time ETL into data lakes - ADB202 - Santa Clara AWS Summit.pdfPerforming real-time ETL into data lakes - ADB202 - Santa Clara AWS Summit.pdf
Performing real-time ETL into data lakes - ADB202 - Santa Clara AWS Summit.pdfAmazon Web Services
 
All Databases Are Equal, But Some Databases Are More Equal than Others: How t...
All Databases Are Equal, But Some Databases Are More Equal than Others: How t...All Databases Are Equal, But Some Databases Are More Equal than Others: How t...
All Databases Are Equal, But Some Databases Are More Equal than Others: How t...javier ramirez
 
Cyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per Day
Cyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per DayCyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per Day
Cyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per DayAmazon Web Services
 
Using ML to detect and prevent fraud without compromising user experience - F...
Using ML to detect and prevent fraud without compromising user experience - F...Using ML to detect and prevent fraud without compromising user experience - F...
Using ML to detect and prevent fraud without compromising user experience - F...Amazon Web Services
 
Performing serverless analytics in AWS Glue - ADB202 - Chicago AWS Summit
Performing serverless analytics in AWS Glue - ADB202 - Chicago AWS SummitPerforming serverless analytics in AWS Glue - ADB202 - Chicago AWS Summit
Performing serverless analytics in AWS Glue - ADB202 - Chicago AWS SummitAmazon Web Services
 
Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS Summit
Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS SummitScalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS Summit
Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS SummitAmazon Web Services
 
Stream Processing in 2019 - AWS Summit Sydney
Stream Processing in 2019 - AWS Summit Sydney Stream Processing in 2019 - AWS Summit Sydney
Stream Processing in 2019 - AWS Summit Sydney Amazon Web Services
 
Building Data Lakes for Analytics on AWS - ADB201 - Anaheim AWS Summit
Building Data Lakes for Analytics on AWS - ADB201 - Anaheim AWS SummitBuilding Data Lakes for Analytics on AWS - ADB201 - Anaheim AWS Summit
Building Data Lakes for Analytics on AWS - ADB201 - Anaheim AWS SummitAmazon Web Services
 
Getting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipelineGetting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipelinejavier ramirez
 
Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWS
Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWSKeynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWS
Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWSFlink Forward
 
Thirty serverless architectures in 30 minutes - MAD202 - Chicago AWS Summit
Thirty serverless architectures in 30 minutes - MAD202 - Chicago AWS SummitThirty serverless architectures in 30 minutes - MAD202 - Chicago AWS Summit
Thirty serverless architectures in 30 minutes - MAD202 - Chicago AWS SummitAmazon Web Services
 
갤럭시 규모의 인공지능 서비스를 위한 AWS 데이터베이스 아키텍처 - 김상필 솔루션 아키텍트 매니저, AWS / 김정환 데브옵스 엔지니어,...
갤럭시 규모의 인공지능 서비스를 위한 AWS 데이터베이스 아키텍처 - 김상필 솔루션 아키텍트 매니저, AWS / 김정환 데브옵스 엔지니어,...갤럭시 규모의 인공지능 서비스를 위한 AWS 데이터베이스 아키텍처 - 김상필 솔루션 아키텍트 매니저, AWS / 김정환 데브옵스 엔지니어,...
갤럭시 규모의 인공지능 서비스를 위한 AWS 데이터베이스 아키텍처 - 김상필 솔루션 아키텍트 매니저, AWS / 김정환 데브옵스 엔지니어,...Amazon Web Services Korea
 
A culture of rapid innovation with DevOps, microservices, & serverless - MAD2...
A culture of rapid innovation with DevOps, microservices, & serverless - MAD2...A culture of rapid innovation with DevOps, microservices, & serverless - MAD2...
A culture of rapid innovation with DevOps, microservices, & serverless - MAD2...Amazon Web Services
 
Getting Started with Serverless Architectures
Getting Started with Serverless ArchitecturesGetting Started with Serverless Architectures
Getting Started with Serverless ArchitecturesAWS Summits
 
AWS re:Invent Comes to London 2019 - Database, Analytics, AI &ML
AWS re:Invent Comes to London 2019 - Database, Analytics, AI &MLAWS re:Invent Comes to London 2019 - Database, Analytics, AI &ML
AWS re:Invent Comes to London 2019 - Database, Analytics, AI &MLAmazon Web Services
 
Scalable serverless architectures using event-driven design - MAD310 - Chicag...
Scalable serverless architectures using event-driven design - MAD310 - Chicag...Scalable serverless architectures using event-driven design - MAD310 - Chicag...
Scalable serverless architectures using event-driven design - MAD310 - Chicag...Amazon Web Services
 

Semelhante a Architetture per l'analisi di flussi di dati in tempo reale (20)

Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
 
Stream processing and managing real-time data
Stream processing and managing real-time dataStream processing and managing real-time data
Stream processing and managing real-time data
 
Build data-drive, high performance, internet scale applications with AWS Data...
Build data-drive, high performance, internet scale applications with AWS Data...Build data-drive, high performance, internet scale applications with AWS Data...
Build data-drive, high performance, internet scale applications with AWS Data...
 
Performing real-time ETL into data lakes - ADB202 - Santa Clara AWS Summit.pdf
Performing real-time ETL into data lakes - ADB202 - Santa Clara AWS Summit.pdfPerforming real-time ETL into data lakes - ADB202 - Santa Clara AWS Summit.pdf
Performing real-time ETL into data lakes - ADB202 - Santa Clara AWS Summit.pdf
 
All Databases Are Equal, But Some Databases Are More Equal than Others: How t...
All Databases Are Equal, But Some Databases Are More Equal than Others: How t...All Databases Are Equal, But Some Databases Are More Equal than Others: How t...
All Databases Are Equal, But Some Databases Are More Equal than Others: How t...
 
Cyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per Day
Cyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per DayCyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per Day
Cyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per Day
 
Using ML to detect and prevent fraud without compromising user experience - F...
Using ML to detect and prevent fraud without compromising user experience - F...Using ML to detect and prevent fraud without compromising user experience - F...
Using ML to detect and prevent fraud without compromising user experience - F...
 
Performing serverless analytics in AWS Glue - ADB202 - Chicago AWS Summit
Performing serverless analytics in AWS Glue - ADB202 - Chicago AWS SummitPerforming serverless analytics in AWS Glue - ADB202 - Chicago AWS Summit
Performing serverless analytics in AWS Glue - ADB202 - Chicago AWS Summit
 
Data_Analytics_and_AI_ML
Data_Analytics_and_AI_MLData_Analytics_and_AI_ML
Data_Analytics_and_AI_ML
 
Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS Summit
Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS SummitScalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS Summit
Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS Summit
 
Stream Processing in 2019 - AWS Summit Sydney
Stream Processing in 2019 - AWS Summit Sydney Stream Processing in 2019 - AWS Summit Sydney
Stream Processing in 2019 - AWS Summit Sydney
 
Building Data Lakes for Analytics on AWS - ADB201 - Anaheim AWS Summit
Building Data Lakes for Analytics on AWS - ADB201 - Anaheim AWS SummitBuilding Data Lakes for Analytics on AWS - ADB201 - Anaheim AWS Summit
Building Data Lakes for Analytics on AWS - ADB201 - Anaheim AWS Summit
 
Getting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipelineGetting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipeline
 
Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWS
Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWSKeynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWS
Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWS
 
Thirty serverless architectures in 30 minutes - MAD202 - Chicago AWS Summit
Thirty serverless architectures in 30 minutes - MAD202 - Chicago AWS SummitThirty serverless architectures in 30 minutes - MAD202 - Chicago AWS Summit
Thirty serverless architectures in 30 minutes - MAD202 - Chicago AWS Summit
 
갤럭시 규모의 인공지능 서비스를 위한 AWS 데이터베이스 아키텍처 - 김상필 솔루션 아키텍트 매니저, AWS / 김정환 데브옵스 엔지니어,...
갤럭시 규모의 인공지능 서비스를 위한 AWS 데이터베이스 아키텍처 - 김상필 솔루션 아키텍트 매니저, AWS / 김정환 데브옵스 엔지니어,...갤럭시 규모의 인공지능 서비스를 위한 AWS 데이터베이스 아키텍처 - 김상필 솔루션 아키텍트 매니저, AWS / 김정환 데브옵스 엔지니어,...
갤럭시 규모의 인공지능 서비스를 위한 AWS 데이터베이스 아키텍처 - 김상필 솔루션 아키텍트 매니저, AWS / 김정환 데브옵스 엔지니어,...
 
A culture of rapid innovation with DevOps, microservices, & serverless - MAD2...
A culture of rapid innovation with DevOps, microservices, & serverless - MAD2...A culture of rapid innovation with DevOps, microservices, & serverless - MAD2...
A culture of rapid innovation with DevOps, microservices, & serverless - MAD2...
 
Getting Started with Serverless Architectures
Getting Started with Serverless ArchitecturesGetting Started with Serverless Architectures
Getting Started with Serverless Architectures
 
AWS re:Invent Comes to London 2019 - Database, Analytics, AI &ML
AWS re:Invent Comes to London 2019 - Database, Analytics, AI &MLAWS re:Invent Comes to London 2019 - Database, Analytics, AI &ML
AWS re:Invent Comes to London 2019 - Database, Analytics, AI &ML
 
Scalable serverless architectures using event-driven design - MAD310 - Chicag...
Scalable serverless architectures using event-driven design - MAD310 - Chicag...Scalable serverless architectures using event-driven design - MAD310 - Chicag...
Scalable serverless architectures using event-driven design - MAD310 - Chicag...
 

Mais de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mais de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Architetture per l'analisi di flussi di dati in tempo reale

  • 1. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Antonello Mantuano Head of Software Engineering Cerved Dr Frank Munz Senior Technical Evangelist Amazon Web Services Architecting for Real-Time Insights with Streaming Data
  • 2. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I TS U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Introductory - 200 “These sessions provide an overview of AWS services and features, and they assume that attendees are new to the topic. These sessions highlight basic use cases, features, functions, and benefits."
  • 3. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T - Streaming Architectures - Amazon Kinesis - Serverless Stream Processing - Amazon Managed Streaming for Kafka (MSK) - Customer Success Story: Antonello Mantuano from Cerved Agenda
  • 4. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Streaming Data Web Clickstream Application LogsIoT Sensors [Wed Oct 11 14:32:52 2018] [error] [client 127.0.0.1] client denied by server configuration: /export/home/live/ap/ht docs/test Continuously generated, small size events, low latency requirements
  • 5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Transform and Process Continuously Streaming Ingest video & data as it’s generated Process data on the fly Real-time analytics/ML, alerts, actions
  • 6. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T From Batch to Streaming Analytics https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying
  • 7. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon Kinesis Real-time data streaming and analytics Easily collect, process, and analyze streams in real time Kinesis Video Streams Kinesis Data Streams Kinesis Data Firehose Kinesis Data Analytics Capture, process, and store video streams for analytics Load data streams into AWS data stores Analyze data streams with SQL or Java Build custom applications that analyze data streams NEW!
  • 9. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon Kinesis Data Streams Overview
  • 10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Data Ingestion from a Variety of Sources Kinesis Data Streams Transactions ERP Web logs/ cookies Connected devices AWS SDKs • Publish directly from application code via APIs • AWS Mobile SDK • Managed AWS sources: CloudWatch Logs, AWS IoT, Kinesis Data Analytics and more • RDS Aurora via Lambda Kinesis Agent • Monitors log files and forwards lines as messages to Kinesis Data Streams 3rd party and open source • Log4j appender • Apache Kafka • Flume, fluentd, and more … Kinesis Producer Library (KPL) • Background process aggregates and batches messages
  • 11. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
  • 12. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Kinesis Data Streams: Standard consumers
  • 13. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Kinesis Data Streams: Standard consumers Shard 1 Shard 2 Shard 3 Shard n Consumer Application A GetRecords() Data GetRecords(): 5 transactions or 2MB per second, per shard Data Producer up to 1 MB or 1000 records per second, per shard With one consumer application, records can be retrieved every 200 ms. Stream
  • 14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Kinesis Data Streams: Enhanced fan-out consumers Every consumer gets dedicated 2MB per second, per shard. Latency is typically less than 70 msec. Shard 1 Data Producer Consumer Application B Consumer Application A RegisterStreamConsumer() EFO Pipe SubscribeToShard() Data: up to 2MB per second EFO Pipe HTTP/2: Consumers do not poll. Messages are pushed to the consumer as they arrive. RegisterStreamConsumer() SubscribeToShard() Data: up to 2MB per second Stream
  • 15. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 16. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T The Serverless Operational Model No provisioning, no management Pay for value Automatic scaling Highly available and secure
  • 17. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Processing a Data Stream with Lambda data producer Kinesis Data Streams Amazon SNS Continuously stream data Lambda service Lambda functionA Lambda function B Continuously polls for new data, 1 poll per second Automatically invokes your function(s) when data found Lambda polls each shard once per second Lambda’s maximum execution time is 15 minutes
  • 18. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Kinesis Streaming Data Analytics: SQL or Apache Flink (Java)
  • 19. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Kinesis Streaming Data Analytics / SQL
  • 20. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Kinesis Streaming Data Analytics / Apache Flink Framework and engine for stateful processing of data streams. Simple programming High performance Stateful Processing Strong data integrity Easy to use and flexible APIs make building apps fast In-memory computing provides low latency & high throughput Durable application state saves Exactly-once processing and consistent state
  • 21. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Kinesis Data Firehose: Ingest Transform Load (ITL)
  • 22. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Kinesis Data Firehose—How it Works Ingest Transform Deliver Amazon S3 Amazon Redshift Amazon Elasticsearch Service AWS IoT Amazon Kinesis Agent Amazon Kinesis Streams Amazon CloudWatch Logs Amazon CloudWatch Events Apache Kafka
  • 23. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Kinesis Data Firehose: Record format Conversion Kinesis Data Firehose Amazon S3 Glue Data Catalog Data Producer schema convert to columnar format JSON data /failed
  • 24. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon Kinesis Data – Streams vs. Firehose Scalable and durable real-time data streaming service with provisioned throughput and sub-second latency that can continuously capture gigabytes of data per second from hundreds of thousands of sources. Kinesis Data Streams Kinesis Data Firehose Capture, transform, convert and load data streams into AWS data stores for near real-time analytics. Data latency 60 seconds.
  • 25. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 26. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Demo Architecture
  • 27. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Live Demo! Use your phone & connect to: XXX 2. modo ! 3. modo " 1. preparazione
  • 28. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 29. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Comparing Amazon Kinesis Data Streams to MSK Amazon Kinesis Data Streams Amazon MSK Newest dataOldest data 50 1 2 3 4 0 1 2 3 0 1 2 3 4 Shard 2 Shard 1 Shard 3 Writes from Producers Stream with 3 shards Newest dataOldest data 50 1 2 3 4 0 1 2 3 0 1 2 3 4 Partition 2 Partition 1 Partition 3 Writes from Producers Topic with 3 partitions
  • 30. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T TopicA Partition1 TopicA Partition3 Partition Replica Replica Producer Zoo- keeper Zoo- keeper Zoo- keeper State & Config TopicA Partition2 Replica Cluster Apache Kafka: Partitioned, Replicated Commit Log
  • 31. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Challenges operating Apache Kafka Difficult to setup, configure and operate Hard to achieve high availability Tricky to scale AWS integrations = development No console, no visible metrics
  • 32. Getting started with Amazon MSK Preview is easy
  • 33. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 34. Fast Data from Legacy to Cloud The battle to overcome the gravity Marzo ’19 Antonello Mantuano Head of Software Engineering antonellomantuano @manant74
  • 35. Cerved – the Data Driven Company 42 Credit Information Credit Management Marketing Solutions LEAD GENERATION CREDIT COLLECTION DATA PROVIDING & MARKETING ANALYSIS CREDIT INFORMATION CREDIT SCORING BAD CREDITS EVALUATION We are deeply passionate about data. Our data enables various financial services, from credit risk analysis to marketing solutions to managing non-performing loans and bad debt. •1M companies sitesWeb Data •4M info from open data setOpen Data •70M payment •60M scoring Cerved Data •70M Real EstateProperty •20M companies •16M shareholders Company Data 2.600 Persons 34.000 Customers 40M€ In Data & Technologies 30 M Decisions 1.400 TB Of Data
  • 36. Why Cerved in Cloud? 43 TIME-TO-MARKET Rapid implementation for basic services. Benefits of Cloud for Cerved PRIVACY & SEGREGATION Manage customers data in secure mode AVAILABILITY Services available 7x24 SCALABILITY Infrastructure quickly adaptable to the load
  • 37. The Data Gravity 44 As data accumulates, it begins to have gravity. This Data Gravity pulls services and applications closer to the data. - Dave McCrory, 2010 DATA Services Apps Latency Throughpu t This attraction (gravitational force) is caused by the need for services and applications to have higher bandwidth and/or lower latency access to the data.
  • 38. 45 Data Ecosystem in Cerved Sourcing Liv.2 Sourcing Liv. 1 REPOS SYNTH Mondo Dati Lince Dati clienti NCA ERG EBS HUB DWH MBD Teradata Tabula Mongo4 DW DB4You XPCH 2 MATCHNEMO Quaes tio LUDO Tabula (su AWS) Aracne G4U MBD1 R3 Pragma Splunk CDR Mambo CAS Dedalo ELK CSS CR-RIBA (Payline)
  • 39. How to overcome Gravity? How to lift to the cloud with Data Gravity?
  • 40. Cerved Data in Cloud Architecture Cerved DBs CDC DBs Operational Online Data OLTP Processes Batch Hadoop DataLakeNoSql Tabula Cloud DB DynamoDB RDS S3 Streaming is the new ETL CDC Producer Raw Events Aggregator Basic Events Aggregator HL Events Aggregator NoSql Ingestion Synk Connector For Cloud Hadoop Ingestion Stream Processing Streaming is the Anti-Gravity
  • 41. Cerved API: a Data In Cloud use case Cerved DBs CDC DBs Operational Online Data OLTP Processes Batch Hadoop DataLakeNoSql Tabula Cloud DB DynamoDB RDS S3 CDC Producer Raw Events Aggregator Basic Events Aggregator HL Events Aggregator NoSql Ingestion Sync Connector to CloudHadoop Ingestion Stream Processing Back End AWS Lambda Spring Boot API Gateway Front End ReactJs Redux Swagger
  • 42. The Results of API in Cloud 49 SLA API available 7x24x365 99,998% in January 2019 PERFORMANCE High scalability with quickly adaptability to load COSTS With AWS Lambda, DynamoDB, S3, ecc… the cost of infrastructure grows with the load DATA SYNC Data are continuously updated in near real time mode
  • 43. Future use case of Data in Cloud Cerved DBs CDC DBs Operational Online Data OLTP Processes Batch Hadoop DataLakeNoSql Tabula Cloud DB DynamoDB RDS S3 CDC Producer Raw Events Aggregator Basic Events Aggregator HL Events Aggregator NoSql Ingestion Sync Connector To CloudHadoop Ingestion Stream Processing Back End AWS Lambda API Gateway EMR SageMaker AWS Kinesis or Managed Streaming for Kafka Data Scientist in Cloud Real Time Apps DR & Backup Use Cases
  • 44. THANK YOU Moving Fast Data in cloud creates a new gravity for new and innovative apps and services Antonello Mantuano Head of Software Engineering antonellomantuano @manant74
  • 45. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 46. Thank you! S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. antonellomantuano @manant74 frankmunz @frankmunz