SlideShare uma empresa Scribd logo
1 de 87
P U B L I C S E C T O R
S U M M I T
Public Secto r B rus s els
04.09.19
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Building a Modern Data Platform in
theCloud
Javier Ramirez
AWS Tech Evangelist
@supercoco9
D A T 1
Brussels
04.09.19
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Relatedbreakouts
Everything You Need to Know About Big Data: From
Architectural Principles to the Best Practices
Manos Samatas, Solutions Architect, Amazon Web Services
Tableau and AWS: Analytics in the Cloud
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Agenda
Challenges of data engineering and analytics
Building a data lake with S3. Ingesting data into the cloud
Data catalog and ETL with AWS Glue
Datawarehouse with Redshift, Spectrum, and Athena
Business dashboards with Quicksight
Customer presentation
Demo
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
A brief opinionated historyof dataanalytics
Problem
Solution
My reports make
my database
server very slow
Before 2009
The DBA years
Overnight DB dump
Read-only replica
My data doesn’t fit in one
machine
And it’s not only
transactional
2009-2011
The Hadoop epiphany
Hadoop
Map/Reduce all the
things
My data is very
fast
Map/Reduce is
hard to use
2012-2014
The Message Broker
and NoSQL Age
Kafka/RabbitMQ
Cassandra/HBAS
E/STORM
Basic ETL
Hive
Duplicating batch/stream is inefficient
I need to cleanse my source data
Hadoop ecosystem is hard to manage
My data scientists don’t like JAVA
I am not sure which data we are
already processing
2015-2017
The Spark kingdom and
the spreadsheet wars
Kafka/Spark
Complex ETL
Create new departments for data
governance
Spreadsheet all the things
Streaming is hard
My schemas have evolved
I cannot query old and new
data together
My cluster is running old
versions. Upgrading is hard
I want to use ML
2017-2018
The myth of DataOps
Kafka/Flink (JAVA or Scala
required)
Complex ETL with a pinch of
ML
Apache Atlas
Commercial distributions
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Some problems during allperiods
• My team spends more time maintaining the cluster than adding functionality
• Security and monitoring are hard
• Most of my time my cluster is sitting idle; Then it’s a bottleneck
• I don’t have the time to experiment
• Data preparation, cleansing, and basic transformations take a disproportionally
high amount of my time. And it’s so frustrating
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Somesimplethingsthatscareme (andeatmyproductivity)
• Text encodings
• Empty strings. Literal ”NULL” strings. Uppercase and Lowercase
• Date and time formats: which date would you say this is 1/4/19? And this? 1553589297
• CSV, especially if uploaded by end users
• A big JSON file in which row 176.543 has a property never seen before
• The same JSON file when all the numbers are strings
• XML
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Let’smake dataengineering and analyticslessscary
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Moredatalakes&analyticsonAWSthananywhereelse
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
A data lake is a centralized repository that allows
you to store all your structured and unstructured
data at any scale
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Data Lakes,Analytics,and MLPortfolio fromAWS
Broadest,deepestsetofanalyticservices
Amazon SageMaker
AWS Deep Learning AMIs
Amazon Rekognition
Amazon Lex
AWS DeepLens
Amazon Comprehend
Amazon Translate
Amazon Transcribe
Amazon Polly
Amazon Athena
Amazon EMR
Amazon Redshift
Amazon Elasticsearch service
Amazon Kinesis
Amazon QuickSight
Analytics
Machine Learning
AWS Direct Connect
AWS Snowball
AWS Snowmobile
AWS Database Migration Service
AWS Storage Gateway
AWS IoT Core
Amazon Kinesis Data Firehose
Amazon Kinesis Data Streams
Amazon Kinesis Video Streams
Real-time
Data Movement
On-premises
Data Movement
Data Lake on AWS
Storage | Archival Storage | Data Catalog
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Data Movement From On-premises Datacenters
AWS Snowball,
Snowball Edge and
Snowmobile
Petabyte and Exabyte-
scale data transport
solution that uses secure
appliances to transfer
large amounts of data into
and out of the AWS cloud
AWS Direct Connect
Establish a dedicated
network connection from
your premises to AWS;
reduces your network
costs, increase bandwidth
throughput, and provide a
more consistent network
experience than Internet-
based connections
AWS Storage
Gateway
Lets your on-premises
applications to use AWS
for storage; includes a
highly-optimized data
transfer mechanism,
bandwidth management,
along with local cache
AWS Database
Migration Service
Migrate database from the
most widely-used
commercial and open-
source offerings to AWS
quickly and securely with
minimal downtime to
applications
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Data Movement From Real-time Sources
Amazon Kinesis
Video Streams
Securely stream video
from connected devices to
AWS for analytics,
machine learning (ML),
and other processing
Amazon Kinesis Data
Firehose
Capture, transform, and
load data streams into
AWS data stores for near
real-time analytics with
existing business
intelligence tools.
Amazon Kinesis Data
Streams
Build custom, real-time
applications that process
data streams using
popular stream processing
frameworks
AWS IoT Core
Supports billions of
devices and trillions of
messages, and can
process and route those
messages to AWS
endpoints and to other
devices reliably and
securely
Managed Streaming
For Kafka
Fully managed open-
source platform for
building real-time
streaming data pipelines
and applications.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
AmazonS3—ObjectStorage
Security and
Compliance
Three different forms of
encryption; encrypts data
in transit when replicating
across regions; log and
monitor with CloudTrail,
use ML to discover and
protect sensitive data with
Macie
Flexible Management
Classify, report, and
visualize data usage
trends; objects can be
tagged to see storage
consumption, cost, and
security; build lifecycle
policies to automate
tiering, and retention
Durability, Availability
& Scalability
Built for eleven nine’s of
durability; data distributed
across 3 physical facilities
in an AWS region;
automatically replicated to
any other AWS region
Query in Place
Run analytics & ML on
data lake without data
movement; S3 Select can
retrieve subset of data,
improving analytics
performance by 400%
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
AmazonGlacier—Backup andArchive
Durability, Availability
& Scalability
Built for eleven nine’s of
durability; data distributed
across 3 physical facilities
in an AWS region;
automatically replicated to
any other AWS region
Secure
Log and monitor with
CloudTrail, Vault Lock
enables WORM storage
capabilities, helping
satisfy compliance
requirements
Retrieves data in
minutes
Three retrieval options to
fit your use case;
expedited retrievals with
Glacier Select can return
data in minutes
Inexpensive
Lowest cost AWS object
storage class, allowing
you to archive large
amounts of data at a very
low cost
$
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Data PreparationAccounts for ~80% of theWork
Building training sets
Cleaning and organizing data
Collecting data sets
Mining data for patterns
Refining algorithms
Other
https://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-most-time-consuming-
least-enjoyable-data-science-task-survey-says/#6493d6c76f63
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
AWSGlue—DataCatalog
Makedatadiscoverable
• Automatically discovers data and stores schema
• Catalog makes data searchable, and available for ETL
• Catalog contains table and job definitions
• Computes statistics to make queries efficient
• Run ad hoc or on a schedule; serverless – only pay when
crawler runs
Glue
Data Catalog
Discover data and
extract schema
Compliance
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
AWSGlue—ETLService
MakeETLscriptinganddeploymenteasy
• Automatically generates ETL code. Spark
(Scale/Python) or Python shell script.
• Code is customizable (demo later on. Yay!)
• Endpoints provided to edit, debug,
test code
• Jobs are scheduled or event-based
• Serverless
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Data Lakes,Analytics,and MLPortfolio fromAWS
Broadest,deepestsetofanalyticservices
Amazon SageMaker
AWS Deep Learning AMIs
Amazon Rekognition
Amazon Lex
AWS DeepLens
Amazon Comprehend
Amazon Translate
Amazon Transcribe
Amazon Polly
Amazon Athena
Amazon EMR
Amazon Redshift
Amazon Elasticsearch service
Amazon Kinesis
Amazon QuickSight
Analytics
Machine Learning
AWS Direct Connect
AWS Snowball
AWS Snowmobile
AWS Database Migration Service
AWS Storage Gateway
AWS IoT Core
Amazon Kinesis Data Firehose
Amazon Kinesis Data Streams
Amazon Kinesis Video Streams
Real-time
Data Movement
On-premises
Data Movement
Data Lake on AWS
Storage | Archival Storage | Data Catalog
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Amazon EMR—Big DataProcessing
Low cost
Flexible billing with per-
second billing, EC2 spot,
reserved instances and
auto-scaling to reduce
costs 50–80%
$
Easy
Launch fully managed
Hadoop & Spark in
minutes; no cluster
setup, node provisioning,
cluster tuning
Latest versions
Updated with the latest
open source frameworks
within 30 days of release
Use S3 storage
Process data directly in
the S3 data lake securely
with high performance
using the EMRFS
connector
Data Lake
100110000100101011100
101010111001010100000
111100101100101010001
100001
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Amazon Redshift—DataWarehousing
Fast at scale
Columnar storage
technology to improve I/O
efficiency and scale query
performance
Secure
Audit everything; encrypt
data end-to-end;
extensive certification and
compliance
Open file formats
Analyze optimized data
formats on the latest SSD,
and all open data formats
in Amazon S3
Inexpensive
As low as $1,000 per
terabyte per year, 1/10th
the cost of traditional data
warehouse solutions; start
at $0.25 per hour
$
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Amazon RedshiftSpectrum
ExtendthedatawarehousetoexabytesofdatainS3datalake
S3 data lakeRedshift data
Redshift Spectrum
query engine • Exabyte Redshift SQL queries against S3
• Join data across Redshift and S3
• Scale compute and storage separately
• Stable query performance and unlimited concurrency
• CSV, ORC, Avro, & Parquet data formats
• Pay only for the amount of data scanned
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Let’splay a game:SQL on anexabyteof data
WernerVogels, Amazon’s CTO, AWS Summit San Francisco 2017
https://youtu.be/RpPf38L0HHU?t=3963
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Numbers are fun
WernerVogels, Amazon’s CTO, AWS Summit San Francisco 2017
https://youtu.be/RpPf38L0HHU?t=3963
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Numbers are fun
WernerVogels, Amazon’s CTO, AWS Summit San Francisco 2017
https://youtu.be/RpPf38L0HHU?t=3963
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
AmazonAthena—InteractiveAnalysis
Interactive query service to analyze data in Amazon S3 using standard SQL
No infrastructure to set up or manage and no data to load
Ability to run SQL queries on data archived in Amazon Glacier (coming soon)
Query Instantly
Zero setup cost; just
point to S3 and
start querying
SQL
Open
ANSI SQL interface,
JDBC/ODBC drivers,
multiple formats,
compression types,
and complex joins and
data types
Easy
Serverless: zero
infrastructure, zero
administration
Integrated with
QuickSight
Pay per query
Pay only for queries
run; save 30–90% on
per-query costs
through compression
$
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
AmazonQuickSight
easy
Empower
everyone
Seamless
connectivity
Fast analysis Serverless
Now with ML superpowers!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Data Lakes fromAWS
Data Lake
on AWS
Cost-effective
Scalable and durable
Secure
Open and comprehensiveAnalyticsMachine Learning
Real-time Data
Movement
On-premises
Data Movement
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
AWS Provides HighestLevelsofSecurity
Secure
Compliance
AWS Artifact
Amazon Inspector
Amazon Cloud HSM
Amazon Cognito
AWS CloudTrail
Security
Amazon GuardDuty
AWS Shield
AWS WAF
Amazon Macie
VPC
Encryption
AWS Certification Manager
AWS Key Management
Service
Encryption at rest
Encryption in transit
Bring your own keys, HSM
support
Identity
AWS IAM
AWS SSO
Amazon Cloud Directory
AWS Directory Service
AWS Organizations
Customer need to have multiple levels of security, identity and access management,
encryption, and compliance to secure their data lake
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Compliance:VirtuallyEveryRegulatoryAgency
CSA
Cloud Security
Alliance Controls
ISO 9001
Global Quality
Standard
ISO 27001
Security Management
Controls
ISO 27017
Cloud Specific
Controls
ISO 27018
Personal Data
Protection
PCI DSS Level 1
Payment Card
Standards
SOC 1
Audit Controls
Report
SOC 2
Security, Availability, &
Confidentiality Report
SOC 3
General Controls
Report
Global United States
CJIS
Criminal Justice
Information Services
DoD SRG
DoD Data
Processing
FedRAMP
Government Data
Standards
FERPA
Educational
Privacy Act
FIPS
Government Security
Standards
FISMA
Federal Information
Security Management
GxP
Quality Guidelines
and Regulations
ISO FFIEC
Financial Institutions
Regulation
HIPPA
Protected Health
Information
ITAR
International Arms
Regulations
MPAA
Protected Media
Content
NIST
National Institute of
Standards and Technology
SEC Rule 17a-4(f)
Financial Data
Standards
VPAT/Section 508
Accountability
Standards
Asia Pacific
FISC [Japan]
Financial Industry
Information Systems
IRAP [Australia]
Australian Security
Standards
K-ISMS [Korea]
Korean Information
Security
MTCS Tier 3 [Singapore]
Multi-Tier Cloud
Security Standard
My Number Act [Japan]
Personal Information
Protection
Europe
C5 [Germany]
Operational Security
Attestation
Cyber Essentials
Plus [UK]
Cyber Threat
Protection
G-Cloud [UK]
UK Government
Standards
IT-Grundschutz
[Germany]
Baseline Protection
Methodology
X P
G
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Data Lakes fromAWS
Data Lake
on AWS
Cost-effective
Scalable and durable
Secure
Open and comprehensiveAnalyticsMachine Learning
Real-time Data
Movement
On-premises
Data Movement
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
For example: Amazon S3 holds trillions of objects and
regularly peaks at millions of requests per second
TIME
CUSTOMERDATA
“…the scale at which AWS operates its public
cloud storage services dwarfs the other vendors
in this Magic Quadrant.”
- Gartner Magic Quadrant for Public Cloud Storage Services, Worldwide
Raj Bala, Arun Chandrasekaran, John McArthur, July 24, 2017
AWS Runs the Largest Global Cloud
Infrastructure
Scalable and durable
CHALLENGE
Need to create constant feedback loop
for designers
Gain up-to-the-minute understanding of
gamer satisfaction to guarantee gamers
are engaged, thus resulting in the most
popular game played in the world
Fortnite | 125+ million players
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
EpicGames usesData Lakesand analytics
Entire analytics platform running on AWS
S3 leveraged as a Data Lake
All telemetry data is collected with Kinesis
Real-time analytics done through Spark on EMR,
DynamoDB to create scoreboards and real-time queries
Use Amazon EMR for large batch data processing
Game designers use data to inform their decisions
Game
clients
Game
servers
Launcher
Game
services
N E A R R E A L T I M E P I P E L I N E
N E A R R E A L T I M E P I P E L I N E
Grafana
Scoreboards API
Limited Raw Data
(real time ad-hoc SQL)
User ETL
(metric definition)
Spark on EMR DynamoDB
NEAR REALTIME PIPELINES
BATCH PIPELINES
ETL using
EMR
Tableau/BI
Ad-hoc SQLS3
(Data Lake)
Kinesis
APIs
Databases
S3
Other
sources
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Data Lakes fromAWS
Data Lake
on AWS
Lowest cost
Scalable and durable
Secure
Open and comprehensiveAnalyticsMachine Learning
Real-time Data
Movement
On-premises
Data Movement
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
PayOnly for theResourcesYouUseas youScale
LowestCost
• Pay-as-you-go for the resources you consume
• As low as $0.05/GB scanned with Athena
• EMR and Athena can automatically scale down
resources after job completes, saving you costs
• Commit to a set term and save up to 75% with
Reserved Instance
• Run on spare compute capacity with EMR and
save up to 90% with Spot
Traditional approach leads to wasted capacity
Traditional: Rigid
AWS: Elastic
Capacity
Demand
Demand
Servers
Unmet demand
upset players
missed revenue
Excess capacity
wasted $$$
AWS approach: pay for the capacity you use
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
AWS databasesand analytics
Broadanddeepportfolio,builtforbuilders
AWS Marketplace
Amazon Redshift
Data warehousing
Amazon EMR
Hadoop + Spark
Athena
Interactive analytics
Kinesis Analytics
Real-time
Amazon Elasticsearch
service
Operational Analytics
RDS
MySQL, PostgreSQL,
MariaDB, Oracle, SQL Server
Aurora
MySQL, PostgreSQL
Amazon
QuickSight
Amazon
SageMaker
DynamoDB
Key value, Document
ElastiCache
Redis, Memcached
Neptune
Graph
Timestream
Time Series
QLDB
Ledger Database
S3/Amazon Glacier
AWS Glue
ETL & Data Catalog
Lake Formation
Data Lakes
Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams | Data Pipeline | Direct Connect
Data Movement
AnalyticsDatabases
Business Intelligence & Machine Learning
Data Lake
Managed
Blockchain
Blockchain
Templates
Blockchain
Amazon
Comprehend
Amazon
Rekognition
Amazon
Lex
Amazon
Transcribe
AWS DeepLens 250+ solutions
730+ Database
solutions
600+
Analytics
solutions
25+
Blockchain
solutions
20+
Data lake
solutions
30+ solutions
RDS on VMWare
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
41
ENTERPRISE INFORMATION MANAGEMENT @ VOO
Modern Data Platform on AWS: 13:15 – 14:05
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
AGENDA
 VOO & Micropole Belgium
 The problem of managing the Enterprise Information
 Why Public Cloud? Why AWS?
 Architecture and used AWS services
 The results
 Next steps
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
VOO, PART OF NETHYS GROUP
Gaz & Electricité
Energie
invest
Télécoms/Média Participations
Quadruple
Play
Energies
renouvelables
Participations
financières
Gestion de réseaux
de distribution
d’électricité
et de gaz
Services
ICT
BtoB
Télévision
à
péage
Quotidien
régional
Magazine
News & TV
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
VOO PRESENCE IN WALLONIAAND BRUSSELS
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
VOO – QUADRUPLE PLAY OFFERING
- ~20 analogue TV channels
- - ~150 digital TV channels
(SD, HD, 3D, 4K)
- VOOcorder & .évasion
(PVR)
- Digital TV card
- VOD
- Be tv
- VOOmotion & Be tv Go
available for PC, tablet and
smartphones
- Internet -> 400Mbps
- “unlimited” Packs
- WIFI modems according “ac”
norms
- WIFI homespots (Wifree)
Fixed telephony VOOmobile via an MVNO
agreement
Television and Internet, a
simple offer,
tailored to your needs
The no-frills essential
experience!
A generous offer at
attractive rates
The All-in Pack
that makes your
life easier
The Pack that is
as mobile as you
are
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
MICROPOLE BELGIUM, PART OF MICROPOLE GROUP
We will soon become an
advanced AWS partner
1250
30
12
Business consultants & engineers around the world
Years of expertise in advanced analytics and BI
A team of 12 certified AWS experts
Data Intelligence & Performance
Data Governance & Architecture
Machine Learning
Blockchain
Advanced Analytics in the Cloud3 Years of AWS Partnership
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
HISTORY OF BI & BIG DATAAT VOO
OSS
BSS  Up to 2014
 Typical Legacy environment with multiple dedicated silo
DWH/Reporting environments
 Source systems and data managements environments all
hosted on a Private Cloud
 Capacity upgrades and performance tuning slow, but more
or less manageable
 During 2014 - 2015 - 2016
 Launch of Mobile services and exploitation of Network
and
Set-Up-Box data to improve customer experience and
usage-based campaign management
 Consequences
 Explosion of storage requirements
 Performance issues
 Lack of adequate tools to manage « big data »
 In addition: reporting/analytical environments fragmented
and unsecured
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
GENESIS OF THE MEMENTO PROJECT
 April-May 2017 - concretization: detailed study with Micropole to define a
business and technical architecture, a future organization and data
governance model, and an implementation roadmap
 June 2017 - implementation: kickoff of Memento project with 2 project tracks
1. Implementation of an “ARCHITECTURE” allowing the “unification of DATA”
coming from different sources (internal/external) to provide the company with
reliable “INFORMATION” and in realtime (reporting/analysis)
2. Set up of an “IT and Business ORGANISATION + an Operational
GOVERNANCE” strengthening a culture supporting the definition and
implementation of a “data strategy”
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Memento project objectives
Improve Customer Experience
Subscriber acquisition
Optimise operations
Increase the level of profitability
Increase customer retention
GDPR compliance
Provide decision support data for the 6 strategic axes of VOO
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
WHY CHOSINGAPUBLIC CLOUD?
 No upfront investments, no need for a detailed
capacity plan, no long delays to order and install
hardware
 Ultra fast installation and configuration of the different
solution components
 « Pay as you go » / « pay as you use » principle
(capacity extensions only when required and payments
accordingly)
 Elasticity, resilience and high availability
 Use of Managed Services proposed in the Cloud
drastically reduces license and maintenance costs
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
WHYAWS?
 Market leader in public cloud (source: Gartner)
 Broadest offering in PaaS
 AWS has the most complete offering in managed services –
compared to Azure or Google Cloud Platform
 Managed service definition (example for a database): it
doesn’t requires installation, maintenance, patching,
licensing. Furthermore backups are managed and high
availability is integrated and requires only a minimum in
configuration.
 This is an tremendous accelerator and a considerable cost
saving compared to traditional software. Examples: S3,
Redshift, EMR, DynamoDB etc.
 Infrastructure administration is reduced to a minimum and is
done inside the EIM team
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
MEMENTOARCHITECTURE
Stage – raw/native format
Process: batch & real-time
Store & expose
Archive
Telenet
Social media
Web logs
Sensor data
Big Data Cluster
Optional:
filter or
aggregate
BW
Primary
access
point
Power users – Data
Scientists
Exploration (sandbox mode)
Batch
Real Time
NRT
Extract SAP data –
simplified flows
CRM 4 CRM 7 ISU ACBIS Numbers EffortelFAST
Data Lake
SERAM EffortelJira
ALLOT
…
Every
SAP
transactional
system
S
O
U
R
C
E
S
S
T
A
G
E
E
X
P
O
S
E
I
N
T
E
G
R
A
T
E
Predictive tools
Entreprise
Data Warehouse
Reporting
Analysis
Visualisation
e.g. Churn
Prediction
Controlled
Sandbox
NoSQL
High
frequency
queries –
detailed
data
Batch
Mini batches (NRT)
Call center
application
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
MEMENTO IMPLEMENTATION ONAWS
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
RESULTS
 A unified data platform
 That enables any analytical use case
 Goes beyond analytics usage
 Shorten development cycles and answer faster business requirements
 Scales without boundaries AND pay for what you are using
 A well defined architecture, structures and standards allows parallel work by a large(r) team
and embraces agile
 Allows business teams to focus on data analysis (and not on data crunching)
 GDPR compliance (privacy by design)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
NEXT STEPS
 Finalise the integration of all existing enterprise data sources
 Finalise remaining use cases and decommission the Legacy environment
 Integrate all new data sources (from new projects and products)
 Extend the usage to other companies within the Nethys Group
 Respond in a agile way to the more challenging and complex business uses cases, including
the use of Machine Learning
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
THANKS FOR YOUR ATTENTION
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
DemoOverview
https://aws.amazon.com/blogs/big-data/harmonize-query-and-visualize-data-from-
various-providers-using-aws-glue-amazon-athena-and-amazon-quicksight/
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Typicalstepsof building adatalake
Setup Storage1
Move data2
Cleanse, prep,
and catalog data
3
Configure and enforce
security and compliance
policies
4
Make data available
for analytics
5
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Building data lakes can still take months
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
AWS LakeFormation (join thepreview)
Build, secure, and manage a data lake in days
Build a data lake in days,
not months
Build and deploy a fully managed
data lake with a few clicks
Enforce security
policies across multiple
services
Centrally define security, governance,
and auditing policies in one place and
enforce those policies for all users
and all applications
Combine different
analytics approaches
Empower analyst and data scientist
productivity, giving them self-service
discovery and safe access to all data
from a single catalog
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Howitworks:AWSLake Formation
S3
IAM KMS
OLTP
ERP
CRM
LOB
Devices
Web
Sensors
Social Kinesis
Build Data Lakes quickly
• Identify, crawl, and catalog sources
• Ingest and clean data
• Transform into optimal formats
Simplify security management
• Enforce encryption
• Define access policies
• Implement audit login
Enable self-service and combined analytics
• Analysts discover all data available for analysis from
a single data catalog
• Use multiple analytics tools over the same data
Athena
Amazon
Redshift
AI Services
Amazon
EMR
Amazon
QuickSight
Data
Catalog
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
CustomerinterestinAWS LakeFormation
“We are very excited about the launch of AWS Lake Formation,
which provides a central point of control to easily load,
clean, secure, and catalog data from thousands of clients to
our AWS-based data lake, dramatically reducing our
operational load. … Additionally, AWS Lake Formation will be
HIPAA compliant from day one …”
- Aaron Symanski, CTO, Change Healthcare
“I can’t wait for my team to get our hands on AWS Lake
Formation. With an enterprise-ready option like Lake
Formation, we will be able to spend more time deriving
value from our data rather than doing the heavy lifting
involved in manually setting up and managing our data lake.” -
Joshua Couch, VP Engineering, Fender Digital
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Thank you!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
Javier Ramirez
@supercoco9
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
SelectAWSGlue customers
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R
S U M M I T
DemoOverview
https://aws.amazon.com/blogs/big-data/harmonize-query-and-visualize-data-from-
various-providers-using-aws-glue-amazon-athena-and-amazon-quicksight/
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019
Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Modernize your Microsoft Applications on AWS
Modernize your Microsoft Applications on AWSModernize your Microsoft Applications on AWS
Modernize your Microsoft Applications on AWS
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 
AWS Media Preservation Summit - Los Angeles
AWS Media Preservation Summit - Los AngelesAWS Media Preservation Summit - Los Angeles
AWS Media Preservation Summit - Los Angeles
 
AWSome Day Brasil - Junho 2020
AWSome Day Brasil - Junho 2020AWSome Day Brasil - Junho 2020
AWSome Day Brasil - Junho 2020
 
Data Design for Microservices
Data Design for MicroservicesData Design for Microservices
Data Design for Microservices
 
What's new in Amazon RDS - ADB206 - New York AWS Summit
What's new in Amazon RDS - ADB206 - New York AWS SummitWhat's new in Amazon RDS - ADB206 - New York AWS Summit
What's new in Amazon RDS - ADB206 - New York AWS Summit
 
What’s new in Amazon RDS - ADB207 - Chicago AWS Summit
What’s new in Amazon RDS - ADB207 - Chicago AWS SummitWhat’s new in Amazon RDS - ADB207 - Chicago AWS Summit
What’s new in Amazon RDS - ADB207 - Chicago AWS Summit
 
What's New in Amazon Aurora (DAT204-R1) - AWS re:Invent 2018
What's New in Amazon Aurora (DAT204-R1) - AWS re:Invent 2018What's New in Amazon Aurora (DAT204-R1) - AWS re:Invent 2018
What's New in Amazon Aurora (DAT204-R1) - AWS re:Invent 2018
 
Foundations of Amazon EC2 - SRV319
Foundations of Amazon EC2 - SRV319 Foundations of Amazon EC2 - SRV319
Foundations of Amazon EC2 - SRV319
 
AWSome Day Online 2020_Module 3: Building in the cloud
AWSome Day Online 2020_Module 3: Building in the cloudAWSome Day Online 2020_Module 3: Building in the cloud
AWSome Day Online 2020_Module 3: Building in the cloud
 
AWS Outposts Update
AWS Outposts UpdateAWS Outposts Update
AWS Outposts Update
 
Data Migration Best Practices
Data Migration Best PracticesData Migration Best Practices
Data Migration Best Practices
 
Tech Talks On Site- Edição de Maio- AutoScaling
Tech Talks On Site- Edição de Maio- AutoScalingTech Talks On Site- Edição de Maio- AutoScaling
Tech Talks On Site- Edição de Maio- AutoScaling
 
VMware Cloud on AWS: Technical Deep Dive - SRV341 - Chicago AWS Summit
VMware Cloud on AWS: Technical Deep Dive - SRV341 - Chicago AWS SummitVMware Cloud on AWS: Technical Deep Dive - SRV341 - Chicago AWS Summit
VMware Cloud on AWS: Technical Deep Dive - SRV341 - Chicago AWS Summit
 
Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
 Citrix Moves Data to Amazon Redshift Fast with Matillion ETL Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
 
Humans and Data Don't Mix- Best Practices to Secure Your Cloud
Humans and Data Don't Mix- Best Practices to Secure Your CloudHumans and Data Don't Mix- Best Practices to Secure Your Cloud
Humans and Data Don't Mix- Best Practices to Secure Your Cloud
 
深入淺出 Amazon Database Migration Service
深入淺出 Amazon Database Migration Service 深入淺出 Amazon Database Migration Service
深入淺出 Amazon Database Migration Service
 
Oracle zdm Migrate Amazon RDS Oracle to Oracle Autonomous 2021 Kamalesh Ramas...
Oracle zdm Migrate Amazon RDS Oracle to Oracle Autonomous 2021 Kamalesh Ramas...Oracle zdm Migrate Amazon RDS Oracle to Oracle Autonomous 2021 Kamalesh Ramas...
Oracle zdm Migrate Amazon RDS Oracle to Oracle Autonomous 2021 Kamalesh Ramas...
 
Simplified and Efficient Cloud Disaster Recovery and Cloud Data Protection (S...
Simplified and Efficient Cloud Disaster Recovery and Cloud Data Protection (S...Simplified and Efficient Cloud Disaster Recovery and Cloud Data Protection (S...
Simplified and Efficient Cloud Disaster Recovery and Cloud Data Protection (S...
 
Building with AWS Databases: Match Your Workload to the Right Database (DAT30...
Building with AWS Databases: Match Your Workload to the Right Database (DAT30...Building with AWS Databases: Match Your Workload to the Right Database (DAT30...
Building with AWS Databases: Match Your Workload to the Right Database (DAT30...
 

Semelhante a Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019

Building a Modern Data Platform on AWS
Building a Modern Data Platform on AWSBuilding a Modern Data Platform on AWS
Building a Modern Data Platform on AWS
Amazon Web Services
 

Semelhante a Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019 (20)

Building-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdfBuilding-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdf
 
Building a modern data platform in AWS
Building a modern data platform in AWSBuilding a modern data platform in AWS
Building a modern data platform in AWS
 
Building-a-Data-Lake-on-AWS
Building-a-Data-Lake-on-AWSBuilding-a-Data-Lake-on-AWS
Building-a-Data-Lake-on-AWS
 
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWSAWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
 
Building_a_Modern_Data_Platform_in_the_Cloud.pdf
Building_a_Modern_Data_Platform_in_the_Cloud.pdfBuilding_a_Modern_Data_Platform_in_the_Cloud.pdf
Building_a_Modern_Data_Platform_in_the_Cloud.pdf
 
Building a modern data platform on AWS. Utrecht AWS Dev Day
Building a modern data platform on AWS. Utrecht AWS Dev DayBuilding a modern data platform on AWS. Utrecht AWS Dev Day
Building a modern data platform on AWS. Utrecht AWS Dev Day
 
Building Data Lakes & Analytics on AWS
Building Data Lakes & Analytics on AWSBuilding Data Lakes & Analytics on AWS
Building Data Lakes & Analytics on AWS
 
Everything You Need to Know About Big Data: From Architectural Principles to ...
Everything You Need to Know About Big Data: From Architectural Principles to ...Everything You Need to Know About Big Data: From Architectural Principles to ...
Everything You Need to Know About Big Data: From Architectural Principles to ...
 
Implementing a Data Warehouse on AWS in a Hybrid Environment
Implementing a Data Warehouse on AWS in a Hybrid EnvironmentImplementing a Data Warehouse on AWS in a Hybrid Environment
Implementing a Data Warehouse on AWS in a Hybrid Environment
 
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
 
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
AWS Analytics Services - When to use what? | AWS Summit Tel Aviv 2019
 
Building a Modern Data Platform on AWS
Building a Modern Data Platform on AWSBuilding a Modern Data Platform on AWS
Building a Modern Data Platform on AWS
 
Modern Data Platform on AWS
Modern Data Platform on AWSModern Data Platform on AWS
Modern Data Platform on AWS
 
Preparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/MLPreparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/ML
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWS
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWS
 
Building a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudBuilding a Modern Data Platform in the Cloud
Building a Modern Data Platform in the Cloud
 
Building-Serverless-Analytics-On-AWS
Building-Serverless-Analytics-On-AWSBuilding-Serverless-Analytics-On-AWS
Building-Serverless-Analytics-On-AWS
 
Data Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaData Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & Athena
 
Data Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaData Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & Athena
 

Mais de javier ramirez

Mais de javier ramirez (20)

¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfest¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfest
 
QuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series databaseQuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series database
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
 
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
 
Deduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDBDeduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDB
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)
 
Your Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic DatabaseYour Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic Database
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
 
QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728
 
Processing and analysing streaming data with Python. Pycon Italy 2022
Processing and analysing streaming  data with Python. Pycon Italy 2022Processing and analysing streaming  data with Python. Pycon Italy 2022
Processing and analysing streaming data with Python. Pycon Italy 2022
 
QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...
 
Servicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en AragónServicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en Aragón
 
Primeros pasos en desarrollo serverless
Primeros pasos en desarrollo serverlessPrimeros pasos en desarrollo serverless
Primeros pasos en desarrollo serverless
 
How AWS is reinventing the cloud
How AWS is reinventing the cloudHow AWS is reinventing the cloud
How AWS is reinventing the cloud
 
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMAnalitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
 
Getting started with streaming analytics
Getting started with streaming analyticsGetting started with streaming analytics
Getting started with streaming analytics
 
Getting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipelineGetting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipeline
 
Getting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep DiveGetting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep Dive
 
Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)
 
Monitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWSMonitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWS
 

Último

Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 

Último (20)

Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 

Building a Modern Data Platform on AWS. Public Sector Summit Brussels 2019

  • 1. P U B L I C S E C T O R S U M M I T Public Secto r B rus s els 04.09.19
  • 2. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Building a Modern Data Platform in theCloud Javier Ramirez AWS Tech Evangelist @supercoco9 D A T 1 Brussels 04.09.19
  • 3. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Relatedbreakouts Everything You Need to Know About Big Data: From Architectural Principles to the Best Practices Manos Samatas, Solutions Architect, Amazon Web Services Tableau and AWS: Analytics in the Cloud
  • 4. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Agenda Challenges of data engineering and analytics Building a data lake with S3. Ingesting data into the cloud Data catalog and ETL with AWS Glue Datawarehouse with Redshift, Spectrum, and Athena Business dashboards with Quicksight Customer presentation Demo
  • 5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T A brief opinionated historyof dataanalytics Problem Solution My reports make my database server very slow Before 2009 The DBA years Overnight DB dump Read-only replica My data doesn’t fit in one machine And it’s not only transactional 2009-2011 The Hadoop epiphany Hadoop Map/Reduce all the things My data is very fast Map/Reduce is hard to use 2012-2014 The Message Broker and NoSQL Age Kafka/RabbitMQ Cassandra/HBAS E/STORM Basic ETL Hive Duplicating batch/stream is inefficient I need to cleanse my source data Hadoop ecosystem is hard to manage My data scientists don’t like JAVA I am not sure which data we are already processing 2015-2017 The Spark kingdom and the spreadsheet wars Kafka/Spark Complex ETL Create new departments for data governance Spreadsheet all the things Streaming is hard My schemas have evolved I cannot query old and new data together My cluster is running old versions. Upgrading is hard I want to use ML 2017-2018 The myth of DataOps Kafka/Flink (JAVA or Scala required) Complex ETL with a pinch of ML Apache Atlas Commercial distributions
  • 6. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Some problems during allperiods • My team spends more time maintaining the cluster than adding functionality • Security and monitoring are hard • Most of my time my cluster is sitting idle; Then it’s a bottleneck • I don’t have the time to experiment • Data preparation, cleansing, and basic transformations take a disproportionally high amount of my time. And it’s so frustrating
  • 7. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Somesimplethingsthatscareme (andeatmyproductivity) • Text encodings • Empty strings. Literal ”NULL” strings. Uppercase and Lowercase • Date and time formats: which date would you say this is 1/4/19? And this? 1553589297 • CSV, especially if uploaded by end users • A big JSON file in which row 176.543 has a property never seen before • The same JSON file when all the numbers are strings • XML
  • 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Let’smake dataengineering and analyticslessscary
  • 9. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Moredatalakes&analyticsonAWSthananywhereelse
  • 10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T A data lake is a centralized repository that allows you to store all your structured and unstructured data at any scale
  • 11. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Data Lakes,Analytics,and MLPortfolio fromAWS Broadest,deepestsetofanalyticservices Amazon SageMaker AWS Deep Learning AMIs Amazon Rekognition Amazon Lex AWS DeepLens Amazon Comprehend Amazon Translate Amazon Transcribe Amazon Polly Amazon Athena Amazon EMR Amazon Redshift Amazon Elasticsearch service Amazon Kinesis Amazon QuickSight Analytics Machine Learning AWS Direct Connect AWS Snowball AWS Snowmobile AWS Database Migration Service AWS Storage Gateway AWS IoT Core Amazon Kinesis Data Firehose Amazon Kinesis Data Streams Amazon Kinesis Video Streams Real-time Data Movement On-premises Data Movement Data Lake on AWS Storage | Archival Storage | Data Catalog
  • 12. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Data Movement From On-premises Datacenters AWS Snowball, Snowball Edge and Snowmobile Petabyte and Exabyte- scale data transport solution that uses secure appliances to transfer large amounts of data into and out of the AWS cloud AWS Direct Connect Establish a dedicated network connection from your premises to AWS; reduces your network costs, increase bandwidth throughput, and provide a more consistent network experience than Internet- based connections AWS Storage Gateway Lets your on-premises applications to use AWS for storage; includes a highly-optimized data transfer mechanism, bandwidth management, along with local cache AWS Database Migration Service Migrate database from the most widely-used commercial and open- source offerings to AWS quickly and securely with minimal downtime to applications
  • 13. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T
  • 14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Data Movement From Real-time Sources Amazon Kinesis Video Streams Securely stream video from connected devices to AWS for analytics, machine learning (ML), and other processing Amazon Kinesis Data Firehose Capture, transform, and load data streams into AWS data stores for near real-time analytics with existing business intelligence tools. Amazon Kinesis Data Streams Build custom, real-time applications that process data streams using popular stream processing frameworks AWS IoT Core Supports billions of devices and trillions of messages, and can process and route those messages to AWS endpoints and to other devices reliably and securely Managed Streaming For Kafka Fully managed open- source platform for building real-time streaming data pipelines and applications.
  • 15. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T AmazonS3—ObjectStorage Security and Compliance Three different forms of encryption; encrypts data in transit when replicating across regions; log and monitor with CloudTrail, use ML to discover and protect sensitive data with Macie Flexible Management Classify, report, and visualize data usage trends; objects can be tagged to see storage consumption, cost, and security; build lifecycle policies to automate tiering, and retention Durability, Availability & Scalability Built for eleven nine’s of durability; data distributed across 3 physical facilities in an AWS region; automatically replicated to any other AWS region Query in Place Run analytics & ML on data lake without data movement; S3 Select can retrieve subset of data, improving analytics performance by 400%
  • 16. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T AmazonGlacier—Backup andArchive Durability, Availability & Scalability Built for eleven nine’s of durability; data distributed across 3 physical facilities in an AWS region; automatically replicated to any other AWS region Secure Log and monitor with CloudTrail, Vault Lock enables WORM storage capabilities, helping satisfy compliance requirements Retrieves data in minutes Three retrieval options to fit your use case; expedited retrievals with Glacier Select can return data in minutes Inexpensive Lowest cost AWS object storage class, allowing you to archive large amounts of data at a very low cost $
  • 17. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Data PreparationAccounts for ~80% of theWork Building training sets Cleaning and organizing data Collecting data sets Mining data for patterns Refining algorithms Other https://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-most-time-consuming- least-enjoyable-data-science-task-survey-says/#6493d6c76f63
  • 18. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T AWSGlue—DataCatalog Makedatadiscoverable • Automatically discovers data and stores schema • Catalog makes data searchable, and available for ETL • Catalog contains table and job definitions • Computes statistics to make queries efficient • Run ad hoc or on a schedule; serverless – only pay when crawler runs Glue Data Catalog Discover data and extract schema Compliance
  • 19. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T AWSGlue—ETLService MakeETLscriptinganddeploymenteasy • Automatically generates ETL code. Spark (Scale/Python) or Python shell script. • Code is customizable (demo later on. Yay!) • Endpoints provided to edit, debug, test code • Jobs are scheduled or event-based • Serverless
  • 20. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Data Lakes,Analytics,and MLPortfolio fromAWS Broadest,deepestsetofanalyticservices Amazon SageMaker AWS Deep Learning AMIs Amazon Rekognition Amazon Lex AWS DeepLens Amazon Comprehend Amazon Translate Amazon Transcribe Amazon Polly Amazon Athena Amazon EMR Amazon Redshift Amazon Elasticsearch service Amazon Kinesis Amazon QuickSight Analytics Machine Learning AWS Direct Connect AWS Snowball AWS Snowmobile AWS Database Migration Service AWS Storage Gateway AWS IoT Core Amazon Kinesis Data Firehose Amazon Kinesis Data Streams Amazon Kinesis Video Streams Real-time Data Movement On-premises Data Movement Data Lake on AWS Storage | Archival Storage | Data Catalog
  • 21. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Amazon EMR—Big DataProcessing Low cost Flexible billing with per- second billing, EC2 spot, reserved instances and auto-scaling to reduce costs 50–80% $ Easy Launch fully managed Hadoop & Spark in minutes; no cluster setup, node provisioning, cluster tuning Latest versions Updated with the latest open source frameworks within 30 days of release Use S3 storage Process data directly in the S3 data lake securely with high performance using the EMRFS connector Data Lake 100110000100101011100 101010111001010100000 111100101100101010001 100001
  • 22. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Amazon Redshift—DataWarehousing Fast at scale Columnar storage technology to improve I/O efficiency and scale query performance Secure Audit everything; encrypt data end-to-end; extensive certification and compliance Open file formats Analyze optimized data formats on the latest SSD, and all open data formats in Amazon S3 Inexpensive As low as $1,000 per terabyte per year, 1/10th the cost of traditional data warehouse solutions; start at $0.25 per hour $
  • 23. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Amazon RedshiftSpectrum ExtendthedatawarehousetoexabytesofdatainS3datalake S3 data lakeRedshift data Redshift Spectrum query engine • Exabyte Redshift SQL queries against S3 • Join data across Redshift and S3 • Scale compute and storage separately • Stable query performance and unlimited concurrency • CSV, ORC, Avro, & Parquet data formats • Pay only for the amount of data scanned
  • 24. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Let’splay a game:SQL on anexabyteof data WernerVogels, Amazon’s CTO, AWS Summit San Francisco 2017 https://youtu.be/RpPf38L0HHU?t=3963
  • 25. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Numbers are fun WernerVogels, Amazon’s CTO, AWS Summit San Francisco 2017 https://youtu.be/RpPf38L0HHU?t=3963
  • 26. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Numbers are fun WernerVogels, Amazon’s CTO, AWS Summit San Francisco 2017 https://youtu.be/RpPf38L0HHU?t=3963
  • 27. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T AmazonAthena—InteractiveAnalysis Interactive query service to analyze data in Amazon S3 using standard SQL No infrastructure to set up or manage and no data to load Ability to run SQL queries on data archived in Amazon Glacier (coming soon) Query Instantly Zero setup cost; just point to S3 and start querying SQL Open ANSI SQL interface, JDBC/ODBC drivers, multiple formats, compression types, and complex joins and data types Easy Serverless: zero infrastructure, zero administration Integrated with QuickSight Pay per query Pay only for queries run; save 30–90% on per-query costs through compression $
  • 28. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T AmazonQuickSight easy Empower everyone Seamless connectivity Fast analysis Serverless Now with ML superpowers!
  • 29. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Data Lakes fromAWS Data Lake on AWS Cost-effective Scalable and durable Secure Open and comprehensiveAnalyticsMachine Learning Real-time Data Movement On-premises Data Movement
  • 30. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T AWS Provides HighestLevelsofSecurity Secure Compliance AWS Artifact Amazon Inspector Amazon Cloud HSM Amazon Cognito AWS CloudTrail Security Amazon GuardDuty AWS Shield AWS WAF Amazon Macie VPC Encryption AWS Certification Manager AWS Key Management Service Encryption at rest Encryption in transit Bring your own keys, HSM support Identity AWS IAM AWS SSO Amazon Cloud Directory AWS Directory Service AWS Organizations Customer need to have multiple levels of security, identity and access management, encryption, and compliance to secure their data lake
  • 31. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Compliance:VirtuallyEveryRegulatoryAgency CSA Cloud Security Alliance Controls ISO 9001 Global Quality Standard ISO 27001 Security Management Controls ISO 27017 Cloud Specific Controls ISO 27018 Personal Data Protection PCI DSS Level 1 Payment Card Standards SOC 1 Audit Controls Report SOC 2 Security, Availability, & Confidentiality Report SOC 3 General Controls Report Global United States CJIS Criminal Justice Information Services DoD SRG DoD Data Processing FedRAMP Government Data Standards FERPA Educational Privacy Act FIPS Government Security Standards FISMA Federal Information Security Management GxP Quality Guidelines and Regulations ISO FFIEC Financial Institutions Regulation HIPPA Protected Health Information ITAR International Arms Regulations MPAA Protected Media Content NIST National Institute of Standards and Technology SEC Rule 17a-4(f) Financial Data Standards VPAT/Section 508 Accountability Standards Asia Pacific FISC [Japan] Financial Industry Information Systems IRAP [Australia] Australian Security Standards K-ISMS [Korea] Korean Information Security MTCS Tier 3 [Singapore] Multi-Tier Cloud Security Standard My Number Act [Japan] Personal Information Protection Europe C5 [Germany] Operational Security Attestation Cyber Essentials Plus [UK] Cyber Threat Protection G-Cloud [UK] UK Government Standards IT-Grundschutz [Germany] Baseline Protection Methodology X P G
  • 32. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Data Lakes fromAWS Data Lake on AWS Cost-effective Scalable and durable Secure Open and comprehensiveAnalyticsMachine Learning Real-time Data Movement On-premises Data Movement
  • 33. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T For example: Amazon S3 holds trillions of objects and regularly peaks at millions of requests per second TIME CUSTOMERDATA “…the scale at which AWS operates its public cloud storage services dwarfs the other vendors in this Magic Quadrant.” - Gartner Magic Quadrant for Public Cloud Storage Services, Worldwide Raj Bala, Arun Chandrasekaran, John McArthur, July 24, 2017 AWS Runs the Largest Global Cloud Infrastructure Scalable and durable
  • 34. CHALLENGE Need to create constant feedback loop for designers Gain up-to-the-minute understanding of gamer satisfaction to guarantee gamers are engaged, thus resulting in the most popular game played in the world Fortnite | 125+ million players
  • 35. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T EpicGames usesData Lakesand analytics Entire analytics platform running on AWS S3 leveraged as a Data Lake All telemetry data is collected with Kinesis Real-time analytics done through Spark on EMR, DynamoDB to create scoreboards and real-time queries Use Amazon EMR for large batch data processing Game designers use data to inform their decisions Game clients Game servers Launcher Game services N E A R R E A L T I M E P I P E L I N E N E A R R E A L T I M E P I P E L I N E Grafana Scoreboards API Limited Raw Data (real time ad-hoc SQL) User ETL (metric definition) Spark on EMR DynamoDB NEAR REALTIME PIPELINES BATCH PIPELINES ETL using EMR Tableau/BI Ad-hoc SQLS3 (Data Lake) Kinesis APIs Databases S3 Other sources
  • 36. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Data Lakes fromAWS Data Lake on AWS Lowest cost Scalable and durable Secure Open and comprehensiveAnalyticsMachine Learning Real-time Data Movement On-premises Data Movement
  • 37. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T PayOnly for theResourcesYouUseas youScale LowestCost • Pay-as-you-go for the resources you consume • As low as $0.05/GB scanned with Athena • EMR and Athena can automatically scale down resources after job completes, saving you costs • Commit to a set term and save up to 75% with Reserved Instance • Run on spare compute capacity with EMR and save up to 90% with Spot Traditional approach leads to wasted capacity Traditional: Rigid AWS: Elastic Capacity Demand Demand Servers Unmet demand upset players missed revenue Excess capacity wasted $$$ AWS approach: pay for the capacity you use
  • 38. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T AWS databasesand analytics Broadanddeepportfolio,builtforbuilders AWS Marketplace Amazon Redshift Data warehousing Amazon EMR Hadoop + Spark Athena Interactive analytics Kinesis Analytics Real-time Amazon Elasticsearch service Operational Analytics RDS MySQL, PostgreSQL, MariaDB, Oracle, SQL Server Aurora MySQL, PostgreSQL Amazon QuickSight Amazon SageMaker DynamoDB Key value, Document ElastiCache Redis, Memcached Neptune Graph Timestream Time Series QLDB Ledger Database S3/Amazon Glacier AWS Glue ETL & Data Catalog Lake Formation Data Lakes Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams | Data Pipeline | Direct Connect Data Movement AnalyticsDatabases Business Intelligence & Machine Learning Data Lake Managed Blockchain Blockchain Templates Blockchain Amazon Comprehend Amazon Rekognition Amazon Lex Amazon Transcribe AWS DeepLens 250+ solutions 730+ Database solutions 600+ Analytics solutions 25+ Blockchain solutions 20+ Data lake solutions 30+ solutions RDS on VMWare
  • 39. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T
  • 40. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T 41 ENTERPRISE INFORMATION MANAGEMENT @ VOO Modern Data Platform on AWS: 13:15 – 14:05
  • 41. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T AGENDA  VOO & Micropole Belgium  The problem of managing the Enterprise Information  Why Public Cloud? Why AWS?  Architecture and used AWS services  The results  Next steps
  • 42. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T VOO, PART OF NETHYS GROUP Gaz & Electricité Energie invest Télécoms/Média Participations Quadruple Play Energies renouvelables Participations financières Gestion de réseaux de distribution d’électricité et de gaz Services ICT BtoB Télévision à péage Quotidien régional Magazine News & TV
  • 43. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T VOO PRESENCE IN WALLONIAAND BRUSSELS
  • 44. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T VOO – QUADRUPLE PLAY OFFERING - ~20 analogue TV channels - - ~150 digital TV channels (SD, HD, 3D, 4K) - VOOcorder & .évasion (PVR) - Digital TV card - VOD - Be tv - VOOmotion & Be tv Go available for PC, tablet and smartphones - Internet -> 400Mbps - “unlimited” Packs - WIFI modems according “ac” norms - WIFI homespots (Wifree) Fixed telephony VOOmobile via an MVNO agreement Television and Internet, a simple offer, tailored to your needs The no-frills essential experience! A generous offer at attractive rates The All-in Pack that makes your life easier The Pack that is as mobile as you are
  • 45. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T MICROPOLE BELGIUM, PART OF MICROPOLE GROUP We will soon become an advanced AWS partner 1250 30 12 Business consultants & engineers around the world Years of expertise in advanced analytics and BI A team of 12 certified AWS experts Data Intelligence & Performance Data Governance & Architecture Machine Learning Blockchain Advanced Analytics in the Cloud3 Years of AWS Partnership
  • 46. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T HISTORY OF BI & BIG DATAAT VOO OSS BSS  Up to 2014  Typical Legacy environment with multiple dedicated silo DWH/Reporting environments  Source systems and data managements environments all hosted on a Private Cloud  Capacity upgrades and performance tuning slow, but more or less manageable  During 2014 - 2015 - 2016  Launch of Mobile services and exploitation of Network and Set-Up-Box data to improve customer experience and usage-based campaign management  Consequences  Explosion of storage requirements  Performance issues  Lack of adequate tools to manage « big data »  In addition: reporting/analytical environments fragmented and unsecured
  • 47. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T GENESIS OF THE MEMENTO PROJECT  April-May 2017 - concretization: detailed study with Micropole to define a business and technical architecture, a future organization and data governance model, and an implementation roadmap  June 2017 - implementation: kickoff of Memento project with 2 project tracks 1. Implementation of an “ARCHITECTURE” allowing the “unification of DATA” coming from different sources (internal/external) to provide the company with reliable “INFORMATION” and in realtime (reporting/analysis) 2. Set up of an “IT and Business ORGANISATION + an Operational GOVERNANCE” strengthening a culture supporting the definition and implementation of a “data strategy”
  • 48. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Memento project objectives Improve Customer Experience Subscriber acquisition Optimise operations Increase the level of profitability Increase customer retention GDPR compliance Provide decision support data for the 6 strategic axes of VOO
  • 49. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T WHY CHOSINGAPUBLIC CLOUD?  No upfront investments, no need for a detailed capacity plan, no long delays to order and install hardware  Ultra fast installation and configuration of the different solution components  « Pay as you go » / « pay as you use » principle (capacity extensions only when required and payments accordingly)  Elasticity, resilience and high availability  Use of Managed Services proposed in the Cloud drastically reduces license and maintenance costs
  • 50. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T WHYAWS?  Market leader in public cloud (source: Gartner)  Broadest offering in PaaS  AWS has the most complete offering in managed services – compared to Azure or Google Cloud Platform  Managed service definition (example for a database): it doesn’t requires installation, maintenance, patching, licensing. Furthermore backups are managed and high availability is integrated and requires only a minimum in configuration.  This is an tremendous accelerator and a considerable cost saving compared to traditional software. Examples: S3, Redshift, EMR, DynamoDB etc.  Infrastructure administration is reduced to a minimum and is done inside the EIM team
  • 51. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T MEMENTOARCHITECTURE Stage – raw/native format Process: batch & real-time Store & expose Archive Telenet Social media Web logs Sensor data Big Data Cluster Optional: filter or aggregate BW Primary access point Power users – Data Scientists Exploration (sandbox mode) Batch Real Time NRT Extract SAP data – simplified flows CRM 4 CRM 7 ISU ACBIS Numbers EffortelFAST Data Lake SERAM EffortelJira ALLOT … Every SAP transactional system S O U R C E S S T A G E E X P O S E I N T E G R A T E Predictive tools Entreprise Data Warehouse Reporting Analysis Visualisation e.g. Churn Prediction Controlled Sandbox NoSQL High frequency queries – detailed data Batch Mini batches (NRT) Call center application
  • 52. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T MEMENTO IMPLEMENTATION ONAWS
  • 53. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T RESULTS  A unified data platform  That enables any analytical use case  Goes beyond analytics usage  Shorten development cycles and answer faster business requirements  Scales without boundaries AND pay for what you are using  A well defined architecture, structures and standards allows parallel work by a large(r) team and embraces agile  Allows business teams to focus on data analysis (and not on data crunching)  GDPR compliance (privacy by design)
  • 54. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T NEXT STEPS  Finalise the integration of all existing enterprise data sources  Finalise remaining use cases and decommission the Legacy environment  Integrate all new data sources (from new projects and products)  Extend the usage to other companies within the Nethys Group  Respond in a agile way to the more challenging and complex business uses cases, including the use of Machine Learning
  • 55. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T THANKS FOR YOUR ATTENTION
  • 56. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T
  • 57. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T DemoOverview https://aws.amazon.com/blogs/big-data/harmonize-query-and-visualize-data-from- various-providers-using-aws-glue-amazon-athena-and-amazon-quicksight/
  • 58. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T
  • 59. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Typicalstepsof building adatalake Setup Storage1 Move data2 Cleanse, prep, and catalog data 3 Configure and enforce security and compliance policies 4 Make data available for analytics 5
  • 60. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Building data lakes can still take months
  • 61. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T AWS LakeFormation (join thepreview) Build, secure, and manage a data lake in days Build a data lake in days, not months Build and deploy a fully managed data lake with a few clicks Enforce security policies across multiple services Centrally define security, governance, and auditing policies in one place and enforce those policies for all users and all applications Combine different analytics approaches Empower analyst and data scientist productivity, giving them self-service discovery and safe access to all data from a single catalog
  • 62. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Howitworks:AWSLake Formation S3 IAM KMS OLTP ERP CRM LOB Devices Web Sensors Social Kinesis Build Data Lakes quickly • Identify, crawl, and catalog sources • Ingest and clean data • Transform into optimal formats Simplify security management • Enforce encryption • Define access policies • Implement audit login Enable self-service and combined analytics • Analysts discover all data available for analysis from a single data catalog • Use multiple analytics tools over the same data Athena Amazon Redshift AI Services Amazon EMR Amazon QuickSight Data Catalog
  • 63. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T CustomerinterestinAWS LakeFormation “We are very excited about the launch of AWS Lake Formation, which provides a central point of control to easily load, clean, secure, and catalog data from thousands of clients to our AWS-based data lake, dramatically reducing our operational load. … Additionally, AWS Lake Formation will be HIPAA compliant from day one …” - Aaron Symanski, CTO, Change Healthcare “I can’t wait for my team to get our hands on AWS Lake Formation. With an enterprise-ready option like Lake Formation, we will be able to spend more time deriving value from our data rather than doing the heavy lifting involved in manually setting up and managing our data lake.” - Joshua Couch, VP Engineering, Fender Digital
  • 64. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T
  • 65. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Thank you! © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T Javier Ramirez @supercoco9
  • 66. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T SelectAWSGlue customers
  • 67. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T
  • 68. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.P U B L I C S E C TO R S U M M I T DemoOverview https://aws.amazon.com/blogs/big-data/harmonize-query-and-visualize-data-from- various-providers-using-aws-glue-amazon-athena-and-amazon-quicksight/