SlideShare uma empresa Scribd logo
1 de 77
Data without Limits!
Dr. Werner Vogels!
CTO, Amazon.com!
I. Science!
Observations – Theory – Models - Facts!
Human Genome Project!
Collaborative project to sequence every single letter!
of the human genetic code.!
13 years and $billions to complete.!
Gigabyte scale datasets (transferred between sites on!
iPods!)!
Beyond the Human Genome!
45+ species sequenced: mouse, rat, gorilla, rabbit, !
platypus, nematode, zebra fish...!
Compare genomes between species to identify!
biologically interesting areas of the genome.!
100Gb scale datasets. Increased computational
requirements.!
The Next Generation!
New sequencing instruments lead to a dramatic!
drop in cost and time required to sequence a genome.!
Sequence and compare genetic code of individuals to!
find areas of variation. Much more interesting.!
Terabyte scale datasets. Significant computational
requirements.!
The 1000 Genomes Projects!
Public/private consortium to build world’s largest!
collection of human genetic variation.!
Hugely important dataset to drive new insight into!
known genetic traits, and the identification of new ones.!
Vast, complex data and computational resources required,
beyond reach of most research groups and hospitals.!
1000 Genomes in the Cloud!
The 1000 Genomes data made available to all on AWS.!
Stored for free as part of the Public Datasets program.!
Updated regularly.!
200Tb. 1700 individual genomes. As much compute and
storage as required available to all.!
II. Consumer!
UNCERTAINTY!
UNDERSTAND!
YOUR CUSTOMER!
Who	
  is	
  my	
  customer	
  really?	
  	
  
	
  
What	
  do	
  people	
  really	
  like?	
  	
  
What	
  is	
  happening	
  socially	
  with	
  my	
  products?	
  
	
  
Where	
  do	
  people	
  consume	
  my	
  product?	
  
How	
  do	
  people	
  really	
  use	
  your	
  product?	
  
	
  
PERSONALIZE!
75% of users select!
movies based on!
recommendations!
A/B TESTING!
BIGGER IS BETTER!
Wego	
  
•  Search	
  using	
  Flexible	
  dates	
  AND/OR	
  LocaBons	
  and	
  Themes	
  
–  FROM	
  Singapore	
  TO	
  Beach	
  FOR	
  A	
  Weekend	
  Trip	
  (theme	
  locaBon	
  +	
  flexible	
  date)	
  
–  FROM	
  Singapore	
  TO	
  Paris	
  FOR	
  A	
  Whole-­‐week	
  VacaBon	
  (specific	
  desBnaBon	
  +	
  flexible	
  
date)	
  
–  FROM	
  Singapore	
  TO	
  Sydney	
  IN	
  Next	
  Two	
  Months	
  (specific	
  desBnaBon	
  +	
  flexible	
  date)	
  
–  FROM	
  Singapore	
  TO	
  Family-­‐friendly	
  DesBnaBon	
  ON	
  30-­‐Apr	
  to	
  05-­‐May	
  (theme	
  locaBon	
  
+	
  fixed	
  dates)	
  
•  Need	
  for	
  robust	
  caching	
  mechanism	
  with	
  millions	
  of	
  flight	
  searches	
  with	
  
10Million	
  +	
  different	
  flight	
  routes	
  	
  
•  Use	
  the	
  AWS	
  cloud	
  to	
  rapidly	
  spin	
  up	
  machines	
  to	
  scale	
  to	
  the	
  requirements	
  
•  AWS	
  allows	
  them	
  to	
  do	
  this	
  in	
  a	
  scalable	
  and	
  cost	
  effecBve	
  manner	
  	
  
Wego	
  –	
  Search	
  
Dropcam	
  is	
  the	
  biggest	
  inbound	
  video	
  
service	
  on	
  the	
  Web	
  	
  
•  More	
  data	
  uploaded	
  per	
  
minute	
  than	
  YouTube	
  	
  
•  Petabytes	
  of	
  data	
  
processed	
  every	
  month	
  
•  Billions	
  of	
  moBon	
  events	
  
detected	
  
III. Industrial!
IV. Sports!
V. Startups!
Experiment

Measure

Iterate or Pivot!
The	
  only	
  Asian	
  company	
  which	
  made	
  it	
  to	
  the	
  CODE_n	
  finalist	
  list	
  for	
  CeBIT	
  2014	
  
Platform Architecture
Archival	
  (Glacier)	
  
Storage	
  (S3)	
  
Crawl	
  Cluster	
  (EC2)	
   File	
  Server	
  
(EC2)	
  
Processing	
  Cluster	
  (EC2)	
  
Choice	
  Engine	
  Cluster	
  	
  
(EC2)	
  
Data	
  
Partners	
  
End	
  user	
  
interacBon/Front	
  
End	
  
On	
  AWS	
  
External	
  to	
  AWS	
  
IntegraBon	
  Engine	
  
Data	
  AcquisiBon	
  
Lenddo’s	
  Journey	
  
•  Process	
  about	
  3.5TB	
  of	
  social	
  data	
  	
  
•  Social	
  Data	
  growing	
  more	
  users	
  	
  
•  Started	
  with	
  MongoDB	
  cluster	
  on	
  CR1	
  instance	
  
types	
  on	
  AWS	
  ,spending	
  10K	
  USD/month	
  	
  
•  Re-­‐architected	
  to	
  move	
  all	
  their	
  data	
  to	
  S3	
  and	
  
keep	
  caches	
  in	
  smaller	
  mongodb	
  and	
  dynamodb	
  
cluster.	
  Use	
  EMR	
  to	
  process	
  data	
  
•  Now	
  spending	
  3K/month	
  	
  
VI. The Pipeline!
The amount of information generated during the first
day of a baby’s life today is equivalent to 70 times
the information contained in the Library of Congress!
MULTIPLE DOMAINS!
Time!
Properties!
Locations!
Sensors!
COLLECT	
  |	
  STORE	
  |	
  ORGANIZE	
  |	
  ANALYZE	
  |	
  SHARE	
  
COLLECT	
  |	
  STORE	
  |	
  ORGANIZE	
  |	
  ANALYZE	
  |	
  SHARE	
  
COLLECT	
  |	
  STORE	
  |	
  ORGANIZE	
  |	
  ANALYZE	
  |	
  SHARE	
  
COLLECT	
  |	
  STORE	
  |	
  ORGANIZE	
  |	
  ANALYZE	
  |	
  SHARE	
  
COLLECT	
  |	
  STORE	
  |	
  ORGANIZE	
  |	
  ANALYZE	
  |	
  SHARE	
  
COLLECT	
  |	
  STORE	
  |	
  ORGANIZE	
  |	
  ANALYZE	
  |	
  SHARE	
  
VII. Real-time!
What was happening 

yesterday?!
What ! right now?
trades are executing!
is the exception rate!
is the ad click-through!
topics are trending!
inventory remains!
queries are slow!
are the high scores!
!
Kinesis	
  architecture	
  
Amazon Web Services
AZ AZ AZ
Durable, highly consistent storage replicates data
across three data centers (availability zones)
Aggregate and
archive to S3
Millions of
sources producing
100s of terabytes
per hour
Front
End
Authentication
Authorization
Ordered stream
of events supports
multiple readers
Real-time
dashboards
and alarms
Machine learning
algorithms or
sliding window
analytics
Aggregate analysis
in Hadoop or a
data warehouse
Inexpensive: $0.028 per million puts
AWS	
  Internal	
  Metering	
  Service	
  
Capture
Submissions
Process in
Realtime
Store in
Redshift
Clients
Submitting
Data
Workload
•  Tens of millions records/sec
•  Multiple TB per hour
•  100,000s of sources
New features
•  Scale with the business
•  Provide real-time alerting
•  Inexpensive
•  Improved auditing
Workload	
  
•  Daily	
  load	
  of	
  billions	
  records	
  from	
  millions	
  of	
  files	
  from	
  
hundreds	
  of	
  sources	
  
•  3	
  hour	
  SLA	
  to	
  load	
  and	
  audit	
  data	
  
•  Hundreds	
  of	
  customers	
  
•  Hundreds	
  of	
  queries	
  per	
  hour	
  
	
  
New	
  features	
  
•  Our	
  data	
  is	
  fresh,	
  we	
  ingest	
  every	
  6	
  hours	
  
•  Now	
  processing	
  triple	
  the	
  volume	
  in	
  less	
  than	
  25%	
  of	
  
the	
  Bme	
  
•  “Hammerstone”	
  ETL	
  soluBon	
  	
  
–  Built	
  on	
  AWS	
  Data	
  Pipeline	
  
–  Build	
  business	
  specific	
  marts	
  
–  Build	
  workload	
  specific	
  clusters	
  
•  Supports	
  a	
  variety	
  of	
  analyBcs	
  tools:	
  Tableau,	
  R,	
  Toad,	
  
SQL	
  Developer,	
  etc.	
  
Internal	
  AWS	
  Data	
  Warehouse	
  
Over 200 internal
data sources
Data staged in
Amazon S3
"Hammerstone:"
Custom ETL
using AWS
Data Pipeline
Data processing
Redshift cluster
Batch reporting
Redshift cluster
Ad hoc query
Redshift cluster
Big Science & Big Data Verticals!
Media/
AdverAsing	
  
Targeted	
  
AdverBsing	
  
Image	
  and	
  
Video	
  
Processing	
  
Oil	
  &	
  Gas	
  
Seismic	
  
Analysis	
  
Retail	
  
RecommendaBons	
  
TransacBon	
  
Analysis	
  
Life	
  
Sciences	
  
Genome	
  
Analysis	
  
Financial	
  
Services	
  
Monte	
  Carlo	
  
SimulaBons	
  
Risk	
  
Analysis	
  
Security	
  
AnB-­‐virus	
  
Fraud	
  
DetecBon	
  
Image	
  
RecogniBon	
  
Social	
  
Network/
Gaming	
  
User	
  
Demographics	
  
Usage	
  
analysis	
  
In-­‐game	
  
metrics	
  
BIG-DATA REQUIRES



NO LIMITS!
Cloud enables big data
collection!
Cloud enables big data
processing!
Cloud enables big data
collaboration!
werner@amazon.com	
  

Mais conteúdo relacionado

Mais procurados

Big Data Pipeline and Analytics Platform
Big Data Pipeline and Analytics PlatformBig Data Pipeline and Analytics Platform
Big Data Pipeline and Analytics Platform
Sudhir Tonse
 
Getting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big DataGetting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big Data
Qubole
 

Mais procurados (20)

Amazon Kinesis: Real-time Streaming Big data Processing Applications (BDT311)...
Amazon Kinesis: Real-time Streaming Big data Processing Applications (BDT311)...Amazon Kinesis: Real-time Streaming Big data Processing Applications (BDT311)...
Amazon Kinesis: Real-time Streaming Big data Processing Applications (BDT311)...
 
Streaming data for real time analysis
Streaming data for real time analysisStreaming data for real time analysis
Streaming data for real time analysis
 
BDA304 Data-Driven Post Mortems
BDA304 Data-Driven Post MortemsBDA304 Data-Driven Post Mortems
BDA304 Data-Driven Post Mortems
 
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
 
Data to Drive Decision-Making - CaliStream Meetup
Data to Drive Decision-Making - CaliStream MeetupData to Drive Decision-Making - CaliStream Meetup
Data to Drive Decision-Making - CaliStream Meetup
 
Big Data Pipeline and Analytics Platform
Big Data Pipeline and Analytics PlatformBig Data Pipeline and Analytics Platform
Big Data Pipeline and Analytics Platform
 
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
 
The Netflix data platform: Now and in the future by Kurt Brown
The Netflix data platform: Now and in the future by Kurt BrownThe Netflix data platform: Now and in the future by Kurt Brown
The Netflix data platform: Now and in the future by Kurt Brown
 
Real-Time Event Processing
Real-Time Event ProcessingReal-Time Event Processing
Real-Time Event Processing
 
Building a real-time, scalable and intelligent programmatic ad buying platform
Building a real-time, scalable and intelligent programmatic ad buying platformBuilding a real-time, scalable and intelligent programmatic ad buying platform
Building a real-time, scalable and intelligent programmatic ad buying platform
 
(ADV402) Beating the Speed of Light with Your Infrastructure in AWS | AWS re:...
(ADV402) Beating the Speed of Light with Your Infrastructure in AWS | AWS re:...(ADV402) Beating the Speed of Light with Your Infrastructure in AWS | AWS re:...
(ADV402) Beating the Speed of Light with Your Infrastructure in AWS | AWS re:...
 
Netflix Big Data Paris 2017
Netflix Big Data Paris 2017Netflix Big Data Paris 2017
Netflix Big Data Paris 2017
 
Aws Kinesis
Aws KinesisAws Kinesis
Aws Kinesis
 
Getting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big DataGetting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big Data
 
Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...
Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...
Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...
 
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
 
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
 
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
 
Mhug apache storm
Mhug apache stormMhug apache storm
Mhug apache storm
 
#lspe Q1 2013 dynamically scaling netflix in the cloud
#lspe Q1 2013   dynamically scaling netflix in the cloud#lspe Q1 2013   dynamically scaling netflix in the cloud
#lspe Q1 2013 dynamically scaling netflix in the cloud
 

Semelhante a AWS Enterprise Day | Closing Keynote - Data Without Limits, Dr Werner Vogels

Beyond the Fridge, The World of Connected Data - Dr Werner Vogels
Beyond the Fridge, The World of Connected Data - Dr Werner VogelsBeyond the Fridge, The World of Connected Data - Dr Werner Vogels
Beyond the Fridge, The World of Connected Data - Dr Werner Vogels
Amazon Web Services
 
AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...
AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...
AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...
Amazon Web Services
 
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
ALTER WAY
 
Big Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and RoadmapBig Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and Roadmap
Srinath Perera
 
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Open Analytics
 
Open Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe OlsenOpen Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe Olsen
Christopher Whitaker
 

Semelhante a AWS Enterprise Day | Closing Keynote - Data Without Limits, Dr Werner Vogels (20)

Beyond the Fridge, The World of Connected Data - Dr Werner Vogels
Beyond the Fridge, The World of Connected Data - Dr Werner VogelsBeyond the Fridge, The World of Connected Data - Dr Werner Vogels
Beyond the Fridge, The World of Connected Data - Dr Werner Vogels
 
AWS Paris Summit 2014 - Closing Keynote Werner Vogels - Beyond the fridge
AWS Paris Summit 2014 - Closing Keynote Werner Vogels - Beyond the fridgeAWS Paris Summit 2014 - Closing Keynote Werner Vogels - Beyond the fridge
AWS Paris Summit 2014 - Closing Keynote Werner Vogels - Beyond the fridge
 
Closing Keynote - AWS Executive Summit 2014 India
Closing Keynote - AWS Executive Summit 2014 IndiaClosing Keynote - AWS Executive Summit 2014 India
Closing Keynote - AWS Executive Summit 2014 India
 
AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...
AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...
AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...
 
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
 
Big Data Use Cases and Solutions in the AWS Cloud
Big Data Use Cases and Solutions in the AWS CloudBig Data Use Cases and Solutions in the AWS Cloud
Big Data Use Cases and Solutions in the AWS Cloud
 
AWS Analytics Experience Argentina - Intro
AWS Analytics Experience Argentina - IntroAWS Analytics Experience Argentina - Intro
AWS Analytics Experience Argentina - Intro
 
(BDT201) Big Data and HPC State of the Union | AWS re:Invent 2014
(BDT201) Big Data and HPC State of the Union | AWS re:Invent 2014(BDT201) Big Data and HPC State of the Union | AWS re:Invent 2014
(BDT201) Big Data and HPC State of the Union | AWS re:Invent 2014
 
(ARC311) Decoding The Genetic Blueprint Of Life On A Cloud Ecosystem
(ARC311) Decoding The Genetic Blueprint Of Life On A Cloud Ecosystem(ARC311) Decoding The Genetic Blueprint Of Life On A Cloud Ecosystem
(ARC311) Decoding The Genetic Blueprint Of Life On A Cloud Ecosystem
 
Big Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and RoadmapBig Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and Roadmap
 
Big problems Big Data, simple solutions
Big problems Big Data, simple solutionsBig problems Big Data, simple solutions
Big problems Big Data, simple solutions
 
Big problems Big data, simple AWS solution
Big problems Big data, simple AWS solutionBig problems Big data, simple AWS solution
Big problems Big data, simple AWS solution
 
Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...
Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...
Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...
 
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and moreBig Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
 
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
 
Open Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe OlsenOpen Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe Olsen
 
BDI- The Beginning (Big data training in Coimbatore)
BDI- The Beginning (Big data training in Coimbatore)BDI- The Beginning (Big data training in Coimbatore)
BDI- The Beginning (Big data training in Coimbatore)
 
Real-time Analytics with Open-Source
Real-time Analytics with Open-SourceReal-time Analytics with Open-Source
Real-time Analytics with Open-Source
 
Nosql Now 2012: MongoDB Use Cases
Nosql Now 2012: MongoDB Use CasesNosql Now 2012: MongoDB Use Cases
Nosql Now 2012: MongoDB Use Cases
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWS
 

Mais de Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Mais de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Último (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 

AWS Enterprise Day | Closing Keynote - Data Without Limits, Dr Werner Vogels

  • 1. Data without Limits! Dr. Werner Vogels! CTO, Amazon.com!
  • 3.
  • 4. Observations – Theory – Models - Facts!
  • 5. Human Genome Project! Collaborative project to sequence every single letter! of the human genetic code.! 13 years and $billions to complete.! Gigabyte scale datasets (transferred between sites on! iPods!)!
  • 6. Beyond the Human Genome! 45+ species sequenced: mouse, rat, gorilla, rabbit, ! platypus, nematode, zebra fish...! Compare genomes between species to identify! biologically interesting areas of the genome.! 100Gb scale datasets. Increased computational requirements.!
  • 7. The Next Generation! New sequencing instruments lead to a dramatic! drop in cost and time required to sequence a genome.! Sequence and compare genetic code of individuals to! find areas of variation. Much more interesting.! Terabyte scale datasets. Significant computational requirements.!
  • 8. The 1000 Genomes Projects! Public/private consortium to build world’s largest! collection of human genetic variation.! Hugely important dataset to drive new insight into! known genetic traits, and the identification of new ones.! Vast, complex data and computational resources required, beyond reach of most research groups and hospitals.!
  • 9. 1000 Genomes in the Cloud! The 1000 Genomes data made available to all on AWS.! Stored for free as part of the Public Datasets program.! Updated regularly.! 200Tb. 1700 individual genomes. As much compute and storage as required available to all.!
  • 10.
  • 11.
  • 12.
  • 13.
  • 17. Who  is  my  customer  really?       What  do  people  really  like?     What  is  happening  socially  with  my  products?     Where  do  people  consume  my  product?   How  do  people  really  use  your  product?    
  • 19. 75% of users select! movies based on! recommendations!
  • 22.
  • 23. Wego   •  Search  using  Flexible  dates  AND/OR  LocaBons  and  Themes   –  FROM  Singapore  TO  Beach  FOR  A  Weekend  Trip  (theme  locaBon  +  flexible  date)   –  FROM  Singapore  TO  Paris  FOR  A  Whole-­‐week  VacaBon  (specific  desBnaBon  +  flexible   date)   –  FROM  Singapore  TO  Sydney  IN  Next  Two  Months  (specific  desBnaBon  +  flexible  date)   –  FROM  Singapore  TO  Family-­‐friendly  DesBnaBon  ON  30-­‐Apr  to  05-­‐May  (theme  locaBon   +  fixed  dates)   •  Need  for  robust  caching  mechanism  with  millions  of  flight  searches  with   10Million  +  different  flight  routes     •  Use  the  AWS  cloud  to  rapidly  spin  up  machines  to  scale  to  the  requirements   •  AWS  allows  them  to  do  this  in  a  scalable  and  cost  effecBve  manner    
  • 25.
  • 26.
  • 27.
  • 28.
  • 29. Dropcam  is  the  biggest  inbound  video   service  on  the  Web     •  More  data  uploaded  per   minute  than  YouTube     •  Petabytes  of  data   processed  every  month   •  Billions  of  moBon  events   detected  
  • 30.
  • 31.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 47.
  • 49.
  • 50.
  • 51.
  • 52. The  only  Asian  company  which  made  it  to  the  CODE_n  finalist  list  for  CeBIT  2014  
  • 53. Platform Architecture Archival  (Glacier)   Storage  (S3)   Crawl  Cluster  (EC2)   File  Server   (EC2)   Processing  Cluster  (EC2)   Choice  Engine  Cluster     (EC2)   Data   Partners   End  user   interacBon/Front   End   On  AWS   External  to  AWS   IntegraBon  Engine   Data  AcquisiBon  
  • 54.
  • 55. Lenddo’s  Journey   •  Process  about  3.5TB  of  social  data     •  Social  Data  growing  more  users     •  Started  with  MongoDB  cluster  on  CR1  instance   types  on  AWS  ,spending  10K  USD/month     •  Re-­‐architected  to  move  all  their  data  to  S3  and   keep  caches  in  smaller  mongodb  and  dynamodb   cluster.  Use  EMR  to  process  data   •  Now  spending  3K/month    
  • 57. The amount of information generated during the first day of a baby’s life today is equivalent to 70 times the information contained in the Library of Congress!
  • 59. COLLECT  |  STORE  |  ORGANIZE  |  ANALYZE  |  SHARE  
  • 60. COLLECT  |  STORE  |  ORGANIZE  |  ANALYZE  |  SHARE  
  • 61. COLLECT  |  STORE  |  ORGANIZE  |  ANALYZE  |  SHARE  
  • 62. COLLECT  |  STORE  |  ORGANIZE  |  ANALYZE  |  SHARE  
  • 63. COLLECT  |  STORE  |  ORGANIZE  |  ANALYZE  |  SHARE  
  • 64. COLLECT  |  STORE  |  ORGANIZE  |  ANALYZE  |  SHARE  
  • 66. What was happening 
 yesterday?!
  • 67. What ! right now? trades are executing! is the exception rate! is the ad click-through! topics are trending! inventory remains! queries are slow! are the high scores! !
  • 68.
  • 69. Kinesis  architecture   Amazon Web Services AZ AZ AZ Durable, highly consistent storage replicates data across three data centers (availability zones) Aggregate and archive to S3 Millions of sources producing 100s of terabytes per hour Front End Authentication Authorization Ordered stream of events supports multiple readers Real-time dashboards and alarms Machine learning algorithms or sliding window analytics Aggregate analysis in Hadoop or a data warehouse Inexpensive: $0.028 per million puts
  • 70. AWS  Internal  Metering  Service   Capture Submissions Process in Realtime Store in Redshift Clients Submitting Data Workload •  Tens of millions records/sec •  Multiple TB per hour •  100,000s of sources New features •  Scale with the business •  Provide real-time alerting •  Inexpensive •  Improved auditing
  • 71. Workload   •  Daily  load  of  billions  records  from  millions  of  files  from   hundreds  of  sources   •  3  hour  SLA  to  load  and  audit  data   •  Hundreds  of  customers   •  Hundreds  of  queries  per  hour     New  features   •  Our  data  is  fresh,  we  ingest  every  6  hours   •  Now  processing  triple  the  volume  in  less  than  25%  of   the  Bme   •  “Hammerstone”  ETL  soluBon     –  Built  on  AWS  Data  Pipeline   –  Build  business  specific  marts   –  Build  workload  specific  clusters   •  Supports  a  variety  of  analyBcs  tools:  Tableau,  R,  Toad,   SQL  Developer,  etc.   Internal  AWS  Data  Warehouse   Over 200 internal data sources Data staged in Amazon S3 "Hammerstone:" Custom ETL using AWS Data Pipeline Data processing Redshift cluster Batch reporting Redshift cluster Ad hoc query Redshift cluster
  • 72. Big Science & Big Data Verticals! Media/ AdverAsing   Targeted   AdverBsing   Image  and   Video   Processing   Oil  &  Gas   Seismic   Analysis   Retail   RecommendaBons   TransacBon   Analysis   Life   Sciences   Genome   Analysis   Financial   Services   Monte  Carlo   SimulaBons   Risk   Analysis   Security   AnB-­‐virus   Fraud   DetecBon   Image   RecogniBon   Social   Network/ Gaming   User   Demographics   Usage   analysis   In-­‐game   metrics  
  • 74. Cloud enables big data collection!
  • 75. Cloud enables big data processing!
  • 76. Cloud enables big data collaboration!