SlideShare uma empresa Scribd logo
1 de 41
Baixar para ler offline
!
AWS Chicago User Group	

!
Big Data Day
Have an idea for a meetup? Talk
to me:	

!
Margaret Walker

CohesiveFT	

!
!
Tweet: @MargieWalker

#AWSChicago	

Sponsors & Hosts
#AWSChicago
6:00 pm Introductions	

6:05 pm Short Talks	

!
"AWS Storage Options" Ben Blair, CTO at MarkITx
@stochastic_code 	

!
"APIs and Big Data in AWS" - Kin Lane,API Evangelist
@kinlane 	

!
"Democratizing Data Analysis with Amazon Redshift" - Bill
Wanjohi @billwanjohi and Michelangelo D'Agostino
@MichelangeloDA, Civis Analytics 	

!
6:45 pm Q & A 	

7:00 pm Networking, drinks and pizza
Agenda
#AWSChicago
Sponsors & Hosts
Next Meetups: 

October 15?	

!
+Nov 12

Let’s drink at re:Invent
Keep it Secret,
Keep it Safe
(and Fast and Available would be nice too)
Hi
Ben Blair
CTO @ MarkITx
We live on AWS
TL;DW
• Use IAM roles for access control
• Use DynamoDB for online storage &
transactions
• Use Redshift for offline storage & analysis
• Use S3 to keep *everything*
It’s hard to keep a
secret
Use AIM EC2 roles instead
3rd normal form,
anyone?
Data duplication is OK
Optimize for each context
Interactive Data goes
in DynamoDB
If your users read or write it, and it’s not huge, it should
probably go into DynamoDB
Why DynamoDB
• Works with tests. Tests are good.
• Predictable Performance & Cost
• Low Maintenance
Why Not DynamoDB
• Vendor lock-in vs Cassandra
• Can’t add / change indexes (but that’s ok)
• Need to watch utilization
SimpleDB
No, just no
ElastiCache
Good place to end, bad place to start
RDS
Hosted SQL Goodness
Redshift
Seriously wonderful
Redshift vs RDS
• Start with RDS
• Redshift is actually very cheap
• RDS for simple reporting on small data sets
• Redshift for all other analysis
S3
Store Everything.
!
You won’t, and you’ll regret it later.
EBS
Distributed Availability > Instance Recovery
Names Matter
Distributed systems care about your keyspace even
when you don’t
Thanks
ben@markitx.com
!
@stochastic_code
!
github.com/markitx
"APIs and Big Data in AWS" 	

Kin Lane

API Evangelist 	

!
@kinlane 	

!
Click here for slides on GitHub	

#AWSChicago
Sponsors & Hosts
Democratizing Data
Analysis with Amazon
Redshift
Michelangelo D’Agostino - Civis Analytics Senior Data Scientist
Bill Wanjohi - Civis Analytics Senior Engineer
● advantages of Redshift
● some pitfalls
● workflows and recommendations on best
practices
What you’ll learn
Why should you listen?
● 18 months of heavy Redshift use
● Two complementary perspectives:
The Scientist and The Engineer
Michelangelo @MichelangeloDA
Bill @billwanjohi
● collaborated on monolithic Vertica analytics
database
● dozens of TB of data
● scaled from 4-20 server blades
● dozens of concurrent users across
departments (hundreds total)
● arbitrary SQL allowed/encouraged
Life before Redshift
Our early requirements
● SQL language
● low starting cost
● easy to integrate with OSS, other DBs
● performant on large data sets
● minimal database administration
Choosing Redshift
● timing: first full release in Feb 2013
● drastically cheaper to start than other
commercial offerings
● very similar to our previous choice, HP
Vertica
● many fewer administration tasks
Basics
● RDBMS
● MPP/Columnar
Supports window functions
Few enforceable constraints
No concept of an index
● Redshift <= ParAccel <= PostgreSQL 8
Postgres drivers work
ORM requires mocking
● Most data I/O via S3 service
Things analytics DBs are good at
● Big aggregates
● Parallel I/O
● Merge joins between tables
Things they’re not good at
● Updates
● Retrieval of individual records
● Enforcing data quality
How’s it worked out?
Pretty good!
● adequate performance
○ big step up from traditional RDBMS
○ comparable to other analytics DBs
● easy to stand up new clusters
● cheaper clusters now available
● most workflows can live entirely in-database
● s3 is a good broker for what can’t
Data Science Workflow
Our custom plumbing syncs tables from dozens
of source databases into Redshift at varying
refresh frequencies.
We’ve found that SQL just invites so many
more people to the analytics game.
Analysts and data scientists run exploratory
SQL and build up complex tables for statistical
modeling一utilizing crazy joins, aggregates and
rollup features.
Redshift supports powerful window functions
Data Science Workflow
Predictive Modeling
Data is pulled directly from Redshift into
python/R to train statistical models
Predictive Modeling
For simple linear models, scoring is done
directly in redshift via SQL.
For more complicated models, data is pulled
from redshift to s3 with a COPY SQL
command, processed in EMR, and loaded back
into redshift with another COPY command.
Hurdles we’ve faced along
the way
● inconsistent runtimes
● catalog contention
● bugs (databases are hard)
● resizing
● too easy to end up with uncompressed data
● “missing” PostgreSQL functionality
● complex workload management
Setup Recommendations
● at least two nodes
● send 35-day snapshots to other regions
● at-rest encryption
● enforce SSL
● provision with boto or AWS CLI
● cluster isolation to hide objects
● buy 3-year reservations
We’re Hiring!
Through research, experimentation, and iteration, we’re
transforming how organizations do analytics. Our clients
range in scale and focus from local to international, all
empowered by our individual-level, data-driven approach.
civisanalytics.com/apply

Mais conteúdo relacionado

Mais procurados

Span Conference: Why your company needs a unified log
Span Conference: Why your company needs a unified logSpan Conference: Why your company needs a unified log
Span Conference: Why your company needs a unified logAlexander Dean
 
HBase Meetup @ Cask HQ 09/25
HBase Meetup @ Cask HQ 09/25HBase Meetup @ Cask HQ 09/25
HBase Meetup @ Cask HQ 09/25Cask Data
 
Event & Data Mesh as a Service: Industrializing Microservices in the Enterpri...
Event & Data Mesh as a Service: Industrializing Microservices in the Enterpri...Event & Data Mesh as a Service: Industrializing Microservices in the Enterpri...
Event & Data Mesh as a Service: Industrializing Microservices in the Enterpri...HostedbyConfluent
 
Dr. Elephant – Achieving Quicker, Easier, and Cost-Effective Big Data Analyti...
Dr. Elephant – Achieving Quicker, Easier, and Cost-Effective Big Data Analyti...Dr. Elephant – Achieving Quicker, Easier, and Cost-Effective Big Data Analyti...
Dr. Elephant – Achieving Quicker, Easier, and Cost-Effective Big Data Analyti...Akshay Rai
 
Optimized Solutions - Corporate Overview
Optimized Solutions - Corporate OverviewOptimized Solutions - Corporate Overview
Optimized Solutions - Corporate OverviewSandy Optimizedsol
 
DOES SFO 2016 - Rich Jackson & Rosalind Radcliffe - The Mainframe DevOps Team...
DOES SFO 2016 - Rich Jackson & Rosalind Radcliffe - The Mainframe DevOps Team...DOES SFO 2016 - Rich Jackson & Rosalind Radcliffe - The Mainframe DevOps Team...
DOES SFO 2016 - Rich Jackson & Rosalind Radcliffe - The Mainframe DevOps Team...Gene Kim
 
Introducing Tupilak, Snowplow's unified log fabric
Introducing Tupilak, Snowplow's unified log fabricIntroducing Tupilak, Snowplow's unified log fabric
Introducing Tupilak, Snowplow's unified log fabricAlexander Dean
 
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...HostedbyConfluent
 
A DevOps State of Mind with Microservices, Containers and Kubernetes
A DevOps State of Mind with Microservices, Containers and KubernetesA DevOps State of Mind with Microservices, Containers and Kubernetes
A DevOps State of Mind with Microservices, Containers and KubernetesAll Things Open
 
RedisConf18 - Redis in Dev, Test, and Prod with the OpenShift Service Catalog
RedisConf18 - Redis in Dev, Test, and Prod with the OpenShift Service CatalogRedisConf18 - Redis in Dev, Test, and Prod with the OpenShift Service Catalog
RedisConf18 - Redis in Dev, Test, and Prod with the OpenShift Service CatalogRedis Labs
 
Elastic at Procter & Gamble: A Network Story
Elastic at Procter & Gamble: A Network StoryElastic at Procter & Gamble: A Network Story
Elastic at Procter & Gamble: A Network StoryElasticsearch
 
Serverless Architecture at iRobot
Serverless Architecture at iRobotServerless Architecture at iRobot
Serverless Architecture at iRobotBen Kehoe
 
Tracking Huge Files with Git LFS
Tracking Huge Files with Git LFSTracking Huge Files with Git LFS
Tracking Huge Files with Git LFSAtlassian
 
Combinación de logs, métricas y seguimiento para una visibilidad centralizada
Combinación de logs, métricas y seguimiento para una visibilidad centralizadaCombinación de logs, métricas y seguimiento para una visibilidad centralizada
Combinación de logs, métricas y seguimiento para una visibilidad centralizadaElasticsearch
 
Elastic Cloud Enterprise in Azure with Devon
Elastic Cloud Enterprise in Azure with DevonElastic Cloud Enterprise in Azure with Devon
Elastic Cloud Enterprise in Azure with DevonElasticsearch
 
Serverless Logging Architecture
Serverless Logging ArchitectureServerless Logging Architecture
Serverless Logging ArchitectureNarendran R
 
JupyterCon 2020 - Supercharging SQL Users with Jupyter Notebooks
JupyterCon 2020 - Supercharging SQL Users with Jupyter NotebooksJupyterCon 2020 - Supercharging SQL Users with Jupyter Notebooks
JupyterCon 2020 - Supercharging SQL Users with Jupyter NotebooksMichelle Ufford
 
"Hacking" JIRA and Confluence Cloud Part 1 - Connect Your Apps - Travis Smith
"Hacking" JIRA and Confluence Cloud Part 1 - Connect Your Apps - Travis Smith"Hacking" JIRA and Confluence Cloud Part 1 - Connect Your Apps - Travis Smith
"Hacking" JIRA and Confluence Cloud Part 1 - Connect Your Apps - Travis SmithAtlassian
 
Why Visibility into Your Stack Matters
Why Visibility into Your Stack MattersWhy Visibility into Your Stack Matters
Why Visibility into Your Stack MattersAmazon Web Services
 
node-crate: node.js and big data
 node-crate: node.js and big data node-crate: node.js and big data
node-crate: node.js and big dataStefan Thies
 

Mais procurados (20)

Span Conference: Why your company needs a unified log
Span Conference: Why your company needs a unified logSpan Conference: Why your company needs a unified log
Span Conference: Why your company needs a unified log
 
HBase Meetup @ Cask HQ 09/25
HBase Meetup @ Cask HQ 09/25HBase Meetup @ Cask HQ 09/25
HBase Meetup @ Cask HQ 09/25
 
Event & Data Mesh as a Service: Industrializing Microservices in the Enterpri...
Event & Data Mesh as a Service: Industrializing Microservices in the Enterpri...Event & Data Mesh as a Service: Industrializing Microservices in the Enterpri...
Event & Data Mesh as a Service: Industrializing Microservices in the Enterpri...
 
Dr. Elephant – Achieving Quicker, Easier, and Cost-Effective Big Data Analyti...
Dr. Elephant – Achieving Quicker, Easier, and Cost-Effective Big Data Analyti...Dr. Elephant – Achieving Quicker, Easier, and Cost-Effective Big Data Analyti...
Dr. Elephant – Achieving Quicker, Easier, and Cost-Effective Big Data Analyti...
 
Optimized Solutions - Corporate Overview
Optimized Solutions - Corporate OverviewOptimized Solutions - Corporate Overview
Optimized Solutions - Corporate Overview
 
DOES SFO 2016 - Rich Jackson & Rosalind Radcliffe - The Mainframe DevOps Team...
DOES SFO 2016 - Rich Jackson & Rosalind Radcliffe - The Mainframe DevOps Team...DOES SFO 2016 - Rich Jackson & Rosalind Radcliffe - The Mainframe DevOps Team...
DOES SFO 2016 - Rich Jackson & Rosalind Radcliffe - The Mainframe DevOps Team...
 
Introducing Tupilak, Snowplow's unified log fabric
Introducing Tupilak, Snowplow's unified log fabricIntroducing Tupilak, Snowplow's unified log fabric
Introducing Tupilak, Snowplow's unified log fabric
 
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
 
A DevOps State of Mind with Microservices, Containers and Kubernetes
A DevOps State of Mind with Microservices, Containers and KubernetesA DevOps State of Mind with Microservices, Containers and Kubernetes
A DevOps State of Mind with Microservices, Containers and Kubernetes
 
RedisConf18 - Redis in Dev, Test, and Prod with the OpenShift Service Catalog
RedisConf18 - Redis in Dev, Test, and Prod with the OpenShift Service CatalogRedisConf18 - Redis in Dev, Test, and Prod with the OpenShift Service Catalog
RedisConf18 - Redis in Dev, Test, and Prod with the OpenShift Service Catalog
 
Elastic at Procter & Gamble: A Network Story
Elastic at Procter & Gamble: A Network StoryElastic at Procter & Gamble: A Network Story
Elastic at Procter & Gamble: A Network Story
 
Serverless Architecture at iRobot
Serverless Architecture at iRobotServerless Architecture at iRobot
Serverless Architecture at iRobot
 
Tracking Huge Files with Git LFS
Tracking Huge Files with Git LFSTracking Huge Files with Git LFS
Tracking Huge Files with Git LFS
 
Combinación de logs, métricas y seguimiento para una visibilidad centralizada
Combinación de logs, métricas y seguimiento para una visibilidad centralizadaCombinación de logs, métricas y seguimiento para una visibilidad centralizada
Combinación de logs, métricas y seguimiento para una visibilidad centralizada
 
Elastic Cloud Enterprise in Azure with Devon
Elastic Cloud Enterprise in Azure with DevonElastic Cloud Enterprise in Azure with Devon
Elastic Cloud Enterprise in Azure with Devon
 
Serverless Logging Architecture
Serverless Logging ArchitectureServerless Logging Architecture
Serverless Logging Architecture
 
JupyterCon 2020 - Supercharging SQL Users with Jupyter Notebooks
JupyterCon 2020 - Supercharging SQL Users with Jupyter NotebooksJupyterCon 2020 - Supercharging SQL Users with Jupyter Notebooks
JupyterCon 2020 - Supercharging SQL Users with Jupyter Notebooks
 
"Hacking" JIRA and Confluence Cloud Part 1 - Connect Your Apps - Travis Smith
"Hacking" JIRA and Confluence Cloud Part 1 - Connect Your Apps - Travis Smith"Hacking" JIRA and Confluence Cloud Part 1 - Connect Your Apps - Travis Smith
"Hacking" JIRA and Confluence Cloud Part 1 - Connect Your Apps - Travis Smith
 
Why Visibility into Your Stack Matters
Why Visibility into Your Stack MattersWhy Visibility into Your Stack Matters
Why Visibility into Your Stack Matters
 
node-crate: node.js and big data
 node-crate: node.js and big data node-crate: node.js and big data
node-crate: node.js and big data
 

Destaque

Chicago AWS user group meetup - May 2014 at Cohesive
Chicago AWS user group meetup - May 2014 at CohesiveChicago AWS user group meetup - May 2014 at Cohesive
Chicago AWS user group meetup - May 2014 at CohesiveAWS Chicago
 
Scott Paddock's AWS Chicago Healthcare slides - 2016
Scott Paddock's AWS Chicago Healthcare slides - 2016Scott Paddock's AWS Chicago Healthcare slides - 2016
Scott Paddock's AWS Chicago Healthcare slides - 2016AWS Chicago
 
AWS Chicago 2016 Lessons Learned Deploying the ELK Stack
AWS Chicago 2016 Lessons Learned Deploying the ELK StackAWS Chicago 2016 Lessons Learned Deploying the ELK Stack
AWS Chicago 2016 Lessons Learned Deploying the ELK StackAWS Chicago
 
Jeremy Cowan's AWS user group presentation "AWS Greengrass & IoT demo"
Jeremy Cowan's AWS user group presentation "AWS Greengrass & IoT demo"Jeremy Cowan's AWS user group presentation "AWS Greengrass & IoT demo"
Jeremy Cowan's AWS user group presentation "AWS Greengrass & IoT demo"AWS Chicago
 
Mark Johnson's AWS Chicago Healthcare Slides - 2016
Mark Johnson's AWS Chicago Healthcare Slides - 2016Mark Johnson's AWS Chicago Healthcare Slides - 2016
Mark Johnson's AWS Chicago Healthcare Slides - 2016AWS Chicago
 
3Com ESPL-341
3Com ESPL-3413Com ESPL-341
3Com ESPL-341savomir
 
Planning For My Double Page Spread
Planning For My Double Page SpreadPlanning For My Double Page Spread
Planning For My Double Page SpreadAS Media Column E
 
GDC Talk: Lifetime Value: The long tail of Mid-Core games
GDC Talk: Lifetime Value: The long tail of Mid-Core gamesGDC Talk: Lifetime Value: The long tail of Mid-Core games
GDC Talk: Lifetime Value: The long tail of Mid-Core gamesTamara (Tammy) Levy
 
Aleksandra Niepsuj 3d - Italy, March 2017, Erasmus+
Aleksandra Niepsuj 3d   - Italy, March 2017, Erasmus+Aleksandra Niepsuj 3d   - Italy, March 2017, Erasmus+
Aleksandra Niepsuj 3d - Italy, March 2017, Erasmus+magdajanusz
 
Mini project report_on_online_shopping
Mini project report_on_online_shoppingMini project report_on_online_shopping
Mini project report_on_online_shoppingSandeep Bittu
 
How many? animals 2
How many? animals 2How many? animals 2
How many? animals 2joseklo
 
AWS Chicago user group meetup on June 24, 2014
AWS Chicago user group meetup on June 24, 2014AWS Chicago user group meetup on June 24, 2014
AWS Chicago user group meetup on June 24, 2014CloudCamp Chicago
 
Chicago AWS user group meetup - May 2014 at Cohesive
Chicago AWS user group meetup - May 2014 at CohesiveChicago AWS user group meetup - May 2014 at Cohesive
Chicago AWS user group meetup - May 2014 at CohesiveCloudCamp Chicago
 
What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?Miklos Christine
 
2016 AWS Healthcare Day | Chicago, IL – June 28th, 2016
2016 AWS Healthcare Day | Chicago, IL – June 28th, 20162016 AWS Healthcare Day | Chicago, IL – June 28th, 2016
2016 AWS Healthcare Day | Chicago, IL – June 28th, 2016Amazon Web Services
 
AWS re:Invent 2016: Simplified Data Center Migration—Lessons Learned by Live ...
AWS re:Invent 2016: Simplified Data Center Migration—Lessons Learned by Live ...AWS re:Invent 2016: Simplified Data Center Migration—Lessons Learned by Live ...
AWS re:Invent 2016: Simplified Data Center Migration—Lessons Learned by Live ...Amazon Web Services
 
Hadoop AWS infrastructure cost evaluation
Hadoop AWS infrastructure cost evaluationHadoop AWS infrastructure cost evaluation
Hadoop AWS infrastructure cost evaluationmattlieber
 
API Architecture
API ArchitectureAPI Architecture
API ArchitectureRyan Kolak
 

Destaque (20)

Chicago AWS user group meetup - May 2014 at Cohesive
Chicago AWS user group meetup - May 2014 at CohesiveChicago AWS user group meetup - May 2014 at Cohesive
Chicago AWS user group meetup - May 2014 at Cohesive
 
Scott Paddock's AWS Chicago Healthcare slides - 2016
Scott Paddock's AWS Chicago Healthcare slides - 2016Scott Paddock's AWS Chicago Healthcare slides - 2016
Scott Paddock's AWS Chicago Healthcare slides - 2016
 
AWS Chicago 2016 Lessons Learned Deploying the ELK Stack
AWS Chicago 2016 Lessons Learned Deploying the ELK StackAWS Chicago 2016 Lessons Learned Deploying the ELK Stack
AWS Chicago 2016 Lessons Learned Deploying the ELK Stack
 
Jeremy Cowan's AWS user group presentation "AWS Greengrass & IoT demo"
Jeremy Cowan's AWS user group presentation "AWS Greengrass & IoT demo"Jeremy Cowan's AWS user group presentation "AWS Greengrass & IoT demo"
Jeremy Cowan's AWS user group presentation "AWS Greengrass & IoT demo"
 
Mark Johnson's AWS Chicago Healthcare Slides - 2016
Mark Johnson's AWS Chicago Healthcare Slides - 2016Mark Johnson's AWS Chicago Healthcare Slides - 2016
Mark Johnson's AWS Chicago Healthcare Slides - 2016
 
3Com ESPL-341
3Com ESPL-3413Com ESPL-341
3Com ESPL-341
 
Planning For My Double Page Spread
Planning For My Double Page SpreadPlanning For My Double Page Spread
Planning For My Double Page Spread
 
GDC Talk: Lifetime Value: The long tail of Mid-Core games
GDC Talk: Lifetime Value: The long tail of Mid-Core gamesGDC Talk: Lifetime Value: The long tail of Mid-Core games
GDC Talk: Lifetime Value: The long tail of Mid-Core games
 
6 i capitulo 5
6 i capitulo 56 i capitulo 5
6 i capitulo 5
 
Aleksandra Niepsuj 3d - Italy, March 2017, Erasmus+
Aleksandra Niepsuj 3d   - Italy, March 2017, Erasmus+Aleksandra Niepsuj 3d   - Italy, March 2017, Erasmus+
Aleksandra Niepsuj 3d - Italy, March 2017, Erasmus+
 
Mini project report_on_online_shopping
Mini project report_on_online_shoppingMini project report_on_online_shopping
Mini project report_on_online_shopping
 
How many? animals 2
How many? animals 2How many? animals 2
How many? animals 2
 
Chicago AWS meetup
Chicago AWS meetupChicago AWS meetup
Chicago AWS meetup
 
AWS Chicago user group meetup on June 24, 2014
AWS Chicago user group meetup on June 24, 2014AWS Chicago user group meetup on June 24, 2014
AWS Chicago user group meetup on June 24, 2014
 
Chicago AWS user group meetup - May 2014 at Cohesive
Chicago AWS user group meetup - May 2014 at CohesiveChicago AWS user group meetup - May 2014 at Cohesive
Chicago AWS user group meetup - May 2014 at Cohesive
 
What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?
 
2016 AWS Healthcare Day | Chicago, IL – June 28th, 2016
2016 AWS Healthcare Day | Chicago, IL – June 28th, 20162016 AWS Healthcare Day | Chicago, IL – June 28th, 2016
2016 AWS Healthcare Day | Chicago, IL – June 28th, 2016
 
AWS re:Invent 2016: Simplified Data Center Migration—Lessons Learned by Live ...
AWS re:Invent 2016: Simplified Data Center Migration—Lessons Learned by Live ...AWS re:Invent 2016: Simplified Data Center Migration—Lessons Learned by Live ...
AWS re:Invent 2016: Simplified Data Center Migration—Lessons Learned by Live ...
 
Hadoop AWS infrastructure cost evaluation
Hadoop AWS infrastructure cost evaluationHadoop AWS infrastructure cost evaluation
Hadoop AWS infrastructure cost evaluation
 
API Architecture
API ArchitectureAPI Architecture
API Architecture
 

Semelhante a AWSChicagoUserGroupBigDataDay

Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...Gary Arora
 
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...Amazon Web Services
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeDATAVERSITY
 
Cloud Big Data Architectures
Cloud Big Data ArchitecturesCloud Big Data Architectures
Cloud Big Data ArchitecturesLynn Langit
 
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyModernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyAlluxio, Inc.
 
Introducing the Hub for Data Orchestration
Introducing the Hub for Data OrchestrationIntroducing the Hub for Data Orchestration
Introducing the Hub for Data OrchestrationAlluxio, Inc.
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database RoundtableEric Kavanagh
 
Palringo AWS London Summit 2017
Palringo AWS London Summit 2017Palringo AWS London Summit 2017
Palringo AWS London Summit 2017PhilipBasford
 
Beyond Relational
Beyond RelationalBeyond Relational
Beyond RelationalLynn Langit
 
AWS vs Azure vs Google (GCP) - Slides
AWS vs Azure vs Google (GCP) - SlidesAWS vs Azure vs Google (GCP) - Slides
AWS vs Azure vs Google (GCP) - SlidesTobyWilman
 
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...Amazon Web Services
 
Using Cloud Automation Technologies to Deliver an Enterprise Data Fabric
Using Cloud Automation Technologies to Deliver an Enterprise Data FabricUsing Cloud Automation Technologies to Deliver an Enterprise Data Fabric
Using Cloud Automation Technologies to Deliver an Enterprise Data FabricCambridge Semantics
 
Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2Amazon Web Services
 
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca SartoriCCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartoriwalk2talk srl
 
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftAmazon Web Services
 
Otimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizações de Projetos de Big Data, Dw e AI no Microsoft AzureOtimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizações de Projetos de Big Data, Dw e AI no Microsoft AzureLuan Moreno Medeiros Maciel
 
Enterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Enterprise Data World 2018 - Building Cloud Self-Service Analytical SolutionEnterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Enterprise Data World 2018 - Building Cloud Self-Service Analytical SolutionDmitry Anoshin
 
Initiate Edinburgh 2019 - Big Data Meets AI
Initiate Edinburgh 2019 - Big Data Meets AIInitiate Edinburgh 2019 - Big Data Meets AI
Initiate Edinburgh 2019 - Big Data Meets AIAmazon Web Services
 

Semelhante a AWSChicagoUserGroupBigDataDay (20)

Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
 
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data Lake
 
Cloud Big Data Architectures
Cloud Big Data ArchitecturesCloud Big Data Architectures
Cloud Big Data Architectures
 
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyModernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
 
Introducing the Hub for Data Orchestration
Introducing the Hub for Data OrchestrationIntroducing the Hub for Data Orchestration
Introducing the Hub for Data Orchestration
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database Roundtable
 
Palringo AWS London Summit 2017
Palringo AWS London Summit 2017Palringo AWS London Summit 2017
Palringo AWS London Summit 2017
 
Beyond Relational
Beyond RelationalBeyond Relational
Beyond Relational
 
AWS vs Azure vs Google (GCP) - Slides
AWS vs Azure vs Google (GCP) - SlidesAWS vs Azure vs Google (GCP) - Slides
AWS vs Azure vs Google (GCP) - Slides
 
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
 
The Best of re:invent 2016
The Best of re:invent 2016The Best of re:invent 2016
The Best of re:invent 2016
 
Using Cloud Automation Technologies to Deliver an Enterprise Data Fabric
Using Cloud Automation Technologies to Deliver an Enterprise Data FabricUsing Cloud Automation Technologies to Deliver an Enterprise Data Fabric
Using Cloud Automation Technologies to Deliver an Enterprise Data Fabric
 
Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2
 
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca SartoriCCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori
 
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
 
Otimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizações de Projetos de Big Data, Dw e AI no Microsoft AzureOtimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
 
Enterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Enterprise Data World 2018 - Building Cloud Self-Service Analytical SolutionEnterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Enterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
 
Initiate Edinburgh 2019 - Big Data Meets AI
Initiate Edinburgh 2019 - Big Data Meets AIInitiate Edinburgh 2019 - Big Data Meets AI
Initiate Edinburgh 2019 - Big Data Meets AI
 
Best of re:Invent
Best of re:InventBest of re:Invent
Best of re:Invent
 

Mais de AWS Chicago

AWS reInvent 2023 recaps from Chicago AWS user group
AWS reInvent 2023 recaps from Chicago AWS user groupAWS reInvent 2023 recaps from Chicago AWS user group
AWS reInvent 2023 recaps from Chicago AWS user groupAWS Chicago
 
Chicago AWS Solutions Architect Mehdy Haghy recaps the new AI/ML releases and...
Chicago AWS Solutions Architect Mehdy Haghy recaps the new AI/ML releases and...Chicago AWS Solutions Architect Mehdy Haghy recaps the new AI/ML releases and...
Chicago AWS Solutions Architect Mehdy Haghy recaps the new AI/ML releases and...AWS Chicago
 
WilliamCollins_Road-to-Transit-Gateway.pptx
WilliamCollins_Road-to-Transit-Gateway.pptxWilliamCollins_Road-to-Transit-Gateway.pptx
WilliamCollins_Road-to-Transit-Gateway.pptxAWS Chicago
 
Suresh Poopandi_Generative AI On AWS-MidWestCommunityDay-Final.pdf
Suresh Poopandi_Generative AI On AWS-MidWestCommunityDay-Final.pdfSuresh Poopandi_Generative AI On AWS-MidWestCommunityDay-Final.pdf
Suresh Poopandi_Generative AI On AWS-MidWestCommunityDay-Final.pdfAWS Chicago
 
Streamlined Entitlements with AWS Lake Formation - Anusha Dwivedula
Streamlined Entitlements with AWS Lake Formation - Anusha DwivedulaStreamlined Entitlements with AWS Lake Formation - Anusha Dwivedula
Streamlined Entitlements with AWS Lake Formation - Anusha DwivedulaAWS Chicago
 
Steve Seaney_AWS Control Tower - 2023 Midwest Community Day - Final.pptx
Steve Seaney_AWS Control Tower - 2023 Midwest Community Day - Final.pptxSteve Seaney_AWS Control Tower - 2023 Midwest Community Day - Final.pptx
Steve Seaney_AWS Control Tower - 2023 Midwest Community Day - Final.pptxAWS Chicago
 
Saurabh_Shanbhag - Building_SaaS_on_AWS.pptx
Saurabh_Shanbhag - Building_SaaS_on_AWS.pptxSaurabh_Shanbhag - Building_SaaS_on_AWS.pptx
Saurabh_Shanbhag - Building_SaaS_on_AWS.pptxAWS Chicago
 
Sanket_Nasre_Simplify Modernization.pdf
Sanket_Nasre_Simplify Modernization.pdfSanket_Nasre_Simplify Modernization.pdf
Sanket_Nasre_Simplify Modernization.pdfAWS Chicago
 
Ross Stuart_Using ML to Solve Lifes Problems.pptx
Ross Stuart_Using ML to Solve Lifes Problems.pptxRoss Stuart_Using ML to Solve Lifes Problems.pptx
Ross Stuart_Using ML to Solve Lifes Problems.pptxAWS Chicago
 
robsable_Enhancing DevOps Practices with CloudWatch APM FINAL.pdf
robsable_Enhancing DevOps Practices with CloudWatch APM FINAL.pdfrobsable_Enhancing DevOps Practices with CloudWatch APM FINAL.pdf
robsable_Enhancing DevOps Practices with CloudWatch APM FINAL.pdfAWS Chicago
 
Sanket_Nasre_Simplify Modernization.pdf
Sanket_Nasre_Simplify Modernization.pdfSanket_Nasre_Simplify Modernization.pdf
Sanket_Nasre_Simplify Modernization.pdfAWS Chicago
 
Mohamed Wali_AWS Security Reference Architecture.pptx
Mohamed Wali_AWS Security Reference Architecture.pptxMohamed Wali_AWS Security Reference Architecture.pptx
Mohamed Wali_AWS Security Reference Architecture.pptxAWS Chicago
 
Nick-Walter-HOB_Migrating_Dinosaurs.pptx
Nick-Walter-HOB_Migrating_Dinosaurs.pptxNick-Walter-HOB_Migrating_Dinosaurs.pptx
Nick-Walter-HOB_Migrating_Dinosaurs.pptxAWS Chicago
 
Pat_Davies_AWSCostOptimization_Final.pdf
Pat_Davies_AWSCostOptimization_Final.pdfPat_Davies_AWSCostOptimization_Final.pdf
Pat_Davies_AWSCostOptimization_Final.pdfAWS Chicago
 
MARK GAMBLE_ASC For Really Remote Edge Computing - AWS Community Day Chicago ...
MARK GAMBLE_ASC For Really Remote Edge Computing - AWS Community Day Chicago ...MARK GAMBLE_ASC For Really Remote Edge Computing - AWS Community Day Chicago ...
MARK GAMBLE_ASC For Really Remote Edge Computing - AWS Community Day Chicago ...AWS Chicago
 
MichaelSoule-UsingJupyterNotebooks.pptx
MichaelSoule-UsingJupyterNotebooks.pptxMichaelSoule-UsingJupyterNotebooks.pptx
MichaelSoule-UsingJupyterNotebooks.pptxAWS Chicago
 
Michal Brygidyn_CloudHackingScenarios.pdf
Michal Brygidyn_CloudHackingScenarios.pdfMichal Brygidyn_CloudHackingScenarios.pdf
Michal Brygidyn_CloudHackingScenarios.pdfAWS Chicago
 
Kamil Kolodziejski_Structura-AWS.pptx
Kamil Kolodziejski_Structura-AWS.pptxKamil Kolodziejski_Structura-AWS.pptx
Kamil Kolodziejski_Structura-AWS.pptxAWS Chicago
 
John Merline AWS Certification FAQ.pptx
John Merline AWS Certification FAQ.pptxJohn Merline AWS Certification FAQ.pptx
John Merline AWS Certification FAQ.pptxAWS Chicago
 
JuliaFMorgado_Breaking_bad_habits.pptx
JuliaFMorgado_Breaking_bad_habits.pptxJuliaFMorgado_Breaking_bad_habits.pptx
JuliaFMorgado_Breaking_bad_habits.pptxAWS Chicago
 

Mais de AWS Chicago (20)

AWS reInvent 2023 recaps from Chicago AWS user group
AWS reInvent 2023 recaps from Chicago AWS user groupAWS reInvent 2023 recaps from Chicago AWS user group
AWS reInvent 2023 recaps from Chicago AWS user group
 
Chicago AWS Solutions Architect Mehdy Haghy recaps the new AI/ML releases and...
Chicago AWS Solutions Architect Mehdy Haghy recaps the new AI/ML releases and...Chicago AWS Solutions Architect Mehdy Haghy recaps the new AI/ML releases and...
Chicago AWS Solutions Architect Mehdy Haghy recaps the new AI/ML releases and...
 
WilliamCollins_Road-to-Transit-Gateway.pptx
WilliamCollins_Road-to-Transit-Gateway.pptxWilliamCollins_Road-to-Transit-Gateway.pptx
WilliamCollins_Road-to-Transit-Gateway.pptx
 
Suresh Poopandi_Generative AI On AWS-MidWestCommunityDay-Final.pdf
Suresh Poopandi_Generative AI On AWS-MidWestCommunityDay-Final.pdfSuresh Poopandi_Generative AI On AWS-MidWestCommunityDay-Final.pdf
Suresh Poopandi_Generative AI On AWS-MidWestCommunityDay-Final.pdf
 
Streamlined Entitlements with AWS Lake Formation - Anusha Dwivedula
Streamlined Entitlements with AWS Lake Formation - Anusha DwivedulaStreamlined Entitlements with AWS Lake Formation - Anusha Dwivedula
Streamlined Entitlements with AWS Lake Formation - Anusha Dwivedula
 
Steve Seaney_AWS Control Tower - 2023 Midwest Community Day - Final.pptx
Steve Seaney_AWS Control Tower - 2023 Midwest Community Day - Final.pptxSteve Seaney_AWS Control Tower - 2023 Midwest Community Day - Final.pptx
Steve Seaney_AWS Control Tower - 2023 Midwest Community Day - Final.pptx
 
Saurabh_Shanbhag - Building_SaaS_on_AWS.pptx
Saurabh_Shanbhag - Building_SaaS_on_AWS.pptxSaurabh_Shanbhag - Building_SaaS_on_AWS.pptx
Saurabh_Shanbhag - Building_SaaS_on_AWS.pptx
 
Sanket_Nasre_Simplify Modernization.pdf
Sanket_Nasre_Simplify Modernization.pdfSanket_Nasre_Simplify Modernization.pdf
Sanket_Nasre_Simplify Modernization.pdf
 
Ross Stuart_Using ML to Solve Lifes Problems.pptx
Ross Stuart_Using ML to Solve Lifes Problems.pptxRoss Stuart_Using ML to Solve Lifes Problems.pptx
Ross Stuart_Using ML to Solve Lifes Problems.pptx
 
robsable_Enhancing DevOps Practices with CloudWatch APM FINAL.pdf
robsable_Enhancing DevOps Practices with CloudWatch APM FINAL.pdfrobsable_Enhancing DevOps Practices with CloudWatch APM FINAL.pdf
robsable_Enhancing DevOps Practices with CloudWatch APM FINAL.pdf
 
Sanket_Nasre_Simplify Modernization.pdf
Sanket_Nasre_Simplify Modernization.pdfSanket_Nasre_Simplify Modernization.pdf
Sanket_Nasre_Simplify Modernization.pdf
 
Mohamed Wali_AWS Security Reference Architecture.pptx
Mohamed Wali_AWS Security Reference Architecture.pptxMohamed Wali_AWS Security Reference Architecture.pptx
Mohamed Wali_AWS Security Reference Architecture.pptx
 
Nick-Walter-HOB_Migrating_Dinosaurs.pptx
Nick-Walter-HOB_Migrating_Dinosaurs.pptxNick-Walter-HOB_Migrating_Dinosaurs.pptx
Nick-Walter-HOB_Migrating_Dinosaurs.pptx
 
Pat_Davies_AWSCostOptimization_Final.pdf
Pat_Davies_AWSCostOptimization_Final.pdfPat_Davies_AWSCostOptimization_Final.pdf
Pat_Davies_AWSCostOptimization_Final.pdf
 
MARK GAMBLE_ASC For Really Remote Edge Computing - AWS Community Day Chicago ...
MARK GAMBLE_ASC For Really Remote Edge Computing - AWS Community Day Chicago ...MARK GAMBLE_ASC For Really Remote Edge Computing - AWS Community Day Chicago ...
MARK GAMBLE_ASC For Really Remote Edge Computing - AWS Community Day Chicago ...
 
MichaelSoule-UsingJupyterNotebooks.pptx
MichaelSoule-UsingJupyterNotebooks.pptxMichaelSoule-UsingJupyterNotebooks.pptx
MichaelSoule-UsingJupyterNotebooks.pptx
 
Michal Brygidyn_CloudHackingScenarios.pdf
Michal Brygidyn_CloudHackingScenarios.pdfMichal Brygidyn_CloudHackingScenarios.pdf
Michal Brygidyn_CloudHackingScenarios.pdf
 
Kamil Kolodziejski_Structura-AWS.pptx
Kamil Kolodziejski_Structura-AWS.pptxKamil Kolodziejski_Structura-AWS.pptx
Kamil Kolodziejski_Structura-AWS.pptx
 
John Merline AWS Certification FAQ.pptx
John Merline AWS Certification FAQ.pptxJohn Merline AWS Certification FAQ.pptx
John Merline AWS Certification FAQ.pptx
 
JuliaFMorgado_Breaking_bad_habits.pptx
JuliaFMorgado_Breaking_bad_habits.pptxJuliaFMorgado_Breaking_bad_habits.pptx
JuliaFMorgado_Breaking_bad_habits.pptx
 

Último

What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 

Último (20)

What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 

AWSChicagoUserGroupBigDataDay

  • 1. ! AWS Chicago User Group ! Big Data Day
  • 2. Have an idea for a meetup? Talk to me: ! Margaret Walker
 CohesiveFT ! ! Tweet: @MargieWalker
 #AWSChicago Sponsors & Hosts #AWSChicago
  • 3. 6:00 pm Introductions 6:05 pm Short Talks ! "AWS Storage Options" Ben Blair, CTO at MarkITx @stochastic_code ! "APIs and Big Data in AWS" - Kin Lane,API Evangelist @kinlane ! "Democratizing Data Analysis with Amazon Redshift" - Bill Wanjohi @billwanjohi and Michelangelo D'Agostino @MichelangeloDA, Civis Analytics ! 6:45 pm Q & A 7:00 pm Networking, drinks and pizza Agenda #AWSChicago Sponsors & Hosts
  • 4. Next Meetups: 
 October 15? ! +Nov 12
 Let’s drink at re:Invent
  • 5. Keep it Secret, Keep it Safe (and Fast and Available would be nice too)
  • 6. Hi Ben Blair CTO @ MarkITx We live on AWS
  • 7. TL;DW • Use IAM roles for access control • Use DynamoDB for online storage & transactions • Use Redshift for offline storage & analysis • Use S3 to keep *everything*
  • 8. It’s hard to keep a secret Use AIM EC2 roles instead
  • 9. 3rd normal form, anyone? Data duplication is OK Optimize for each context
  • 10. Interactive Data goes in DynamoDB If your users read or write it, and it’s not huge, it should probably go into DynamoDB
  • 11. Why DynamoDB • Works with tests. Tests are good. • Predictable Performance & Cost • Low Maintenance
  • 12. Why Not DynamoDB • Vendor lock-in vs Cassandra • Can’t add / change indexes (but that’s ok) • Need to watch utilization
  • 14. ElastiCache Good place to end, bad place to start
  • 17. Redshift vs RDS • Start with RDS • Redshift is actually very cheap • RDS for simple reporting on small data sets • Redshift for all other analysis
  • 18. S3 Store Everything. ! You won’t, and you’ll regret it later.
  • 19. EBS Distributed Availability > Instance Recovery
  • 20. Names Matter Distributed systems care about your keyspace even when you don’t
  • 22. "APIs and Big Data in AWS" Kin Lane
 API Evangelist ! @kinlane ! Click here for slides on GitHub #AWSChicago Sponsors & Hosts
  • 23. Democratizing Data Analysis with Amazon Redshift Michelangelo D’Agostino - Civis Analytics Senior Data Scientist Bill Wanjohi - Civis Analytics Senior Engineer
  • 24. ● advantages of Redshift ● some pitfalls ● workflows and recommendations on best practices What you’ll learn
  • 25. Why should you listen? ● 18 months of heavy Redshift use ● Two complementary perspectives: The Scientist and The Engineer
  • 28. ● collaborated on monolithic Vertica analytics database ● dozens of TB of data ● scaled from 4-20 server blades ● dozens of concurrent users across departments (hundreds total) ● arbitrary SQL allowed/encouraged Life before Redshift
  • 29. Our early requirements ● SQL language ● low starting cost ● easy to integrate with OSS, other DBs ● performant on large data sets ● minimal database administration
  • 30. Choosing Redshift ● timing: first full release in Feb 2013 ● drastically cheaper to start than other commercial offerings ● very similar to our previous choice, HP Vertica ● many fewer administration tasks
  • 31. Basics ● RDBMS ● MPP/Columnar Supports window functions Few enforceable constraints No concept of an index ● Redshift <= ParAccel <= PostgreSQL 8 Postgres drivers work ORM requires mocking ● Most data I/O via S3 service
  • 32. Things analytics DBs are good at ● Big aggregates ● Parallel I/O ● Merge joins between tables
  • 33. Things they’re not good at ● Updates ● Retrieval of individual records ● Enforcing data quality
  • 34. How’s it worked out? Pretty good! ● adequate performance ○ big step up from traditional RDBMS ○ comparable to other analytics DBs ● easy to stand up new clusters ● cheaper clusters now available ● most workflows can live entirely in-database ● s3 is a good broker for what can’t
  • 35. Data Science Workflow Our custom plumbing syncs tables from dozens of source databases into Redshift at varying refresh frequencies.
  • 36. We’ve found that SQL just invites so many more people to the analytics game. Analysts and data scientists run exploratory SQL and build up complex tables for statistical modeling一utilizing crazy joins, aggregates and rollup features. Redshift supports powerful window functions Data Science Workflow
  • 37. Predictive Modeling Data is pulled directly from Redshift into python/R to train statistical models
  • 38. Predictive Modeling For simple linear models, scoring is done directly in redshift via SQL. For more complicated models, data is pulled from redshift to s3 with a COPY SQL command, processed in EMR, and loaded back into redshift with another COPY command.
  • 39. Hurdles we’ve faced along the way ● inconsistent runtimes ● catalog contention ● bugs (databases are hard) ● resizing ● too easy to end up with uncompressed data ● “missing” PostgreSQL functionality ● complex workload management
  • 40. Setup Recommendations ● at least two nodes ● send 35-day snapshots to other regions ● at-rest encryption ● enforce SSL ● provision with boto or AWS CLI ● cluster isolation to hide objects ● buy 3-year reservations
  • 41. We’re Hiring! Through research, experimentation, and iteration, we’re transforming how organizations do analytics. Our clients range in scale and focus from local to international, all empowered by our individual-level, data-driven approach. civisanalytics.com/apply