SlideShare uma empresa Scribd logo
1 de 16
Baixar para ler offline
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Modernize your data warehouse with
Amazon Redshift
Harshida Patel
DW Specialist SA
AWS
A D B 3 0 5
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Agenda
Startup:
1. Log in to the AWS Management Console (using your account and credits, feel free to work
with a neighbor)
2. Switch to the Oregon region (us-west-2)
3. Create an IAM role for Amazon Redshift Spectrum
4. Create an Amazon Redshift cluster and associate the IAM role
5. Update the security group to allow Amazon Redshift
Refresher on Amazon Redshift
Workshop time
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
https://tinyurl.com/y33amykm
1. Log in to the console (using your account and credits, or the workshop account)
2. Switch to the Oregon Region (us-west-2)
3. Create an IAM role for Amazon Redshift Spectrum
4. Create an Amazon Redshift cluster and associate the IAM role
5. Update the security group to allow Amazon Redshift
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Load
Unload
Backup
Restore
Massively parallel, shared nothing columnar
architecture
Leader node
• SQL endpoint
• Stores metadata
• Coordinates parallel SQL processing
Compute nodes
• Local, columnar storage
• Executes queries in parallel
• Load, unload, backup, restore
Amazon Redshift Spectrum nodes
• Execute queries directly against
Amazon Simple Storage Service (Amazon
S3)
SQL clients/BI tools
128GB RAM
16TB disk
16 cores
JDBC/ODBC
128GB RAM
16TB disk
16 coresCompute
node
128GB RAM
16TB disk
16 coresCompute
node
128GB RAM
16TB disk
16 coresCompute
node
Leader
node
Amazon S3
...
1 2 3 4 N
Amazon
Redshift
Spectrum
Load
Query
Amazon Redshift architecture
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon Redshift Advisor: Your DBA’s best friend
• Amazon Redshift expert system available in the console
• Identifies undesirable user behaviors for resolution by providing
high-impact recommendations to improve performance and
reduce cost
• >96% of clusters have tailored feedback
• Actionable WLM, COPY, storage, and system maintenance feedback
• Analyses have doubled since launch (July ‘18); will double again by
EOY
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Five points of guidance for Amazon Redshift (SET DW)
Sort Key (to improve filter performance) Choose up to three columns (Compound Sort Key)
ordered in increasing order of specificity, balanced with likelihood of use.
Encoding of Columns Compress all columns except for the first sort key column.
Table Maintenance VACUUM and ANALYZE tables weekly (use the Amazon Redshift Advisor and/or
STL_ALERT_EVENT_LOG as a guide for frequency).
Distribution Key (to improve join performance) strategy that:
• Follows the common join pattern for the table and evenly distributes the data across the database slic
on the cluster.
• DISTSTYLE AUTO is a great go-to for all tables < ~5 million rows.
• DISTSTYLE EVEN is a good fail-safe, but remember data redistribution.
Workload Management (WLM) and Query Monitoring Rules (QMR)
• Start with defining up to ~3 queues.
• Split up the memory across the queues. Monitor the percent of each queue’s workload going to disk.
• Anticipate changing WLM settings to match the workload changes (day|night, weekday|weekend).
• Use QMR.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Availability of intelligent administration and maintenance features
Distribution key Recommendation for distribution key
Sort key Recommendation for sort key
Concurrency
setting
Automation for concurrent setting, making
it dynamic
Vacuum Auto vacuum in the background
Analyze Auto analyze in the background
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Concurrency Scaling
Amazon Redshift automatically adds transient clusters,
in seconds, to serve sudden spike in concurrent requests with
consistently fast performance.
Backup Caching layer
How it works:
All queries go to the leader node.
The user experiences less wait for
queries.
When queries in designated WLM
queue begin queuing, Amazon
Redshift automatically routes them
to the new clusters, automatically
enabling Concurrency Scaling.
Amazon Redshift automatically spins
up a new cluster, processes waiting
queries, and automatically shuts
down the concurrency scaling
cluster.
1
2
3
For every 24 hours that your main
cluster is in use, you accrue a one-
hour credit for concurrency
scaling. this means that
concurrency scaling is free for
>97% of customers.
Launched
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
https://tinyurl.com/y33amykm
Thank you!
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Harshida Patel
DW Specialist SA
AWS

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Introducing-AWS-Hong-Kong-Region
Introducing-AWS-Hong-Kong-RegionIntroducing-AWS-Hong-Kong-Region
Introducing-AWS-Hong-Kong-Region
 
How to go from zero to data lakes in days - ADB202 - New York AWS Summit
How to go from zero to data lakes in days - ADB202 - New York AWS SummitHow to go from zero to data lakes in days - ADB202 - New York AWS Summit
How to go from zero to data lakes in days - ADB202 - New York AWS Summit
 
Resiliency-and-Availability-Design-Patterns-for-the-Cloud
Resiliency-and-Availability-Design-Patterns-for-the-CloudResiliency-and-Availability-Design-Patterns-for-the-Cloud
Resiliency-and-Availability-Design-Patterns-for-the-Cloud
 
Deep dive on Amazon S3 Glacier Deep Archive - STG301 - Santa Clara AWS Summit
Deep dive on Amazon S3 Glacier Deep Archive - STG301 - Santa Clara AWS SummitDeep dive on Amazon S3 Glacier Deep Archive - STG301 - Santa Clara AWS Summit
Deep dive on Amazon S3 Glacier Deep Archive - STG301 - Santa Clara AWS Summit
 
Running Amazon EC2 workloads at scale - CMP301 - New York AWS Summit
Running Amazon EC2 workloads at scale - CMP301 - New York AWS SummitRunning Amazon EC2 workloads at scale - CMP301 - New York AWS Summit
Running Amazon EC2 workloads at scale - CMP301 - New York AWS Summit
 
Best-Practices-for-Running-Windows-Workloads-on-AWS
Best-Practices-for-Running-Windows-Workloads-on-AWSBest-Practices-for-Running-Windows-Workloads-on-AWS
Best-Practices-for-Running-Windows-Workloads-on-AWS
 
Manage your database in the cloud like a pro with Cloud Volumes Service for A...
Manage your database in the cloud like a pro with Cloud Volumes Service for A...Manage your database in the cloud like a pro with Cloud Volumes Service for A...
Manage your database in the cloud like a pro with Cloud Volumes Service for A...
 
Tech deep dive: Cloud data management with Veeam and AWS - SVC216-S - New Yor...
Tech deep dive: Cloud data management with Veeam and AWS - SVC216-S - New Yor...Tech deep dive: Cloud data management with Veeam and AWS - SVC216-S - New Yor...
Tech deep dive: Cloud data management with Veeam and AWS - SVC216-S - New Yor...
 
Storing data long term with Amazon S3 Glacier Deep Archive - STG301 - New Yor...
Storing data long term with Amazon S3 Glacier Deep Archive - STG301 - New Yor...Storing data long term with Amazon S3 Glacier Deep Archive - STG301 - New Yor...
Storing data long term with Amazon S3 Glacier Deep Archive - STG301 - New Yor...
 
Fulfilling_a_Billion_Requests_from_a_Global_SaaS_Company_Insights_into_AfterS...
Fulfilling_a_Billion_Requests_from_a_Global_SaaS_Company_Insights_into_AfterS...Fulfilling_a_Billion_Requests_from_a_Global_SaaS_Company_Insights_into_AfterS...
Fulfilling_a_Billion_Requests_from_a_Global_SaaS_Company_Insights_into_AfterS...
 
Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...
 
What's new in Amazon Aurora - ADB207 - New York AWS Summit
What's new in Amazon Aurora - ADB207 - New York AWS SummitWhat's new in Amazon Aurora - ADB207 - New York AWS Summit
What's new in Amazon Aurora - ADB207 - New York AWS Summit
 
Ask me anything about building data lakes on AWS - ADB209 - New York AWS Summit
Ask me anything about building data lakes on AWS - ADB209 - New York AWS SummitAsk me anything about building data lakes on AWS - ADB209 - New York AWS Summit
Ask me anything about building data lakes on AWS - ADB209 - New York AWS Summit
 
Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS Summit
Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS SummitScalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS Summit
Scalable, secure log analytics with Amazon ES - ADB302 - Chicago AWS Summit
 
Journey into the Cloud with VMware Cloud on AWS: Deep Dive - CMP303 - Anaheim...
Journey into the Cloud with VMware Cloud on AWS: Deep Dive - CMP303 - Anaheim...Journey into the Cloud with VMware Cloud on AWS: Deep Dive - CMP303 - Anaheim...
Journey into the Cloud with VMware Cloud on AWS: Deep Dive - CMP303 - Anaheim...
 
What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...
What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...
What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...
 
Optimizing data lakes with Amazon S3 - STG302 - New York AWS Summit
Optimizing data lakes with Amazon S3 - STG302 - New York AWS SummitOptimizing data lakes with Amazon S3 - STG302 - New York AWS Summit
Optimizing data lakes with Amazon S3 - STG302 - New York AWS Summit
 
How-to-Choose-the-Right-Database-to-Build-High-Performance-Internet-Scale-App...
How-to-Choose-the-Right-Database-to-Build-High-Performance-Internet-Scale-App...How-to-Choose-the-Right-Database-to-Build-High-Performance-Internet-Scale-App...
How-to-Choose-the-Right-Database-to-Build-High-Performance-Internet-Scale-App...
 
Data_Analytics_and_AI_ML
Data_Analytics_and_AI_MLData_Analytics_and_AI_ML
Data_Analytics_and_AI_ML
 
Soluzioni per la migrazione e gestione dei dati in Amazon Web Services
Soluzioni per la migrazione e gestione dei dati in Amazon Web ServicesSoluzioni per la migrazione e gestione dei dati in Amazon Web Services
Soluzioni per la migrazione e gestione dei dati in Amazon Web Services
 

Semelhante a Modernize your data warehouse with Amazon Redshift - ADB305 - New York AWS Summit

Semelhante a Modernize your data warehouse with Amazon Redshift - ADB305 - New York AWS Summit (20)

Modernizing your data warehouse using Amazon Redshift - ADB303 - Chicago AWS ...
Modernizing your data warehouse using Amazon Redshift - ADB303 - Chicago AWS ...Modernizing your data warehouse using Amazon Redshift - ADB303 - Chicago AWS ...
Modernizing your data warehouse using Amazon Redshift - ADB303 - Chicago AWS ...
 
Modernize your data warehouse with Amazon Redshift - ADB305 - Santa Clara AWS...
Modernize your data warehouse with Amazon Redshift - ADB305 - Santa Clara AWS...Modernize your data warehouse with Amazon Redshift - ADB305 - Santa Clara AWS...
Modernize your data warehouse with Amazon Redshift - ADB305 - Santa Clara AWS...
 
Amazon Redshift tips and tricks - Scaling storage and compute - ADB301 - Sant...
Amazon Redshift tips and tricks - Scaling storage and compute - ADB301 - Sant...Amazon Redshift tips and tricks - Scaling storage and compute - ADB301 - Sant...
Amazon Redshift tips and tricks - Scaling storage and compute - ADB301 - Sant...
 
Optimize EC2 for Fun and Profit - SRV203 - Anaheim AWS Summit
Optimize EC2 for Fun and Profit - SRV203 - Anaheim AWS SummitOptimize EC2 for Fun and Profit - SRV203 - Anaheim AWS Summit
Optimize EC2 for Fun and Profit - SRV203 - Anaheim AWS Summit
 
What's new with Amazon Redshift - ADB203 - New York AWS Summit
What's new with Amazon Redshift - ADB203 - New York AWS SummitWhat's new with Amazon Redshift - ADB203 - New York AWS Summit
What's new with Amazon Redshift - ADB203 - New York AWS Summit
 
Best practices for Running Spark jobs on Amazon EMR with Spot Instances | AWS...
Best practices for Running Spark jobs on Amazon EMR with Spot Instances | AWS...Best practices for Running Spark jobs on Amazon EMR with Spot Instances | AWS...
Best practices for Running Spark jobs on Amazon EMR with Spot Instances | AWS...
 
SRV203 Optimizing Amazon EC2 for Fun and Profit
 SRV203 Optimizing Amazon EC2 for Fun and Profit SRV203 Optimizing Amazon EC2 for Fun and Profit
SRV203 Optimizing Amazon EC2 for Fun and Profit
 
Optimize Amazon EC2 for Fun and Profit - SRV203 - Chicago AWS Summit
Optimize Amazon EC2 for Fun and Profit - SRV203 - Chicago AWS SummitOptimize Amazon EC2 for Fun and Profit - SRV203 - Chicago AWS Summit
Optimize Amazon EC2 for Fun and Profit - SRV203 - Chicago AWS Summit
 
Data Warehousing in the Cloud - AWS Summit Sydney
Data Warehousing in the Cloud - AWS Summit SydneyData Warehousing in the Cloud - AWS Summit Sydney
Data Warehousing in the Cloud - AWS Summit Sydney
 
Amazon Aurora, funzionalità e best practice per la migrazione di database su AWS
Amazon Aurora, funzionalità e best practice per la migrazione di database su AWSAmazon Aurora, funzionalità e best practice per la migrazione di database su AWS
Amazon Aurora, funzionalità e best practice per la migrazione di database su AWS
 
Managed Relational Databases
Managed Relational DatabasesManaged Relational Databases
Managed Relational Databases
 
Data Design and Modeling for Microservices I AWS Dev Day 2018
Data Design and Modeling for Microservices I AWS Dev Day 2018Data Design and Modeling for Microservices I AWS Dev Day 2018
Data Design and Modeling for Microservices I AWS Dev Day 2018
 
Capacity Management Made Easy with Amazon EC2 Auto Scaling (CMP377) - AWS re:...
Capacity Management Made Easy with Amazon EC2 Auto Scaling (CMP377) - AWS re:...Capacity Management Made Easy with Amazon EC2 Auto Scaling (CMP377) - AWS re:...
Capacity Management Made Easy with Amazon EC2 Auto Scaling (CMP377) - AWS re:...
 
How to Build Multi-Region Applications in the Cloud: AWS Developer Workshop -...
How to Build Multi-Region Applications in the Cloud: AWS Developer Workshop -...How to Build Multi-Region Applications in the Cloud: AWS Developer Workshop -...
How to Build Multi-Region Applications in the Cloud: AWS Developer Workshop -...
 
What's New in Amazon Aurora - ADB203 - Anaheim AWS Summit
What's New in Amazon Aurora - ADB203 - Anaheim AWS SummitWhat's New in Amazon Aurora - ADB203 - Anaheim AWS Summit
What's New in Amazon Aurora - ADB203 - Anaheim AWS Summit
 
What's new in Amazon Aurora - ADB203 - Atlanta AWS Summit
What's new in Amazon Aurora - ADB203 - Atlanta AWS SummitWhat's new in Amazon Aurora - ADB203 - Atlanta AWS Summit
What's new in Amazon Aurora - ADB203 - Atlanta AWS Summit
 
Migrating Oracle to Aurora PostgreSQL Utilizing AWS Database Migration Servic...
Migrating Oracle to Aurora PostgreSQL Utilizing AWS Database Migration Servic...Migrating Oracle to Aurora PostgreSQL Utilizing AWS Database Migration Servic...
Migrating Oracle to Aurora PostgreSQL Utilizing AWS Database Migration Servic...
 
Aurora Serverless: Scalable, Cost-Effective Application Deployment (DAT336) -...
Aurora Serverless: Scalable, Cost-Effective Application Deployment (DAT336) -...Aurora Serverless: Scalable, Cost-Effective Application Deployment (DAT336) -...
Aurora Serverless: Scalable, Cost-Effective Application Deployment (DAT336) -...
 
Building a Modern Data Warehouse - Deep Dive on Amazon Redshift
Building a Modern Data Warehouse - Deep Dive on Amazon RedshiftBuilding a Modern Data Warehouse - Deep Dive on Amazon Redshift
Building a Modern Data Warehouse - Deep Dive on Amazon Redshift
 
SAP-HANA in high Availability su AWS-Webinar
SAP-HANA in high Availability su AWS-WebinarSAP-HANA in high Availability su AWS-Webinar
SAP-HANA in high Availability su AWS-Webinar
 

Mais de Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Mais de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Modernize your data warehouse with Amazon Redshift - ADB305 - New York AWS Summit

  • 1. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Modernize your data warehouse with Amazon Redshift Harshida Patel DW Specialist SA AWS A D B 3 0 5
  • 2. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Agenda Startup: 1. Log in to the AWS Management Console (using your account and credits, feel free to work with a neighbor) 2. Switch to the Oregon region (us-west-2) 3. Create an IAM role for Amazon Redshift Spectrum 4. Create an Amazon Redshift cluster and associate the IAM role 5. Update the security group to allow Amazon Redshift Refresher on Amazon Redshift Workshop time
  • 3. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 4. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. https://tinyurl.com/y33amykm 1. Log in to the console (using your account and credits, or the workshop account) 2. Switch to the Oregon Region (us-west-2) 3. Create an IAM role for Amazon Redshift Spectrum 4. Create an Amazon Redshift cluster and associate the IAM role 5. Update the security group to allow Amazon Redshift
  • 5. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 6. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Load Unload Backup Restore Massively parallel, shared nothing columnar architecture Leader node • SQL endpoint • Stores metadata • Coordinates parallel SQL processing Compute nodes • Local, columnar storage • Executes queries in parallel • Load, unload, backup, restore Amazon Redshift Spectrum nodes • Execute queries directly against Amazon Simple Storage Service (Amazon S3) SQL clients/BI tools 128GB RAM 16TB disk 16 cores JDBC/ODBC 128GB RAM 16TB disk 16 coresCompute node 128GB RAM 16TB disk 16 coresCompute node 128GB RAM 16TB disk 16 coresCompute node Leader node Amazon S3 ... 1 2 3 4 N Amazon Redshift Spectrum Load Query Amazon Redshift architecture
  • 7. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon Redshift Advisor: Your DBA’s best friend • Amazon Redshift expert system available in the console • Identifies undesirable user behaviors for resolution by providing high-impact recommendations to improve performance and reduce cost • >96% of clusters have tailored feedback • Actionable WLM, COPY, storage, and system maintenance feedback • Analyses have doubled since launch (July ‘18); will double again by EOY
  • 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Five points of guidance for Amazon Redshift (SET DW) Sort Key (to improve filter performance) Choose up to three columns (Compound Sort Key) ordered in increasing order of specificity, balanced with likelihood of use. Encoding of Columns Compress all columns except for the first sort key column. Table Maintenance VACUUM and ANALYZE tables weekly (use the Amazon Redshift Advisor and/or STL_ALERT_EVENT_LOG as a guide for frequency). Distribution Key (to improve join performance) strategy that: • Follows the common join pattern for the table and evenly distributes the data across the database slic on the cluster. • DISTSTYLE AUTO is a great go-to for all tables < ~5 million rows. • DISTSTYLE EVEN is a good fail-safe, but remember data redistribution. Workload Management (WLM) and Query Monitoring Rules (QMR) • Start with defining up to ~3 queues. • Split up the memory across the queues. Monitor the percent of each queue’s workload going to disk. • Anticipate changing WLM settings to match the workload changes (day|night, weekday|weekend). • Use QMR.
  • 9. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Availability of intelligent administration and maintenance features Distribution key Recommendation for distribution key Sort key Recommendation for sort key Concurrency setting Automation for concurrent setting, making it dynamic Vacuum Auto vacuum in the background Analyze Auto analyze in the background
  • 10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Concurrency Scaling Amazon Redshift automatically adds transient clusters, in seconds, to serve sudden spike in concurrent requests with consistently fast performance. Backup Caching layer How it works: All queries go to the leader node. The user experiences less wait for queries. When queries in designated WLM queue begin queuing, Amazon Redshift automatically routes them to the new clusters, automatically enabling Concurrency Scaling. Amazon Redshift automatically spins up a new cluster, processes waiting queries, and automatically shuts down the concurrency scaling cluster. 1 2 3 For every 24 hours that your main cluster is in use, you accrue a one- hour credit for concurrency scaling. this means that concurrency scaling is free for >97% of customers. Launched
  • 11. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
  • 12. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
  • 13. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
  • 14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
  • 15. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. https://tinyurl.com/y33amykm
  • 16. Thank you! S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Harshida Patel DW Specialist SA AWS