SlideShare uma empresa Scribd logo
1 de 90
Baixar para ler offline
Masterclass
ianmas@amazon.com
@IanMmmm
Ian Massingham — Technical Evangelist
Amazon Redshift
Masterclass
Intended to educate you on how to get the best from AWS services
Show you how things work and how to get things done
A technical deep dive that goes beyond the basics
1
2
3
Amazon Redshift
A fast, fully managed, petabyte-scale data warehouse

Makes it simple & cost-effective to analyse all your data
You can continue to use your existing tools

Start small & scale to a petabyte or more
Costs less than $1,000/terabyte/year
Amazon Redshift
Optimised for
Data Warehousing
Scalable
No Up-Front Costs
Compatible
Secure
Simple
https://aws.amazon.com/blogs/aws/category/amazon-redshift/
https://aws.amazon.com/blogs/aws/category/amazon-redshift/
Agenda
Why Run Your Data Warehouse on AWS?

Getting Started

Table Design

Data Loading

Working with Data

Backup and Restoration

Upgrading & Scaling
WHY RUN YOUR DATA
WAREHOUSE ON AWS?
Database Management Challenges
Expensive Very difficult
to switch
Difficult to
administer
Challenging
to scale
Customers asked for data warehousing the AWS way
No upfront costs, pay as you go
Really fast performance at a really low price
Open and flexible with support for popular BI tools
Easy to provision & scale up massively
S
Amazon Redshift parallelises & distributes everything
Query
Load
Backup
Restore
Resize
Leader
Node
10GigE Mesh
JDBC/ODBC
Compute
Nodes
Common BI Tools
Amazon Redshift lets you start small & grow big
Dense Compute Nodes - dc1.large & dc1.8xlarge
dc1.large nodes
15GiB RAM

2 virtual cores, 10GigE
Single Node (160GB SSD)
Cluster 2-32 Nodes (320GB – 5.12TB SSD)
dc1.8xlarge nodes
244GiB RAM

32 virtual cores, 10GigE
Cluster 2-128 Nodes (up to 326TB SSD)
ds2.8xlarge nodes
244GiB RAM

36 virtual cores, 10GigE, 16TB magnetic HDD per node
Cluster 2-128 Nodes (up to 2PB magnetic HDD)
Amazon Redshift lets you start small & grow big
Dense Storage Nodes - ds1.xlarge, ds1.8xlarge, ds2.xlarge & ds2.8xlarge
Simplify Cluster Operations with Amazon Redshift
▶︎ Built-in security in transit, at rest, when backed up
▶︎ Backup to Amazon S3 is continuous, incremental & automatic
▶︎ Streaming restores let you resume querying faster
▶︎ Disk failures are transparent & nodes recover automatically
Clients Amazon Redshift Amazon S3
Automatic
Backups
Streaming

Restore
Amazon Redshift Dramatically Reduces IO
▶︎ Column storage
▶︎ Data compression
▶︎ Zone maps
▶︎ Direct-attached storage
▶︎ Large data block sizes
Row
storage
Column
storage
• Needed a way to increase speed, performance
and flexibility of data analysis at a low cost
• Using AWS enabled FT to run queries 98% faster
than previously—helping FT make business
decisions quickly
• Easier to track and analyze trends
• Reduced infrastructure costs by 80% over
traditional data center model
Amazon Redshift in action
GETTING STARTED
Creating an
Amazon Redshift
Data Warehouse
via the AWS
Console
Creating an
Amazon Redshift
Data Warehouse
via the AWS
Console
Creating an
Amazon Redshift
Data Warehouse
via the AWS
Console
Creating an
Amazon Redshift
Data Warehouse
via the AWS
Console
Creating an
Amazon Redshift
Data Warehouse
via the AWS
Console
Creating an
Amazon Redshift
Data Warehouse
via the AWS
Console
Creating an
Amazon Redshift
Data Warehouse
via the AWS
Console
Creating an
Amazon Redshift
Data Warehouse
via the AWS
Console
Creating an
Amazon Redshift
Data Warehouse
via the AWS
Console
Creating an
Amazon Redshift
Data Warehouse
via the AWS
Console
Creating an
Amazon Redshift
Data Warehouse
via the AWS
Console
Creating an
Amazon Redshift
Data Warehouse
via the AWS
Console
Creating an
Amazon Redshift
Data Warehouse
via the AWS
Console
Creating an
Amazon Redshift
Data Warehouse
via the AWS
Console
Creating an
Amazon Redshift
Data Warehouse
via the AWS
Console
Creating an
Amazon Redshift
Data Warehouse
via the AWS
Console
Getting Started with Amazon Redshift
http://aws.amazon.com/redshift/getting-started/
▶︎ 2 Month free trial
▶︎ Getting Started tutorial
▶︎ Amazon Redshift System Overview
▶︎ Guides for table design, loading data & query design
▶︎ Connect using industry standard OBDC/JDBC connections
▶︎ Many BI & ETL vendors have certified Amazon Redshift
TABLE DESIGN
Choosing the optimal table design
▶︎ Compression Encodings
▶︎ Choosing the right data types
▶︎ Distributing and sorting data
Choosing a Column Compression Type
http://docs.aws.amazon.com/redshift/latest/dg/t_Compressing_data_on_disk.html
▶︎ Compression is a key tool improving performance in modern
data warehouse systems to reduce the size of data that is
read from storage, minimise IO & improve query performance
▶︎ The COPY command used to load data in Amazon Redshift
will perform compression analysis
▶︎ We strongly recommend using the COPY command to apply
automatic compression
Data Types
Data Type Aliases Description
SMALLINT INT2 Signed two-byte integer
INTEGER INT, INT4 Signed four-byte integer
BIGINT INT8 Signed eight-byte integer
DECIMAL NUMERIC Exact numeric of selectable precision
REAL FLOAT4 Single precision floating-point number
DOUBLE PRECISION FLOAT8, FLOAT Double precision floating-point number
BOOLEAN BOOL Logical Boolean (true/false)
CHAR CHARACTER, NCHAR, BPCHAR Fixed-length character string
VARCHAR CHARACTER VARYING, NVARCHAR, TEXT Variable-length character string with a user-defined limit
DATE Calendar date (year, month, day)
TIMESTAMP TIMESTAMP WITHOUT TIME ZONE Date and time (without time zone)
http://docs.aws.amazon.com/redshift/latest/dg/c_Supported_data_types.html
Your Schema
▶︎ Expressed in 3rd Normal Form
Airport
Airport Group
Airport Group MapDestination Enquiry
Enquiry Airport
Enquiry Airport GroupHoliday Type
Search Type
Distributing Data
▶ Redshift is a distributed system:
▶ A cluster contains a leader node & compute nodes
▶ A compute node contains slices (one per core)
▶ A slice contains data
▶ Slices are chosen based on two types of distribution:
▶ Round Robin (automated)
▶ Based on a distribution key (hash of a defined column)
▶ Queries run on all slices in parallel: optimal query throughput can be achieved
when data is evenly spread across slices
Unoptimised Distribution
ORDERS ITEMS
Order 1: Dave Smith, Total £195
Item 1.1: Order 1, Kindle Fire HD 7”, £159
Item 1.2: Order 1, Kindle Fire Case, £36
Slice 1 | Slice 2
Node 1
Slice 1 | Slice 2
Node 2
Slice 1 | Slice 2
Node 3
Default (No Distribution Key, Round Robin Order)
Order 1 Order 2 Order 3Item 2.1 Item 1.1 Item 1.2
Item 2.2Item 3.1
Optimised Distribution
ORDERS ITEMS
Order 1: Dave Smith, Total £195
Item 1.1: Order 1, Kindle Fire HD 7”, £159
Item 1.2: Order 1, Kindle Fire Case, £36
Slice 1 | Slice 2
Node 1
Slice 1 | Slice 2
Node 2
Slice 1 | Slice 2
Node 3
Customised (ORDERS.ORDER_ID DISTKEY, ITEMS.ORDER_ID DISTKEY)
Order 1 Order 2 Order 3
Item 2.1Item 1.1
Item 1.2 Item 2.2
Item 3.1
▶ Frequently Joined
▶ By most commonly run queries
▶ By queries which consume the most CPU
▶ High Cardinality
▶ Large number of discrete values
▶ Low Skew
▶ Uniform Distribution
▶ No Hotspots
▶ Query STV_BLOCKLIST for Skew Factor
Choosing a Distribution Key
Sorting Data
▶ In the slices (on disk), the data is sorted by a sort key (if none uses
the insertion order)
▶ Choose a sort key that is frequently used in your queries
▶ As a query predicate (date, identifier, …)
▶ As a join parameter (it can also be the hash key)
▶ The sort key allows Redshift to avoid reading entire blocks based on
predicates
▶ E.g.: a table containing a timestamp sort key, and where only recent
data is accessed, will skip blocks containing “old” data
Schema Design
▶ Optimizing a database for querying
▶ Analyse using Automatic Compression to ensure optimal IO
▶ Co-locate frequently joined tables: use distribution key wisely
(avoids data transfers between nodes)
▶ For joined tables, use sort keys on the joined columns, allowing
fast merge joins
▶ Compression allows you to de-normalize without penalizing
storage, simplifying queries and limiting joins
Possible Schema Optimisations
DISTKEY, SORTKEY = EnquiryID
▶ Denormalise commonly used join attributes onto large tables
Destination
Airport
Airport Group MapEnquiry
Enquiry Airport
Enquiry Airport
Group
Search Type
Airport GroupHoliday Type
Possible Schema Optimisations
DISTKEY, SORTKEY = EnquiryID
▶ Cache Small Tables with DISTSTYLE ALL
Destination
Airport
Airport Group MapEnquiry
Enquiry Airport
Enquiry Airport
Group
Airport GroupHoliday Type
Search Type
http://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_TABLE_NEW.html
Vacuuming Tables
▶ Clean up tables after a bulk delete, a load, or a series of
incremental updates with the VACUUM command, either
against the entire database or against individual tables
http://docs.aws.amazon.com/redshift/latest/dg/t_Reclaiming_storage_space202.html
Learning more about Table Design
▶︎ Table Design Documentation
http://docs.aws.amazon.com/redshift/latest/dg/t_Creating_tables.html
▶︎ Table Design Tutorial
http://docs.aws.amazon.com/redshift/latest/dg/tutorial-tuning-tables.html
DATA LOADING
Amazon Redshift Loading Data Overview
AWS CloudCorporate Data center
DynamoDBAmazon S3
Data Volume
Amazon Elastic
MapReduce
Amazon RDS
Amazon Redshift
Amazon
Glacier
logs / files
Source DBs
VPN
Connection
AWS Direct
Connect
S3 Multipart
Upload
AWS Import/
Export
Sharing Your Data
▶ Data Dumped from DB
▶ CSV Files Encrypted
▶ Review Encryption Code
▶ Uploaded to S3
▶ Bucket <bucket>
▶ Data shared with AWS
{	
  
	
  	
  "Id":	
  "Policy1234567890",	
  
	
  	
  "Statement":	
  [	
  
	
  	
  	
  	
  {	
  
	
  	
  	
  	
  	
  	
  "Action":	
  [	
  
	
  	
  	
  	
  	
  	
  	
  	
  "s3:Get*",	
  
	
  	
  	
  	
  	
  	
  	
  	
  "s3:List*"	
  
	
  	
  	
  	
  	
  	
  ],	
  
	
  	
  	
  	
  	
  	
  "Effect":	
  "Allow",	
  
	
  	
  	
  	
  	
  	
  "Principal":	
  {	
  
	
  	
  	
  	
  	
  	
  	
  	
  "AWS":	
  ["arn:aws:iam::887210671223:root"]	
  
	
  	
  	
  	
  	
  	
  },	
  
	
  	
  	
  	
  	
  	
  "Resource":	
  [	
  
	
  	
  	
  	
  	
  	
  	
  	
  "arn:aws:s3:::<bucket-­‐name>",	
  
	
  	
  	
  	
  	
  	
  	
  	
  "arn:aws:s3:::<bucket-­‐name>/*"	
  
	
  	
  	
  	
  	
  	
  ],	
  
	
  	
  	
  	
  	
  	
  "Sid":	
  "Stmt0987654321"	
  
	
  	
  	
  	
  }	
  
	
  	
  ]	
  
}
Splitting Data Files
Slice 0
Slice 1
Slice 0
Slice 1
Node 0
Node 1
2 dc1.large nodes
copy	
  customer	
  	
  
from	
  ‘s3://mydata/client.txt’	
  
credentials	
  ‘aws_access_key_id=<your-­‐access-­‐key>;	
  aws_secret_access_key=<your_secret_key>’	
  
delimiter	
  ‘|’;
mydata
Client.txt.1
Client.txt.2
Client.txt.3
Client.txt.4
http://docs.aws.amazon.com/redshift/latest/dg/t_Loading_data.html
Typical Load Issues
▶ Mismatch between data types in table and values in input data fields
▶ Mismatch between number of columns in table and number of fields in input data
▶ Mismatched quotes
▶ Redshift supports both single and double quotes; however, these quotes must be balanced appropriately
▶ Incorrect format for date/time data in input files
▶ Use DATEFORMAT and TIMEFORMAT to control
▶ Out-of-range values in input files (for numeric columns)
▶ Number of distinct values for a column exceeds the limitation for its compression
encoding
Direct SQL
▶ Redshift supports standard DML commands
▶ INSERT, UPDATE, DELETE
▶ Redshift does not support single-command merge (upsert) statement
▶ Load data into a staging table
▶ Joining the staging table with the target table
▶ UPDATE data where row exists
▶ INSERT where no row exists
▶ All Direct SQL Commands go via Leader Node
Data Loading Best Practices
▶ Use a COPY Command to load data
▶ Use a single COPY command
▶ Split your data into multiple files
▶ Compress your data files with GZIP
▶ Use multi-row inserts if COPY is not possible
▶ Bulk insert operations (INSERT INTO…SELECT and CREATE
TABLE AS) provide high performance data insertion
Data Loading Best Practices
▶ Load your data in sort key order to avoid needing to vacuum
▶ Organize your data as a sequence of time-series tables, where each table is
identical but contains data for different time ranges
▶ Use staging tables to perform an upsert
▶ Run the VACUUM command whenever you add, delete, or modify a large
number of rows, unless you load your data in sort key order
▶ Increase the memory available to a COPY or VACUUM by increasing
wlm_query_slot_count
▶ Run the ANALYZE command whenever you’ve made a non-trivial number of
changes to your data to ensure your table statistics are current
WORKING WITH DATA IN
AMAZON REDSHIFT
RDBMS RedshiftOLTP ERP Reporting & BI
RedshiftOLTP ERP Reporting & BI
Data Integration
RDBMS
https://aws.amazon.com/redshift/partners/
JDBC/ODBC
Business Intelligence & Visualisation
https://aws.amazon.com/redshift/partners/
https://aws.amazon.com/marketplace/redshift
A M A Z O N Q U I C K S I G H T
FAST, EASY TO USE, CLOUD POWERED BUSINESS INTELLIGENCE
AVAILABLE IN PREVIEW NOW
B R I N G I N G B U S I N E S S I N T E L L I G E N C E T O A L L , W I T H A M A Z O N Q U I C K S I G H T
FIRST ANALYSIS IN
LESS THAN 60 SECONDS
BLAZING FAST QUERIES
WITH A NEW IN-MEMORY
QUERY ENGINE
(SPICE)
DYNAMIC, BEAUTIFUL
DATA VISUZALITIONS
SHARE LIVE AND
SNAPSHOT ANALYSES
WITH EVERYONE
INTEGRATE WITH
DATA SOURCES
ON AWS
1/10TH THE COST
OF OLD-GUARD BI
TOOLS
https://youtu.be/Tj0gW4XI6vU
Amazon QuickSight Launch & Demo
BACKUP AND RESTORATION
128GB RAM
16TB disk
16 cores
Redshift
Cluster
Node
128GB RAM
16TB disk
16 cores
Redshift
Cluster
Node
128GB RAM
16TB disk
16 cores
Redshift
Cluster
Node
Backups to Amazon S3 are
automatic, continuous &
incremental
Configurable system snapshot
retention period
User snapshots on-demand
Streaming restores enable you
to resume querying faster
Backup & Restoration
Amazon S3
CLUSTER DATA REPLICATION

+
AUTOMATED BACKUPS ON
AMAZON S3

+
NODE MONITORING
UPGRADING & SCALING
128GB RAM
16TB disk
16 coresRedshift
Leader Node
128GB RAM
16TB disk
16 coresRedshift
Cluster Node
128GB RAM
16TB disk
16 coresRedshift
Cluster Node
128GB RAM
16TB disk
16 coresRedshift
Cluster Node
128GB RAM
16TB disk
16 coresRedshift
Cluster Node
128GB RAM
16TB disk
16 coresRedshift
Cluster Node
128GB RAM
16TB disk
16 coresRedshift
Cluster Node
128GB RAM
16TB disk
16 coresRedshift
Cluster Node
Resize while remaining online
Provision a new cluster in the background
Copy data in parallel from node to node
Only charged for source cluster
SQL Clients/BI Tools
128GB RAM
16TB disk
16 coresRedshift
Leader Node
128GB RAM
16TB disk
16 coresRedshift
Leader Node
128GB RAM
16TB disk
16 coresRedshift
Cluster Node
128GB RAM
16TB disk
16 coresRedshift
Cluster Node
128GB RAM
16TB disk
16 coresRedshift
Cluster Node
128GB RAM
16TB disk
16 coresRedshift
Cluster Node
Automatic SQL endpoint switchover via DNS
Decommission the source cluster
SQL Clients/BI Tools
Resizing a Redshift
Data Warehouse
via the AWS
Console
Resizing a Redshift
Data Warehouse
via the AWS
Console
Resizing a Redshift
Data Warehouse
via the AWS
Console
Resizing a Redshift
Data Warehouse
via the AWS
Console
http://docs.aws.amazon.com/redshift/latest/mgmt/working-with-clusters.html#cluster-resize-intro
RECENT FEATURES
You can write UDFs using Python 2.7
Syntax is largely identical to
PostgreSQL UDF syntax
System & network calls within
UDFs are prohibited
Comes with Pandas, NumPy, and
SciPy pre-installed
You’ll also be able import your own
libraries for even more flexibility
Scalar User
Defined Functions
https://aws.amazon.com/blogs/aws/category/amazon-redshift/
DS2 instances have twice the
memory and compute power of
their Dense Storage predecessor,
DS1 (previously called DW1), but
the same storage.
DS2 also supports Enhanced
Networking and provides 50%
more disk throughput than DS1.
New DS2 Instances
https://aws.amazon.com/blogs/aws/category/amazon-redshift/
RESOURCES YOU CAN USE
TO LEARN MORE
aws.amazon.com/redshift
Getting Started with Amazon Redshift
aws.amazon.com/redshift/getting-started/

Customers running Analytics in the AWS Cloud
http://aws.amazon.com/solutions/case-studies/analytics/

AWS re:Invent 2015 | Amazon Redshift Deep Dive: Tuning & Best Practices
https://youtu.be/fmy3jCxUliM?list=PLhr1KZpdzukdsblOEVXrCYtvUsDakzYJI
Certification
aws.amazon.com/certification
Self-Paced Labs
aws.amazon.com/training/

self-paced-labs
Try products, gain new skills, and
get hands-on practice working
with AWS technologies
aws.amazon.com/training
Training
Validate your proven skills and
expertise with the AWS platform
Build technical expertise to
design and operate scalable,
efficient applications on AWS
AWS Training & Certification
Follow
us
for m
ore
events
&
w
ebinars
@AWScloud for Global AWS News & Announcements
@AWS_UKI for local AWS events & news
@IanMmmm
Ian Massingham — Technical Evangelist

Mais conteúdo relacionado

Mais procurados

ABCs of AWS: S3
ABCs of AWS: S3ABCs of AWS: S3
ABCs of AWS: S3Mark Cohen
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftAmazon Web Services
 
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...Amazon Web Services
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
Architecting a Serverless Data Lake on AWS
Architecting a Serverless Data Lake on AWSArchitecting a Serverless Data Lake on AWS
Architecting a Serverless Data Lake on AWSAmazon Web Services
 
A complete guide to azure storage
A complete guide to azure storageA complete guide to azure storage
A complete guide to azure storageHimanshu Sahu
 
Amazon Redshift Tutorial | AWS Tutorial for Beginners | AWS Certification Tra...
Amazon Redshift Tutorial | AWS Tutorial for Beginners | AWS Certification Tra...Amazon Redshift Tutorial | AWS Tutorial for Beginners | AWS Certification Tra...
Amazon Redshift Tutorial | AWS Tutorial for Beginners | AWS Certification Tra...Edureka!
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon RedshiftKel Graham
 
Module 2 - Datalake
Module 2 - DatalakeModule 2 - Datalake
Module 2 - DatalakeLam Le
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeDatabricks
 
Let’s get to know Snowflake
Let’s get to know SnowflakeLet’s get to know Snowflake
Let’s get to know SnowflakeKnoldus Inc.
 

Mais procurados (20)

Introduction to Amazon Aurora
Introduction to Amazon AuroraIntroduction to Amazon Aurora
Introduction to Amazon Aurora
 
Amazon Redshift
Amazon Redshift Amazon Redshift
Amazon Redshift
 
ABCs of AWS: S3
ABCs of AWS: S3ABCs of AWS: S3
ABCs of AWS: S3
 
Azure storage
Azure storageAzure storage
Azure storage
 
Introduction to Amazon Athena
Introduction to Amazon AthenaIntroduction to Amazon Athena
Introduction to Amazon Athena
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
 
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Introduction to Amazon DynamoDB
Introduction to Amazon DynamoDBIntroduction to Amazon DynamoDB
Introduction to Amazon DynamoDB
 
Introduction to Amazon Athena
Introduction to Amazon AthenaIntroduction to Amazon Athena
Introduction to Amazon Athena
 
Architecting a Serverless Data Lake on AWS
Architecting a Serverless Data Lake on AWSArchitecting a Serverless Data Lake on AWS
Architecting a Serverless Data Lake on AWS
 
A complete guide to azure storage
A complete guide to azure storageA complete guide to azure storage
A complete guide to azure storage
 
Amazon Redshift Tutorial | AWS Tutorial for Beginners | AWS Certification Tra...
Amazon Redshift Tutorial | AWS Tutorial for Beginners | AWS Certification Tra...Amazon Redshift Tutorial | AWS Tutorial for Beginners | AWS Certification Tra...
Amazon Redshift Tutorial | AWS Tutorial for Beginners | AWS Certification Tra...
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon Redshift
 
Module 2 - Datalake
Module 2 - DatalakeModule 2 - Datalake
Module 2 - Datalake
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
 
Amazon Aurora: Under the Hood
Amazon Aurora: Under the HoodAmazon Aurora: Under the Hood
Amazon Aurora: Under the Hood
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Let’s get to know Snowflake
Let’s get to know SnowflakeLet’s get to know Snowflake
Let’s get to know Snowflake
 

Destaque

(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best PracticesAmazon Web Services
 
Migrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar SeriesMigrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar SeriesAmazon Web Services
 
Advanced Security Best Practices Masterclass
Advanced Security Best Practices MasterclassAdvanced Security Best Practices Masterclass
Advanced Security Best Practices MasterclassAmazon Web Services
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftBest Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
AWS Webcast - Data Integration into Amazon Redshift
AWS Webcast - Data Integration into Amazon RedshiftAWS Webcast - Data Integration into Amazon Redshift
AWS Webcast - Data Integration into Amazon RedshiftAmazon Web Services
 
Journey Through the Cloud - Digital Media
Journey Through the Cloud - Digital MediaJourney Through the Cloud - Digital Media
Journey Through the Cloud - Digital MediaAmazon Web Services
 
Amazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and OptimizationAmazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and OptimizationAmazon Web Services
 
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAmazon Web Services
 
Data Warehousing Infrastructure on Cloud
Data Warehousing Infrastructure on CloudData Warehousing Infrastructure on Cloud
Data Warehousing Infrastructure on Cloudtdwiindia
 
SQL Server to Redshift Data Load Using SSIS
SQL Server to Redshift Data Load Using SSISSQL Server to Redshift Data Load Using SSIS
SQL Server to Redshift Data Load Using SSISMarc Leinbach
 
AWS July Webinar Series: Amazon Redshift Optimizing Performance
AWS July Webinar Series: Amazon Redshift Optimizing PerformanceAWS July Webinar Series: Amazon Redshift Optimizing Performance
AWS July Webinar Series: Amazon Redshift Optimizing PerformanceAmazon Web Services
 
Journey Through the Cloud - Social & Mobile Apps
Journey Through the Cloud - Social & Mobile Apps Journey Through the Cloud - Social & Mobile Apps
Journey Through the Cloud - Social & Mobile Apps Amazon Web Services
 
Redshift performance tuning
Redshift performance tuningRedshift performance tuning
Redshift performance tuningCarlos del Cacho
 
Real-Time Analytics and Visualization of Streaming Big Data with JReport & Sc...
Real-Time Analytics and Visualization of Streaming Big Data with JReport & Sc...Real-Time Analytics and Visualization of Streaming Big Data with JReport & Sc...
Real-Time Analytics and Visualization of Streaming Big Data with JReport & Sc...Mia Yuan Cao
 

Destaque (20)

Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Amazon S3 Masterclass
Amazon S3 MasterclassAmazon S3 Masterclass
Amazon S3 Masterclass
 
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
 
Migrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar SeriesMigrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar Series
 
Advanced Security Best Practices Masterclass
Advanced Security Best Practices MasterclassAdvanced Security Best Practices Masterclass
Advanced Security Best Practices Masterclass
 
Amazon EC2 Masterclass
Amazon EC2 MasterclassAmazon EC2 Masterclass
Amazon EC2 Masterclass
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftBest Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
AWS Webcast - Data Integration into Amazon Redshift
AWS Webcast - Data Integration into Amazon RedshiftAWS Webcast - Data Integration into Amazon Redshift
AWS Webcast - Data Integration into Amazon Redshift
 
Journey Through the Cloud - Digital Media
Journey Through the Cloud - Digital MediaJourney Through the Cloud - Digital Media
Journey Through the Cloud - Digital Media
 
Amazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and OptimizationAmazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and Optimization
 
Amazon EMR Masterclass
Amazon EMR MasterclassAmazon EMR Masterclass
Amazon EMR Masterclass
 
AWS CloudFormation Masterclass
AWS CloudFormation MasterclassAWS CloudFormation Masterclass
AWS CloudFormation Masterclass
 
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
 
Data Warehousing Infrastructure on Cloud
Data Warehousing Infrastructure on CloudData Warehousing Infrastructure on Cloud
Data Warehousing Infrastructure on Cloud
 
SQL Server to Redshift Data Load Using SSIS
SQL Server to Redshift Data Load Using SSISSQL Server to Redshift Data Load Using SSIS
SQL Server to Redshift Data Load Using SSIS
 
AWS July Webinar Series: Amazon Redshift Optimizing Performance
AWS July Webinar Series: Amazon Redshift Optimizing PerformanceAWS July Webinar Series: Amazon Redshift Optimizing Performance
AWS July Webinar Series: Amazon Redshift Optimizing Performance
 
Journey Through the Cloud - Social & Mobile Apps
Journey Through the Cloud - Social & Mobile Apps Journey Through the Cloud - Social & Mobile Apps
Journey Through the Cloud - Social & Mobile Apps
 
Redshift performance tuning
Redshift performance tuningRedshift performance tuning
Redshift performance tuning
 
Real-Time Analytics and Visualization of Streaming Big Data with JReport & Sc...
Real-Time Analytics and Visualization of Streaming Big Data with JReport & Sc...Real-Time Analytics and Visualization of Streaming Big Data with JReport & Sc...
Real-Time Analytics and Visualization of Streaming Big Data with JReport & Sc...
 

Semelhante a Amazon Redshift Masterclass

Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Best Practices for Building Robust Data Platform with Apache Spark and Delta
Best Practices for Building Robust Data Platform with Apache Spark and DeltaBest Practices for Building Robust Data Platform with Apache Spark and Delta
Best Practices for Building Robust Data Platform with Apache Spark and DeltaDatabricks
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Data Warehousing in the Era of Big Data: Intro to Amazon Redshift
Data Warehousing in the Era of Big Data: Intro to Amazon RedshiftData Warehousing in the Era of Big Data: Intro to Amazon Redshift
Data Warehousing in the Era of Big Data: Intro to Amazon RedshiftAmazon Web Services
 
My Database Skills Killed the Server
My Database Skills Killed the ServerMy Database Skills Killed the Server
My Database Skills Killed the ServerColdFusionConference
 
Leveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseLeveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseAmazon Web Services
 
Loading Data into Redshift with Lab
Loading Data into Redshift with LabLoading Data into Redshift with Lab
Loading Data into Redshift with LabAmazon Web Services
 
Leveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseLeveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseAmazon Web Services
 
AWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon RedshiftAWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon RedshiftAmazon Web Services
 
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)Amazon Web Services
 
Optimize Your Reporting In Less Than 10 Minutes
Optimize Your Reporting In Less Than 10 MinutesOptimize Your Reporting In Less Than 10 Minutes
Optimize Your Reporting In Less Than 10 MinutesAlexandra Sasha Blumenfeld
 
Amazon RedShift - Ianni Vamvadelis
Amazon RedShift - Ianni VamvadelisAmazon RedShift - Ianni Vamvadelis
Amazon RedShift - Ianni Vamvadelishuguk
 

Semelhante a Amazon Redshift Masterclass (20)

Masterclass - Redshift
Masterclass - RedshiftMasterclass - Redshift
Masterclass - Redshift
 
Processing and Analytics
Processing and AnalyticsProcessing and Analytics
Processing and Analytics
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Redshift overview
Redshift overviewRedshift overview
Redshift overview
 
11g R2
11g R211g R2
11g R2
 
Best Practices for Building Robust Data Platform with Apache Spark and Delta
Best Practices for Building Robust Data Platform with Apache Spark and DeltaBest Practices for Building Robust Data Platform with Apache Spark and Delta
Best Practices for Building Robust Data Platform with Apache Spark and Delta
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Data Warehousing in the Era of Big Data: Intro to Amazon Redshift
Data Warehousing in the Era of Big Data: Intro to Amazon RedshiftData Warehousing in the Era of Big Data: Intro to Amazon Redshift
Data Warehousing in the Era of Big Data: Intro to Amazon Redshift
 
Loading Data into Redshift
Loading Data into RedshiftLoading Data into Redshift
Loading Data into Redshift
 
Loading Data into Redshift
Loading Data into RedshiftLoading Data into Redshift
Loading Data into Redshift
 
My Database Skills Killed the Server
My Database Skills Killed the ServerMy Database Skills Killed the Server
My Database Skills Killed the Server
 
Leveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseLeveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data Warehouse
 
Loading Data into Redshift with Lab
Loading Data into Redshift with LabLoading Data into Redshift with Lab
Loading Data into Redshift with Lab
 
Leveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseLeveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data Warehouse
 
AWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon RedshiftAWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon Redshift
 
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
 
AWS Analytics
AWS AnalyticsAWS Analytics
AWS Analytics
 
Optimize Your Reporting In Less Than 10 Minutes
Optimize Your Reporting In Less Than 10 MinutesOptimize Your Reporting In Less Than 10 Minutes
Optimize Your Reporting In Less Than 10 Minutes
 
Loading Data into Redshift
Loading Data into RedshiftLoading Data into Redshift
Loading Data into Redshift
 
Amazon RedShift - Ianni Vamvadelis
Amazon RedShift - Ianni VamvadelisAmazon RedShift - Ianni Vamvadelis
Amazon RedShift - Ianni Vamvadelis
 

Mais de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mais de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Último

Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 

Último (20)

Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 

Amazon Redshift Masterclass

  • 1. Masterclass ianmas@amazon.com @IanMmmm Ian Massingham — Technical Evangelist Amazon Redshift
  • 2. Masterclass Intended to educate you on how to get the best from AWS services Show you how things work and how to get things done A technical deep dive that goes beyond the basics 1 2 3
  • 3. Amazon Redshift A fast, fully managed, petabyte-scale data warehouse
 Makes it simple & cost-effective to analyse all your data You can continue to use your existing tools
 Start small & scale to a petabyte or more Costs less than $1,000/terabyte/year
  • 4. Amazon Redshift Optimised for Data Warehousing Scalable No Up-Front Costs Compatible Secure Simple
  • 7. Agenda Why Run Your Data Warehouse on AWS?
 Getting Started
 Table Design
 Data Loading
 Working with Data
 Backup and Restoration
 Upgrading & Scaling
  • 8. WHY RUN YOUR DATA WAREHOUSE ON AWS?
  • 9. Database Management Challenges Expensive Very difficult to switch Difficult to administer Challenging to scale
  • 10. Customers asked for data warehousing the AWS way No upfront costs, pay as you go Really fast performance at a really low price Open and flexible with support for popular BI tools Easy to provision & scale up massively S
  • 11. Amazon Redshift parallelises & distributes everything Query Load Backup Restore Resize Leader Node 10GigE Mesh JDBC/ODBC Compute Nodes Common BI Tools
  • 12. Amazon Redshift lets you start small & grow big Dense Compute Nodes - dc1.large & dc1.8xlarge dc1.large nodes 15GiB RAM
 2 virtual cores, 10GigE Single Node (160GB SSD) Cluster 2-32 Nodes (320GB – 5.12TB SSD) dc1.8xlarge nodes 244GiB RAM
 32 virtual cores, 10GigE Cluster 2-128 Nodes (up to 326TB SSD)
  • 13. ds2.8xlarge nodes 244GiB RAM
 36 virtual cores, 10GigE, 16TB magnetic HDD per node Cluster 2-128 Nodes (up to 2PB magnetic HDD) Amazon Redshift lets you start small & grow big Dense Storage Nodes - ds1.xlarge, ds1.8xlarge, ds2.xlarge & ds2.8xlarge
  • 14. Simplify Cluster Operations with Amazon Redshift ▶︎ Built-in security in transit, at rest, when backed up ▶︎ Backup to Amazon S3 is continuous, incremental & automatic ▶︎ Streaming restores let you resume querying faster ▶︎ Disk failures are transparent & nodes recover automatically Clients Amazon Redshift Amazon S3 Automatic Backups Streaming
 Restore
  • 15. Amazon Redshift Dramatically Reduces IO ▶︎ Column storage ▶︎ Data compression ▶︎ Zone maps ▶︎ Direct-attached storage ▶︎ Large data block sizes Row storage Column storage
  • 16. • Needed a way to increase speed, performance and flexibility of data analysis at a low cost • Using AWS enabled FT to run queries 98% faster than previously—helping FT make business decisions quickly • Easier to track and analyze trends • Reduced infrastructure costs by 80% over traditional data center model Amazon Redshift in action
  • 18. Creating an Amazon Redshift Data Warehouse via the AWS Console
  • 19. Creating an Amazon Redshift Data Warehouse via the AWS Console
  • 20. Creating an Amazon Redshift Data Warehouse via the AWS Console
  • 21. Creating an Amazon Redshift Data Warehouse via the AWS Console
  • 22. Creating an Amazon Redshift Data Warehouse via the AWS Console
  • 23. Creating an Amazon Redshift Data Warehouse via the AWS Console
  • 24. Creating an Amazon Redshift Data Warehouse via the AWS Console
  • 25. Creating an Amazon Redshift Data Warehouse via the AWS Console
  • 26. Creating an Amazon Redshift Data Warehouse via the AWS Console
  • 27. Creating an Amazon Redshift Data Warehouse via the AWS Console
  • 28. Creating an Amazon Redshift Data Warehouse via the AWS Console
  • 29. Creating an Amazon Redshift Data Warehouse via the AWS Console
  • 30. Creating an Amazon Redshift Data Warehouse via the AWS Console
  • 31. Creating an Amazon Redshift Data Warehouse via the AWS Console
  • 32. Creating an Amazon Redshift Data Warehouse via the AWS Console
  • 33. Creating an Amazon Redshift Data Warehouse via the AWS Console
  • 34. Getting Started with Amazon Redshift http://aws.amazon.com/redshift/getting-started/ ▶︎ 2 Month free trial ▶︎ Getting Started tutorial ▶︎ Amazon Redshift System Overview ▶︎ Guides for table design, loading data & query design ▶︎ Connect using industry standard OBDC/JDBC connections ▶︎ Many BI & ETL vendors have certified Amazon Redshift
  • 36. Choosing the optimal table design ▶︎ Compression Encodings ▶︎ Choosing the right data types ▶︎ Distributing and sorting data
  • 37. Choosing a Column Compression Type http://docs.aws.amazon.com/redshift/latest/dg/t_Compressing_data_on_disk.html ▶︎ Compression is a key tool improving performance in modern data warehouse systems to reduce the size of data that is read from storage, minimise IO & improve query performance ▶︎ The COPY command used to load data in Amazon Redshift will perform compression analysis ▶︎ We strongly recommend using the COPY command to apply automatic compression
  • 38. Data Types Data Type Aliases Description SMALLINT INT2 Signed two-byte integer INTEGER INT, INT4 Signed four-byte integer BIGINT INT8 Signed eight-byte integer DECIMAL NUMERIC Exact numeric of selectable precision REAL FLOAT4 Single precision floating-point number DOUBLE PRECISION FLOAT8, FLOAT Double precision floating-point number BOOLEAN BOOL Logical Boolean (true/false) CHAR CHARACTER, NCHAR, BPCHAR Fixed-length character string VARCHAR CHARACTER VARYING, NVARCHAR, TEXT Variable-length character string with a user-defined limit DATE Calendar date (year, month, day) TIMESTAMP TIMESTAMP WITHOUT TIME ZONE Date and time (without time zone) http://docs.aws.amazon.com/redshift/latest/dg/c_Supported_data_types.html
  • 39. Your Schema ▶︎ Expressed in 3rd Normal Form Airport Airport Group Airport Group MapDestination Enquiry Enquiry Airport Enquiry Airport GroupHoliday Type Search Type
  • 40. Distributing Data ▶ Redshift is a distributed system: ▶ A cluster contains a leader node & compute nodes ▶ A compute node contains slices (one per core) ▶ A slice contains data ▶ Slices are chosen based on two types of distribution: ▶ Round Robin (automated) ▶ Based on a distribution key (hash of a defined column) ▶ Queries run on all slices in parallel: optimal query throughput can be achieved when data is evenly spread across slices
  • 41. Unoptimised Distribution ORDERS ITEMS Order 1: Dave Smith, Total £195 Item 1.1: Order 1, Kindle Fire HD 7”, £159 Item 1.2: Order 1, Kindle Fire Case, £36 Slice 1 | Slice 2 Node 1 Slice 1 | Slice 2 Node 2 Slice 1 | Slice 2 Node 3 Default (No Distribution Key, Round Robin Order) Order 1 Order 2 Order 3Item 2.1 Item 1.1 Item 1.2 Item 2.2Item 3.1
  • 42. Optimised Distribution ORDERS ITEMS Order 1: Dave Smith, Total £195 Item 1.1: Order 1, Kindle Fire HD 7”, £159 Item 1.2: Order 1, Kindle Fire Case, £36 Slice 1 | Slice 2 Node 1 Slice 1 | Slice 2 Node 2 Slice 1 | Slice 2 Node 3 Customised (ORDERS.ORDER_ID DISTKEY, ITEMS.ORDER_ID DISTKEY) Order 1 Order 2 Order 3 Item 2.1Item 1.1 Item 1.2 Item 2.2 Item 3.1
  • 43. ▶ Frequently Joined ▶ By most commonly run queries ▶ By queries which consume the most CPU ▶ High Cardinality ▶ Large number of discrete values ▶ Low Skew ▶ Uniform Distribution ▶ No Hotspots ▶ Query STV_BLOCKLIST for Skew Factor Choosing a Distribution Key
  • 44. Sorting Data ▶ In the slices (on disk), the data is sorted by a sort key (if none uses the insertion order) ▶ Choose a sort key that is frequently used in your queries ▶ As a query predicate (date, identifier, …) ▶ As a join parameter (it can also be the hash key) ▶ The sort key allows Redshift to avoid reading entire blocks based on predicates ▶ E.g.: a table containing a timestamp sort key, and where only recent data is accessed, will skip blocks containing “old” data
  • 45. Schema Design ▶ Optimizing a database for querying ▶ Analyse using Automatic Compression to ensure optimal IO ▶ Co-locate frequently joined tables: use distribution key wisely (avoids data transfers between nodes) ▶ For joined tables, use sort keys on the joined columns, allowing fast merge joins ▶ Compression allows you to de-normalize without penalizing storage, simplifying queries and limiting joins
  • 46. Possible Schema Optimisations DISTKEY, SORTKEY = EnquiryID ▶ Denormalise commonly used join attributes onto large tables Destination Airport Airport Group MapEnquiry Enquiry Airport Enquiry Airport Group Search Type Airport GroupHoliday Type
  • 47. Possible Schema Optimisations DISTKEY, SORTKEY = EnquiryID ▶ Cache Small Tables with DISTSTYLE ALL Destination Airport Airport Group MapEnquiry Enquiry Airport Enquiry Airport Group Airport GroupHoliday Type Search Type http://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_TABLE_NEW.html
  • 48. Vacuuming Tables ▶ Clean up tables after a bulk delete, a load, or a series of incremental updates with the VACUUM command, either against the entire database or against individual tables http://docs.aws.amazon.com/redshift/latest/dg/t_Reclaiming_storage_space202.html
  • 49. Learning more about Table Design ▶︎ Table Design Documentation http://docs.aws.amazon.com/redshift/latest/dg/t_Creating_tables.html ▶︎ Table Design Tutorial http://docs.aws.amazon.com/redshift/latest/dg/tutorial-tuning-tables.html
  • 51. Amazon Redshift Loading Data Overview AWS CloudCorporate Data center DynamoDBAmazon S3 Data Volume Amazon Elastic MapReduce Amazon RDS Amazon Redshift Amazon Glacier logs / files Source DBs VPN Connection AWS Direct Connect S3 Multipart Upload AWS Import/ Export
  • 52. Sharing Your Data ▶ Data Dumped from DB ▶ CSV Files Encrypted ▶ Review Encryption Code ▶ Uploaded to S3 ▶ Bucket <bucket> ▶ Data shared with AWS {      "Id":  "Policy1234567890",      "Statement":  [          {              "Action":  [                  "s3:Get*",                  "s3:List*"              ],              "Effect":  "Allow",              "Principal":  {                  "AWS":  ["arn:aws:iam::887210671223:root"]              },              "Resource":  [                  "arn:aws:s3:::<bucket-­‐name>",                  "arn:aws:s3:::<bucket-­‐name>/*"              ],              "Sid":  "Stmt0987654321"          }      ]   }
  • 53. Splitting Data Files Slice 0 Slice 1 Slice 0 Slice 1 Node 0 Node 1 2 dc1.large nodes copy  customer     from  ‘s3://mydata/client.txt’   credentials  ‘aws_access_key_id=<your-­‐access-­‐key>;  aws_secret_access_key=<your_secret_key>’   delimiter  ‘|’; mydata Client.txt.1 Client.txt.2 Client.txt.3 Client.txt.4 http://docs.aws.amazon.com/redshift/latest/dg/t_Loading_data.html
  • 54. Typical Load Issues ▶ Mismatch between data types in table and values in input data fields ▶ Mismatch between number of columns in table and number of fields in input data ▶ Mismatched quotes ▶ Redshift supports both single and double quotes; however, these quotes must be balanced appropriately ▶ Incorrect format for date/time data in input files ▶ Use DATEFORMAT and TIMEFORMAT to control ▶ Out-of-range values in input files (for numeric columns) ▶ Number of distinct values for a column exceeds the limitation for its compression encoding
  • 55. Direct SQL ▶ Redshift supports standard DML commands ▶ INSERT, UPDATE, DELETE ▶ Redshift does not support single-command merge (upsert) statement ▶ Load data into a staging table ▶ Joining the staging table with the target table ▶ UPDATE data where row exists ▶ INSERT where no row exists ▶ All Direct SQL Commands go via Leader Node
  • 56. Data Loading Best Practices ▶ Use a COPY Command to load data ▶ Use a single COPY command ▶ Split your data into multiple files ▶ Compress your data files with GZIP ▶ Use multi-row inserts if COPY is not possible ▶ Bulk insert operations (INSERT INTO…SELECT and CREATE TABLE AS) provide high performance data insertion
  • 57. Data Loading Best Practices ▶ Load your data in sort key order to avoid needing to vacuum ▶ Organize your data as a sequence of time-series tables, where each table is identical but contains data for different time ranges ▶ Use staging tables to perform an upsert ▶ Run the VACUUM command whenever you add, delete, or modify a large number of rows, unless you load your data in sort key order ▶ Increase the memory available to a COPY or VACUUM by increasing wlm_query_slot_count ▶ Run the ANALYZE command whenever you’ve made a non-trivial number of changes to your data to ensure your table statistics are current
  • 58. WORKING WITH DATA IN AMAZON REDSHIFT
  • 59. RDBMS RedshiftOLTP ERP Reporting & BI
  • 60. RedshiftOLTP ERP Reporting & BI Data Integration RDBMS https://aws.amazon.com/redshift/partners/
  • 61. JDBC/ODBC Business Intelligence & Visualisation https://aws.amazon.com/redshift/partners/
  • 63.
  • 64. A M A Z O N Q U I C K S I G H T FAST, EASY TO USE, CLOUD POWERED BUSINESS INTELLIGENCE AVAILABLE IN PREVIEW NOW
  • 65. B R I N G I N G B U S I N E S S I N T E L L I G E N C E T O A L L , W I T H A M A Z O N Q U I C K S I G H T FIRST ANALYSIS IN LESS THAN 60 SECONDS BLAZING FAST QUERIES WITH A NEW IN-MEMORY QUERY ENGINE (SPICE) DYNAMIC, BEAUTIFUL DATA VISUZALITIONS SHARE LIVE AND SNAPSHOT ANALYSES WITH EVERYONE INTEGRATE WITH DATA SOURCES ON AWS 1/10TH THE COST OF OLD-GUARD BI TOOLS
  • 68. 128GB RAM 16TB disk 16 cores Redshift Cluster Node 128GB RAM 16TB disk 16 cores Redshift Cluster Node 128GB RAM 16TB disk 16 cores Redshift Cluster Node Backups to Amazon S3 are automatic, continuous & incremental Configurable system snapshot retention period User snapshots on-demand Streaming restores enable you to resume querying faster Backup & Restoration Amazon S3
  • 69.
  • 70.
  • 71.
  • 72.
  • 73.
  • 74.
  • 75. CLUSTER DATA REPLICATION
 + AUTOMATED BACKUPS ON AMAZON S3
 + NODE MONITORING
  • 77. 128GB RAM 16TB disk 16 coresRedshift Leader Node 128GB RAM 16TB disk 16 coresRedshift Cluster Node 128GB RAM 16TB disk 16 coresRedshift Cluster Node 128GB RAM 16TB disk 16 coresRedshift Cluster Node 128GB RAM 16TB disk 16 coresRedshift Cluster Node 128GB RAM 16TB disk 16 coresRedshift Cluster Node 128GB RAM 16TB disk 16 coresRedshift Cluster Node 128GB RAM 16TB disk 16 coresRedshift Cluster Node Resize while remaining online Provision a new cluster in the background Copy data in parallel from node to node Only charged for source cluster SQL Clients/BI Tools 128GB RAM 16TB disk 16 coresRedshift Leader Node
  • 78. 128GB RAM 16TB disk 16 coresRedshift Leader Node 128GB RAM 16TB disk 16 coresRedshift Cluster Node 128GB RAM 16TB disk 16 coresRedshift Cluster Node 128GB RAM 16TB disk 16 coresRedshift Cluster Node 128GB RAM 16TB disk 16 coresRedshift Cluster Node Automatic SQL endpoint switchover via DNS Decommission the source cluster SQL Clients/BI Tools
  • 79. Resizing a Redshift Data Warehouse via the AWS Console
  • 80. Resizing a Redshift Data Warehouse via the AWS Console
  • 81. Resizing a Redshift Data Warehouse via the AWS Console
  • 82. Resizing a Redshift Data Warehouse via the AWS Console http://docs.aws.amazon.com/redshift/latest/mgmt/working-with-clusters.html#cluster-resize-intro
  • 84. You can write UDFs using Python 2.7 Syntax is largely identical to PostgreSQL UDF syntax System & network calls within UDFs are prohibited Comes with Pandas, NumPy, and SciPy pre-installed You’ll also be able import your own libraries for even more flexibility Scalar User Defined Functions https://aws.amazon.com/blogs/aws/category/amazon-redshift/
  • 85. DS2 instances have twice the memory and compute power of their Dense Storage predecessor, DS1 (previously called DW1), but the same storage. DS2 also supports Enhanced Networking and provides 50% more disk throughput than DS1. New DS2 Instances https://aws.amazon.com/blogs/aws/category/amazon-redshift/
  • 86. RESOURCES YOU CAN USE TO LEARN MORE
  • 88. Getting Started with Amazon Redshift aws.amazon.com/redshift/getting-started/ Customers running Analytics in the AWS Cloud http://aws.amazon.com/solutions/case-studies/analytics/ AWS re:Invent 2015 | Amazon Redshift Deep Dive: Tuning & Best Practices https://youtu.be/fmy3jCxUliM?list=PLhr1KZpdzukdsblOEVXrCYtvUsDakzYJI
  • 89. Certification aws.amazon.com/certification Self-Paced Labs aws.amazon.com/training/
 self-paced-labs Try products, gain new skills, and get hands-on practice working with AWS technologies aws.amazon.com/training Training Validate your proven skills and expertise with the AWS platform Build technical expertise to design and operate scalable, efficient applications on AWS AWS Training & Certification
  • 90. Follow us for m ore events & w ebinars @AWScloud for Global AWS News & Announcements @AWS_UKI for local AWS events & news @IanMmmm Ian Massingham — Technical Evangelist