This session provides the attendee with an overview of Amazon RDS across different database types and then dives deep into the benefits and performance of Amazon Aurora.
2. What is Amazon RDS?
• Managed relational database in the cloud
• 6 familiar engines and multiple versions to chose from
• Managed for you:
• Amazon RDS handles routine database tasks such as
provisioning, patching, backup, recovery, failure detection,
and repair.
3. Why did AWS build Amazon RDS?
• There’s a lot of repetitive
labor that must be done, but
doesn’t directly add value
• Backups and restores
• Software installs and
patching
• Managing hardware
• Achieving many important
capabilities requires lots of
spend, lots of engineering,
or both
• Scaling
• High availability
• Migration
Managing relational databases is hard
4. With a traditional DB you:
• Acquire hardware (purchase, rack and stack)
• Load OS
• Load clustering software
• Load database software
• Create a database
• Optimize your query logic
• Design and implement a backup strategy
• Perform patching (OS, clustering software,
database)
• Perform software and hardware upgrades
• Deal with hardware failures
How does RDS compare to traditional DB hosting?
With RDS you:
• Create a database (selecting
options for backup and
maintenance)
• Optimize your query logic and
execution
5. Amazon RDS is simple and fast to deploy
• Get a production-ready
database instance in
minutes
• No need to acquire
servers, rack and stack,
install OS and
database software
7. A simple application architecture
RDS database instance
Application, in an
Amazon EC2 instance
Elastic Load Balancing
load balancer instance
DB snapshots in
Amazon S3
8. Choose Multi-AZ for greater availability, durability
• An Availability Zone is a physically distinct, independent
infrastructure
• With Multi-AZ operation, your database is synchronously
replicated to another AZ in the same AWS Region
• Failover occurs automatically in response to the most
important failure scenarios
• Planned maintenance is applied first to backup
9. A resilient, durable, still simple application
architecture
RDS database instances:
master and Multi-AZ standby
Application, in Amazon
EC2 instances
Elastic Load Balancing
load balancer instance
DB snapshots in
Amazon S3
10. Amazon RDS offers fast, predictable storage
General Purpose
(SSD) for most
workloads
Provisioned IOPS
(SSD) for OLTP
workloads up to 30,000
IOPS
Magnetic for small
workloads with
infrequent access
11. Amazon RDS Read Replicas offer scale-out
• Offload read traffic to
automatically maintained
Read Replicas
• Load-share traffic across
multiple Read Replicas
• Easy to set up
12. Amazon RDS provides levels of security difficult to
achieve on-premises
• AWS has achieved major compliances
• Amazon RDS gives each database instance IP firewall
protection
• Amazon VPC lets you isolate and control network
configuration and connect securely to your IT infrastructure
• AWS Identity and Access Management provides resource-
level permission controls
• Amazon RDS offers encryption at rest and SSL protection for
data in transit
13. 13
Amazon RDS is easy to monitor with
Amazon CloudWatch CloudWatch RDS Metrics
CPU utilization
Storage
Memory
Swap usage
DB connections
I/O (read and write)
Latency (read and write)
Throughput (read and write)
Replica lag
Many more
CloudWatch Alarms
Similar to on-premises custom
monitoring tools
14. Amazon RDS is cost-effective
Monthly
bill = GB+
Assumes DB instance accessed only from Amazon EC2
Further details at http://aws.amazon.com/rds/pricing/
= 720 hrs * $0.35 + 100 GB * $0.115
= $263.50
db.m4.xlarge; MySQL; N.
Virginia; Single-AZ;
On-Demand
100 GB
General Purpose
(SSD)
4 vCPUs;
16 GiB
RAM
• Pay only for what you use; no minimum charge
Example:
15. Save money with Amazon RDS Reserved Instances
• Pay a low up-front fee to get a lower hourly price on
database instances for a 1- or 3-year term
• Your lower Reserved Instance price applies to any
running instance matching the description you specified
at purchase time
Start saving here
Cumulative spend
Month 1 Month 2 Month 3 Month 4 Month 5 Month 6 Month 7 Month 8 Month 9 Month 10 Month 11 Month 12
On-Demand 263.50 527.00 790.50 1,054.00 1,317.50 1,581.00 1,844.50 2,108.00 2,371.50 2,635.00 2,898.50 3,162.00
1-Yr RI 1,777.50 1,789.00 1,800.50 1,812.00 1,823.50 1,835.00 1,846.50 1,858.00 1,869.50 1,881.00 1,892.50 1,904.00
16. How Amazon RDS backups work
Automated backups
• Restore your database to a
point in time
• Enabled by default
• Choose a retention period, up
to 35 days
Manual snapshots
• Build a new database instance
from a snapshot when needed
• Initiated by you
• Persist until you delete them
• Stored in Amazon S3
17. Choose cross-region snapshot copy for even greater
durability, ease of migration
• Copy a database
snapshot to a
different AWS Region
• Warm standby for
disaster recovery
• Or use it as a base
for migration to a
different region
18. Easily migrate to Amazon RDS
• AWS Schema Conversion Tool
• Move schema to new DB
• Convert schema to new DB platform
• AWS Database Migration Service
• Homogenous (A to A)
• Heterogeneous (A to B)
More info: http://aws.amazon.com/dms/
19. Why should I use RDS?
Let AWS handle these
So you can focus on these
Migration
Backup and recovery
Configuration
Patching
Software upgrades
Storage upgrades
Server upgrades
Hardware issues
Database schema
Query design
Query optimization
21. MySQL-compatible relational database
Performance and availability of
commercial databases
Simplicity and cost-effectiveness of
open source databases
Delivered as a managed service
What is Amazon Aurora?
22. Re-imagined for the cloud
Architected for the cloud—that is, we
moved the logging and storage layer into a
multitenant, scale-out database-optimized
storage service
Leverages existing AWS services: Amazon
EC2, Amazon VPC, Amazon DynamoDB,
Amazon SWF, and Amazon S3
Maintains compatibility with MySQL—
customers can migrate their MySQL
applications as-is and use all MySQL tools
Control PlaneData Plane
Amazon
DynamoDB
Amazon
SWF
Amazon Route 53
Logging + Storage
SQL
Transactions
Caching
Amazon S3
1
2
3
23. WRITE PERFORMANCE READ PERFORMANCE
MySQL SysBench results
R3.8XL: 32 cores / 244 GB RAM
5X faster than RDS MySQL 5.6 and 5.7
Five times higher throughput than stock MySQL
based on industry standard benchmarks.
0
25,000
50,000
75,000
100,000
125,000
150,000
0
100,000
200,000
300,000
400,000
500,000
600,000
700,000
Aurora MySQL 5.6 MySQL 5.7
24. WRITE PERFORMANCE READ PERFORMANCE
Scaling with instance sizes
Aurora scales with instance size for both read and write.
Aurora MySQL 5.6 MySQL 5.7
26. Do fewer I/Os
Minimize network packets
Cache prior results
Offload the database engine
DO LESS WORK
Process asynchronously
Reduce latency path
Use lock-free data structures
Batch operations together
BE MORE EFFICIENT
How did we achieve this?
DATABASES ARE ALL ABOUT I/O
NETWORK-ATTACHED STORAGE IS ALL ABOUT PACKETS/SECOND
HIGH-THROUGHPUT PROCESSING DOES NOT ALLOW CONTEXT SWITCHES
27. I/O traffic in MySQL
BINLOG DATA DOUBLE-WRITELOG FRM FILES
T Y P E O F W R IT E
MYSQL WITH REPLICA
EBS mirrorEBS mirror
AZ 1 AZ 2
Amazon S3
EBS
Amazon Elastic
Block Store (EBS)
Primary
Instance
Replica
Instance
1
2
3
4
5
Issue write to EBS—EBS issues to mirror, ack when both done
Stage write to standby instance through DRBD
Issue write to EBS on standby instance
I/O FLOW
Steps 1, 3, 5 are sequential and synchronous
This amplifies both latency and jitter
Many types of writes for each user operation
Have to write data blocks twice to avoid torn writes
OBSERVATIONS
780 K transactions
7,388 K I/Os per million txns (excludes mirroring, standby)
Average 7.4 I/Os per transaction
PERFORMANCE
30 minute SysBench write-only workload, 100 GB dataset, RDS Multi-AZ, 30 K PIOPS
28. IO traffic in Aurora
AZ 1 AZ 3
Primary
Instance
Amazon S3
AZ 2
Replica
Instance
AMAZON AURORA
ASYNC
4/6 QUORUM
DISTRIBUTED
WRITES
BINLOG DATA DOUBLE-WRITELOG FRM FILES
T Y P E O F W R IT E
IO FLOW
Only write redo log records; all steps asynchronous
No data block writes (checkpoint, cache replacement)
6X more log writes, but 9X less network traffic
Tolerant of network and storage outlier latency
OBSERVATIONS
27,378 K transactions 35X MORE
950 K I/Os per 1M txns (6X amplification) 7.7X LESS
PERFORMANCE
Boxcar redo log records—fully ordered by LSN
Shuffle to appropriate segments—partially ordered
Boxcar to storage nodes and issue writesReplica
Instance
29. “In RDS MySQL, we saw replica lag spike to almost 12 minutes, which
is almost absurd from an application’s perspective. The maximum
read replica lag across 4 replicas never exceeded beyond 20 ms.”
Real-life data—read replica latency
30. I/O traffic in Aurora Replicas
PAGE CACHE
UPDATE
Aurora Master
30% Read
70% Write
Aurora Replica
100% New Reads
Shared Multi-AZ Storage
MySQL Master
30% Read
70% Write
MySQL Replica
30% New Reads
70% Write
SINGLE-THREADED
BINLOG APPLY
Data Volume Data Volume
Logical: Ship SQL statements to replica
Write workload similar on both instances
Independent storage
Can result in data drift between master and replica
Physical: ship redo from master to replica
Replica shares storage; no writes performed
Cached pages have redo applied
Advance read view when all commits seen
MYSQL READ SCALING AMAZON AURORA READ SCALING
32. Storage durability
Storage volume automatically grows up to 64 TB
Quorum system for read/write; latency tolerant
Peer to peer gossip replication to fill in holes
Continuous backup to S3 (built for 11 9s durability)
Continuous monitoring of nodes and disks for repair
10 GB segments as unit of repair or hotspot rebalance
Quorum membership changes do not stall writes
AZ 1 AZ 2 AZ 3
Amazon S3
33. Six copies across three Availability Zones
4 out 6 write quorum; 3 out of 6 read quorum
Peer-to-peer replication for repairs
Volume striped across hundreds of storage nodes
SQL
Transaction
AZ 1 AZ 2 AZ 3
Caching
SQL
Transaction
AZ 1 AZ 2 AZ 3
Caching
Read and write availabilityRead availability
Fault-tolerant storage
34. Survivable caches
We moved the cache out of the
database process
Cache remains warm in the event of
database restart
Lets you resume fully loaded
operations much faster
Instant crash recovery + survivable
cache = quick and easy recovery from
DB failures
SQL
Transactions
Caching
SQL
Transactions
Caching
SQL
Transactions
Caching
Caching process is outside the DB process
and remains warm across a database restart
35. Aurora Replicas are failover targets
Aurora cluster contains primary node
and up to 15 secondary nodes
Failing database nodes are
automatically detected and replaced
Failing database processes are
automatically detected and recycled
Secondary nodes automatically
promoted on persistent outage, no
single point of failure
Customer application can scale out
read traffic across secondary nodes
AZ 1 AZ 3AZ 2
Primary
Node
Primary
Node
Primary
Node
Primary
Node
Primary
Node
Secondary
Node
Primary
Node
Primary
Node
Secondary
Node
Customer specifiable failover order
Read balancing across Aurora Replicas
36. ALTER SYSTEM CRASH [{INSTANCE | DISPATCHER | NODE}]
ALTER SYSTEM SIMULATE percent_failure DISK failure_type IN
[DISK index | NODE index] FOR INTERVAL interval
ALTER SYSTEM SIMULATE percent_failure NETWORK failure_type
[TO {ALL | read_replica | availability_zone}] FOR INTERVAL interval
Simulate failures using SQL
To cause the failure of a component at the database node:
To simulate the failure of disks:
To simulate the failure of networking:
38. Well-established MySQL ecosystem
Business Intelligence Data Integration Query and Monitoring SI and Consulting
Source: Amazon
“We ran our compatibility test suites against Amazon Aurora and everything
just worked." —Dan Jewett, Vice President of Product Management at Tableau
41. Why Aurora?
• Architected for 99.99% availability
• Enterprise performance (5x) at 1/10 the cost
• Compatible with MySQL 5.6
• Automatically grows storage as needed, up to 64 TB
• Easy migration from MySQL
• Up to 15 Aurora Replicas in a region
• Cross-region replication
• Encryption in-transit and at rest
• Continuous backup to S3 (11 9’s data durability)
• Fully managed
42. Recent feature releases for Amazon RDS
May 18, 2016 Amazon Aurora now supports sharing database snapshots across accounts
May 6, 2016 Deploy Siebel CRM applications on Amazon RDS for Oracle
May 4, 2016 RDS Enhanced Monitoring is now available in South America (Sao Paulo) and China (Beijing)
April 27, 2016 MariaDB audit plug-in now available for RDS MySQL and MariaDB
April 26, 2016 Amazon RDS MySQL now supports point-and-click upgrade from MySQL 5.6 to 5.7
April 22, 2016 Enhanced Monitoring is now available for Amazon RDS for SQL Server
April 8, 2016 Amazon RDS for PostgreSQL now supports version 9.5 with minor version 9.5.2, and minor versions 9.4.7 and 9.3.12
April 1, 2016 Cluster view for Amazon Aurora in RDS console
April 1, 2016 Amazon RDS now supports January PSU patches, improved custom Oracle directories and read privileges support
Detailed listing available here: http://aws.amazon.com/rds/whats-new/
43. Try Amazon RDS for free
• For your first year, at no charge…
• 750 free instance-hours allow you to run a micro database
instance continuously
• 20 GB of database instance storage
• 20 GB for automated backups
• 10,000 I/Os
• Learn more about the AWS free
tier: http://aws.amazon.com/free/
Show of hands:
How many people are familiar with Amazon RDS?
How many people have used one of the RDS database platforms?
A database instance is a virtual database server in the cloud,with the compute and storage resources you specify. You can create and delete DB Instances, define/refine infrastructure attributes of your DB Instance(s), and control access and security via the AWS Management Console, Amazon RDS APIs, and Command Line Tools. You can run one or more DB Instances, and each DB Instance can support one or more databases or database schemas, depending on engine type.
DB Instances are simple to create, using either the AWS Management Console, Amazon RDS APIs, or Command Line Tools. To launch a DB Instance using the AWS Management Console, click "RDS," then the "Launch a DB Instance" button on the "Amazon RDS" tab. From there, you can specify the fundamental parameters for your DB instance:
DB engine: MySQL, Oracle, Microsoft SQL Server, PostgreSQL (and, now in preview, Amazon Aurora)
DB engine version (optional)
License Model (optional)
DB Instance type
Amount of allocated storage (in GB)
Whether your DB Instance should run as a Multi-AZ deployment
Storage type
DB Instance identifier
Master user name
Master user password
You also have the ability to change your DB Instance’s backup retention policy, preferred backup window, and scheduled maintenance window. Alternatively, you can create your DB Instance using the CreateDBInstance API or rds-create-db-instance command.
The automated backup feature of Amazon RDS enables point-in-time recovery of your DB Instance. You can initiate a point-in-time restore and specify any second during your retention period, up to the Latest Restorable Time.
Amazon RDS provides backup storage up to 100% of your provisioned database storage at no additional charge. For example, if you have 10GB-months of provisioned database storage, we will provide up to 10GB-months of backup storage at no additional charge.
Amazon RDS allows you to control if and when the relational database software powering your DB Instance is upgraded to new versions supported by Amazon RDS. This provides you with the flexibility to maintain compatibility with specific engine versions, test new versions with your application before deploying in production, and perform version upgrades on your own terms and timelines.
We’ll explain Multi-AZ on the next slide.
In this simple application stack, an application running in an Amazon EC2 instance is supported by a master database running in an Amazon RDS database instance. It is a best practice to present an application out to its consumers behind an Elastic Load Balancer, so that compute resiliency and scaling features such as Auto Scaling and ELB groups can be adopted in the future.
Amazon RDS Multi-AZ deployments provide enhanced availability and durability for Database (DB) Instances, making them a natural fit for production database workloads. When you provision a Multi-AZ DB Instance, Amazon RDS automatically creates a primary DB Instance and synchronously replicates the data to a standby instance in a different Availability Zone (AZ). Each AZ runs on its own physically distinct, independent infrastructure, and is engineered to be highly reliable. In case of an infrastructure failure (for example, instance hardware failure, storage failure, or network disruption), Amazon RDS performs an automatic failover to the standby, so that you can resume database operations as soon as the failover is complete. Since the endpoint for your DB Instance remains the same after a failover, your application can resume database operation without the need for manual administrative intervention.
Multi-AZ is available for all RDS engines.
Because Multi-AZ minimizes the downtime impact of scheduled maintenance, it gives value even to deployments in which the app servers are in a single AZ. But it’s still best to have the instances spread across multiple AZs.
This application stack employs AWS reliability and durability features. An ELB group of Amazon EC2 instances supports the application logic. The instances use a Multi-AZ Amazon RDS deployment. In the event of infrastructure failure, the database fails over to a standby instance. The application logic retries its database connections, to the same endpoint as before, and service resumes using the new master. Meanwhile, a new standby is instantiated.
In addition to Amazon RDS’s automatic backups, the database snapshot feature is employed to ensure that backups are durably retained. You can create a new database instance from a database snapshot whenever you desire.
Amazon RDS General Purpose (SSD) Storage is suitable for a broad range of database workloads that have moderate I/O requirements. With the baseline of 3 IOPS/GB and ability to burst up to 3,000 IOPS, this storage option provides predictable performance to meet the needs of most applications.
Amazon RDS Provisioned IOPS (SSD) Storage is an SSD-backed storage option designed to deliver fast, predictable, and consistent I/O performance. With Amazon RDS Provisioned IOPS (SSD) Storage, you specify an IOPS rate when creating a DB Instance, and Amazon RDS provisions that IOPS rate for the lifetime of the DB Instance. Amazon RDS Provisioned IOPS (SSD) Storage is optimized for I/O-intensive, transactional (OLTP) database workloads.
Formerly known as Standard storage, Amazon RDS Magnetic Storage is useful for small database workloads where data is accessed less frequently.
Choose the storage type most suited for your workload.
High-performance OLTP workloads: Amazon RDS Provisioned IOPS (SSD) Storage
Database workloads with moderate I/O requirements: Amazon RDS General Purpose (SSD) Storage
Small database workloads with infrequent I/O: Amazon RDS Magnetic Storage
The computation and memory capacity of a DB instance is determined by its DB instance class. You can change the CPU and memory available to a DB instance by changing its DB instance class; to change the DB instance class, you must modify the DB instance.
Here are the DB instance classes available through Amazon RDS:
Micro instances (db.t1.micro): An instance sufficient for testing but should not be used for production applications.
Standard - Current Generation (m3): Second generation instances that provide more computing capacity than the first generation db.m1 instance classes at a lower price.
Memory Optimized - Current Generation (db.r3): Second generation instances that provide memory optimization and more computing capacity than the first generation db.m2 instance classes at a lower price.
Burst Capable - Current Generation (db.t2): Instances that provide baseline performance level with the ability to burst to full CPU usage.
You can change from one database instance type to another. There will be a brief availability event during the changeover.
You can increase the amount of storage available to your database instance on demand for the MySQL, Oracle, and PostgreSQL database engines. This change is performed online, without an availability impact. Amazon Aurora automatically grows the database size on demand.
Encryption at rest is available with certain engines. SSL support for database connections is available with Amazon RDS for MySQL, PostgreSQL, SQL Server, and Amazon Aurora.
Amazon CloudWatch is a monitoring service for AWS cloud resources and the applications you run on AWS. You can use Amazon CloudWatch to collect and track metrics, collect and monitor log files, and set alarms. The system-wide visibility into resource utilization, application performance, and operational health that Amazon CloudWatch provides can help you keep your applications running smoothly.
This bill illustrates an example monthly bill for an Amazon RDS instance. (For the sake of simplicity, we’ll treat a month as 720 hours). The bill has two major components: the price for the hours during which the RDS instance ran, and the storage for that instance. An On-Demand m4.xlarge instance, running MySQL in the US East (N. Virginia) region, costs $0.35 per hour. General Purpose (SSD) storage costs $0.115 per gigabyte per month. The total monthly bill works out to $263.50. Further savings are available with Reserved Instances.
Amazon RDS Reserved Instances give you the option to make a low, one-time payment for each DB instance you want to reserve and in turn receive a significant discount on the hourly charge for that instance. The example, which compares the cumulative expenditure for an RDS instance purchased On-Demand versus through a 1-Year Heavy Utilization Reserved Instance, demonstrates that the break-even point is well in advance of the expiration of the term.
Reserved instances are keyed to DB instance class, DB engine, and the choice of Single-AZ vs. Multi-AZ. When you have purchased a reserved instance in an AWS account, the lower hourly rate applies to a database instance running under that account that matches that description.
Heavy RIs are appropriate for production database workloads.
When automated backups are turned on for your DB Instance, Amazon RDS automatically performs a full daily snapshot of your data (during your preferred backup window) and captures transaction logs (as updates to your DB Instance are made). When you initiate a point-in-time recovery, transaction logs are applied to the most appropriate daily backup in order to restore your DB Instance to the specific time you requested. Amazon RDS retains backups of a DB Instance for a limited, user-specified period of time called the retention period, which by default is one day but can be set to up to thirty five days.
Manual database snapshots are user-initiated and enable you to back up your DB Instance in a known state as frequently as you wish, and then restore to that specific state at any time. DB Snapshots can be created with the AWS Management Console or CreateDBSnapshot API and are kept until you explicitly delete them with the Console or DeleteDBSnapshot API.
Manual database snapshots are kept in Amazon Simple Storage Service (Amazon S3). Amazon S3 is designed for 99.999999999% durability.
Cross region snapshot copy is available for all Amazon RDS engines. You can copy snapshots of any size. Copies can be moved between any of the public AWS regions, and you can copy the same snapshot to multiple Regions simultaneously by initiating more than one transfer. There is no charge for the copy operation itself; you pay only for the data transfer out of the source region and for the data storage in the destination region.