2. AWS Data Services to Accelerate Your Move to the Cloud
RDS
Open
Source
RDS
Commercial
Aurora
Migration for DB Freedom
DynamoDB
& DAX
ElastiCache EMR Amazon
Redshift
Redshift
Spectrum
AthenaElasticsearch
Service
QuickSightGlue
Databases to Elevate your Apps
Relational Non-Relational
& In-Memory
Analytics to Engage your Data
Inline Data Warehousing Reporting
Data Lake
Amazon AI to Drive the Future
Lex
Polly
Rekognition Machine
Learning
Deep Learning, MXNet
Database Migration
Schema Conversion
4. Why Start With SQL?
• Established and well worn technology
• Lots of existing code, communities, books, background,
tools, etc
• You aren’t going to break SQL DBs in your first 10 million
users. Probably.
• Clear patterns to scalability
5. Why Start With NoSQL?
• Super low latency applications
• Metadata driven datasets
• Highly unrelational data
• Need schema-less data constructs*
• Massive amounts of data (again, in the TB range)
• Rapid ingest of data (thousands of records/sec)
• Small datasets with low latency and high scalability
*Need != “its easier to do dev without schemas”
10. Thinking About the Questions
Should I use
SQL or NoSQL?
Should I use
MySQL or
PostgreSQL?
Should I use Redis,
Memcache, or
ElastiCache?
?Should I use
MongoDB,
Cassandra, or
DynamoDB?
11. Actually, Thinking About the Right Questions
What are my scale
and latency
needs?
What are my
transactional and
consistency
needs?
What are my
read/write, storage
and IOPS needs?
What are my time
to market and
server control
needs?
?
12. Factors to Consider
Factors SQL NoSQL
Application • App with complex business logic? • Web app with lots of users?
Transactions • Complex txns, joins, updates? • Simple data model, updates, queries?
Scale • Developer managed • Automatic, on-demand scaling
Performance • Developer architected • Consistent, high performance at scale
Availability • Architected for fail-over • Seamless and transparent
Core Skills • SQL + Java/Ruby/Python/PhP • NoSQL + Java/Ruby/Python/PhP
Best of both worlds: Possible to Use SQL and NoSQL models in one App
13. backup & recovery,
data load & unload
performance tuning
25%40%
5% 5%
scripting & coding
security
planning
install, upgrade,
patch and migrate
documentation,
licensing & training
Why Managed Databases?
14. If You Host Your Databases On-premises
Power, HVAC, net
Rack & stack
Server maintenance
OS patches
DB s/w patches
Database backups
Scaling
High availability
DB s/w installs
OS installation
you
App optimization
15. If You Host Your Databases in EC2
Power, HVAC, net
Rack & stack
Server maintenance
OS patches
DB s/w patches
Database backups
Scaling
High availability
DB s/w installs
OS installation
you
App optimization
16. If You Choose a Managed Database Service
Power, HVAC, net
Rack & stack
Server maintenance
OS patches
DB s/w patches
Database backups
App optimization
High availability
DB s/w installs
OS installation
you
Scaling
19. • Manageability
§ Rapid deployment with pre-configured parameters
§ Patch Management
§ Monitoring and Metrics
• Availability and Data Durability
• Scalability
• Fast
• Secure
§ Encryption in transit and at rest
§ TDE with Oracle Database and SQL Server
• Inexpensive
Key Features
20. • DB Snapshots
§ User-driven snapshots of database
§ Kept until explicitly deleted
• Automated Backups
§ Nightly system snapshots + transaction backup
§ Enables point-in-time restore to any point in
retention period, up to the last 5 minutes
§ Max retention period = 35 days
Backups and Recovery
22. • Scale nodes vertically up or down
§ t2.small (1 virtual core, 2GiB)
§ m3.2xlarge (8 virtual cores, 30GiB)
§ r3.8xlarge (32 virtual cores, 244GiB)
• Convert storage to PIOPS
§ Consistent throughput + low I/O latencies
• Scale Storage vertically without downtime
§ Increase throughput by spreading data across
additional volumes, with no impact
§ Independently scale provisioned IOPS
Push Button Scaling
23. • Add Read Replicas
§ Horizontal scaling of read heavy workloads
§ Offload reporting
• Currently Available for MySQL, PostgreSQL
§ Asynchronous, native tech
• Overcoming Challenges
§ RDS makes it easy to re-create if fallen behind
§ Deploy a proxy to round robin requests
Horizontal Scaling with Read Replicas
24. RDS for Production Workloads
Amazon RDS
Configuration
Improve
Availability
Increase
Throughput
Reduce
Latency
Push-Button Scaling
Multi AZ
Read Replicas
Provisioned IOPS
Read ReplicasPush-Button Scaling Provisioned IOPS
Region
Multi-AZ
availability
zone
availability
zone
25. Amazon RDS for MariaDB
• Same features and pricing as RDS MySQL
• Available in the free tier
• Differences from RDS MySQL
– XtraDB and Aria storage engines only
– Version 10.x and 11.x MariaDB
– Current generation instances (not t1, m1, cr1)
27. MySQL-compatible relational database
Performance and availability of
commercial databases
Simplicity and cost-effectiveness of
open source databases
Delivered as a managed service
What is Amazon Aurora?
29. Relational databases were not designed for the cloud
Multiple layers of
functionality all in a
monolithic stack
SQL
Transactions
Caching
Logging
30. Not much has changed in last 30 years
Even when you scale it out, you’re still replicating the same stack
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
Application
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
Application
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
Storage
Application
31. This is a problem.
For cost. For flexibility. And for availability.
32. Reimagining the relational database
What if you were inventing the database today?
You wouldn’t design it the way we did in 1970.
You’d build something
ü that can scale out ….
ü that is self-healing ….
ü that leverages existing AWS services …
33. A service-oriented architecture applied to the database
Moved the logging and storage layer into a
multitenant, scale-out database-optimized
storage service
Integrated with other AWS services like
Amazon EC2, Amazon VPC, Amazon
DynamoDB, Amazon SWF, and Amazon
Route 53 for control plane operations
Integrated with Amazon S3 for continuous
backup with 99.999999999% durability
Control PlaneData Plane
Amazon
DynamoDB
Amazon SWF
Amazon Route 53
Logging + Storage
SQL
Transactions
Caching
Amazon S3
1
2
3
34. “When we ran Alfresco’s workload on Aurora, we were blown away to find that
Aurora was 10X faster than our MySQL environment” said John Newton,
Founder and CTO of Alfresco. “Speed matters in our business and Aurora has
been faster, cheaper, and considerably easier to use than MySQL”
Amazon Aurora is fast
35. • 4 client machines with 1,000 threads each
WRITE PERFORMANCE READ PERFORMANCE
• Single client with 1,000 threads
• MySQL Sysbench
• R3.8XL with 32 cores and 244 GB RAM
SQL benchmark results
36. Scaling table count
Tables
Amazon
Aurora
MySQL
I2.8XL
local SSD
MySQL
I2.8XL
RAM disk
RDS
MySQL
30 K IOPS
(single AZ)
10 60,000 18,000 22,000 25,000
100 66,000 19,000 24,000 23,000
1,000 64,000 7,000 18,000 8,000
10,000 54,000 4,000 8,000 5,000
• Write-only workload
• 1,000 connections
• Query cache (default on for Amazon Aurora, off for MySQL)
11x
U P TO
FA S T E R
37. Scaling user connections
• OLTP workload
• Variable connection count
• 250 tables
• Query cache (default on for Amazon Aurora, off for MySQL)
Connections Amazon Aurora
RDS MySQL
30 K IOPS (single AZ)
50 40,000 10,000
500 71,000 21,000
5,000 110,000 13,000
8x
U P TO
FA S T E R
38. Do fewer I/Os
Minimize network packets
Cache prior results
Offload the database engine
DO LESS WORK
Process asynchronously
Reduce latency path
Use lock-free data structures
Batch operations together
BE MORE EFFICIENT
How do we achieve these results?
39. I/O traffic patterns: MySQL vs. Aurora
Binlog Data Double-write bufferLog records FRM files, metadata
T Y P E O F W R I T E S
EBS mirrorEBS mirror
AZ 1 AZ 2
Amazon S3
MYSQL WITH STANDBY
SEQUENTIAL
WRITE
SEQUENTIAL
WRITE
EBS
Amazon Elastic
Block Store (EBS)
Primary
Instance
Standby
Instance
AZ 1 AZ 3
Primary
Instance
Amazon S3
AZ 2
Replica
Instance
AMAZON AURORA
ASYNC
4/6 QUORUM
DISTRIBUTED
WRITES
40. I/O traffic patterns: MySQL vs. Aurora
T Y P E O F W R I T E S
AZ 1 AZ 3
Primary
Instance
Amazon S3
AZ 2
Replica
Instance
AMAZON AURORA
ASYNC
4/6 QUORUM
DISTRIBUTED
WRITES
EBS mirrorEBS mirror
AZ 1 AZ 2
Amazon S3
EBS
Amazon Elastic
Block Store (EBS)
Primary
Database
Node
Standby
Database
Node
POSTGRESQL WITH STANDBY
WAL DATA COMMIT LOG & FILES
41. I/O volume: MySQL vs. Aurora
Workload
MySQL
w/ 30 K PIOS
Aurora
Read Only 24,814 0 0.00%
Write Only 7,387,798
158,323
2.21%
OLTP 7,722,684
201,292
2.61%
R/W: 50/50 23,753,366 364,032 1.55%
100 GB database / 1 M Sysbench transactions
50x
U P TO
L OWE R I/O V OL U ME
42. Amazon Aurora is highly available
“Using Amazon Aurora, we can run many replicas with millisecond latency. This
means during a power event we can handle large surges in traffic and still give our
customers timely, up-to-date information. In addition, spreading these replicas across
multiple AWS Availability Zones with automatic failover gives us confidence that our
databases will be there when we need them.” – Edward Wong, Solutions Architect
at PG&E
43. Highly available storage
• Six copies of data; quorum system for
read/write; latency tolerant
• Background scrubbing; CRC on the
wire and on disk
• Peer-to-peer gossip replication for
catch up and recovery
• Continuous back to Amazon S3 as a
quorum set member
• 10 GB segments as unit of repair or
hot spot rebalance
AZ 1 AZ 2 AZ 3
Amazon S3
44. Traditional databases
• Have to replay logs since the last
checkpoint
• Single-threaded in MySQL; requires a
large number of disk accesses
Amazon Aurora
• Underlying storage replays redo
records on demand as part of a disk
read
• Parallel, distributed, asynchronous
Checkpointed Data Redo Log
Crash at T0 requires
a reapplication of the
SQL in the redo log since
last checkpoint
T0 T0
Crash at T0 will result in redo logs being
applied to each segment on demand, in
parallel, asynchronously
Instant crash recovery
45. Survivable caches
• We moved the cache out of the
database process
• Cache remains warm in the event
of a database restart
• Lets you resume fully loaded
operations much faster
• Instant crash recovery +
survivable cache = quick and easy
recovery from DB failures
SQL
Transactions
Caching
SQL
Transactions
Caching
SQL
Transactions
Caching
Caching process is outside the DB process
and remains warm across a database restart
46. Faster, more predictable failover
App
RunningFailure Detection DNS Propagation
Recovery Recovery
DB
Failure
MYSQL
App
Running
Failure Detection DNS Propagation
Recovery
DB
Failure
AURORA WITH MARIADB DRIVER
1 5 – 3 0 s e c
5 – 2 0 s e c
47. ALTER SYSTEM CRASH [{INSTANCE | DISPATCHER | NODE}]
ALTER SYSTEM SIMULATE percent_failure DISK failure_type IN
[DISK index | NODE index] FOR INTERVAL interval
ALTER SYSTEM SIMULATE percent_failure NETWORK failure_type
[TO {ALL | read_replica | availability_zone}] FOR INTERVAL interval
Simulate failures using SQL
• To cause the failure of a component at the database node:
• To simulate the failure of disks:
• To simulate the failure of networking:
48. Amazon Aurora is easy to use
“Amazon Aurora’s new user-friendly monitoring interface made it
easy to diagnose and address issues. Its performance, reliability and
monitoring really shows Amazon Aurora is an enterprise-grade AWS
database.” – Mohamad Reza, Information Systems Officer at United
Nations
49. Simplify database management
Schema design
Query construction
Query optimization
Automatic failover
Backup and recovery
Isolation and security
Industry compliance
Push-button scaling
Automated patching
Advanced monitoring
Routine maintenance
Amazon RDS takes care of your time-consuming database
management tasks, freeing you to focus on your applications and
business
You
RDS
50. Simplify storage management
§ Continuous, incremental backups to Amazon S3
§ Instantly create user snapshots—no performance impact
§ Automatic storage scaling up to 64 TB—no performance impact
§ Automatic restriping, mirror repair, hot spot management, encryption
Up to 64 TB of storage – auto-incremented in 10 GB units
up to 64 TB
51. Simplify data security
R Encryption to secure data at rest
– AES-256; hardware accelerated
– All blocks on disk and in Amazon S3 are encrypted
– Key management via AWS KMS
R SSL to secure data in transit
R Network isolation via Amazon VPC by default
R No direct access to nodes
R Supports industry-standard security and data
protection certifications
Storage
SQL
Transactions
Caching
Amazon S3
Applicationcoming soon
52. 52
Simplify monitoring with AWS console
Amazon CloudWatch
metrics for Amazon RDS
l CPU utilization
l Storage
l Memory
l Swap usage
l DB connections
l I/O (read and write)
l Latency (read and write)
l Throughput (read and write)
l Replica lag
l Many more
Amazon CloudWatch Alarms
l Similar to on-premises custom
monitoring tools
53. Advanced monitoring
50+ system/OS metrics | sorted process list view | 1–60 sec granularity
alarms on specific metrics | egress to Amazon CloudWatch Logs | integration with third-party tools
ALARM
54. Well established ecosystem
Business Intelligence Data Integration Query and Monitoring SI and Consulting
Source: Amazon
“We ran our compatibility test suites against Amazon Aurora and everything
just worked." - Dan Jewett, Vice President of Product Management at Tableau
57. ElastiCache: Fully Managed Cache Service
Easy to
Deploy
Deploy master-
slave(s)
configuration with
a few button clicks
or API calls
Easy to
Migrate
Compatible with
memcached or
Redis
Existing code will
work when you
update node end
points
Easy to
Administer
ElastiCache
automatically
replaces failed
nodes and patches
software as needed
CloudWatch
enables you to
monitor cache
performance
metrics
Easy to
Secure
Supports VPC and
Security Group
configurations
Easy to
Scale
Provide assisted
scale up and scale
out capability
59. Why in-memory?
• Everything is connected - phones, tablets, cars,
air conditioners, toasters
• Demand for real-time performance – online
games, ad tech, eCommerce, social apps, etc.
• Load is spikey and unpredictable
• Database performance often the bottleneck
60. Application
Server
Hot Items
Small, frequently-accessed items are
ideal candidates for read caching
• Reduce server-side latency to <1ms
• Eliminate “hot spot” performance barriers
• Offload heavy read activity from database
61. asynchronousreplication
Redis HA on ElastiCache
Availability Zone #1 Availability Zone #2
writes
use “Primary
Endpoint”
from Node Group
reads
use ‘replica’ endpoints
from Node Group
*can use ‘primary’ also
Auto-Failover
§ Goes to replica with
lowest replication lag
§ No changes in DNS
64. =
• Managed NoSQL database service
• Highly scalable
• Consistent, single-digit millisecond
latency at any scale
• Highly durable and available—3x
replication
• Accessible via simple and powerful
APIs
• Supports both document and key-
value data models
• No table size or throughout limits
66. Writes
Replicated continuously to 3 AZs
Persisted to disk (custom SSD)
Reads
Strongly or eventually consistent
No latency trade-off
Automatic replication for rock-solid durability and availability
67. Table Table
Items
Attributes
Hash
Key
Range
Key
Mandatory
Key-value access pattern
Determines data distribution Optional
Model 1:N relationships
Enables rich query
capabilities
All items for a hash key
==, <, >, >=, <=
“begins with”
“between”
sorted results
counts
top/bottom N values
paged responses
68. Data types
• String (S)
• Number (N)
• Binary (B)
• String Set (SS)
• Number Set (NS)
• Binary Set (BS)
• Boolean (BOOL)
• Null (NULL)
• List (L)
• Map (M)
Used for storing nested JSON documents
72. Consistent low latency whether
scaling up/down or operating at
your provisioned limits
Durable Low Latency – At Scale
73. Popular use cases
Ad Tech IoT Gaming
Mobile
& Web
Ad serving,
retargeting, ID
lookup, user
profile
management,
session-
tracking, RTB
Tracking state,
metadata and
readings from
millions of
devices, real-
time
notifications
Recording
game details,
leaderboards,
session
information,
usage history,
and logs
Storing user
profiles,
session details,
personalization
settings, entity
specific
metadata
74. Fast Development
Customer Experiences
Weatherbug mobile app
Lightning detection & alerting
for 40M users/month
Developed and tested in
weeks, at “1/20th of the cost of
the traditional DB approach”
Super Bowl promotion
Millions of interactions over a
relatively short period of time
Built the app in 3 days, from
design to production-ready
77. Databases on EC2
• Any database that runs on Windows or Linux!
• Many AMIs available from technology partners
– Oracle Database, MS SQL Server, MongoDB, Vertica, …
• White papers available on best practices
– Oracle Database, MS SQL Server, MongoDB, Cassandra, …
• Why?
– No managed service
– Full control
– Exceed limits of managed service, e.g. > 6TB of storage on RDS
79. Disclaimer
• This session must not to be used as guidance for
licensing purchases or compliance, it is merely
informational and non-binding. All licensing decisions
must be agreed with Microsoft and Oracle.
• You must review your Microsoft PUR and Oracle license
agreement to understand your specific usage rights.
Your Microsoft PUR and Oracle license agreement may
be customized and therefore different than the
information in this presentation.
80. Licensing Terms
• BYOL – Bring your own license based on license
portability rules of your vendor
• LI – License included, AWS provides the license as part
of the hourly instance fee
• Dedicated Instances – AWS instances where the
underlying physical hardware is not shared
• Virtual Cores – Directly mapped to physical CPU cores
• vCPUs – Hyper-threaded virtual cores
82. SQL Server Support on AWS
• Microsoft workloads are supported on AWS
• Our customers have successfully deployed in the AWS cloud
virtually every Microsoft application available, including Microsoft
Exchange, SharePoint, Lync, Dynamics, and Remote Desktop
Services
• If you have support related issues you should contact AWS Support
• If you have an existing Microsoft support agreement you can contact
Microsoft Support
• Support for Microsoft workloads on AWS can be a collaborative
effort between you, AWS Support, and Microsoft Support.
83. SQL Server License Mobility on AWS
You are responsible for obtaining the licenses required for eligible Microsoft
applications running in the AWS cloud using the License Mobility through Software
Assurance benefit, and for complying with all applicable Microsoft licensing
requirements. Under the PUR, the number of licenses required varies based on the
instance type, version of SQL Server, and the Microsoft licensing model you
choose.
For “Licensing by Individual Virtual OSE” of Microsoft SQL Server 2014 (and
permitted instances of Microsoft SQL Server 2012), the July 2014 version of the
PUR states, “The number of licenses required equals the number of Virtual Cores in
each Virtual OSE in which you will run the server software, subject to a minimum of
four licenses per Virtual OSE.” The July 2014 version of the PUR defines a “Virtual
Core” as “the unit of processing power in a virtual hardware system. A Virtual Core
is the virtual representation of one or more hardware threads.”
http://aws.amazon.com/windows/resources/licensemobility/sql/
84. SQL Server Licensing Cloud vs On-Prem
• SQL Server is twice as expensive on both AWS and
Azure for a single server with the same number of cores
• It can be four times as expensive if a passive mirror is
included
• These are standard Microsoft terms under the PUR
• Counteract by:
• Optimizing licenses to use SE or other editions instead of EE
• Reduce vCPUs to right size the instance (new hardware)
• Add a caching tier, move components to NoSQL or migrate to
MySQL/PostgreSQL
87. Oracle Support on AWS
• All Oracle Technology products (Database, Fusion
Middleware and others) are supported on EC2
• No Oracle Applications (E-Business Suite, Siebel, PeopleSoft,
etc.) are supported on AWS, but run without problems
• Oracle has not refused support calls
• Oracle reserves the right to ask the customer to reproduce a
problem on a certified environment
• AWS will provide a certified environment at no cost to the
customer if it looks like a virtualization problem
• AWS has never had a virtualization problem associated with
Oracle software
88. Oracle License Portability to AWS
All Oracle Software licenses are fully portable to Amazon Web Services EC2
• Enterprise License Agreement (ELA)
• Unlimited License Agreement (ULA)*
• Business Process Outsourcing (BPO)
• Oracle Partner Network (OPN)
Processor & Socket Licensing:
• Standard Edition Licenses
• 0.25 core multiplier = 1 license for 4 virtual cores (8 vCPUs) on EC2
• Enterprise Edition Licenses
• 0.5 core multiplier = 1 license for 2 virtual cores (4 vCPUs) on EC2
• Standard named user plus licensing applies, including counting
the minimums where applicable
Oracle Cloud Licensing Policy
http://www.oracle.com/us/corporate/pricing/cloud-licensing-070579.pdf
AWS Virtual Core Table
http://aws.amazon.com/ec2/virtualcores/
89. Old-World Vendors and Old-World Policies…
You’ve Got
Mail!
AUDIT
Very Expensive Proprietary Lock-In Punitive
Licensing
Unshackle From
H stile Database Vendors
90. Freedom Begins with Choice; Migrating Data and
Schema
AWS Schema Conversion
Tool
Automatically move tables,
views, stored procedures,
metadata
Highlights and recommends
custom actions as needed
AWS Database Migration Service
Start a migration in literally a few minutes
Keep apps running during the migration
Replicate from, within, or to Amazon EC2 or
managed database services or on-premises
0
1
2
3
4
5
WorkloadQualification
Framework
Assess workloads by
complexity, technology,
effort, and other
factors
Recommends strategy
and plans for migration
AWS Workload Qualification Framework