Learn tips and techniques that will improve the performance of your applications and databases running on Amazon EC2 instance storage and/or Amazon Elastic Block Store (EBS). This advanced session discusses when to use HI1, HS1, and Amazon EBS. We will share an "under the hood" view to tune the performance of your Elastic Block Store and best practices for running workloads on Amazon EBS, such as relational databases (MySQL, Oracle, SQL Server, postgres) and NoSQL data stores, such as MongoDB and Riak.
2. What We’ll Cover
- Maximizing EC2 and Elastic Block Store
Performance – Best Practices
- As Measured by…
- Configuration Options
- Deployment Patterns
- Tips and Best Practices
8. What is Amazon EBS?
• Very flexible service with lots of choice
– Used with Amazon EC2 instances
– Attach/detach/copy/delete volumes
– Point-in-time snapshots of volumes -> Amazon S3
– Automatically replicated within its Availability Zone to protect
from component failure
– Paying a low price for only what you provision
12. Amazon EBS Standard
Amazon EBS
• IOPS: ~100 IOPS steady-state, with best-effort bursts to hundreds
• Throughput: variable by workload, best effort to 10s of MB/s.
• Latency: Varies, reads typically <20 ms writes typically <10 ms
• Capacity: As provisioned, up to 1 TB
13. Amazon EBS PIOPS
Amazon EBS
• IOPS: Within 10% of up to 4000 IOPS,
99.9% of a given year, as provisioned.
• Throughput: 16 KB per I/O = up to 64 MB/s, as provisioned.
• Latency: low and consistent. Second/IOPS, at recommended QD
• Capacity: As provisioned, up to 1 TB
*
*
14. EC2 Instance: Architecting for Performance
• IOPS consistency requires EBS-
optimized instances
• Maximum throughput delivered by
Amazon EBS is limited by Amazon
EC2 bandwidth
• EBS throughput =
EBS IOPS × Block size
– Ex: 64 MB/s = 4000 IOPS × 16 KB
Instance vCPU
EBS
Optimized Max MB/s Max 16k IOPS
t1 micro 1 No 32MB/s 2000
m1.small 1 No 64MB/s 4000
m1.medium 1 No 64MB/s 4000
m1.large 2 Yes 64MB/s 4000
m1.xlarge 4 Yes 128MB/s 8000
m3.xlarge 4 Yes 64MB/s 4000
m3.2xlarge 8 Yes 128MB/s 8000
c1.medium 2 No 32MB/s 2000
c1.xlarge 8 Yes 128MB/s 8000
cc2.8xlarge 32 NA 800MB/s 50,000
m2.xlarge 2 No 64MB/s 4000
m2.2xlarge 4 Yes 64MB/s 4000
m2.4xlarge 8 Yes 128MB/s 8000
cr1.8xlarge 32 NA 800MB/s 50,000
hi1.4xlarge 16 NA 800MB/s 50,000
cg1.4xlarge 16 NA 800MB/s 50,000
Max 8k =
2x
Max 4k =
4x*
Max 2k =
8x*
*Maximum IOPS is also limited to ~100,000 per 32 vCpu,
irrespective of block size/throughput.
15. EBS-Optimized
• EBS-optimized offers a “SAN-like” experience
• Network interference results:
No impact on IOPS or
Amazon EBS
throughput
Row Labels AvgBW AvgIOPs
m3.2xlarge (EBS-optimized)
no network load
random
read 57,542 3,596
write 61,713 3,857
rw (70/30) 66,997 4,186
sequential
read 61,708 3,856
write 61,651 3,853
rw (70/30) 66,996 4,187
with network load-test1
random
read 59,835 3,739
write 63,407 3,962
rw (70/30) 68,859 4,303
sequential
read 61,736 3,858
write 63,360 3,959
rw (70/30) 68,859 4,302
Network interference tests
No
Difference
In
Throughput
18. ❶ Select a new type of Provisioned IOPS volume
❸ Specify the number of I/O operations per
second your application needs, up to 4000
IOPS per volume. The volume will deliver the
specified I/O operations per second.
❷ Specify the volume capacity
Review: Provisioned IOPS Volumes
Minimum ratio of capacity to IOPS = 1:30
$aws ec2 create-volume --availability-zone us-east-1a --size 134 --volume-type io1 --iops 4000
21. I/O Characteristics
• I/O size
– 4 KB to 64 MB
• I/O pattern
– Sequential and random
• I/O type
– Read and write
• PIOPS always measures I/O in terms of
16 KB or smaller
• PIOPS delivers same number of IOPS for
sequential and random I/O
• PIOPS delivers same number of IOPS for
reads or writes
PIOPS is optimized for database workloads
32. Architecting for Performance: RAID
• Customers stripe number of
volumes to drive higher
IOPS and throughput
– RAID 0 or RAID 10
• How should customers think
about taking snapshots on a
striped volume?
– Quiesce file systems and
take snapshot
– Unmount file system and
take snapshot
– Use OS-specific tools
IO Pattern Block Sizes Thread Write IOPS Write BW (MB) Read IOPS Read BW (MB)
Sequential
4K 8 33,500 134 48,250 193
16K 8 13,875 222 48,063 769
1M 1 247 247 823 823
Random
4K 8 35,250 141 48,250 193
16K 8 13,875 222 42,125 674
1M 1 496 496 795 795
12×400 GB PIOPS volumes, pre-warmed,
RAID 0 LVM, Stripe size 64 KB, attached to
CR1 instance
33. • Leverage SSD instance type
(hi1.4xlarge)
o 2 × 1 TB SSD storage (ephemeral
storage)
o Perfect for replicas
• If replicas on SSD instance types, disable
integrity features such as fsync and
full_page_writes on those hosts to
improve performance
Performance – Extra-large Production Scale
35. Performance / Stability Tips
• Ext4 or XFS (understand journal impact!)
• nobarrier, noatime, noexec, nodiratime
• Raise file descriptor limits
• Set read-aheads low
• AWS business-level support – Trusted Advisor
• Amazon CloudWatch metrics in general
• SNAPSHOT SNAPSHOT SNAPSHOT
36. Pre-warming Amazon EBS volumes
• Typically 5%, extreme worst case of 50% performance
reduction in IOPS and latency when volumes are used without
pre-warming
– Performance is as provisioned when all the chunks are accessed
• Recommendation if testing or you have spare setup time:
– Write to every 4 MB block before using new volumes
• Linux: DD
• Windows: NTFS Full format
– Takes roughly an hour to pre-warm 1 TB 4 KB PIOPS volume
– Be warned, can take up to a day for a 1 TB standard Amazon EBS volume
37. Architecting for Performance: Latency
• Performance requirements may be driven by IOPS or latency
or both
• Recommendation is to start with queue depth of 4 and tune
based on IOPS and latency requirement
– Some customers may need lowest possible latency; this can be achieved at
queue depth of 1 or 2
• Very high queue depths ( >24) may decrease IOPS count as
well as increase latency
38. Write Latency
• Database applications care
about latency as much as IOPS
delivered
• There is an interdependency
among IOPS, queue depth, and
latency
• Current guidance is queue
depth of 1 for every 200 IOPS,
but if latency-bound and write-
heavy, 1:500 – 1:1000 is better.
1 4 8 12 16 20 24 28 32
AvgIOPS ( Count) 845 4152 4153 4177 4152 4176 4177 4177 4151
AvgTP90 ( ms) 3.13 1.47 2.03 3.56 3.62 5.54 6.18 7.48 7.71
845
4152
3.13
1.47
2.03
3.56 3.62
5.54
6.18
7.48
7.71
0
1
2
3
4
5
6
7
8
9
0
500
1000
1500
2000
2500
3000
3500
4000
4500
WriteIOPS
16 KB random WRITE- M3.2Xlarge
EBS-optimized
L
a
t
e
n
c
y
QD
39. Read Latency
• Reads can take advantage of a
deeper queue
• Current guidance is queue
depth of 1 for every 250 IOPS
• EBS-optimized provides
predictable latency
1 4 8 12 16 20 24 28 32
AvgIOPS ( Count) 1864 4153 4153 4177 4120 2800 1965 1213 1089
AvgTP90 ( ms) 0.68 1.46 2.15 3.43 3.88 5.18 91.14 93.18 93.70
1864
4153
4120
1965
0.68 1.46 2.15 3.43 3.88 5.18
91.14
93.18 93.70
0
10
20
30
40
50
60
70
80
90
100
0
500
1000
1500
2000
2500
3000
3500
4000
4500
ReadIOPS
16 KB random READ - M3.2Xlarge EBS-optimized
L
a
t
e
n
c
y
QD
40. What About Performance Cost?
cc2.8xlarge
24 @ 4 KB
PIOPS
VS.
hi1.4xlarge hi1.4xlarge
$11773 on-demand,
$10589 effective 3 YR reserved
$4538 on-demand,
$1539 effective 3 YR reserved
If >20 KB IOPS read, choose hi1
If 3 YR, and >8 KB IOPS, choose hi1If >10 KB write IOPS, TEST,
but probably choose PIOPS
On demand, If <20 KB read IOPS, choose PIOPS
41. What about Capacity Cost?
cc2.8xlarge
48x
1TB
EBS
VS.
hs1.8xlarge hs1.8xlarge
$7312 on-demand,
$6128 effective 3 YR reserved
$6734 on-demand,
$2408 effective 3 YR reserved
If >43TB, or > 800MB/s, choose hs1
If 3 year, and >18TB, choose hs1