Understanding Amazon EBS Availability and Performance
1. AWS Summit 2013
Navigating the Cloud
Understanding Amazon EBS Availability and Performance
Eric Anderson
CopperEgg
April 18, 2013
2. CopperEgg: EBS Use Case
• How CopperEgg uses EBS
• EBS vs Provisioned IOPS EBS
• EBS and RAID
• Backup/Snapshot best practices
• Filesystem selection and tuning
• Monitoring/Migrations/Planning
3. How CopperEgg uses EBS
• Real-time monitoring (every 5s)
– System information
– Processes
– Synthetic HTTP/TCP/etc
– Application metrics
– Tons more..
• Requirements:
– Store many terabytes of data
– Persist the data over long periods of time
– Backups (use snapshots)
– High IO: 50-60k+ ops/s per node
• SSD + Provisioned IOPS EBS
– Consistent IO behavior (non-spikey)
4. EBS vs Provisioned IOPS EBS
• Standard EBS
– Good for low IO volume
– Bursty workloads may be a good
fit: do the math
• Provisioned IOPS EBS
– Great for steady IO patterns that
need consistency
– Not always more expensive than
standard!
– Be sure to use the IOPS you
provision!
5. EBS and RAID
• Which RAID?
– Depends on your use case, but:
• We use stripes (RAID 0) for most things
– Good performance, we build our fault tolerance at a different level
• RAID 10 (stripe of mirrors)
– Good RAID0 performance, but increase in fault tolerance due to mirrors
– Twice the cost of RAID 0
• RAID 0+1 (mirror of stripes)
– Don’t do this – same performance, worse fault tolerance
• RAID 5 (stripe with parity)
– Could be dangerous: software RAID 5 can be bad if you have any write caching enabled.
– Maybe RAID 6 (dual parity) is an option..
• Block size
– Use an appropriate stripe size for best results
• We use 64kb – but you need to test various configs to get the best fit for your application
6. Backup/Snapshot best practices
• Snapshot regularly
– At least once per day, more if you can
– First snapshots take a while, subsequent are faster
– Schedule for when your IO load is lowest to reduce impact
• We do it at around 9pm CST
• Use consistent naming for snapshots
– {hostname}-{raid device}-{device}-{timestamp}
• Use the API for creation
– Faster kickoff, more likely to be consistent (script it!)
– ec2-create-snapshot –d “{hostname}-{raid device}-{device}-{timestamp}” vol-d726382
• Move older snapshots to S3/Glacier for long-term storage
• RAID makes this a bit more complex:
– Make sure you unmount/snapshot/remount your file system, or use fsfreeze to keep
consistent snapshots!
7. Choosing a good file system
• We like ext3/4, but we love XFS
– High performance, consistent
– Robust and lots of options for tweaking/adjusting as needed
• Our favorite mount options: (your mileage may vary)
– inode64, noatime, nodiratime, attr2, nobarrier, logbufs=8, logbsize=256k, osyncisdsync, nobootwait, noauto
– Yields great performance, reduces unnecessary writes, stable
• We like ZFS a lot too, but we want to see more runtime on linux first
– But FreeBSD/ZFS would be a fine choice
• However: test your workload!
– File systems behave differently under different workloads
8. EBS/File system performance tuning
• Tuning file systems:
– Set the scheduler to use „deadline‟ (for each disk in RAID array/EBS):
• [as root] echo deadline > /sys/block/[disk device]/queue/scheduler
– Adjust how aggressively the cache is written to disk. Tune these back if you are
bursty in write IO:
• vm.dirty_ratio=30
• vm.dirty_background_ratio=20
• Track what you change!
– Before changing anything, monitor it
– After you make the change, monitor it
– Then: KEEP monitoring it – things can change over time in unexpected ways
9. Monitoring
• Observing:
– iostat –xcd –t 1
• Watch the sum of r/s and w/s – this is your IOPS metric. For PIOPS, you want it close to the provisioned
amount. We monitor this using CopperEgg custom metrics, and alert if it goes low, or high.
– grep –A 1 dirty /proc/vmstat
• If nr_dirty approaches nr_dirty_threshold, you need to tune down vm.dirty to flush writes more often.
• Reference: http://docs.neo4j.org/chunked/stable/linux-performance-guide.html
• Useful stats to capture:
– In /proc/fs/xfs/stat
• xs_trans* -> transactions
• xs_read/write* -> read/write operations stats
• xb_* -> buffer stats
• Ignore SMART - does not work for EBS
• Watch the console log
– Use the AWS API to look for warning signs of EBS issues
10. Migrations and Capacity Planning
• Using PIOPS?
– Plan on a data migration path if you need to increase PIOPS
• You can‟t (yet) increase IOPS on the fly
• Migration steps from an EBS backed RAID:
1. Snapshot 1hr before, then again, and again – each time it takes less time
2. Stop all services
3. Unmount the filesystem
4. Stop the RAID (mdadm –stop /dev/md0)
5. Take final snapshot
6. Create new volumes based on last snapshot
7. RAID attach new volumes – mdadm should detect the array and magically make it work.
8. Mount the filesystem
9. Restart services