The beautiful thing about software engineering is that it gives you the warm and fuzzy illusion of total understanding: I control this machine because I know how it operates. This is the result of layers upon layers of successful abstractions, which hide immense sophistication and complexity. As with any abstraction, though, these sometimes leak, and that's when a good grounding in what's under the hood pays off.
This first in what will hopefully be a series of talks covers the fundamentals of storage, providing an overview of the three storage tiers commonly found on modern platforms (hard drives, RAM and CPU cache). You'll come away knowing a little bit about a lot of different moving parts under the hood; after all, isn't understanding how the machine operates what this is all about?
-- A talk given at GeeCON Kraków 2016.
5. Axioms
• Not a trick question
– Servers are properly
configured
– System architecture
makes sense
– No obvious bugs
– No scheduled jobs
• So what else goes
bump in the night?
13. Everybody knows…
“Disk seeks are a huge performance
bottleneck… When the amount of data
starts to grow so large that effective
caching becomes impossible… you
need at least one disk seek to read and
a couple of disk seeks to write things.”
-- MySQL Reference Manual (8.12.3)
14. Everybody knows…
“Disk seeks are a huge performance
bottleneck… When the amount of data
starts to grow so large that effective
caching becomes impossible… you
need at least one disk seek to read and
a couple of disk seeks to write things.”
-- MySQL Reference Manual (8.12.3)
22. Interlude: Math
• Rotation is fixed
– Constant angular
velocity (CAV)
• Newton tells us that…
v = ω ∙ r
• Throughput increases
with radius!
23. Interlude: Math
• Commodity drives
are available at:
– 5400-15000 RPM
– Usually 7200 RPM
• What does it mean
for latency?
7200
60
= 120
Revolutions
/ Second
1
120
= 0.08333
~ 8.33ms!
24. In practice?
• Modern drives
give you:
200+ MB/s
300 IOPS
• Pure random
access nets only
1.2MB/s!
26. Fine-tuning
• Provision more RAM
• Careful index structure
– Represent IPs as
UNSIGNED INT for 75%
reduction
– Implement better UUIDs¹
for 30% reduction
¹ Store UUID in an optimized way, Percona blog
27. … or use a sledgehammer!
• RAID 0 (and variants)
employ striping
• Data is distributed to
multiple spindles
• If it sounds familiar…
– It is!
– We call it “sharding”
28. It’s turtles all the way down
• Don’t jump to
conclusions!
– RAID 0 is impractical
– RAID 5 may be slow
– RAID 10 is expensive
– etc.
• Do your homework
• Benchmark!
30. Let’s talk SSDs
• Non-volatile RAM
• Lots of IOPS
• Expensive :-)
• Same caveats
apply…
31. Let’s talk SSDs
• Value starts at “1”
• Electrons accrue in the
floating gate
• After programming,
value becomes “0”
• Electrons are drained
to reset value to “0”
32. Surprise and Terror
• “Draining” is destructive!
• Limited erases
• Limited lifespan!
37. Caveats, remember?
• Addressing
– Cells (1 bit) – not
addressable
– Pages (0.5-8KB)
– Blocks (32-64 pages)
• Why do you care?
– Reads/writes on a page
– But erasure on a block
41. Surprising Results
• What happens when
you delete file?
– Not much
– Bit flip on file table
– Space is not reclaimed
• Result?
– SATA TRIM command
7
5
6
1
2
Block A Block B
Block C Block D
42. SSD Takeaways
• A moving target
–File systems
–Data structures
–Longevity
• As usual:
–Benchmark
–Monitor