O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
SSDs, IMDGs and All the Rest 
A short intro into how SSDs are 
powering the data revolution 
Uri Cohen 
Head of Product @ ...
The Data Processing Hierarchy
But Data Amounts Just Keep Growing
But We Have a Performance Gap
In Memory 
Computing 
to the 
Rescue? 
Not enough anymore… 
• Average GigaSpaces XAP 
cluster size grew 5-10 fold 
since 2...
SSD to Save 
the Day! 
https://www.mimoco.com
(It Actually 
Looks More 
Like This)
Some Numbers 
Level Access time Typical size 
Registers instantaneous under 1KB 
Level 1 Cache 1-3 ns 64KB per core 
Level...
Some Numbers 
Level Random Access Time Typical Size 
Registers instantaneous under 1KB 
Level 1 Cache 1-3 ns 64KB per core...
Performance Is All the Rage 
http://arstechnica.com/information-technology/2012/06/inside-the-ssd-revolution-how-solid-sta...
Is It All Roses 
and Daisies?
Step Back – 
How SSDs 
Work
The Foundation - NAND Chips
NAND Traits 
Space-efficient 
(60% less than NOR) 
 Effectively only 
NAND is used 
commercially
NAND Traits 
Can only write and 
read whole pages, 
4096 or 8192 bytes 
at a time 
 Modern FSs work 
this way anyway (but...
NAND Traits 
Limited life span 
(5K-10K write/erase 
cycles) 
 Need to evenly 
distribute load across 
all blocks
NAND Traits 
You cannot update 
a page “in place” 
 So why not delete 
it and write a new one 
instead?
Duh, you can 
only delete 
whole blocks
Typical Update Cycle
Typical 
Update Cycle 
• Updating 4096 
(or less) bytes of 
data can result in 
2MB of data 
moving around on 
the SSD 
• ...
Controllers 
to the 
Rescue
Write Caching
Garbage 
Collection 
(Grrrrrr….) 
Compacts 
fragmented disk 
blocks  but has a 
performance cost 
• Modern SSDs try to do...
Striping
Wear 
Leveling 
A bag of techniques 
the controller uses 
to keep all of the 
flash cells at roughly 
the same level of 
u...
Dedupe & Compression
Databases, 
Charge 
Ahead! 
http://cdn.pcworld.idg.com.au/article/images/740x500/dimg/larry-mario_500.jpg
The Naive - 
MySQL (or 
PostgreSQL, 
Oracle, 
Mongo, …) 
Let’s just use it! 
(and write data 
in place FTW)
The Naive - 
MySQL (or 
PostgreSQL, 
Oracle, 
Mongo, …) 
• They all perform 
buffering of 
writes before 
flushing to disk...
Source: Anandtech
Source: Anandtech
Cassandra 
Already 
Optimized 
(But for 
what?)
Cassandra Write Path 
http://www.slideshare.net/rbranson/cassandra-and-solid-state-drives
Cassandra Write Path 
http://www.slideshare.net/rbranson/cassandra-and-solid-state-drives
Cassandra Write Path 
http://www.slideshare.net/rbranson/cassandra-and-solid-state-drives
Cassandra Write Path 
http://www.slideshare.net/rbranson/cassandra-and-solid-state-drives
C* 
Observations 
(for SSDs) 
• All disk writes are 
sequential and append 
only 
• Compaction is applied 
when merging SS...
But Still… 
• Read path is 
complex 
• Compaction can 
cause performance 
variations
Why DO WE 
Treat SSDs 
the Same as 
HDDs?
Software 
Optimizations 
Direct access: 
• No kernel space 
overhead 
• TRIM 
• Multithreading 
• Caching in DRAM 
• On Di...
Flash 
Optimized 
APIs
How We Did It
43 
RAM Only : ~1M read Txns/sec 
RAM + SSD: 242K read Txns/sec 
Raw Performance Numbers
Looking at It from a Cost Perspective 
44 
While Reducing Servers by 50% 
Provides 2x – 3.6x Better TPS/$ 
- 1KB object si...
Resources 
• http://arstechnica.com/information-technology/ 
2012/06/inside-the-ssd-revolution-how-solid-state- 
disks-rea...
Thank You!
Próximos SlideShares
Carregando em…5
×

SSDs, IMDGs and All the Rest - Jax London

SSDs, IMDGs and All the Rest - Jax London

  • Seja o primeiro a comentar

SSDs, IMDGs and All the Rest - Jax London

  1. 1. SSDs, IMDGs and All the Rest A short intro into how SSDs are powering the data revolution Uri Cohen Head of Product @ GigaSpaces @uri1803 #jaxlondon 2014
  2. 2. The Data Processing Hierarchy
  3. 3. But Data Amounts Just Keep Growing
  4. 4. But We Have a Performance Gap
  5. 5. In Memory Computing to the Rescue? Not enough anymore… • Average GigaSpaces XAP cluster size grew 5-10 fold since 2008 • We’re in the realm of terabytes, not gigabytes
  6. 6. SSD to Save the Day! https://www.mimoco.com
  7. 7. (It Actually Looks More Like This)
  8. 8. Some Numbers Level Access time Typical size Registers instantaneous under 1KB Level 1 Cache 1-3 ns 64KB per core Level 2 Cache 3-10 ns 256KB per core Level 3 Cache 10-20 ns 2-20 MB per chip Main Memory 30-60 ns 4-32 GB per system Hard Disk 3,000,000-10,000,000 ns over 1TB
  9. 9. Some Numbers Level Random Access Time Typical Size Registers instantaneous under 1KB Level 1 Cache 1-3 ns 64KB per core Level 2 Cache 3-10 ns 256KB per core Level 3 Cache 10-20 ns 2-20 MB per chip Main Memory 30-60 ns 4-32 GB per system SSD < 1,000,000 ns 128GB – 2TB Hard Disk 3,000,000-10,000,000 ns over 1TB
  10. 10. Performance Is All the Rage http://arstechnica.com/information-technology/2012/06/inside-the-ssd-revolution-how-solid-state-disks-really-work/
  11. 11. Is It All Roses and Daisies?
  12. 12. Step Back – How SSDs Work
  13. 13. The Foundation - NAND Chips
  14. 14. NAND Traits Space-efficient (60% less than NOR)  Effectively only NAND is used commercially
  15. 15. NAND Traits Can only write and read whole pages, 4096 or 8192 bytes at a time  Modern FSs work this way anyway (but keep that in mind for later)
  16. 16. NAND Traits Limited life span (5K-10K write/erase cycles)  Need to evenly distribute load across all blocks
  17. 17. NAND Traits You cannot update a page “in place”  So why not delete it and write a new one instead?
  18. 18. Duh, you can only delete whole blocks
  19. 19. Typical Update Cycle
  20. 20. Typical Update Cycle • Updating 4096 (or less) bytes of data can result in 2MB of data moving around on the SSD • It’s called Write Amplification
  21. 21. Controllers to the Rescue
  22. 22. Write Caching
  23. 23. Garbage Collection (Grrrrrr….) Compacts fragmented disk blocks  but has a performance cost • Modern SSDs try to do this in the background... • When no empty blocks are available, GC must be done before ANY write can go through
  24. 24. Striping
  25. 25. Wear Leveling A bag of techniques the controller uses to keep all of the flash cells at roughly the same level of use
  26. 26. Dedupe & Compression
  27. 27. Databases, Charge Ahead! http://cdn.pcworld.idg.com.au/article/images/740x500/dimg/larry-mario_500.jpg
  28. 28. The Naive - MySQL (or PostgreSQL, Oracle, Mongo, …) Let’s just use it! (and write data in place FTW)
  29. 29. The Naive - MySQL (or PostgreSQL, Oracle, Mongo, …) • They all perform buffering of writes before flushing to disk • ... but flushes are still RANDOM writes
  30. 30. Source: Anandtech
  31. 31. Source: Anandtech
  32. 32. Cassandra Already Optimized (But for what?)
  33. 33. Cassandra Write Path http://www.slideshare.net/rbranson/cassandra-and-solid-state-drives
  34. 34. Cassandra Write Path http://www.slideshare.net/rbranson/cassandra-and-solid-state-drives
  35. 35. Cassandra Write Path http://www.slideshare.net/rbranson/cassandra-and-solid-state-drives
  36. 36. Cassandra Write Path http://www.slideshare.net/rbranson/cassandra-and-solid-state-drives
  37. 37. C* Observations (for SSDs) • All disk writes are sequential and append only • Compaction is applied when merging SSTables • SSTables are immutable once written  No write amplification
  38. 38. But Still… • Read path is complex • Compaction can cause performance variations
  39. 39. Why DO WE Treat SSDs the Same as HDDs?
  40. 40. Software Optimizations Direct access: • No kernel space overhead • TRIM • Multithreading • Caching in DRAM • On Disk and DRAM Indexing
  41. 41. Flash Optimized APIs
  42. 42. How We Did It
  43. 43. 43 RAM Only : ~1M read Txns/sec RAM + SSD: 242K read Txns/sec Raw Performance Numbers
  44. 44. Looking at It from a Cost Perspective 44 While Reducing Servers by 50% Provides 2x – 3.6x Better TPS/$ - 1KB object size and uniform distribution - 2 sockets 2.8GHz CPU with total 24 cores, CentOS 5.8, 2 FusionIO SLC PCIe cards RAID - YCSB measurements performed by SanDisk Assumptions: 1TB Flash = $2K; 1TB RAM = $20K
  45. 45. Resources • http://arstechnica.com/information-technology/ 2012/06/inside-the-ssd-revolution-how-solid-state- disks-really-work/ • http://www.slideshare.net/rbranson/cassandra-and-solid-state- drives • http://www.sandisk.com/enterprise/zetascale/ • http://www.gigaspaces.com/xap-memoryxtend-flash-performance- big-data
  46. 46. Thank You!

×