SlideShare uma empresa Scribd logo
1 de 35
Baixar para ler offline
©2014 DataStax Confidential. Do not distribute without consent.
@AlTobey
Open Source Mechanic @ Datastax
Designing Commodity Storage
1
What is commodity storage?
•software-defined storage
•e.g. Cassandra, S3, GCE Persistent Disks
•Intel/AMD x86_64 architecture
!
Open Standards:
•PCI-Express
•Near-line SAS, Enterprise SATA, SATA SSD
•1g/10g ethernet
Definitely NOT this
Designed to solve
different problems
from a different era.
Not this either
Besides SSDs most “desktop”
gear is to be avoided for
production deployment.
Enterprise
Rack & Stack
•Blades & 1U for high CPU with low storage density
•2U for plenty of CPU & storage & air flow
•3U-4U for high-latency / high-density storage
•“racks” don’t have to be literal
•blade chassis
•separate network/power is key
Vendors
Choosing Server Components
•CPU
•Memory
•Motherboards
•Host Bus Adapters
•Hard Drives
•Network Interface Cards
CPU Pricing
E5-2620
E5-2630
E5-2650
E5-2670
E5-2687W
E5-2690
0 550 1100 1650 2200
6 cores 2.6Ghz 80w
6 cores 2.1Ghz 80w
8 cores 2.6Ghz 95w
10 cores 2.5Ghz 115w (3.3Ghz turbo)
8 cores 3.4Ghz 150w
8 cores 2.9Ghz 135w (3.8Ghz turbo)
Dollars
15MB L3 Cache
15MB
20MB
20MB
25MB
25MB
Processors
Source: http://en.wikipedia.org/wiki/Sandy_Bridge-E
Memory
•always get ECC!
•~5 single bit errors in 8 GB RAM per hour (top-end error rate)
•unexplainable crashes
•data corruption
•8GB DIMMs are still the sweet spot
!
•Registered Memory: match to your CPU/motherboard
•Pretty much all server memory is ECC and Registered
!
•Speed: match to fastest rating of CPU/motherboard
Motherboards
•Largely out of your control
•Dell / HP / etc. you’re looking at server model, e.g. DL380
•Supermicro: be very careful when picking your VAR
•Features to watch for:
•Socket count (NUMA)
•IPMI
•onboard SAS or SATA port speed/count
•PCIe speed & layout
•RAM capacity
Storage Adapters
•Serial Attached SCSI
•Bit Error Rate: 1 in 10^16 bits or 1bit in 1,250TiB
•Supports SATA drives over STP
•Near-line SAS drives are SATA chassis with SAS boards
•Always use SAS if you need an expander
•Check out enclosure services in Linux
•Serial ATA
•Bit Error Rate: 1 in 10^15 or 1 bit in 125 TiB
•Avoid expanders
Storage Adapters
•JBOD
•cheap
•OS manages drives
•drivers usually shipped with OS
•CPU overhead is negligible
•HW RAID is sometimes faster, usually comes with cache
•writethrough v.s. writeback
•writeback + BBU provides interesting performance options
•driver + utilities management
Parity RAID
RAID
•JBOD
•mount every drive with individual filesystems
•cheap
•RAID0
•single drive failure means node rebuild
•cheap
•RAID10
•fast, protects against single disk failure
•expensive
RAID
•RAID 5 / 6 (and beyond)
•parity data protection
•performance heavily dependent on implementation
•cheapest option for drive failure protection
•RAID 50 / 60
•stripe across multiple RAID[56] volumes
•mostly useful with large number of drives
•can provide decent performance esp. on HW RAID
Hard Drives
•SATA HDD
•there’s only one head carriage
•seeks kill
•decent performance on sequential IO
•bit errors
•cheap!
Hard Drives
•SAS HDD
•there’s only one head carriage
•seeks kill
•bit errors
•expensive!
•faster RPMs may help a little with seek latency
Hard Drives
•SATA SSD
•very low latency seeks
•slightly lower sequential IO throughput
•more expensive than SATA HDD
•vendors might not want to sell them to you!
•sometimes called “value series” or similar
•Cassandra runs fine on consumer-grade SSDs
•make sure your SATA/SAS bus and HBA are up to the task
Hard Drives
•Enterprise SSD
•quite expensive
•vendor supported
•more reliable
•often faster as well
Hard Drives
•PCIe SSD
•e.g. FusionIO, ioSwitch
•highest performance potential
•not as expensive as you think
•lots of new products entering the market
•generally not hot-swappable
Networking
•you don’t need 10gig
•but it’s awesome
•Broadcom cards are common and commonly buggy
•Intel cards are expensive but a good bet
•Consider lesser-known add-in cards, e.g. Myricom
To the Cloud!
•Amazon, Google, etc. all use similar gear under the VM
•same constraints apply, but you only get a fraction of the box
•pass-through PCIe devices for the best performance
•Avoid EBS in EC2, go with ephemerals
•GCE PD’s may need additional read/write threads
@AlTobey
Q & A
Everybody is hiring, including Datastax!
Open Source Mechanic, Datastax

Mais conteúdo relacionado

Mais procurados

SOUG_SDM_OracleDB_V3
SOUG_SDM_OracleDB_V3SOUG_SDM_OracleDB_V3
SOUG_SDM_OracleDB_V3
UniFabric
 
TechTalk v2.0 - Performance tuning Cassandra + AWS
TechTalk v2.0 - Performance tuning Cassandra + AWSTechTalk v2.0 - Performance tuning Cassandra + AWS
TechTalk v2.0 - Performance tuning Cassandra + AWS
Pythian
 
Ceph Day Seoul - Ceph: a decade in the making and still going strong
Ceph Day Seoul - Ceph: a decade in the making and still going strong Ceph Day Seoul - Ceph: a decade in the making and still going strong
Ceph Day Seoul - Ceph: a decade in the making and still going strong
Ceph Community
 

Mais procurados (19)

Affirmed Systems SSD Storage Area Network Appliance architecture for trading ...
Affirmed Systems SSD Storage Area Network Appliance architecture for trading ...Affirmed Systems SSD Storage Area Network Appliance architecture for trading ...
Affirmed Systems SSD Storage Area Network Appliance architecture for trading ...
 
Cassandra Day Chicago 2015: DataStax Enterprise & Apache Cassandra Hardware B...
Cassandra Day Chicago 2015: DataStax Enterprise & Apache Cassandra Hardware B...Cassandra Day Chicago 2015: DataStax Enterprise & Apache Cassandra Hardware B...
Cassandra Day Chicago 2015: DataStax Enterprise & Apache Cassandra Hardware B...
 
Evoluzione dello storage
Evoluzione dello storageEvoluzione dello storage
Evoluzione dello storage
 
Raid level 4
Raid level 4Raid level 4
Raid level 4
 
Understanding RAID Controller
Understanding RAID ControllerUnderstanding RAID Controller
Understanding RAID Controller
 
VMware Virtual SAN slideshow
VMware Virtual SAN slideshowVMware Virtual SAN slideshow
VMware Virtual SAN slideshow
 
Ceph Day San Jose - Ceph at Salesforce
Ceph Day San Jose - Ceph at Salesforce Ceph Day San Jose - Ceph at Salesforce
Ceph Day San Jose - Ceph at Salesforce
 
SOUG_SDM_OracleDB_V3
SOUG_SDM_OracleDB_V3SOUG_SDM_OracleDB_V3
SOUG_SDM_OracleDB_V3
 
Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server
Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server
Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server
 
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack CloudJourney to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
 
Walk Through a Software Defined Everything PoC
Walk Through a Software Defined Everything PoCWalk Through a Software Defined Everything PoC
Walk Through a Software Defined Everything PoC
 
Eric Moreau - Samedi SQL - Backup dans Azure et BD hybrides
Eric Moreau - Samedi SQL - Backup dans Azure et BD hybridesEric Moreau - Samedi SQL - Backup dans Azure et BD hybrides
Eric Moreau - Samedi SQL - Backup dans Azure et BD hybrides
 
Ceph Day San Jose - HA NAS with CephFS
Ceph Day San Jose - HA NAS with CephFSCeph Day San Jose - HA NAS with CephFS
Ceph Day San Jose - HA NAS with CephFS
 
Raid data recovery Tips
Raid data recovery TipsRaid data recovery Tips
Raid data recovery Tips
 
TechTalk v2.0 - Performance tuning Cassandra + AWS
TechTalk v2.0 - Performance tuning Cassandra + AWSTechTalk v2.0 - Performance tuning Cassandra + AWS
TechTalk v2.0 - Performance tuning Cassandra + AWS
 
Get Your GeekOn With Ron - Session Two: Local Storage vs Centralized Storage ...
Get Your GeekOn With Ron - Session Two: Local Storage vs Centralized Storage ...Get Your GeekOn With Ron - Session Two: Local Storage vs Centralized Storage ...
Get Your GeekOn With Ron - Session Two: Local Storage vs Centralized Storage ...
 
Developing a Ceph Appliance for Secure Environments
Developing a Ceph Appliance for Secure EnvironmentsDeveloping a Ceph Appliance for Secure Environments
Developing a Ceph Appliance for Secure Environments
 
Azure VM 101 - HomeGen by CloudGen Verona - Marco Obinu
Azure VM 101 - HomeGen by CloudGen Verona - Marco ObinuAzure VM 101 - HomeGen by CloudGen Verona - Marco Obinu
Azure VM 101 - HomeGen by CloudGen Verona - Marco Obinu
 
Ceph Day Seoul - Ceph: a decade in the making and still going strong
Ceph Day Seoul - Ceph: a decade in the making and still going strong Ceph Day Seoul - Ceph: a decade in the making and still going strong
Ceph Day Seoul - Ceph: a decade in the making and still going strong
 

Semelhante a Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra

Oracle Performance On Linux X86 systems
Oracle  Performance On Linux  X86 systems Oracle  Performance On Linux  X86 systems
Oracle Performance On Linux X86 systems
Baruch Osoveskiy
 
1 emc vs_compellent
1 emc vs_compellent1 emc vs_compellent
1 emc vs_compellent
jyoti_j2
 
OSS Presentation Accelerating VDI by Daniel Beveridge
OSS Presentation Accelerating VDI by Daniel BeveridgeOSS Presentation Accelerating VDI by Daniel Beveridge
OSS Presentation Accelerating VDI by Daniel Beveridge
OpenStorageSummit
 
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...
Citrix
 
robust-storage-solution
robust-storage-solutionrobust-storage-solution
robust-storage-solution
Tecsun Yeep
 

Semelhante a Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra (20)

Servers Technologies and Enterprise Data Center Trends 2014 - Thailand
Servers Technologies and Enterprise Data Center Trends 2014 - ThailandServers Technologies and Enterprise Data Center Trends 2014 - Thailand
Servers Technologies and Enterprise Data Center Trends 2014 - Thailand
 
Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016
 
MySQL Performance Tuning London Meetup June 2017
MySQL Performance Tuning London Meetup June 2017MySQL Performance Tuning London Meetup June 2017
MySQL Performance Tuning London Meetup June 2017
 
A better storage solution
A better storage solutionA better storage solution
A better storage solution
 
OSDC 2016 - Tuning Linux for your Database by Colin Charles
OSDC 2016 - Tuning Linux for your Database by Colin CharlesOSDC 2016 - Tuning Linux for your Database by Colin Charles
OSDC 2016 - Tuning Linux for your Database by Colin Charles
 
SSD PPT BY SAURABH
SSD PPT BY SAURABHSSD PPT BY SAURABH
SSD PPT BY SAURABH
 
Storage (Hard disk drive)
Storage (Hard disk drive)Storage (Hard disk drive)
Storage (Hard disk drive)
 
Windows Server 2012 R2 Software-Defined Storage
Windows Server 2012 R2 Software-Defined StorageWindows Server 2012 R2 Software-Defined Storage
Windows Server 2012 R2 Software-Defined Storage
 
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
 
Presentation database on flash
Presentation   database on flashPresentation   database on flash
Presentation database on flash
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data Analysis
 
Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis
 
IaaS for DBAs in Azure
IaaS for DBAs in AzureIaaS for DBAs in Azure
IaaS for DBAs in Azure
 
Oracle Performance On Linux X86 systems
Oracle  Performance On Linux  X86 systems Oracle  Performance On Linux  X86 systems
Oracle Performance On Linux X86 systems
 
Výhody a benefity nasazení Oracle Database Appliance
Výhody a benefity nasazení Oracle Database ApplianceVýhody a benefity nasazení Oracle Database Appliance
Výhody a benefity nasazení Oracle Database Appliance
 
1 emc vs_compellent
1 emc vs_compellent1 emc vs_compellent
1 emc vs_compellent
 
OSS Presentation Accelerating VDI by Daniel Beveridge
OSS Presentation Accelerating VDI by Daniel BeveridgeOSS Presentation Accelerating VDI by Daniel Beveridge
OSS Presentation Accelerating VDI by Daniel Beveridge
 
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...
 
robust-storage-solution
robust-storage-solutionrobust-storage-solution
robust-storage-solution
 
Webinar NETGEAR - ReadyNAS, le novità hardware e software
Webinar NETGEAR - ReadyNAS, le novità hardware e softwareWebinar NETGEAR - ReadyNAS, le novità hardware e software
Webinar NETGEAR - ReadyNAS, le novità hardware e software
 

Mais de DataStax Academy

Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
DataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
DataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
DataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
DataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
DataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
DataStax Academy
 

Mais de DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 

Último

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Último (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

Cassandra Day SV 2014: Designing Commodity Storage in Apache Cassandra

  • 1. ©2014 DataStax Confidential. Do not distribute without consent. @AlTobey Open Source Mechanic @ Datastax Designing Commodity Storage 1
  • 2. What is commodity storage? •software-defined storage •e.g. Cassandra, S3, GCE Persistent Disks •Intel/AMD x86_64 architecture ! Open Standards: •PCI-Express •Near-line SAS, Enterprise SATA, SATA SSD •1g/10g ethernet
  • 3. Definitely NOT this Designed to solve different problems from a different era.
  • 4. Not this either Besides SSDs most “desktop” gear is to be avoided for production deployment.
  • 6.
  • 7. Rack & Stack •Blades & 1U for high CPU with low storage density •2U for plenty of CPU & storage & air flow •3U-4U for high-latency / high-density storage •“racks” don’t have to be literal •blade chassis •separate network/power is key
  • 9. Choosing Server Components •CPU •Memory •Motherboards •Host Bus Adapters •Hard Drives •Network Interface Cards
  • 10. CPU Pricing E5-2620 E5-2630 E5-2650 E5-2670 E5-2687W E5-2690 0 550 1100 1650 2200 6 cores 2.6Ghz 80w 6 cores 2.1Ghz 80w 8 cores 2.6Ghz 95w 10 cores 2.5Ghz 115w (3.3Ghz turbo) 8 cores 3.4Ghz 150w 8 cores 2.9Ghz 135w (3.8Ghz turbo) Dollars 15MB L3 Cache 15MB 20MB 20MB 25MB 25MB
  • 12.
  • 13. Memory •always get ECC! •~5 single bit errors in 8 GB RAM per hour (top-end error rate) •unexplainable crashes •data corruption •8GB DIMMs are still the sweet spot ! •Registered Memory: match to your CPU/motherboard •Pretty much all server memory is ECC and Registered ! •Speed: match to fastest rating of CPU/motherboard
  • 14. Motherboards •Largely out of your control •Dell / HP / etc. you’re looking at server model, e.g. DL380 •Supermicro: be very careful when picking your VAR •Features to watch for: •Socket count (NUMA) •IPMI •onboard SAS or SATA port speed/count •PCIe speed & layout •RAM capacity
  • 15. Storage Adapters •Serial Attached SCSI •Bit Error Rate: 1 in 10^16 bits or 1bit in 1,250TiB •Supports SATA drives over STP •Near-line SAS drives are SATA chassis with SAS boards •Always use SAS if you need an expander •Check out enclosure services in Linux •Serial ATA •Bit Error Rate: 1 in 10^15 or 1 bit in 125 TiB •Avoid expanders
  • 16. Storage Adapters •JBOD •cheap •OS manages drives •drivers usually shipped with OS •CPU overhead is negligible •HW RAID is sometimes faster, usually comes with cache •writethrough v.s. writeback •writeback + BBU provides interesting performance options •driver + utilities management
  • 17.
  • 18.
  • 20.
  • 21. RAID •JBOD •mount every drive with individual filesystems •cheap •RAID0 •single drive failure means node rebuild •cheap •RAID10 •fast, protects against single disk failure •expensive
  • 22. RAID •RAID 5 / 6 (and beyond) •parity data protection •performance heavily dependent on implementation •cheapest option for drive failure protection •RAID 50 / 60 •stripe across multiple RAID[56] volumes •mostly useful with large number of drives •can provide decent performance esp. on HW RAID
  • 23.
  • 24. Hard Drives •SATA HDD •there’s only one head carriage •seeks kill •decent performance on sequential IO •bit errors •cheap!
  • 25.
  • 26. Hard Drives •SAS HDD •there’s only one head carriage •seeks kill •bit errors •expensive! •faster RPMs may help a little with seek latency
  • 27.
  • 28. Hard Drives •SATA SSD •very low latency seeks •slightly lower sequential IO throughput •more expensive than SATA HDD •vendors might not want to sell them to you! •sometimes called “value series” or similar •Cassandra runs fine on consumer-grade SSDs •make sure your SATA/SAS bus and HBA are up to the task
  • 29. Hard Drives •Enterprise SSD •quite expensive •vendor supported •more reliable •often faster as well
  • 30.
  • 31. Hard Drives •PCIe SSD •e.g. FusionIO, ioSwitch •highest performance potential •not as expensive as you think •lots of new products entering the market •generally not hot-swappable
  • 32.
  • 33. Networking •you don’t need 10gig •but it’s awesome •Broadcom cards are common and commonly buggy •Intel cards are expensive but a good bet •Consider lesser-known add-in cards, e.g. Myricom
  • 34. To the Cloud! •Amazon, Google, etc. all use similar gear under the VM •same constraints apply, but you only get a fraction of the box •pass-through PCIe devices for the best performance •Avoid EBS in EC2, go with ephemerals •GCE PD’s may need additional read/write threads
  • 35. @AlTobey Q & A Everybody is hiring, including Datastax! Open Source Mechanic, Datastax