SlideShare a Scribd company logo
1 of 31
Ceph at DreamHost
A Storage Journey
About Me
• One of the original four of DreamHost
• Still active daily at DreamHost
• Have spent a lot of time working on the
Ops side.
• Hosting company founded in 1997
• Sage’s other company
• shared hosting, virtual
servers, dedicated servers, cloud
storage, cloud computing
• 375k customers, 1.3MM websites
Storage Journey
A long strange trip
His name was Destro
... and then there
were more.
The First NetApp
Remote Failover
Remote Failover
Meanwhile...
... and still more.
Lots of NetApps
• Peak of around 125 individual NetApps
• Smallish capacity on each (8TB)
• Internal software continuously moving
data between NetApps
• Lots of time spent managing nearly full
filers
Ideal
Reality
Hosting Landscape
• Included storage had grown from 50MB
to gigabytes, then terabytes.
• Prices stayed the same.
• Eventually went to unlimited Storage
• Usage per customer skyrocketed.
Failed Experiments
Failed Experiments
• ATAoE and XFS-based
systems
• Performance &
Stability issues
• 2006 era gear
Failed Experiments
• High capacity
• Nice features
• Expensive
• 85% full and it
failed
Some Success
• First on Sun hardware
then Supermicro
• Great stability
• Not enough IO for
front-line network
storage
Back to Basics
Local RAID
• SATA drives had grown in capacity and
were very cheap
• 4-6TB per hosting server
• Less dependence on congested
network
• Smaller failure domains
The Good
Local RAID
• No more quota, too slow to scan
filesystem
• No more fast failovers
• Multiple hour filesystem check with ext3
• More failure domains
The Bad
Local RAID
• Complete RAID loss more common
than anticipated
• Multiple days to fully restore from
backup
The Ugly
Storage Today
Light at the end of the tunnel
Hybrid Mix
• We learned something from every step
of the way
• No one size fits all when it comes to
storage
• Use whatever is best for the job
• Be ready to change
Best Tool For The Job
A Bit of Everything
• Clustered NetApps and NFS for email
• Local RAID in hosting servers
• ZFS and OpenSolaris backup servers
• Ceph for DreamObjects and
DreamCompute
Best Tool For The Job
• Object Storage, S3/Swift compatible
• 2+ Petabytes raw storage
• 3x replication, 900+ OSDs
• RGW behind HAProxy
• Row, rack, node and disk fault tolerant
• OpenStack-based Public Cloud
• 3+ Petabytes raw storage
• All storage is on Ceph RBD
• Boot and Attachable Volumes
• Nicira SDN + Ceph, Live Migration
HA Load Balancer
MySQL / PostgreSQL
Horizon
Cockpit Pod
Glance
Keystone
Nova
Quantum
Cinder
Nicira NVP
Glance Store (Ceph)
OSMirrors (apt)
Ceph Monitors
Opscode Chef
Logstash + Graphite
Networking Gear
8x - Hypervisor Node
192 GB RAM
64AMD cores
14x - Storage Node
12x - 3TB disks
Networking Gear
Compute Pod
8x - Hypervisor Node
192 GB RAM
64AMD cores
14x - Storage Nodes
12x - 3TB disks
Networking Gear
Compute Pod
8x - Hypervisor Node
192 GB RAM
64AMD cores
14x - Storage Nodes
12x - 3TB disks
Networking Gear
Compute Pod
Pods
• 512 cores
• 1.5TB of RAM
• 504TB raw storage
• 168TB redundant storage
N etworking
• ODM switches w/ Linux
• 10Gbps everywhere
• IPv6 from the ground up
• Spine and leaf topology
• 120 Gbps between pods (!)
The Internets
Thar be dragonshere!
Nicira NVP Nicira NVP NiciraNVP
CephFS & The Future
• The return of Failovers
• No more backup servers
• No more major disk-related outages
• Fault tolerant low cost hosting
Storage Panacea?
Thanks!
@dallas
dallas@dreamhost.com

More Related Content

What's hot

Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) 
Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) 
Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) Dave Pitts
 
London HUG 14/4 - Deploying and Discovering at Scale with Consul and Nomad
London HUG 14/4 - Deploying and Discovering at Scale with Consul and NomadLondon HUG 14/4 - Deploying and Discovering at Scale with Consul and Nomad
London HUG 14/4 - Deploying and Discovering at Scale with Consul and NomadLondon HashiCorp User Group
 
Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)
Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)
Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)Jens Hadlich
 
Global deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon OhGlobal deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon OhCeph Community
 
Ceph Object Storage at Spreadshirt
Ceph Object Storage at SpreadshirtCeph Object Storage at Spreadshirt
Ceph Object Storage at SpreadshirtJens Hadlich
 
Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013
Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013
Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013Amazon Web Services
 
Webinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyWebinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyCeph Community
 
Spreadshirt Platform - An Architectural Overview
Spreadshirt Platform - An Architectural OverviewSpreadshirt Platform - An Architectural Overview
Spreadshirt Platform - An Architectural OverviewJens Hadlich
 
Day 2 General Session Presentations RedisConf
Day 2 General Session Presentations RedisConfDay 2 General Session Presentations RedisConf
Day 2 General Session Presentations RedisConfRedis Labs
 
SVC / Storwize: cost effective storage planning (BVQ use case)
SVC / Storwize: cost effective storage planning (BVQ use case)SVC / Storwize: cost effective storage planning (BVQ use case)
SVC / Storwize: cost effective storage planning (BVQ use case)Michael Pirker
 
Data Scotland 2019: You can run SQL Server on AWS
Data Scotland 2019: You can run SQL Server on AWSData Scotland 2019: You can run SQL Server on AWS
Data Scotland 2019: You can run SQL Server on AWSJohn McCormack
 
San Francisco HashiCorp User Group at GitHub
San Francisco HashiCorp User Group at GitHubSan Francisco HashiCorp User Group at GitHub
San Francisco HashiCorp User Group at GitHubJon Benson
 
Ceph and cloud stack apr 2014
Ceph and cloud stack   apr 2014Ceph and cloud stack   apr 2014
Ceph and cloud stack apr 2014Ian Colle
 
Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...
Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...
Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...ScyllaDB
 
Cloud Costing Services
Cloud Costing Services Cloud Costing Services
Cloud Costing Services InnoTech
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First ClusterDataStax Academy
 
GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...
GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...
GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...DevOpsDays Tel Aviv
 
MongoDB at community engine
MongoDB at community engineMongoDB at community engine
MongoDB at community enginemathraq
 
MongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
MongoDB and Amazon Web Services: Storage Options for MongoDB DeploymentsMongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
MongoDB and Amazon Web Services: Storage Options for MongoDB DeploymentsMongoDB
 

What's hot (20)

Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) 
Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) 
Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) 
 
London HUG 14/4 - Deploying and Discovering at Scale with Consul and Nomad
London HUG 14/4 - Deploying and Discovering at Scale with Consul and NomadLondon HUG 14/4 - Deploying and Discovering at Scale with Consul and Nomad
London HUG 14/4 - Deploying and Discovering at Scale with Consul and Nomad
 
Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)
Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)
Ceph Object Storage at Spreadshirt (July 2015, Ceph Berlin Meetup)
 
Drupal performance
Drupal performanceDrupal performance
Drupal performance
 
Global deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon OhGlobal deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon Oh
 
Ceph Object Storage at Spreadshirt
Ceph Object Storage at SpreadshirtCeph Object Storage at Spreadshirt
Ceph Object Storage at Spreadshirt
 
Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013
Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013
Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013
 
Webinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyWebinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case Study
 
Spreadshirt Platform - An Architectural Overview
Spreadshirt Platform - An Architectural OverviewSpreadshirt Platform - An Architectural Overview
Spreadshirt Platform - An Architectural Overview
 
Day 2 General Session Presentations RedisConf
Day 2 General Session Presentations RedisConfDay 2 General Session Presentations RedisConf
Day 2 General Session Presentations RedisConf
 
SVC / Storwize: cost effective storage planning (BVQ use case)
SVC / Storwize: cost effective storage planning (BVQ use case)SVC / Storwize: cost effective storage planning (BVQ use case)
SVC / Storwize: cost effective storage planning (BVQ use case)
 
Data Scotland 2019: You can run SQL Server on AWS
Data Scotland 2019: You can run SQL Server on AWSData Scotland 2019: You can run SQL Server on AWS
Data Scotland 2019: You can run SQL Server on AWS
 
San Francisco HashiCorp User Group at GitHub
San Francisco HashiCorp User Group at GitHubSan Francisco HashiCorp User Group at GitHub
San Francisco HashiCorp User Group at GitHub
 
Ceph and cloud stack apr 2014
Ceph and cloud stack   apr 2014Ceph and cloud stack   apr 2014
Ceph and cloud stack apr 2014
 
Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...
Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...
Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...
 
Cloud Costing Services
Cloud Costing Services Cloud Costing Services
Cloud Costing Services
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...
GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...
GRU: Taming a Herd of Wild Servers - Oz Katz, Similarweb - DevOpsDays Tel Avi...
 
MongoDB at community engine
MongoDB at community engineMongoDB at community engine
MongoDB at community engine
 
MongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
MongoDB and Amazon Web Services: Storage Options for MongoDB DeploymentsMongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
MongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
 

Viewers also liked

Ceph Day London 2014 - Deploying ceph in the wild
Ceph Day London 2014 - Deploying ceph in the wildCeph Day London 2014 - Deploying ceph in the wild
Ceph Day London 2014 - Deploying ceph in the wildCeph Community
 
Ceph Day London 2014 - Ceph Over High-Performance Networks
Ceph Day London 2014 - Ceph Over High-Performance Networks Ceph Day London 2014 - Ceph Over High-Performance Networks
Ceph Day London 2014 - Ceph Over High-Performance Networks Ceph Community
 
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Community
 
Ceph Day New York 2014: Ceph over High Performance Networks
Ceph Day New York 2014: Ceph over High Performance NetworksCeph Day New York 2014: Ceph over High Performance Networks
Ceph Day New York 2014: Ceph over High Performance NetworksCeph Community
 
Ceph Day Nov 2012 - Sage Weil
Ceph Day Nov 2012 - Sage WeilCeph Day Nov 2012 - Sage Weil
Ceph Day Nov 2012 - Sage WeilCeph Community
 
London Ceph Day: Ceph Performance and Optimization
London Ceph Day: Ceph Performance and Optimization London Ceph Day: Ceph Performance and Optimization
London Ceph Day: Ceph Performance and Optimization Ceph Community
 
London Ceph Day: Erasure Coding: Purpose and Progress
London Ceph Day: Erasure Coding: Purpose and Progress London Ceph Day: Erasure Coding: Purpose and Progress
London Ceph Day: Erasure Coding: Purpose and Progress Ceph Community
 
London Ceph Day: Ceph at CERN
London Ceph Day: Ceph at CERNLondon Ceph Day: Ceph at CERN
London Ceph Day: Ceph at CERNCeph Community
 
Ceph as storage for CloudStack
Ceph as storage for CloudStack Ceph as storage for CloudStack
Ceph as storage for CloudStack Ceph Community
 
London Ceph Day: Deploying Ceph and OpenStack with Juju
London Ceph Day: Deploying Ceph and OpenStack with JujuLondon Ceph Day: Deploying Ceph and OpenStack with Juju
London Ceph Day: Deploying Ceph and OpenStack with JujuCeph Community
 
London Ceph Day: Unified Cloud Storage with Synnefo + Ceph + Ganeti
London Ceph Day: Unified Cloud Storage with Synnefo + Ceph + GanetiLondon Ceph Day: Unified Cloud Storage with Synnefo + Ceph + Ganeti
London Ceph Day: Unified Cloud Storage with Synnefo + Ceph + GanetiCeph Community
 
London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph Ceph Community
 
Using Ceph in a Private Cloud - Ceph Day Frankfurt
Using Ceph in a Private Cloud - Ceph Day Frankfurt Using Ceph in a Private Cloud - Ceph Day Frankfurt
Using Ceph in a Private Cloud - Ceph Day Frankfurt Ceph Community
 
Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt
Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt
Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt Ceph Community
 
Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...
Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...
Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...Ceph Community
 
Webinar - Advance Ceph Features
Webinar - Advance Ceph FeaturesWebinar - Advance Ceph Features
Webinar - Advance Ceph FeaturesCeph Community
 
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with CrowbarWicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with CrowbarCeph Community
 
Plano de mídia - 1/9/2015
Plano de mídia - 1/9/2015Plano de mídia - 1/9/2015
Plano de mídia - 1/9/2015Renato Cruz
 

Viewers also liked (20)

Ceph Day London 2014 - Deploying ceph in the wild
Ceph Day London 2014 - Deploying ceph in the wildCeph Day London 2014 - Deploying ceph in the wild
Ceph Day London 2014 - Deploying ceph in the wild
 
Ceph Day London 2014 - Ceph Over High-Performance Networks
Ceph Day London 2014 - Ceph Over High-Performance Networks Ceph Day London 2014 - Ceph Over High-Performance Networks
Ceph Day London 2014 - Ceph Over High-Performance Networks
 
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
 
Ceph Day New York 2014: Ceph over High Performance Networks
Ceph Day New York 2014: Ceph over High Performance NetworksCeph Day New York 2014: Ceph over High Performance Networks
Ceph Day New York 2014: Ceph over High Performance Networks
 
Ceph Day Nov 2012 - Sage Weil
Ceph Day Nov 2012 - Sage WeilCeph Day Nov 2012 - Sage Weil
Ceph Day Nov 2012 - Sage Weil
 
Strata - 03/31/2012
Strata - 03/31/2012Strata - 03/31/2012
Strata - 03/31/2012
 
London Ceph Day: Ceph Performance and Optimization
London Ceph Day: Ceph Performance and Optimization London Ceph Day: Ceph Performance and Optimization
London Ceph Day: Ceph Performance and Optimization
 
London Ceph Day: Erasure Coding: Purpose and Progress
London Ceph Day: Erasure Coding: Purpose and Progress London Ceph Day: Erasure Coding: Purpose and Progress
London Ceph Day: Erasure Coding: Purpose and Progress
 
London Ceph Day: Ceph at CERN
London Ceph Day: Ceph at CERNLondon Ceph Day: Ceph at CERN
London Ceph Day: Ceph at CERN
 
Ceph as storage for CloudStack
Ceph as storage for CloudStack Ceph as storage for CloudStack
Ceph as storage for CloudStack
 
London Ceph Day: Deploying Ceph and OpenStack with Juju
London Ceph Day: Deploying Ceph and OpenStack with JujuLondon Ceph Day: Deploying Ceph and OpenStack with Juju
London Ceph Day: Deploying Ceph and OpenStack with Juju
 
London Ceph Day: Unified Cloud Storage with Synnefo + Ceph + Ganeti
London Ceph Day: Unified Cloud Storage with Synnefo + Ceph + GanetiLondon Ceph Day: Unified Cloud Storage with Synnefo + Ceph + Ganeti
London Ceph Day: Unified Cloud Storage with Synnefo + Ceph + Ganeti
 
London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph London Ceph Day Keynote: Building Tomorrow's Ceph
London Ceph Day Keynote: Building Tomorrow's Ceph
 
Using Ceph in a Private Cloud - Ceph Day Frankfurt
Using Ceph in a Private Cloud - Ceph Day Frankfurt Using Ceph in a Private Cloud - Ceph Day Frankfurt
Using Ceph in a Private Cloud - Ceph Day Frankfurt
 
Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt
Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt
Ceph at the Digital Repository of Ireland - Ceph Day Frankfurt
 
Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...
Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...
Best Practices with Ceph as Distributed, Intelligent, Unified Cloud Storage -...
 
Webinar - Advance Ceph Features
Webinar - Advance Ceph FeaturesWebinar - Advance Ceph Features
Webinar - Advance Ceph Features
 
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with CrowbarWicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
 
Plano de mídia - 1/9/2015
Plano de mídia - 1/9/2015Plano de mídia - 1/9/2015
Plano de mídia - 1/9/2015
 
62 0422 la restauración del arbol novia
62 0422 la restauración del arbol novia62 0422 la restauración del arbol novia
62 0422 la restauración del arbol novia
 

Similar to Ceph Day Santa Clara: Ceph at DreamHost

Getting started with Riak in the Cloud
Getting started with Riak in the CloudGetting started with Riak in the Cloud
Getting started with Riak in the CloudInes Sombra
 
DrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalabilityDrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalabilitycherryhillco
 
HIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoProHIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoProRedis Labs
 
OpenStack Cinder, Implementation Today and New Trends for Tomorrow
OpenStack Cinder, Implementation Today and New Trends for TomorrowOpenStack Cinder, Implementation Today and New Trends for Tomorrow
OpenStack Cinder, Implementation Today and New Trends for TomorrowEd Balduf
 
High Performance Drupal
High Performance DrupalHigh Performance Drupal
High Performance DrupalChapter Three
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in JavaRuben Badaró
 
Best practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudBest practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudAnshum Gupta
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteDataWorks Summit
 
Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?Tim Lossen
 
High Scalability Toronto: Meetup #2
High Scalability Toronto: Meetup #2High Scalability Toronto: Meetup #2
High Scalability Toronto: Meetup #2ScribbleLive
 
AWS Cloud experience concepts tips and tricks
AWS Cloud experience concepts tips and tricksAWS Cloud experience concepts tips and tricks
AWS Cloud experience concepts tips and tricksDirk Harms-Merbitz
 
Troubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingTroubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingGreat Wide Open
 
End of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationEnd of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationCeph Community
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraJon Haddad
 
Alluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata ServicesAlluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata ServicesAlluxio, Inc.
 
MySQL in the Hosted Cloud
MySQL in the Hosted CloudMySQL in the Hosted Cloud
MySQL in the Hosted CloudColin Charles
 
V mware2012 20121221_final
V mware2012 20121221_finalV mware2012 20121221_final
V mware2012 20121221_finalWeb2Present
 
Life After Sharding: Monitoring and Management of a Complex Data Cloud
Life After Sharding: Monitoring and Management of a Complex Data CloudLife After Sharding: Monitoring and Management of a Complex Data Cloud
Life After Sharding: Monitoring and Management of a Complex Data CloudOSCON Byrum
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Community
 

Similar to Ceph Day Santa Clara: Ceph at DreamHost (20)

Getting started with Riak in the Cloud
Getting started with Riak in the CloudGetting started with Riak in the Cloud
Getting started with Riak in the Cloud
 
DrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalabilityDrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalability
 
HIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoProHIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoPro
 
OpenStack Cinder, Implementation Today and New Trends for Tomorrow
OpenStack Cinder, Implementation Today and New Trends for TomorrowOpenStack Cinder, Implementation Today and New Trends for Tomorrow
OpenStack Cinder, Implementation Today and New Trends for Tomorrow
 
High Performance Drupal
High Performance DrupalHigh Performance Drupal
High Performance Drupal
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in Java
 
Best practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudBest practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloud
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great Taste
 
Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?
 
High Scalability Toronto: Meetup #2
High Scalability Toronto: Meetup #2High Scalability Toronto: Meetup #2
High Scalability Toronto: Meetup #2
 
AWS Cloud experience concepts tips and tricks
AWS Cloud experience concepts tips and tricksAWS Cloud experience concepts tips and tricks
AWS Cloud experience concepts tips and tricks
 
Troubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingTroubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed Debugging
 
End of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationEnd of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph Replication
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - Cassandra
 
Alluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata ServicesAlluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata Services
 
MySQL in the Hosted Cloud
MySQL in the Hosted CloudMySQL in the Hosted Cloud
MySQL in the Hosted Cloud
 
V mware2012 20121221_final
V mware2012 20121221_finalV mware2012 20121221_final
V mware2012 20121221_final
 
Life After Sharding: Monitoring and Management of a Complex Data Cloud
Life After Sharding: Monitoring and Management of a Complex Data CloudLife After Sharding: Monitoring and Management of a Complex Data Cloud
Life After Sharding: Monitoring and Management of a Complex Data Cloud
 
Redis
RedisRedis
Redis
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
 

Recently uploaded

unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 

Recently uploaded (20)

unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 

Ceph Day Santa Clara: Ceph at DreamHost

  • 1. Ceph at DreamHost A Storage Journey
  • 2. About Me • One of the original four of DreamHost • Still active daily at DreamHost • Have spent a lot of time working on the Ops side.
  • 3. • Hosting company founded in 1997 • Sage’s other company • shared hosting, virtual servers, dedicated servers, cloud storage, cloud computing • 375k customers, 1.3MM websites
  • 4. Storage Journey A long strange trip
  • 5. His name was Destro
  • 6. ... and then there were more.
  • 11. ... and still more.
  • 12. Lots of NetApps • Peak of around 125 individual NetApps • Smallish capacity on each (8TB) • Internal software continuously moving data between NetApps • Lots of time spent managing nearly full filers
  • 13. Ideal
  • 15. Hosting Landscape • Included storage had grown from 50MB to gigabytes, then terabytes. • Prices stayed the same. • Eventually went to unlimited Storage • Usage per customer skyrocketed.
  • 17. Failed Experiments • ATAoE and XFS-based systems • Performance & Stability issues • 2006 era gear
  • 18. Failed Experiments • High capacity • Nice features • Expensive • 85% full and it failed
  • 19. Some Success • First on Sun hardware then Supermicro • Great stability • Not enough IO for front-line network storage
  • 21. Local RAID • SATA drives had grown in capacity and were very cheap • 4-6TB per hosting server • Less dependence on congested network • Smaller failure domains The Good
  • 22. Local RAID • No more quota, too slow to scan filesystem • No more fast failovers • Multiple hour filesystem check with ext3 • More failure domains The Bad
  • 23. Local RAID • Complete RAID loss more common than anticipated • Multiple days to fully restore from backup The Ugly
  • 24. Storage Today Light at the end of the tunnel
  • 25. Hybrid Mix • We learned something from every step of the way • No one size fits all when it comes to storage • Use whatever is best for the job • Be ready to change Best Tool For The Job
  • 26. A Bit of Everything • Clustered NetApps and NFS for email • Local RAID in hosting servers • ZFS and OpenSolaris backup servers • Ceph for DreamObjects and DreamCompute Best Tool For The Job
  • 27. • Object Storage, S3/Swift compatible • 2+ Petabytes raw storage • 3x replication, 900+ OSDs • RGW behind HAProxy • Row, rack, node and disk fault tolerant
  • 28. • OpenStack-based Public Cloud • 3+ Petabytes raw storage • All storage is on Ceph RBD • Boot and Attachable Volumes • Nicira SDN + Ceph, Live Migration
  • 29. HA Load Balancer MySQL / PostgreSQL Horizon Cockpit Pod Glance Keystone Nova Quantum Cinder Nicira NVP Glance Store (Ceph) OSMirrors (apt) Ceph Monitors Opscode Chef Logstash + Graphite Networking Gear 8x - Hypervisor Node 192 GB RAM 64AMD cores 14x - Storage Node 12x - 3TB disks Networking Gear Compute Pod 8x - Hypervisor Node 192 GB RAM 64AMD cores 14x - Storage Nodes 12x - 3TB disks Networking Gear Compute Pod 8x - Hypervisor Node 192 GB RAM 64AMD cores 14x - Storage Nodes 12x - 3TB disks Networking Gear Compute Pod Pods • 512 cores • 1.5TB of RAM • 504TB raw storage • 168TB redundant storage N etworking • ODM switches w/ Linux • 10Gbps everywhere • IPv6 from the ground up • Spine and leaf topology • 120 Gbps between pods (!) The Internets Thar be dragonshere! Nicira NVP Nicira NVP NiciraNVP
  • 30. CephFS & The Future • The return of Failovers • No more backup servers • No more major disk-related outages • Fault tolerant low cost hosting Storage Panacea?