SlideShare a Scribd company logo
1 of 15
Download to read offline
Ceph and Penguin Computing On Demand
Travis Rhoden
Copyright © 2013 Penguin Computing, Inc. All rights reserved
Who is Penguin Computing?
●
Founded in 1997, with a focus on custom Linux systems
●
Core markets: HPC, enterprise/data-center, HPC cloud services
– We see about a 50/50 mix between HPC and enterprise orders
– Offer turn-key clusters, full-range of Linux servers
●
Now the largest private System Integrator in North America
●
Stable, profitable, growing...
Copyright © 2013 Penguin Computing, Inc. All rights reserved
What is Penguin Computing On Demand (POD)?
●
POD launched in 2009 as an HPC-as-a-Service offering
●
Purpose-built HPC cluster for on-demand customers
– Offers low-latency interconnects, high core counts, plentiful RAM for
processing
– Non-virtualized compute resources, focused on absolute compute
performance
– Tuned MPI/cluster stack available “out of the box”
●
“Pay as you go” – only for what you use, charge per core-hour
●
Customizable, persistent user environment
●
Over 50 million commercial jobs run
Copyright © 2013 Penguin Computing, Inc. All rights reserved
Original POD designs
●
Original clusters used
standalone DAS NFS
servers
●
Login nodes ran on
VMWare, then KVM,
stored locally on host
Copyright © 2013 Penguin Computing, Inc. All rights reserved
Original POD limitations
●
Disparate NFS servers led to non-global namespace
– Users unable to take advantage of all installed storage
– Not all disks able to contribute to performance (no scale-out effect)
– A full NFS server affected co-resident users
– NFS server RAID card a SPoF
●
Never lost data, but did have times where data was inaccessible
●
VM login nodes were handled by a standalone set of hardware
– Storage servers not leveraged for hosting VM disks
Copyright © 2013 Penguin Computing, Inc. All rights reserved
POD New Architecture
●
Time for something different
– More expandable
– More fault tolerant
– More flexible
●
OpenStack & Ceph
Copyright © 2013 Penguin Computing, Inc. All rights reserved
POD Ceph Usage – Open Stack
●
Ceph OpenStack integration is a big plus
– Store Disk images in Ceph (Glance)
– Store Volumes in Ceph (Cinder)
– Boot VMs straight from Ceph (boot from volume)
– Leverage COW semantics for boot volume creation
– Live migration
●
No immediate need for RADOSGW
– Nice to know it's there if we need it
Copyright © 2013 Penguin Computing, Inc. All rights reserved
POD Ceph Usage - RBD
●
The same storage system hosts RBDs for us
●
Each POD user has their $HOME in an RBD
– To make visible to all compute nodes and customer-accessible login
nodes, we mount the RBD on one of several NFS servers and export
from there
– Aren't quite ready to throw full weight into CephFS, but early testing
has started
– We know this creates a performance bottleneck, but the pros outweigh
the cons
Copyright © 2013 Penguin Computing, Inc. All rights reserved
POD Ceph Usage – RBD Pros and Cons
●
Pros
– Thin provisioning
– User specific backups or snapshots
– Nice block device to export 1:1 mapping
●
Cons
– NFS server SPoF and bottleneck
– Loss of parallel access to OSDs
– Slow-ish resize
Copyright © 2013 Penguin Computing, Inc. All rights reserved
POD Storage Hardware
●
Started with 5x Penguin Computing IB2712 chassis
– Dual Xeon 5600-series
– 48GB RAM
– Dual 10GbE
– 12x hot-swap 3.5” SATA drives
– 2x internal SSDs for OS and OSD journals
●
6 journals on each SSD
●
60x 2TB → 120TB raw storage
– 109TB available in Ceph
●
XFS on OSDs
Copyright © 2013 Penguin Computing, Inc. All rights reserved
POD Ceph Storage Config
●
Running 3 monitors
– On same chassis as OSDs (not recommended by Inktank)
●
Running 2 MDS processes
– On same chassis as OSDs
– 1 active, 1 backup
●
Each chassis has a 2-port 10GbE LAG to ToR switch
●
2 replicas
●
Separate pools for Glance, Cinder, user $HOMEs
Copyright © 2013 Penguin Computing, Inc. All rights reserved
CephFS on POD
●
Primary use case for storage on POD is users reading and writing data to
their $HOME directory
●
On our HPC clusters, primarily tends to be sequential writes, but we see
sequential reads and some bits of random I/O
●
Running VMs also produce random I/O
●
Since users can run jobs comprising dozens of compute nodes, potentially
all hitting the same folder(s), would be nice to use CephFS rather than
NFS
●
Testing a scratch space a good way to start
●
Using ceph-fuse, as Cluster is CentOS 6.3
Copyright © 2013 Penguin Computing, Inc. All rights reserved
CephFS initial benchmarks
●
Simple dd, 1GB file, 4MB blocks
– (dd if=/dev/zero of=[dir] bs=4M count=256 conv=fdatasync)
Copyright © 2013 Penguin Computing, Inc. All rights reserved
Ceph Lessons Learned
●
Our 3rd production Ceph cluster
– 1st has been decommissioned, ran Argonaut and Bobtail, used IPoIB
– 2nd being decommissioned, still running Bobtail
– 3rd is the primary workhorse for a production POD cluster, launched on Bobtail,
now running the latest Cuttlefish
●
For RBD, very recent Linux kernel a must if using kclient
– Pre 3.10 had kpanic issues when using cephx
●
SSDs nice, but may not be best bang for buck
– 3-4 OSD journals per SSD is ideal, but does add significant cost
– We've seen promising results using higher end RAID controllers in lieu of
SSDs, due to write-back cache, at an overall lower cost
– We still need to test more to determine how this behavior caries over
sequential vs random, and small vs large I/O.
●
Need to work hard to balance density versus manageable failure domains
– Density very popular, but leads to a lot of recovery traffic if server fails
Copyright © 2013 Penguin Computing, Inc. All rights reserved
Thanks!
@off_rhoden
trhoden@penguincomputing.com
@PenguinHPC

More Related Content

Viewers also liked

eBook: State of Data Backup for SMBs
eBook: State of Data Backup for SMBseBook: State of Data Backup for SMBs
eBook: State of Data Backup for SMBsCarbonite
 
Keeping Your Business HIPAA-Compliant
Keeping Your Business HIPAA-CompliantKeeping Your Business HIPAA-Compliant
Keeping Your Business HIPAA-CompliantCarbonite
 
5 Deadly Sins of Small Business Data Backup- webinar slides12082011
5 Deadly Sins of Small Business Data Backup- webinar slides120820115 Deadly Sins of Small Business Data Backup- webinar slides12082011
5 Deadly Sins of Small Business Data Backup- webinar slides12082011Carbonite
 
Working from home? Here's how the cloud can help.
Working from home? Here's how the cloud can help.Working from home? Here's how the cloud can help.
Working from home? Here's how the cloud can help.Carbonite
 
AppFolio End of Year Preparation
AppFolio End of Year PreparationAppFolio End of Year Preparation
AppFolio End of Year PreparationAppFolio
 
2014 State of Backup for SMBs
2014 State of Backup for SMBs2014 State of Backup for SMBs
2014 State of Backup for SMBsCarbonite
 
AppFolio Reports and Letters - Webinar Recap
AppFolio Reports and Letters - Webinar RecapAppFolio Reports and Letters - Webinar Recap
AppFolio Reports and Letters - Webinar RecapAppFolio
 
AppFolio Presentation: Rental Business Trends
AppFolio Presentation: Rental Business TrendsAppFolio Presentation: Rental Business Trends
AppFolio Presentation: Rental Business TrendsAppFolio
 
Data Protection for Credit Unions
Data Protection for Credit UnionsData Protection for Credit Unions
Data Protection for Credit UnionsCarbonite
 
AppFolio Maintenance Contact Center (Customer Webinar Slides)
AppFolio Maintenance Contact Center (Customer Webinar Slides) AppFolio Maintenance Contact Center (Customer Webinar Slides)
AppFolio Maintenance Contact Center (Customer Webinar Slides) AppFolio
 
Small Business Big Impact: 10 Facts About Small Businesses
Small Business Big Impact: 10 Facts About Small Businesses Small Business Big Impact: 10 Facts About Small Businesses
Small Business Big Impact: 10 Facts About Small Businesses Carbonite
 
A peek behind the cloud: Backblaze CEO discusses Cloud Storage at dotScale
A peek behind the cloud: Backblaze CEO discusses Cloud Storage at dotScaleA peek behind the cloud: Backblaze CEO discusses Cloud Storage at dotScale
A peek behind the cloud: Backblaze CEO discusses Cloud Storage at dotScaleBackblaze
 

Viewers also liked (13)

JD Gen Book
JD Gen BookJD Gen Book
JD Gen Book
 
eBook: State of Data Backup for SMBs
eBook: State of Data Backup for SMBseBook: State of Data Backup for SMBs
eBook: State of Data Backup for SMBs
 
Keeping Your Business HIPAA-Compliant
Keeping Your Business HIPAA-CompliantKeeping Your Business HIPAA-Compliant
Keeping Your Business HIPAA-Compliant
 
5 Deadly Sins of Small Business Data Backup- webinar slides12082011
5 Deadly Sins of Small Business Data Backup- webinar slides120820115 Deadly Sins of Small Business Data Backup- webinar slides12082011
5 Deadly Sins of Small Business Data Backup- webinar slides12082011
 
Working from home? Here's how the cloud can help.
Working from home? Here's how the cloud can help.Working from home? Here's how the cloud can help.
Working from home? Here's how the cloud can help.
 
AppFolio End of Year Preparation
AppFolio End of Year PreparationAppFolio End of Year Preparation
AppFolio End of Year Preparation
 
2014 State of Backup for SMBs
2014 State of Backup for SMBs2014 State of Backup for SMBs
2014 State of Backup for SMBs
 
AppFolio Reports and Letters - Webinar Recap
AppFolio Reports and Letters - Webinar RecapAppFolio Reports and Letters - Webinar Recap
AppFolio Reports and Letters - Webinar Recap
 
AppFolio Presentation: Rental Business Trends
AppFolio Presentation: Rental Business TrendsAppFolio Presentation: Rental Business Trends
AppFolio Presentation: Rental Business Trends
 
Data Protection for Credit Unions
Data Protection for Credit UnionsData Protection for Credit Unions
Data Protection for Credit Unions
 
AppFolio Maintenance Contact Center (Customer Webinar Slides)
AppFolio Maintenance Contact Center (Customer Webinar Slides) AppFolio Maintenance Contact Center (Customer Webinar Slides)
AppFolio Maintenance Contact Center (Customer Webinar Slides)
 
Small Business Big Impact: 10 Facts About Small Businesses
Small Business Big Impact: 10 Facts About Small Businesses Small Business Big Impact: 10 Facts About Small Businesses
Small Business Big Impact: 10 Facts About Small Businesses
 
A peek behind the cloud: Backblaze CEO discusses Cloud Storage at dotScale
A peek behind the cloud: Backblaze CEO discusses Cloud Storage at dotScaleA peek behind the cloud: Backblaze CEO discusses Cloud Storage at dotScale
A peek behind the cloud: Backblaze CEO discusses Cloud Storage at dotScale
 

Recently uploaded

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Recently uploaded (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

Ceph Storage and Penguin Computing on Demand

  • 1. Ceph and Penguin Computing On Demand Travis Rhoden
  • 2. Copyright © 2013 Penguin Computing, Inc. All rights reserved Who is Penguin Computing? ● Founded in 1997, with a focus on custom Linux systems ● Core markets: HPC, enterprise/data-center, HPC cloud services – We see about a 50/50 mix between HPC and enterprise orders – Offer turn-key clusters, full-range of Linux servers ● Now the largest private System Integrator in North America ● Stable, profitable, growing...
  • 3. Copyright © 2013 Penguin Computing, Inc. All rights reserved What is Penguin Computing On Demand (POD)? ● POD launched in 2009 as an HPC-as-a-Service offering ● Purpose-built HPC cluster for on-demand customers – Offers low-latency interconnects, high core counts, plentiful RAM for processing – Non-virtualized compute resources, focused on absolute compute performance – Tuned MPI/cluster stack available “out of the box” ● “Pay as you go” – only for what you use, charge per core-hour ● Customizable, persistent user environment ● Over 50 million commercial jobs run
  • 4. Copyright © 2013 Penguin Computing, Inc. All rights reserved Original POD designs ● Original clusters used standalone DAS NFS servers ● Login nodes ran on VMWare, then KVM, stored locally on host
  • 5. Copyright © 2013 Penguin Computing, Inc. All rights reserved Original POD limitations ● Disparate NFS servers led to non-global namespace – Users unable to take advantage of all installed storage – Not all disks able to contribute to performance (no scale-out effect) – A full NFS server affected co-resident users – NFS server RAID card a SPoF ● Never lost data, but did have times where data was inaccessible ● VM login nodes were handled by a standalone set of hardware – Storage servers not leveraged for hosting VM disks
  • 6. Copyright © 2013 Penguin Computing, Inc. All rights reserved POD New Architecture ● Time for something different – More expandable – More fault tolerant – More flexible ● OpenStack & Ceph
  • 7. Copyright © 2013 Penguin Computing, Inc. All rights reserved POD Ceph Usage – Open Stack ● Ceph OpenStack integration is a big plus – Store Disk images in Ceph (Glance) – Store Volumes in Ceph (Cinder) – Boot VMs straight from Ceph (boot from volume) – Leverage COW semantics for boot volume creation – Live migration ● No immediate need for RADOSGW – Nice to know it's there if we need it
  • 8. Copyright © 2013 Penguin Computing, Inc. All rights reserved POD Ceph Usage - RBD ● The same storage system hosts RBDs for us ● Each POD user has their $HOME in an RBD – To make visible to all compute nodes and customer-accessible login nodes, we mount the RBD on one of several NFS servers and export from there – Aren't quite ready to throw full weight into CephFS, but early testing has started – We know this creates a performance bottleneck, but the pros outweigh the cons
  • 9. Copyright © 2013 Penguin Computing, Inc. All rights reserved POD Ceph Usage – RBD Pros and Cons ● Pros – Thin provisioning – User specific backups or snapshots – Nice block device to export 1:1 mapping ● Cons – NFS server SPoF and bottleneck – Loss of parallel access to OSDs – Slow-ish resize
  • 10. Copyright © 2013 Penguin Computing, Inc. All rights reserved POD Storage Hardware ● Started with 5x Penguin Computing IB2712 chassis – Dual Xeon 5600-series – 48GB RAM – Dual 10GbE – 12x hot-swap 3.5” SATA drives – 2x internal SSDs for OS and OSD journals ● 6 journals on each SSD ● 60x 2TB → 120TB raw storage – 109TB available in Ceph ● XFS on OSDs
  • 11. Copyright © 2013 Penguin Computing, Inc. All rights reserved POD Ceph Storage Config ● Running 3 monitors – On same chassis as OSDs (not recommended by Inktank) ● Running 2 MDS processes – On same chassis as OSDs – 1 active, 1 backup ● Each chassis has a 2-port 10GbE LAG to ToR switch ● 2 replicas ● Separate pools for Glance, Cinder, user $HOMEs
  • 12. Copyright © 2013 Penguin Computing, Inc. All rights reserved CephFS on POD ● Primary use case for storage on POD is users reading and writing data to their $HOME directory ● On our HPC clusters, primarily tends to be sequential writes, but we see sequential reads and some bits of random I/O ● Running VMs also produce random I/O ● Since users can run jobs comprising dozens of compute nodes, potentially all hitting the same folder(s), would be nice to use CephFS rather than NFS ● Testing a scratch space a good way to start ● Using ceph-fuse, as Cluster is CentOS 6.3
  • 13. Copyright © 2013 Penguin Computing, Inc. All rights reserved CephFS initial benchmarks ● Simple dd, 1GB file, 4MB blocks – (dd if=/dev/zero of=[dir] bs=4M count=256 conv=fdatasync)
  • 14. Copyright © 2013 Penguin Computing, Inc. All rights reserved Ceph Lessons Learned ● Our 3rd production Ceph cluster – 1st has been decommissioned, ran Argonaut and Bobtail, used IPoIB – 2nd being decommissioned, still running Bobtail – 3rd is the primary workhorse for a production POD cluster, launched on Bobtail, now running the latest Cuttlefish ● For RBD, very recent Linux kernel a must if using kclient – Pre 3.10 had kpanic issues when using cephx ● SSDs nice, but may not be best bang for buck – 3-4 OSD journals per SSD is ideal, but does add significant cost – We've seen promising results using higher end RAID controllers in lieu of SSDs, due to write-back cache, at an overall lower cost – We still need to test more to determine how this behavior caries over sequential vs random, and small vs large I/O. ● Need to work hard to balance density versus manageable failure domains – Density very popular, but leads to a lot of recovery traffic if server fails
  • 15. Copyright © 2013 Penguin Computing, Inc. All rights reserved Thanks! @off_rhoden trhoden@penguincomputing.com @PenguinHPC