SlideShare uma empresa Scribd logo
1 de 22
Baixar para ler offline
Multi-Cell Openstack:
How to Evolve your Cloud
to Scale
● Belmiro Moreira - CERN
● Matt Van Winkle - Rackspace
● Sam Morrison - NeCTAR, University of
Melbourne
Cells: How we use them
at NeCTAR
Sam Morrison
sam.morrison@unimelb.edu.au
NeCTAR Research Cloud
● Started in 2011
● Funded by the Australian Government
● 8 institutions around the country
● Production early 2012 - Openstack Diablo
● All federated to appear as 1 cloud from the
users point of view
● Put the compute near the data and tools
● 5000+ users
NeCTAR Sites
● University of Melbourne
● National Computation Infrastructure
● Monash University
● Queensland CyberInfrastructure Foundation
● eResearch SA
● University of Tasmania
● Intersect, NSW
● iVEC, WA
Cells to build a Federation
● Use cells to federate geographically
separated sites
● Different hardware/networks/people
● Parent cell run centrally at unimelb along
with keystone/cinder/glance etc (no neutron)
● Each site has 1 or more compute cells
● These roughly match up to availability zones from a
users perspective (cells are behind the scenes)
How big?
● Each site ~4000 cores, ~150 hypervisors
● 6 sites in production, 4600+ instances
● Last 2 sites in prod by end of year
● ~1000 hypervisors, 40k cores
● ~10 compute cells
● Some sites have multiple datacenters so
have multiple cells
Pain points
● Cell scheduling isn’t smart
● Broadcast calls rely on all cells to be alive
● Not many people to share experiences with
● Upgrades, although havana → icehouse
could happen in stages. Much easier!
Things we’ve added, not in
trunk (yet)
● Security group syncing
● ec2 id mappings (needed for metadata)
● Availability zone / aggregate support
● Flavour management
*We assume a cell only has 1 parent
Cells: How we use them
at CERN
Belmiro Moreira
email: belmiro.moreira @ cern.ch
@belmiromoreira
CERN
● Conseil Européen pour la Recherche Nucléaire – aka
European Organization for Nuclear Research
● Founded in 1954 with an international treaty
○ 21 state members, other countries contribute to experiments
○ Situated between Geneva and the Jura Mountains, straddling the Swiss-
French border
● CERN mission is to do fundamental research
● CERN provides particle accelerators and other infrastructure
for high-energy physics research
CERN - Cloud Infrastructure
● In production since July 2013
● Performed two upgrades: Grizzly -> Havana -> Icehouse
○ Currently running: nova; glance; keystone; horizon; cinder w/ Ceph;
ceilometer
● RDO distribution on SLC6; pip with Windows Server 2012 R2
● 2 geographically separated data centres
○ Geneva (Switzerland) and Budapest (Hungary)
● Numbers
○ ~3000 compute nodes (75k cores; 140TB RAM)
■ ~2900 kvm; ~100 Hyper-V;
○ ~8000 virtual machines
CERN - Cloud Infrastructure - Cells
● Why we use cells?
○ Scale transparently between different Data Centres
○ Availability and Resilience
○ Isolate different use-cases
● Today: 1 api Cell and 8 compute Cells
○ 2 level tree
○ size range between 100 to ~1600 compute nodes
○ 6 Compute Cells in Switzerland; 2 Compute Cells in Hungary
● “Shared” and “Private” Cells
○ 3 availability zones available in “Shared” Cells
CERN - Cells Limitations
● Missing functionality
○ Security Groups
○ Flavor propagation (api -> compute)
○ Manage aggregates on api Cell
○ Server groups
● Cell scheduler
● Ceilometer integration
CERN - Cells Challenges
● More ~74000 cores by beginning 2015
○ How to organize and distribute nodes between different cells?
● Split current large cells into a small number (~200) of
compute nodes
○ Expected to have +30 cells by end 2015
○ How to manage a large number of Cells?
Created by: Matt Van Winkle @mvanwink
Modified Date: 10/29/2014
Cells at Rackspace
Cells: How to Evolve Your Cloud to Scale
• Managed Cloud company offering a suite of dedicated and cloud hosting products
• Founded in 1998 in San Antonio, TX
• Home of Fanatical Support
• More than 200,000 customers in 120 countries
Rackspace
www.rackspace.com
• In production since August 2012
– Currently running: Nova; Glance; Neutron; Ironic; Swift; Cinder
• Regular upgrades from trunk
– Package built on trunk pull from 10/21 in testing now
• Compute nodes are Debian based
– Run as VMs on hypervisors and manage via XAPI
• 6 Geographic regions around the globe
– DFW; ORD; IAD; LON; SYD; HKG
• Numbers
– 10’s of 1000’s of hypervisors (over 330K Cores, 1+ Petabyte of RAM)
• All XenServer
– Over 150,000 virtual machines
Rackspace – Cloud Infrastructure
www.rackspace.com
• Why we use cells?
– Manage Multiple Flavor Classes
– Network resources (Public IPs, Private IPs, aggregation routers, etc)
– Network Constraints
– Continual Supply Chain
• 1 Global API cell per region with multiple Compute cells (3 – 35+)
– 2 level tree
– Size between ~100 and ~600 hosts per cell
• Control infrastructure exists as instances in small OpenStack deployment
• All cells available to all tenants
– Tested “dedicated” cells for potential large customers
Rackspace – Cloud Infrastructure - Cells
www.rackspace.com
• Missing Functionality
– Security Groups
– Host aggregates
• Scheduler
– No “disable”
– Incomplete host statuses
• Other services are not cell aware
– Neutron is a prime example
Rackspace – Cells Limitations
www.rackspace.com
• Increasing number of flavor classes
– Different Hardware specs per class
– Sizing varies by average VM density
• Multiple vendor sources
– Subtle hardware differences in same specs across different vendors
• Scaling global services with cell growth
– Still don’t have the perfect ratios
Rackspace – Cells Challenges
www.rackspace.com
• Nova Dev team met this morning to discuss cells in a few sessions:
– Cells – Wednesday, November 5, 09:00
– Cells continued – Wednesday, November 5, 09:50
• Areas of discussion
– Feature completion
– No-op/single cell as default
– Cell awareness in APIs
• Recap from sessions
Cells Feature Completion
www.rackspace.com
Thank You!
● Belmiro Moreira - CERN - belmiro.moreira@cern.ch
● Matt Van Winkle - Rackspace - @mvanwink
● Sam Morrison - NeCTAR, University of Melbourne - sam.morrison@unimelb.
edu.au
Questions?
www.rackspace.com

Mais conteúdo relacionado

Mais procurados

Moving from CellsV1 to CellsV2 at CERN
Moving from CellsV1 to CellsV2 at CERNMoving from CellsV1 to CellsV2 at CERN
Moving from CellsV1 to CellsV2 at CERN
Belmiro Moreira
 
OpenStack Paris 2014 - Federation, are we there yet ?
OpenStack Paris 2014 - Federation, are we there yet ?OpenStack Paris 2014 - Federation, are we there yet ?
OpenStack Paris 2014 - Federation, are we there yet ?
Tim Bell
 
What's new in OpenStack Liberty
What's new in OpenStack LibertyWhat's new in OpenStack Liberty
What's new in OpenStack Liberty
Stephen Gordon
 

Mais procurados (20)

CERN OpenStack Cloud Control Plane - From VMs to K8s
CERN OpenStack Cloud Control Plane - From VMs to K8sCERN OpenStack Cloud Control Plane - From VMs to K8s
CERN OpenStack Cloud Control Plane - From VMs to K8s
 
CPU Optimizations in the CERN Cloud - February 2016
CPU Optimizations in the CERN Cloud - February 2016CPU Optimizations in the CERN Cloud - February 2016
CPU Optimizations in the CERN Cloud - February 2016
 
CERN User Story
CERN User StoryCERN User Story
CERN User Story
 
Future Science on Future OpenStack
Future Science on Future OpenStackFuture Science on Future OpenStack
Future Science on Future OpenStack
 
OpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim BellOpenStack @ CERN, by Tim Bell
OpenStack @ CERN, by Tim Bell
 
Moving from CellsV1 to CellsV2 at CERN
Moving from CellsV1 to CellsV2 at CERNMoving from CellsV1 to CellsV2 at CERN
Moving from CellsV1 to CellsV2 at CERN
 
20161025 OpenStack at CERN Barcelona
20161025 OpenStack at CERN Barcelona20161025 OpenStack at CERN Barcelona
20161025 OpenStack at CERN Barcelona
 
20121017 OpenStack CERN Accelerating Science
20121017 OpenStack CERN Accelerating Science20121017 OpenStack CERN Accelerating Science
20121017 OpenStack CERN Accelerating Science
 
Evolution of Openstack Networking at CERN
Evolution of Openstack Networking at CERNEvolution of Openstack Networking at CERN
Evolution of Openstack Networking at CERN
 
Containers on Baremetal and Preemptible VMs at CERN and SKA
Containers on Baremetal and Preemptible VMs at CERN and SKAContainers on Baremetal and Preemptible VMs at CERN and SKA
Containers on Baremetal and Preemptible VMs at CERN and SKA
 
The OpenStack Cloud at CERN
The OpenStack Cloud at CERNThe OpenStack Cloud at CERN
The OpenStack Cloud at CERN
 
20190620 accelerating containers v3
20190620 accelerating containers v320190620 accelerating containers v3
20190620 accelerating containers v3
 
20150924 rda federation_v1
20150924 rda federation_v120150924 rda federation_v1
20150924 rda federation_v1
 
Integrating Bare-metal Provisioning into CERN's Private Cloud
Integrating Bare-metal Provisioning into CERN's Private CloudIntegrating Bare-metal Provisioning into CERN's Private Cloud
Integrating Bare-metal Provisioning into CERN's Private Cloud
 
OpenStack Paris 2014 - Federation, are we there yet ?
OpenStack Paris 2014 - Federation, are we there yet ?OpenStack Paris 2014 - Federation, are we there yet ?
OpenStack Paris 2014 - Federation, are we there yet ?
 
Operational War Stories from 5 Years of Running OpenStack in Production
Operational War Stories from 5 Years of Running OpenStack in ProductionOperational War Stories from 5 Years of Running OpenStack in Production
Operational War Stories from 5 Years of Running OpenStack in Production
 
Openstack Infrastructure Containerization
Openstack Infrastructure ContainerizationOpenstack Infrastructure Containerization
Openstack Infrastructure Containerization
 
What's new in OpenStack Liberty
What's new in OpenStack LibertyWhat's new in OpenStack Liberty
What's new in OpenStack Liberty
 
Euro ht condor_alahiff
Euro ht condor_alahiffEuro ht condor_alahiff
Euro ht condor_alahiff
 
Manila on CephFS at CERN (OpenStack Summit Boston, 11 May 2017)
Manila on CephFS at CERN (OpenStack Summit Boston, 11 May 2017)Manila on CephFS at CERN (OpenStack Summit Boston, 11 May 2017)
Manila on CephFS at CERN (OpenStack Summit Boston, 11 May 2017)
 

Destaque

Producció I Distribució D’Energia ElèCtrica (1)
Producció I Distribució D’Energia ElèCtrica (1)Producció I Distribució D’Energia ElèCtrica (1)
Producció I Distribució D’Energia ElèCtrica (1)
AvantimePress
 
Articles Reading Rules
Articles Reading RulesArticles Reading Rules
Articles Reading Rules
Learngle
 
Periodico de Velilla de San Antonio Junio 2009
Periodico de Velilla de San Antonio Junio 2009Periodico de Velilla de San Antonio Junio 2009
Periodico de Velilla de San Antonio Junio 2009
guestf4a7e5e
 
Guia articuladora5
Guia articuladora5Guia articuladora5
Guia articuladora5
Karlita Sil
 
Núm. 2623, 23 de jeshván de 5774 México D.F. a 27 de octubre de 2013
Núm. 2623, 23 de jeshván de 5774 México D.F. a 27 de octubre de 2013Núm. 2623, 23 de jeshván de 5774 México D.F. a 27 de octubre de 2013
Núm. 2623, 23 de jeshván de 5774 México D.F. a 27 de octubre de 2013
Centro Deportivo Israelita
 

Destaque (20)

Manual Tecnico
Manual TecnicoManual Tecnico
Manual Tecnico
 
UAV Summit 2010
UAV Summit 2010UAV Summit 2010
UAV Summit 2010
 
Plano texas
Plano texasPlano texas
Plano texas
 
Knowledge Innovation Market
Knowledge Innovation MarketKnowledge Innovation Market
Knowledge Innovation Market
 
Producció I Distribució D’Energia ElèCtrica (1)
Producció I Distribució D’Energia ElèCtrica (1)Producció I Distribució D’Energia ElèCtrica (1)
Producció I Distribució D’Energia ElèCtrica (1)
 
Articles Reading Rules
Articles Reading RulesArticles Reading Rules
Articles Reading Rules
 
Arte y fotos
Arte y fotosArte y fotos
Arte y fotos
 
Presentación posgradoAdministración de sistemas, devOps y Cloud Computing 05...
Presentación posgradoAdministración de sistemas, devOps y Cloud Computing  05...Presentación posgradoAdministración de sistemas, devOps y Cloud Computing  05...
Presentación posgradoAdministración de sistemas, devOps y Cloud Computing 05...
 
Curriculum espanol copy
Curriculum espanol   copyCurriculum espanol   copy
Curriculum espanol copy
 
Periodico de Velilla de San Antonio Junio 2009
Periodico de Velilla de San Antonio Junio 2009Periodico de Velilla de San Antonio Junio 2009
Periodico de Velilla de San Antonio Junio 2009
 
AEI Pastelería de Estepa - Agrupación Empresarial Innovadora
AEI Pastelería de Estepa -  Agrupación Empresarial InnovadoraAEI Pastelería de Estepa -  Agrupación Empresarial Innovadora
AEI Pastelería de Estepa - Agrupación Empresarial Innovadora
 
Jppc2013
Jppc2013Jppc2013
Jppc2013
 
Guia articuladora5
Guia articuladora5Guia articuladora5
Guia articuladora5
 
El lenguaje
El lenguajeEl lenguaje
El lenguaje
 
H2 Mobility Italy - Presentation at Hydrogen Park, Venice, 23 November 2015
H2 Mobility Italy - Presentation at Hydrogen Park, Venice, 23 November 2015H2 Mobility Italy - Presentation at Hydrogen Park, Venice, 23 November 2015
H2 Mobility Italy - Presentation at Hydrogen Park, Venice, 23 November 2015
 
Transferencia del Conocimiento y Propiedad Intelectual - Comisión Uinnova
Transferencia del Conocimiento y Propiedad Intelectual - Comisión UinnovaTransferencia del Conocimiento y Propiedad Intelectual - Comisión Uinnova
Transferencia del Conocimiento y Propiedad Intelectual - Comisión Uinnova
 
Núm. 2623, 23 de jeshván de 5774 México D.F. a 27 de octubre de 2013
Núm. 2623, 23 de jeshván de 5774 México D.F. a 27 de octubre de 2013Núm. 2623, 23 de jeshván de 5774 México D.F. a 27 de octubre de 2013
Núm. 2623, 23 de jeshván de 5774 México D.F. a 27 de octubre de 2013
 
Elementos de la Comunicación Visual
Elementos de la Comunicación VisualElementos de la Comunicación Visual
Elementos de la Comunicación Visual
 
Tengo Un Perro Así
Tengo Un Perro AsíTengo Un Perro Así
Tengo Un Perro Así
 
T1 oportunidad
T1 oportunidadT1 oportunidad
T1 oportunidad
 

Semelhante a Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014

Openstack For Beginners
Openstack For BeginnersOpenstack For Beginners
Openstack For Beginners
cpallares
 
CERN Data Centre Evolution
CERN Data Centre EvolutionCERN Data Centre Evolution
CERN Data Centre Evolution
Gavin McCance
 
Open stack neutron and opendaylight
Open stack neutron and opendaylightOpen stack neutron and opendaylight
Open stack neutron and opendaylight
ramgow
 
NaaS in OpenStack - CloudCamp Moscow
NaaS in OpenStack - CloudCamp MoscowNaaS in OpenStack - CloudCamp Moscow
NaaS in OpenStack - CloudCamp Moscow
Ilya Alekseyev
 
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
balmanme
 
CloudStack - LinuxFest NorthWest
CloudStack - LinuxFest NorthWestCloudStack - LinuxFest NorthWest
CloudStack - LinuxFest NorthWest
ke4qqq
 

Semelhante a Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014 (20)

Openstack For Beginners
Openstack For BeginnersOpenstack For Beginners
Openstack For Beginners
 
CloudLab Overview
CloudLab OverviewCloudLab Overview
CloudLab Overview
 
Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017
 
DOE Magellan OpenStack user story
DOE Magellan OpenStack user storyDOE Magellan OpenStack user story
DOE Magellan OpenStack user story
 
CERN Data Centre Evolution
CERN Data Centre EvolutionCERN Data Centre Evolution
CERN Data Centre Evolution
 
Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt
 
OpenNebulaConf2015 1.07 Cloud for Scientific Computing @ STFC - Alexander Dibbo
OpenNebulaConf2015 1.07 Cloud for Scientific Computing @ STFC - Alexander DibboOpenNebulaConf2015 1.07 Cloud for Scientific Computing @ STFC - Alexander Dibbo
OpenNebulaConf2015 1.07 Cloud for Scientific Computing @ STFC - Alexander Dibbo
 
OpenStack: Toward a More Resilient Cloud
OpenStack: Toward a More Resilient CloudOpenStack: Toward a More Resilient Cloud
OpenStack: Toward a More Resilient Cloud
 
Toward 10,000 Containers on OpenStack
Toward 10,000 Containers on OpenStackToward 10,000 Containers on OpenStack
Toward 10,000 Containers on OpenStack
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
 
Open stack neutron and opendaylight
Open stack neutron and opendaylightOpen stack neutron and opendaylight
Open stack neutron and opendaylight
 
All about open stack
All about open stackAll about open stack
All about open stack
 
How to deliver High Performance OpenStack Cloud: Christoph Dwertmann, Vault S...
How to deliver High Performance OpenStack Cloud: Christoph Dwertmann, Vault S...How to deliver High Performance OpenStack Cloud: Christoph Dwertmann, Vault S...
How to deliver High Performance OpenStack Cloud: Christoph Dwertmann, Vault S...
 
NaaS in OpenStack - CloudCamp Moscow
NaaS in OpenStack - CloudCamp MoscowNaaS in OpenStack - CloudCamp Moscow
NaaS in OpenStack - CloudCamp Moscow
 
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
 
Walk Through a Software Defined Everything PoC
Walk Through a Software Defined Everything PoCWalk Through a Software Defined Everything PoC
Walk Through a Software Defined Everything PoC
 
CloudStack - LinuxFest NorthWest
CloudStack - LinuxFest NorthWestCloudStack - LinuxFest NorthWest
CloudStack - LinuxFest NorthWest
 
OpenStack@NBU
OpenStack@NBUOpenStack@NBU
OpenStack@NBU
 
20121115 open stack_ch_user_group_v1.2
20121115 open stack_ch_user_group_v1.220121115 open stack_ch_user_group_v1.2
20121115 open stack_ch_user_group_v1.2
 
CERN Mass and Agility talk at OSCON 2014
CERN Mass and Agility talk at OSCON 2014CERN Mass and Agility talk at OSCON 2014
CERN Mass and Agility talk at OSCON 2014
 

Último

Último (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 

Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014

  • 1. Multi-Cell Openstack: How to Evolve your Cloud to Scale ● Belmiro Moreira - CERN ● Matt Van Winkle - Rackspace ● Sam Morrison - NeCTAR, University of Melbourne
  • 2. Cells: How we use them at NeCTAR Sam Morrison sam.morrison@unimelb.edu.au
  • 3. NeCTAR Research Cloud ● Started in 2011 ● Funded by the Australian Government ● 8 institutions around the country ● Production early 2012 - Openstack Diablo ● All federated to appear as 1 cloud from the users point of view ● Put the compute near the data and tools ● 5000+ users
  • 4. NeCTAR Sites ● University of Melbourne ● National Computation Infrastructure ● Monash University ● Queensland CyberInfrastructure Foundation ● eResearch SA ● University of Tasmania ● Intersect, NSW ● iVEC, WA
  • 5. Cells to build a Federation ● Use cells to federate geographically separated sites ● Different hardware/networks/people ● Parent cell run centrally at unimelb along with keystone/cinder/glance etc (no neutron) ● Each site has 1 or more compute cells ● These roughly match up to availability zones from a users perspective (cells are behind the scenes)
  • 6. How big? ● Each site ~4000 cores, ~150 hypervisors ● 6 sites in production, 4600+ instances ● Last 2 sites in prod by end of year ● ~1000 hypervisors, 40k cores ● ~10 compute cells ● Some sites have multiple datacenters so have multiple cells
  • 7. Pain points ● Cell scheduling isn’t smart ● Broadcast calls rely on all cells to be alive ● Not many people to share experiences with ● Upgrades, although havana → icehouse could happen in stages. Much easier!
  • 8. Things we’ve added, not in trunk (yet) ● Security group syncing ● ec2 id mappings (needed for metadata) ● Availability zone / aggregate support ● Flavour management *We assume a cell only has 1 parent
  • 9. Cells: How we use them at CERN Belmiro Moreira email: belmiro.moreira @ cern.ch @belmiromoreira
  • 10. CERN ● Conseil Européen pour la Recherche Nucléaire – aka European Organization for Nuclear Research ● Founded in 1954 with an international treaty ○ 21 state members, other countries contribute to experiments ○ Situated between Geneva and the Jura Mountains, straddling the Swiss- French border ● CERN mission is to do fundamental research ● CERN provides particle accelerators and other infrastructure for high-energy physics research
  • 11. CERN - Cloud Infrastructure ● In production since July 2013 ● Performed two upgrades: Grizzly -> Havana -> Icehouse ○ Currently running: nova; glance; keystone; horizon; cinder w/ Ceph; ceilometer ● RDO distribution on SLC6; pip with Windows Server 2012 R2 ● 2 geographically separated data centres ○ Geneva (Switzerland) and Budapest (Hungary) ● Numbers ○ ~3000 compute nodes (75k cores; 140TB RAM) ■ ~2900 kvm; ~100 Hyper-V; ○ ~8000 virtual machines
  • 12. CERN - Cloud Infrastructure - Cells ● Why we use cells? ○ Scale transparently between different Data Centres ○ Availability and Resilience ○ Isolate different use-cases ● Today: 1 api Cell and 8 compute Cells ○ 2 level tree ○ size range between 100 to ~1600 compute nodes ○ 6 Compute Cells in Switzerland; 2 Compute Cells in Hungary ● “Shared” and “Private” Cells ○ 3 availability zones available in “Shared” Cells
  • 13. CERN - Cells Limitations ● Missing functionality ○ Security Groups ○ Flavor propagation (api -> compute) ○ Manage aggregates on api Cell ○ Server groups ● Cell scheduler ● Ceilometer integration
  • 14. CERN - Cells Challenges ● More ~74000 cores by beginning 2015 ○ How to organize and distribute nodes between different cells? ● Split current large cells into a small number (~200) of compute nodes ○ Expected to have +30 cells by end 2015 ○ How to manage a large number of Cells?
  • 15. Created by: Matt Van Winkle @mvanwink Modified Date: 10/29/2014 Cells at Rackspace Cells: How to Evolve Your Cloud to Scale
  • 16. • Managed Cloud company offering a suite of dedicated and cloud hosting products • Founded in 1998 in San Antonio, TX • Home of Fanatical Support • More than 200,000 customers in 120 countries Rackspace www.rackspace.com
  • 17. • In production since August 2012 – Currently running: Nova; Glance; Neutron; Ironic; Swift; Cinder • Regular upgrades from trunk – Package built on trunk pull from 10/21 in testing now • Compute nodes are Debian based – Run as VMs on hypervisors and manage via XAPI • 6 Geographic regions around the globe – DFW; ORD; IAD; LON; SYD; HKG • Numbers – 10’s of 1000’s of hypervisors (over 330K Cores, 1+ Petabyte of RAM) • All XenServer – Over 150,000 virtual machines Rackspace – Cloud Infrastructure www.rackspace.com
  • 18. • Why we use cells? – Manage Multiple Flavor Classes – Network resources (Public IPs, Private IPs, aggregation routers, etc) – Network Constraints – Continual Supply Chain • 1 Global API cell per region with multiple Compute cells (3 – 35+) – 2 level tree – Size between ~100 and ~600 hosts per cell • Control infrastructure exists as instances in small OpenStack deployment • All cells available to all tenants – Tested “dedicated” cells for potential large customers Rackspace – Cloud Infrastructure - Cells www.rackspace.com
  • 19. • Missing Functionality – Security Groups – Host aggregates • Scheduler – No “disable” – Incomplete host statuses • Other services are not cell aware – Neutron is a prime example Rackspace – Cells Limitations www.rackspace.com
  • 20. • Increasing number of flavor classes – Different Hardware specs per class – Sizing varies by average VM density • Multiple vendor sources – Subtle hardware differences in same specs across different vendors • Scaling global services with cell growth – Still don’t have the perfect ratios Rackspace – Cells Challenges www.rackspace.com
  • 21. • Nova Dev team met this morning to discuss cells in a few sessions: – Cells – Wednesday, November 5, 09:00 – Cells continued – Wednesday, November 5, 09:50 • Areas of discussion – Feature completion – No-op/single cell as default – Cell awareness in APIs • Recap from sessions Cells Feature Completion www.rackspace.com
  • 22. Thank You! ● Belmiro Moreira - CERN - belmiro.moreira@cern.ch ● Matt Van Winkle - Rackspace - @mvanwink ● Sam Morrison - NeCTAR, University of Melbourne - sam.morrison@unimelb. edu.au Questions? www.rackspace.com