What key capabilities are important to enterprise when building a Cloud ?
This presentation will discuss these 5 Keys:
1) Self-Service Portal
2) Automated Provisioning
3) Chargeback/Showback
4) Capacity Management & Planning
5) Leverage Existing Infrastructure
Presented by Joe Fitzgerald
GM, Cloud Management Products BU Red Hat
http://www.redhat.com
Building Enterprise Clouds - Key Considerations and Strategies - RED HAT
1. Toronto
June 18, 2014
Building Enterprise Clouds - Key
Considerations and Strategies
Joe Fitzgerald
GM, Cloud Management Products BU
Red Hat
2. 2
“I want to build a cloud....”
Self-Service Portal
Automated Provisioning
Chargeback/Showback
Capacity Management & Planning
Leverage Existing Infrastructure
What KEY Capabilities are Important?
3. 3
1. How do I provide my users with self-service yet still control what they
can see and do?
2. How do I tie self-service to what's going on in my infrastructure?
3. How do I provide adequate support and service levels when I give
users control?
4. How do I ensure compliance in a cloud?
5. How can I integrate this “cloud” into my existing infrastructure tools
and processes?
Challenges
What we hearfromcustomers in initial cloud meetings
4. 4
6. How do I manage my capacity to maximize utilization while still
delivering adequate/good performance and availability?
7. How can I leverage OpenStack for my cloud workloads? What about
my traditional workloads?
8. How do I avoid lock-in so I can leverage whatever platform(s) is best
for my business?
9. How do I utilize public cloud resources in a controlled way?
10. How can I connect my cloud to my service catalog and cmdb?
Challenges....continued
What we hearfromcustomers in initial cloud meetings
6. 6
Compliance Chargeback
Quota
Enforcement
Approval
Workflow
Self
Service
Clouds Simplify User Experience...
But Create Management Complexity
Service isn't working –
did anything change? what?
Where is best place
to provision ?
Which workloads
are over provisioned ?
Why is a
service slow ?
Why am I out of resources ?
When will I run out of capacity ?
How is my cloud running ?
Private Cloud
7. 7
Compliance Chargeback
Quota
Enforcement
Approval
Workflow
Self
Service
Hybrid Clouds Create Even More Complexity
Service isn't working –
did anything change? what?
Where is best place
to provision ?
Which workloads
are over provisioned ?
Why is a
service slow ?
Why am I out of resources ?
When will I run out of capacity ?
How is my cloud running ?
Private Cloud Public Cloud
8. 8
Cloud Enablement
Cloud Management
Root Cause
Analysis
Config
MgmtOptimize
Resource
Mgmt
Capacity
Planning
Compliance Chargeback
Quota
Enforcement
Approval
Workflow
Self
Service
Need Cloud Management - “Back Office”
Private Cloud Public Cloud
16. 16
CLOUD WORKLOAD MANAGEMEMT/OPTIMIZATION
Workload Balancing1
Normal Operating Range
Capacity Management2
Operations and Support3
Where is the best place
to run a new workload?
How can I optimize
existing resources?
I didn't create the work-
load, but I need to
determine what's the
problem and how to fix it
What is
causing
spikes?
Can cluster “X” handle workload?
Where do I have performance issues?
Where do I have waste?
Identify root cause and reduce
mean time to resolution (MTTR) by
viewing workloads in 4
dimensions:
Timeline
VM
Inspection
Consump-
tion
Drift
Analysis
18. 18
AUTOMATING IT PROCESS
Protect Environment – Stop VM if it Breaks Policy
Sample Rule: Every Workload must have correct SW, Patches installed
● Users only see conforming VMs/Workloads
● Non-conforming VMs preventing from running
● Policy breach notifications sent automatically
● Tagging certain items allows one to apply policies to only tagged items
CLOUDFORMS
YES NO
Help Desk
Security Team
IT Management
Converged
Infrastructure
19. 19
WASTE DETECTION
Optimize the Environment
● VM sprawl
● Incorrectly configured workloads
● Datastore wastage
● Over-allocated resource pools
North America Europe Asia
CPU
Memory
Storage
Allocated Actual
CPU
Memory
Storage
Allocated Actual
CPU
Memory
Storage
Allocated Actual
50 GHz
60 GB
400 GB
34 GHz
42 GB
187 GB
30 GHz
45 GB
250 GB
26 GHz
39 GB
237 GB
20 GHz
30 GB
150 GB
17 GHz
26 GB
142 GB
= VM
= VM sprawl
= Incorrectly
configured
workload
Over-Allocated
CPU by 16 GHz
Memory by 18 GB
Storage by 213 GB
Understand resource consumption
today and trending over time:
Storage – 78% Used 22% Free
Storage Consumption
78%
22%
27. Recommendations
● Design private cloud with management in mind, assume
multiple platforms, avoid lock-in
● Consider cloud management platforms that are
integrated, open, multi-cloud
● Allocate and track resource usage and efficiency using
quotas, optimization tools,
● Enforce policies regarding workload placement, cloud
usage, configuration management and licensing
● Plan for hybrid clouds with traditional and cloud
workloads
28.
29. 29
AUTOMATING IT PROCESS
Protect Environment – Stop VM if it Breaks Policy
Sample Rule: Every Workload must have correct SW, Patches installed
● Users only see conforming VMs/Workloads
● Non-conforming VMs preventing from running
● Policy breach notifications sent automatically
● Tagging certain items allows one to apply policies to only tagged items
CLOUDFORMS
YES NO
Help Desk
Security Team
IT Management
Converged
Infrastructure
30. 30
CLOUD WORKLOAD MANAGEMEMT/OPTIMIZATION
Workload Balancing1
Normal Operating Range
Capacity Management2
Where is the best place
to run a new workload?
How can I optimize
existing resources?
What is
causing
spikes?
Can cluster “X” handle workload?
Where do I have performance issues?
Where do I have waste?
31. 31
N-TIER APPLICATION SERVICES
Orchestrate Deployment and Management
VM Templates:
● Web server
● Application server
● Database Server
I need resources for
application service “X”. 2 Web
servers,
1 JBoss App server, and
1 Oracle DB server
How long do you need it?
How big do you need it?
What is it's purpose?
90 day project
Medium size
App Development
1
2
Post-Provisioning
Configuration:
● Satellite
● Puppet
● Chef
5
6
Intelligent Service Delivery:
● Where can I place this workload?
● What policies may affect placement?
Chargeback:
- Whole Unit
- Allocated
- Actual Usage
- Tagged
N-Tier Application
Service Request
Management
Approval Workflow
3
4
Converged
Infrastructure
● SMS
● BladeLogic
● etc....
32. 32
Converged
Infrastructure
What policies affect placement?
Which options offer least cost?
Where do I have available capacity?
Requests
Dev QA Prod
Dev QA Prod
Dev QA Prod
Dev QA Prod
Dev QA Prod
CLOUD BROKERING
Controlling Where Requests Get Met
This is another expression of our ideal—the core message.
It’s the same topic, but written in a way that is more conversational. It’s easier to use as inspiration for marketing material.
These are just two of what will be many, many variations and iterations of this messaging. We have a lot of different resources, from a copy and a BU perspective.
And that’s important—addressing each customer or potential customer with the message (or portion of the message) that’s appropriate to them.
Some prospects will know what they want from their cloud deployment and have thought through needs and wants, others will be less mature in their cloud approach.
Either way, its important for us to probe in key areas we've learned are important to our existing customers. This will help us target our capabilities and possibly help some realize needed capabilities they've not thought of (we become trusted advisor).
Self-Service Provision – prospect all want this! But need controls and analytics to operationally provide it. Intelligent workload placement is important as org look to automate this process.
Cloud Workload Mgmt – this is optimizing my environment, cloud doesn't really help save time and money if org simple throw resources (Compute, storage, networking) as consumption increases. Also being able to quickly identify and rectify problem is key to cloud operations
Chargeback – Most orgs want/need to be able to chargeback (or at least showback) to the business for infrastructure usage, especially in a self-service model. Detailed tracking and monitoring around CPU, Storage, memory, and network is key.
Capacity Management & Planning – These are actually two different disciplines – Capacity mgmt allows IT to know when resources are trending toward limits set by IT. Allows them to see resource availability across the environment, and make best-fit recommendations for new workloads based on availability, IT policies, and cost. Capacity Planning with CloudForms allows IT to model future scenarios to see impact prior to actually doing it. (If I add more CPU to cluster “x”, will it cause a problem with networking, storage, and memory resources?) Being able to model future additions and see projected impact, help orgs make informed and impactful and cost effective decision/additions to their cloud environment.
Cloud Brokering – Use intelligence around not only available resource pools, but policies that affect placement (production workloads cannot run in public cloud), and which options offer least cost (placing this workload on RHEV saves me licensing cost on VMware)
Deploy N-Tier Apps – Org have a need to offer/delivery “workloads” in addition to simple Virtual machines (VMs). By workloads I meant n-tier apps.....could be combination of web server, app server, and database server. This is a step in the direction of PaaS, but not as extensive and deep as OpenShift Enterprise.
Public Cloud Flexing and Bursting – Many orgs want to take advantage of readily available public cloud resources, but want to control usage and dictate conditions. Orgs may want to use Public Cloud for DR purposes, seasonal spikes in business demand, and/or more permanent use for dev and test reasons.
Use Existing Infrastructure – Orgs want to use existing platforms (ie VMware) and have option to add new, when and if they want to (OpenStack; RHEV). Integrate to config mgmt, service catlog, and other systems monitoring tools.
Manage Converged Infrastructure – Converged infrastructure is pre-configured hardware stacks like vBlock, FlexPod, and PureFlex. Typically these commercially available “stacks” come with cloud management capabilities, but some orgs want better capabilities than what's provided....hence an opportunity for CloudForms. Some orgs may also go to their local systems integrators and task them with building a “stack” for them.....this may be done without consideration for management. CloudForms also has an opportunity to win business here as well.
1.How do I provide my users with self-service yet still control what they can see and do?
Need RBAC to parse what services are available per role
Lack quota enforcement to cap resource availability
No workflow approval capability for sensitive requests
Lack control of managed resources (start/stop/pause, snapshots, cloning, add/remove resources throughout workload lifecycle...)
Without the above controls, a self-service provisioning process can get IT into trouble quickly (Infrastructure becomes a free-for-all)
2. How do I tie self-service to what's going on in my infrastructure?
Inability to know in near real time where to best place requests
No view into available memory, CPU, network, storage resources without manual intervention (no automation)
Lack consolidated view of workloads across platforms (single pane of glass)
3. How do I provide adequate support and service levels when I give users control?
Root cause analysis, VM inspection, Drift analysis, Alerting, and Eventing (if these 3 things happen together then let me know)
4. How do I ensure compliance in a self-service cloud?
Lack policies to govern the environment (i.e. production workloads never run in the public cloud, all windows VMs must have ant-virus protection installed)....and if some tries to do something in violation of a policy, automatically shut it down. (ie All Windows workloads must have virus protection installed....if after some time, user tries to remove, we can auto-shout down VM, notify applicable people and rectify the policy violation)
Lack ability to limit user from mis-allocating systems and apps (User “X” doesn't have ability to self-provision Oracle DB Vms)
5. How can I integrate this “cloud” into my existing infrastructure tools and processes?
Can I use my existing service catalog?
No apparent path to integrate CMDB, change mgmt systems, IP address mgmt, financial mgmt systems (This may require consulting to custom build integration)
6. How can I plan for capacity requirements in a self-service cloud?
Lack insight into current consumption and utilization (where do I have room for workload?); Lack insight into trends (when am I going to run out of a resource) and peak events; Don't know if I'm under/over allocating resources
7. How do I handle N-Tier application stacks and automate delivery to users?
No ability to orchestrate a sequenced delivery and installation as a single service without doing manually; No verification and secondary configuration of multiple application components
8. How do I manage my capacity to maximize utilization while still delivering adequate/ good performance and availability?
Portability....moving workloads around for best utilization. Without visibility across the landscape and clear understanding of whats allocated/consumed....difficult to optimize
9. How do I utilize public cloud resources in a controlled way?
Want to use public cloud based on capacity, security, cost...need RBAC, Policies, Approval workflow
10. How do I chargeback in a self-service model across multiple clouds/platforms?
Without a Cloud Mgmt Platform, you'd have to have aggregate information manually...even then you'd have to have mgmt products in place for each cloud or platform. Still wouldn't be able to track or manage consumption easily based on various factors (BU, location, project) (ie BU will make varying IT requests, some may get fulfilled on Vmware, some on RHEV, some on Amazon....IT has to manually aggregate that info
So let’s dig into the CloudForms solution by taking a look at it from a high level. As a cloud management platform CloudForms delivers the key capabilities required to enable private and hybrid clouds, and to manage your infrastructure as a service. First and foremost providing your users with “self-service” to fully automate the provisioning of new workloads and applications within your business rules as well as to manage and operate those workloads throughout their lifecycle. In addition to enabling self-service, and this is unique to CloudForms, are a fully integrated set of cloud optimized infrastructure management capabilities that alone are super valuable but in combination with self-service cloud enablement provide the highest levels of automation. The key differentiation here is tying the user supplied data from the self service request together with your business rules AND the real-time data from the infrastructure to fully automate the end to end lifecycle of that service. So what does that mean?
Let me give you an example…a user comes to the portal or your own enterprise service catalog for that matter, and requests an application environment made up of 3 tiers of servers. CF2 takes that request, validates it and then goes about the provisioning of the service. It determines that a business rule exists that says oracle database servers can only be provisioned to certain clusters to comply with licensing limitations so it identifies those clusters and then determines the one w/ the most capacity available based on real-time capacity and utilization information and provisions the db tier to that cluster. It then provisions the web tier onto the cloud area with not only the right capacity but with the correct network paths and then joins the web servers to a load balancer pool. Ultimately the entire service is provisioned, the user notified and the ongoing management of that service's lifecycle begins. We’ll dig into this in more detailed use cases shortly but the key takeaway here is the integration of the user data, the business rules and policies, with the real time infrastructure data and analytics to fully automate cloud service delivery and management. How is that different? on numerous occasions we’ve engaged with enterprise accounts that have put a service catalog on top of their virtualized infrastructure. The user requests a server or application which kicks off an email to the engineering team who take the request and determine the right location and configuration for the workload by looking at their vCenter UI, checking their capacity planning spreadsheets, requesting IP addresses from the network team and then manually creating and configuring the vm’s to fulfill the request. They then go update the ticket which kicks off an email to the requestor that their servers are ready to go. What they generally find is this not a scalable model and it really doesn't substantially decrease service request fulfillment times which leads to user dissatisfaction and in some cases lines of business going directly to Amazon for their workloads cutting out IT completely.
So let’s dig into the CloudForms solution by taking a look at it from a high level. As a cloud management platform CloudForms delivers the key capabilities required to enable private and hybrid clouds, and to manage your infrastructure as a service. First and foremost providing your users with “self-service” to fully automate the provisioning of new workloads and applications within your business rules as well as to manage and operate those workloads throughout their lifecycle. In addition to enabling self-service, and this is unique to CloudForms, are a fully integrated set of cloud optimized infrastructure management capabilities that alone are super valuable but in combination with self-service cloud enablement provide the highest levels of automation. The key differentiation here is tying the user supplied data from the self service request together with your business rules AND the real-time data from the infrastructure to fully automate the end to end lifecycle of that service. So what does that mean?
Let me give you an example…a user comes to the portal or your own enterprise service catalog for that matter, and requests an application environment made up of 3 tiers of servers. CF2 takes that request, validates it and then goes about the provisioning of the service. It determines that a business rule exists that says oracle database servers can only be provisioned to certain clusters to comply with licensing limitations so it identifies those clusters and then determines the one w/ the most capacity available based on real-time capacity and utilization information and provisions the db tier to that cluster. It then provisions the web tier onto the cloud area with not only the right capacity but with the correct network paths and then joins the web servers to a load balancer pool. Ultimately the entire service is provisioned, the user notified and the ongoing management of that service's lifecycle begins. We’ll dig into this in more detailed use cases shortly but the key takeaway here is the integration of the user data, the business rules and policies, with the real time infrastructure data and analytics to fully automate cloud service delivery and management. How is that different? on numerous occasions we’ve engaged with enterprise accounts that have put a service catalog on top of their virtualized infrastructure. The user requests a server or application which kicks off an email to the engineering team who take the request and determine the right location and configuration for the workload by looking at their vCenter UI, checking their capacity planning spreadsheets, requesting IP addresses from the network team and then manually creating and configuring the vm’s to fulfill the request. They then go update the ticket which kicks off an email to the requestor that their servers are ready to go. What they generally find is this not a scalable model and it really doesn't substantially decrease service request fulfillment times which leads to user dissatisfaction and in some cases lines of business going directly to Amazon for their workloads cutting out IT completely.
So let’s dig into the CloudForms solution by taking a look at it from a high level. As a cloud management platform CloudForms delivers the key capabilities required to enable private and hybrid clouds, and to manage your infrastructure as a service. First and foremost providing your users with “self-service” to fully automate the provisioning of new workloads and applications within your business rules as well as to manage and operate those workloads throughout their lifecycle. In addition to enabling self-service, and this is unique to CloudForms, are a fully integrated set of cloud optimized infrastructure management capabilities that alone are super valuable but in combination with self-service cloud enablement provide the highest levels of automation. The key differentiation here is tying the user supplied data from the self service request together with your business rules AND the real-time data from the infrastructure to fully automate the end to end lifecycle of that service. So what does that mean?
Let me give you an example…a user comes to the portal or your own enterprise service catalog for that matter, and requests an application environment made up of 3 tiers of servers. CF2 takes that request, validates it and then goes about the provisioning of the service. It determines that a business rule exists that says oracle database servers can only be provisioned to certain clusters to comply with licensing limitations so it identifies those clusters and then determines the one w/ the most capacity available based on real-time capacity and utilization information and provisions the db tier to that cluster. It then provisions the web tier onto the cloud area with not only the right capacity but with the correct network paths and then joins the web servers to a load balancer pool. Ultimately the entire service is provisioned, the user notified and the ongoing management of that service's lifecycle begins. We’ll dig into this in more detailed use cases shortly but the key takeaway here is the integration of the user data, the business rules and policies, with the real time infrastructure data and analytics to fully automate cloud service delivery and management. How is that different? on numerous occasions we’ve engaged with enterprise accounts that have put a service catalog on top of their virtualized infrastructure. The user requests a server or application which kicks off an email to the engineering team who take the request and determine the right location and configuration for the workload by looking at their vCenter UI, checking their capacity planning spreadsheets, requesting IP addresses from the network team and then manually creating and configuring the vm’s to fulfill the request. They then go update the ticket which kicks off an email to the requestor that their servers are ready to go. What they generally find is this not a scalable model and it really doesn't substantially decrease service request fulfillment times which leads to user dissatisfaction and in some cases lines of business going directly to Amazon for their workloads cutting out IT completely.
So let’s dig into the CloudForms solution by taking a look at it from a high level. As a cloud management platform CloudForms delivers the key capabilities required to enable private and hybrid clouds, and to manage your infrastructure as a service. First and foremost providing your users with “self-service” to fully automate the provisioning of new workloads and applications within your business rules as well as to manage and operate those workloads throughout their lifecycle. In addition to enabling self-service, and this is unique to CloudForms, are a fully integrated set of cloud optimized infrastructure management capabilities that alone are super valuable but in combination with self-service cloud enablement provide the highest levels of automation. The key differentiation here is tying the user supplied data from the self service request together with your business rules AND the real-time data from the infrastructure to fully automate the end to end lifecycle of that service. So what does that mean?
Let me give you an example…a user comes to the portal or your own enterprise service catalog for that matter, and requests an application environment made up of 3 tiers of servers. CF2 takes that request, validates it and then goes about the provisioning of the service. It determines that a business rule exists that says oracle database servers can only be provisioned to certain clusters to comply with licensing limitations so it identifies those clusters and then determines the one w/ the most capacity available based on real-time capacity and utilization information and provisions the db tier to that cluster. It then provisions the web tier onto the cloud area with not only the right capacity but with the correct network paths and then joins the web servers to a load balancer pool. Ultimately the entire service is provisioned, the user notified and the ongoing management of that service's lifecycle begins. We’ll dig into this in more detailed use cases shortly but the key takeaway here is the integration of the user data, the business rules and policies, with the real time infrastructure data and analytics to fully automate cloud service delivery and management. How is that different? on numerous occasions we’ve engaged with enterprise accounts that have put a service catalog on top of their virtualized infrastructure. The user requests a server or application which kicks off an email to the engineering team who take the request and determine the right location and configuration for the workload by looking at their vCenter UI, checking their capacity planning spreadsheets, requesting IP addresses from the network team and then manually creating and configuring the vm’s to fulfill the request. They then go update the ticket which kicks off an email to the requestor that their servers are ready to go. What they generally find is this not a scalable model and it really doesn't substantially decrease service request fulfillment times which leads to user dissatisfaction and in some cases lines of business going directly to Amazon for their workloads cutting out IT completely.
So let’s dig into the CloudForms solution by taking a look at it from a high level. As a cloud management platform CloudForms delivers the key capabilities required to enable private and hybrid clouds, and to manage your infrastructure as a service. First and foremost providing your users with “self-service” to fully automate the provisioning of new workloads and applications within your business rules as well as to manage and operate those workloads throughout their lifecycle. In addition to enabling self-service, and this is unique to CloudForms, are a fully integrated set of cloud optimized infrastructure management capabilities that alone are super valuable but in combination with self-service cloud enablement provide the highest levels of automation. The key differentiation here is tying the user supplied data from the self service request together with your business rules AND the real-time data from the infrastructure to fully automate the end to end lifecycle of that service. So what does that mean?
Let me give you an example…a user comes to the portal or your own enterprise service catalog for that matter, and requests an application environment made up of 3 tiers of servers. CF2 takes that request, validates it and then goes about the provisioning of the service. It determines that a business rule exists that says oracle database servers can only be provisioned to certain clusters to comply with licensing limitations so it identifies those clusters and then determines the one w/ the most capacity available based on real-time capacity and utilization information and provisions the db tier to that cluster. It then provisions the web tier onto the cloud area with not only the right capacity but with the correct network paths and then joins the web servers to a load balancer pool. Ultimately the entire service is provisioned, the user notified and the ongoing management of that service's lifecycle begins. We’ll dig into this in more detailed use cases shortly but the key takeaway here is the integration of the user data, the business rules and policies, with the real time infrastructure data and analytics to fully automate cloud service delivery and management. How is that different? on numerous occasions we’ve engaged with enterprise accounts that have put a service catalog on top of their virtualized infrastructure. The user requests a server or application which kicks off an email to the engineering team who take the request and determine the right location and configuration for the workload by looking at their vCenter UI, checking their capacity planning spreadsheets, requesting IP addresses from the network team and then manually creating and configuring the vm’s to fulfill the request. They then go update the ticket which kicks off an email to the requestor that their servers are ready to go. What they generally find is this not a scalable model and it really doesn't substantially decrease service request fulfillment times which leads to user dissatisfaction and in some cases lines of business going directly to Amazon for their workloads cutting out IT completely.
So let’s dig into the CloudForms solution by taking a look at it from a high level. As a cloud management platform CloudForms delivers the key capabilities required to enable private and hybrid clouds, and to manage your infrastructure as a service. First and foremost providing your users with “self-service” to fully automate the provisioning of new workloads and applications within your business rules as well as to manage and operate those workloads throughout their lifecycle. In addition to enabling self-service, and this is unique to CloudForms, are a fully integrated set of cloud optimized infrastructure management capabilities that alone are super valuable but in combination with self-service cloud enablement provide the highest levels of automation. The key differentiation here is tying the user supplied data from the self service request together with your business rules AND the real-time data from the infrastructure to fully automate the end to end lifecycle of that service. So what does that mean?
Let me give you an example…a user comes to the portal or your own enterprise service catalog for that matter, and requests an application environment made up of 3 tiers of servers. CF2 takes that request, validates it and then goes about the provisioning of the service. It determines that a business rule exists that says oracle database servers can only be provisioned to certain clusters to comply with licensing limitations so it identifies those clusters and then determines the one w/ the most capacity available based on real-time capacity and utilization information and provisions the db tier to that cluster. It then provisions the web tier onto the cloud area with not only the right capacity but with the correct network paths and then joins the web servers to a load balancer pool. Ultimately the entire service is provisioned, the user notified and the ongoing management of that service's lifecycle begins. We’ll dig into this in more detailed use cases shortly but the key takeaway here is the integration of the user data, the business rules and policies, with the real time infrastructure data and analytics to fully automate cloud service delivery and management. How is that different? on numerous occasions we’ve engaged with enterprise accounts that have put a service catalog on top of their virtualized infrastructure. The user requests a server or application which kicks off an email to the engineering team who take the request and determine the right location and configuration for the workload by looking at their vCenter UI, checking their capacity planning spreadsheets, requesting IP addresses from the network team and then manually creating and configuring the vm’s to fulfill the request. They then go update the ticket which kicks off an email to the requestor that their servers are ready to go. What they generally find is this not a scalable model and it really doesn't substantially decrease service request fulfillment times which leads to user dissatisfaction and in some cases lines of business going directly to Amazon for their workloads cutting out IT completely.
Note: By OpenStack we mean Red Hat's distribution – Red Hat Enterprise Linux OpenStack Platform (CloudForms may be able to manage other distributions of OpenStack, but we've only tested our Red Hat version)
From a user perspective, getting to “self-service” poses some challenges.
There are often islands or silos of VM s for use in services
Multiple portals with inconsistent look and feel and different capabilities
Lack of any kind of quota or chargeback construct or display
CLOUDforms delivers capabilities to IT to have a policy controlled, role-delegated portal that is web accessible from any location. This service delivery can be effected across VMWare, Red Hat, Microsoft and Amazon environments with consistent look and feel and behavior.
Through CLOUDForms own Service Catalog or an integrated internal service catalog, you can provision and retire services to authorized users across the organization.
Users have access to dashboards that show them, what they have allocated, what is available and what is consumed, and offers them details on their quota levels and chargeback / showback amounts.
Note: By OpenStack we mean Red Hat's distribution – Red Hat Enterprise Linux OpenStack Platform (CloudForms may be able to manage other distributions of OpenStack, but we've only tested our Red Hat version)
Service Delivery – could be VMs, Instances/Templates, n-tier applications/workloads; CF could be integrated into an existing Service Catalog
Role-Based Access Controls (RBAC) – Ensure only authorized workloads are accessible by authorized users and admins
Quota Enforcement – Ensure users/groups do not exceed their allocated infrastructure; Can Tag objects, users, groups to make quota enforcement as detailed or broad as org wants
Approval Workflow – Force an optional automated approval process on any IT request
Intelligent Workload Placement – Use logic and policies to determine best infrastructure for the job
Chargeback – Important for most orgs to be able to accurately track and charge for consumption. Even if they simply want to “showback” to the business consumption vs actualy charging them for it.
(This is actual saving told to us by Media Company... you could setup a peer-to-peer call with them and have them tell prospect same thing!)
Example: Media Company saved 25 person hours per VM provisioned
So far in 2013, they have provisioned little over 1000 VMs with CloudForms.
Assuming $100 cost per man hour
Comcast saves $2,500 per VM (25 x $100)
So far this year, they have saved $2.5M! ($2,500 x 1000 VMs)
Benefits
Business:
- Reduce time in meeting customer request/accelerate projects & innovation
- Elevate the value and perception of IT internally
Technical:
- Less people needed in the provision process (reallocate heads elsewhere)
- Control access, usage and request fulfillment – Lifecycle control (Request thru retirement)
Note: By OpenStack we mean Red Hat's distribution – Red Hat Enterprise Linux OpenStack Platform (CloudForms may be able to manage other distributions of OpenStack, but we've only tested our Red Hat version)
From an operations administrators perspective, managing and controlling Cloud environments pushes into new management areas.
Admins need a consolidated view of their cloud and hypervisor environments in order to holistically assess consumption and capacity.
Utilization, drift and capacity assessments need to be run across hypervisor and cloud platform environments to correctly determine headroom and workload allocation options. These kinds of assessments are made more complex or impossible when attempted through silo-ed control consoles.
Often the admin cannot get dashboards and time lines that aggregate the correct useful information needed for valid operational assessments.
There is a growing amount of tool sprawl implied in cloud management, and CLOUDForms reduces management complexity by consolidating the management functions of a number of different tools.
Using CLOUDForms administrators have the capabilities to ensure and deliver consistent Service Levels to their users.
Operations – Quickly identify root cause and reduce Mean Time To Resolution (MTTR). Compare on these dimensions:
- Timeline (When was this workload last working properly?)
- Analysis (Look into VM...configured properly? Was something recently installed?)
- Drift (How has this workload changed over time from parent template?)
- Resource consumption (3 wks ago Memory started ticking upward...now we've hit allocated capacity)....could result in best-fit placement move of workload
Benefits
Business:
- Reducing the time it takes to resolve problems means more productive users (less time waiting for a fix), better up-time/utilization of existing resources (utilization rates increase – getting more without purchasing more)
Technical:
- Faster time to problem resolution means happier users, and less time/money spent on determining cause and fixing issue. Redeploy IT admin time saved, on other projects – again doing more with same labor ppol
Note: By OpenStack we mean Red Hat's distribution – Red Hat Enterprise Linux OpenStack Platform (CloudForms may be able to manage other distributions of OpenStack, but we've only tested our Red Hat version)
From an executive perspective, obtaining visibility to the IT environment at the business planning level is crucial.
Senior management need to ensure that proper policy thresholds, compliance gates and limits, and process governance are present and working in their operations. CLOUDForms adaptive management platform can respond to input conditions apply automated response to those conditions.
Without real time, aggregated consumption, utilization, capacity and chargeback information, it is difficult for executive managers to decision correctly against their operating units. CLOUDForms provides active, hot link dashboards for just those uses.
Additionally, close monitoring of service levels and capacity guard-band limits may be imperative in certain IT service models, and CLOUDForms can give execs access to dashboard trending and analysis date to support overall health and capacity assessments for their operations.
We've touched on the benefit of policies to control the environment...but even more powerful is CloudForms ability to continually monitor the environment post-delivery to ensure those policies continue to be adhered to. This become more and more critical as an org's environment grows to thousands, if not, 10s of thousands of VMs/workloads.
Another example: If Oracle workload, make sure it runs “here”/cluster because of licensing issues (Oracle requires owner to pay depending on what it's running on/underlying infrastructure)
Users only see VMs that conform to IT policy
If user tries to remove or update items that make VM non-conform to policy, CloudForms with not allow VM to run
CloudForms can notify help desk, security personnel, IT management, etc... of the policy breach
Don't want entire environment to be constrained by policy....Tag small group and apply
Benefits
Business:
- This capabilities provides an insurance policies to IT that infrastructure usage will always conform to how IT wants it used.
- Better help IT meet SLAs and promises made to the business. Can even “show” results
Technical:
- Big fear of IT shops is losing control of their environment. One could argue they have have already lost some control. Automating processes with assurance of adhering to police, means IT can get back control and remain in control.
Waste detection within the environment is key to optimizing the resources on hand. Arguably, many virtual environments today suffer from waste, but the time and effort required for IT to find and remove/reclaim the waste is somewhat prohibitive. CloudForms provides mechanism to easily find and reclaim waste, ensuring an org's environment runs as efficiently as possible.
Example:
Lets assume a new deployment of CloudForms. Lets next assume CloudForms can identify 10% waste across the environment (some VM sprawl, some over allocated VMs and Pools, some incorrectly configured workloads). Ask your prospect how much that could save them vs purchasing new resources? Not only are we talking about saving pure infrastructure resources, but also think about software licensing savings. Without exact numbers, it's powerful to get your prospect thinking about potential savings. That savings could be enough to pay for the purchase of CloudForms for years to come (essentially no new or unanticipated cost to the org)
Without tools to provide IT with accurate views into consumption, resource trending, capacity and quota understanding...it's very difficult to optimize resources on hand and properly plan for future infrastructure investments.
We've touched on the benefit of policies to control the environment...but even more powerful is CloudForms ability to continually monitor the environment post-delivery to ensure those policies continue to be adhered to. This become more and more critical as an org's environment grows to thousands, if not, 10s of thousands of VMs/workloads.
Another example: If Oracle workload, make sure it runs “here”/cluster because of licensing issues (Oracle requires owner to pay depending on what it's running on/underlying infrastructure)
Users only see VMs that conform to IT policy
If user tries to remove or update items that make VM non-conform to policy, CloudForms with not allow VM to run
CloudForms can notify help desk, security personnel, IT management, etc... of the policy breach
Don't want entire environment to be constrained by policy....Tag small group and apply
Benefits
Business:
- This capabilities provides an insurance policies to IT that infrastructure usage will always conform to how IT wants it used.
- Better help IT meet SLAs and promises made to the business. Can even “show” results
Technical:
- Big fear of IT shops is losing control of their environment. One could argue they have have already lost some control. Automating processes with assurance of adhering to police, means IT can get back control and remain in control.
Workload Balancing – need to understand normal operating ranges. Need visibility into when and why workloads fall outside normal ranges.
Capacity Management -
What platform should this workload run on?, How many workloads can cluster “X” handle?, Where do I have waste?, Where do I have performance issues?
(IE) Too many high I/O workloads on cluster “X”, so fulfill a high I/O request on another cluster (need visibility into the performance over time)
Tag workloads by resource consumption type
(i.e. High storage, Low CPU, High Memory, etc.....)
CloudForms can offer “right-sizing” recommendations, helping orgs reclaim unused resources or recommend where more resources are needed. This helps orgs optimize the resources they have
Benefits
Business:
- Visibility into capacity and performance across landscape coupled with best-fit placement and right-sizing recommendations ensure orgs gain the most from existing resources and eliminate needlessly spending on new resource
Technical:
- Optimize infrastructure by easily finding and reclaiming waste (workloads no longer in use) as well as seeing where/what clusters have performance issues – avoid adding new workloads to those clusters which could affect other workloads.
- Easily move workloads to evenly distribute resources and normalize performance across entire infrastructure landscape
FYI CloudForms can automate the request process by asking same questions IT guy does when N-Tier service request is made. Some organization may want more automation, while others want to have IT personally involved.
Important that prospects understand CloudForms can deliver more than simple VM resources, but can wrap those VMs together and delivery as an application, vs delivering resources and expecting the requestor to put them all together.
This is a step toward PaaS, but not nearly the capabilities provided with a product like Red Hat OpenShift Enterprise.
Benefits
Business:
- Automating a process that's typically done today manually, saves time and labor – meet demand faster, IT no longer an inhibitor to the speed of business
- Detailed tracking and chargeback of resource use, means IT can show cost of business support and not be viewed as money pit by upper mgmt.
Technical:
- Less people need to be involved in deploying N-tier application/workloads. Re-use that labor on other projects
- Automation controlled by approvals, policies, and best-fit placement recommendations not only means time savings, but also means less human errors that cost time/money later.
Examples of how/why Cloud Brokering is important:
1 - Request comes in for QA resources; requestors monthly quota is exhausted and to avoid affecting others quota allotment, you send them to public cloud for needed resources
2 - Request comes in for web application resources that will run in production; We have capacity available on all three clouds – VMware, RHEV, Amazon. But we have a policy that states “no production workloads can run in public cloud”. VMware and RHEV are still viable, but RHEV will cost less in licensing fees. RHEV becomes the automated place for THIS request to be met.
As orgs seek to automate the provisioning process and provide self-service, cloud brokering becomes critical. Being able to marry infrastructure knowledge, with policies and business logic mean customers have an adaptable management platform that can be relied on to allow a true self-service model and future-proof an orgs management investment.
Benefits
Business:
- Adaptive and extensible platform to manage the entire infrastructure landscape (allows an evolution if needed) and also ensures rules/policies and logic control infrastructure usage
Technical:
- Ensures automation does NOT get IT in trouble. Automation frees labor resources that used to be involved in delivering resources (re-allocate to other projects)
CLOUDForms covers service provisioning and deployment aspects shown earlier. CLOUDForms provides the ability to manage VMs across their lifecycle from provisioning or conversion (P2V/V2V) through operations and eventually to retirement. CLOUDForms automatically discovers, assesses, classifies, monitors and tracks VMs in any state, powered on, off or suspended, and provides a spectrum of lifecycle management and automation including:
VM Lifecycle Management- including automatic discovery, tracking, inventory, analysis, assessment, aging and retirement.
Self-Service Provisioning and Self-Management - through a rich, web-based portal with fine-grained access control and support for request management, tracking and approval.
Configuration Management - including automatic, agent-free deep VM discovery, analysis, assessment and tracking of software, accounts, users, groups, patches, services, packages, registry keys, MD5s and configuration files.
Comprehensive Baselining and Drift - including the virtual hardware, settings, guest configuration, network settings as well as relationships and classifications.
Real-Time Policy-Based Standards Enforcement – assessment, analysis and policy-based enforcement of VM configuration, operational, network, resource and security standards.
Resource Monitoring and Optimization – performance monitoring, identification of over-allocated resources, current and future bottlenecks, automatic VM aging and retirement.
Quota Enforcement, Usage, Chargeback and Cost Allocation – detailed usage tracking by configurable classifications with support for multiple rates tables, fixed cost, allocation and usage and reservation based chargeback.
Advanced Capacity Planning, Trending, and Best-Fit Placement – factors in resource availability, policies and business classifications across time periods optimizing planning and VM placement.
Example Use Scenarios:
1. Administrator decides to move workloads from on-premise private resources to public cloud, in order to free those resources for a new project (One-Click)
2. Acme Co. wants to use Amazon for disaster recovery purposes and writes a rule stating if a critical cluster goes down, immediately move it to Amazon. (Either One-Click or Automated)
3. For certain tagged workloads, Smith Tech Co. has written a policy stating that if a particular threshold is met (cluster CPU is near exhausted) auto-move the workloads to the public cloud (Either One-Click or Automated)
Although not all of your prospects may be interested in utilizing public cloud resources at this time, most will want the option sometime in the future. Many orgs fear using public cloud for security and possible performance reasons, CloudForms allows orgs to place policies around usage, ensuring public cloud usage adheres to how IT wants it to be used.
Benefits
Business:
- Utilize public cloud resources in a controlled/approved manner
- Rent public cloud resources (particularly for short periods of time) vs procuring more on-premise resources (save on time required to buy, receive, set-up and deploy those resources) and save by not having excess resources on-premise after short spikes in resource needs
Technical:
- Control public cloud usage (no more surprise AWS monthly bill), you dictate usage terms
- Public cloud can be a Disaster Recovery (DR) option, allowing you to use on-premise resources that were “saved” for DR purposes. You only pay for DR resources if/when needed vs having sunk costs on idle resources.