4. Motivation - Why ?
OpenStack currently lacks of a set of features in respect to
portable hardware acceleration*:
• Accelerators life management
• Accelerators resource discovery
• Reconfigurable FPGA, GPUs and other accelerators
migration support, ease of use, etc.
* these features have been highlighted in the OPNFV OpenStack GAP Analysis document.
5. Motivation - What ?
Very Close - in the cpu chipset or on the board (i.e., see Intel’s skylake staging)
• Suitable for offload model, and inline if associated interface is also in place
• With optimal sharing of resources can provide excellent processing gains
• Limited by horizontal scale, but can be leveraged as a unit of management like the
associated CPU
Nearby - attached via a bus or similar (i.e., PCIe or within a chassis assembly)
• Suitable for offload and inline models
• Susceptible to negative impact if interface across the bus is chatty
• Larger scale possible, particularly in chassis configurations
Far - reachable by TCP/IP or other communication protocol
• Suitable for offload and inline if latency is not a concern
• Largest horizontal scale flexibility
• Much more suited to a standalone function model
6. Motivation - How ?
Nova Extension Solely Dedicated Acceleration Management Function
(AMF)
Target Best performance, no portability. Best performance/portability trade off.
Accelerator
access
Direct management. Management through a portability layer.
Pros - Direct interaction between the compute
node and the accelerators could provide
slightly better performance
- Resource discovery, scheduling, setup etc.
- Support for automatic integrated acceleration
management for accelerated VM's Migration
- Hardware portability and independence.
Cons - Portability/migration hard to support
- Code complexity: specific code needed
for each accelerator type, with impact to
the project performance, security,
maintainability, etc.
- The accelerator allocation phase might take time,
as an handshake procedure has to be put in place
- Scalability issues
8. Interesting Usecase - SmallCellGW
Lookaside Encryption Acceleration used in CMCC’s commercial deployment
9. Interesting Use Cases - NFVIaaS
● For people that are familiar with ETSI NFV
standard, NFVIaaS was among the NFV use
cases that published in Phase 1 documents
● However few of us grasp what this use case
actually meant for business, until now.
● Now many operators begin to build their own
Public Cloud. If an operator has a multiple site
supported Public Cloud, it then would be able to
offer NFVIaaS to NFV app companies that has its
own content, or VNF, or MANO, but without NFVI.
These NFV apps then could be deployed on the
Public Cloud.
● There is still a problem for the operator who
owns the Public Cloud - so further service
classification on NFVIaaS without acceleration
13. OPNFV Requirements – Resource Discovery
• Basic Requirements
• Requirement RD-1 Acceleration agent SHOULD be able to collaborate with AC to discover local
available acceleration resource, and notify VIM accordingly.
• Requirement RD-2 VIM SHOULD maintain a catalog for recognizable acceleration resources
and its features.
• Requirement RD‑3 VIM SHOULD support notification of acceleration resource discovery to
MANO under the circumstance in which MANO has requested notification of this type of event.
• Requirement RD-4 VIM SHOULD maintain the dependency mapping between abstract
acceleration resource to physical acceleration resource.
• Discovery for Re-programmable accelerators
• Requirement RD-5 VIM SHOULD support the data store of acceleration resource configuration
feature for re-programmable accelerators.
• QoS control
• Requirement RD-6 Acceleration agent MUST collaborate with AC in exposing its capability of
virtualizing/slicing physical accelerator via resource discovery.
• Requirement RD-7 VIM MUST support the data store of acceleration resource virtualization and
slicing feature.
• Requirement RD-8 Acceleration agent MUST collaborate with AC in collecting local accelerator
performance metrics.
15. OPNFV Requirements – Resource Selection
• Basic Requirements:
• Requirement RS-1 Acceleration Agent MUST support the capability to report
acceleration resource’s running status information.
• Requirement RS-2 VIM MUST maintain an inventory for up-to-date running status for
all available acceleration resource, based on aggregated information from local
acceleration agents.
• Requirement RS-3 VIM MUST support the capability to choose an appropriate
accelerator based on the acceleration capability requirement in the virtualization
resource allocation request from MANO based on the acceleration instance inventory.
• Configuration for Re-programmable accelerators
• Requirement RS-4 VIM MUST support the selection of a re-programmable accelerator
based on its registered configuration feature.
• Fine-grained QoS control
• Requirement RS-5 VIM MUST support the capability of selecting part of a physical
accelerator to fulfill a request, given the corresponding physical accelerator supports
virtualization and slicing.
16. OPNFV Requirements – Resource Allocation
•Basic Requirements:
• Requirement RA-1 Acceleration agent MUST collaborate with AC in supporting
the capability of triggering attaching/detaching an accelerator to a
virtualization container (e.g., VM).
• Requirement RA-2 VIM MUST support the capability to interact with local
acceleration agent on selected compute node to trigger resource allocation on
selected accelerators.
•Configuration for Re-programmable accelerators:
• Requirement RA-3 Acceleration agent MUST support the capability to collaborate
with AC to configure an accelerator.
•Fine-grained QoS control
• Requirement RA-4 Acceleration agent MUST support the capability to collaborate
with AC to slice an accelerator and attach/detach only part of it to a virtualized
container.
17. OPNFV Requirements – Resource Update
•Basic requirements
• All the basic requirements are covered by previous requirements stated above.
•Scaling requirements (optional)
• Requirement RU-1 VIM MAY support the capability of modification (Increase,
Decrease) of NFVI acceleration resources upon request from MANO.
• Requirement RU-2 VIM MAY support notification of NFVI acceleration
resource modifications per MANO’s request or earlier subscription.
18. OPNFV Requirements – Resource Release
• Basic requirements
• Requirement RR-1 VIM SHOULD support the capability to terminate the association
of a given set of acceleration resources from a given VNF, and update data stores
accordingly, upon request from MANO, or as part of the termination process of a
virtualized container.
• Requirement RR-2 VIM SHOULD support notification of acceleration resource
termination under the as per MANO’s request or earlier subscription.
• Fault Management requirements
• Requirement RR-3 VIM May support reclaim of associated acceleration resources for a
crashed or faulty virtualized container.
• Requirement RR-4 Acceleration agent MUST collaborate with AC in detecting faulty
accelerators.
• Requirement RR-5 VIM MUST support the capability to collect fault information related
to both physical and virtual accelerators from acceleration agents and report to MANO if
needed.
• Requirement RR-6 VIM MAY support detection and auto recovery of faulty acceleration
agents.
• Requirement RR-7 VIM MAY support fault analysis or problem diagnosis for
acceleration resources.
28. The missing piece and Ironic
❑When it comes to NFV and the use of VNFs that have been created
and are managed in this way, they are not visible to Neutron and not
captured in Nova as consumable functions.
❑Ironic has the role of discovering and initializing “bare metal” devices
and exposing them to the rest of the OpenStack system. However
there is no requirement that all resources used by Nova for example
need be bare metal based.
❑We need a public api that allows the dynamic registration of resources
that happen to be hosted on acceleration hardware.
29. What is Application Acceleration
Typically when you run something on alternate hardware as a subcomponent of the
application, we call it an accelerated application, and the specific function is often call
an accelerated function. This is commonly called the “offload” model. Sometimes
referred to as co-processing, and we see analytics as an example of this type of
workload.
The “inline” model frequently puts a specialized accelerator between an application and
an interface to other systems. We all know about Graphics “acceleration” as an
example, or wetware interface pre-processors used in genomics. An inline function
may be standalone and not have any external processing dependencies
A few platforms have emerged to support this, namely GPU and FPGA, and along with
even more specific hardware are commonly connected to a more traditional CPU via
PCI or similar technology.
As these patterns and specific workloads have become highly popular we have seen
general CPU vendors add acceleration platforms to the chipset. Graphics,
communications, encryption are all examples.
So how do we deploy and manage acceleration hardware?
and
30. OpenStack by principal - Nova
Nova by definition manages the allocation of compute
resources.
• Through meta data it is dynamically aware of a
compute node and its characteristics. This may include
some close or nearby resources.
• Nova can be taught with alternate meta data about
what look like standalone compute nodes that are in
fact acceleration devices.
• By providing additional filters and automation scripts
Nova can manage a standalone acceleration device
just like a general CPU device
31. OpenStack by principal - Glance
Glance is used to manage the life cycle of artifacts used for
provisioning
• Glance understands the artifacts through meta data
associated with a resource
• Just like VMs need images, accelerated devices need
to be loaded with bitstreams
• By providing additional artifact types and meta data
Glance can be utilized to manage the artifacts needed
for acceleration device life cycle management
32. OpenStack by principal - Neutron/Cinder
Neutron and Cinder manage traditional data center devices
and do not understand how they are implemented. The
device is just something to be configured, and separately
monitored and managed.
• The devices are consumed much like a PaaS or SaaS
level service. An appliance model. If the appliance is
implemented with CPU, GPU, FPGA, discrete
hardware or aliens it does not matter.
• The manager holds operational state and configuration
data about the devices, just like Nova understands
number of vCPU and how much has been allocated.
33. Conclusions
• This completes the puzzle and separates out the
concerns for creating and consuming VNFs to create
NFV and NFVi.
• This approach provides a more generalized way to
manage acceleration hardware while still separating
life cycle from specific function.
• By supporting all 3 types of configuration as well as
consideration of inline and offload models, any specific
performance optimization for each can still be applied,
but without affecting the alternates.
While current steps in Nova to support acceleration, and in
Nomad to support the domain specific notions of NFV…
34. OpenStack to the rescue
By leveraging Glance and Nova to manage the
provisioning of acceleration hardware in all models, the
accelerated application/function can be adopted directly
into any automation a customer needs.
Functions provided by these accelerated systems can be
combined and consumed like any PaaS or SaaS service
35. Not sure how to mix this in the flow
The following set of slides provide a bit of background and
then the reasoning to lead to the following proposed
approaches. Since Nomad already has a path forward I
leave it to the Nomad veterans to determine if this is of
interest and how to weave the ideas.