Ulas Kozat, Huawei, Yaoguang Wang, Huawei
In the first phase of telco-cloud vision, the physical network functions are targeted for virtualization and became Virtual Network Functions (VNF) decoupled from the specific hardware platform. As we dive into the second phase of the cloud era, the core need is to provide VNF implementations that can take advantage of what cloud has to offer in terms of utility based computing (a.k.a. scaling), availability, data durability, etc. To this end, we have been developing a VNF Performance Modeling framework for automatic characterization of a particular VNF implementation in terms of its cloud-readiness and its bottlenecks towards cloud-readiness. We will present the details of our performance modeling framework and show its utility based on the existing open source VNF implementations. The next frontier of telco-cloud vision is to develop cloud-native network functions and services. Thus, in the last part of our talk, we will cover the future evolution of the framework and discuss the needs, requirements, potential metrics for evaluating the cloud-nativeness of network functions.
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
My network functions are virtualized, but are they cloud-ready
1.
2. My Network Functions are Virtualized;
But are they Cloud-Ready?
Ulas C. Kozat & Yaoguang Wang
Wireless Cloud and Open Source Research Center
Huawei
3. Author Bios
Ulas C. Kozat, Ph.D.
• PTL in OPNFV Domino
• Main technology focus is on the design
of the next generation wireless
network architectures,
network/infrastructure modeling and
optimization.
Yaoguang Wang, Ph.D.
• Committer in OPNFV Bottlenecks
• Contributor in OPNFV Yardstick
• Focuses on NFV performance
evaluation, modeling and optimization.
5. Network Function
Virtualization
Cloud Ready
Network Functions
Cloud Native
Network Functions
+ Scalable performance
+ Dynamic service
chains
Phase-1 Phase-2 Phase-3
+ Small footprint
+ Fast start up
+ Fast failure recovery
+ Very dynamic micro-chains
•Functional Decoupling from Hardware
•Portability to Different Clouds
•Automated on-boarding
•Quasi static service chains
3 Phases of Cloudification
6. 3 Phases of Cloudification
Sub-system
Physical Network
Functions
Virtualized Network
Functions
PaaS
Cloud Native Network
Functions
IaaS
Targets Brown Field:
Requires minimal changes
for the software
architecture
Targets Green Field:
Write functions to
maximally utilize platform
features
Phase-1 Phase-3
Phase-2
7. Cloud Platform and VNF Evaluation
Automation Framework, Tools, Metrics
Operator
Vendor X
Vendor Y
Cloud Platform
VNF
Does this platform satisfy my cost, performance, &
operational requirements?
Will this VNF perform well over my cloud?
9. NFVI Evaluation Categories
− Node
− # of nodes
− CPU
− # of cores/cache size
− Clock speed
− Memory
− Total DRAM
− Storage
− disk size
− NIC
− # of NIC
− NIC Bandwidth
Capacity
− CPU Speed
− Integer/Floating computation
− Memory Speed
− Latency/throughput
− Disk Performance
− iops/latency(random/seq)
− Packet Transmission
− Latency (ICMP/SCTP/UDP/PDCP)
− Throughput (TCP/UDP/SCTP)
− Packet delay variation
− Packet loss rate (UDP/PDCP)
− Virtualization Overhead
− VM/PM, Container/PM
Performance
10. NFVI Evaluation Categories
− HW attributes
− MTTF (CPU/Disk)
− Self-healing
− Data Replication Factor
− Failover time
− Detection time
− Service Assurance
− User plane latency assurance
(UDP/PDCP)
− Performance Variation in time (e.g., peak
vs. non-busy hours/days)
− Performance Variation over different
instances (the same flavor and time)
Reliability
− Instance O&M
− Time to boot/destroy/migration
− VM/Container
− Scaling Speed
− Time to scale-out/in/up/down
Agility
− Instance distribution
− VM/Container
− Resource Allocation Distribution
− vcpu, memory, disk
Balance
11. Host
Host
Test Methodology
Host
e.g.
Performance:
− cpu speed/disk iops
− virtualization overhead
Reliability:
− performance variation
Guest
(vm/container)
Host
e.g.
Performance:
− Packet transmission
− virtualization overhead
Reliability:
− service assurance
Guest
(vm/container)
e.g.
Capacity:
− resource statistics
Agility:
− time to boot/migration
Balance:
− Instance distribution
Single node Multiple nodes All nodes
Host
Guest
(vm/container)
Host
12. Ranking Platforms with Multiple Objectives
1. Capacity: the amount of HW resources
2. Performance: characteristics of fundamental operations
3. Reliability: credibility as some conditions(time/instance) vary
4. Agility: speed of resource allocation
5. Balance: distribution of resource allocation
Category Score
Normalization
Weighing
Modeling
13. VNF Evaluation Framework
Target:
Evaluate VNF performance
under different configurations
User Input:
SUT: VNFD, Parameter value
files(Resource Vector), bootstrap
mode
Evaluation: metrics, method,
search direction
Workload: deployment, generation,
parameters
Traffic Generator:
Built-in (iperf, pktgen, MoonGen)
External (VNF owner provide)
14. VNF Performance Metrics
Granularity
− Independent VNFs, e.g. vSwitch, vDPI, vFW, vIDS, etc.
− VNF-FG, e.g. vIMS, vEPC, etc.
Consideration:
− Functional metrics
− Case-by-case: VNF Metrics Reference
Virtual Switch Maximum forwarding rate (bps,pps)
Latency (us)
RFC 2285
bmwg-vswitch-opnfv
vFirewall Maximum offered load(pps)
Forwarding rate(pps)
RFC 3511
vIMS Mean session setup time
Successful call rate(calls/sec, w/input call
rate)
3GPP TS 32.454
vEPC EPS attach success rate
Mean dedicated bearer set-up time
3GPP TS 32.455
15. When is your VNF Cloud Ready?
Cloud == Utility Based Computing == Unit Price Efficiency
Double the Resources Double the Throughput Double the Price
Scale Out Scale Up
(T, $1)
(2T, $2)
(4T, $4)
Scale Up Scale Down
(T, $1)
(T, $1)
(T, $1)
(2T, $2)
(2T, $2)
(4T, $4)
Scale Out
Scale In
16. Evaluating a VNF
Max. Throughput
(e.g., API-calls/sec, packets/sec, flows/sec, etc.)
Measured at a specified latency
and packet loss target
Total VNF Resource Allocation1 2 3 4 5
y
2y
3y
4y
5y
ds = Scaling Degree of VNF
ds is measured at the curve knee point where linear
growth phase changes to sub-linear growth
This is a vectoral value with CPU, memory, disk, network
bandwidth resources.
Minimum resource requirement depends on the VNF
implementation.
Maps to the $ value
Scalar performance value
different for different VNF
implementations
17. Evaluating & Ranking VNFs
(T, $, ds)
$ amount is computed from minimum resource requirement, i.e.,
$ = nCPU$CPU + nGPU$GPU + nm$m + ns$s + nnet$net + …
Measured by VNF Evaluation Framework
Measured by VNF Evaluation Framework
ds
T/$ VNF X
better VNF
VNF Y
VNF Z
worse VNF
?
?
18. Finding the Knee Point…
Possible algorithmic methods for automatic determination of knee point:
• Check slope at each point, stop if slope gets “significantly” small relative to the first slope
• Add points (0,0), (1,T[1]), (2, T[2]), …; find best non-linear fit for TNL = a0 + a1d + a2d2 + …;
find best linear fit TL = g1d; stop when |TNL – TL|
• Use ML techniques for knee point classification
All of these methods have parameter selection problem!
?
Not straight-forward for smooth curves
d
T
m1
mk
19. Scaling Score ()
Can we instead use an indirect method for comparing scaling performance?
Proposal: Compare maximum flavor vs. minimum flavor in terms of price efficiency
)1exp(100
$/
$/
minmin
maxmax
flavorflavor
flavorflavor
T
T
1 2 3 4 5
Throughput/$
VNF Flavor
ideal
1 , 100
2 , 37
20. Cloud Native Network Functions
Qualitative features (from ETSI)
Level of separation of logic and state
Degree of scale-out
Resource footprint (Scheduling Density)
Use of accelerators
VNF Resiliency Model
Monitoring capabilities
Abstraction and APIs exposed
Testing code coming with it
Plan to address in
Automated VNF Evaluation Framework
21. Cloud Native Network Functions
Logic
State
X
Logic Logic
State
1 2
3
3
time
Throughput
Introduce failure
Relaunch
instance
Recovery time
Scheduling Freedom
Vs.
Dynamic Service Chain Creation Models
22. Conclusions & Next Steps
Presented NFVI and VNF evaluation frameworks
Discussed (and proposed) evaluation categories and metrics for
cloud readiness and cloud nativeness
VNF evaluation framework is currently under development:
(1) Cloud Readiness evaluation
(2) Cloud Nativeness evaluation