3. DevConf 2017: Realistic Container Platform Simulations
OpenShift Performance and Scalability Team
Who are we?
Performance EngineeringRequestor
Upstream Development
QE Handoff
Test Design/Exec @ Scale
Engineering
Product Management
Marketing
Field/Customer
http://j.mp/ose-perf-scale
3
5. DevConf 2017: Realistic Container Platform Simulations
Some don’t care where they run
● Batch workloads
● Web servers
It’s all about the workloads...
Some care greatly
● Security, Isolation
● Uptime
● Performance
● Proximity/Locality to data
8. DevConf 2017: Realistic Container Platform Simulations8
What is a workload? Business Requirements
9. DevConf 2017: Realistic Container Platform Simulations9
Attribute Build Farm
CPU Intensive High
Memory Intensive High
Disk I/O Latency Low
Disk I/O Throughput High
Network Latency Low
Network Throughput High
Security Low
Uptime (Live
Migration)
N/A
Deployment Speed High
Alternative OS N/A
Workload →
Infrastructure Mapping:
Build Farm
Icon Meaning
Mature and/or
No Perf
Concerns
Immature and/or
Limited Perf
Concerns
Mixed Concerns
Not Applicable
10. DevConf 2017: Realistic Container Platform Simulations
10
Atomic Host - a
container-optimized, minimal
footprint OS powered by Red Hat
Enterprise Linux
Telemetry - logging and metrics for
pods/containers, services and
underlying infrastructure to make
informed decisions
Runtime and Packaging Format
- standardized container
packaging format and runtime,
powered by Docker (and OCI)
Automation and host configuration
management via Cockpit to
dynamically provision
and configure container host clusters
Orchestration - for complex
multi-container services, powered
by Kubernetes
Networking - scalable, multi-host
container networking, powered by
Open vSwitch, that runs anywhere
Red Hat Enterprise Linux runs
Cluster Services - Scheduling for
services across a container host
cluster, powered by Kubernetes
Storage, with persistent storage
plugins
to enable running of stateful services
in containers
Atomic Registry - integrated
storage and management for
sharing container images
Security to prevent tenants from
compromising other occupants
Container Infrastructure Components
13. DevConf 2017: Realistic Container Platform Simulations13
System Verification Test Suite (SVT)
● Red Hat OpenShift Performance and Scalability team’s test suites
● https://github.com/openshift/svt
● cluster-loader
● Networking/synthetic
● Workload Generator
● Reliability/Longevity
● https://github.com/kubernetes/perf-tests
14. DevConf 2017: Realistic Container Platform Simulations14
Image Provisioner
Runs on any RHEL instance (currently supported EC2 and qcow2/kvm)
● filesystem juggling to allow for thinpool in base RHEL cloud image
● RHEL OS setup (install latest packages, ssh keys)
● aos-ansible (pull down supporting container images)
● openshift-rpm-install (install openshift RPMs, but do not configure)
● clone necessary git repos
● collectd-install (install and configure collectd)
● docker-config (setup storage)
● repo-install (setup custom yum repos)
● pbench-install (install and configure pbench)
Ansible Automation to build a pre-configured RHEL image
15. DevConf 2017: Realistic Container Platform Simulations
What is cluster-loader?
I’d like an environment with thousands of deployment configs
(which include services and replication controllers), thousands more
routes, pods (each with a persistent storage volume automatically
attached), secrets, image streams, buildconfigs, etc.
16. DevConf 2017: Realistic Container Platform Simulations
16
Start
Parse args &
config
New Config
Obj
End
False
Create
Namespace
True
X
Exists
?
Items < N
False
Create X
Iterate Item
Count
True True
False
X can be:
Quota
Template
Service
User
Pod
RC
22. DevConf 2017: Realistic Container Platform Simulations
HAProxy
● Raise default HAProxy maxconn to 20000: BZ1406327
● HAProxy is forcefully restarted due to not responding to
/healthz probe when under high load: BZ1405440
● ARP tuning documentation
● nbproc
● tcp_fastopen
● busy_poll
22
25. DevConf 2017: Realistic Container Platform Simulations
OCP-on-OSP: etcd
● Team reminded of etcd tight perf req’s
● OCP has poor day-2 operations capabilities for etcd
○ Difficult to replace failed nodes (SSL certs, data migration)
25
26. DevConf 2017: Realistic Container Platform Simulations
Lessons Learned/Next Steps
● Ensure fast disk for etcd (SSD is a must).
● Automate real hostnames and DNS (required for metrics and route tests)
● If using boot-from-volume, ensure you use snapshots when nova booting
● Ceph pools should have correct # of pg/pgp. Use pgcalc.
● Use dedicated builder nodes in OpenShift.
● Incorporate CFME into the environment, deployed as a pod.
26