RHCT,RHCSAv5,RHCSAv7,RHCEv5,RHCEv7,RHCVA,RHCI,RHCX,RHCSARHOS,CEI,CEH,CHFI,CND,EDRP,CCNA,MCTCNA,Sec+,Net+,VCA,vExpert2017 em Inovasi Informatika Indonesia, PT
Container Storage Landscape
Container storage platform has grown up so fast and come with many
selections and features, we need to take care when choose container storage
CNCF Landscape for Storage
Source:
https://landscape.cncf.io/card-mode?category=cloud-native-storage&gro
uping=category&license=open-source
CNCF Landscape for Storage
Source:
https://landscape.cncf.io/card-mode?category=cloud-native-stora
ge&company-type=for-profit&grouping=category&license=not-op
en-source
● Open-source Design
● in-kernel data replication
● Low CPU requirements
● Automated Resizing and Provisioning
● Supported Storage Types
● Data Protection and Replication
● Dynamic Provisioning
● Container-Native Storage or
Container-attached Storage
Source: https://www.google.com
Container Storage Keypoints
By Choosing Opensource Container
Storage, can eliminate the storage cost
component compare to commercial
product. Accordingly, Open Source ushers in
higher quality, greater reliability, more
flexibility, lower costs and an end to
proprietary lock-in. So ultimately Open
Source is good for business
Opensource by Design
Opensource by Design
Rook turns distributed
storage systems into
self-managing,
self-scaling,
self-healing storage
services.
Some of the Container Storage
Support in-kernel data
replication, this technology give
better performance and
efficiency compare to user-space
based replication.
In-kernel Data Replication
Erasure coding vs Replication
One of concern with replication is its
consumption of storage, generally 3X the raw
capacity. The other option is erasure coding
which is a parity based protection scheme that
stripes data across the cluster’s nodes. Erasure
coding only requires 1.5X to 1.75X of additional
capacity for copies to
However erasure coding consume more CPU
compare to replication
Low CPU Consumption
SDS Engine
Networking
Apps
SDS Engine
Networking
Apps
SDS Engine
Networking
Apps
good storage platform have
capability to simplify
provisioning at the first place.
However, how to scale the
storage platform is also another
challenge for storage
administrator.
On today, storage platform must
have auto rebalancing or
automatic data distribution on
they offering
Automated Resizing and Provisioning
CSI Based vs L7 General Storage interfaces vs Local
- iSCSI
- NFS
- Local Volume
- Local FileSystem
- CephFS
- Cinder
- Object Storage
- Glusterfs
- AzureFile
- AWSElasticBlockStore
Supported Storage Types
● Snapshot Features
● Replication Features
● Erasure Coding
● Backup and Recovery
● DC-DRC Strategy
● Compression
● Thin Provisioning
Data Protection and Replication
Node1 Node2 Node3
Node1 Node2 Node3
Backup
Snapshot
Replication to Other DC/Region
A1 A2
A1 A2 A1 A2
For many small apps, a slow persistent
volume that is automatically selected by
cloud provider is enough.
However, for heterogeneous workloads,
being able to pick between different
storage models, and being able to
implement policies, or better yet
“operators” around persistent volume
claim fulfillment, becomes increasingly
important.
Dynamic Provisioning
Anytime we hear about CSI, we
hear about how its modular, how its
vendor-neutral, and how it can be
upgraded out-of-band. These are
nice to have, but do you really care?
CNS or CAS?
if you’ve used the standard storage volume type in your env, you’ve
likely seen:
● Slow startup times
● Databases that failed due to disk I/O or size
● Difficulty transferring storage models
● Storage caused permissions issues which needed init pods,
fsGroups, supplementalGroups, or SELinux exceptions to be
rectified.
● Lack of multitenancy tools for transferring data volumes between
pods in different namespaces
CNS or CAS?
● CAS must support pass-through mode (what we call LocalPV in
the Kubernetes ecosystem)
● CAS must support multi-node HA at LocalPV speed
● CAS software should be cloud native in architecture – supporting
multiple data engines depending on the workload
● CAS should be open source to avoid introducing vendor
dependencies.
CNS or CAS?
Container Attached Storage enables agile
storage for stateful containerized
applications. This is because it follows a
microservice-based pattern which allows
the storage controller and target replicas
to be upgraded seamlessly
CAS is also appropriate for organizations
looking to orchestrate their storage across
multiple clouds. This is because CAS can
be deployed on any Kubernetes platform
CAS or SDS?
Popular CAS solutions providers for Kubernetes
include:
● OpenEBS
● StorageOS
● Portworx
● Longhorn
Software-Defined Storage architecture
relies on data programs to decouple
running applications from storage
hardware. This simplifies the
management of storage devices by
abstracting them into virtual partitions.
With Software-Defined Storage, the
data/service management interface is
hosted on a master server that controls
storage layers consisting of shared storage
pools. This makes provisioning and
allocation of storage easy and flexible.
CAS or SDS?
So what do you think about CEPH Storage
Platform? Is SDS? Is CAS? or is CNS
Depend on your Workload
Every Container storage have better capability on specific workload, choosing
the right storage for specific purpose is the important things
Consider the following table of storage requirements for different workloads
Determine your Workload
Workload
Needs
Durability?
Needs to be
fast?
Avg size per
container
Dedicated
disk?
AI Workloads
with model
no no 1 to 100GB no
Postgres
workloads with
millions of rows
yes yes 1G to 5TB yes
HDFS datanode
backend folders
no yes 10TB yes
Cold Storage yes no Infinite no
Determine your Workload
NEO
Block
Storage
(NBS)
NEO
Object
Storage
(NOS)
NEO High
Performance
Storage
NHP-5k
NEO High
Performance
Storage
NHP-10k
NEO High
Performance
Storage
NHP-15k
SAN Storage
HDD
Upto 100k
SAN Storage
SSD
Upto 200k
Unstructured Data
Website Data
Data Warehouse
Database Services
High Capacity High Throughput High IOPS
High Transaction
Core and HPC
StorageClass