SlideShare uma empresa Scribd logo
1 de 56
Baixar para ler offline
What CloudStackers Need to Know About LINSTOR/DRBD
Philipp Reisner, CEO LINBIT
1
Leading Open Source OS based SDS
COMPANY OVERVIEW
REFERENCES
• Developer of DRBD and LINSTOR
• 100% founder owned
• Offices in Europe and US
• Team of highly experienced Linux
experts
• Exclusivity Japan: SIOS
• Leading Open Source Block Storage
(included in Linux Kernel (v2.6.33)
• Open Source DRBD supported by
proprietary LINBIT products / services
• OpenStack with DRBD Cinder driver
• Kubernetes Driver
• Install base of >2 million
PRODUCT OVERVIEW
SOLUTIONS
LINBIT SDS
Since 2016
Perfectly suited for SSD/NVMe high performance storage
DRBD HA, DRBD DR
Market leading solutions since 2001, over 600 customers
Ideally suited to power HA and DR in OEM appliances (Cisco, IBM, Oracle)
3
LINBIT SDS
When? Why? What?
4
When is LINBIT SDS a fit?
Transaction Processing
●
Oracle DB
●
PostgreSQL
●
MariaDB
●
Message queuing systems
Analytic Processing
●
DB2 Warehouse
●
And similar read intensive workloads
PersistentVolumes
...for Containers
●
Kubernetes
●
Nomad
●
Docker
Virtualization
●
CloudStack
●
Others...
5
Why is LINBIT SDS so fast?
Layout at volume allocation
• All participating machines have full replicas, which
machines participate determined when creating a
volume.
• Be faster at IO submission time
• Saving on CPU/memory
In Kernel data-path
• Reduce number of context switches
• Saving on CPU/memory resources
• Minimal latency for block-IO operations
• Optional load-balancing for READs
Hyper-Converged
Very well suitable for hyper-converged deployment
• Reduced network load for reads
• Reduces latency
• LINBIT SDS’ Low resource consumption leaves most
of CPU and memory for workload. About 0.5% of a
single core are consumed by DRBD under heavier IO
load (measured with an analytics DB)
Build on existing components
• DRBD, LVM, ZFS, LUKS, VDO, ...
• Help day2 operations by leveraging on the
opera tion teams prior knowledge
• Build on the shoulders of giants
6
What is LINBIT SDS doing?
Storage Allocation
• 3 to 1000s of nodes
• Multiple tiers
• Multi tenancy
• Complex policies
Chassis - rack - room
Network
• Multiple NICs per server
• Multiple networks
• RDMA
• TCP
Business continuity
• Continuous data protection
• Multiple sites
• Backups
SSD - disk - cloud
Data Replication
• Persistence & availability
• Sync / async
• 2,3 or more replicas
• Consistency groups
• Quorum
7
Linux Storage Gems
LVM, RAID, SSD cache tiers, deduplication, targets & initiators
8
Linux's LVM
physical
volume
physical
volume
logical
volume
logical
volume
physical
volume
snapshot
Volume Group
9
Linux's LVM
• based on device mapper
• original objects
• PVs, VGs, LVs, snapshots
• LVs can scatter over PVs in multiple segments
• thinlv
• thinpools = LVs
• thin LVs live in thinpools
• multiple snapshots became efficient!
10
Linux's LVM
physical
volume
physical
volume
logical
volume
thinpool
physical
volume
snapshot
thin-LV thin-LV
thin-sLV
Volume Group
11
Linux's RAID
• original MD code
• mdadm command
• Raid Levels: 0,1,4,5,6,10
• now available in LVM as well
• device mapper interface for MD code
• do not call it ‘dmraid’; that is software for hardware fake-raid
• lvcreate --type raid6 --size 100G VG_name
RAID1
A4
A3
A2
A1
A4
A3
A2
A1
12
Linux’s DeDupe
• Virtual Data Optimizer (VDO) since RHEL 7.5
• Red hat acquired Permabit and is GPLing VDO
• Linux upstreaming is in preparation
• in-line data deduplication
• kernel part is a device mapper module
• indexing service runs in user-space
• async or synchronous writeback
• recommended to be used below LVM
13
SSD cache for HDD
• dm-cache
• device mapper module
• accessible via LVM tools
• bcache
• generic Linux block device
• slightly ahead in the performance game
• dm-write-cache
• for combinding PMEM & NVMe drives
14
Linux’s targets & initiators
• Open-ISCSI initiator
• Ietd, STGT, SCST
• mostly historical
• LIO
• iSCSI, iSER, SRP, FC, FCoE
• SCSI pass through, block IO, file IO, user-specific-IO
• NVMe-OF & NVMe/TCP
• target & initiator
Initiator Target
IO-requests
data/completion
15
ZFS on Linux
• Ubuntu eco-system only
• has its own
• logic volume manager (zVols)
• thin provisioning
• RAID (RAIDz)
• caching for SSDs (ZIL, SLOG)
• and a file system!
16
Put in simplest form
17
DRBD – think of it as ...
Target
Initiator
IO-requests
data/completion
RAID1
A4
A3
A2
A1
A4
A3
A2
A1
18
DRBD – another way to look at it
19
DRBD Roles: Primary & Secondary
Primary Secondary
replication
20
DRBD – multiple Volumes
• consistency group
Primary Secondary
replication
21
DRBD – up to 32 replicas
• each may be synchronous or async
Primary
Secondary
Secondary
22
DRBD – Diskless nodes
• intentional diskless (no change tracking bitmap)
• disks can fail
Primary
Secondary
Secondary
23
DRBD - more about
• a node knows the version of the data is exposes
• automatic partial resync after connection outage
• checksum-based verify & resync
• split brain detection & resolution policies
• fencing
• quorum
• multiple resouces per node possible (1000s)
• dual Primary for live migration of VMs only!
24
DRBD Recent Features & ROADMAP
• Recent
• meta-data on PMEM/NVDIMMS
• improved, fine-grained locking for parallel workloads
• support discard granularity up 128MiB (GiBs w. resync)
• release DRBD-9.2
• ROADMAP
• performance optimizations
• replace „stacking”
• NATed partitions
25
The combination is more than the sum of its parts
26
LINSTOR - goals
• storage build from generic Linux nodes
• for SDS consumers (CloudStack, K8s, OpenStack, ...)
• building on existing Linux storage components
• multiple tenants possible
• deployment architectures
• distinct storage nodes
• hyperconverged with hypervisors / container hosts
• LVM, thin LVM or ZFS for volume management (stratis later)
• Open Source, GPL
27
Example
28
LINSTOR - Hyperconverged
storage
devices
hypervisor
& storage
VM
storage
devices
hypervisor
& storage
storage
devices
hypervisor
& storage
storage
devices
hypervisor
& storage
storage
devices
hypervisor
& storage
storage
devices
hypervisor
& storage
VM
29
LINSTOR - VM migrated
storage
devices
hypervisor
& storage
storage
devices
hypervisor
& storage
storage
devices
hypervisor
& storage
storage
devices
hypervisor
& storage
storage
devices
hypervisor
& storage
storage
devices
hypervisor
& storage
VM
VM
30
LINSTOR - add local replica
storage
devices
hypervisor
& storage
storage
devices
hypervisor
& storage
storage
devices
hypervisor
& storage
storage
devices
hypervisor
& storage
storage
devices
hypervisor
& storage
storage
devices
hypervisor
& storage
VM
VM
31
LINSTOR - remove 3rd
copy
storage
devices
hypervisor
& storage
storage
devices
hypervisor
& storage
storage
devices
hypervisor
& storage
storage
devices
hypervisor
& storage
storage
devices
hypervisor
& storage
storage
devices
hypervisor
& storage
VM
VM
32
Architecture, objects and functions
33
LINSTOR
Controller
LINSTOR
Satellite
LINSTOR
Satellite
LINSTOR
Satellite
LINSTOR
Controller
LINSTOR
Satellite
LINSTOR
Controller
LINSTOR
Satellite
LINSTOR
Protocol
TCP/IP
SSL/TLS
API
Library
CLI
XCP-NG
34
LINSTOR Objects
• Nodes
• Resources

Volumes
• Snapshots
• Storage Pools

shared
• Resource groups
• Properties

Aux properties
node A
LVM VG
node C
shared
LVM VG
DRBD res.
2 replicas
encryped
volume
node B
ZFS pool
plain
volume
Resource Groups
Resource Groups
35
LINSTOR Storage Layers
LVM VG, ZFS zPool, Exos, OpenFlex, SPDK, shared LVM VG
Mid (opt & multiple)
Bottom (required)
LUKS (encryption), caches, NVMe target & initiator
Top (opt) DRBD
Below (opt)
Bottom (required)
VDO (deduplication), software RAID
36
LINSTOR data placement
Example policy
3 way redundant, where two
copies are in the same rack but in
diffeent fire compartments
(synchronous) and a 3rd
replica in
a different site (asynchronous)
Example tags
rack = number
room = number
site = city
• arbitrary tags on nodes
• require placement on equal/different/named tag values
• prohibit placements with named existing volumes
• different failure domains for related volumes
37
LINSTOR network path selection
• a storage pool may preferred a NIC
• express NUMA relation of NVMe devices and NICs
• DRBD’s multi pathing supported
• load balancing with the RDMA transport
• fail-over only with the TCP transport
38
in the Software Ecosystem
39
LINSTOR connectors
Apache CloudStack
since 4.16
update with 4.18 (already merged)
with Qemu/KVM
40
LINSTOR SDS vs LINSTOR/DRBD
LINBIT SDS From source code
yum / apt repo Red hat UBI Github → make → package
Support ✓ Enterprise, incl 24/7 Community only
GUI ✓ n.a.
41
“Naming is hard” – Phil Karlton
Translation Matrix
LINSTOR Resource Group Resource/Volume
Kubernetes storageClass + file system → persistentVolume
Nomad
OpenNebula Datastore Image
OpenStack Volume Type Volume
Proxmox Storage Pool Volume
XCP-ng Storage Repository (SR) Virtual disk Image (VDI)
CloudStack Primary storage Volume
42
Summary
DRBD Diskless NVMe - oF
ISCSI
Block transport systems
Orchestrators
Block Storage Features
DRBD LUKS DM - Cache
VDO
Hardware
HDD
Node-level volume management
LVM ZFS
SSD NVMe PMEM
43
Thank you
https://www.linbit.com
44
Appendix Slides: Example Disaggregated Architecture
45
LINSTOR – disaggregated stack
storage
node
storage
node
hypervisor
VM
VM
storage
node
storage
node
hypervisor
VM
VM
hypervisor
46
LINSTOR / failed Hypervisor
storage
node
storage
node
storage
node
storage
node
hypervisor
VM
VM
hypervisor
VM
VM
47
LINSTOR / failed storage node
storage
node
hypervisor
VM
VM
storage
node
storage
node
hypervisor
VM
VM
hypervisor
48
Appendix Slides: Possible Storage Stacks
49
LINSTOR Storage Stacks
• Disaggregated Storage
• Classic enterprise workloads
• Data bases
• Message queues
• Typical Orchestrators
• OpenStack, OpenNebula
• Kubernetes
• Flexibly redundancy (1-n)
• HDDs, SSDs, NVMe SSDs
DRBD
App
DRBD
DRBD
50
LINSTOR Storage Stacks
• Hyperconverged
• Classic enterprise workloads
• Data bases
• Message queues
• Typical Orchestrators
• OpenStack, OpenNebula
• Kubernetes
• Flexibly redundancy (1-n)
• HDDs, SSDs, NVMe SSDs
DRBD
DRBD
App
DRBD
51
LINSTOR Storage Stacks
• Disaggregated
• Classic enterprise workloads
• Data bases
• Message queues
• Typical Orchestrators
• OpenStack, OpenNebula
• Kubernetes
• NVMe SSDs, SSDs
App
NVMe-oF
Initiator
NVMe-oF
Target
NVMe-oF
Initiator
NVMe-oF
Target
Raid1
52
LINSTOR Storage Stacks
• Disaggregated
• Cloud native workload
• Ephemeral storage
• Typical Orchestrator
• Kubernetes
• Application handles redundancy
• Best suited for NVMe SSDs
App
NVMe-oF
Initiator
NVMe-oF
Target
53
LINSTOR Storage Stacks
• Hyperconverged
• Cloud native workload
• Ephemeral storage
• PMEM optimized data base
• Typical Orchestrator
• Kubernetes
• Application handles redundancy
• PMEM, NVDIMMs
App
54
LINSTOR Slicing Storage
• LVM or ZFS
• Thick – pre allocated
• Best performance
• Less features
• Thin – allocated on demand
• Overprovisioning possible
• Many snapshots possible
• Optional
• Encryption on top
• Deduplication below
55
WinDRBD
56
WinDRBD
• in public beta
• https://www.linbit.com/en/drbd-community/drbd-download/
• Windows 7sp1, Windows 10, Windows Server 2016
• wire protocol compatible to Linux version
• driver tracks Linux version with one day release offset
• WinDRBD user level tools are merged into upstream

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

CloudStack and cloud-init
CloudStack and cloud-initCloudStack and cloud-init
CloudStack and cloud-init
 
Ceph Introduction 2017
Ceph Introduction 2017  Ceph Introduction 2017
Ceph Introduction 2017
 
What's New In Apache CloudStack 4.17
What's New In Apache CloudStack 4.17What's New In Apache CloudStack 4.17
What's New In Apache CloudStack 4.17
 
Using the KVMhypervisor in CloudStack
Using the KVMhypervisor in CloudStackUsing the KVMhypervisor in CloudStack
Using the KVMhypervisor in CloudStack
 
Lisa 2015-gluster fs-hands-on
Lisa 2015-gluster fs-hands-onLisa 2015-gluster fs-hands-on
Lisa 2015-gluster fs-hands-on
 
Anatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxCon
Anatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxConAnatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxCon
Anatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxCon
 
Terraform
TerraformTerraform
Terraform
 
Accelerating Hyper-Converged Enterprise Virtualization using Proxmox and Ceph
Accelerating Hyper-Converged Enterprise Virtualization using Proxmox and CephAccelerating Hyper-Converged Enterprise Virtualization using Proxmox and Ceph
Accelerating Hyper-Converged Enterprise Virtualization using Proxmox and Ceph
 
OpenShift Virtualization - VM and OS Image Lifecycle
OpenShift Virtualization - VM and OS Image LifecycleOpenShift Virtualization - VM and OS Image Lifecycle
OpenShift Virtualization - VM and OS Image Lifecycle
 
Room 3 - 7 - Nguyễn Như Phúc Huy - Vitastor: a fast and simple Ceph-like bloc...
Room 3 - 7 - Nguyễn Như Phúc Huy - Vitastor: a fast and simple Ceph-like bloc...Room 3 - 7 - Nguyễn Như Phúc Huy - Vitastor: a fast and simple Ceph-like bloc...
Room 3 - 7 - Nguyễn Như Phúc Huy - Vitastor: a fast and simple Ceph-like bloc...
 
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
 
Minio Cloud Storage
Minio Cloud StorageMinio Cloud Storage
Minio Cloud Storage
 
Deploying CloudStack and Ceph with flexible VXLAN and BGP networking
Deploying CloudStack and Ceph with flexible VXLAN and BGP networking Deploying CloudStack and Ceph with flexible VXLAN and BGP networking
Deploying CloudStack and Ceph with flexible VXLAN and BGP networking
 
VM Job Queues in CloudStack
VM Job Queues in CloudStackVM Job Queues in CloudStack
VM Job Queues in CloudStack
 
Terraform
TerraformTerraform
Terraform
 
OVN - Basics and deep dive
OVN - Basics and deep diveOVN - Basics and deep dive
OVN - Basics and deep dive
 
Ceph Block Devices: A Deep Dive
Ceph Block Devices:  A Deep DiveCeph Block Devices:  A Deep Dive
Ceph Block Devices: A Deep Dive
 
Technical Introduction to RHEL8
Technical Introduction to RHEL8Technical Introduction to RHEL8
Technical Introduction to RHEL8
 
NATS Streaming - an alternative to Apache Kafka?
NATS Streaming - an alternative to Apache Kafka?NATS Streaming - an alternative to Apache Kafka?
NATS Streaming - an alternative to Apache Kafka?
 
VictoriaLogs: Open Source Log Management System - Preview
VictoriaLogs: Open Source Log Management System - PreviewVictoriaLogs: Open Source Log Management System - Preview
VictoriaLogs: Open Source Log Management System - Preview
 

Semelhante a What CloudStackers Need To Know About LINSTOR/DRBD

drbd9_and_drbdmanage_may_2015
drbd9_and_drbdmanage_may_2015drbd9_and_drbdmanage_may_2015
drbd9_and_drbdmanage_may_2015
Alexandre Huynh
 

Semelhante a What CloudStackers Need To Know About LINSTOR/DRBD (20)

LINSTOR - Linux Block storage management tool (march 2019)
LINSTOR - Linux Block storage management tool (march 2019)LINSTOR - Linux Block storage management tool (march 2019)
LINSTOR - Linux Block storage management tool (march 2019)
 
Performant and Resilient Storage: The Open Source & Linux Way
Performant and Resilient Storage: The Open Source & Linux WayPerformant and Resilient Storage: The Open Source & Linux Way
Performant and Resilient Storage: The Open Source & Linux Way
 
OpenNebulaConf2019 - Performant and Resilient Storage the Open Source & Linux...
OpenNebulaConf2019 - Performant and Resilient Storage the Open Source & Linux...OpenNebulaConf2019 - Performant and Resilient Storage the Open Source & Linux...
OpenNebulaConf2019 - Performant and Resilient Storage the Open Source & Linux...
 
Reliable Storage for High Availability, Disaster Recovery, Clouds and Contain...
Reliable Storage for High Availability, Disaster Recovery, Clouds and Contain...Reliable Storage for High Availability, Disaster Recovery, Clouds and Contain...
Reliable Storage for High Availability, Disaster Recovery, Clouds and Contain...
 
OpenNebulaConf2018 - LINSTOR - Philipp Reisner - LINBIT
OpenNebulaConf2018 - LINSTOR  - Philipp Reisner - LINBIT OpenNebulaConf2018 - LINSTOR  - Philipp Reisner - LINBIT
OpenNebulaConf2018 - LINSTOR - Philipp Reisner - LINBIT
 
Block Storage For VMs With Ceph
Block Storage For VMs With CephBlock Storage For VMs With Ceph
Block Storage For VMs With Ceph
 
XenSummit - 08/28/2012
XenSummit - 08/28/2012XenSummit - 08/28/2012
XenSummit - 08/28/2012
 
LINSTOR - Resilient OSS Storage for OpenNebula - September 2018
LINSTOR - Resilient OSS Storage for OpenNebula - September 2018LINSTOR - Resilient OSS Storage for OpenNebula - September 2018
LINSTOR - Resilient OSS Storage for OpenNebula - September 2018
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage Cluster
 
drbd9_and_drbdmanage_may_2015
drbd9_and_drbdmanage_may_2015drbd9_and_drbdmanage_may_2015
drbd9_and_drbdmanage_may_2015
 
Cache Tiering and Erasure Coding
Cache Tiering and Erasure CodingCache Tiering and Erasure Coding
Cache Tiering and Erasure Coding
 
Cache Tiering and Erasure Coding
Cache Tiering and Erasure CodingCache Tiering and Erasure Coding
Cache Tiering and Erasure Coding
 
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-DeviceSUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
 
DRBD Deep Dive - Philipp Reisner - LINBIT
DRBD Deep Dive - Philipp Reisner - LINBITDRBD Deep Dive - Philipp Reisner - LINBIT
DRBD Deep Dive - Philipp Reisner - LINBIT
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
 
Linux High Availability Overview - openSUSE.Asia Summit 2015
Linux High Availability Overview - openSUSE.Asia Summit 2015 Linux High Availability Overview - openSUSE.Asia Summit 2015
Linux High Availability Overview - openSUSE.Asia Summit 2015
 
Redis by-hari
Redis by-hariRedis by-hari
Redis by-hari
 
OpenNebulaConf 2016 - The DRBD SDS for OpenNebula by Philipp Reisner, LINBIT
OpenNebulaConf 2016 - The DRBD SDS for OpenNebula by Philipp Reisner, LINBITOpenNebulaConf 2016 - The DRBD SDS for OpenNebula by Philipp Reisner, LINBIT
OpenNebulaConf 2016 - The DRBD SDS for OpenNebula by Philipp Reisner, LINBIT
 
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
 

Mais de ShapeBlue

Mais de ShapeBlue (20)

CloudStack Authentication Methods – Harikrishna Patnala, ShapeBlue
CloudStack Authentication Methods – Harikrishna Patnala, ShapeBlueCloudStack Authentication Methods – Harikrishna Patnala, ShapeBlue
CloudStack Authentication Methods – Harikrishna Patnala, ShapeBlue
 
CloudStack Tooling Ecosystem – Kiran Chavala, ShapeBlue
CloudStack Tooling Ecosystem – Kiran Chavala, ShapeBlueCloudStack Tooling Ecosystem – Kiran Chavala, ShapeBlue
CloudStack Tooling Ecosystem – Kiran Chavala, ShapeBlue
 
Elevating Cloud Infrastructure with Object Storage, DRS, VM Scheduling, and D...
Elevating Cloud Infrastructure with Object Storage, DRS, VM Scheduling, and D...Elevating Cloud Infrastructure with Object Storage, DRS, VM Scheduling, and D...
Elevating Cloud Infrastructure with Object Storage, DRS, VM Scheduling, and D...
 
VM Migration from VMware to CloudStack and KVM – Suresh Anaparti, ShapeBlue
VM Migration from VMware to CloudStack and KVM – Suresh Anaparti, ShapeBlueVM Migration from VMware to CloudStack and KVM – Suresh Anaparti, ShapeBlue
VM Migration from VMware to CloudStack and KVM – Suresh Anaparti, ShapeBlue
 
How We Grew Up with CloudStack and its Journey – Dilip Singh, DataHub
How We Grew Up with CloudStack and its Journey – Dilip Singh, DataHubHow We Grew Up with CloudStack and its Journey – Dilip Singh, DataHub
How We Grew Up with CloudStack and its Journey – Dilip Singh, DataHub
 
What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...
What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...
What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...
 
CloudStack 101: The Best Way to Build Your Private Cloud – Rohit Yadav, VP Ap...
CloudStack 101: The Best Way to Build Your Private Cloud – Rohit Yadav, VP Ap...CloudStack 101: The Best Way to Build Your Private Cloud – Rohit Yadav, VP Ap...
CloudStack 101: The Best Way to Build Your Private Cloud – Rohit Yadav, VP Ap...
 
How We Use CloudStack to Provide Managed Hosting - Swen Brüseke - proIO
How We Use CloudStack to Provide Managed Hosting - Swen Brüseke - proIOHow We Use CloudStack to Provide Managed Hosting - Swen Brüseke - proIO
How We Use CloudStack to Provide Managed Hosting - Swen Brüseke - proIO
 
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...
 
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
 
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online
KVM Security Groups Under the Hood - Wido den Hollander - Your.OnlineKVM Security Groups Under the Hood - Wido den Hollander - Your.Online
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online
 
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
 
Use Existing Assets to Build a Powerful In-house Cloud Solution - Magali Perv...
Use Existing Assets to Build a Powerful In-house Cloud Solution - Magali Perv...Use Existing Assets to Build a Powerful In-house Cloud Solution - Magali Perv...
Use Existing Assets to Build a Powerful In-house Cloud Solution - Magali Perv...
 
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...
 
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
 
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
 
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlueElevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
 
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
 
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...
 
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlueWhat’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 

What CloudStackers Need To Know About LINSTOR/DRBD

  • 1. What CloudStackers Need to Know About LINSTOR/DRBD Philipp Reisner, CEO LINBIT 1
  • 2. Leading Open Source OS based SDS COMPANY OVERVIEW REFERENCES • Developer of DRBD and LINSTOR • 100% founder owned • Offices in Europe and US • Team of highly experienced Linux experts • Exclusivity Japan: SIOS • Leading Open Source Block Storage (included in Linux Kernel (v2.6.33) • Open Source DRBD supported by proprietary LINBIT products / services • OpenStack with DRBD Cinder driver • Kubernetes Driver • Install base of >2 million PRODUCT OVERVIEW SOLUTIONS LINBIT SDS Since 2016 Perfectly suited for SSD/NVMe high performance storage DRBD HA, DRBD DR Market leading solutions since 2001, over 600 customers Ideally suited to power HA and DR in OEM appliances (Cisco, IBM, Oracle)
  • 4. 4 When is LINBIT SDS a fit? Transaction Processing ● Oracle DB ● PostgreSQL ● MariaDB ● Message queuing systems Analytic Processing ● DB2 Warehouse ● And similar read intensive workloads PersistentVolumes ...for Containers ● Kubernetes ● Nomad ● Docker Virtualization ● CloudStack ● Others...
  • 5. 5 Why is LINBIT SDS so fast? Layout at volume allocation • All participating machines have full replicas, which machines participate determined when creating a volume. • Be faster at IO submission time • Saving on CPU/memory In Kernel data-path • Reduce number of context switches • Saving on CPU/memory resources • Minimal latency for block-IO operations • Optional load-balancing for READs Hyper-Converged Very well suitable for hyper-converged deployment • Reduced network load for reads • Reduces latency • LINBIT SDS’ Low resource consumption leaves most of CPU and memory for workload. About 0.5% of a single core are consumed by DRBD under heavier IO load (measured with an analytics DB) Build on existing components • DRBD, LVM, ZFS, LUKS, VDO, ... • Help day2 operations by leveraging on the opera tion teams prior knowledge • Build on the shoulders of giants
  • 6. 6 What is LINBIT SDS doing? Storage Allocation • 3 to 1000s of nodes • Multiple tiers • Multi tenancy • Complex policies Chassis - rack - room Network • Multiple NICs per server • Multiple networks • RDMA • TCP Business continuity • Continuous data protection • Multiple sites • Backups SSD - disk - cloud Data Replication • Persistence & availability • Sync / async • 2,3 or more replicas • Consistency groups • Quorum
  • 7. 7 Linux Storage Gems LVM, RAID, SSD cache tiers, deduplication, targets & initiators
  • 9. 9 Linux's LVM • based on device mapper • original objects • PVs, VGs, LVs, snapshots • LVs can scatter over PVs in multiple segments • thinlv • thinpools = LVs • thin LVs live in thinpools • multiple snapshots became efficient!
  • 11. 11 Linux's RAID • original MD code • mdadm command • Raid Levels: 0,1,4,5,6,10 • now available in LVM as well • device mapper interface for MD code • do not call it ‘dmraid’; that is software for hardware fake-raid • lvcreate --type raid6 --size 100G VG_name RAID1 A4 A3 A2 A1 A4 A3 A2 A1
  • 12. 12 Linux’s DeDupe • Virtual Data Optimizer (VDO) since RHEL 7.5 • Red hat acquired Permabit and is GPLing VDO • Linux upstreaming is in preparation • in-line data deduplication • kernel part is a device mapper module • indexing service runs in user-space • async or synchronous writeback • recommended to be used below LVM
  • 13. 13 SSD cache for HDD • dm-cache • device mapper module • accessible via LVM tools • bcache • generic Linux block device • slightly ahead in the performance game • dm-write-cache • for combinding PMEM & NVMe drives
  • 14. 14 Linux’s targets & initiators • Open-ISCSI initiator • Ietd, STGT, SCST • mostly historical • LIO • iSCSI, iSER, SRP, FC, FCoE • SCSI pass through, block IO, file IO, user-specific-IO • NVMe-OF & NVMe/TCP • target & initiator Initiator Target IO-requests data/completion
  • 15. 15 ZFS on Linux • Ubuntu eco-system only • has its own • logic volume manager (zVols) • thin provisioning • RAID (RAIDz) • caching for SSDs (ZIL, SLOG) • and a file system!
  • 17. 17 DRBD – think of it as ... Target Initiator IO-requests data/completion RAID1 A4 A3 A2 A1 A4 A3 A2 A1
  • 18. 18 DRBD – another way to look at it
  • 19. 19 DRBD Roles: Primary & Secondary Primary Secondary replication
  • 20. 20 DRBD – multiple Volumes • consistency group Primary Secondary replication
  • 21. 21 DRBD – up to 32 replicas • each may be synchronous or async Primary Secondary Secondary
  • 22. 22 DRBD – Diskless nodes • intentional diskless (no change tracking bitmap) • disks can fail Primary Secondary Secondary
  • 23. 23 DRBD - more about • a node knows the version of the data is exposes • automatic partial resync after connection outage • checksum-based verify & resync • split brain detection & resolution policies • fencing • quorum • multiple resouces per node possible (1000s) • dual Primary for live migration of VMs only!
  • 24. 24 DRBD Recent Features & ROADMAP • Recent • meta-data on PMEM/NVDIMMS • improved, fine-grained locking for parallel workloads • support discard granularity up 128MiB (GiBs w. resync) • release DRBD-9.2 • ROADMAP • performance optimizations • replace „stacking” • NATed partitions
  • 25. 25 The combination is more than the sum of its parts
  • 26. 26 LINSTOR - goals • storage build from generic Linux nodes • for SDS consumers (CloudStack, K8s, OpenStack, ...) • building on existing Linux storage components • multiple tenants possible • deployment architectures • distinct storage nodes • hyperconverged with hypervisors / container hosts • LVM, thin LVM or ZFS for volume management (stratis later) • Open Source, GPL
  • 28. 28 LINSTOR - Hyperconverged storage devices hypervisor & storage VM storage devices hypervisor & storage storage devices hypervisor & storage storage devices hypervisor & storage storage devices hypervisor & storage storage devices hypervisor & storage VM
  • 29. 29 LINSTOR - VM migrated storage devices hypervisor & storage storage devices hypervisor & storage storage devices hypervisor & storage storage devices hypervisor & storage storage devices hypervisor & storage storage devices hypervisor & storage VM VM
  • 30. 30 LINSTOR - add local replica storage devices hypervisor & storage storage devices hypervisor & storage storage devices hypervisor & storage storage devices hypervisor & storage storage devices hypervisor & storage storage devices hypervisor & storage VM VM
  • 31. 31 LINSTOR - remove 3rd copy storage devices hypervisor & storage storage devices hypervisor & storage storage devices hypervisor & storage storage devices hypervisor & storage storage devices hypervisor & storage storage devices hypervisor & storage VM VM
  • 34. 34 LINSTOR Objects • Nodes • Resources  Volumes • Snapshots • Storage Pools  shared • Resource groups • Properties  Aux properties node A LVM VG node C shared LVM VG DRBD res. 2 replicas encryped volume node B ZFS pool plain volume Resource Groups Resource Groups
  • 35. 35 LINSTOR Storage Layers LVM VG, ZFS zPool, Exos, OpenFlex, SPDK, shared LVM VG Mid (opt & multiple) Bottom (required) LUKS (encryption), caches, NVMe target & initiator Top (opt) DRBD Below (opt) Bottom (required) VDO (deduplication), software RAID
  • 36. 36 LINSTOR data placement Example policy 3 way redundant, where two copies are in the same rack but in diffeent fire compartments (synchronous) and a 3rd replica in a different site (asynchronous) Example tags rack = number room = number site = city • arbitrary tags on nodes • require placement on equal/different/named tag values • prohibit placements with named existing volumes • different failure domains for related volumes
  • 37. 37 LINSTOR network path selection • a storage pool may preferred a NIC • express NUMA relation of NVMe devices and NICs • DRBD’s multi pathing supported • load balancing with the RDMA transport • fail-over only with the TCP transport
  • 38. 38 in the Software Ecosystem
  • 39. 39 LINSTOR connectors Apache CloudStack since 4.16 update with 4.18 (already merged) with Qemu/KVM
  • 40. 40 LINSTOR SDS vs LINSTOR/DRBD LINBIT SDS From source code yum / apt repo Red hat UBI Github → make → package Support ✓ Enterprise, incl 24/7 Community only GUI ✓ n.a.
  • 41. 41 “Naming is hard” – Phil Karlton Translation Matrix LINSTOR Resource Group Resource/Volume Kubernetes storageClass + file system → persistentVolume Nomad OpenNebula Datastore Image OpenStack Volume Type Volume Proxmox Storage Pool Volume XCP-ng Storage Repository (SR) Virtual disk Image (VDI) CloudStack Primary storage Volume
  • 42. 42 Summary DRBD Diskless NVMe - oF ISCSI Block transport systems Orchestrators Block Storage Features DRBD LUKS DM - Cache VDO Hardware HDD Node-level volume management LVM ZFS SSD NVMe PMEM
  • 44. 44 Appendix Slides: Example Disaggregated Architecture
  • 45. 45 LINSTOR – disaggregated stack storage node storage node hypervisor VM VM storage node storage node hypervisor VM VM hypervisor
  • 46. 46 LINSTOR / failed Hypervisor storage node storage node storage node storage node hypervisor VM VM hypervisor VM VM
  • 47. 47 LINSTOR / failed storage node storage node hypervisor VM VM storage node storage node hypervisor VM VM hypervisor
  • 49. 49 LINSTOR Storage Stacks • Disaggregated Storage • Classic enterprise workloads • Data bases • Message queues • Typical Orchestrators • OpenStack, OpenNebula • Kubernetes • Flexibly redundancy (1-n) • HDDs, SSDs, NVMe SSDs DRBD App DRBD DRBD
  • 50. 50 LINSTOR Storage Stacks • Hyperconverged • Classic enterprise workloads • Data bases • Message queues • Typical Orchestrators • OpenStack, OpenNebula • Kubernetes • Flexibly redundancy (1-n) • HDDs, SSDs, NVMe SSDs DRBD DRBD App DRBD
  • 51. 51 LINSTOR Storage Stacks • Disaggregated • Classic enterprise workloads • Data bases • Message queues • Typical Orchestrators • OpenStack, OpenNebula • Kubernetes • NVMe SSDs, SSDs App NVMe-oF Initiator NVMe-oF Target NVMe-oF Initiator NVMe-oF Target Raid1
  • 52. 52 LINSTOR Storage Stacks • Disaggregated • Cloud native workload • Ephemeral storage • Typical Orchestrator • Kubernetes • Application handles redundancy • Best suited for NVMe SSDs App NVMe-oF Initiator NVMe-oF Target
  • 53. 53 LINSTOR Storage Stacks • Hyperconverged • Cloud native workload • Ephemeral storage • PMEM optimized data base • Typical Orchestrator • Kubernetes • Application handles redundancy • PMEM, NVDIMMs App
  • 54. 54 LINSTOR Slicing Storage • LVM or ZFS • Thick – pre allocated • Best performance • Less features • Thin – allocated on demand • Overprovisioning possible • Many snapshots possible • Optional • Encryption on top • Deduplication below
  • 56. 56 WinDRBD • in public beta • https://www.linbit.com/en/drbd-community/drbd-download/ • Windows 7sp1, Windows 10, Windows Server 2016 • wire protocol compatible to Linux version • driver tracks Linux version with one day release offset • WinDRBD user level tools are merged into upstream