SlideShare a Scribd company logo
1 of 22
Deep diving into the dynamic
provisioning of GlusterFS
volumes in k8s with Heketi
Artem Romanchik
Key notes
Persistent Volumes Claim (PVC)
GlusterFS
Heketi
Known issues
Good advice
Did you work with GlusterFS and Heketi?
Most popular answers
А что это такое?
Немножко
БГ миловал!
Борис Гребенщиков
Использую в проде
Крутая штука!
GlusterFS
GlusterFS is a scalable network filesystem suitable for
data-intensive tasks such as cloud storage and media
streaming.
* - Distributed Glusterfs Volume
- Replicated Glusterfs Volume
- Distributed Replicated Glusterfs Volume
- Striped Glusterfs Volume
- Distributed Striped Glusterfs Volume
Heketi
Heketi provides a RESTful management interface which can be used to
manage the life cycle of GlusterFS volumes. With Heketi, cloud services
like OpenStack Manila, Kubernetes, and OpenShift can dynamically
provision GlusterFS volumes.
Kubernetes
Heketi
GlusterFS
Management
Mount volumes
Persistent Volumes Claim (PVC).
What is it?
PVC from user side
K8S POD PVC
Volume Mount
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mystorage
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: slow
PVC from server side
PVC StorageClass
PV
…
glusterfs:
endpoints: glusterfs-dynamic-service-storage-service-0
path: vol_c2fe5ec0d33aff8bc91893d9fedf84f7
…
parameters:
resturl: http://127.0.0.1:8081
…
volumetype: replicate:3
provisioner: kubernetes.io/glusterfs
K8S POD
Service
Endpoints
Heketi
resources:
requests:
storage: 1Gi
storageClassName: slow
PVC from Heketi side
Kubernetes
(provisioner)
Heketi
POST /volumes HTTP/1.1 *
Location: /queue/fb82…adc3
* {"size":1,"name":"","durability":{"type":"replicate","replicate":{"replica":3},"disperse":{}},"gid":2008,"snapshot":{"enable":true,"factor":1}}
GET /queue/fb82..adc3 **
** Every 2s
Location: /volumes/2927…66b02
GET /volumes/2927…66b02
*** {"size":1,"name":"vol_29279734e412cac413e2baf5deb66b02","durability":{"type":"replicate","replicate":{"replica":3},"disperse":{}},
"gid":2008,"glustervolumeoptions":["",""],"snapshot":{"enable":true,"factor":1}
Volume {"size":1,"name":"vol_2927…66b02" ***
PVC from GlusterFS side
Heketi GlusterFS
mkdir -p /var/lib/heketi/mounts/vg_name/brick_name
Thin logical volume create *
mkfs.xfs -i size=512 -n size=8192 /dev/mapper/vg_name-brick_name
** awk "BEGIN {print "/dev/mapper/vg_name-brick_name /var/lib/heketi/mounts/vg_name/brick_name xfs rw,inode64,noatime,nouuid 1 2" >> "/var/lib/heketi/fstab"}"
Add volume to /var/lib/heketi/fstab **
Mount volume
mkdir /var/lib/heketi/mounts/vg_name/brick_name/brick
gluster --mode=script --timeout=600 volume create vol_name replica 3 brick1 brick2 brick3
gluster --mode=script --timeout=600 volume set vol_name user.heketi.id ID
gluster --mode=script --timeout=600 volume start vol_name
* lvcreate -qq --autobackup=y --poolmetadatasize 8192K --chunksize 256K --size 1048576K --thin vg_name/tp_name --virtualsize 1048576K --name brick_name
Map of the Heketi world
pvc
StorageClass
PV
Heketi
Service
GlusterFS
Volume
Brick1
Brick2
Brick3
Heketi DB
…
metadata:
annotations:
gluster.kubernetes.io/heketi-volume-id: a9d8b1ae636258c09af7378946ceac76
name: pvc-90d794f3-41c8-11ea-8353-06d8b3ea3b88
glusterfs:
endpoints: glusterfs-dynamic-test
path: vol_90da4e99-41c8-11ea-8e54-06029613cf28
…
Endpoints
…
subsets:
- addresses:
- ip: 10.2.4.144
- ip: 10.2.4.231
- ip: 10.2.4.58
ports:
- port: 1
protocol: TCP
pkg/remoteexec/*executor
ssh/k8sAPIkubernetes.io/glusterfs
"port": 8081,
"glusterfs": {
"executor": "kubernetes",
"db": "/var/lib/heketi/heketi.db",
"kubeexec": {
"host": "https://kubernetes.default.svc.cluster.local",
"fstab": "/var/lib/heketi/fstab",
"backup_lvm_metadata": true
Kubernetes.io/glusterfs
import ( …
gcli "github.com/heketi/heketi/client/api/go-client"
gapi "github.com/heketi/heketi/pkg/glusterfs/api“ …)
…
func (p *glusterfsVolumeProvisioner) CreateVolume(gid int) (r *v1.GlusterfsPersistentVolumeSource, size
int, volID string, err error) {
…
cli := gcli.NewClient(d.url, d.user, d.secretValue)
…
volumeReq := &gapi.VolumeCreateRequest{Size: sz, Name: customVolumeName, Clusters: clusterIDs, Gid:
gid64, Durability: p.volumeType, GlusterVolumeOptions: p.volumeOptions, Snapshot: snaps}
volume, err := cli.VolumeCreate(volumeReq)
…
https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/glusterfs/glusterfs.go
Not all information we can find in the docs
And here: https://docs.openshift.com/container-platform/3.11/install_config/persistent_storage/dynamically_provisioning_pvs.html
https://kubernetes.io/docs/concepts/storage/storage-classes/#glusterfs
type provisionerConfig struct {
…
url string
user string
volumeType gapi.VolumeDurabilityInfo
volumeNamePrefix string
thinPoolSnapFactor float32
customEpNamePrefix string
.... }
There is no information here: https://kubernetes.io/docs/concepts/storage/storage-classes/#glusterfs
Fortunatelly, we can see a pretty good description here:
https://github.com/kubernetes/examples/blob/master/staging/persistent-volume-provisioning/README.md
customepnameprefix: By default dynamically provisioned volumes has an endpoint and service created with the naming schema of glusterfs-dynamic-<PVC UUID format. With this
option present in storageclass, an admin can now prefix the desired endpoint from storageclass. If customepnameprefix storageclass parameter is set, the dynamically provisioned
volumes will have an endpoint and service created in the following format where - is the field separator/delimiter: customepnameprefix-<PVC UUID>
Task: custom gluster volumes names
External projects as part of Heketi
gorilla/mux
is a powerful URL
router and dispatcher
Library for creating powerful
modern CLI applications as well
as a program to generate
applications and command files
An embedded
key/value database for
Go
BoltDB
apps/glusterfs/dbcommon.go
heketi-cli vloume list
GET HTTP://localhost:8080/volumes/list
JSON
/var/lib/heketi.db
clusterentries
nodeentries
Heketi database
nodeentries
deviceentries deviceentries
volumeentries
brickentries
volumeentries volumeentries volumeentries
brickentries
brickentries
brickentries
brickentries
brickentries
brickentries
brickentries
brickentries brickentries
Heketi Database
# cat db_before.json | jq '. | map_values(keys)'
{
"clusterentries": [
"2e16d5adfb5eababeceb6719e5e808cd"
],
"volumeentries": [
"08f43b51ac546868c0d1a29e5f3921fb" ],
"brickentries": [
"077dfbe0c7f0a8949960ef21333a2c11",
"1…b0",
"1c6…5“],
"nodeentries": [
"520bf6e83b4bba8fb5a992a6da6ef041",
"d3…a0",
"fdb…13"
],
"deviceentries": [
"0888196c04e9e5f7ad346ba7ec173c01",
"38…94",
"fb…a6"
],
"blockvolumeentries": [],
"dbattributeentries": [],
"pendingoperations": []
}
"08f43b51ac546868c0d1a29e5f3921fb": {
"Info": {
"size": 2, "name": "vol_08f43b51ac546868c0d1a29e5f3921fb",
"durability": { "type": "replicate", "replicate": { "replica": 3 },
"disperse": {}
},
"gid": 2016,
"snapshot": { "enable": false, "factor": 1 },
"id": "08f43b51ac546868c0d1a29e5f3921fb",
"cluster": "2e16d5adfb5eababeceb6719e5e808cd",
"mount": {
"glusterfs": {
"hosts": [ "10.2.4.58", "10.2.4.144“, "10.2.4.231“ ],
"device": "10.2.4.58:vol_08f43b51ac546868c0d1a29e5f3921fb",
"options": { "backup-volfile-servers": "10.2.4.144,10.2.4.231“ }
}
},
"blockinfo": {}
},
"Bricks": [
"077dfbe0c7f0a8949960ef21333a2c11",
"1…b0",
"1c6…5 “ ],
"GlusterVolumeOptions": [ ],
"Pending": { "Id": "“ }
}
…
"bricks": {
"$ 077dfbe0c7f0a8949960ef21333a2c11 ": {
"Info": {
"id": " 077dfbe0c7f0a8949960ef21333a2c11 ",
"path": "$brick_path_1",
"device": vg_z12…13",
"node": “520bf6e83b4bba8fb5a992a6da6ef041",
"volume": " 08f43b51ac546868c0d1a29e5f3921fb ",
"size": $volume_size_bytes
},
"TpSize": $volume_size_bytes,
"PoolMetadataSize": 12288,
"Pending": {
"Id": ""
},
"LvmThinPool": "tp_ 077dfbe0c7f0a8949960ef21333a2c11 ",
"LvmLv": "",
"SubType": 1
…
"State": "online",
"Info": {
"zone": 1,
"hostnames": {
"manage": [ "10.2.4.231“ ],
"storage": [ "10.2.4.231“ ]
},
"cluster":
"2e16d5adfb5eababeceb6719e5e808cd",
"id": "520bf6e83b4bba8fb5a992a6da6ef041"
},
"Devices": [
"0888196c04e9e5f7ad346ba7ec173c01"
How Heketi manage Gluster
executor mock ssh kubernetes
How
manage
Does not send
any commands
out to servers.
Sends
commands to
real systems
over ssh
Send
commands to
k8s api
Purpose Development
and tests
GlusterFS as
separate
servers
Manage
GlusterFS
k8s pods
/etc/heketi/heketi.json
… "glusterfs": {
"executor": "kubernetes",
"db": "/var/lib/heketi/heketi.db"
"kubeexec": {
"host":
"https://kubernetes.default.svc.cluster.local",
"fstab": "/var/lib/heketi/fstab” ….
Custom fstab
for GlusterFS
PODs
Possible Heketi database problems
Problem Impact Solution
We have a volume in
GlusterFS, but have no in
Heketi
Lost control
Add information
about the lost volume to db
We have a volume in Heketi,
but have no in GlusterFS
Inconsistent DB
Delete volume from Heketi DB
use heketi-cli or API
Volume settings not same
as in the Heketi DB
Inconsistent DB Fix values in Heketi DB
We replaced the physical
device with GlusterFS bricks
Storage isn’t health
Recreate all volumes
If we replace a physical device
Restore LV from the lvm archive. Ex.:
/etc/lvm/archive/vg_0888196c04e9e5f7ad346ba7ec173c01_00141-488962832.vg
Replacing a broken GlusterFS POD – it’s realy easy! Example:
device=$(grep /dev/ $backup | sed -e s/'t'//g -e s/'#'//g | cut -d " " -f3)
device_id=$(grep /dev/ $backup -B1 | head -n1 | cut -d " " -f3)
pvcreate --uuid $device_id $device –norestorefile
vgname=$(grep -E 'vg_.*{' $backup | cut -d " " -f1)
vgcreate $vgname $device
bricks=$(grep brick_ $backup | sed s/'t'//g | cut -d " " -f1)
for brick in $bricks; do
chunksize=256K; tp_name=…; poolmetadatasize=…; size=…
lvcreate -qq --autobackup=y --poolmetadatasize $poolmetadatasize --chunksize 
$chunksize --size $size --thin $vgname/$tp_name --virtualsize $size --name $brick"
mkfs.xfs -i size=512 -n size=8192 /dev/mapper/$vgname-$brick
done
Reset all bricks:
gluster v status $volume | grep ${brickpath: -25} | grep " N " && 
gluster volume reset-brick $volume $brickpath start && 
gluster volume reset-brick $volume $brickpath $brickpath commit force
But replacing only a broken device isn’t possible from Heketi. We need to do this manually:
heketi-cli node add --zone=1 --cluster=$heketiClusterID --management-host-name=$newNodeAddress --storage-host-name=$newNodeAddress
heketi-cli device add --name $deviceName --node $newNodeID
heketi-cli node disable $oldNodeID
heketi-cli node remove $oldNodeID
heketi-cli device delete $oldDeviceID
heketi-cli node delete $oldNodeID
Good advices
- OpenShift docs (Gluster storage)
- RedHat knowledge base (KB)
No-Cost RHEL Developer Subscription:
https://developers.redhat.com/blog/2016/03/31/no-cost-rhel-developer-subscription-now-available/
https://github.com/TargetProcess/heketi_tools
3. You can use our scripts for checking and fixing Heketi and Gluster
https://play.google.com/books/reader?id=D5q1DwAAQBAJ&authuser=2&hl=ru
2. Implementation Guide for IBM Blockchain
Platform for Multicloud
1. Feel free to use RedHat resources:
4. RTFM : https://github.com/heketi/heketi
Q&A
Answer: 42
Thank you!
Artem Romanchik
Targetprocess, Inc

More Related Content

What's hot

高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
Ryousei Takano
 

What's hot (19)

MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...
 
Node.js Stream API
Node.js Stream APINode.js Stream API
Node.js Stream API
 
GDG Devfest 2019 - Build go kit microservices at kubernetes with ease
GDG Devfest 2019 - Build go kit microservices at kubernetes with easeGDG Devfest 2019 - Build go kit microservices at kubernetes with ease
GDG Devfest 2019 - Build go kit microservices at kubernetes with ease
 
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
 
Unified Data Platform, by Pauline Yeung of Cisco Systems
Unified Data Platform, by Pauline Yeung of Cisco SystemsUnified Data Platform, by Pauline Yeung of Cisco Systems
Unified Data Platform, by Pauline Yeung of Cisco Systems
 
Paris Kafka Meetup - How to develop with Kafka
Paris Kafka Meetup - How to develop with KafkaParis Kafka Meetup - How to develop with Kafka
Paris Kafka Meetup - How to develop with Kafka
 
Sensmon couchdb
Sensmon couchdbSensmon couchdb
Sensmon couchdb
 
Neo4j after 1 year in production
Neo4j after 1 year in productionNeo4j after 1 year in production
Neo4j after 1 year in production
 
Redis - Usability and Use Cases
Redis - Usability and Use CasesRedis - Usability and Use Cases
Redis - Usability and Use Cases
 
Redis data modeling examples
Redis data modeling examplesRedis data modeling examples
Redis data modeling examples
 
Paris Redis Meetup Introduction
Paris Redis Meetup IntroductionParis Redis Meetup Introduction
Paris Redis Meetup Introduction
 
NoSQL meets Microservices - Michael Hackstein
NoSQL meets Microservices - Michael HacksteinNoSQL meets Microservices - Michael Hackstein
NoSQL meets Microservices - Michael Hackstein
 
Top 10 F5 iRules to migrate to a modern load balancing platform
Top 10 F5 iRules to migrate to a modern load balancing platformTop 10 F5 iRules to migrate to a modern load balancing platform
Top 10 F5 iRules to migrate to a modern load balancing platform
 
Rust: Reach Further
Rust: Reach FurtherRust: Reach Further
Rust: Reach Further
 
Synapse india dotnet development web approch part 2
Synapse india dotnet development web approch part 2Synapse india dotnet development web approch part 2
Synapse india dotnet development web approch part 2
 
동시성과 병렬성
동시성과 병렬성동시성과 병렬성
동시성과 병렬성
 
The elephant in the room mongo db + hadoop
The elephant in the room  mongo db + hadoopThe elephant in the room  mongo db + hadoop
The elephant in the room mongo db + hadoop
 
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
 
JEEConf. Vanilla java
JEEConf. Vanilla javaJEEConf. Vanilla java
JEEConf. Vanilla java
 

Similar to Delex 2020: Deep diving into the dynamic provisioning of GlusterFS volumes in k8s with Heketi

Developing your first application using FIWARE
Developing your first application using FIWAREDeveloping your first application using FIWARE
Developing your first application using FIWARE
FIWARE
 
Intravert Server side processing for Cassandra
Intravert Server side processing for CassandraIntravert Server side processing for Cassandra
Intravert Server side processing for Cassandra
Edward Capriolo
 

Similar to Delex 2020: Deep diving into the dynamic provisioning of GlusterFS volumes in k8s with Heketi (20)

Original slides from Ryan Dahl's NodeJs intro talk
Original slides from Ryan Dahl's NodeJs intro talkOriginal slides from Ryan Dahl's NodeJs intro talk
Original slides from Ryan Dahl's NodeJs intro talk
 
Golang Project Layout and Practice
Golang Project Layout and PracticeGolang Project Layout and Practice
Golang Project Layout and Practice
 
Skydive 5/07/2016
Skydive 5/07/2016Skydive 5/07/2016
Skydive 5/07/2016
 
Elasticsearch und die Java-Welt
Elasticsearch und die Java-WeltElasticsearch und die Java-Welt
Elasticsearch und die Java-Welt
 
Jaap : node, npm & grunt
Jaap : node, npm & gruntJaap : node, npm & grunt
Jaap : node, npm & grunt
 
Dev fest 2020 taiwan how to debug microservices on kubernetes as a pros (ht...
Dev fest 2020 taiwan   how to debug microservices on kubernetes as a pros (ht...Dev fest 2020 taiwan   how to debug microservices on kubernetes as a pros (ht...
Dev fest 2020 taiwan how to debug microservices on kubernetes as a pros (ht...
 
GDG Cloud Taipei meetup #50 - Build go kit microservices at kubernetes with ...
GDG Cloud Taipei meetup #50 - Build go kit microservices at kubernetes  with ...GDG Cloud Taipei meetup #50 - Build go kit microservices at kubernetes  with ...
GDG Cloud Taipei meetup #50 - Build go kit microservices at kubernetes with ...
 
ELK: a log management framework
ELK: a log management frameworkELK: a log management framework
ELK: a log management framework
 
Developing your first application using FIWARE
Developing your first application using FIWAREDeveloping your first application using FIWARE
Developing your first application using FIWARE
 
WebSocket Perspectives 2015 - Clouds, Streams, Microservices and WoT
WebSocket Perspectives 2015 - Clouds, Streams, Microservices and WoTWebSocket Perspectives 2015 - Clouds, Streams, Microservices and WoT
WebSocket Perspectives 2015 - Clouds, Streams, Microservices and WoT
 
Socket.io
Socket.ioSocket.io
Socket.io
 
nuclio Overview October 2017
nuclio Overview October 2017nuclio Overview October 2017
nuclio Overview October 2017
 
Node.js - async for the rest of us.
Node.js - async for the rest of us.Node.js - async for the rest of us.
Node.js - async for the rest of us.
 
FwDays 2021: Metarhia Technology Stack for Node.js
FwDays 2021: Metarhia Technology Stack for Node.jsFwDays 2021: Metarhia Technology Stack for Node.js
FwDays 2021: Metarhia Technology Stack for Node.js
 
Skydive, real-time network analyzer, container integration
Skydive, real-time network analyzer, container integrationSkydive, real-time network analyzer, container integration
Skydive, real-time network analyzer, container integration
 
Intravert Server side processing for Cassandra
Intravert Server side processing for CassandraIntravert Server side processing for Cassandra
Intravert Server side processing for Cassandra
 
NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"
NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"
NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"
 
Non-blocking I/O, Event loops and node.js
Non-blocking I/O, Event loops and node.jsNon-blocking I/O, Event loops and node.js
Non-blocking I/O, Event loops and node.js
 
Ethereum
EthereumEthereum
Ethereum
 
HashiConf Digital 2020: HashiCorp Vault configuration as code via HashiCorp T...
HashiConf Digital 2020: HashiCorp Vault configuration as code via HashiCorp T...HashiConf Digital 2020: HashiCorp Vault configuration as code via HashiCorp T...
HashiConf Digital 2020: HashiCorp Vault configuration as code via HashiCorp T...
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

Delex 2020: Deep diving into the dynamic provisioning of GlusterFS volumes in k8s with Heketi

  • 1. Deep diving into the dynamic provisioning of GlusterFS volumes in k8s with Heketi Artem Romanchik
  • 2. Key notes Persistent Volumes Claim (PVC) GlusterFS Heketi Known issues Good advice
  • 3. Did you work with GlusterFS and Heketi? Most popular answers А что это такое? Немножко БГ миловал! Борис Гребенщиков Использую в проде Крутая штука!
  • 4. GlusterFS GlusterFS is a scalable network filesystem suitable for data-intensive tasks such as cloud storage and media streaming. * - Distributed Glusterfs Volume - Replicated Glusterfs Volume - Distributed Replicated Glusterfs Volume - Striped Glusterfs Volume - Distributed Striped Glusterfs Volume
  • 5. Heketi Heketi provides a RESTful management interface which can be used to manage the life cycle of GlusterFS volumes. With Heketi, cloud services like OpenStack Manila, Kubernetes, and OpenShift can dynamically provision GlusterFS volumes. Kubernetes Heketi GlusterFS Management Mount volumes
  • 6. Persistent Volumes Claim (PVC). What is it?
  • 7. PVC from user side K8S POD PVC Volume Mount apiVersion: v1 kind: PersistentVolumeClaim metadata: name: mystorage spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi storageClassName: slow
  • 8. PVC from server side PVC StorageClass PV … glusterfs: endpoints: glusterfs-dynamic-service-storage-service-0 path: vol_c2fe5ec0d33aff8bc91893d9fedf84f7 … parameters: resturl: http://127.0.0.1:8081 … volumetype: replicate:3 provisioner: kubernetes.io/glusterfs K8S POD Service Endpoints Heketi resources: requests: storage: 1Gi storageClassName: slow
  • 9. PVC from Heketi side Kubernetes (provisioner) Heketi POST /volumes HTTP/1.1 * Location: /queue/fb82…adc3 * {"size":1,"name":"","durability":{"type":"replicate","replicate":{"replica":3},"disperse":{}},"gid":2008,"snapshot":{"enable":true,"factor":1}} GET /queue/fb82..adc3 ** ** Every 2s Location: /volumes/2927…66b02 GET /volumes/2927…66b02 *** {"size":1,"name":"vol_29279734e412cac413e2baf5deb66b02","durability":{"type":"replicate","replicate":{"replica":3},"disperse":{}}, "gid":2008,"glustervolumeoptions":["",""],"snapshot":{"enable":true,"factor":1} Volume {"size":1,"name":"vol_2927…66b02" ***
  • 10. PVC from GlusterFS side Heketi GlusterFS mkdir -p /var/lib/heketi/mounts/vg_name/brick_name Thin logical volume create * mkfs.xfs -i size=512 -n size=8192 /dev/mapper/vg_name-brick_name ** awk "BEGIN {print "/dev/mapper/vg_name-brick_name /var/lib/heketi/mounts/vg_name/brick_name xfs rw,inode64,noatime,nouuid 1 2" >> "/var/lib/heketi/fstab"}" Add volume to /var/lib/heketi/fstab ** Mount volume mkdir /var/lib/heketi/mounts/vg_name/brick_name/brick gluster --mode=script --timeout=600 volume create vol_name replica 3 brick1 brick2 brick3 gluster --mode=script --timeout=600 volume set vol_name user.heketi.id ID gluster --mode=script --timeout=600 volume start vol_name * lvcreate -qq --autobackup=y --poolmetadatasize 8192K --chunksize 256K --size 1048576K --thin vg_name/tp_name --virtualsize 1048576K --name brick_name
  • 11. Map of the Heketi world pvc StorageClass PV Heketi Service GlusterFS Volume Brick1 Brick2 Brick3 Heketi DB … metadata: annotations: gluster.kubernetes.io/heketi-volume-id: a9d8b1ae636258c09af7378946ceac76 name: pvc-90d794f3-41c8-11ea-8353-06d8b3ea3b88 glusterfs: endpoints: glusterfs-dynamic-test path: vol_90da4e99-41c8-11ea-8e54-06029613cf28 … Endpoints … subsets: - addresses: - ip: 10.2.4.144 - ip: 10.2.4.231 - ip: 10.2.4.58 ports: - port: 1 protocol: TCP pkg/remoteexec/*executor ssh/k8sAPIkubernetes.io/glusterfs "port": 8081, "glusterfs": { "executor": "kubernetes", "db": "/var/lib/heketi/heketi.db", "kubeexec": { "host": "https://kubernetes.default.svc.cluster.local", "fstab": "/var/lib/heketi/fstab", "backup_lvm_metadata": true
  • 12. Kubernetes.io/glusterfs import ( … gcli "github.com/heketi/heketi/client/api/go-client" gapi "github.com/heketi/heketi/pkg/glusterfs/api“ …) … func (p *glusterfsVolumeProvisioner) CreateVolume(gid int) (r *v1.GlusterfsPersistentVolumeSource, size int, volID string, err error) { … cli := gcli.NewClient(d.url, d.user, d.secretValue) … volumeReq := &gapi.VolumeCreateRequest{Size: sz, Name: customVolumeName, Clusters: clusterIDs, Gid: gid64, Durability: p.volumeType, GlusterVolumeOptions: p.volumeOptions, Snapshot: snaps} volume, err := cli.VolumeCreate(volumeReq) … https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/glusterfs/glusterfs.go
  • 13. Not all information we can find in the docs And here: https://docs.openshift.com/container-platform/3.11/install_config/persistent_storage/dynamically_provisioning_pvs.html https://kubernetes.io/docs/concepts/storage/storage-classes/#glusterfs type provisionerConfig struct { … url string user string volumeType gapi.VolumeDurabilityInfo volumeNamePrefix string thinPoolSnapFactor float32 customEpNamePrefix string .... } There is no information here: https://kubernetes.io/docs/concepts/storage/storage-classes/#glusterfs Fortunatelly, we can see a pretty good description here: https://github.com/kubernetes/examples/blob/master/staging/persistent-volume-provisioning/README.md customepnameprefix: By default dynamically provisioned volumes has an endpoint and service created with the naming schema of glusterfs-dynamic-<PVC UUID format. With this option present in storageclass, an admin can now prefix the desired endpoint from storageclass. If customepnameprefix storageclass parameter is set, the dynamically provisioned volumes will have an endpoint and service created in the following format where - is the field separator/delimiter: customepnameprefix-<PVC UUID> Task: custom gluster volumes names
  • 14. External projects as part of Heketi gorilla/mux is a powerful URL router and dispatcher Library for creating powerful modern CLI applications as well as a program to generate applications and command files An embedded key/value database for Go BoltDB apps/glusterfs/dbcommon.go heketi-cli vloume list GET HTTP://localhost:8080/volumes/list JSON /var/lib/heketi.db
  • 15. clusterentries nodeentries Heketi database nodeentries deviceentries deviceentries volumeentries brickentries volumeentries volumeentries volumeentries brickentries brickentries brickentries brickentries brickentries brickentries brickentries brickentries brickentries
  • 16. Heketi Database # cat db_before.json | jq '. | map_values(keys)' { "clusterentries": [ "2e16d5adfb5eababeceb6719e5e808cd" ], "volumeentries": [ "08f43b51ac546868c0d1a29e5f3921fb" ], "brickentries": [ "077dfbe0c7f0a8949960ef21333a2c11", "1…b0", "1c6…5“], "nodeentries": [ "520bf6e83b4bba8fb5a992a6da6ef041", "d3…a0", "fdb…13" ], "deviceentries": [ "0888196c04e9e5f7ad346ba7ec173c01", "38…94", "fb…a6" ], "blockvolumeentries": [], "dbattributeentries": [], "pendingoperations": [] } "08f43b51ac546868c0d1a29e5f3921fb": { "Info": { "size": 2, "name": "vol_08f43b51ac546868c0d1a29e5f3921fb", "durability": { "type": "replicate", "replicate": { "replica": 3 }, "disperse": {} }, "gid": 2016, "snapshot": { "enable": false, "factor": 1 }, "id": "08f43b51ac546868c0d1a29e5f3921fb", "cluster": "2e16d5adfb5eababeceb6719e5e808cd", "mount": { "glusterfs": { "hosts": [ "10.2.4.58", "10.2.4.144“, "10.2.4.231“ ], "device": "10.2.4.58:vol_08f43b51ac546868c0d1a29e5f3921fb", "options": { "backup-volfile-servers": "10.2.4.144,10.2.4.231“ } } }, "blockinfo": {} }, "Bricks": [ "077dfbe0c7f0a8949960ef21333a2c11", "1…b0", "1c6…5 “ ], "GlusterVolumeOptions": [ ], "Pending": { "Id": "“ } } … "bricks": { "$ 077dfbe0c7f0a8949960ef21333a2c11 ": { "Info": { "id": " 077dfbe0c7f0a8949960ef21333a2c11 ", "path": "$brick_path_1", "device": vg_z12…13", "node": “520bf6e83b4bba8fb5a992a6da6ef041", "volume": " 08f43b51ac546868c0d1a29e5f3921fb ", "size": $volume_size_bytes }, "TpSize": $volume_size_bytes, "PoolMetadataSize": 12288, "Pending": { "Id": "" }, "LvmThinPool": "tp_ 077dfbe0c7f0a8949960ef21333a2c11 ", "LvmLv": "", "SubType": 1 … "State": "online", "Info": { "zone": 1, "hostnames": { "manage": [ "10.2.4.231“ ], "storage": [ "10.2.4.231“ ] }, "cluster": "2e16d5adfb5eababeceb6719e5e808cd", "id": "520bf6e83b4bba8fb5a992a6da6ef041" }, "Devices": [ "0888196c04e9e5f7ad346ba7ec173c01"
  • 17. How Heketi manage Gluster executor mock ssh kubernetes How manage Does not send any commands out to servers. Sends commands to real systems over ssh Send commands to k8s api Purpose Development and tests GlusterFS as separate servers Manage GlusterFS k8s pods /etc/heketi/heketi.json … "glusterfs": { "executor": "kubernetes", "db": "/var/lib/heketi/heketi.db" "kubeexec": { "host": "https://kubernetes.default.svc.cluster.local", "fstab": "/var/lib/heketi/fstab” …. Custom fstab for GlusterFS PODs
  • 18. Possible Heketi database problems Problem Impact Solution We have a volume in GlusterFS, but have no in Heketi Lost control Add information about the lost volume to db We have a volume in Heketi, but have no in GlusterFS Inconsistent DB Delete volume from Heketi DB use heketi-cli or API Volume settings not same as in the Heketi DB Inconsistent DB Fix values in Heketi DB We replaced the physical device with GlusterFS bricks Storage isn’t health Recreate all volumes
  • 19. If we replace a physical device Restore LV from the lvm archive. Ex.: /etc/lvm/archive/vg_0888196c04e9e5f7ad346ba7ec173c01_00141-488962832.vg Replacing a broken GlusterFS POD – it’s realy easy! Example: device=$(grep /dev/ $backup | sed -e s/'t'//g -e s/'#'//g | cut -d " " -f3) device_id=$(grep /dev/ $backup -B1 | head -n1 | cut -d " " -f3) pvcreate --uuid $device_id $device –norestorefile vgname=$(grep -E 'vg_.*{' $backup | cut -d " " -f1) vgcreate $vgname $device bricks=$(grep brick_ $backup | sed s/'t'//g | cut -d " " -f1) for brick in $bricks; do chunksize=256K; tp_name=…; poolmetadatasize=…; size=… lvcreate -qq --autobackup=y --poolmetadatasize $poolmetadatasize --chunksize $chunksize --size $size --thin $vgname/$tp_name --virtualsize $size --name $brick" mkfs.xfs -i size=512 -n size=8192 /dev/mapper/$vgname-$brick done Reset all bricks: gluster v status $volume | grep ${brickpath: -25} | grep " N " && gluster volume reset-brick $volume $brickpath start && gluster volume reset-brick $volume $brickpath $brickpath commit force But replacing only a broken device isn’t possible from Heketi. We need to do this manually: heketi-cli node add --zone=1 --cluster=$heketiClusterID --management-host-name=$newNodeAddress --storage-host-name=$newNodeAddress heketi-cli device add --name $deviceName --node $newNodeID heketi-cli node disable $oldNodeID heketi-cli node remove $oldNodeID heketi-cli device delete $oldDeviceID heketi-cli node delete $oldNodeID
  • 20. Good advices - OpenShift docs (Gluster storage) - RedHat knowledge base (KB) No-Cost RHEL Developer Subscription: https://developers.redhat.com/blog/2016/03/31/no-cost-rhel-developer-subscription-now-available/ https://github.com/TargetProcess/heketi_tools 3. You can use our scripts for checking and fixing Heketi and Gluster https://play.google.com/books/reader?id=D5q1DwAAQBAJ&authuser=2&hl=ru 2. Implementation Guide for IBM Blockchain Platform for Multicloud 1. Feel free to use RedHat resources: 4. RTFM : https://github.com/heketi/heketi