SlideShare uma empresa Scribd logo
1 de 34
Baixar para ler offline
Haodong Tang, haodong.tang@intel.com, Intel
Brien Porter, brien.porter@intel.com, Intel
March, 2018
Agenda
§ Background and motivations
§ Ceph with RDMA Messenger
§ Ceph with NVMe-oF
§ Summary & Next-step
Background
§ Past work (Boston OpenStack summit)
§ Optane based all-flash array was capable of delivering over 2.8M 4K random read
IOPS with very low latency.
§ Optane as BlueStore DB drive dramatically improved the tail latency of 4K
random write.
§ In this session, we’re talking about…
§ With the emergence of faster storage device (Optane/AEP), we need faster
network stack (high BW, low CPU cost, low latency) to keep performance linear
growth.
Uneven CPU distribution
§ Unbalanced CPU utilization.
§ With single OSD per NVMe, Ceph can’t take full advantage of NVMe performance. à to
minimize latency of 4K RW.
§ With multiple OSDs per NVMe, greatly improves 4K RW performance , but CPU tends to
be the bottleneck. à to reduce CPU utilization.
* This picture is from the Boston OpenS
Summit
CPU overhead – 4K random read
AsyncMsg(~22 - ~24%)
CPU overhead – 4K random write
Wal remove thread(~1.85%)
RocksDB(about ~6%-
~7% )
AsyncMsg(~14% -
~15%)
Motivations
§ RDMA is a direct access from the memory of one computer into that of
another without involving either one’s operating system.
§ RDMA supports zero-copy networking(kernel bypass).
• Eliminate CPUs, memory or context switches.
• Reduce latency and enable fast messenger transfer.
§ Potential benefit for ceph.
• Better Resource Allocation – Bring additional disk to servers with spare
CPU.
• Reduce latency generated by ceph network stack.
10
RoCE portion adopted from “Supplement to InfiniBand Architecture Specification Volume 1 Release
1.2.1, Annex A17: RoCEv2”, September 2, 2014
SoftwareTypicallyHardware
RDMA Application / ULP
iWARP
Protocol
RDMA Software Stack
TCP
IP
Ethernet
Link Layer
Ethernet / IP
Management
InfiniBand RoCE v1
RoCE v2 iWARP
IB
Transport
Protocol
UDP
IP
Ethernet
Link Layer
IB
Transport
Protocol
Ethernet
Link Layer
IB
Transport
Protocol
IB
Network
Layer
IB Link
Layer
Ethernet / IP
Management
Ethernet / IP
Management
InfiniBand
Management
IB
Network
Layer
RDMA
API
(Verbs)
Red content defined by the IBTA Green content defined by IEEE / IETF
RDMA overview
Ceph network layer
§ Currently, the Async Messenger is the default network stack, which is subject to the
message transfer between different dispatcher (client, monitor, OSD daemon).
§ By test, Async Messenger bring ~8% CPU benefit (lower than Simple Messenger) , no
throughput gain found.
writer
reader
NIC
in_q
dispatcher
out_q
accepter
NIC
dispatcher
poller pool
event callback
disptch_queue
event driver
Simple Messenger Async Messenger
§ XIO Messenger is based on Accelio, which
seamlessly support RDMA. XIO Messenger was
implemented in Ceph Hammer Release as Beta.
No support for now.
§ Async Messenger.
• Async Messenger is compatible with
different network protocol, like Posix,
RDMA and DPDK.
• Current Async Messenger RDMA support is
based on IB protocol.
• How about to integrate iwarp protocol ?
Ceph with RDMA
HW Ethernet NIC(RNIC)
NIC Driver
Dispatcher
Kernel
Async
Messenger
IO Library
RDMA Stack + OFA Verb APInetwork stack
event driver
event driver
event driver
workers pool
dispatch
queue
event driver
event driver
event driver
workers pool
same pool
IB Link
IB Transport
IB Transport APIkernel
bypass
§ Motivation
§ Leverage RDMA to improve performance
(low CPU utilization, low latency) and
improve drive scalability.
§ Leverage Intel network technology (NIC
with IWARP support) to speed up Ceph.
§ Prerequisite
§ Ceph AsyncMessenger provide
asynchronous semantics for RDMA.
§ To-do
§ Need rdma-cm library.
Ceph IWARP support
HW Ethernet NIC(RNIC)
NIC Driver
Dispatcher
Kernel
Async
Messenger
IO Library
RDMA Stack + OFA Verb APInetwork stack
event driver
event driver
event driver
workers pool
dispatch queue
event driver
event driver
event driver
workers pool
same pool
TCP/IP
MPA
DDP
RDMAPkernel bypass
14
§ Code link: iwarp enabling code
§ Implementation
• Every RDMA connection owns dedicated qp
and recv/send queue.
• All RDMA connection own common cq and
memory pool.
• One cq polling thread to get completed
queue.
• Use epoll to notify waiting event.
Ceph IWARP integration
claimed as the property of others.
marks of Intel Corporation in the U.S. and/or other countries.
write
NIC
dispatcher
poller pool
disptch_queue
event driver
Async Messenger with RDMA
poll_cq libibverb
r/s_queue
RDMAMgr – dedicated resource
qp
cq
RDMAMgr – common resource
mem pool
RDMAConnection
RDMA-CM
read
return resource
notify
event callback
Ceph w/ IWARP performance
§ Test configuration
CPU SKX Platform
Memory 128 GB
NIC 10 Gb X722 NIC
Disk distribution 4x P3700 as OSD drive, 1x
Optane as DB driver
Software configuration CentOS 7, Ceph Luminous (dev)
MON
OSD
OSD
OSD
OSD
OSD
OSD
OSD
OSD
FIO FIO FIO
FIO FIO FIO
client
OSD Node
§ Test Methodology
§ QD scaling: 1->128
Ceph w/ IWARP performance
§ Ceph w/ IWARP delivers higher 4K random write performance than TCP/IP.
§ Ceph w/ IWARP generates higher CPU Utilization.
§ Ceph w/ IWARP consume more user level CPU, while Ceph w/ TCP/IP consumes more system level
CPU.
0%
5%
10%
15%
20%
25%
30%
0
20000
40000
60000
80000
100000
120000
QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64
IOPS
Ceph Performance Comparison - RDMA vs TCP/IP - 1x OSD per SSD
4K Random Write
RDMA Cluster 4K Random Write IOPS TCP/IP Cluster 4K Random Write IOPS
RDMA Cluster CPU Utilization TCP/IP Cluster CPU Utilization
0
5
10
15
20
25
TCP/IP Cluster CPU Utilzation RDMA Cluster CPU Utilization
Ceph CPU Comparison - RDMA vs TCP/IP - 4x OSDs per Node, QD=64
4K Random Write
usr sys iowait soft
Ceph w/ IWARP performance
§ With QD scaling up, the 4K random write IOPS per CPU utilization of Ceph w/ IWARP is catching up
Ceph with TCP/IP.
0
1000
2000
3000
4000
5000
6000
QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64
Ceph Performance Comparison - RDMA vs TCP/IP - 4x OSDs per node
IOPS per CPU Utilization
RDMA IOPS per CPU Utilization TCP/IP IOPS per CPU Utilization
Ceph w/ RoCE performance
§ Test configuration
MON
OSD
OSD
OSD
OSD
OSD
OSD
OSD
OSD
FIO FIO FIO
FIO FIO FIO
client
OSD Node
§ Test Methodology
§ QD scaling: 1->128
CPU Broadware 88x Linux cores
Memory 128 GB
NIC 40 Gb Mellanox NIC
Disk distribution 4x P3700 as OSD drive, 1x
Optane as DB driver
Software configuration Ubuntu 14.04, Ceph Luminous
(dev)
§ The performance of Ceph w/ RoCE is ~11% to ~86% higher than TCP/IP.
§ The total CPU utilization of Ceph w/ RoCE cluster is ~14% higher than TCP/IP.
§ The user level CPU utilization of Ceph w/ RoCE cluster is ~13% higher than TCP/IP.
Ceph w/ RoCE performance on 8x OSDs cluste
0%
10%
20%
30%
40%
50%
60%
70%
80%
0
20000
40000
60000
80000
100000
120000
140000
160000
QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64 QD=128
IOPS
Ceph Performance Comparison - RDMA vs TCP/IP - 8x OSDs per node
4K Random Write
RDMA Cluster 4K Random Write IOPS TCP/IP Cluster 4K Random Write IOPS
RDMA Cluster CPU Utilization TCP/IP Cluster CPU Utilization
0
10
20
30
40
50
60
70
80
TCP/IP Cluster CPU Utilzation RDMA Cluster CPU Utilization
Ceph CPU Comparison - RDMA vs TCP/IP - 8x OSDs per Node, QD=64
4K Random Write
usr sys iowait soft
§ With Ceph tunings, the performance of Ceph w/ RoCE is higher than TCP in high QD workload.
§ The IOPS per CPU of Ceph w/ RoCE cluster is higher than TCP cluster.
§ But still lower in low QD workload.
§ Tunings:
§ Increase RDMA completed queue depth.
§ Decrease Ceph RDMA polling time.
Ceph w/ RoCE performance (after tunings) on
16x OSDs cluster
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
0
20000
40000
60000
80000
100000
120000
140000
160000
QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64 QD=128
IOPS
Ceph Performance Comparison - RDMA vs TCP/IP - 16x OSDs per node
4K Random Write
RDMA Cluster 4K Random Write IOPS TCP/IP Cluster 4K Random Write IOPS
RDMA Cluster 4K Random Write IOPS (After Tuings) RDMA Cluster CPU Utilization
TCP/IP Cluster CPU Utilization RDMA Cluster CPU Utilization (After Tuings)
0
500
1000
1500
2000
2500
3000
QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64 QD=128
Ceph Performance Comparison - RDMA vs TCP/IP - 16x OSDs per node
IOPS per CPU Utilization
RDMA IOPS per CPU Utilization TCP/IP IOPS per CPU Utilization
RDMA IOPS per CPU Utilization (After Tunings)
§ ~10% of the total CPU used by Ceph is consumed by RDMA polling thread.
§ Both Ceph RDMA and TCP/IP code are based on Epoll, while RDMA polling thread
requires extra CPU cycle.
CPU profiling
NVMe over Fabrics (NVMe-oF)
§ NVMe is a new specification optimized for NAND
flash and next-generation solid-state storage
technologies.
§ NVMe over Fabrics enables access to remote NVMe
devices over multiple network fabrics.
§ Supported fabrics
§ RDMA – InfiniBand, IWARP, RoCE
§ Fiber Channel
§ TCP/IP
§ NVMe-oF benefits
§ NVMe disaggregation.
§ Delivers performance of remote NVMe on-par with
local NVMe.
FIO performance
§ NVMe-oF added negligible performance overhead for write IO (< 1%)
§ NVMe-oF added up to ~8.7% performance gap for read IO.
0
500
1000
1500
2000
2500
3000
3500
0
50000
100000
150000
200000
250000
4K Random
Write
8K Random
Write
16K Sequential
Write
128K Sequential
Write
LAT(us)
IOPS
NVMe-oF Write Performance Evaluation
- QD=32, volume size = 40GB
Local NVMe IOPS NVMe-oF IOPS Local NVMe Avg. Lat NVMe-oF Avg. Lat
0
200
400
600
800
1000
1200
1400
1600
1800
0
50000
100000
150000
200000
250000
300000
4K Random
Read
8K Random
Read
16K Sequential
Read
128K Sequential
Read
LAT(us)
IOPS
NVMe-oF Read Perfromance Evaluation
- QD=32, volume size = 40GB
Local NVMe IOPS NVMe-oF IOPS Local NVMe Avg. Lat NVMe-oF Avg. Lat
Ceph over NVMf
§ Expectations and questions before POC.
§ Expectations: According to the benchmark from the first part, we’re
expecting
§ on-par 4K random write performance with NVMe-oF for Ceph.
§ on-par CPU utilization on NVMe-oF host node.
§ Questions:
§ How many CPU will be used on NVMe-oF target node ?
§ How is the behavior of tail latency(99.0%) latency with NVMe-oF ?
§ Does NVMe-oF influence the Scale-up and Scale-out ability of Ceph
?
Benchmark methodology
§ Hardware configuration
§ 2x Storage nodes, 3x OSD nodes, 3x Client nodes.
§ 6x P3700 (800 GB U.2), 3x Optane (375 GB)
§ 30x FIO processes worked on 30x RBD volumes.
§ All these 8x servers are BRW, 128 GB memory,
Mellanox Connect-X4 NICs.
Optane
Ceph OSD
P3700
P3700
client
Optane
Ceph OSD
P3700
P3700
client
Optane
Ceph OSD
P3700
P3700
client
Ceph Client
RBD
RBD
RBD
RBD
FIO
Ceph Client
RBD
RBD
RBD
RBD
FIO
Ceph Client
RBD
RBD
RBD
RBD
FIO TCP/IP
Optane
Ceph OSD
P3700
P3700
NVMf
client
Optane
Ceph OSD
P3700
P3700
NVMf
client
Optane
Ceph OSD
P3700
P3700
NVMf
client
Ceph Client
RBD
RBD
RBD
RBD
FIO
Ceph Client
RBD
RBD
RBD
RBD
FIO
Ceph Client
RBD
RBD
RBD
RBD
FIO
P3700
P3700
P3700
Ceph Target
P3700
P3700
P3700
Ceph Target
RDMA
TCP/IP
§ Baseline and comparison
§ The baseline setup used local NVMe.
§ The comparison setup attaches remote NVMe as OSD data drive.
§ 6x 2T P3700 are among 2x Storage nodes.
§ OSD nodes attach the 6x P3700 over RoCE V2 fabric.
§ Set NVMe-oF CPU offload on target node.
Ceph over NVMe-oF – 4K random write
§ Compared with traditional setup, running Ceph over NVMf didn’t degrade
4K random write IOPS.
0
50000
100000
150000
200000
250000
QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64 QD=128
IOPS
4K Random Write - Ceph over NVMf vs Ceph over local NVMe
4K RW - Ceph with Local NVMe 4K RW - Ceph with NVMe-oF
Ceph over NVMe-oF – CPU overheads
§ Running Ceph over NVMe-oF add < 1% CPU overheads on target node.
§ Running Ceph over NVMe-oF didn’t add extra CPU overheads on host(OSD) node.
CPU Utilization on OSD Node CPU Utilization on Target Node
Ceph over NVMe-oF – tail latency
§ When QD is higher than 16, Ceph over NVMf shows higher tail latency (99%).
§ When QD is lower than 16, Ceph over NVMf on-par with Ceph over local NVMe.
0
50
100
150
200
250
300
350
QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64 QD=128
ms
Tail Latency Comparison - Ceph over NVMf vs Ceph over local NVMe
4K RW - Ceph with Local NVMe 4K RW - Ceph with NVMe-oF
Ceph over NVMe-oF – OSD node Scalin
out
§ Running Ceph over NVMe-oF didn’t limit the Ceph OSD node scaling out.
§ For 4K random write/read, the maximum ratio of 3x nodes to 2x nodes is 1.47, closing to 1.5
(ideal value).
0
50000
100000
150000
200000
QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64 QD=128
IOPS
Scaling Out Testing - Ceph over NVMf
4K Random Write
2xnodes 3xnodes
0
500000
1000000
QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64 QD=128
IOPS
Scaling Out Testing - Ceph over NVMf
4K Random Read
2xnodes 3xnodes
1.3
1.35
1.4
1.45
1.5
QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64 QD=128
Performance Comparison- Ceph over NVMf
4K Random Write, 3x nodes/2x nodes
1.2
1.25
1.3
1.35
1.4
1.45
1.5
QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64 QD=128
Performance Comparison - Ceph over NVMf
4K Random Write, 3x nodes/2x nodes
Ceph over NVMe-oF – OSD
Scaling up
§ The OSD scalability per OSD node depends on Ceph architecture.
§ Running Ceph over NVMe-oF didn’t improve the OSD scalability.
0
20000
40000
60000
80000
100000
120000
140000
160000
180000
200000
QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64 QD=128
IOPS
Scaling Up Tesint - Ceph over NVMf
4K Random Write
24xOSDs 36xOSDs 48xOSDs
Summary & Next-step
§ Summary
§ RDMA is critical for future Ceph AFA solutions.
§ Ceph with RDMA messenger provides up to ~86% performance advantage over TCP/IP in
low queue depth workload.
§ As network fabrics, RDMA performs well in Ceph NVMe-oF solutions.
§ Running Ceph on NVMe-oF does not appreciably degrade Ceph write performance.
§ Ceph with NVMe-oF brings more flexible provisioning and lower TCO.
§ Next-step
§ Expand Ceph iWARP cluster scale, to 5 or 10 ODS node with 5 client node.
§ leverage NVMe-oF with the high density storage node for lower TCO.
Legal Disclaimer & Optimization Notice
Optimization Notice
Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to
Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not
guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel.
Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not
specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference
Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.
Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software,
operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and
performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when
combined with other products. For more complete information visit www.intel.com/benchmarks.
INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR
OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO
LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS
INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE,
MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
Copyright © 2018, Intel Corporation. All rights reserved. Intel, Pentium, Xeon, Xeon Phi, Core, VTune, Cilk, and the Intel logo are
trademarks of Intel Corporation in the U.S. and other countries.

Mais conteúdo relacionado

Mais procurados

Deploying CloudStack with Ceph
Deploying CloudStack with CephDeploying CloudStack with Ceph
Deploying CloudStack with CephShapeBlue
 
Testing SAP PI/PO Interfaces the easy way
Testing SAP PI/PO Interfaces the easy wayTesting SAP PI/PO Interfaces the easy way
Testing SAP PI/PO Interfaces the easy wayDaniel Graversen
 
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...OpenStack Korea Community
 
IBM Tape Update Dezember18 - TS1160
IBM Tape Update Dezember18 - TS1160IBM Tape Update Dezember18 - TS1160
IBM Tape Update Dezember18 - TS1160Josef Weingand
 
Tadx - Présentation Conteneurisation
Tadx -  Présentation ConteneurisationTadx -  Présentation Conteneurisation
Tadx - Présentation ConteneurisationTADx
 
Route53 で親子同居
Route53 で親子同居Route53 で親子同居
Route53 で親子同居@ otsuka752
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaScyllaDB
 
클라우드 컴퓨팅 기반 기술과 오픈스택(Kvm) 기반 Provisioning
클라우드 컴퓨팅 기반 기술과 오픈스택(Kvm) 기반 Provisioning 클라우드 컴퓨팅 기반 기술과 오픈스택(Kvm) 기반 Provisioning
클라우드 컴퓨팅 기반 기술과 오픈스택(Kvm) 기반 Provisioning Ji-Woong Choi
 
All our aggregates are wrong (ExploreDDD 2018)
All our aggregates are wrong (ExploreDDD 2018)All our aggregates are wrong (ExploreDDD 2018)
All our aggregates are wrong (ExploreDDD 2018)Mauro Servienti
 
Ceph Block Devices: A Deep Dive
Ceph Block Devices:  A Deep DiveCeph Block Devices:  A Deep Dive
Ceph Block Devices: A Deep DiveRed_Hat_Storage
 
Penser et designer un site éco-responsable
Penser et designer un site éco-responsable Penser et designer un site éco-responsable
Penser et designer un site éco-responsable Julien Dereumaux
 
Docker Introduction
Docker IntroductionDocker Introduction
Docker IntroductionSparkbit
 
해외 사례로 보는 Billing for OpenStack Solution
해외 사례로 보는 Billing for OpenStack Solution해외 사례로 보는 Billing for OpenStack Solution
해외 사례로 보는 Billing for OpenStack SolutionNalee Jang
 
Introduction to k3s and k3sup
Introduction to k3s and k3supIntroduction to k3s and k3sup
Introduction to k3s and k3supSaiyam Pathak
 
[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?
[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?
[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?OpenStack Korea Community
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Odinot Stanislas
 
Using Magnolia in a Microservices Architecture
Using Magnolia in a Microservices ArchitectureUsing Magnolia in a Microservices Architecture
Using Magnolia in a Microservices ArchitectureMagnolia
 
Vmware horizon6
Vmware horizon6Vmware horizon6
Vmware horizon6laonap166
 

Mais procurados (20)

Deploying CloudStack with Ceph
Deploying CloudStack with CephDeploying CloudStack with Ceph
Deploying CloudStack with Ceph
 
Testing SAP PI/PO Interfaces the easy way
Testing SAP PI/PO Interfaces the easy wayTesting SAP PI/PO Interfaces the easy way
Testing SAP PI/PO Interfaces the easy way
 
Awx
AwxAwx
Awx
 
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
 
IBM Tape Update Dezember18 - TS1160
IBM Tape Update Dezember18 - TS1160IBM Tape Update Dezember18 - TS1160
IBM Tape Update Dezember18 - TS1160
 
Tadx - Présentation Conteneurisation
Tadx -  Présentation ConteneurisationTadx -  Présentation Conteneurisation
Tadx - Présentation Conteneurisation
 
Route53 で親子同居
Route53 で親子同居Route53 で親子同居
Route53 で親子同居
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
클라우드 컴퓨팅 기반 기술과 오픈스택(Kvm) 기반 Provisioning
클라우드 컴퓨팅 기반 기술과 오픈스택(Kvm) 기반 Provisioning 클라우드 컴퓨팅 기반 기술과 오픈스택(Kvm) 기반 Provisioning
클라우드 컴퓨팅 기반 기술과 오픈스택(Kvm) 기반 Provisioning
 
All our aggregates are wrong (ExploreDDD 2018)
All our aggregates are wrong (ExploreDDD 2018)All our aggregates are wrong (ExploreDDD 2018)
All our aggregates are wrong (ExploreDDD 2018)
 
Ceph Block Devices: A Deep Dive
Ceph Block Devices:  A Deep DiveCeph Block Devices:  A Deep Dive
Ceph Block Devices: A Deep Dive
 
Penser et designer un site éco-responsable
Penser et designer un site éco-responsable Penser et designer un site éco-responsable
Penser et designer un site éco-responsable
 
Ansible
AnsibleAnsible
Ansible
 
Docker Introduction
Docker IntroductionDocker Introduction
Docker Introduction
 
해외 사례로 보는 Billing for OpenStack Solution
해외 사례로 보는 Billing for OpenStack Solution해외 사례로 보는 Billing for OpenStack Solution
해외 사례로 보는 Billing for OpenStack Solution
 
Introduction to k3s and k3sup
Introduction to k3s and k3supIntroduction to k3s and k3sup
Introduction to k3s and k3sup
 
[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?
[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?
[OpenStack Days Korea 2016] Track1 - 카카오는 오픈스택 기반으로 어떻게 5000VM을 운영하고 있을까?
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
 
Using Magnolia in a Microservices Architecture
Using Magnolia in a Microservices ArchitectureUsing Magnolia in a Microservices Architecture
Using Magnolia in a Microservices Architecture
 
Vmware horizon6
Vmware horizon6Vmware horizon6
Vmware horizon6
 

Semelhante a Accelerating Ceph with iWARP RDMA over Ethernet - Brien Porter, Haodong Tang

Deploying flash storage for Ceph without compromising performance
Deploying flash storage for Ceph without compromising performance Deploying flash storage for Ceph without compromising performance
Deploying flash storage for Ceph without compromising performance Ceph Community
 
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...Ceph Community
 
Red hat open stack and storage presentation
Red hat open stack and storage presentationRed hat open stack and storage presentation
Red hat open stack and storage presentationMayur Shetty
 
Offloading for Databases - Deep Dive
Offloading for Databases - Deep DiveOffloading for Databases - Deep Dive
Offloading for Databases - Deep DiveUniFabric
 
Ceph Day Berlin: Deploying Flash Storage for Ceph without Compromising Perfor...
Ceph Day Berlin: Deploying Flash Storage for Ceph without Compromising Perfor...Ceph Day Berlin: Deploying Flash Storage for Ceph without Compromising Perfor...
Ceph Day Berlin: Deploying Flash Storage for Ceph without Compromising Perfor...Ceph Community
 
Performance optimization for all flash based on aarch64 v2.0
Performance optimization for all flash based on aarch64 v2.0Performance optimization for all flash based on aarch64 v2.0
Performance optimization for all flash based on aarch64 v2.0Ceph Community
 
DPDK Summit - 08 Sept 2014 - NTT - High Performance vSwitch
DPDK Summit - 08 Sept 2014 - NTT - High Performance vSwitchDPDK Summit - 08 Sept 2014 - NTT - High Performance vSwitch
DPDK Summit - 08 Sept 2014 - NTT - High Performance vSwitchJim St. Leger
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDKKernel TLV
 
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...Danielle Womboldt
 
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...Ceph Community
 
Ceph Day Chicago - Deploying flash storage for Ceph without compromising perf...
Ceph Day Chicago - Deploying flash storage for Ceph without compromising perf...Ceph Day Chicago - Deploying flash storage for Ceph without compromising perf...
Ceph Day Chicago - Deploying flash storage for Ceph without compromising perf...Ceph Community
 
3.INTEL.Optane_on_ceph_v2.pdf
3.INTEL.Optane_on_ceph_v2.pdf3.INTEL.Optane_on_ceph_v2.pdf
3.INTEL.Optane_on_ceph_v2.pdfhellobank1
 
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph Ceph Community
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community
 
LF_DPDK17_DPDK support for new hardware offloads
LF_DPDK17_DPDK support for new hardware offloadsLF_DPDK17_DPDK support for new hardware offloads
LF_DPDK17_DPDK support for new hardware offloadsLF_DPDK
 
Software Stacks to enable SDN and NFV
Software Stacks to enable SDN and NFVSoftware Stacks to enable SDN and NFV
Software Stacks to enable SDN and NFVYoshihiro Nakajima
 
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Community
 
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화OpenStack Korea Community
 
CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] IO Visor Project
 

Semelhante a Accelerating Ceph with iWARP RDMA over Ethernet - Brien Porter, Haodong Tang (20)

Deploying flash storage for Ceph without compromising performance
Deploying flash storage for Ceph without compromising performance Deploying flash storage for Ceph without compromising performance
Deploying flash storage for Ceph without compromising performance
 
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
Accelerating Ceph Performance with High Speed Networks and Protocols - Qingch...
 
Red hat open stack and storage presentation
Red hat open stack and storage presentationRed hat open stack and storage presentation
Red hat open stack and storage presentation
 
Ceph
CephCeph
Ceph
 
Offloading for Databases - Deep Dive
Offloading for Databases - Deep DiveOffloading for Databases - Deep Dive
Offloading for Databases - Deep Dive
 
Ceph Day Berlin: Deploying Flash Storage for Ceph without Compromising Perfor...
Ceph Day Berlin: Deploying Flash Storage for Ceph without Compromising Perfor...Ceph Day Berlin: Deploying Flash Storage for Ceph without Compromising Perfor...
Ceph Day Berlin: Deploying Flash Storage for Ceph without Compromising Perfor...
 
Performance optimization for all flash based on aarch64 v2.0
Performance optimization for all flash based on aarch64 v2.0Performance optimization for all flash based on aarch64 v2.0
Performance optimization for all flash based on aarch64 v2.0
 
DPDK Summit - 08 Sept 2014 - NTT - High Performance vSwitch
DPDK Summit - 08 Sept 2014 - NTT - High Performance vSwitchDPDK Summit - 08 Sept 2014 - NTT - High Performance vSwitch
DPDK Summit - 08 Sept 2014 - NTT - High Performance vSwitch
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
 
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...
 
Ceph Day Chicago - Deploying flash storage for Ceph without compromising perf...
Ceph Day Chicago - Deploying flash storage for Ceph without compromising perf...Ceph Day Chicago - Deploying flash storage for Ceph without compromising perf...
Ceph Day Chicago - Deploying flash storage for Ceph without compromising perf...
 
3.INTEL.Optane_on_ceph_v2.pdf
3.INTEL.Optane_on_ceph_v2.pdf3.INTEL.Optane_on_ceph_v2.pdf
3.INTEL.Optane_on_ceph_v2.pdf
 
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph
 
LF_DPDK17_DPDK support for new hardware offloads
LF_DPDK17_DPDK support for new hardware offloadsLF_DPDK17_DPDK support for new hardware offloads
LF_DPDK17_DPDK support for new hardware offloads
 
Software Stacks to enable SDN and NFV
Software Stacks to enable SDN and NFVSoftware Stacks to enable SDN and NFV
Software Stacks to enable SDN and NFV
 
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
 
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
 
CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016]
 

Último

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Último (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Accelerating Ceph with iWARP RDMA over Ethernet - Brien Porter, Haodong Tang

  • 1. Haodong Tang, haodong.tang@intel.com, Intel Brien Porter, brien.porter@intel.com, Intel March, 2018
  • 2. Agenda § Background and motivations § Ceph with RDMA Messenger § Ceph with NVMe-oF § Summary & Next-step
  • 3.
  • 4. Background § Past work (Boston OpenStack summit) § Optane based all-flash array was capable of delivering over 2.8M 4K random read IOPS with very low latency. § Optane as BlueStore DB drive dramatically improved the tail latency of 4K random write. § In this session, we’re talking about… § With the emergence of faster storage device (Optane/AEP), we need faster network stack (high BW, low CPU cost, low latency) to keep performance linear growth.
  • 5. Uneven CPU distribution § Unbalanced CPU utilization. § With single OSD per NVMe, Ceph can’t take full advantage of NVMe performance. à to minimize latency of 4K RW. § With multiple OSDs per NVMe, greatly improves 4K RW performance , but CPU tends to be the bottleneck. à to reduce CPU utilization. * This picture is from the Boston OpenS Summit
  • 6. CPU overhead – 4K random read AsyncMsg(~22 - ~24%)
  • 7. CPU overhead – 4K random write Wal remove thread(~1.85%) RocksDB(about ~6%- ~7% ) AsyncMsg(~14% - ~15%)
  • 8. Motivations § RDMA is a direct access from the memory of one computer into that of another without involving either one’s operating system. § RDMA supports zero-copy networking(kernel bypass). • Eliminate CPUs, memory or context switches. • Reduce latency and enable fast messenger transfer. § Potential benefit for ceph. • Better Resource Allocation – Bring additional disk to servers with spare CPU. • Reduce latency generated by ceph network stack.
  • 9.
  • 10. 10 RoCE portion adopted from “Supplement to InfiniBand Architecture Specification Volume 1 Release 1.2.1, Annex A17: RoCEv2”, September 2, 2014 SoftwareTypicallyHardware RDMA Application / ULP iWARP Protocol RDMA Software Stack TCP IP Ethernet Link Layer Ethernet / IP Management InfiniBand RoCE v1 RoCE v2 iWARP IB Transport Protocol UDP IP Ethernet Link Layer IB Transport Protocol Ethernet Link Layer IB Transport Protocol IB Network Layer IB Link Layer Ethernet / IP Management Ethernet / IP Management InfiniBand Management IB Network Layer RDMA API (Verbs) Red content defined by the IBTA Green content defined by IEEE / IETF RDMA overview
  • 11. Ceph network layer § Currently, the Async Messenger is the default network stack, which is subject to the message transfer between different dispatcher (client, monitor, OSD daemon). § By test, Async Messenger bring ~8% CPU benefit (lower than Simple Messenger) , no throughput gain found. writer reader NIC in_q dispatcher out_q accepter NIC dispatcher poller pool event callback disptch_queue event driver Simple Messenger Async Messenger
  • 12. § XIO Messenger is based on Accelio, which seamlessly support RDMA. XIO Messenger was implemented in Ceph Hammer Release as Beta. No support for now. § Async Messenger. • Async Messenger is compatible with different network protocol, like Posix, RDMA and DPDK. • Current Async Messenger RDMA support is based on IB protocol. • How about to integrate iwarp protocol ? Ceph with RDMA HW Ethernet NIC(RNIC) NIC Driver Dispatcher Kernel Async Messenger IO Library RDMA Stack + OFA Verb APInetwork stack event driver event driver event driver workers pool dispatch queue event driver event driver event driver workers pool same pool IB Link IB Transport IB Transport APIkernel bypass
  • 13. § Motivation § Leverage RDMA to improve performance (low CPU utilization, low latency) and improve drive scalability. § Leverage Intel network technology (NIC with IWARP support) to speed up Ceph. § Prerequisite § Ceph AsyncMessenger provide asynchronous semantics for RDMA. § To-do § Need rdma-cm library. Ceph IWARP support HW Ethernet NIC(RNIC) NIC Driver Dispatcher Kernel Async Messenger IO Library RDMA Stack + OFA Verb APInetwork stack event driver event driver event driver workers pool dispatch queue event driver event driver event driver workers pool same pool TCP/IP MPA DDP RDMAPkernel bypass
  • 14. 14 § Code link: iwarp enabling code § Implementation • Every RDMA connection owns dedicated qp and recv/send queue. • All RDMA connection own common cq and memory pool. • One cq polling thread to get completed queue. • Use epoll to notify waiting event. Ceph IWARP integration claimed as the property of others. marks of Intel Corporation in the U.S. and/or other countries. write NIC dispatcher poller pool disptch_queue event driver Async Messenger with RDMA poll_cq libibverb r/s_queue RDMAMgr – dedicated resource qp cq RDMAMgr – common resource mem pool RDMAConnection RDMA-CM read return resource notify event callback
  • 15. Ceph w/ IWARP performance § Test configuration CPU SKX Platform Memory 128 GB NIC 10 Gb X722 NIC Disk distribution 4x P3700 as OSD drive, 1x Optane as DB driver Software configuration CentOS 7, Ceph Luminous (dev) MON OSD OSD OSD OSD OSD OSD OSD OSD FIO FIO FIO FIO FIO FIO client OSD Node § Test Methodology § QD scaling: 1->128
  • 16. Ceph w/ IWARP performance § Ceph w/ IWARP delivers higher 4K random write performance than TCP/IP. § Ceph w/ IWARP generates higher CPU Utilization. § Ceph w/ IWARP consume more user level CPU, while Ceph w/ TCP/IP consumes more system level CPU. 0% 5% 10% 15% 20% 25% 30% 0 20000 40000 60000 80000 100000 120000 QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64 IOPS Ceph Performance Comparison - RDMA vs TCP/IP - 1x OSD per SSD 4K Random Write RDMA Cluster 4K Random Write IOPS TCP/IP Cluster 4K Random Write IOPS RDMA Cluster CPU Utilization TCP/IP Cluster CPU Utilization 0 5 10 15 20 25 TCP/IP Cluster CPU Utilzation RDMA Cluster CPU Utilization Ceph CPU Comparison - RDMA vs TCP/IP - 4x OSDs per Node, QD=64 4K Random Write usr sys iowait soft
  • 17. Ceph w/ IWARP performance § With QD scaling up, the 4K random write IOPS per CPU utilization of Ceph w/ IWARP is catching up Ceph with TCP/IP. 0 1000 2000 3000 4000 5000 6000 QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64 Ceph Performance Comparison - RDMA vs TCP/IP - 4x OSDs per node IOPS per CPU Utilization RDMA IOPS per CPU Utilization TCP/IP IOPS per CPU Utilization
  • 18. Ceph w/ RoCE performance § Test configuration MON OSD OSD OSD OSD OSD OSD OSD OSD FIO FIO FIO FIO FIO FIO client OSD Node § Test Methodology § QD scaling: 1->128 CPU Broadware 88x Linux cores Memory 128 GB NIC 40 Gb Mellanox NIC Disk distribution 4x P3700 as OSD drive, 1x Optane as DB driver Software configuration Ubuntu 14.04, Ceph Luminous (dev)
  • 19. § The performance of Ceph w/ RoCE is ~11% to ~86% higher than TCP/IP. § The total CPU utilization of Ceph w/ RoCE cluster is ~14% higher than TCP/IP. § The user level CPU utilization of Ceph w/ RoCE cluster is ~13% higher than TCP/IP. Ceph w/ RoCE performance on 8x OSDs cluste 0% 10% 20% 30% 40% 50% 60% 70% 80% 0 20000 40000 60000 80000 100000 120000 140000 160000 QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64 QD=128 IOPS Ceph Performance Comparison - RDMA vs TCP/IP - 8x OSDs per node 4K Random Write RDMA Cluster 4K Random Write IOPS TCP/IP Cluster 4K Random Write IOPS RDMA Cluster CPU Utilization TCP/IP Cluster CPU Utilization 0 10 20 30 40 50 60 70 80 TCP/IP Cluster CPU Utilzation RDMA Cluster CPU Utilization Ceph CPU Comparison - RDMA vs TCP/IP - 8x OSDs per Node, QD=64 4K Random Write usr sys iowait soft
  • 20. § With Ceph tunings, the performance of Ceph w/ RoCE is higher than TCP in high QD workload. § The IOPS per CPU of Ceph w/ RoCE cluster is higher than TCP cluster. § But still lower in low QD workload. § Tunings: § Increase RDMA completed queue depth. § Decrease Ceph RDMA polling time. Ceph w/ RoCE performance (after tunings) on 16x OSDs cluster 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 0 20000 40000 60000 80000 100000 120000 140000 160000 QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64 QD=128 IOPS Ceph Performance Comparison - RDMA vs TCP/IP - 16x OSDs per node 4K Random Write RDMA Cluster 4K Random Write IOPS TCP/IP Cluster 4K Random Write IOPS RDMA Cluster 4K Random Write IOPS (After Tuings) RDMA Cluster CPU Utilization TCP/IP Cluster CPU Utilization RDMA Cluster CPU Utilization (After Tuings) 0 500 1000 1500 2000 2500 3000 QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64 QD=128 Ceph Performance Comparison - RDMA vs TCP/IP - 16x OSDs per node IOPS per CPU Utilization RDMA IOPS per CPU Utilization TCP/IP IOPS per CPU Utilization RDMA IOPS per CPU Utilization (After Tunings)
  • 21. § ~10% of the total CPU used by Ceph is consumed by RDMA polling thread. § Both Ceph RDMA and TCP/IP code are based on Epoll, while RDMA polling thread requires extra CPU cycle. CPU profiling
  • 22.
  • 23. NVMe over Fabrics (NVMe-oF) § NVMe is a new specification optimized for NAND flash and next-generation solid-state storage technologies. § NVMe over Fabrics enables access to remote NVMe devices over multiple network fabrics. § Supported fabrics § RDMA – InfiniBand, IWARP, RoCE § Fiber Channel § TCP/IP § NVMe-oF benefits § NVMe disaggregation. § Delivers performance of remote NVMe on-par with local NVMe.
  • 24. FIO performance § NVMe-oF added negligible performance overhead for write IO (< 1%) § NVMe-oF added up to ~8.7% performance gap for read IO. 0 500 1000 1500 2000 2500 3000 3500 0 50000 100000 150000 200000 250000 4K Random Write 8K Random Write 16K Sequential Write 128K Sequential Write LAT(us) IOPS NVMe-oF Write Performance Evaluation - QD=32, volume size = 40GB Local NVMe IOPS NVMe-oF IOPS Local NVMe Avg. Lat NVMe-oF Avg. Lat 0 200 400 600 800 1000 1200 1400 1600 1800 0 50000 100000 150000 200000 250000 300000 4K Random Read 8K Random Read 16K Sequential Read 128K Sequential Read LAT(us) IOPS NVMe-oF Read Perfromance Evaluation - QD=32, volume size = 40GB Local NVMe IOPS NVMe-oF IOPS Local NVMe Avg. Lat NVMe-oF Avg. Lat
  • 25. Ceph over NVMf § Expectations and questions before POC. § Expectations: According to the benchmark from the first part, we’re expecting § on-par 4K random write performance with NVMe-oF for Ceph. § on-par CPU utilization on NVMe-oF host node. § Questions: § How many CPU will be used on NVMe-oF target node ? § How is the behavior of tail latency(99.0%) latency with NVMe-oF ? § Does NVMe-oF influence the Scale-up and Scale-out ability of Ceph ?
  • 26. Benchmark methodology § Hardware configuration § 2x Storage nodes, 3x OSD nodes, 3x Client nodes. § 6x P3700 (800 GB U.2), 3x Optane (375 GB) § 30x FIO processes worked on 30x RBD volumes. § All these 8x servers are BRW, 128 GB memory, Mellanox Connect-X4 NICs. Optane Ceph OSD P3700 P3700 client Optane Ceph OSD P3700 P3700 client Optane Ceph OSD P3700 P3700 client Ceph Client RBD RBD RBD RBD FIO Ceph Client RBD RBD RBD RBD FIO Ceph Client RBD RBD RBD RBD FIO TCP/IP Optane Ceph OSD P3700 P3700 NVMf client Optane Ceph OSD P3700 P3700 NVMf client Optane Ceph OSD P3700 P3700 NVMf client Ceph Client RBD RBD RBD RBD FIO Ceph Client RBD RBD RBD RBD FIO Ceph Client RBD RBD RBD RBD FIO P3700 P3700 P3700 Ceph Target P3700 P3700 P3700 Ceph Target RDMA TCP/IP § Baseline and comparison § The baseline setup used local NVMe. § The comparison setup attaches remote NVMe as OSD data drive. § 6x 2T P3700 are among 2x Storage nodes. § OSD nodes attach the 6x P3700 over RoCE V2 fabric. § Set NVMe-oF CPU offload on target node.
  • 27. Ceph over NVMe-oF – 4K random write § Compared with traditional setup, running Ceph over NVMf didn’t degrade 4K random write IOPS. 0 50000 100000 150000 200000 250000 QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64 QD=128 IOPS 4K Random Write - Ceph over NVMf vs Ceph over local NVMe 4K RW - Ceph with Local NVMe 4K RW - Ceph with NVMe-oF
  • 28. Ceph over NVMe-oF – CPU overheads § Running Ceph over NVMe-oF add < 1% CPU overheads on target node. § Running Ceph over NVMe-oF didn’t add extra CPU overheads on host(OSD) node. CPU Utilization on OSD Node CPU Utilization on Target Node
  • 29. Ceph over NVMe-oF – tail latency § When QD is higher than 16, Ceph over NVMf shows higher tail latency (99%). § When QD is lower than 16, Ceph over NVMf on-par with Ceph over local NVMe. 0 50 100 150 200 250 300 350 QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64 QD=128 ms Tail Latency Comparison - Ceph over NVMf vs Ceph over local NVMe 4K RW - Ceph with Local NVMe 4K RW - Ceph with NVMe-oF
  • 30. Ceph over NVMe-oF – OSD node Scalin out § Running Ceph over NVMe-oF didn’t limit the Ceph OSD node scaling out. § For 4K random write/read, the maximum ratio of 3x nodes to 2x nodes is 1.47, closing to 1.5 (ideal value). 0 50000 100000 150000 200000 QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64 QD=128 IOPS Scaling Out Testing - Ceph over NVMf 4K Random Write 2xnodes 3xnodes 0 500000 1000000 QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64 QD=128 IOPS Scaling Out Testing - Ceph over NVMf 4K Random Read 2xnodes 3xnodes 1.3 1.35 1.4 1.45 1.5 QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64 QD=128 Performance Comparison- Ceph over NVMf 4K Random Write, 3x nodes/2x nodes 1.2 1.25 1.3 1.35 1.4 1.45 1.5 QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64 QD=128 Performance Comparison - Ceph over NVMf 4K Random Write, 3x nodes/2x nodes
  • 31. Ceph over NVMe-oF – OSD Scaling up § The OSD scalability per OSD node depends on Ceph architecture. § Running Ceph over NVMe-oF didn’t improve the OSD scalability. 0 20000 40000 60000 80000 100000 120000 140000 160000 180000 200000 QD=1 QD=2 QD=4 QD=8 QD=16 QD=32 QD=64 QD=128 IOPS Scaling Up Tesint - Ceph over NVMf 4K Random Write 24xOSDs 36xOSDs 48xOSDs
  • 32.
  • 33. Summary & Next-step § Summary § RDMA is critical for future Ceph AFA solutions. § Ceph with RDMA messenger provides up to ~86% performance advantage over TCP/IP in low queue depth workload. § As network fabrics, RDMA performs well in Ceph NVMe-oF solutions. § Running Ceph on NVMe-oF does not appreciably degrade Ceph write performance. § Ceph with NVMe-oF brings more flexible provisioning and lower TCO. § Next-step § Expand Ceph iWARP cluster scale, to 5 or 10 ODS node with 5 client node. § leverage NVMe-oF with the high density storage node for lower TCO.
  • 34. Legal Disclaimer & Optimization Notice Optimization Notice Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804 Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit www.intel.com/benchmarks. INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Copyright © 2018, Intel Corporation. All rights reserved. Intel, Pentium, Xeon, Xeon Phi, Core, VTune, Cilk, and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries.