SlideShare uma empresa Scribd logo
1 de 25
Baixar para ler offline
1© 2018 Mellanox Technologies
Machine Learning On OpenStack and K8 Done Right!
2018
Brain In The Cloud
Erez Cohen, VP CloudX & Artificial Intelligence
2© 2018 Mellanox Technologies
Data is Growing Faster Than Ever
Autonomous vehicle generates 4000GByte per day
SONAR
~10-100KB Per/Sec
CAMERA
~20-40MB Per/sec
GPS
~50KB Per/Sec
Data will grow by a factor of 10 over the next decade to 160 Zeta Bytes in 2025 (source: IDC)
Faster Data processing requires faster Interconnect speeds
RADAR
~10-100KB Per/Sec
Light Detection & Ranging
~10-70MB Per/Sec
3© 2018 Mellanox Technologies
Machine Learning Is Everywhere!
Fraud Detection
4© 2018 Mellanox Technologies
What Is Machine Learning
Machine Learning
Machine learning is the subfield of computer science that, according
to Arthur Samuel in 1959, gives "computers the ability to learn without
being explicitly programmed.“
Source: https://en.wikipedia.org/wiki/Machine_learning
5© 2018 Mellanox Technologies
Deep Learning
 Also known as Deep Neural Network (DNN)
 Subset of Artificial Neural Network (ANN)
Deep Learning
Deep Learning is a subfield of machine learning concerned with
algorithms inspired by the structure and function of the brain called
artificial neural networks
Source: http://machinelearningmastery.com/what-is-deep-learning/
6© 2018 Mellanox Technologies
Why Deep Learning And Why Now?
 Deep Learning allow to solve difficult problems
 In some cases problems that can’t be solve in other ways
 Deep Learning is not new
 1943: “A logical calculus of the ideas immanent in nervous activity”, McCulloch & Pitts
So why now?
 Infrastructure
 Recent development in GPU and network technology allow to realize machine learning
 Data
 More data is generated then ever. Critical for the training process
 Software
 Wave of open source machine learning frameworks
Cognitive Toolkit
7© 2018 Mellanox Technologies
Deep Learning Demands Highest Performance
TRAINING DATASET
NEW DATA
TRAINING
Intensive computing (Billions of
TFLOPS)
• GPU!
Ultra-fast networking for scalability
• RDMA, GPUDirect, Collective
acceleration
Fast, distributed storage
INFERENCING
Images
Video
Text
Speech
8© 2018 Mellanox Technologies
Neural Networks Complexity Growth
2014 2015 2016 2017
DeepSpeech DeepSpeech-2
DeepSpeech-3
30X
2013 2014 2015 2016
AlexNet GoogleNet
ResNet
Inception-V2
350X
Inception-V4
Image
Recognition
Speech
Recognition
PolyNet
9© 2018 Mellanox Technologies
Training Challenges
Training with large data sets and increasing networks can take long time
 In some cases even weeks
In many cases training need to happen frequently
 Model development and tuning
 Real life use cases may require retraining regularly
Accelerate training time by scale out architecture
 Add workers (nodes) to reduce training time
Types of parallelism that are now popular
Data parallelism
Model parallelism
Network is critical element to accelerate Distributed Training!
10© 2018 Mellanox Technologies
Model and Data Parallelism
Main Model/Parameter Server/Allreaduce
Local
Model
Mini
Batch
Mini
Batch
Mini
Batch
Mini
Batch
Mini
Batch
Local
Model
Local
Model
Local
Model
Local
Model
Local
Model
Mini
BatchData Data
Model Parallelism Data Parallelism
11© 2018 Mellanox Technologies
Accelerates Distributed Training
 Data Parallelism communication pattern
 Gradient updates to parameter servers or among workers.
 Model parameters distribution among workers.
 Frequent – each training step due to the sequential nature of SGD
 High bandwidth is needed, as models become larger and larger
 Number of parameters is increasing
 Usually characterized with Bursts on the network - workers are synchronized
RDMA and GPU Direct Accelerates Distributed Training
12© 2018 Mellanox Technologies
Machine Learning on the Cloud
 GPU provisioning to VMs
 Advance Networking
 Advance Storage
13© 2018 Mellanox Technologies
GPU Provisioning with OpenStack
 Today – PCI Passthrough or Ironic
 PCI passthrough requires hardware support and has some caveats…
 https://wiki.openstack.org/wiki/GPUs
 Good performance requires pinning and NUMA topology support configured too
 Tomorrow – vGPU
 mdev framework introduced in Linux 4.10 by Red Hat, NVIDIA, Intel
~$ openstack flavor show 56cd053c-b6a2-4103-b870-a83dd5d27ec1
+----------------------------+--------------------------------------------+
| Field | Value |
+----------------------------+--------------------------------------------+
| OS-FLV-DISABLED:disabled | False |
| OS-FLV-EXT-DATA:ephemeral | 1000 |
| disk | 30 |
| id | 56cd053c-b6a2-4103-b870-a83dd5d27ec1 |
| name | mon.m3.c24r120.2gpu-p100.mlx |
| os-flavor-access:is_public | False |
| properties | pci_passthrough:alias='P100:2,MlxCX4-VF:1' |
| ram | 122880 |
| rxtx_factor | 1.0 |
| swap | |
| vcpus | 24 |
+----------------------------+--------------------------------------------+
~$ openstack server list --all-projects --project d99… --flavor 56c…
+--------------------------------------+------------+--------+----------------------------------+
| ID | Name | Status | Networks |
+--------------------------------------+------------+--------+----------------------------------+
| 1d77bf12-0099-4580-bf6f-36c42225f2c0 | massive003 | ACTIVE | monash-03-internal=10.16.201.20 |
+--------------------------------------+------------+--------+----------------------------------+
14© 2018 Mellanox Technologies
What Is RDMA?
 Remote Direct Memory Access (RDMA)
 Advance transport protocol (same layer as TCP and UDP)
 Main features
 Remote memory read/write semantics in addition to send/receive
 Kernel bypass / direct user space access
 Full hardware offload
 Secure, channel based IO
 Application advantage
 Low latency
 High bandwidth
 Low CPU consumption
 RoCE: RDMA over Converged Ethernet
 Available for all Ethernet speeds 10 – 100G
 Verbs: RDMA SW interface (equivalent to sockets)
15© 2018 Mellanox Technologies
GPUDirect™ RDMA Technology
16© 2018 Mellanox Technologies
Para-Virtualized SR-IOV
Enable Advance Networking For VMs & Containers
Single Root I/O Virtualization
(SR-IOV)
 PCIe device presents multiple instances
to the OS/Hypervisor
 Enables Application Direct Access
 Bare metal performance for VM
 Reduces CPU overhead
 Enables many advanced NIC features
(e.g. DPDK, RDMA, ASAP2,)
NIC
Hypervisor
vSwitch
VM VM
SR-IOV NIC
Hypervisor VM VM
eSwitch
Physical Function
(PF)
Virtual Function
(VF)
17© 2018 Mellanox Technologies
ASAP2 Direct: Full OVS Offload
 Enable SR-IOV data path with OVS control plane
 In other words, enable support for most SDN controllers with SR-IOV
data plane
 Use Open vSwitch to be the management interface and
offload OVS data-plane to Mellanox embedded Switch
(eSwitch) using ASAP2 Direct
 Allow for RDMA, GPUDirect and other advance network
services directly from a VM or Container
VM
ConnectX-5 eSwitch
VM
Hypervisor
OVS
SR-IOV
VF
SR-IOV
VF
DataPath
PF
18© 2018 Mellanox Technologies
Comprehensive OpenStack Integration
Integrated with Major
OpenStack
Distributions
In-Box
Neturon-ML2
support for
mixed
environment
(VXLAN, PV,
SRIOV)
Ethernet
Neutron: Data
plane
acceleration
and isolation
iSER and
NVMf
Accelerating
storage
access
OpenStack Plugins Create Seamless Integration , Control, & Management
19© 2018 Mellanox Technologies
Container Networking Acceleration
Enable RoCE and DPDK networking technologies to accelerate
cloud-native apps and workloads
20© 2018 Mellanox Technologies
Containers and Kubernetes Integration
PF VF-1 VF-2 VF-3
SR-IOV
CNI
ibdev=mlx5_1
netdev=eth0
net_ns=1
ibdev=mlx5_2
netdev=eth1
net_ns=2
ibdev=mlx5_3
netdev=eth2
net_ns=3
Kubernetes/
Docker
Container1 Container2 Container3
SR-
IOV/RDMA
Device
Plugin
Mellanox ConnectX Adapter Card with SR-IOV Enabled
 Every container/POD has an IB device (mlx5_1,2,3)
 Isolation is on the driver level
RDMA Application RDMA Application RDMA Application
Verbs Verbs Verbs
21© 2018 Mellanox Technologies
All Major Machine Learning Frameworks Support
RDMA
TensorFlow: Several implementations upstream
 Native (verbs) -
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/
verbs
 MPI, Horovod – Donated by Uber among others
Caffe2 / PyTorch: Over MPI or Gloo library
Microsoft Cognitive Toolkit: Native support
NVIDIA NCCL2: Native support in NCCL
Cognitive Toolkit
22© 2018 Mellanox Technologies
TensorFlow with Mellanox RDMA Test Report
 System Configuration
 8 x x86 servers
 4 x NVIDIA P100 per server
 Mellanox 100G RDMA network
 NVMe driver per server
 TensorFlow v1.4
RDMA vs. TCP: Up to 50% Better Performance
Advanced RDMA vs. TCP: Up to 173% Better Performance
Reference Deployment Guide
23© 2018 Mellanox Technologies
NVIDIA® DGX-1™ Deep Learning Server
8 x NVIDIA® Tesla® P/V100 GPUs
5.3TFlops
16nm FinFET
NVLINK
4 x ConnectX®-4 EDR 100G InfiniBand
Adapters
24© 2018 Mellanox Technologies
Mellanox Enables Most Efficient Machine Learning
Platforms
Highest Performance, Scalability and Productivity for Deep Learning
25© 2018 Mellanox Technologies
Thank You

Mais conteúdo relacionado

Mais procurados

Mellanox Announces HDR 200 Gb/s InfiniBand Solutions
Mellanox Announces HDR 200 Gb/s InfiniBand SolutionsMellanox Announces HDR 200 Gb/s InfiniBand Solutions
Mellanox Announces HDR 200 Gb/s InfiniBand Solutionsinside-BigData.com
 
IBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERIBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERinside-BigData.com
 
StarlingX - Project Onboarding
StarlingX - Project OnboardingStarlingX - Project Onboarding
StarlingX - Project OnboardingShuquan Huang
 
New idc architecture
New idc architectureNew idc architecture
New idc architectureMason Mei
 
DATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe ConferenceDATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe ConferenceLEGATO project
 
OpenPOWER Foundation Overview
OpenPOWER Foundation OverviewOpenPOWER Foundation Overview
OpenPOWER Foundation OverviewNVIDIA Taiwan
 
Tesla Accelerated Computing Platform
Tesla Accelerated Computing PlatformTesla Accelerated Computing Platform
Tesla Accelerated Computing Platforminside-BigData.com
 
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPC
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPCHPC DAY 2017 | FlyElephant Solutions for Data Science and HPC
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPCHPC DAY
 
Interconnect Your Future with Connect-IB
Interconnect Your Future with Connect-IBInterconnect Your Future with Connect-IB
Interconnect Your Future with Connect-IBMellanox Technologies
 
End-to-End Big Data AI with Analytics Zoo
End-to-End Big Data AI with Analytics ZooEnd-to-End Big Data AI with Analytics Zoo
End-to-End Big Data AI with Analytics ZooJason Dai
 
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta..."The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...Edge AI and Vision Alliance
 
Model-driven Telemetry: The Foundation of Big Data Analytics
Model-driven Telemetry: The Foundation of Big Data AnalyticsModel-driven Telemetry: The Foundation of Big Data Analytics
Model-driven Telemetry: The Foundation of Big Data AnalyticsCisco Canada
 
Trends in Systems and How to Get Efficient Performance
Trends in Systems and How to Get Efficient PerformanceTrends in Systems and How to Get Efficient Performance
Trends in Systems and How to Get Efficient Performanceinside-BigData.com
 
GTC Taiwan 2017 主題演說
GTC Taiwan 2017 主題演說GTC Taiwan 2017 主題演說
GTC Taiwan 2017 主題演說NVIDIA Taiwan
 

Mais procurados (20)

Mellanox Announces HDR 200 Gb/s InfiniBand Solutions
Mellanox Announces HDR 200 Gb/s InfiniBand SolutionsMellanox Announces HDR 200 Gb/s InfiniBand Solutions
Mellanox Announces HDR 200 Gb/s InfiniBand Solutions
 
IBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERIBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWER
 
Apache Pulsar @Splunk
Apache Pulsar @SplunkApache Pulsar @Splunk
Apache Pulsar @Splunk
 
StarlingX - Project Onboarding
StarlingX - Project OnboardingStarlingX - Project Onboarding
StarlingX - Project Onboarding
 
OpenPOWER System Marconi100
OpenPOWER System Marconi100OpenPOWER System Marconi100
OpenPOWER System Marconi100
 
New idc architecture
New idc architectureNew idc architecture
New idc architecture
 
DATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe ConferenceDATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe Conference
 
Interconnect Your Future
Interconnect Your FutureInterconnect Your Future
Interconnect Your Future
 
OpenPOWER Foundation Overview
OpenPOWER Foundation OverviewOpenPOWER Foundation Overview
OpenPOWER Foundation Overview
 
Tesla Accelerated Computing Platform
Tesla Accelerated Computing PlatformTesla Accelerated Computing Platform
Tesla Accelerated Computing Platform
 
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPC
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPCHPC DAY 2017 | FlyElephant Solutions for Data Science and HPC
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPC
 
Interconnect Your Future with Connect-IB
Interconnect Your Future with Connect-IBInterconnect Your Future with Connect-IB
Interconnect Your Future with Connect-IB
 
End-to-End Big Data AI with Analytics Zoo
End-to-End Big Data AI with Analytics ZooEnd-to-End Big Data AI with Analytics Zoo
End-to-End Big Data AI with Analytics Zoo
 
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta..."The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
 
Model-driven Telemetry: The Foundation of Big Data Analytics
Model-driven Telemetry: The Foundation of Big Data AnalyticsModel-driven Telemetry: The Foundation of Big Data Analytics
Model-driven Telemetry: The Foundation of Big Data Analytics
 
Trends in Systems and How to Get Efficient Performance
Trends in Systems and How to Get Efficient PerformanceTrends in Systems and How to Get Efficient Performance
Trends in Systems and How to Get Efficient Performance
 
GTC Taiwan 2017 主題演說
GTC Taiwan 2017 主題演說GTC Taiwan 2017 主題演說
GTC Taiwan 2017 主題演說
 
IBM HPC Transformation with AI
IBM HPC Transformation with AI IBM HPC Transformation with AI
IBM HPC Transformation with AI
 
Mellanox 2013 Analyst Day
Mellanox 2013 Analyst DayMellanox 2013 Analyst Day
Mellanox 2013 Analyst Day
 
OpenPOWER/POWER9 AI webinar
OpenPOWER/POWER9 AI webinar OpenPOWER/POWER9 AI webinar
OpenPOWER/POWER9 AI webinar
 

Semelhante a Brain in the Cloud: Machine Learning on OpenStack & Kubernetes Done Right - Erez Cohen, Mellanox - Cloud Native Day Tel Aviv 2018

Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and moreAdvanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and moreinside-BigData.com
 
Netsft2017 day in_life_of_nfv
Netsft2017 day in_life_of_nfvNetsft2017 day in_life_of_nfv
Netsft2017 day in_life_of_nfvIntel
 
Using SmartNICs to Provide Better Data Center Security - Jack Matheson - 44CO...
Using SmartNICs to Provide Better Data Center Security - Jack Matheson - 44CO...Using SmartNICs to Provide Better Data Center Security - Jack Matheson - 44CO...
Using SmartNICs to Provide Better Data Center Security - Jack Matheson - 44CO...44CON
 
CSCfi Computing Services 12/2014
CSCfi Computing Services 12/2014CSCfi Computing Services 12/2014
CSCfi Computing Services 12/2014Olli-Pekka Lehto
 
Cisco at v mworld 2015 theater presentation brfarnha
Cisco at v mworld 2015 theater presentation brfarnhaCisco at v mworld 2015 theater presentation brfarnha
Cisco at v mworld 2015 theater presentation brfarnhaldangelo0772
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Accelerationinside-BigData.com
 
Connectivity for the data centric era
Connectivity for the data centric eraConnectivity for the data centric era
Connectivity for the data centric eraDESMOND YUEN
 
Open vSwitch Implementation Options
Open vSwitch Implementation Options Open vSwitch Implementation Options
Open vSwitch Implementation Options Netronome
 
Introduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las Vegas
Introduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las VegasIntroduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las Vegas
Introduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las VegasBruno Teixeira
 
SDN 101: Software Defined Networking Course - Sameh Zaghloul/IBM - 2014
SDN 101: Software Defined Networking Course - Sameh Zaghloul/IBM - 2014SDN 101: Software Defined Networking Course - Sameh Zaghloul/IBM - 2014
SDN 101: Software Defined Networking Course - Sameh Zaghloul/IBM - 2014SAMeh Zaghloul
 
Cisco storage networking protect scale-simplify_dec_2016
Cisco storage networking   protect scale-simplify_dec_2016Cisco storage networking   protect scale-simplify_dec_2016
Cisco storage networking protect scale-simplify_dec_2016Tony Antony
 
The Hitch-Hikers Guide to Data Centre Virtualization and Workload Consolidation:
The Hitch-Hikers Guide to Data Centre Virtualization and Workload Consolidation:The Hitch-Hikers Guide to Data Centre Virtualization and Workload Consolidation:
The Hitch-Hikers Guide to Data Centre Virtualization and Workload Consolidation:Cisco Canada
 
G rpc talk with intel (3)
G rpc talk with intel (3)G rpc talk with intel (3)
G rpc talk with intel (3)Intel
 
Open coud networking at full speed - Avi Alkobi
Open coud networking at full speed - Avi AlkobiOpen coud networking at full speed - Avi Alkobi
Open coud networking at full speed - Avi AlkobiOpenInfra Days Poland 2019
 
Lessons Learned during IBM SmartCloud Orchestrator Deployment at a Large Tel...
Lessons Learned during IBM SmartCloud Orchestrator Deployment at a Large Tel...Lessons Learned during IBM SmartCloud Orchestrator Deployment at a Large Tel...
Lessons Learned during IBM SmartCloud Orchestrator Deployment at a Large Tel...Eduardo Patrocinio
 
MIG 5th Data Centre Summit 2016 PTS Presentation v1
MIG 5th Data Centre Summit 2016 PTS Presentation v1MIG 5th Data Centre Summit 2016 PTS Presentation v1
MIG 5th Data Centre Summit 2016 PTS Presentation v1blewington
 
Netronome Corporate Brochure
Netronome Corporate BrochureNetronome Corporate Brochure
Netronome Corporate BrochureNetronome
 
The Data Center Network Evolution
The Data Center Network EvolutionThe Data Center Network Evolution
The Data Center Network EvolutionCisco Canada
 

Semelhante a Brain in the Cloud: Machine Learning on OpenStack & Kubernetes Done Right - Erez Cohen, Mellanox - Cloud Native Day Tel Aviv 2018 (20)

Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and moreAdvanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
 
Netsft2017 day in_life_of_nfv
Netsft2017 day in_life_of_nfvNetsft2017 day in_life_of_nfv
Netsft2017 day in_life_of_nfv
 
CloudX on OpenStack
CloudX on OpenStackCloudX on OpenStack
CloudX on OpenStack
 
Using SmartNICs to Provide Better Data Center Security - Jack Matheson - 44CO...
Using SmartNICs to Provide Better Data Center Security - Jack Matheson - 44CO...Using SmartNICs to Provide Better Data Center Security - Jack Matheson - 44CO...
Using SmartNICs to Provide Better Data Center Security - Jack Matheson - 44CO...
 
CSCfi Computing Services 12/2014
CSCfi Computing Services 12/2014CSCfi Computing Services 12/2014
CSCfi Computing Services 12/2014
 
Cisco at v mworld 2015 theater presentation brfarnha
Cisco at v mworld 2015 theater presentation brfarnhaCisco at v mworld 2015 theater presentation brfarnha
Cisco at v mworld 2015 theater presentation brfarnha
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Connectivity for the data centric era
Connectivity for the data centric eraConnectivity for the data centric era
Connectivity for the data centric era
 
Open vSwitch Implementation Options
Open vSwitch Implementation Options Open vSwitch Implementation Options
Open vSwitch Implementation Options
 
Introduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las Vegas
Introduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las VegasIntroduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las Vegas
Introduction to SDN and Network Programmability - BRKRST-1014 | 2017/Las Vegas
 
SDN 101: Software Defined Networking Course - Sameh Zaghloul/IBM - 2014
SDN 101: Software Defined Networking Course - Sameh Zaghloul/IBM - 2014SDN 101: Software Defined Networking Course - Sameh Zaghloul/IBM - 2014
SDN 101: Software Defined Networking Course - Sameh Zaghloul/IBM - 2014
 
Cisco storage networking protect scale-simplify_dec_2016
Cisco storage networking   protect scale-simplify_dec_2016Cisco storage networking   protect scale-simplify_dec_2016
Cisco storage networking protect scale-simplify_dec_2016
 
The Hitch-Hikers Guide to Data Centre Virtualization and Workload Consolidation:
The Hitch-Hikers Guide to Data Centre Virtualization and Workload Consolidation:The Hitch-Hikers Guide to Data Centre Virtualization and Workload Consolidation:
The Hitch-Hikers Guide to Data Centre Virtualization and Workload Consolidation:
 
G rpc talk with intel (3)
G rpc talk with intel (3)G rpc talk with intel (3)
G rpc talk with intel (3)
 
Open coud networking at full speed - Avi Alkobi
Open coud networking at full speed - Avi AlkobiOpen coud networking at full speed - Avi Alkobi
Open coud networking at full speed - Avi Alkobi
 
Lessons Learned during IBM SmartCloud Orchestrator Deployment at a Large Tel...
Lessons Learned during IBM SmartCloud Orchestrator Deployment at a Large Tel...Lessons Learned during IBM SmartCloud Orchestrator Deployment at a Large Tel...
Lessons Learned during IBM SmartCloud Orchestrator Deployment at a Large Tel...
 
MIG 5th Data Centre Summit 2016 PTS Presentation v1
MIG 5th Data Centre Summit 2016 PTS Presentation v1MIG 5th Data Centre Summit 2016 PTS Presentation v1
MIG 5th Data Centre Summit 2016 PTS Presentation v1
 
Netronome Corporate Brochure
Netronome Corporate BrochureNetronome Corporate Brochure
Netronome Corporate Brochure
 
The Data Center Network Evolution
The Data Center Network EvolutionThe Data Center Network Evolution
The Data Center Network Evolution
 
Building a Digital Telco
Building a Digital TelcoBuilding a Digital Telco
Building a Digital Telco
 

Mais de Cloud Native Day Tel Aviv

Cloud Native is a Cultural Decision. By Reshef Mann
Cloud Native is a Cultural Decision. By Reshef MannCloud Native is a Cultural Decision. By Reshef Mann
Cloud Native is a Cultural Decision. By Reshef MannCloud Native Day Tel Aviv
 
Container Runtime Security with Falco, by Néstor Salceda
Container Runtime Security with Falco, by Néstor SalcedaContainer Runtime Security with Falco, by Néstor Salceda
Container Runtime Security with Falco, by Néstor SalcedaCloud Native Day Tel Aviv
 
Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...
Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...
Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...Cloud Native Day Tel Aviv
 
Running I/O intensive workloads on Kubernetes, by Nati Shalom
Running I/O intensive workloads on Kubernetes, by Nati ShalomRunning I/O intensive workloads on Kubernetes, by Nati Shalom
Running I/O intensive workloads on Kubernetes, by Nati ShalomCloud Native Day Tel Aviv
 
WTF Do We Need a Service Mesh? By Anton Weiss.
WTF Do We Need a Service Mesh? By Anton Weiss.WTF Do We Need a Service Mesh? By Anton Weiss.
WTF Do We Need a Service Mesh? By Anton Weiss.Cloud Native Day Tel Aviv
 
Update Strategies for the Edge, by Kat Cosgrove
Update Strategies for the Edge, by Kat CosgroveUpdate Strategies for the Edge, by Kat Cosgrove
Update Strategies for the Edge, by Kat CosgroveCloud Native Day Tel Aviv
 
Building a Cloud-Native SaaS Product The Hard Way. By Arthur Berezin
Building a Cloud-Native SaaS Product The Hard Way. By Arthur BerezinBuilding a Cloud-Native SaaS Product The Hard Way. By Arthur Berezin
Building a Cloud-Native SaaS Product The Hard Way. By Arthur BerezinCloud Native Day Tel Aviv
 
The Four Questions (Every Monitoring Engineer gets asked), by Leon Adato
The Four Questions (Every Monitoring Engineer gets asked), by Leon AdatoThe Four Questions (Every Monitoring Engineer gets asked), by Leon Adato
The Four Questions (Every Monitoring Engineer gets asked), by Leon AdatoCloud Native Day Tel Aviv
 
K8s Pod Scheduling - Deep Dive. By Tsahi Duek.
K8s Pod Scheduling - Deep Dive. By Tsahi Duek.K8s Pod Scheduling - Deep Dive. By Tsahi Duek.
K8s Pod Scheduling - Deep Dive. By Tsahi Duek.Cloud Native Day Tel Aviv
 
Cloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-Shalom
Cloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-ShalomCloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-Shalom
Cloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-ShalomCloud Native Day Tel Aviv
 
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.Cloud Native Day Tel Aviv
 
Cloud native transformation patterns, by Pini Reznik
Cloud native transformation patterns, by Pini ReznikCloud native transformation patterns, by Pini Reznik
Cloud native transformation patterns, by Pini ReznikCloud Native Day Tel Aviv
 
Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...
Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...
Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...Cloud Native Day Tel Aviv
 
Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...
Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...
Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...Cloud Native Day Tel Aviv
 
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...Cloud Native Day Tel Aviv
 
Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...
Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...
Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...Cloud Native Day Tel Aviv
 
A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...
A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...
A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...Cloud Native Day Tel Aviv
 
The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...
The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...
The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...Cloud Native Day Tel Aviv
 
I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018
I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018
I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018Cloud Native Day Tel Aviv
 
Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...
Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...
Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...Cloud Native Day Tel Aviv
 

Mais de Cloud Native Day Tel Aviv (20)

Cloud Native is a Cultural Decision. By Reshef Mann
Cloud Native is a Cultural Decision. By Reshef MannCloud Native is a Cultural Decision. By Reshef Mann
Cloud Native is a Cultural Decision. By Reshef Mann
 
Container Runtime Security with Falco, by Néstor Salceda
Container Runtime Security with Falco, by Néstor SalcedaContainer Runtime Security with Falco, by Néstor Salceda
Container Runtime Security with Falco, by Néstor Salceda
 
Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...
Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...
Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...
 
Running I/O intensive workloads on Kubernetes, by Nati Shalom
Running I/O intensive workloads on Kubernetes, by Nati ShalomRunning I/O intensive workloads on Kubernetes, by Nati Shalom
Running I/O intensive workloads on Kubernetes, by Nati Shalom
 
WTF Do We Need a Service Mesh? By Anton Weiss.
WTF Do We Need a Service Mesh? By Anton Weiss.WTF Do We Need a Service Mesh? By Anton Weiss.
WTF Do We Need a Service Mesh? By Anton Weiss.
 
Update Strategies for the Edge, by Kat Cosgrove
Update Strategies for the Edge, by Kat CosgroveUpdate Strategies for the Edge, by Kat Cosgrove
Update Strategies for the Edge, by Kat Cosgrove
 
Building a Cloud-Native SaaS Product The Hard Way. By Arthur Berezin
Building a Cloud-Native SaaS Product The Hard Way. By Arthur BerezinBuilding a Cloud-Native SaaS Product The Hard Way. By Arthur Berezin
Building a Cloud-Native SaaS Product The Hard Way. By Arthur Berezin
 
The Four Questions (Every Monitoring Engineer gets asked), by Leon Adato
The Four Questions (Every Monitoring Engineer gets asked), by Leon AdatoThe Four Questions (Every Monitoring Engineer gets asked), by Leon Adato
The Four Questions (Every Monitoring Engineer gets asked), by Leon Adato
 
K8s Pod Scheduling - Deep Dive. By Tsahi Duek.
K8s Pod Scheduling - Deep Dive. By Tsahi Duek.K8s Pod Scheduling - Deep Dive. By Tsahi Duek.
K8s Pod Scheduling - Deep Dive. By Tsahi Duek.
 
Cloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-Shalom
Cloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-ShalomCloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-Shalom
Cloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-Shalom
 
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.
 
Cloud native transformation patterns, by Pini Reznik
Cloud native transformation patterns, by Pini ReznikCloud native transformation patterns, by Pini Reznik
Cloud native transformation patterns, by Pini Reznik
 
Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...
Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...
Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...
 
Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...
Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...
Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...
 
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
 
Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...
Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...
Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...
 
A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...
A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...
A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...
 
The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...
The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...
The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...
 
I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018
I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018
I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018
 
Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...
Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...
Keeping I.T. Real - Aaron Wolf, Mathematics and computer programming teacher,...
 

Último

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 

Último (20)

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

Brain in the Cloud: Machine Learning on OpenStack & Kubernetes Done Right - Erez Cohen, Mellanox - Cloud Native Day Tel Aviv 2018

  • 1. 1© 2018 Mellanox Technologies Machine Learning On OpenStack and K8 Done Right! 2018 Brain In The Cloud Erez Cohen, VP CloudX & Artificial Intelligence
  • 2. 2© 2018 Mellanox Technologies Data is Growing Faster Than Ever Autonomous vehicle generates 4000GByte per day SONAR ~10-100KB Per/Sec CAMERA ~20-40MB Per/sec GPS ~50KB Per/Sec Data will grow by a factor of 10 over the next decade to 160 Zeta Bytes in 2025 (source: IDC) Faster Data processing requires faster Interconnect speeds RADAR ~10-100KB Per/Sec Light Detection & Ranging ~10-70MB Per/Sec
  • 3. 3© 2018 Mellanox Technologies Machine Learning Is Everywhere! Fraud Detection
  • 4. 4© 2018 Mellanox Technologies What Is Machine Learning Machine Learning Machine learning is the subfield of computer science that, according to Arthur Samuel in 1959, gives "computers the ability to learn without being explicitly programmed.“ Source: https://en.wikipedia.org/wiki/Machine_learning
  • 5. 5© 2018 Mellanox Technologies Deep Learning  Also known as Deep Neural Network (DNN)  Subset of Artificial Neural Network (ANN) Deep Learning Deep Learning is a subfield of machine learning concerned with algorithms inspired by the structure and function of the brain called artificial neural networks Source: http://machinelearningmastery.com/what-is-deep-learning/
  • 6. 6© 2018 Mellanox Technologies Why Deep Learning And Why Now?  Deep Learning allow to solve difficult problems  In some cases problems that can’t be solve in other ways  Deep Learning is not new  1943: “A logical calculus of the ideas immanent in nervous activity”, McCulloch & Pitts So why now?  Infrastructure  Recent development in GPU and network technology allow to realize machine learning  Data  More data is generated then ever. Critical for the training process  Software  Wave of open source machine learning frameworks Cognitive Toolkit
  • 7. 7© 2018 Mellanox Technologies Deep Learning Demands Highest Performance TRAINING DATASET NEW DATA TRAINING Intensive computing (Billions of TFLOPS) • GPU! Ultra-fast networking for scalability • RDMA, GPUDirect, Collective acceleration Fast, distributed storage INFERENCING Images Video Text Speech
  • 8. 8© 2018 Mellanox Technologies Neural Networks Complexity Growth 2014 2015 2016 2017 DeepSpeech DeepSpeech-2 DeepSpeech-3 30X 2013 2014 2015 2016 AlexNet GoogleNet ResNet Inception-V2 350X Inception-V4 Image Recognition Speech Recognition PolyNet
  • 9. 9© 2018 Mellanox Technologies Training Challenges Training with large data sets and increasing networks can take long time  In some cases even weeks In many cases training need to happen frequently  Model development and tuning  Real life use cases may require retraining regularly Accelerate training time by scale out architecture  Add workers (nodes) to reduce training time Types of parallelism that are now popular Data parallelism Model parallelism Network is critical element to accelerate Distributed Training!
  • 10. 10© 2018 Mellanox Technologies Model and Data Parallelism Main Model/Parameter Server/Allreaduce Local Model Mini Batch Mini Batch Mini Batch Mini Batch Mini Batch Local Model Local Model Local Model Local Model Local Model Mini BatchData Data Model Parallelism Data Parallelism
  • 11. 11© 2018 Mellanox Technologies Accelerates Distributed Training  Data Parallelism communication pattern  Gradient updates to parameter servers or among workers.  Model parameters distribution among workers.  Frequent – each training step due to the sequential nature of SGD  High bandwidth is needed, as models become larger and larger  Number of parameters is increasing  Usually characterized with Bursts on the network - workers are synchronized RDMA and GPU Direct Accelerates Distributed Training
  • 12. 12© 2018 Mellanox Technologies Machine Learning on the Cloud  GPU provisioning to VMs  Advance Networking  Advance Storage
  • 13. 13© 2018 Mellanox Technologies GPU Provisioning with OpenStack  Today – PCI Passthrough or Ironic  PCI passthrough requires hardware support and has some caveats…  https://wiki.openstack.org/wiki/GPUs  Good performance requires pinning and NUMA topology support configured too  Tomorrow – vGPU  mdev framework introduced in Linux 4.10 by Red Hat, NVIDIA, Intel ~$ openstack flavor show 56cd053c-b6a2-4103-b870-a83dd5d27ec1 +----------------------------+--------------------------------------------+ | Field | Value | +----------------------------+--------------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 1000 | | disk | 30 | | id | 56cd053c-b6a2-4103-b870-a83dd5d27ec1 | | name | mon.m3.c24r120.2gpu-p100.mlx | | os-flavor-access:is_public | False | | properties | pci_passthrough:alias='P100:2,MlxCX4-VF:1' | | ram | 122880 | | rxtx_factor | 1.0 | | swap | | | vcpus | 24 | +----------------------------+--------------------------------------------+ ~$ openstack server list --all-projects --project d99… --flavor 56c… +--------------------------------------+------------+--------+----------------------------------+ | ID | Name | Status | Networks | +--------------------------------------+------------+--------+----------------------------------+ | 1d77bf12-0099-4580-bf6f-36c42225f2c0 | massive003 | ACTIVE | monash-03-internal=10.16.201.20 | +--------------------------------------+------------+--------+----------------------------------+
  • 14. 14© 2018 Mellanox Technologies What Is RDMA?  Remote Direct Memory Access (RDMA)  Advance transport protocol (same layer as TCP and UDP)  Main features  Remote memory read/write semantics in addition to send/receive  Kernel bypass / direct user space access  Full hardware offload  Secure, channel based IO  Application advantage  Low latency  High bandwidth  Low CPU consumption  RoCE: RDMA over Converged Ethernet  Available for all Ethernet speeds 10 – 100G  Verbs: RDMA SW interface (equivalent to sockets)
  • 15. 15© 2018 Mellanox Technologies GPUDirect™ RDMA Technology
  • 16. 16© 2018 Mellanox Technologies Para-Virtualized SR-IOV Enable Advance Networking For VMs & Containers Single Root I/O Virtualization (SR-IOV)  PCIe device presents multiple instances to the OS/Hypervisor  Enables Application Direct Access  Bare metal performance for VM  Reduces CPU overhead  Enables many advanced NIC features (e.g. DPDK, RDMA, ASAP2,) NIC Hypervisor vSwitch VM VM SR-IOV NIC Hypervisor VM VM eSwitch Physical Function (PF) Virtual Function (VF)
  • 17. 17© 2018 Mellanox Technologies ASAP2 Direct: Full OVS Offload  Enable SR-IOV data path with OVS control plane  In other words, enable support for most SDN controllers with SR-IOV data plane  Use Open vSwitch to be the management interface and offload OVS data-plane to Mellanox embedded Switch (eSwitch) using ASAP2 Direct  Allow for RDMA, GPUDirect and other advance network services directly from a VM or Container VM ConnectX-5 eSwitch VM Hypervisor OVS SR-IOV VF SR-IOV VF DataPath PF
  • 18. 18© 2018 Mellanox Technologies Comprehensive OpenStack Integration Integrated with Major OpenStack Distributions In-Box Neturon-ML2 support for mixed environment (VXLAN, PV, SRIOV) Ethernet Neutron: Data plane acceleration and isolation iSER and NVMf Accelerating storage access OpenStack Plugins Create Seamless Integration , Control, & Management
  • 19. 19© 2018 Mellanox Technologies Container Networking Acceleration Enable RoCE and DPDK networking technologies to accelerate cloud-native apps and workloads
  • 20. 20© 2018 Mellanox Technologies Containers and Kubernetes Integration PF VF-1 VF-2 VF-3 SR-IOV CNI ibdev=mlx5_1 netdev=eth0 net_ns=1 ibdev=mlx5_2 netdev=eth1 net_ns=2 ibdev=mlx5_3 netdev=eth2 net_ns=3 Kubernetes/ Docker Container1 Container2 Container3 SR- IOV/RDMA Device Plugin Mellanox ConnectX Adapter Card with SR-IOV Enabled  Every container/POD has an IB device (mlx5_1,2,3)  Isolation is on the driver level RDMA Application RDMA Application RDMA Application Verbs Verbs Verbs
  • 21. 21© 2018 Mellanox Technologies All Major Machine Learning Frameworks Support RDMA TensorFlow: Several implementations upstream  Native (verbs) - https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/ verbs  MPI, Horovod – Donated by Uber among others Caffe2 / PyTorch: Over MPI or Gloo library Microsoft Cognitive Toolkit: Native support NVIDIA NCCL2: Native support in NCCL Cognitive Toolkit
  • 22. 22© 2018 Mellanox Technologies TensorFlow with Mellanox RDMA Test Report  System Configuration  8 x x86 servers  4 x NVIDIA P100 per server  Mellanox 100G RDMA network  NVMe driver per server  TensorFlow v1.4 RDMA vs. TCP: Up to 50% Better Performance Advanced RDMA vs. TCP: Up to 173% Better Performance Reference Deployment Guide
  • 23. 23© 2018 Mellanox Technologies NVIDIA® DGX-1™ Deep Learning Server 8 x NVIDIA® Tesla® P/V100 GPUs 5.3TFlops 16nm FinFET NVLINK 4 x ConnectX®-4 EDR 100G InfiniBand Adapters
  • 24. 24© 2018 Mellanox Technologies Mellanox Enables Most Efficient Machine Learning Platforms Highest Performance, Scalability and Productivity for Deep Learning
  • 25. 25© 2018 Mellanox Technologies Thank You