SlideShare uma empresa Scribd logo
1 de 21
Baixar para ler offline
Rich Graham
February 2016, HPCAC Stanford Conference
Interconnect Your Future
© 2015 Mellanox Technologies 2
The Ever Growing Demand for Higher Performance
2000 202020102005
“Roadrunner”
1st
2015
Terascale Petascale Exascale
Single-Core to Many-CoreSMP to Clusters
Performance Development
Co-Design
HW SW
APP
Hardware
Software
Application
The Interconnect is the Enabling Technology
© 2015 Mellanox Technologies 3
Co-Design Architecture to Enable Exascale Performance
CPU-Centric Co-Design
Limited to Main CPU Usage
Results in Performance Limitation
Creating Synergies
Enables Higher Performance and Scale
Software
Software
In-CPU
Computing
In-Network
Computing
In-Storage
Computing
© 2015 Mellanox Technologies 4
The Intelligence is Moving to the Interconnect
CPU
Interconnect
Past Future
© 2015 Mellanox Technologies 5
Breaking the Application Latency Wall
§ Today: Network device latencies are on the order of 100 nanoseconds
§ Challenge: Enabling the next order of magnitude improvement in application performance
§ Solution: Creating synergies between software and hardware – intelligent interconnect
Intelligent Interconnect Paves the Road to Exascale Performance
10 years ago
~10
microsecond
~100
microsecond
NetworkCommunication
Framework
Today
~10
microsecond
Communication
Framework
~0.1
microsecond
Network
~1
microsecond
Communication
Framework
Future
~0.05
microsecond
Co-Design
Network
© 2015 Mellanox Technologies 6
Co-Design: Offloaded Technologies Target Application Characteristics
Programmability
RDMA GPUDirect Virtualization
Backward and Future Compatibility
Direct Communication
Applications (Innovations, Scalability, Performance)
Software-Defined
Network (SDN)
Co-Design Requires Intelligent Interconnect
Offloaded Technologies: Intelligent Interconnect
© 2015 Mellanox Technologies 7
The Road to Exascale – Co-Design System Architecture
Co-Design
Co-Design
Co-Design
Co-Design
CPU GPU
HCA
Switch
FPGA
In-CPU
Computing
In-GPU
Computing
In-FPGA
Computing
In-Network
Computing
In-Network
Computing
© 2015 Mellanox Technologies 8
Introducing Switch-IB 2 World’s First Smart Switch
© 2015 Mellanox Technologies 9
Introducing Switch-IB 2 World’s First Smart Switch
§ The world fastest switch with <90 nanosecond latency
§ 36-ports, 100Gb/s per port, 7.2Tb/s throughput, 7.02 Billion messages/sec
§ Adaptive Routing, Congestion control, support for multiple topologies
World’s First Smart Switch
Build for Scalable Compute and Storage Infrastructures
10X Higher Performance with The New Switch SHArP Technology
© 2015 Mellanox Technologies 10
SHArP (Scalable Hierarchical Aggregation Protocol) Technology
Delivering 10X Performance Improvement
for MPI and SHMEM/PAGS Applications
Switch-IB 2 Enables the Switch Network to
Operate as a Co-Processor
SHArP Enables Switch-IB 2 to Manage and
Execute MPI Operations in the Network
© 2015 Mellanox Technologies 11
Scalable Hierarchical Aggregation Protocol
§ Reliable Scalable General Purpose Primitive, Applicable to Multiple Use-cases
•  In-network Tree based aggregation mechanism
•  Large number of groups
•  Multiple simultaneous outstanding operations
Accelerating HPC applications
§ Scalable High Performance Collective Offload
•  Barrier, Reduce, All-Reduce, Broadcast
•  Sum, Min, Max, Min-loc, max-loc, OR, XOR, AND
•  Integer and Floating-Point, 32 / 64 bit
§ Significantly reduce MPI collective runtime
§ Increase CPU availability and efficiency
§ Enable communication and computation overlap
Accelerating MapReduce Applications
§ Prevent the Incast Traffic Pattern
© 2015 Mellanox Technologies 12
SHArP Performance Advantage – MiniFE Details
§  MiniFE is a Finite Element mini-application
•  Implements kernels that represent implicit finite-element applications
10X to 25X Performance Improvement
AllRedcue MPI Collective
Number
of Nodes
CPU-Based
Latency (usec)
SHArP
Latency (usec)
Ratio
32 41.7 4.24 9.9
64 49.08 4.63 10.6
128 57.67 4.76 12.1
256 67.76 4.87 13.9
512 79.62 5.09 15.6
1024 93.55 5.58 16.8
2048 109.92 5.63 19.5
4096 129.16 5.73 22.5
8192 151.76 5.94 25.5
© 2015 Mellanox Technologies 13
SHArP Performance– First Results (Partial Implementation)
3.5X Performance Improvement on 64 Nodes
© 2015 Mellanox Technologies 14
The Intelligence is Moving to the Interconnect
Communication Frameworks (MPI, SHMEM/PGAS)
The Only Approach to Deliver 10X Performance Improvements
Applications Transport
RDMA
SR-IOV
Collectives
Peer-Direct
GPUDirect
More…
MPI / SHMEM Offloads
Q1’16
Q3’16
© 2015 Mellanox Technologies 15
Introducing ConnectX-4 Lx Programmable Adapter
Scalable, Efficient, High-Performance and Flexible Solution
Security
Cloud/Virtualization
Storage
High Performance Computing
Precision Time Synchronization
Networking + FPGA
Mellanox Acceleration Engines
and FGPA Programmability
On One Adapter
© 2015 Mellanox Technologies 16
InfiniBand Router – In Progress
§ Isolation between InfiniBand subnets
§ Simple connectivity between different topologies
•  Enable sharing a common storage network by multiple disconnected subnets
§ Support 2^128 nodes (unlimited system size)
SB7780
© 2015 Mellanox Technologies 17
§ Router implements GID to LID mapping
§ SM allocates Alias GID to HCA
§ Address resolution
•  IP based applications
-  Name to IP (standard), IP to GID using new API
•  Pure IB applications
-  Upon LID assignment change, GID DNS is updated
InfiniBand Router Details
IB	subnet
IB	subnetIB	subnet
GID	DNS
RMA	1
RPA
RPA	 RPA	
RTM
HCA
GID	DNA	
Agent
SM
SRPM	
SRTM
HCA
GID	DNA	
Agent
SM
SRPM	
SRTM
HCA
GID	DNA	
Agent
SM
SRPM	
SRTM
RTM: Routing Table Manager
SRTM: Subnet Routing Table Manager
RPA: Router Port Agent
SRPM: Subnet Router Port Manager
GID DNS: IP to GID resolution
© 2015 Mellanox Technologies 18
Multi-Host Socket Direct – Low Latency Socket Communication
§ Each CPU with direct network access
§  QPI avoidance for I/O – improve performance
§  Enables GPU / peer direct on both sockets
§ Solution is transparent to software
CPU CPUCPU CPU
QPI
Multi-Host Socket Direct Performance
50% Lower CPU Utilization
20% lower Latency
Multi Host Evaluation Kit
Lower Application Latency, Free-up CPU
© 2015 Mellanox Technologies 19
Switch LatencyMessage Rate
Mellanox InfiniBand Leadership Over Future Competition
20%
Lower
44%
Higher
Power Consumption
Per Switch Port
Scalability
CPU efficiency
25%
Lower
2X
Higher
100
Gb/s
Link Speed
200
Gb/s
Link Speed
2014
Gain Competitive Advantage Today
Protect Your Future
2017
Smart Network For Smart Systems
RDMA, Acceleration Engines, Programmability
Higher Performance
Unlimited Scalability
Higher Resiliency
Proven!
© 2015 Mellanox Technologies 20
Technology Roadmap – One-Generation Lead over the Competition
2000 202020102005
20G 40G 56G 100G
“Roadrunner”
Mellanox Connected
1st3rd
TOP500 2003
Virginia Tech (Apple)
2015
200G
Terascale Petascale Exascale
Mellanox 400G
Thank You

Mais conteúdo relacionado

Mais procurados

LF_DPDK17_Serverless DPDK - How SmartNIC resident DPDK Accelerates Packet Pro...
LF_DPDK17_Serverless DPDK - How SmartNIC resident DPDK Accelerates Packet Pro...LF_DPDK17_Serverless DPDK - How SmartNIC resident DPDK Accelerates Packet Pro...
LF_DPDK17_Serverless DPDK - How SmartNIC resident DPDK Accelerates Packet Pro...LF_DPDK
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Accelerationinside-BigData.com
 
Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...
Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...
Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...inside-BigData.com
 
Xilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systemsXilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systemsGanesan Narayanasamy
 
Using SmartNICs to Provide Better Data Center Security - Jack Matheson - 44CO...
Using SmartNICs to Provide Better Data Center Security - Jack Matheson - 44CO...Using SmartNICs to Provide Better Data Center Security - Jack Matheson - 44CO...
Using SmartNICs to Provide Better Data Center Security - Jack Matheson - 44CO...44CON
 
InfiniBand In-Network Computing Technology and Roadmap
InfiniBand In-Network Computing Technology and RoadmapInfiniBand In-Network Computing Technology and Roadmap
InfiniBand In-Network Computing Technology and Roadmapinside-BigData.com
 
The HPE Machine and Gen-Z - BUD17-503
The HPE Machine and Gen-Z - BUD17-503The HPE Machine and Gen-Z - BUD17-503
The HPE Machine and Gen-Z - BUD17-503Linaro
 
Programming Models for Exascale Systems
Programming Models for Exascale SystemsProgramming Models for Exascale Systems
Programming Models for Exascale Systemsinside-BigData.com
 
DDN: Protecting Your Data, Protecting Your Hardware
DDN: Protecting Your Data, Protecting Your HardwareDDN: Protecting Your Data, Protecting Your Hardware
DDN: Protecting Your Data, Protecting Your Hardwareinside-BigData.com
 
Overview of the MVAPICH Project and Future Roadmap
Overview of the MVAPICH Project and Future RoadmapOverview of the MVAPICH Project and Future Roadmap
Overview of the MVAPICH Project and Future Roadmapinside-BigData.com
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraLinaro
 
Challenges and Opportunities for HPC Interconnects and MPI
Challenges and Opportunities for HPC Interconnects and MPIChallenges and Opportunities for HPC Interconnects and MPI
Challenges and Opportunities for HPC Interconnects and MPIinside-BigData.com
 
Development, test, and characterization of MEC platforms with Teranium and Dr...
Development, test, and characterization of MEC platforms with Teranium and Dr...Development, test, and characterization of MEC platforms with Teranium and Dr...
Development, test, and characterization of MEC platforms with Teranium and Dr...Michelle Holley
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
 
High Performance Interconnects: Landscape, Assessments & Rankings
High Performance Interconnects: Landscape, Assessments & RankingsHigh Performance Interconnects: Landscape, Assessments & Rankings
High Performance Interconnects: Landscape, Assessments & Rankingsinside-BigData.com
 
State Of FPGA: Current & Future - A Panel discussion @ 4th FPGA Camp
State Of FPGA: Current & Future - A Panel discussion @ 4th FPGA CampState Of FPGA: Current & Future - A Panel discussion @ 4th FPGA Camp
State Of FPGA: Current & Future - A Panel discussion @ 4th FPGA CampFPGA Central
 

Mais procurados (20)

LF_DPDK17_Serverless DPDK - How SmartNIC resident DPDK Accelerates Packet Pro...
LF_DPDK17_Serverless DPDK - How SmartNIC resident DPDK Accelerates Packet Pro...LF_DPDK17_Serverless DPDK - How SmartNIC resident DPDK Accelerates Packet Pro...
LF_DPDK17_Serverless DPDK - How SmartNIC resident DPDK Accelerates Packet Pro...
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...
Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...
Building Efficient HPC Clouds with MCAPICH2 and RDMA-Hadoop over SR-IOV Infin...
 
Xilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systemsXilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systems
 
Using SmartNICs to Provide Better Data Center Security - Jack Matheson - 44CO...
Using SmartNICs to Provide Better Data Center Security - Jack Matheson - 44CO...Using SmartNICs to Provide Better Data Center Security - Jack Matheson - 44CO...
Using SmartNICs to Provide Better Data Center Security - Jack Matheson - 44CO...
 
InfiniBand In-Network Computing Technology and Roadmap
InfiniBand In-Network Computing Technology and RoadmapInfiniBand In-Network Computing Technology and Roadmap
InfiniBand In-Network Computing Technology and Roadmap
 
Apache Pulsar @Splunk
Apache Pulsar @SplunkApache Pulsar @Splunk
Apache Pulsar @Splunk
 
The HPE Machine and Gen-Z - BUD17-503
The HPE Machine and Gen-Z - BUD17-503The HPE Machine and Gen-Z - BUD17-503
The HPE Machine and Gen-Z - BUD17-503
 
Programming Models for Exascale Systems
Programming Models for Exascale SystemsProgramming Models for Exascale Systems
Programming Models for Exascale Systems
 
OpenPOWER Webinar
OpenPOWER Webinar OpenPOWER Webinar
OpenPOWER Webinar
 
DDN: Protecting Your Data, Protecting Your Hardware
DDN: Protecting Your Data, Protecting Your HardwareDDN: Protecting Your Data, Protecting Your Hardware
DDN: Protecting Your Data, Protecting Your Hardware
 
Overview of the MVAPICH Project and Future Roadmap
Overview of the MVAPICH Project and Future RoadmapOverview of the MVAPICH Project and Future Roadmap
Overview of the MVAPICH Project and Future Roadmap
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
 
Challenges and Opportunities for HPC Interconnects and MPI
Challenges and Opportunities for HPC Interconnects and MPIChallenges and Opportunities for HPC Interconnects and MPI
Challenges and Opportunities for HPC Interconnects and MPI
 
Development, test, and characterization of MEC platforms with Teranium and Dr...
Development, test, and characterization of MEC platforms with Teranium and Dr...Development, test, and characterization of MEC platforms with Teranium and Dr...
Development, test, and characterization of MEC platforms with Teranium and Dr...
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
High Performance Interconnects: Landscape, Assessments & Rankings
High Performance Interconnects: Landscape, Assessments & RankingsHigh Performance Interconnects: Landscape, Assessments & Rankings
High Performance Interconnects: Landscape, Assessments & Rankings
 
State Of FPGA: Current & Future - A Panel discussion @ 4th FPGA Camp
State Of FPGA: Current & Future - A Panel discussion @ 4th FPGA CampState Of FPGA: Current & Future - A Panel discussion @ 4th FPGA Camp
State Of FPGA: Current & Future - A Panel discussion @ 4th FPGA Camp
 

Destaque (7)

Mercedes gomez tawe
Mercedes gomez taweMercedes gomez tawe
Mercedes gomez tawe
 
Diario cantero
Diario canteroDiario cantero
Diario cantero
 
Video de mi grupo: Yo,Lydia, Marta
Video de mi grupo: Yo,Lydia, Marta Video de mi grupo: Yo,Lydia, Marta
Video de mi grupo: Yo,Lydia, Marta
 
Reunión
Reunión   Reunión
Reunión
 
RÚBRIICA ORAL; Ismael
RÚBRIICA ORAL; IsmaelRÚBRIICA ORAL; Ismael
RÚBRIICA ORAL; Ismael
 
Shakespeare 2015 16
Shakespeare 2015 16Shakespeare 2015 16
Shakespeare 2015 16
 
Informe de la valoracion de Estudios Academicos
Informe de la valoracion de Estudios AcademicosInforme de la valoracion de Estudios Academicos
Informe de la valoracion de Estudios Academicos
 

Semelhante a Interconnect your future

Co-Design Architecture for Exascale
Co-Design Architecture for ExascaleCo-Design Architecture for Exascale
Co-Design Architecture for Exascaleinside-BigData.com
 
Mellanox Announcements at SC15
Mellanox Announcements at SC15Mellanox Announcements at SC15
Mellanox Announcements at SC15inside-BigData.com
 
Announcing the Mellanox ConnectX-5 100G InfiniBand Adapter
Announcing the Mellanox ConnectX-5 100G InfiniBand AdapterAnnouncing the Mellanox ConnectX-5 100G InfiniBand Adapter
Announcing the Mellanox ConnectX-5 100G InfiniBand Adapterinside-BigData.com
 
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and moreAdvanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and moreinside-BigData.com
 
Banv meetup-contrail
Banv meetup-contrailBanv meetup-contrail
Banv meetup-contrailnvirters
 
Open vSwitch Implementation Options
Open vSwitch Implementation Options Open vSwitch Implementation Options
Open vSwitch Implementation Options Netronome
 
InfiniBand In-Network Computing Technology and Roadmap
InfiniBand In-Network Computing Technology and RoadmapInfiniBand In-Network Computing Technology and Roadmap
InfiniBand In-Network Computing Technology and Roadmapinside-BigData.com
 
Mellnox Interconnect presentation in OpenPOWER Brazil workshop
Mellnox Interconnect presentation in OpenPOWER Brazil workshopMellnox Interconnect presentation in OpenPOWER Brazil workshop
Mellnox Interconnect presentation in OpenPOWER Brazil workshopGanesan Narayanasamy
 
Storage, Cloud, Web 2.0, Big Data Driving Growth
Storage, Cloud, Web 2.0, Big Data Driving GrowthStorage, Cloud, Web 2.0, Big Data Driving Growth
Storage, Cloud, Web 2.0, Big Data Driving GrowthMellanox Technologies
 
[OpenStack Days Korea 2016] Track1 - Mellanox CloudX - Acceleration for Cloud...
[OpenStack Days Korea 2016] Track1 - Mellanox CloudX - Acceleration for Cloud...[OpenStack Days Korea 2016] Track1 - Mellanox CloudX - Acceleration for Cloud...
[OpenStack Days Korea 2016] Track1 - Mellanox CloudX - Acceleration for Cloud...OpenStack Korea Community
 
How to use SDN to Innovate, Expand and Deliver for your business
How to use SDN to Innovate, Expand and Deliver for your businessHow to use SDN to Innovate, Expand and Deliver for your business
How to use SDN to Innovate, Expand and Deliver for your businessNapier University
 
Software Stacks to enable SDN and NFV
Software Stacks to enable SDN and NFVSoftware Stacks to enable SDN and NFV
Software Stacks to enable SDN and NFVYoshihiro Nakajima
 
Open coud networking at full speed - Avi Alkobi
Open coud networking at full speed - Avi AlkobiOpen coud networking at full speed - Avi Alkobi
Open coud networking at full speed - Avi AlkobiOpenInfra Days Poland 2019
 
OVNC 2015-Open Ethernet과 SDN을 통한 Mellanox의 차세대 네트워크 혁신 방안
OVNC 2015-Open Ethernet과 SDN을 통한 Mellanox의 차세대 네트워크 혁신 방안OVNC 2015-Open Ethernet과 SDN을 통한 Mellanox의 차세대 네트워크 혁신 방안
OVNC 2015-Open Ethernet과 SDN을 통한 Mellanox의 차세대 네트워크 혁신 방안NAIM Networks, Inc.
 
ProductX2014 Tom thirer. mellanox
ProductX2014 Tom thirer. mellanoxProductX2014 Tom thirer. mellanox
ProductX2014 Tom thirer. mellanoxProduct Excellence
 
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...Netronome
 
6WINDGate™ - Enabling Cloud RAN Virtualization
6WINDGate™ - Enabling Cloud RAN Virtualization6WINDGate™ - Enabling Cloud RAN Virtualization
6WINDGate™ - Enabling Cloud RAN Virtualization6WIND
 
High Performance Networking Leveraging the DPDK and Growing Community
High Performance Networking Leveraging the DPDK and Growing CommunityHigh Performance Networking Leveraging the DPDK and Growing Community
High Performance Networking Leveraging the DPDK and Growing Community6WIND
 
A New Approach to Continuous Monitoring in the Cloud
A New Approach to Continuous Monitoring in the CloudA New Approach to Continuous Monitoring in the Cloud
A New Approach to Continuous Monitoring in the CloudNETSCOUT
 

Semelhante a Interconnect your future (20)

Co-Design Architecture for Exascale
Co-Design Architecture for ExascaleCo-Design Architecture for Exascale
Co-Design Architecture for Exascale
 
Mellanox Announcements at SC15
Mellanox Announcements at SC15Mellanox Announcements at SC15
Mellanox Announcements at SC15
 
Announcing the Mellanox ConnectX-5 100G InfiniBand Adapter
Announcing the Mellanox ConnectX-5 100G InfiniBand AdapterAnnouncing the Mellanox ConnectX-5 100G InfiniBand Adapter
Announcing the Mellanox ConnectX-5 100G InfiniBand Adapter
 
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and moreAdvanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
 
Banv meetup-contrail
Banv meetup-contrailBanv meetup-contrail
Banv meetup-contrail
 
Open vSwitch Implementation Options
Open vSwitch Implementation Options Open vSwitch Implementation Options
Open vSwitch Implementation Options
 
InfiniBand In-Network Computing Technology and Roadmap
InfiniBand In-Network Computing Technology and RoadmapInfiniBand In-Network Computing Technology and Roadmap
InfiniBand In-Network Computing Technology and Roadmap
 
Mellnox Interconnect presentation in OpenPOWER Brazil workshop
Mellnox Interconnect presentation in OpenPOWER Brazil workshopMellnox Interconnect presentation in OpenPOWER Brazil workshop
Mellnox Interconnect presentation in OpenPOWER Brazil workshop
 
Storage, Cloud, Web 2.0, Big Data Driving Growth
Storage, Cloud, Web 2.0, Big Data Driving GrowthStorage, Cloud, Web 2.0, Big Data Driving Growth
Storage, Cloud, Web 2.0, Big Data Driving Growth
 
[OpenStack Days Korea 2016] Track1 - Mellanox CloudX - Acceleration for Cloud...
[OpenStack Days Korea 2016] Track1 - Mellanox CloudX - Acceleration for Cloud...[OpenStack Days Korea 2016] Track1 - Mellanox CloudX - Acceleration for Cloud...
[OpenStack Days Korea 2016] Track1 - Mellanox CloudX - Acceleration for Cloud...
 
How to use SDN to Innovate, Expand and Deliver for your business
How to use SDN to Innovate, Expand and Deliver for your businessHow to use SDN to Innovate, Expand and Deliver for your business
How to use SDN to Innovate, Expand and Deliver for your business
 
Software Stacks to enable SDN and NFV
Software Stacks to enable SDN and NFVSoftware Stacks to enable SDN and NFV
Software Stacks to enable SDN and NFV
 
Open coud networking at full speed - Avi Alkobi
Open coud networking at full speed - Avi AlkobiOpen coud networking at full speed - Avi Alkobi
Open coud networking at full speed - Avi Alkobi
 
OVNC 2015-Open Ethernet과 SDN을 통한 Mellanox의 차세대 네트워크 혁신 방안
OVNC 2015-Open Ethernet과 SDN을 통한 Mellanox의 차세대 네트워크 혁신 방안OVNC 2015-Open Ethernet과 SDN을 통한 Mellanox의 차세대 네트워크 혁신 방안
OVNC 2015-Open Ethernet과 SDN을 통한 Mellanox의 차세대 네트워크 혁신 방안
 
ProductX2014 Tom thirer. mellanox
ProductX2014 Tom thirer. mellanoxProductX2014 Tom thirer. mellanox
ProductX2014 Tom thirer. mellanox
 
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
 
6WINDGate™ - Enabling Cloud RAN Virtualization
6WINDGate™ - Enabling Cloud RAN Virtualization6WINDGate™ - Enabling Cloud RAN Virtualization
6WINDGate™ - Enabling Cloud RAN Virtualization
 
High Performance Networking Leveraging the DPDK and Growing Community
High Performance Networking Leveraging the DPDK and Growing CommunityHigh Performance Networking Leveraging the DPDK and Growing Community
High Performance Networking Leveraging the DPDK and Growing Community
 
A New Approach to Continuous Monitoring in the Cloud
A New Approach to Continuous Monitoring in the CloudA New Approach to Continuous Monitoring in the Cloud
A New Approach to Continuous Monitoring in the Cloud
 
Mellanox IBM
Mellanox IBMMellanox IBM
Mellanox IBM
 

Mais de inside-BigData.com

Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...inside-BigData.com
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networksinside-BigData.com
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...inside-BigData.com
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networksinside-BigData.com
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoringinside-BigData.com
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecastsinside-BigData.com
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Updateinside-BigData.com
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19inside-BigData.com
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuninginside-BigData.com
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficientlyinside-BigData.com
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Erainside-BigData.com
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Clusterinside-BigData.com
 
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...inside-BigData.com
 
Adaptive Linear Solvers and Eigensolvers
Adaptive Linear Solvers and EigensolversAdaptive Linear Solvers and Eigensolvers
Adaptive Linear Solvers and Eigensolversinside-BigData.com
 
Scientific Applications and Heterogeneous Architectures
Scientific Applications and Heterogeneous ArchitecturesScientific Applications and Heterogeneous Architectures
Scientific Applications and Heterogeneous Architecturesinside-BigData.com
 
SW/HW co-design for near-term quantum computing
SW/HW co-design for near-term quantum computingSW/HW co-design for near-term quantum computing
SW/HW co-design for near-term quantum computinginside-BigData.com
 

Mais de inside-BigData.com (20)

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
 
Data Parallel Deep Learning
Data Parallel Deep LearningData Parallel Deep Learning
Data Parallel Deep Learning
 
Making Supernovae with Jets
Making Supernovae with JetsMaking Supernovae with Jets
Making Supernovae with Jets
 
Adaptive Linear Solvers and Eigensolvers
Adaptive Linear Solvers and EigensolversAdaptive Linear Solvers and Eigensolvers
Adaptive Linear Solvers and Eigensolvers
 
Scientific Applications and Heterogeneous Architectures
Scientific Applications and Heterogeneous ArchitecturesScientific Applications and Heterogeneous Architectures
Scientific Applications and Heterogeneous Architectures
 
SW/HW co-design for near-term quantum computing
SW/HW co-design for near-term quantum computingSW/HW co-design for near-term quantum computing
SW/HW co-design for near-term quantum computing
 

Último

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 

Último (20)

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 

Interconnect your future

  • 1. Rich Graham February 2016, HPCAC Stanford Conference Interconnect Your Future
  • 2. © 2015 Mellanox Technologies 2 The Ever Growing Demand for Higher Performance 2000 202020102005 “Roadrunner” 1st 2015 Terascale Petascale Exascale Single-Core to Many-CoreSMP to Clusters Performance Development Co-Design HW SW APP Hardware Software Application The Interconnect is the Enabling Technology
  • 3. © 2015 Mellanox Technologies 3 Co-Design Architecture to Enable Exascale Performance CPU-Centric Co-Design Limited to Main CPU Usage Results in Performance Limitation Creating Synergies Enables Higher Performance and Scale Software Software In-CPU Computing In-Network Computing In-Storage Computing
  • 4. © 2015 Mellanox Technologies 4 The Intelligence is Moving to the Interconnect CPU Interconnect Past Future
  • 5. © 2015 Mellanox Technologies 5 Breaking the Application Latency Wall § Today: Network device latencies are on the order of 100 nanoseconds § Challenge: Enabling the next order of magnitude improvement in application performance § Solution: Creating synergies between software and hardware – intelligent interconnect Intelligent Interconnect Paves the Road to Exascale Performance 10 years ago ~10 microsecond ~100 microsecond NetworkCommunication Framework Today ~10 microsecond Communication Framework ~0.1 microsecond Network ~1 microsecond Communication Framework Future ~0.05 microsecond Co-Design Network
  • 6. © 2015 Mellanox Technologies 6 Co-Design: Offloaded Technologies Target Application Characteristics Programmability RDMA GPUDirect Virtualization Backward and Future Compatibility Direct Communication Applications (Innovations, Scalability, Performance) Software-Defined Network (SDN) Co-Design Requires Intelligent Interconnect Offloaded Technologies: Intelligent Interconnect
  • 7. © 2015 Mellanox Technologies 7 The Road to Exascale – Co-Design System Architecture Co-Design Co-Design Co-Design Co-Design CPU GPU HCA Switch FPGA In-CPU Computing In-GPU Computing In-FPGA Computing In-Network Computing In-Network Computing
  • 8. © 2015 Mellanox Technologies 8 Introducing Switch-IB 2 World’s First Smart Switch
  • 9. © 2015 Mellanox Technologies 9 Introducing Switch-IB 2 World’s First Smart Switch § The world fastest switch with <90 nanosecond latency § 36-ports, 100Gb/s per port, 7.2Tb/s throughput, 7.02 Billion messages/sec § Adaptive Routing, Congestion control, support for multiple topologies World’s First Smart Switch Build for Scalable Compute and Storage Infrastructures 10X Higher Performance with The New Switch SHArP Technology
  • 10. © 2015 Mellanox Technologies 10 SHArP (Scalable Hierarchical Aggregation Protocol) Technology Delivering 10X Performance Improvement for MPI and SHMEM/PAGS Applications Switch-IB 2 Enables the Switch Network to Operate as a Co-Processor SHArP Enables Switch-IB 2 to Manage and Execute MPI Operations in the Network
  • 11. © 2015 Mellanox Technologies 11 Scalable Hierarchical Aggregation Protocol § Reliable Scalable General Purpose Primitive, Applicable to Multiple Use-cases •  In-network Tree based aggregation mechanism •  Large number of groups •  Multiple simultaneous outstanding operations Accelerating HPC applications § Scalable High Performance Collective Offload •  Barrier, Reduce, All-Reduce, Broadcast •  Sum, Min, Max, Min-loc, max-loc, OR, XOR, AND •  Integer and Floating-Point, 32 / 64 bit § Significantly reduce MPI collective runtime § Increase CPU availability and efficiency § Enable communication and computation overlap Accelerating MapReduce Applications § Prevent the Incast Traffic Pattern
  • 12. © 2015 Mellanox Technologies 12 SHArP Performance Advantage – MiniFE Details §  MiniFE is a Finite Element mini-application •  Implements kernels that represent implicit finite-element applications 10X to 25X Performance Improvement AllRedcue MPI Collective Number of Nodes CPU-Based Latency (usec) SHArP Latency (usec) Ratio 32 41.7 4.24 9.9 64 49.08 4.63 10.6 128 57.67 4.76 12.1 256 67.76 4.87 13.9 512 79.62 5.09 15.6 1024 93.55 5.58 16.8 2048 109.92 5.63 19.5 4096 129.16 5.73 22.5 8192 151.76 5.94 25.5
  • 13. © 2015 Mellanox Technologies 13 SHArP Performance– First Results (Partial Implementation) 3.5X Performance Improvement on 64 Nodes
  • 14. © 2015 Mellanox Technologies 14 The Intelligence is Moving to the Interconnect Communication Frameworks (MPI, SHMEM/PGAS) The Only Approach to Deliver 10X Performance Improvements Applications Transport RDMA SR-IOV Collectives Peer-Direct GPUDirect More… MPI / SHMEM Offloads Q1’16 Q3’16
  • 15. © 2015 Mellanox Technologies 15 Introducing ConnectX-4 Lx Programmable Adapter Scalable, Efficient, High-Performance and Flexible Solution Security Cloud/Virtualization Storage High Performance Computing Precision Time Synchronization Networking + FPGA Mellanox Acceleration Engines and FGPA Programmability On One Adapter
  • 16. © 2015 Mellanox Technologies 16 InfiniBand Router – In Progress § Isolation between InfiniBand subnets § Simple connectivity between different topologies •  Enable sharing a common storage network by multiple disconnected subnets § Support 2^128 nodes (unlimited system size) SB7780
  • 17. © 2015 Mellanox Technologies 17 § Router implements GID to LID mapping § SM allocates Alias GID to HCA § Address resolution •  IP based applications -  Name to IP (standard), IP to GID using new API •  Pure IB applications -  Upon LID assignment change, GID DNS is updated InfiniBand Router Details IB subnet IB subnetIB subnet GID DNS RMA 1 RPA RPA RPA RTM HCA GID DNA Agent SM SRPM SRTM HCA GID DNA Agent SM SRPM SRTM HCA GID DNA Agent SM SRPM SRTM RTM: Routing Table Manager SRTM: Subnet Routing Table Manager RPA: Router Port Agent SRPM: Subnet Router Port Manager GID DNS: IP to GID resolution
  • 18. © 2015 Mellanox Technologies 18 Multi-Host Socket Direct – Low Latency Socket Communication § Each CPU with direct network access §  QPI avoidance for I/O – improve performance §  Enables GPU / peer direct on both sockets § Solution is transparent to software CPU CPUCPU CPU QPI Multi-Host Socket Direct Performance 50% Lower CPU Utilization 20% lower Latency Multi Host Evaluation Kit Lower Application Latency, Free-up CPU
  • 19. © 2015 Mellanox Technologies 19 Switch LatencyMessage Rate Mellanox InfiniBand Leadership Over Future Competition 20% Lower 44% Higher Power Consumption Per Switch Port Scalability CPU efficiency 25% Lower 2X Higher 100 Gb/s Link Speed 200 Gb/s Link Speed 2014 Gain Competitive Advantage Today Protect Your Future 2017 Smart Network For Smart Systems RDMA, Acceleration Engines, Programmability Higher Performance Unlimited Scalability Higher Resiliency Proven!
  • 20. © 2015 Mellanox Technologies 20 Technology Roadmap – One-Generation Lead over the Competition 2000 202020102005 20G 40G 56G 100G “Roadrunner” Mellanox Connected 1st3rd TOP500 2003 Virginia Tech (Apple) 2015 200G Terascale Petascale Exascale Mellanox 400G