SlideShare uma empresa Scribd logo
1 de 61
Open MPI State of the Union
Community Meeting SC‘15
Jeff Squyres
November 18, 2015
Nathan HjelmGeorge Bosilca
First, two quick
public service announcements
MPI Forum BOF
Wednesday, 3:30pm
Room 15
Come hear about:
MPI-3.1
The road to MPI-4 (ongoing activities)
www.EuroMPI2016.ed.ac.uk
• Call for papers open by end of November 2015
• Full paper submission deadline: 1st May 2016
• Associated events: tutorials, workshops, training
• Focusing on: benchmarks, tools, applications,
parallel I/O, fault tolerance, hybrid MPI+X, and
alternatives to MPI and reasons for not using MPI
Reminder:
Open MPI now hosted on GitHub
• In late 2014, Open MPI
code and bug tracking
moved to GitHub
• Please file at GitHub:
 Issues (bugs, feature
requests, etc.)
 Pull requests (code!)
https://github.com/open-mpi/ompi
Development
master
Transition to super stable
v1.7
v1.7.1
v1.7.2
New features,
enhancements
Time
v1.8
v1.8.1
Bug fixes only
v1.8.4
…
Branch to create
Feature series
v1.9.0
…
v1.9 / v2.0
branch
Open MPI versioning (OLD)
Development
master
Transition to super stable
v1.7
v1.7.1
v1.7.2
New features,
enhancements
Time
v1.8
v1.8.1
Bug fixes only
v1.8.4
…
Branch to create
Feature series
v1.9.0
…
v1.9 / v2.0
branch
Open MPI versioning (OLD)
Open MPI versioning
No more
“odd / even” versioning!
New version numbering scheme
• Open MPI will (continue to) use a “A.B.C”
version number triple
• Each number now has a specific meaning:
This number changes when backwards
compatibility breaks
This number changes when new features are
added
This number changes for all other releases
A
B
C 9
Definition
• Open MPI vY is backwards compatible with
Open MPI vX (where Y>X) if:
 Users can compile a correct MPI / OSHMEM
program with vX
 Run it with the same CLI options and MCA
parameters using vX or vY
 The job executes correctly
10
What does that encompass?
• “Backwards compatibility” covers several
areas:
 Binary compatibility, specifically the MPI /
OSHMEM API ABI
 MPI / OSHMEM run time system
 mpirun / oshrun CLI options
 MCA parameter names / values / meanings
11
What does that not encompass?
• Open MPI only supports running exactly
the same version of the runtime and MPI /
OSHMEM libraries in a single job
 If you mix-n-match vX and vY in a single job…
12
ERROR
v1.8.x Roadmap
v1.8.x / v1.10.x release manager
• Ralph Castain, Intel
v1.8 roadmap
Git master development
No further v1.8.x
releases planned
v1.7.x /
v1.8.x branch
v1.8 v1.8.8……
Old style
odd / even
v1.10.x Roadmap
v1.10.x roadmap
Git master development
v1.7.x /
v1.8.x branch
v1.8.5 v1.8.8…
v1.10.x branch
v1.10 v1.10.1
Small number of
new features
v1.10 series
• Small number of new features
 libfabric support for usNIC and PSM
 Mellanox Yalla PML
 Intel PSM2 for OmniPath
• Built on top of stable v1.8 series
• “Mostly” the new numbering scheme
 Example: v1.10.1 could have been called v1.11
(new features, but still ABI compatible)
User question
(submitted via web form before the BOF)
Does OmniPath work with Open MPI v1.10.x?
Yes, OmniPath should work out-of-the-box with Open
MPI v1.10.1 via the PSM2 MTL.
v1.10 series
• Current status:
 v1.10.0: released August 25, 2015
 v1.10.1: released October 4, 2015
• Bug fixes compared to v1.10.0
• Conformance: MPI-3.1 without non-blocking I/O
collectives
 v1.10.2: estimated January 2016
• There will likely be future bug-fix releases
as needed
v2.x Roadmap
v2.x release managers
• Howard Pritchard, Los Alamos National Lab
• Jeff Squyres, Cisco Systems, Inc.
v2.x roadmap
Git master development
v2.x branch
v1.7.x /
v1.8.x branch
v1.8.8…
v1.10.x branch
v1.10.1
Lots of new
features
…
v2.x timeline
• Branched in June, 2015
• v2.0.0 has taken longer than expected
 v1.10.x releases have consumed resources
 Many new features in v2.0.0
• Estimate v2.0.0 release Q1 2016
 Currently focusing on stabilizing new features
 Will be MPI-3.1 compliant
IBM Platform MPI
• IBM Platform MPI to be
based on Open MPI v2.x
 Many proprietary features to be
committed back to Open MPI
 First release planned Q3, 2016
• Integration with IBM/Platform
software
 Customer support available
 See the IBM booth for more
details
Miscellaneous v2.x features
• Compliance level:
 MPI-3.0 + errata
 MPI-3.1
• Performance and scalability improvements
 Lots of internal plumbing enhancements
• Improved Cray XE/XC support
• Unified Communication X (UCX) support
• Process Management Interface – Exascale
(PMIx)
PMIx
• Goal: Work with the HPC community to
define and implement new APIs that
support evolving programming model
requirements for application/resource
manager interactions.
• Support the Instant On initiative
 For rapid startup of applications at exascale+
• Standalone library (non-copy-left licensed)
BoF: Thurs @ 12:15-1:15pm, Room 15
Usability issues
• Open MPI has some cryptic plugin names
 OB1, R2, Vader, Yoda, Ikrit, …
• New libraries provide multiple ways to get
to the same underlying network
 Libfabric
 UCX
 Need a simple, clear interface for users
• This is an ongoing discussion in the
development community…
Java bindings updates
• Primarily contributed by Oscar Vega-Gispert
 Community (LANL) updating the bindings/fixing bugs
• Now includes (almost) all MPI 3.1 functionality
 MPI_WIN_shared_query, etc. not implemented because they
don’t make sense in Java (no explicit virtual address concept)
 MPI_Neighbor_Alltoall(w/v) will be coming after v2.0.0
• Various bug fixes, including:
 Better interaction with JVM GC
 Fixes for PSM/JVM signal handler conflicts (PSM fix)
• Java MPI tests / examples now available
 https://github.com/open-mpi/ompi-java-test
Java performance – OSU latency
0
5
10
15
20
25
30
35
40
45
50
C
java array
java ByteBuffer direct
Message length (bytes)
Java 7
Cray XC
(edison.nersc.gov)
Latency(usec)
George Bosilca
2.x goodies
Features dropped from v2.x
• Myrinet / MX support
• Cray XT support
• VampirTrace support
• Checkpoint / restart support
• Legacy collective module: ml
User question
How do I performance profile my code?
There’s a lot of good PMPI-based tools available.
VampirTrace has been removed from Open MPI, but
ScoreP is it’s direct replacement (not yet bundled in
Open MPI).
https://doc.zih.tu-dresden.de/hpc-
wiki/bin/view/Compendium/ScoreP
Network
support
Name Owner Status Thread
BTL
OpenIB Chelsio maintenance Done
Portals4 SNL maintenance Not done
Scif LANL maintenance In progress
self UTK active Done
sm UTK active Done
smcuda NVIDIA active Done
tcp UTK active Done
uGNI LANL active Done
usnic CISCO active Done
vader LANL active Done
PML
Yalla Mellanox active In progress
UCX Mellanox/UTK active In progress
MTL
MXM Mellanox active In progress
OFI Intel active In progress
Portals4 SNL active In progress
PSM Intel active In progress
PSM2 Intel active In progress
It’s complicated!
(only listing those
that are in release
branches)
MPI_Init / MPI_Finalize
extension(s)
• Also called the #1007 fiasco
• Thanks to the MPI standard (3.1) we know that some functions must
be thread safe even for libraries that do not support thread safety
(Section 12.4):
 MPI_INITIALIZED, MPI_FINALIZED, MPI_QUERY_THREAD,
MPI_IS_THREAD_MAIN, MPI_GET_VERSION and
MPI_GET_LIBRARY_VERSION
• Clearly there is a need for a better (more precise) definition.
Meanwhile:
• If MPI_INITIALIZED is invoked and MPI is only partially initialized,
wait until MPI is fully initialized before returning.
• If MPI_FINALIZED is invoked and MPI is only partially finalized, wait
until MPI is fully finalized before returning.
MPI_[Wait|Test]* (#176)
• We had supra-linear complexity
 Due to error case scenarios
 Thread unfriendly
Threading
• Audit of the communication support
 Fine grain locks for request management
 Atomic operations for everything else
Mellanox MT27520
OpenIB BTL
From now
until the
release the
focus is on
performance
Async progress
• Helper thread(s) to
make progress
 Support for more
communication
infrastructures to
come
Shared Memory
TCP
Trunk
Adapt
Trunk
Adapt
CUDA Support
V CT
CUDA 7.5
Mellanox MT27520
• Multi-level coordination
protocol based on the
location of the source
and destination memory
• Delocalize part of the
datatype engine into the
GPU
 Major part still driven by
the CPU
 Provide a different
datatype representation
(avoid branching in the
code)
Open MPI 2.x
MVAPICH 2.1 GDR
CUDA Support
V CT
Open MPI 2.x
MVAPICH 2.1 GDR
• Support for GPUDirect
• Right now integrated with
OpenIB and shared
memory
 BTL independent support
in the design stage
Monitoring PML (#724)
• Intercept the component interface
 through careful manipulations of the init/enable/fini
interfaces
 Expose more [low-level] details about communication
patterns without PMPI
• Bytes for point-to-point, groups for collectives, control
messages, collective communications patterns, no one-sided
 Controlled by MPI_T to generate an extended
communication map
• Topology components (TreeMatch) reorder the processes
I/O
• Supports 2 modules:
 ROMIO 3.1.4 (#518), 3.2b (#504)
 OMPIO (UH) – default in 2.x
• Highly modular architecture for parallel I/O
• Supports Multiple Collective I/O algorithms
 File view based automatic selection of collective I/O component
• Automatic adjustments of no. of aggregators depending on
 Application characteristics (e.g. data volume, access pattern)
 File system characteristics (e.g. stripe size, stripe depth)
• Asynchronous individual and collective I/O operations
 Integrated with the progress engine of Open MPI
UCX integration
• UCX is the next generation open-source production-grade communication middleware.
• Collaboration between industry, laboratories, and academia.
• Exposes near-bare-metal performance by reducing CPU overhead.
• Provides both a thin low-level API, and a general purpose high-level API.
UCX Integration
UCX PML
ConnectX-4 EDR, back-to-back, no switch
Other minor improvements
• Dynamic process management (#989)
 Gracefully disabled if the RTE or the underlying communication
does not supports this feature
• C99 component support (#541)
• Dismantle the tuned collective support
 All algorithms are now available to all collective modules
 Tuned became a tiny decision layer
 Driver specific configuration become now possible
• Extend MPI_COMM_SPLIT_TYPE to include all HWLOC
types (CU, HOST, BOARD, SOCKET,
LxCACHE…)(#320)
• Did I mention Fortran yet?
Memory Scaling and RMA Update
Nathan Hjelm
Los Alamos National Laboratory
v1.10.x memory scaling
• Process structure and network endpoint
allocated for all processes in
MPI_COMM_WORLD
• Structures allocated even if when there is
no communication with a peer
v2.x memory scaling
• Process structure and network endpoint
allocated on first communication
• Small delay in 1st communication, but…
 Big reduction in memory footprint
 Configurable cutoff for new behavior
 MCA variable: mpi_add_procs_cutoff
• Default:1024 processes
Memory scaling
• mpimemu VmRSS on 320 nodes of a Cray XE-6. 16 ppn aprun
• Reduction of 6MB/process @ 5120 processes!
6 MB
8 MB
10 MB
12 MB
14 MB
16 MB
18 MB
20 MB
0 1000 2000 3000 4000 5000
MemoryUsage/Process
# Processes
mpimemu VmRSS Post MPI_Init
Static add_procs
Dynamic add_procs
6MB!
v1.10 RMA
• Fully supports MPI-3 RMA
• Full support for MPI datatypes
 Stress tested with ARMCI
• Emulates one-sided operation using point-to-
point components (PML) for communication
• Caveat: no asynchronous progress support
 Targets must enter MPI to progress any one-
sided data movement (!)
 Defeats the purpose of passive target RMA
v2.x RMA
• Uses network RMA and atomic operation
support through Byte Transport Layer (BTL)
 Fully supports passive target RMA operations
• Initially supports:
 IB*, Infinipath/Omnipath**, Gemini/Aries
• More networks can be supported if they have
 put, get, fetch-and-add, compare-and-swap
• Otherwise, fall back to emulated RMA
v2.x RMA performance
• IMB truly passive put on Cray XC-30
0
20
40
60
80
100
20 25 210 215 220
Overlap%
Message Size (Bytes)
IMB Truly Passive Put Overlap
OSC/pt2pt
OSC/rdma
Beyond the v2.x series
What’s the plan over time?
• Plan for a new release series once a year
 v2.x: release in mid / late 2015
 v3.x: release in mid / late 2016
 v4.x: release in mid / late 2017
 …etc.
54
NOTE: Scheduled releases is a new concept
for the Open MPI developer community. We’ll
continue to evaluate this plan over time.
3.x series
• Branching for v3.x in June 2016 is… unlikely
 As noted previously, v2.0.0 is taking longer than
expected
• More likely:
 The v2.x series will have a good life (~1 year?)
 We’ll postpone v3.x for a while
 Stay tuned
MPI v4.0
• Reminder: go to the MPI Forum BOF
 3:30pm on Wednesday
• MPI v4.0 is at least 2 years away
 Content far from certain
 Too far off to make predictions
 Will likely include portions of what will become
MPI-4.0 over time
User question
What are the major implementation issues involved with the
proposed MPI-4 endpoints? Any progress to report?
We (developers) have talked about implementing MPI-4
endpoints. But…
• We have identified a minimal set of changes in Open
MPI for compliance with the current proposal
• The endpoints proposal is not final (yet)
• Not enough user demand (yet)
User question
What wire protocols are used? How do you tune them?
How does one tune FDR/EDR/OPA fabrics?
This is a bit more than we can cover in a 1-hour BOF.
Please come see us afterwards.
Where do we need help?
• Code
 Any bug that bothers you
 Any feature that you can add
• User documentation
• Testing
• Usability
• Release engineering
Researchers: how can we help you?
• Fork OMPI on GitHub
• Ask questions on the devel list
• Come to Open MPI developer meetings
 Next: February 22-25, 2016, Dallas, TX, USA
• Generally: be part of the open source
community
Come Join Us!
http://www.open-mpi.org/

Mais conteúdo relacionado

Mais procurados

(Open) MPI, Parallel Computing, Life, the Universe, and Everything
(Open) MPI, Parallel Computing, Life, the Universe, and Everything(Open) MPI, Parallel Computing, Life, the Universe, and Everything
(Open) MPI, Parallel Computing, Life, the Universe, and EverythingJeff Squyres
 
.NET Core Blimey! (dotnetsheff Jan 2016)
.NET Core Blimey! (dotnetsheff Jan 2016).NET Core Blimey! (dotnetsheff Jan 2016)
.NET Core Blimey! (dotnetsheff Jan 2016)citizenmatt
 
OFI Overview 2019 Webinar
OFI Overview 2019 WebinarOFI Overview 2019 Webinar
OFI Overview 2019 Webinarseanhefty
 
平行化你的工作 part1
平行化你的工作 part1平行化你的工作 part1
平行化你的工作 part1Shuen-Huei Guan
 
Clang Analyzer Tool Review
Clang Analyzer Tool ReviewClang Analyzer Tool Review
Clang Analyzer Tool ReviewDoug Schuster
 
Building a Language Server for Eclipse MicroProfile
Building a Language Server for Eclipse MicroProfileBuilding a Language Server for Eclipse MicroProfile
Building a Language Server for Eclipse MicroProfileYK Chang
 
.NET Standard - Under the Hood
.NET Standard - Under the Hood.NET Standard - Under the Hood
.NET Standard - Under the HoodImmo Landwerth
 
Clang compiler `
Clang compiler `Clang compiler `
Clang compiler `Rabin BK
 
Introduction to the LLVM Compiler System
Introduction to the LLVM  Compiler SystemIntroduction to the LLVM  Compiler System
Introduction to the LLVM Compiler Systemzionsaint
 
"Using TensorFlow Lite to Deploy Deep Learning on Cortex-M Microcontrollers,"...
"Using TensorFlow Lite to Deploy Deep Learning on Cortex-M Microcontrollers,"..."Using TensorFlow Lite to Deploy Deep Learning on Cortex-M Microcontrollers,"...
"Using TensorFlow Lite to Deploy Deep Learning on Cortex-M Microcontrollers,"...Edge AI and Vision Alliance
 
Python for IoT, A return of experience
Python for IoT, A return of experiencePython for IoT, A return of experience
Python for IoT, A return of experienceAlexandre Abadie
 
Introduction to OFI
Introduction to OFIIntroduction to OFI
Introduction to OFIseanhefty
 
Debugging of (C)Python applications
Debugging of (C)Python applicationsDebugging of (C)Python applications
Debugging of (C)Python applicationsRoman Podoliaka
 
Netty Cookbook - Table of contents
Netty Cookbook - Table of contentsNetty Cookbook - Table of contents
Netty Cookbook - Table of contentsTrieu Nguyen
 

Mais procurados (20)

(Open) MPI, Parallel Computing, Life, the Universe, and Everything
(Open) MPI, Parallel Computing, Life, the Universe, and Everything(Open) MPI, Parallel Computing, Life, the Universe, and Everything
(Open) MPI, Parallel Computing, Life, the Universe, and Everything
 
LLVM Compiler
LLVM CompilerLLVM Compiler
LLVM Compiler
 
Where is LLVM Being Used Today?
Where is LLVM Being Used Today? Where is LLVM Being Used Today?
Where is LLVM Being Used Today?
 
.NET Core Blimey! (dotnetsheff Jan 2016)
.NET Core Blimey! (dotnetsheff Jan 2016).NET Core Blimey! (dotnetsheff Jan 2016)
.NET Core Blimey! (dotnetsheff Jan 2016)
 
OFI Overview 2019 Webinar
OFI Overview 2019 WebinarOFI Overview 2019 Webinar
OFI Overview 2019 Webinar
 
平行化你的工作 part1
平行化你的工作 part1平行化你的工作 part1
平行化你的工作 part1
 
Clang Analyzer Tool Review
Clang Analyzer Tool ReviewClang Analyzer Tool Review
Clang Analyzer Tool Review
 
Building a Language Server for Eclipse MicroProfile
Building a Language Server for Eclipse MicroProfileBuilding a Language Server for Eclipse MicroProfile
Building a Language Server for Eclipse MicroProfile
 
.NET Standard - Under the Hood
.NET Standard - Under the Hood.NET Standard - Under the Hood
.NET Standard - Under the Hood
 
Composer namespacing
Composer namespacingComposer namespacing
Composer namespacing
 
Clang compiler `
Clang compiler `Clang compiler `
Clang compiler `
 
Introduction to the LLVM Compiler System
Introduction to the LLVM  Compiler SystemIntroduction to the LLVM  Compiler System
Introduction to the LLVM Compiler System
 
"Using TensorFlow Lite to Deploy Deep Learning on Cortex-M Microcontrollers,"...
"Using TensorFlow Lite to Deploy Deep Learning on Cortex-M Microcontrollers,"..."Using TensorFlow Lite to Deploy Deep Learning on Cortex-M Microcontrollers,"...
"Using TensorFlow Lite to Deploy Deep Learning on Cortex-M Microcontrollers,"...
 
Python for IoT, A return of experience
Python for IoT, A return of experiencePython for IoT, A return of experience
Python for IoT, A return of experience
 
PySide
PySidePySide
PySide
 
Introduction to OFI
Introduction to OFIIntroduction to OFI
Introduction to OFI
 
Meetup 2009
Meetup 2009Meetup 2009
Meetup 2009
 
Caffe2 on Android
Caffe2 on AndroidCaffe2 on Android
Caffe2 on Android
 
Debugging of (C)Python applications
Debugging of (C)Python applicationsDebugging of (C)Python applications
Debugging of (C)Python applications
 
Netty Cookbook - Table of contents
Netty Cookbook - Table of contentsNetty Cookbook - Table of contents
Netty Cookbook - Table of contents
 

Semelhante a Open MPI SC'15 State of the Union BOF

Lenovo system management solutions
Lenovo system management solutionsLenovo system management solutions
Lenovo system management solutionsinside-BigData.com
 
Linux Distribution Collaboration …on a Mainframe!
Linux Distribution Collaboration …on a Mainframe!Linux Distribution Collaboration …on a Mainframe!
Linux Distribution Collaboration …on a Mainframe!All Things Open
 
The new WPE API
The new WPE APIThe new WPE API
The new WPE APIIgalia
 
The Open eHealth Integration Platform
The Open eHealth Integration PlatformThe Open eHealth Integration Platform
The Open eHealth Integration Platformkrasserm
 
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...DevOps_Fest
 
Embedded-Linux-Community-Update-2022-02-JJ78.pdf
Embedded-Linux-Community-Update-2022-02-JJ78.pdfEmbedded-Linux-Community-Update-2022-02-JJ78.pdf
Embedded-Linux-Community-Update-2022-02-JJ78.pdfxiso
 
Python quick guide1
Python quick guide1Python quick guide1
Python quick guide1Kanchilug
 
Managing Plone Projects with Perl and Subversion
Managing Plone Projects with Perl and SubversionManaging Plone Projects with Perl and Subversion
Managing Plone Projects with Perl and SubversionLuciano Rocha
 
OpenNTF - UKLUG 2009 Edinburgh
OpenNTF - UKLUG 2009 EdinburghOpenNTF - UKLUG 2009 Edinburgh
OpenNTF - UKLUG 2009 EdinburghOpenNTF
 
Modern VoIP in modern infrastructures
Modern VoIP in modern infrastructuresModern VoIP in modern infrastructures
Modern VoIP in modern infrastructuresGiacomo Vacca
 
Advanced Scalable Decomposition Method with MPICH Environment for HPC
Advanced Scalable Decomposition Method with MPICH Environment for HPCAdvanced Scalable Decomposition Method with MPICH Environment for HPC
Advanced Scalable Decomposition Method with MPICH Environment for HPCIJSRD
 
ITCamp 2017 - Raffaele Rialdi - Adopting .NET Core in Mainstream Projects
ITCamp 2017 - Raffaele Rialdi - Adopting .NET Core in Mainstream ProjectsITCamp 2017 - Raffaele Rialdi - Adopting .NET Core in Mainstream Projects
ITCamp 2017 - Raffaele Rialdi - Adopting .NET Core in Mainstream ProjectsITCamp
 
Redfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined InfrastructureRedfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined InfrastructureBruno Cornec
 
ELC-E 2016 Neil Armstrong - No, it's never too late to upstream your legacy l...
ELC-E 2016 Neil Armstrong - No, it's never too late to upstream your legacy l...ELC-E 2016 Neil Armstrong - No, it's never too late to upstream your legacy l...
ELC-E 2016 Neil Armstrong - No, it's never too late to upstream your legacy l...Neil Armstrong
 
Infinitas cakefest 2010 presentation
Infinitas cakefest 2010 presentationInfinitas cakefest 2010 presentation
Infinitas cakefest 2010 presentationWalther Lalk
 

Semelhante a Open MPI SC'15 State of the Union BOF (20)

Lenovo system management solutions
Lenovo system management solutionsLenovo system management solutions
Lenovo system management solutions
 
Linux Distribution Collaboration …on a Mainframe!
Linux Distribution Collaboration …on a Mainframe!Linux Distribution Collaboration …on a Mainframe!
Linux Distribution Collaboration …on a Mainframe!
 
The new WPE API
The new WPE APIThe new WPE API
The new WPE API
 
The Open eHealth Integration Platform
The Open eHealth Integration PlatformThe Open eHealth Integration Platform
The Open eHealth Integration Platform
 
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
 
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...
 
Websocket 101 in Python
Websocket 101 in PythonWebsocket 101 in Python
Websocket 101 in Python
 
Embedded-Linux-Community-Update-2022-02-JJ78.pdf
Embedded-Linux-Community-Update-2022-02-JJ78.pdfEmbedded-Linux-Community-Update-2022-02-JJ78.pdf
Embedded-Linux-Community-Update-2022-02-JJ78.pdf
 
Python quick guide1
Python quick guide1Python quick guide1
Python quick guide1
 
Managing Plone Projects with Perl and Subversion
Managing Plone Projects with Perl and SubversionManaging Plone Projects with Perl and Subversion
Managing Plone Projects with Perl and Subversion
 
OpenNTF - UKLUG 2009 Edinburgh
OpenNTF - UKLUG 2009 EdinburghOpenNTF - UKLUG 2009 Edinburgh
OpenNTF - UKLUG 2009 Edinburgh
 
Modern VoIP in modern infrastructures
Modern VoIP in modern infrastructuresModern VoIP in modern infrastructures
Modern VoIP in modern infrastructures
 
Advanced Scalable Decomposition Method with MPICH Environment for HPC
Advanced Scalable Decomposition Method with MPICH Environment for HPCAdvanced Scalable Decomposition Method with MPICH Environment for HPC
Advanced Scalable Decomposition Method with MPICH Environment for HPC
 
SRE NL MeetUp - eBPF.pdf
SRE NL MeetUp - eBPF.pdfSRE NL MeetUp - eBPF.pdf
SRE NL MeetUp - eBPF.pdf
 
ITCamp 2017 - Raffaele Rialdi - Adopting .NET Core in Mainstream Projects
ITCamp 2017 - Raffaele Rialdi - Adopting .NET Core in Mainstream ProjectsITCamp 2017 - Raffaele Rialdi - Adopting .NET Core in Mainstream Projects
ITCamp 2017 - Raffaele Rialdi - Adopting .NET Core in Mainstream Projects
 
Redfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined InfrastructureRedfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined Infrastructure
 
ELC-E 2016 Neil Armstrong - No, it's never too late to upstream your legacy l...
ELC-E 2016 Neil Armstrong - No, it's never too late to upstream your legacy l...ELC-E 2016 Neil Armstrong - No, it's never too late to upstream your legacy l...
ELC-E 2016 Neil Armstrong - No, it's never too late to upstream your legacy l...
 
VocBench 2.0: A Web Application for Collaborative Development of Multilingual...
VocBench 2.0: A Web Application for Collaborative Development of Multilingual...VocBench 2.0: A Web Application for Collaborative Development of Multilingual...
VocBench 2.0: A Web Application for Collaborative Development of Multilingual...
 
Infinitas cakefest 2010 presentation
Infinitas cakefest 2010 presentationInfinitas cakefest 2010 presentation
Infinitas cakefest 2010 presentation
 
Bringing Tizen to a Raspberry Pi 2 Near You
Bringing Tizen to a Raspberry Pi 2 Near YouBringing Tizen to a Raspberry Pi 2 Near You
Bringing Tizen to a Raspberry Pi 2 Near You
 

Mais de Jeff Squyres

MPI Fourm SC'15 BOF
MPI Fourm SC'15 BOFMPI Fourm SC'15 BOF
MPI Fourm SC'15 BOFJeff Squyres
 
2014 01-21-mpi-community-feedback
2014 01-21-mpi-community-feedback2014 01-21-mpi-community-feedback
2014 01-21-mpi-community-feedbackJeff Squyres
 
Cisco usNIC: how it works, how it is used in Open MPI
Cisco usNIC: how it works, how it is used in Open MPICisco usNIC: how it works, how it is used in Open MPI
Cisco usNIC: how it works, how it is used in Open MPIJeff Squyres
 
Cisco EuroMPI'13 vendor session presentation
Cisco EuroMPI'13 vendor session presentationCisco EuroMPI'13 vendor session presentation
Cisco EuroMPI'13 vendor session presentationJeff Squyres
 
Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)
Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)
Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)Jeff Squyres
 
MOSSCon 2013, Cisco Open Source talk
MOSSCon 2013, Cisco Open Source talkMOSSCon 2013, Cisco Open Source talk
MOSSCon 2013, Cisco Open Source talkJeff Squyres
 
Ethernet and TCP optimizations
Ethernet and TCP optimizationsEthernet and TCP optimizations
Ethernet and TCP optimizationsJeff Squyres
 
Friends don't let friends leak MPI_Requests
Friends don't let friends leak MPI_RequestsFriends don't let friends leak MPI_Requests
Friends don't let friends leak MPI_RequestsJeff Squyres
 
MPI-3 Timer requests proposal
MPI-3 Timer requests proposalMPI-3 Timer requests proposal
MPI-3 Timer requests proposalJeff Squyres
 
MPI_Mprobe is good for you
MPI_Mprobe is good for youMPI_Mprobe is good for you
MPI_Mprobe is good for youJeff Squyres
 
The Message Passing Interface (MPI) in Layman's Terms
The Message Passing Interface (MPI) in Layman's TermsThe Message Passing Interface (MPI) in Layman's Terms
The Message Passing Interface (MPI) in Layman's TermsJeff Squyres
 
What is [Open] MPI?
What is [Open] MPI?What is [Open] MPI?
What is [Open] MPI?Jeff Squyres
 

Mais de Jeff Squyres (13)

MPI Fourm SC'15 BOF
MPI Fourm SC'15 BOFMPI Fourm SC'15 BOF
MPI Fourm SC'15 BOF
 
2014 01-21-mpi-community-feedback
2014 01-21-mpi-community-feedback2014 01-21-mpi-community-feedback
2014 01-21-mpi-community-feedback
 
Cisco usNIC: how it works, how it is used in Open MPI
Cisco usNIC: how it works, how it is used in Open MPICisco usNIC: how it works, how it is used in Open MPI
Cisco usNIC: how it works, how it is used in Open MPI
 
Cisco EuroMPI'13 vendor session presentation
Cisco EuroMPI'13 vendor session presentationCisco EuroMPI'13 vendor session presentation
Cisco EuroMPI'13 vendor session presentation
 
Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)
Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)
Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)
 
MPI History
MPI HistoryMPI History
MPI History
 
MOSSCon 2013, Cisco Open Source talk
MOSSCon 2013, Cisco Open Source talkMOSSCon 2013, Cisco Open Source talk
MOSSCon 2013, Cisco Open Source talk
 
Ethernet and TCP optimizations
Ethernet and TCP optimizationsEthernet and TCP optimizations
Ethernet and TCP optimizations
 
Friends don't let friends leak MPI_Requests
Friends don't let friends leak MPI_RequestsFriends don't let friends leak MPI_Requests
Friends don't let friends leak MPI_Requests
 
MPI-3 Timer requests proposal
MPI-3 Timer requests proposalMPI-3 Timer requests proposal
MPI-3 Timer requests proposal
 
MPI_Mprobe is good for you
MPI_Mprobe is good for youMPI_Mprobe is good for you
MPI_Mprobe is good for you
 
The Message Passing Interface (MPI) in Layman's Terms
The Message Passing Interface (MPI) in Layman's TermsThe Message Passing Interface (MPI) in Layman's Terms
The Message Passing Interface (MPI) in Layman's Terms
 
What is [Open] MPI?
What is [Open] MPI?What is [Open] MPI?
What is [Open] MPI?
 

Último

Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 

Último (20)

Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 

Open MPI SC'15 State of the Union BOF

  • 1. Open MPI State of the Union Community Meeting SC‘15 Jeff Squyres November 18, 2015 Nathan HjelmGeorge Bosilca
  • 2. First, two quick public service announcements
  • 3. MPI Forum BOF Wednesday, 3:30pm Room 15 Come hear about: MPI-3.1 The road to MPI-4 (ongoing activities)
  • 4. www.EuroMPI2016.ed.ac.uk • Call for papers open by end of November 2015 • Full paper submission deadline: 1st May 2016 • Associated events: tutorials, workshops, training • Focusing on: benchmarks, tools, applications, parallel I/O, fault tolerance, hybrid MPI+X, and alternatives to MPI and reasons for not using MPI
  • 5. Reminder: Open MPI now hosted on GitHub • In late 2014, Open MPI code and bug tracking moved to GitHub • Please file at GitHub:  Issues (bugs, feature requests, etc.)  Pull requests (code!) https://github.com/open-mpi/ompi
  • 6. Development master Transition to super stable v1.7 v1.7.1 v1.7.2 New features, enhancements Time v1.8 v1.8.1 Bug fixes only v1.8.4 … Branch to create Feature series v1.9.0 … v1.9 / v2.0 branch Open MPI versioning (OLD)
  • 7. Development master Transition to super stable v1.7 v1.7.1 v1.7.2 New features, enhancements Time v1.8 v1.8.1 Bug fixes only v1.8.4 … Branch to create Feature series v1.9.0 … v1.9 / v2.0 branch Open MPI versioning (OLD)
  • 8. Open MPI versioning No more “odd / even” versioning!
  • 9. New version numbering scheme • Open MPI will (continue to) use a “A.B.C” version number triple • Each number now has a specific meaning: This number changes when backwards compatibility breaks This number changes when new features are added This number changes for all other releases A B C 9
  • 10. Definition • Open MPI vY is backwards compatible with Open MPI vX (where Y>X) if:  Users can compile a correct MPI / OSHMEM program with vX  Run it with the same CLI options and MCA parameters using vX or vY  The job executes correctly 10
  • 11. What does that encompass? • “Backwards compatibility” covers several areas:  Binary compatibility, specifically the MPI / OSHMEM API ABI  MPI / OSHMEM run time system  mpirun / oshrun CLI options  MCA parameter names / values / meanings 11
  • 12. What does that not encompass? • Open MPI only supports running exactly the same version of the runtime and MPI / OSHMEM libraries in a single job  If you mix-n-match vX and vY in a single job… 12 ERROR
  • 14. v1.8.x / v1.10.x release manager • Ralph Castain, Intel
  • 15. v1.8 roadmap Git master development No further v1.8.x releases planned v1.7.x / v1.8.x branch v1.8 v1.8.8…… Old style odd / even
  • 17. v1.10.x roadmap Git master development v1.7.x / v1.8.x branch v1.8.5 v1.8.8… v1.10.x branch v1.10 v1.10.1 Small number of new features
  • 18. v1.10 series • Small number of new features  libfabric support for usNIC and PSM  Mellanox Yalla PML  Intel PSM2 for OmniPath • Built on top of stable v1.8 series • “Mostly” the new numbering scheme  Example: v1.10.1 could have been called v1.11 (new features, but still ABI compatible)
  • 19. User question (submitted via web form before the BOF) Does OmniPath work with Open MPI v1.10.x? Yes, OmniPath should work out-of-the-box with Open MPI v1.10.1 via the PSM2 MTL.
  • 20. v1.10 series • Current status:  v1.10.0: released August 25, 2015  v1.10.1: released October 4, 2015 • Bug fixes compared to v1.10.0 • Conformance: MPI-3.1 without non-blocking I/O collectives  v1.10.2: estimated January 2016 • There will likely be future bug-fix releases as needed
  • 22. v2.x release managers • Howard Pritchard, Los Alamos National Lab • Jeff Squyres, Cisco Systems, Inc.
  • 23. v2.x roadmap Git master development v2.x branch v1.7.x / v1.8.x branch v1.8.8… v1.10.x branch v1.10.1 Lots of new features …
  • 24. v2.x timeline • Branched in June, 2015 • v2.0.0 has taken longer than expected  v1.10.x releases have consumed resources  Many new features in v2.0.0 • Estimate v2.0.0 release Q1 2016  Currently focusing on stabilizing new features  Will be MPI-3.1 compliant
  • 25. IBM Platform MPI • IBM Platform MPI to be based on Open MPI v2.x  Many proprietary features to be committed back to Open MPI  First release planned Q3, 2016 • Integration with IBM/Platform software  Customer support available  See the IBM booth for more details
  • 26. Miscellaneous v2.x features • Compliance level:  MPI-3.0 + errata  MPI-3.1 • Performance and scalability improvements  Lots of internal plumbing enhancements • Improved Cray XE/XC support • Unified Communication X (UCX) support • Process Management Interface – Exascale (PMIx)
  • 27. PMIx • Goal: Work with the HPC community to define and implement new APIs that support evolving programming model requirements for application/resource manager interactions. • Support the Instant On initiative  For rapid startup of applications at exascale+ • Standalone library (non-copy-left licensed) BoF: Thurs @ 12:15-1:15pm, Room 15
  • 28. Usability issues • Open MPI has some cryptic plugin names  OB1, R2, Vader, Yoda, Ikrit, … • New libraries provide multiple ways to get to the same underlying network  Libfabric  UCX  Need a simple, clear interface for users • This is an ongoing discussion in the development community…
  • 29. Java bindings updates • Primarily contributed by Oscar Vega-Gispert  Community (LANL) updating the bindings/fixing bugs • Now includes (almost) all MPI 3.1 functionality  MPI_WIN_shared_query, etc. not implemented because they don’t make sense in Java (no explicit virtual address concept)  MPI_Neighbor_Alltoall(w/v) will be coming after v2.0.0 • Various bug fixes, including:  Better interaction with JVM GC  Fixes for PSM/JVM signal handler conflicts (PSM fix) • Java MPI tests / examples now available  https://github.com/open-mpi/ompi-java-test
  • 30. Java performance – OSU latency 0 5 10 15 20 25 30 35 40 45 50 C java array java ByteBuffer direct Message length (bytes) Java 7 Cray XC (edison.nersc.gov) Latency(usec)
  • 32. Features dropped from v2.x • Myrinet / MX support • Cray XT support • VampirTrace support • Checkpoint / restart support • Legacy collective module: ml
  • 33. User question How do I performance profile my code? There’s a lot of good PMPI-based tools available. VampirTrace has been removed from Open MPI, but ScoreP is it’s direct replacement (not yet bundled in Open MPI). https://doc.zih.tu-dresden.de/hpc- wiki/bin/view/Compendium/ScoreP
  • 34. Network support Name Owner Status Thread BTL OpenIB Chelsio maintenance Done Portals4 SNL maintenance Not done Scif LANL maintenance In progress self UTK active Done sm UTK active Done smcuda NVIDIA active Done tcp UTK active Done uGNI LANL active Done usnic CISCO active Done vader LANL active Done PML Yalla Mellanox active In progress UCX Mellanox/UTK active In progress MTL MXM Mellanox active In progress OFI Intel active In progress Portals4 SNL active In progress PSM Intel active In progress PSM2 Intel active In progress It’s complicated! (only listing those that are in release branches)
  • 35. MPI_Init / MPI_Finalize extension(s) • Also called the #1007 fiasco • Thanks to the MPI standard (3.1) we know that some functions must be thread safe even for libraries that do not support thread safety (Section 12.4):  MPI_INITIALIZED, MPI_FINALIZED, MPI_QUERY_THREAD, MPI_IS_THREAD_MAIN, MPI_GET_VERSION and MPI_GET_LIBRARY_VERSION • Clearly there is a need for a better (more precise) definition. Meanwhile: • If MPI_INITIALIZED is invoked and MPI is only partially initialized, wait until MPI is fully initialized before returning. • If MPI_FINALIZED is invoked and MPI is only partially finalized, wait until MPI is fully finalized before returning.
  • 36. MPI_[Wait|Test]* (#176) • We had supra-linear complexity  Due to error case scenarios  Thread unfriendly
  • 37. Threading • Audit of the communication support  Fine grain locks for request management  Atomic operations for everything else Mellanox MT27520 OpenIB BTL From now until the release the focus is on performance
  • 38. Async progress • Helper thread(s) to make progress  Support for more communication infrastructures to come Shared Memory TCP Trunk Adapt Trunk Adapt
  • 39. CUDA Support V CT CUDA 7.5 Mellanox MT27520 • Multi-level coordination protocol based on the location of the source and destination memory • Delocalize part of the datatype engine into the GPU  Major part still driven by the CPU  Provide a different datatype representation (avoid branching in the code) Open MPI 2.x MVAPICH 2.1 GDR
  • 40. CUDA Support V CT Open MPI 2.x MVAPICH 2.1 GDR • Support for GPUDirect • Right now integrated with OpenIB and shared memory  BTL independent support in the design stage
  • 41. Monitoring PML (#724) • Intercept the component interface  through careful manipulations of the init/enable/fini interfaces  Expose more [low-level] details about communication patterns without PMPI • Bytes for point-to-point, groups for collectives, control messages, collective communications patterns, no one-sided  Controlled by MPI_T to generate an extended communication map • Topology components (TreeMatch) reorder the processes
  • 42. I/O • Supports 2 modules:  ROMIO 3.1.4 (#518), 3.2b (#504)  OMPIO (UH) – default in 2.x • Highly modular architecture for parallel I/O • Supports Multiple Collective I/O algorithms  File view based automatic selection of collective I/O component • Automatic adjustments of no. of aggregators depending on  Application characteristics (e.g. data volume, access pattern)  File system characteristics (e.g. stripe size, stripe depth) • Asynchronous individual and collective I/O operations  Integrated with the progress engine of Open MPI
  • 43. UCX integration • UCX is the next generation open-source production-grade communication middleware. • Collaboration between industry, laboratories, and academia. • Exposes near-bare-metal performance by reducing CPU overhead. • Provides both a thin low-level API, and a general purpose high-level API.
  • 44. UCX Integration UCX PML ConnectX-4 EDR, back-to-back, no switch
  • 45. Other minor improvements • Dynamic process management (#989)  Gracefully disabled if the RTE or the underlying communication does not supports this feature • C99 component support (#541) • Dismantle the tuned collective support  All algorithms are now available to all collective modules  Tuned became a tiny decision layer  Driver specific configuration become now possible • Extend MPI_COMM_SPLIT_TYPE to include all HWLOC types (CU, HOST, BOARD, SOCKET, LxCACHE…)(#320) • Did I mention Fortran yet?
  • 46. Memory Scaling and RMA Update Nathan Hjelm Los Alamos National Laboratory
  • 47. v1.10.x memory scaling • Process structure and network endpoint allocated for all processes in MPI_COMM_WORLD • Structures allocated even if when there is no communication with a peer
  • 48. v2.x memory scaling • Process structure and network endpoint allocated on first communication • Small delay in 1st communication, but…  Big reduction in memory footprint  Configurable cutoff for new behavior  MCA variable: mpi_add_procs_cutoff • Default:1024 processes
  • 49. Memory scaling • mpimemu VmRSS on 320 nodes of a Cray XE-6. 16 ppn aprun • Reduction of 6MB/process @ 5120 processes! 6 MB 8 MB 10 MB 12 MB 14 MB 16 MB 18 MB 20 MB 0 1000 2000 3000 4000 5000 MemoryUsage/Process # Processes mpimemu VmRSS Post MPI_Init Static add_procs Dynamic add_procs 6MB!
  • 50. v1.10 RMA • Fully supports MPI-3 RMA • Full support for MPI datatypes  Stress tested with ARMCI • Emulates one-sided operation using point-to- point components (PML) for communication • Caveat: no asynchronous progress support  Targets must enter MPI to progress any one- sided data movement (!)  Defeats the purpose of passive target RMA
  • 51. v2.x RMA • Uses network RMA and atomic operation support through Byte Transport Layer (BTL)  Fully supports passive target RMA operations • Initially supports:  IB*, Infinipath/Omnipath**, Gemini/Aries • More networks can be supported if they have  put, get, fetch-and-add, compare-and-swap • Otherwise, fall back to emulated RMA
  • 52. v2.x RMA performance • IMB truly passive put on Cray XC-30 0 20 40 60 80 100 20 25 210 215 220 Overlap% Message Size (Bytes) IMB Truly Passive Put Overlap OSC/pt2pt OSC/rdma
  • 53. Beyond the v2.x series
  • 54. What’s the plan over time? • Plan for a new release series once a year  v2.x: release in mid / late 2015  v3.x: release in mid / late 2016  v4.x: release in mid / late 2017  …etc. 54 NOTE: Scheduled releases is a new concept for the Open MPI developer community. We’ll continue to evaluate this plan over time.
  • 55. 3.x series • Branching for v3.x in June 2016 is… unlikely  As noted previously, v2.0.0 is taking longer than expected • More likely:  The v2.x series will have a good life (~1 year?)  We’ll postpone v3.x for a while  Stay tuned
  • 56. MPI v4.0 • Reminder: go to the MPI Forum BOF  3:30pm on Wednesday • MPI v4.0 is at least 2 years away  Content far from certain  Too far off to make predictions  Will likely include portions of what will become MPI-4.0 over time
  • 57. User question What are the major implementation issues involved with the proposed MPI-4 endpoints? Any progress to report? We (developers) have talked about implementing MPI-4 endpoints. But… • We have identified a minimal set of changes in Open MPI for compliance with the current proposal • The endpoints proposal is not final (yet) • Not enough user demand (yet)
  • 58. User question What wire protocols are used? How do you tune them? How does one tune FDR/EDR/OPA fabrics? This is a bit more than we can cover in a 1-hour BOF. Please come see us afterwards.
  • 59. Where do we need help? • Code  Any bug that bothers you  Any feature that you can add • User documentation • Testing • Usability • Release engineering
  • 60. Researchers: how can we help you? • Fork OMPI on GitHub • Ask questions on the devel list • Come to Open MPI developer meetings  Next: February 22-25, 2016, Dallas, TX, USA • Generally: be part of the open source community