SlideShare uma empresa Scribd logo
1 de 26
Baixar para ler offline
© Copyright Khronos® Group 2019
1
APIs for Accelerating Vision and
Inferencing: An Overview of
Options and Trade-offs
Neil Trevett
Khronos Group
May 2019
© Copyright Khronos® Group 2019
2
Sensor Data
Training Data
Trained
Networks
Neural Network
Training
Application
Code
Embedded Vision and Inferencing Acceleration
Compilation Ingestion
GPU
FPGA
DSP
Dedicated
Hardware
Vision
Library
Inferencing
Engine
Compiled
Code
Hardware Acceleration APIs
Diverse Embedded
Hardware
(GPUs, DSPs, FPGAs)
Apps link to
compiled code or
inferencing library
Networks trained on
high-end desktop
and cloud systems
© Copyright Khronos® Group 2019
3
Neural Network Training Acceleration
Desktop and
Cloud Hardware
cuDNN MIOpen clDNN
Neural Net Training
Frameworks
Neural Net Training
Frameworks
Neural Net Training
Frameworks
TPU
Authoring
interchange
Training Data
Neural Net Training
Frameworks
GPU APIs and libraries for
compute acceleration
Network training needs 32-
bit floating-point precision
© Copyright Khronos® Group 2019
4
Sensor Data
Training Data
Trained
Networks
Neural Network
Training
Application
Code
Embedded Vision and Inferencing Acceleration
Compilation Ingestion
GPU
FPGA
DSP
Dedicated
Hardware
Vision
Library
Inferencing
Engine
Compiled
Code
Hardware Acceleration APIs
Diverse Embedded
Hardware
(GPUs, DSPs, FPGAs)
Apps link to
compiled code or
inferencing library
Networks trained on
high-end desktop
and cloud systems
© Copyright Khronos® Group 2019
5
Neural Network Exchange Industry Initiatives
Embedded Inferencing Import Training Interchange
Defined Specification Open Source Project
Multi-company Governance at Khronos Initiated by Facebook & Microsoft
Stability for hardware deployment Software stack flexibility
ONNX and NNEF are Complementary
ONNX moves quickly to track authoring framework updates
NNEF provides a stable bridge from training into edge inferencing engines
© Copyright Khronos® Group 2019
6
NNEF and ONNX Support
NNEF V1.0 released in August 2018
After positive industry feedback on Provisional
Specification – maintenance update issued in March 2019
Extensions to V1.0 in Ratification for imminent release
NNEF Working Group Participants
ONNX 1.5 Released in April 2019
Introduced support for Quantization
ONNX Runtime being integrated with GPU inferencing
engines such as NVIDIA TensorRT
ONNX Supporters
NNEF adopts a rigorous approach to design life cycles - especially needed for safety-critical
or mission-critical applications in automotive, industrial and infrastructure markets
© Copyright Khronos® Group 2019
7
NNEF Tools Ecosystem
Files
Syntax
Parser/
Validator
Caffe and
Caffe2
Convertor
NNEF open source projects hosted on Khronos
NNEF GitHub repository under Apache 2.0
https://github.com/KhronosGroup/NNEF-Tools
ONNX
Convertor
TensorFlow
and TF Lite
Convertor
OpenVX
Ingestion &
Execution
Live
Imminent
Other NNEF Updates
NNEF Model Zoo is now available on GitHub
System-level conformance testing program
is being created
© Copyright Khronos® Group 2019
8
Sensor Data
Training Data
Trained
Networks
Neural Network
Training
Application
Code
Embedded Vision and Inferencing Acceleration
Compilation Ingestion
GPU
FPGA
DSP
Dedicated
Hardware
Vision
Library
Inferencing
Engine
Compiled
Code
Hardware Acceleration APIs
Diverse Embedded
Hardware
(GPUs, DSPs, FPGAs)
Apps link to
compiled code or
inferencing library
Networks trained on
high-end desktop
and cloud systems
© Copyright Khronos® Group 2019
9
Primary Machine Learning Compilers
Import Formats
Caffe, Keras, MXNet,
ONNX
TensorFlow Graph,
MXNet, PaddlePaddle,
Keras, ONNX
PyTorch, ONNX
TensorFlow Graph,
PyTorch
Front-end / IR NNVM / Relay IR nGraph / Stripe IR Glow Core / Glow IR XLA HLO
Output
OpenCL, LLVM, CUDA,
Metal
OpenCL, LLVM, CUDA OpenCL, LLVM
LLVM, TPU IR, XLA IR
TensorFlow Lite / NNAPI
(inc. HW accel)
© Copyright Khronos® Group 2019
10
ML Compiler Steps
1.Import Trained Network
Description
2. Apply graph-level optimizations
using node fusion and memory tiling
3. Decompose to primitive
instructions and emit programs
for accelerated run-times
Consistent Steps
Fast progress but still area of intense research
If compiler optimizations are effective - hardware accelerator APIs can stay ‘simple’ and won’t
need complex metacommands (combined primitive commands) like DirectML
Embedded NN Compilers
CEVA Deep Neural Network (CDNN)
Cadence Xtensa Neural Network Compiler (XNNC)
© Copyright Khronos® Group 2019
11
Sensor Data
Training Data
Trained
Networks
Neural Network
Training
Application
Code
Embedded Vision and Inferencing Acceleration
Compilation Ingestion
GPU
FPGA
DSP
Dedicated
Hardware
Vision
Library
Inferencing
Engine
Compiled
Code
Hardware Acceleration APIs
Diverse Embedded
Hardware
(GPUs, DSPs, FPGAs)
Apps link to
compiled code or
inferencing library
Networks trained on
high-end desktop
and cloud systems
© Copyright Khronos® Group 2019
12
PC/Mobile Platform Inferencing Stacks
Microsoft Windows
Windows Machine Learning (WinML)
Google Android
Neural Network API (NNAPI)
Apple MacOS and iOS
CoreML
https://docs.microsoft.com/en-us/windows/uwp/machine-learning/ https://developer.android.com/ndk/guides/neuralnetworks/ https://developer.apple.com/documentation/coreml
Core ML Model
Consistent Steps
1. Import and optimize trained NN model file
2. Ingest graph into ML runtime
3. Accelerate inferencing operations using
underlying low-level API
TensorFlow Lite
© Copyright Khronos® Group 2019
13
Inferencing Engines
Platform Engines
Android NNAPI
Microsoft WinML
Apple CoreML
Desktop IHVs
AMD MIOpen
Intel clDNN/OpenVINO
NVIDIA cuDNN/TensorRT
VeriSilicon
Huawei
Mobile / Embedded
Arm Compute Library
Huawei MACE
Qualcomm Neural Processing SDK
Synopsis - MetaWare EV Dev Toolkit
TI Deep Learning Library (TIDL)
VeriSilicon Acuity
Acceleration APIs
Cross-platform
Inferencing Engines
Almost all Embedded
Inferencing Engines use
OpenCL to access
accelerator silicon
Both provide
Inferencing AND Vision
acceleration
© Copyright Khronos® Group 2019
14
Khronos OpenVX and NNEF for Inferencing
Vision
Node
Vision
Node
Vision
Node
Downstream
Application
Processing
Native
Camera
Control CNN Nodes
An OpenVX graph mixing CNN
nodes with traditional vision nodes
NNEF Translator converts NNEF
representation into OpenVX Node Graphs
OpenVX
A high-level graph-based abstraction for Portable, Efficient Vision Processing
Optimized OpenVX drivers created and shipped by processor vendors
Can be implemented on almost any hardware or processor
Graph can contain vision processing and NN nodes – enables global optimizations
Hardware Implementations
Performance comparable to hand-optimized, non-portable code
Real, complex applications on real, complex hardware
Much lower development effort than hand-optimized code
© Copyright Khronos® Group 2019
15
OpenVX and OpenCV are Complementary
Implementation Community driven open source library
Callable library implemented
and shipped by hardware vendors
Conformance
Extensive OpenCV Test Suite but
no formal Adopters program
Implementations pass defined conformance
test suite to use trademark
Scope
Very wide.
1000s of imaging and vision functions
Multiple camera APIs/interfaces
Tight focus on core hardware accelerated
functions for mobile vision and inferencing.
Uses external camera drivers
Acceleration
OpenCV 3.0 Transparent API (or T-API) enables
function offload to OpenCL devices
Implementation free to use any underlying API.
Uses OpenCL for Custom Nodes
Efficiency
OpenCV 4.0 G-API graph model for some filters,
arithmetic/binary operations, and well-defined
geometrical transformations
Graph-based execution of all Nodes.
Optimizable computation and data transfer
Inferencing
Deep Neural Network module. API to construct
neural networks from layers for forward pass
computations only. Import from ONNX,
TensorFlow, Torch, Caffe
Neural Network layers and operations
represented directly in the OpenVX Graph.
NNEF direct import
© Copyright Khronos® Group 2019
16
Sensor Data
Training Data
Trained
Networks
Neural Network
Training
Application
Code
Embedded Vision and Inferencing Acceleration
Compilation Ingestion
GPU
FPGA
DSP
Dedicated
Hardware
Vision
Library
Inferencing
Engine
Compiled
Code
Hardware Acceleration APIs
Diverse Embedded
Hardware
(GPUs, DSPs, FPGAs)
Apps link to
compiled code or
inferencing library
Networks trained on
high-end desktop
and cloud systems
© Copyright Khronos® Group 2019
17
GPU
GPU and Heterogeneous Acceleration APIs
FPGA DSP
Custom Hardware
GPU
CPUCPUCPU
GPU
Application
Fragmented GPU
API Landscape
Application
Heterogeneous
Compute
Resources
OpenCL provides a programming and runtime
framework for heterogeneous compute resources
Low-level control over memory allocation and parallel
task execution - simpler and relatively lightweight
compared to GPU APIs
Many mobile SOCs and Embedded Systems
becoming increasingly heterogeneous
Autonomous vehicles
Vision and inferencing
© Copyright Khronos® Group 2019
18
OpenGL ES 1.0 - 2003
Fixed function
graphics
OpenGL ES 2.0 - 2007
Programmable Shaders
OpenGL SC 1.0 - 2005
Fixed function graphics
safety critical subset
OpenGL SC 2.0 - April 2016
Programmable Shaders
safety critical subset
Safety Critical GPU API Evolution
Vulkan 1.0 - 2016
Explicit Graphics and Compute
New Generation Safety
Critical API for Graphics,
Compute and Display
© Copyright Khronos® Group 2019
19
Vulkan SC Working Group Goals
Industry Need
for GPU Acceleration APIs
designed to ease system
safety certification is
increasing
ISO 26262 / ASIL-D
Clearly Definable Design
Goals to Adapt Vulkan for SC
Reduce driver size and complexity
-> Offline pipeline creation, no
dynamic display resolutions
Deterministic Behavior
-> No ignored parameters, static
memory management, eliminate
undefined behaviors
Robust Error Handling
-> Error callbacks so app can respond,
Fatal error callbacks for fast recovery
initiation
C API - MISRA C Compliance
Rendering Compute Display
Vulkan is Compelling Starting
Point for SC GPU API Design
• Widely adopted, royalty-free
open standard
• Low-level explicit API – smaller
surface area than OpenGL
• Not burdened by debug functionality
• Very little internal state
• Well-defined thread behavior
Vulkan SC Working Group announced February 2019
Any company welcome to join Khronos and participate
© Copyright Khronos® Group 2019
20
OpenCL is Pervasive!
Modo
Desktop Creative Apps
Math and Physics Libraries
Vision and Imaging Libraries
CLBlast SYCL-BLAS
Linear Algebra Libraries
Parallel Computation Languages
Arm Compute Library
clDNN
Machine Learning Inferencing Compilers
SYCL-DNN
Machine Learning Libraries
Android NNAPI
TI Deep Learning Library (TIDL)
VeriSilicon
Huawei
Intel
Intel
Synopsis MetaWare EV
Hardware Implementations
© Copyright Khronos® Group 2019
21
SYCL Single Source C++ Parallel Programming
• SYCL 1.2.1 Adopters Program released in July 2018 with open source conformance tests soon
- https://www.khronos.org/news/press/khronos-releases-conformance-test-suite-for-sycl-1.2.1
• Multiple SYCL libraries for vision and inferencing
- SYCL-BLAS, SYCL-DNN, SYCL-Eigen
• Multiple Implémentations shipping: triSYCL, ComputeCpp, HipSYCL
- http://sycl.tech
C++ Kernel Fusion can gives better performance
on complex apps and libs than hand-coding
Single application source
file using STANDARD C++C++ templates and lambda
functions separate host &
device code
Accelerated code passed into
device OpenCL compilers
Ideal for accelerating
larger C++-based engines
and applications
© Copyright Khronos® Group 2019
22
OpenCL Evolution
May 2017
OpenCL 2.2
Spec Maintenance Updates
Regular updates for
spec clarifications and bug fixes
Target 2020
‘OpenCL Next’
Integration of Extensions plus
New Core functionality
Vulkan-like loader and layers
‘Flexible Profile’ for Deployment Flexibility
OpenCL Extension Specs
Vulkan / OpenCL Interop
Scratch-Pad Memory Management
Extended Subgroups
SPIR-V 1.4 ingestion for compiler efficiency
SPIR-V Extended debug info
Repeat Cycle for next
Core Specification
© Copyright Khronos® Group 2019
23
Flexible Profile and Feature Sets
• In OpenCL Next Flexible Profile features become optional for enhanced deployment flexibility
- API and language features e.g. floating point precisions
• Feature Sets reduce danger of fragmentation
- Defined to suit specific markets – e.g. desktop, embedded vision and inferencing
• Implementations are conformant if fully support feature set functionality
- Supporting Feature Sets will help drive sales – encouraging consistent functionality per market
- An implementation may support multiple Feature Sets
OpenCL 2.2 Functionality = queryable, optional feature
Khronos-defined
OpenCL 2.2 Full Profile
Feature Set
Khronos-defined
OpenCL 1.2 Full Profile
Feature Set
Industry-defined Feature
Set E.g. Embedded Vision
and Inferencing
© Copyright Khronos® Group 2019
24
Apps and Libs
Apps and Libs
OpenCL Next Feature Set Discussions
• What could be useful Feature Sets?
- Feature sets for previous spec versions e.g. OpenCL 1.2?
- ‘Desktop’ Feature Set to raise the universally available baseline above OpenCL 1.2?
- Set of OpenCL functionality that runs efficiently over Vulkan?
- Vertical market focused – e.g. inferencing, vision processing
• Some vertically integrated markets don’t care about cross-vendor app portability
- But still value in industry standard programming framework – reduces costs for engineering, tooling, training etc.
- Allow conformance for any combination of features – no Feature Sets
- Enables minimal footprint OpenCL per system – ideal for Safety Certification
OS
OpenCL
Drivers
Apps and Libs
Some markets, e.g.
desktop, value software
portability across multiple
vendors’ acceleratorsOS
OpenCL
Drivers
OS
OpenCL
Drivers
Apps and Libs
OS
OpenCL
Drivers
Apps and Libs
Some vertically integrated embedded
markets do not need application portability
– but rather small surface area esp. for
safety certification
© Copyright Khronos® Group 2019
25
Extending OpenVX with Custom Nodes
OpenVX/OpenCL Interop
• Provisional Extension
• Enables custom OpenCL acceleration to be
invoked from OpenVX User Kernels
• Memory objects can be mapped or copied
Kernel/Graph Import
• Provisional Extension
• Defines container for executable or IR code
• Enables arbitrary code to be inserted as a
OpenVX Node in a graph
OpenCL Command Queue
Application
cl_mem buffers
Fully asynchronous host-device
operations during data exchange
OpenVX data objects
Runtime
Runtime Map or copy OpenVX data
objects into cl_mem buffers
Copy or export
cl_mem buffers into OpenVX
data objects
OpenVX user-kernels can access command
queue and cl_mem objects to asynchronously
schedule OpenCL kernel execution
OpenVX/OpenCL Interop
© Copyright Khronos® Group 2019
26
Resources
• Compilers for Machine Learning Conference Proceedings
- www.c4ml.org
• Khronos Standards: OpenCL, OpenVX, NNEF, SYCL
- Any company is welcome to join Khronos
- www.khronos.org
• OpenCL application and library resources
- https://www.khronos.org/opencl/resources/opencl-applications-using-opencl
- https://www.iwocl.org/resources/opencl-libraries-and-toolkits/
• In-Depth Presentations on OpenCV and OpenVX
- This room today at 1PM and 1:35PM
• Khronos EVS Workshop
Hardware Acceleration for Machine Learning and Computer Vision through Khronos Open Standard APIs
- Many more details on OpenVX, SYCL, OpenCL, NNEF
- Hands-on exercises with OpenVX and NNEF
- Thursday May 23, 9am–5pm
- Rooms 203 and 204, Santa Clara Convention Center
- https://www.khronos.org/events/2019-embedded-vision-summit

Mais conteúdo relacionado

Mais procurados

"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre..."APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...Edge AI and Vision Alliance
 
“OpenCV: Past, Present and Future,” a Presentation from OpenCV.org
“OpenCV: Past, Present and Future,” a Presentation from OpenCV.org“OpenCV: Past, Present and Future,” a Presentation from OpenCV.org
“OpenCV: Past, Present and Future,” a Presentation from OpenCV.orgEdge AI and Vision Alliance
 
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati..."The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...Edge AI and Vision Alliance
 
Deploy PyTorch models in Production on AWS with TorchServe
Deploy PyTorch models in Production on AWS with TorchServeDeploy PyTorch models in Production on AWS with TorchServe
Deploy PyTorch models in Production on AWS with TorchServeSuman Debnath
 
Qt World Summit 2017: Qt vs. Web - Total Cost of Ownership
Qt World Summit 2017: Qt vs. Web - Total Cost of OwnershipQt World Summit 2017: Qt vs. Web - Total Cost of Ownership
Qt World Summit 2017: Qt vs. Web - Total Cost of OwnershipBurkhard Stubert
 
Performance Evaluation using TAU Performance System and E4S
Performance Evaluation using TAU Performance System and E4SPerformance Evaluation using TAU Performance System and E4S
Performance Evaluation using TAU Performance System and E4SGanesan Narayanasamy
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineLinaro
 
Виктор Ерухимов Open VX mixar moscow sept'15
Виктор Ерухимов Open VX  mixar moscow sept'15 Виктор Ерухимов Open VX  mixar moscow sept'15
Виктор Ерухимов Open VX mixar moscow sept'15 mixARConference
 
ONNX - The Lingua Franca of Deep Learning
ONNX - The Lingua Franca of Deep LearningONNX - The Lingua Franca of Deep Learning
ONNX - The Lingua Franca of Deep LearningHagay Lupesko
 
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...AMD Developer Central
 
A Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing ClustersA Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing ClustersIntel® Software
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraLinaro
 
An Open Hardware CPU written in VHDL, synthesized with Open Source tools
An Open Hardware CPU written in VHDL, synthesized with Open Source toolsAn Open Hardware CPU written in VHDL, synthesized with Open Source tools
An Open Hardware CPU written in VHDL, synthesized with Open Source toolsGanesan Narayanasamy
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018Linaro
 
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...AMD Developer Central
 
LCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience ReportLCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience ReportLinaro
 

Mais procurados (20)

"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre..."APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
 
“OpenCV: Past, Present and Future,” a Presentation from OpenCV.org
“OpenCV: Past, Present and Future,” a Presentation from OpenCV.org“OpenCV: Past, Present and Future,” a Presentation from OpenCV.org
“OpenCV: Past, Present and Future,” a Presentation from OpenCV.org
 
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati..."The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
"The Vision Acceleration API Landscape: Options and Trade-offs," a Presentati...
 
CFD on Power
CFD on Power CFD on Power
CFD on Power
 
ARM and Machine Learning
ARM and Machine LearningARM and Machine Learning
ARM and Machine Learning
 
Deploy PyTorch models in Production on AWS with TorchServe
Deploy PyTorch models in Production on AWS with TorchServeDeploy PyTorch models in Production on AWS with TorchServe
Deploy PyTorch models in Production on AWS with TorchServe
 
Introduction to GPUs in HPC
Introduction to GPUs in HPCIntroduction to GPUs in HPC
Introduction to GPUs in HPC
 
Qt World Summit 2017: Qt vs. Web - Total Cost of Ownership
Qt World Summit 2017: Qt vs. Web - Total Cost of OwnershipQt World Summit 2017: Qt vs. Web - Total Cost of Ownership
Qt World Summit 2017: Qt vs. Web - Total Cost of Ownership
 
Performance Evaluation using TAU Performance System and E4S
Performance Evaluation using TAU Performance System and E4SPerformance Evaluation using TAU Performance System and E4S
Performance Evaluation using TAU Performance System and E4S
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
Виктор Ерухимов Open VX mixar moscow sept'15
Виктор Ерухимов Open VX  mixar moscow sept'15 Виктор Ерухимов Open VX  mixar moscow sept'15
Виктор Ерухимов Open VX mixar moscow sept'15
 
ONNX - The Lingua Franca of Deep Learning
ONNX - The Lingua Franca of Deep LearningONNX - The Lingua Franca of Deep Learning
ONNX - The Lingua Franca of Deep Learning
 
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...
 
A Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing ClustersA Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing Clusters
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
 
An Open Hardware CPU written in VHDL, synthesized with Open Source tools
An Open Hardware CPU written in VHDL, synthesized with Open Source toolsAn Open Hardware CPU written in VHDL, synthesized with Open Source tools
An Open Hardware CPU written in VHDL, synthesized with Open Source tools
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018
 
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...
 
OpenPOWER Latest Updates
OpenPOWER Latest UpdatesOpenPOWER Latest Updates
OpenPOWER Latest Updates
 
LCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience ReportLCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience Report
 

Semelhante a "APIs for Accelerating Vision and Inferencing: An Industry Overview of Options and Trade-offs," a Presentation from the Khronos Group

"Update on Khronos Standards for Vision and Machine Learning," a Presentation...
"Update on Khronos Standards for Vision and Machine Learning," a Presentation..."Update on Khronos Standards for Vision and Machine Learning," a Presentation...
"Update on Khronos Standards for Vision and Machine Learning," a Presentation...Edge AI and Vision Alliance
 
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...Edge AI and Vision Alliance
 
oneAPI: Industry Initiative & Intel Product
oneAPI: Industry Initiative & Intel ProductoneAPI: Industry Initiative & Intel Product
oneAPI: Industry Initiative & Intel ProductTyrone Systems
 
OpenCL Overview Japan Virtual Open House Feb 2021
OpenCL Overview Japan Virtual Open House Feb 2021OpenCL Overview Japan Virtual Open House Feb 2021
OpenCL Overview Japan Virtual Open House Feb 2021The Khronos Group Inc.
 
HP CAST 2017 Frankfurt : HPE UberCloud boosting HPC as a Service
HP CAST 2017 Frankfurt : HPE UberCloud boosting HPC as a ServiceHP CAST 2017 Frankfurt : HPE UberCloud boosting HPC as a Service
HP CAST 2017 Frankfurt : HPE UberCloud boosting HPC as a ServiceThomas Francis
 
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...Edge AI and Vision Alliance
 
"Update on Khronos Standards for Vision and Machine Learning," a Presentation...
"Update on Khronos Standards for Vision and Machine Learning," a Presentation..."Update on Khronos Standards for Vision and Machine Learning," a Presentation...
"Update on Khronos Standards for Vision and Machine Learning," a Presentation...Edge AI and Vision Alliance
 
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...tdc-globalcode
 
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloLinaro
 
"Portable Performance via the OpenVX Computer Vision Library: Case Studies," ...
"Portable Performance via the OpenVX Computer Vision Library: Case Studies," ..."Portable Performance via the OpenVX Computer Vision Library: Case Studies," ...
"Portable Performance via the OpenVX Computer Vision Library: Case Studies," ...Edge AI and Vision Alliance
 
Webinar: Synergy turbinado com o SSP1.4: criptografia elíptica, vídeo pela US...
Webinar: Synergy turbinado com o SSP1.4: criptografia elíptica, vídeo pela US...Webinar: Synergy turbinado com o SSP1.4: criptografia elíptica, vídeo pela US...
Webinar: Synergy turbinado com o SSP1.4: criptografia elíptica, vídeo pela US...Embarcados
 
Innovation with ai at scale on the edge vt sept 2019 v0
Innovation with ai at scale  on the edge vt sept 2019 v0Innovation with ai at scale  on the edge vt sept 2019 v0
Innovation with ai at scale on the edge vt sept 2019 v0Ganesan Narayanasamy
 
Faster deep learning solutions from training to inference - Michele Tameni - ...
Faster deep learning solutions from training to inference - Michele Tameni - ...Faster deep learning solutions from training to inference - Michele Tameni - ...
Faster deep learning solutions from training to inference - Michele Tameni - ...Codemotion
 
“Intel Video AI Box—Converging AI, Media and Computing in a Compact and Open ...
“Intel Video AI Box—Converging AI, Media and Computing in a Compact and Open ...“Intel Video AI Box—Converging AI, Media and Computing in a Compact and Open ...
“Intel Video AI Box—Converging AI, Media and Computing in a Compact and Open ...Edge AI and Vision Alliance
 
Built Cross-Platform Application with .NET Core Development.pdf
Built Cross-Platform Application with .NET Core Development.pdfBuilt Cross-Platform Application with .NET Core Development.pdf
Built Cross-Platform Application with .NET Core Development.pdfI-Verve Inc
 
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIY
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIYWhy Pay for Open Source Linux? Avoid the Hidden Cost of DIY
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIYEnterprise Management Associates
 
How APIs are Transforming Cisco Solutions and Catalyzing an Innovation Ecosystem
How APIs are Transforming Cisco Solutions and Catalyzing an Innovation EcosystemHow APIs are Transforming Cisco Solutions and Catalyzing an Innovation Ecosystem
How APIs are Transforming Cisco Solutions and Catalyzing an Innovation EcosystemCisco DevNet
 
Redfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined InfrastructureRedfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined InfrastructureBruno Cornec
 

Semelhante a "APIs for Accelerating Vision and Inferencing: An Industry Overview of Options and Trade-offs," a Presentation from the Khronos Group (20)

"Update on Khronos Standards for Vision and Machine Learning," a Presentation...
"Update on Khronos Standards for Vision and Machine Learning," a Presentation..."Update on Khronos Standards for Vision and Machine Learning," a Presentation...
"Update on Khronos Standards for Vision and Machine Learning," a Presentation...
 
SYCL 2020 Specification
SYCL 2020 SpecificationSYCL 2020 Specification
SYCL 2020 Specification
 
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
 
oneAPI: Industry Initiative & Intel Product
oneAPI: Industry Initiative & Intel ProductoneAPI: Industry Initiative & Intel Product
oneAPI: Industry Initiative & Intel Product
 
OpenCL Overview Japan Virtual Open House Feb 2021
OpenCL Overview Japan Virtual Open House Feb 2021OpenCL Overview Japan Virtual Open House Feb 2021
OpenCL Overview Japan Virtual Open House Feb 2021
 
HP CAST 2017 Frankfurt : HPE UberCloud boosting HPC as a Service
HP CAST 2017 Frankfurt : HPE UberCloud boosting HPC as a ServiceHP CAST 2017 Frankfurt : HPE UberCloud boosting HPC as a Service
HP CAST 2017 Frankfurt : HPE UberCloud boosting HPC as a Service
 
Enabling NFV features in kubernetes
Enabling NFV features in kubernetesEnabling NFV features in kubernetes
Enabling NFV features in kubernetes
 
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
 
"Update on Khronos Standards for Vision and Machine Learning," a Presentation...
"Update on Khronos Standards for Vision and Machine Learning," a Presentation..."Update on Khronos Standards for Vision and Machine Learning," a Presentation...
"Update on Khronos Standards for Vision and Machine Learning," a Presentation...
 
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
 
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
 
"Portable Performance via the OpenVX Computer Vision Library: Case Studies," ...
"Portable Performance via the OpenVX Computer Vision Library: Case Studies," ..."Portable Performance via the OpenVX Computer Vision Library: Case Studies," ...
"Portable Performance via the OpenVX Computer Vision Library: Case Studies," ...
 
Webinar: Synergy turbinado com o SSP1.4: criptografia elíptica, vídeo pela US...
Webinar: Synergy turbinado com o SSP1.4: criptografia elíptica, vídeo pela US...Webinar: Synergy turbinado com o SSP1.4: criptografia elíptica, vídeo pela US...
Webinar: Synergy turbinado com o SSP1.4: criptografia elíptica, vídeo pela US...
 
Innovation with ai at scale on the edge vt sept 2019 v0
Innovation with ai at scale  on the edge vt sept 2019 v0Innovation with ai at scale  on the edge vt sept 2019 v0
Innovation with ai at scale on the edge vt sept 2019 v0
 
Faster deep learning solutions from training to inference - Michele Tameni - ...
Faster deep learning solutions from training to inference - Michele Tameni - ...Faster deep learning solutions from training to inference - Michele Tameni - ...
Faster deep learning solutions from training to inference - Michele Tameni - ...
 
“Intel Video AI Box—Converging AI, Media and Computing in a Compact and Open ...
“Intel Video AI Box—Converging AI, Media and Computing in a Compact and Open ...“Intel Video AI Box—Converging AI, Media and Computing in a Compact and Open ...
“Intel Video AI Box—Converging AI, Media and Computing in a Compact and Open ...
 
Built Cross-Platform Application with .NET Core Development.pdf
Built Cross-Platform Application with .NET Core Development.pdfBuilt Cross-Platform Application with .NET Core Development.pdf
Built Cross-Platform Application with .NET Core Development.pdf
 
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIY
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIYWhy Pay for Open Source Linux? Avoid the Hidden Cost of DIY
Why Pay for Open Source Linux? Avoid the Hidden Cost of DIY
 
How APIs are Transforming Cisco Solutions and Catalyzing an Innovation Ecosystem
How APIs are Transforming Cisco Solutions and Catalyzing an Innovation EcosystemHow APIs are Transforming Cisco Solutions and Catalyzing an Innovation Ecosystem
How APIs are Transforming Cisco Solutions and Catalyzing an Innovation Ecosystem
 
Redfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined InfrastructureRedfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined Infrastructure
 

Mais de Edge AI and Vision Alliance

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...Edge AI and Vision Alliance
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...Edge AI and Vision Alliance
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...Edge AI and Vision Alliance
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...Edge AI and Vision Alliance
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...Edge AI and Vision Alliance
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsightsEdge AI and Vision Alliance
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...Edge AI and Vision Alliance
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...Edge AI and Vision Alliance
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...Edge AI and Vision Alliance
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...Edge AI and Vision Alliance
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from SamsaraEdge AI and Vision Alliance
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...Edge AI and Vision Alliance
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...Edge AI and Vision Alliance
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...Edge AI and Vision Alliance
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...Edge AI and Vision Alliance
 

Mais de Edge AI and Vision Alliance (20)

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
 

Último

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Último (20)

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

"APIs for Accelerating Vision and Inferencing: An Industry Overview of Options and Trade-offs," a Presentation from the Khronos Group

  • 1. © Copyright Khronos® Group 2019 1 APIs for Accelerating Vision and Inferencing: An Overview of Options and Trade-offs Neil Trevett Khronos Group May 2019
  • 2. © Copyright Khronos® Group 2019 2 Sensor Data Training Data Trained Networks Neural Network Training Application Code Embedded Vision and Inferencing Acceleration Compilation Ingestion GPU FPGA DSP Dedicated Hardware Vision Library Inferencing Engine Compiled Code Hardware Acceleration APIs Diverse Embedded Hardware (GPUs, DSPs, FPGAs) Apps link to compiled code or inferencing library Networks trained on high-end desktop and cloud systems
  • 3. © Copyright Khronos® Group 2019 3 Neural Network Training Acceleration Desktop and Cloud Hardware cuDNN MIOpen clDNN Neural Net Training Frameworks Neural Net Training Frameworks Neural Net Training Frameworks TPU Authoring interchange Training Data Neural Net Training Frameworks GPU APIs and libraries for compute acceleration Network training needs 32- bit floating-point precision
  • 4. © Copyright Khronos® Group 2019 4 Sensor Data Training Data Trained Networks Neural Network Training Application Code Embedded Vision and Inferencing Acceleration Compilation Ingestion GPU FPGA DSP Dedicated Hardware Vision Library Inferencing Engine Compiled Code Hardware Acceleration APIs Diverse Embedded Hardware (GPUs, DSPs, FPGAs) Apps link to compiled code or inferencing library Networks trained on high-end desktop and cloud systems
  • 5. © Copyright Khronos® Group 2019 5 Neural Network Exchange Industry Initiatives Embedded Inferencing Import Training Interchange Defined Specification Open Source Project Multi-company Governance at Khronos Initiated by Facebook & Microsoft Stability for hardware deployment Software stack flexibility ONNX and NNEF are Complementary ONNX moves quickly to track authoring framework updates NNEF provides a stable bridge from training into edge inferencing engines
  • 6. © Copyright Khronos® Group 2019 6 NNEF and ONNX Support NNEF V1.0 released in August 2018 After positive industry feedback on Provisional Specification – maintenance update issued in March 2019 Extensions to V1.0 in Ratification for imminent release NNEF Working Group Participants ONNX 1.5 Released in April 2019 Introduced support for Quantization ONNX Runtime being integrated with GPU inferencing engines such as NVIDIA TensorRT ONNX Supporters NNEF adopts a rigorous approach to design life cycles - especially needed for safety-critical or mission-critical applications in automotive, industrial and infrastructure markets
  • 7. © Copyright Khronos® Group 2019 7 NNEF Tools Ecosystem Files Syntax Parser/ Validator Caffe and Caffe2 Convertor NNEF open source projects hosted on Khronos NNEF GitHub repository under Apache 2.0 https://github.com/KhronosGroup/NNEF-Tools ONNX Convertor TensorFlow and TF Lite Convertor OpenVX Ingestion & Execution Live Imminent Other NNEF Updates NNEF Model Zoo is now available on GitHub System-level conformance testing program is being created
  • 8. © Copyright Khronos® Group 2019 8 Sensor Data Training Data Trained Networks Neural Network Training Application Code Embedded Vision and Inferencing Acceleration Compilation Ingestion GPU FPGA DSP Dedicated Hardware Vision Library Inferencing Engine Compiled Code Hardware Acceleration APIs Diverse Embedded Hardware (GPUs, DSPs, FPGAs) Apps link to compiled code or inferencing library Networks trained on high-end desktop and cloud systems
  • 9. © Copyright Khronos® Group 2019 9 Primary Machine Learning Compilers Import Formats Caffe, Keras, MXNet, ONNX TensorFlow Graph, MXNet, PaddlePaddle, Keras, ONNX PyTorch, ONNX TensorFlow Graph, PyTorch Front-end / IR NNVM / Relay IR nGraph / Stripe IR Glow Core / Glow IR XLA HLO Output OpenCL, LLVM, CUDA, Metal OpenCL, LLVM, CUDA OpenCL, LLVM LLVM, TPU IR, XLA IR TensorFlow Lite / NNAPI (inc. HW accel)
  • 10. © Copyright Khronos® Group 2019 10 ML Compiler Steps 1.Import Trained Network Description 2. Apply graph-level optimizations using node fusion and memory tiling 3. Decompose to primitive instructions and emit programs for accelerated run-times Consistent Steps Fast progress but still area of intense research If compiler optimizations are effective - hardware accelerator APIs can stay ‘simple’ and won’t need complex metacommands (combined primitive commands) like DirectML Embedded NN Compilers CEVA Deep Neural Network (CDNN) Cadence Xtensa Neural Network Compiler (XNNC)
  • 11. © Copyright Khronos® Group 2019 11 Sensor Data Training Data Trained Networks Neural Network Training Application Code Embedded Vision and Inferencing Acceleration Compilation Ingestion GPU FPGA DSP Dedicated Hardware Vision Library Inferencing Engine Compiled Code Hardware Acceleration APIs Diverse Embedded Hardware (GPUs, DSPs, FPGAs) Apps link to compiled code or inferencing library Networks trained on high-end desktop and cloud systems
  • 12. © Copyright Khronos® Group 2019 12 PC/Mobile Platform Inferencing Stacks Microsoft Windows Windows Machine Learning (WinML) Google Android Neural Network API (NNAPI) Apple MacOS and iOS CoreML https://docs.microsoft.com/en-us/windows/uwp/machine-learning/ https://developer.android.com/ndk/guides/neuralnetworks/ https://developer.apple.com/documentation/coreml Core ML Model Consistent Steps 1. Import and optimize trained NN model file 2. Ingest graph into ML runtime 3. Accelerate inferencing operations using underlying low-level API TensorFlow Lite
  • 13. © Copyright Khronos® Group 2019 13 Inferencing Engines Platform Engines Android NNAPI Microsoft WinML Apple CoreML Desktop IHVs AMD MIOpen Intel clDNN/OpenVINO NVIDIA cuDNN/TensorRT VeriSilicon Huawei Mobile / Embedded Arm Compute Library Huawei MACE Qualcomm Neural Processing SDK Synopsis - MetaWare EV Dev Toolkit TI Deep Learning Library (TIDL) VeriSilicon Acuity Acceleration APIs Cross-platform Inferencing Engines Almost all Embedded Inferencing Engines use OpenCL to access accelerator silicon Both provide Inferencing AND Vision acceleration
  • 14. © Copyright Khronos® Group 2019 14 Khronos OpenVX and NNEF for Inferencing Vision Node Vision Node Vision Node Downstream Application Processing Native Camera Control CNN Nodes An OpenVX graph mixing CNN nodes with traditional vision nodes NNEF Translator converts NNEF representation into OpenVX Node Graphs OpenVX A high-level graph-based abstraction for Portable, Efficient Vision Processing Optimized OpenVX drivers created and shipped by processor vendors Can be implemented on almost any hardware or processor Graph can contain vision processing and NN nodes – enables global optimizations Hardware Implementations Performance comparable to hand-optimized, non-portable code Real, complex applications on real, complex hardware Much lower development effort than hand-optimized code
  • 15. © Copyright Khronos® Group 2019 15 OpenVX and OpenCV are Complementary Implementation Community driven open source library Callable library implemented and shipped by hardware vendors Conformance Extensive OpenCV Test Suite but no formal Adopters program Implementations pass defined conformance test suite to use trademark Scope Very wide. 1000s of imaging and vision functions Multiple camera APIs/interfaces Tight focus on core hardware accelerated functions for mobile vision and inferencing. Uses external camera drivers Acceleration OpenCV 3.0 Transparent API (or T-API) enables function offload to OpenCL devices Implementation free to use any underlying API. Uses OpenCL for Custom Nodes Efficiency OpenCV 4.0 G-API graph model for some filters, arithmetic/binary operations, and well-defined geometrical transformations Graph-based execution of all Nodes. Optimizable computation and data transfer Inferencing Deep Neural Network module. API to construct neural networks from layers for forward pass computations only. Import from ONNX, TensorFlow, Torch, Caffe Neural Network layers and operations represented directly in the OpenVX Graph. NNEF direct import
  • 16. © Copyright Khronos® Group 2019 16 Sensor Data Training Data Trained Networks Neural Network Training Application Code Embedded Vision and Inferencing Acceleration Compilation Ingestion GPU FPGA DSP Dedicated Hardware Vision Library Inferencing Engine Compiled Code Hardware Acceleration APIs Diverse Embedded Hardware (GPUs, DSPs, FPGAs) Apps link to compiled code or inferencing library Networks trained on high-end desktop and cloud systems
  • 17. © Copyright Khronos® Group 2019 17 GPU GPU and Heterogeneous Acceleration APIs FPGA DSP Custom Hardware GPU CPUCPUCPU GPU Application Fragmented GPU API Landscape Application Heterogeneous Compute Resources OpenCL provides a programming and runtime framework for heterogeneous compute resources Low-level control over memory allocation and parallel task execution - simpler and relatively lightweight compared to GPU APIs Many mobile SOCs and Embedded Systems becoming increasingly heterogeneous Autonomous vehicles Vision and inferencing
  • 18. © Copyright Khronos® Group 2019 18 OpenGL ES 1.0 - 2003 Fixed function graphics OpenGL ES 2.0 - 2007 Programmable Shaders OpenGL SC 1.0 - 2005 Fixed function graphics safety critical subset OpenGL SC 2.0 - April 2016 Programmable Shaders safety critical subset Safety Critical GPU API Evolution Vulkan 1.0 - 2016 Explicit Graphics and Compute New Generation Safety Critical API for Graphics, Compute and Display
  • 19. © Copyright Khronos® Group 2019 19 Vulkan SC Working Group Goals Industry Need for GPU Acceleration APIs designed to ease system safety certification is increasing ISO 26262 / ASIL-D Clearly Definable Design Goals to Adapt Vulkan for SC Reduce driver size and complexity -> Offline pipeline creation, no dynamic display resolutions Deterministic Behavior -> No ignored parameters, static memory management, eliminate undefined behaviors Robust Error Handling -> Error callbacks so app can respond, Fatal error callbacks for fast recovery initiation C API - MISRA C Compliance Rendering Compute Display Vulkan is Compelling Starting Point for SC GPU API Design • Widely adopted, royalty-free open standard • Low-level explicit API – smaller surface area than OpenGL • Not burdened by debug functionality • Very little internal state • Well-defined thread behavior Vulkan SC Working Group announced February 2019 Any company welcome to join Khronos and participate
  • 20. © Copyright Khronos® Group 2019 20 OpenCL is Pervasive! Modo Desktop Creative Apps Math and Physics Libraries Vision and Imaging Libraries CLBlast SYCL-BLAS Linear Algebra Libraries Parallel Computation Languages Arm Compute Library clDNN Machine Learning Inferencing Compilers SYCL-DNN Machine Learning Libraries Android NNAPI TI Deep Learning Library (TIDL) VeriSilicon Huawei Intel Intel Synopsis MetaWare EV Hardware Implementations
  • 21. © Copyright Khronos® Group 2019 21 SYCL Single Source C++ Parallel Programming • SYCL 1.2.1 Adopters Program released in July 2018 with open source conformance tests soon - https://www.khronos.org/news/press/khronos-releases-conformance-test-suite-for-sycl-1.2.1 • Multiple SYCL libraries for vision and inferencing - SYCL-BLAS, SYCL-DNN, SYCL-Eigen • Multiple Implémentations shipping: triSYCL, ComputeCpp, HipSYCL - http://sycl.tech C++ Kernel Fusion can gives better performance on complex apps and libs than hand-coding Single application source file using STANDARD C++C++ templates and lambda functions separate host & device code Accelerated code passed into device OpenCL compilers Ideal for accelerating larger C++-based engines and applications
  • 22. © Copyright Khronos® Group 2019 22 OpenCL Evolution May 2017 OpenCL 2.2 Spec Maintenance Updates Regular updates for spec clarifications and bug fixes Target 2020 ‘OpenCL Next’ Integration of Extensions plus New Core functionality Vulkan-like loader and layers ‘Flexible Profile’ for Deployment Flexibility OpenCL Extension Specs Vulkan / OpenCL Interop Scratch-Pad Memory Management Extended Subgroups SPIR-V 1.4 ingestion for compiler efficiency SPIR-V Extended debug info Repeat Cycle for next Core Specification
  • 23. © Copyright Khronos® Group 2019 23 Flexible Profile and Feature Sets • In OpenCL Next Flexible Profile features become optional for enhanced deployment flexibility - API and language features e.g. floating point precisions • Feature Sets reduce danger of fragmentation - Defined to suit specific markets – e.g. desktop, embedded vision and inferencing • Implementations are conformant if fully support feature set functionality - Supporting Feature Sets will help drive sales – encouraging consistent functionality per market - An implementation may support multiple Feature Sets OpenCL 2.2 Functionality = queryable, optional feature Khronos-defined OpenCL 2.2 Full Profile Feature Set Khronos-defined OpenCL 1.2 Full Profile Feature Set Industry-defined Feature Set E.g. Embedded Vision and Inferencing
  • 24. © Copyright Khronos® Group 2019 24 Apps and Libs Apps and Libs OpenCL Next Feature Set Discussions • What could be useful Feature Sets? - Feature sets for previous spec versions e.g. OpenCL 1.2? - ‘Desktop’ Feature Set to raise the universally available baseline above OpenCL 1.2? - Set of OpenCL functionality that runs efficiently over Vulkan? - Vertical market focused – e.g. inferencing, vision processing • Some vertically integrated markets don’t care about cross-vendor app portability - But still value in industry standard programming framework – reduces costs for engineering, tooling, training etc. - Allow conformance for any combination of features – no Feature Sets - Enables minimal footprint OpenCL per system – ideal for Safety Certification OS OpenCL Drivers Apps and Libs Some markets, e.g. desktop, value software portability across multiple vendors’ acceleratorsOS OpenCL Drivers OS OpenCL Drivers Apps and Libs OS OpenCL Drivers Apps and Libs Some vertically integrated embedded markets do not need application portability – but rather small surface area esp. for safety certification
  • 25. © Copyright Khronos® Group 2019 25 Extending OpenVX with Custom Nodes OpenVX/OpenCL Interop • Provisional Extension • Enables custom OpenCL acceleration to be invoked from OpenVX User Kernels • Memory objects can be mapped or copied Kernel/Graph Import • Provisional Extension • Defines container for executable or IR code • Enables arbitrary code to be inserted as a OpenVX Node in a graph OpenCL Command Queue Application cl_mem buffers Fully asynchronous host-device operations during data exchange OpenVX data objects Runtime Runtime Map or copy OpenVX data objects into cl_mem buffers Copy or export cl_mem buffers into OpenVX data objects OpenVX user-kernels can access command queue and cl_mem objects to asynchronously schedule OpenCL kernel execution OpenVX/OpenCL Interop
  • 26. © Copyright Khronos® Group 2019 26 Resources • Compilers for Machine Learning Conference Proceedings - www.c4ml.org • Khronos Standards: OpenCL, OpenVX, NNEF, SYCL - Any company is welcome to join Khronos - www.khronos.org • OpenCL application and library resources - https://www.khronos.org/opencl/resources/opencl-applications-using-opencl - https://www.iwocl.org/resources/opencl-libraries-and-toolkits/ • In-Depth Presentations on OpenCV and OpenVX - This room today at 1PM and 1:35PM • Khronos EVS Workshop Hardware Acceleration for Machine Learning and Computer Vision through Khronos Open Standard APIs - Many more details on OpenVX, SYCL, OpenCL, NNEF - Hands-on exercises with OpenVX and NNEF - Thursday May 23, 9am–5pm - Rooms 203 and 204, Santa Clara Convention Center - https://www.khronos.org/events/2019-embedded-vision-summit