SlideShare uma empresa Scribd logo
1 de 28
INTRODUCTION
PHIL ROGERS, AMD CORPORATE FELLOW &
PRESIDENT OF HSA FOUNDATION
HSA FOUNDATION
 Founded in June 2012
 Developing a new platform for heterogeneous
systems
 www.hsafoundation.com
 Specifications under development in working
groups to define the platform
 Membership consists of 43 companies and 16
universities
 Adding 1-2 new members each month
© Copyright 2014 HSA Foundation. All Rights Reserved
DIVERSE PARTNERS DRIVING FUTURE OF
HETEROGENEOUS COMPUTING
© Copyright 2014 HSA Foundation. All Rights Reserved
Founders
Promoters
Supporters
Contributors
Academic
Needs Updating – Add Toshiba
Logo
MEMBERSHIP TABLE
Membership Level Number List
Founder 6 AMD, ARM, Imagination Technologies, MediaTek Inc.,
Qualcomm Inc., Samsung Electronics Co Ltd
Promoter 1 LG Electronics
Contributor 25 Analog Devices Inc., Apical, Broadcom, Canonical
Limited, CEVA Inc., Digital Media Professionals,
Electronics and Telecommunications Research,
Institute (ETRI), General Processor, Huawei, Industrial
Technology Res. Institute, Marvell International Ltd.,
Mobica, Oracle, Sonics, Inc, Sony Mobile,
Communications, Swarm 64 GmbH, Synopsys,
Tensilica, Inc., Texas Instruments Inc., Toshiba, VIA
Technologies, Vivante Corporation
Supporter 13 Allinea Software Ltd, Arteris Inc., Codeplay Software,
Fabric Engine, Kishonti, Lawrence Livermore National
Laboratory, Linaro, MultiCoreWare, Oak Ridge
National Laboratory, Sandia Corporation,
StreamComputing, SUSE LLC, UChicago Argonne LLC,
Operator of Argonne National Laboratory
Academic 17 Institute for Computing Systems Architecture,
Missouri University of Science & Technology, National
Tsing Hua University, NMAM Institute of Technology,
Northeastern University, Rice University, Seoul
National University, System Software Lab National,
Tsing Hua University, Tampere University of
Technology, TEI of Crete, The University of Mississippi,
University of North Texas, University of Bologna,
University of Bristol Microelectronic Research Group,
University of Edinburgh, University of Illinois at
Urbana-Champaign Department of Computer Science
© Copyright 2014 HSA Foundation. All Rights Reserved
HETEROGENEOUS PROCESSORS HAVE
PROLIFERATED — MAKE THEM BETTER
 Heterogeneous SOCs have arrived and are a
tremendous advance over previous platforms
 SOCs combine CPU cores, GPU cores and
other accelerators, with high bandwidth access
to memory
 How do we make them even better?
 Easier to program
 Easier to optimize
 Higher performance
 Lower power
 HSA unites accelerators architecturally
 Early focus on the GPU compute accelerator,
but HSA will go well beyond the GPU
© Copyright 2014 HSA Foundation. All Rights Reserved
INFLECTIONS IN PROCESSOR DESIGN
© Copyright 2014 HSA Foundation. All Rights Reserved
?
Single-thread
Performance
Time
we are
here
Enabled by:
 Moore’s
Law
 Voltage
Scaling
Constrained by:
Power
Complexity
Single-Core Era
ModernApplication
Performance
Time (Data-parallel exploitation)
we are
here
Heterogeneous
Systems Era
Enabled by:
 Abundant data
parallelism
 Power efficient
GPUs
Temporarily
Constrained by:
Programming
models
Comm.overhead
Throughput
Performance
Time (# of processors)
we are
here
Enabled by:
 Moore’s Law
 SMP
architecture
Constrained
by:
Power
Parallel SW
Scalability
Multi-Core Era
Assembly C/C++ Java … pthreads OpenMP / TBB …
Shader CUDA OpenCL
C++ and Java
LEGACY GPU COMPUTE
PCIe
™
System Memory
(Coherent)
CPU CPU CPU
. .
.
CU CU CU CU
CU CU CU CU
GPU Memory
(Non-Coherent)
GPU
 Multiple memory pools
 Multiple address spaces
 High overhead dispatch
 Data copies across PCIe
 New languages for
programming
 Dual source development
 Proprietary environments
 Expert programmers only
 Need to fix all of this to
unleash our programmers
The limiters
© Copyright 2014 HSA Foundation. All Rights Reserved
EXISTING APUS AND SOCS
CPU
1
CPU
N…
CPU
2
Physical Integration
CU
1 …
CU
2
CU
3
CU
M-2
CU
M-1
CU
M
System Memory
(Coherent)
GPU Memory
(Non-Coherent)
GPU
 Physical Integration
 Good first step
 Some copies gone
 Two memory pools remain
 Still queue through the OS
 Still requires expert
programmers
 Need to finish the job
AN HSA ENABLED SOC
 Unified Coherent
Memory enables
data sharing across
all processors
 Processors
architected to
operate
cooperatively
 Designed to enable
the application to
run on different
processors at
different times
Unified Coherent Memory
CPU
1
CPU
N…
CPU
2
CU
1
CU
2
CU
3
CU
M-2
CU
M-1
CU
M…
PILLARS OF HSA*
 Unified addressing across all processors
 Operation into pageable system memory
 Full memory coherency
 User mode dispatch
 Architected queuing language
 Scheduling and context switching
 HSA Intermediate Language (HSAIL)
 High level language support for GPU compute processors
© Copyright 2014 HSA Foundation. All Rights Reserved
* All features of HSA are subject to change, pending ratification of 1.0 Final specifications by the HSA Board of Directors
HSA SPECIFICATIONS
 HSA System Architecture Specification
 Version 1.0 Provisional, Released April 2014
 Defines discovery, memory model, queue management, atomics, etc
 HSA Programmers Reference Specification
 Version 1.0 Provisional, Released June 2014
 Defines the HSAIL language and object format
 HSA Runtime Software Specification
 Version 1.0 Provisional, expected to be released in July 2014
 Defines the APIs through which an HSA application uses the platform
 All released specifications can be found at the HSA Foundation web site:
 www.hsafoundation.com/standards
© Copyright 2014 HSA Foundation. All Rights Reserved
HSA - AN OPEN PLATFORM
 Open Architecture, membership open to all
 HSA Programmers Reference Manual
 HSA System Architecture
 HSA Runtime
 Delivered via royalty free standards
 Royalty Free IP, Specifications and APIs
 ISA agnostic for both CPU and GPU
 Membership from all areas of computing
 Hardware companies
 Operating Systems
 Tools and Middleware
 Applications
 Universities
© Copyright 2014 HSA Foundation. All Rights Reserved
HSA INTERMEDIATE LAYER — HSAIL
 HSAIL is a virtual ISA for parallel programs
 Finalized to ISA by a JIT compiler or “Finalizer”
 ISA independent by design for CPU & GPU
 Explicitly parallel
 Designed for data parallel programming
 Support for exceptions, virtual functions,
and other high level language features
 Lower level than OpenCL SPIR
 Fits naturally in the OpenCL compilation stack
 Suitable to support additional high level languages and programming models:
 Java, C++, OpenMP, C++, Python, etc
© Copyright 2014 HSA Foundation. All Rights Reserved
HSA MEMORY MODEL
 Defines visibility ordering between all
threads in the HSA System
 Designed to be compatible with
C++11, Java, OpenCL and .NET
Memory Models
 Relaxed consistency memory model
for parallel compute performance
 Visibility controlled by:
 Load.Acquire
 Store.Release
 Fences
© Copyright 2014 HSA Foundation. All Rights Reserved
HSA QUEUING MODEL
 User mode queuing for low latency dispatch
 Application dispatches directly
 No OS or driver required in the dispatch path
 Architected Queuing Layer
 Single compute dispatch path for all hardware
 No driver translation, direct to hardware
 Allows for dispatch to queue from any agent
 CPU or GPU
 GPU self enqueue enables lots of solutions
 Recursion
 Tree traversal
 Wavefront reforming
© Copyright 2014 HSA Foundation. All Rights Reserved
HSA SOFTWARE
Hardware - APUs, CPUs, GPUs
Driver Stack
Domain Libraries
OpenCL™, DX Runtimes,
User Mode Drivers
Graphics Kernel Mode Driver
Apps
Apps
Apps
Apps
Apps
Apps
HSA Software Stack
Task Queuing
Libraries
HSA Domain Libraries,
OpenCL ™ 2.x Runtime
HSA Kernel
Mode Driver
HSA Runtime
HSA JIT
Apps
Apps
Apps
Apps
Apps
Apps
User mode component Kernel mode component Components contributed by third parties
EVOLUTION OF THE SOFTWARE STACK
© Copyright 2014 HSA Foundation. All Rights Reserved
OPENCL™ AND HSA
 HSA is an optimized platform architecture
for OpenCL
 Not an alternative to OpenCL
 OpenCL on HSA will benefit from
 Avoidance of wasteful copies
 Low latency dispatch
 Improved memory model
 Pointers shared between CPU and GPU
 OpenCL 2.0 leverages HSA Features
 Shared Virtual Memory
 Platform Atomics
© Copyright 2014 HSA Foundation. All Rights Reserved
ADDITIONAL LANGUAGES ON HSA
 In development
© Copyright 2014 HSA Foundation. All Rights Reserved
Language Body More Information
Java Sumatra OpenJDK http://openjdk.java.net/projects/sumatra/
LLVM LLVM Code
generator for HSAIL
C++ AMP Multicoreware https://bitbucket.org/multicoreware/cppa
mp-driver-ng/wiki/Home
OpenMP, GCC AMD, Suse https://gcc.gnu.org/viewcvs/gcc/branches
/hsa/gcc/README.hsa?view=markup&p
athrev=207425
SUMATRA PROJECT OVERVIEW
 AMD/Oracle sponsored Open Source (OpenJDK) project
 Targeted at Java 9 (2015 release)
 Allows developers to efficiently represent data parallel algorithms in
Java
 Sumatra ‘repurposes’ Java 8’s multi-core Stream/Lambda API’s to
enable both CPU or GPU computing
 At runtime, Sumatra enabled Java Virtual Machine (JVM) will dispatch
‘selected’ constructs to available HSA enabled devices
 Developers of Java libraries are already refactoring their library code to
use these same constructs
 So developers using existing libraries should see GPU acceleration
without any code changes
 http://openjdk.java.net/projects/sumatra/
 https://wikis.oracle.com/display/HotSpotInternals/Sumatra
 http://mail.openjdk.java.net/pipermail/sumatra-dev/
© Copyright 2014 HSA Foundation. All Rights Reserved
Application.java
Java Compiler
GPUCPU
Sumatra Enabled JVM
Application
GPU ISA
Lambda/Stream API
CPU ISA
Application.clas
s
Development
Runtime
HSA Finalizer
HSA OPEN SOURCE SOFTWARE
 HSA will feature an open source linux execution and compilation stack
 Allows a single shared implementation for many components
 Enables university research and collaboration in all areas
 Because it’s the right thing to do
© Copyright 2014 HSA Foundation. All Rights Reserved
Component Name IHV or Common Rationale
HSA Bolt Library Common Enable understanding and debug
HSAIL Code Generator Common Enable research
LLVM Contributions Common Industry and academic collaboration
HSAIL Assembler Common Enable understanding and debug
HSA Runtime Common Standardize on a single runtime
HSA Finalizer IHV Enable research and debug
HSA Kernel Driver IHV For inclusion in linux distros
WORKLOAD EXAMPLE
SUFFIX ARRAY CONSTRUCTION
CLOUD SERVER WORKLOAD
SUFFIX ARRAYS
 Suffix Arrays are a fundamental data structure
 Designed for efficient searching of a large text
 Quickly locate every occurrence of a substring S in a text T
 Suffix Arrays are used to accelerate in-memory cloud workloads
 Full text index search
 Lossless data compression
 Bio-informatics
© Copyright 2014 HSA Foundation. All Rights Reserved
ACCELERATED SUFFIX ARRAY
CONSTRUCTION ON HSA
© Copyright 2014 HSA Foundation. All Rights Reserved
M. Deo, “Parallel Suffix Array Construction and Least Common Prefix for the GPU”, Submitted to ”Principles and Practice of Parallel Programming, (PPoPP’13)” February 2013.
AMD A10 4600M APU with Radeon™ HD Graphics; CPU: 4 cores @ 2.3 MHz (turbo 3.2 GHz); GPU: AMD Radeon HD 7660G, 6 compute units, 685MHz; 4GB RAM
By offloading data parallel computations to
GPU, HSA increases performance and
reduces energy for Suffix Array
Construction.
By efficiently sharing data between CPU and
GPU, HSA lets us move compute to data
without penalty of intermediate copies.
+5.8x
-5x
INCREASED
PERFORMANCE
DECREASED
ENERGYMerge Sort::GPU
Radix Sort::GPU
Compute SA::CPU
Lexical Rank::CPU
Radix Sort::GPU
Skew Algorithm for Compute SA
EASE OF PROGRAMMING
CODE COMPLEXITY VS. PERFORMANCE
LINES-OF-CODE AND PERFORMANCE FOR DIFFERENT
PROGRAMMING MODELS
AMD A10-5800K APU with Radeon™ HD Graphics – CPU: 4 cores, 3800MHz (4200MHz Turbo); GPU: AMD Radeon HD 7660D, 6 compute units, 800MHz; 4GB RAM.
Software – Windows 7 Professional SP1 (64-bit OS); AMD OpenCL™ 1.2 AMD-APP (937.2); Microsoft Visual Studio 11 Beta
0
50
100
150
200
250
300
350
LOC
Copy-back Algorithm Launch Copy Compile Init Performance
Serial CPU TBB Intrinsics+TBB OpenCL™-C OpenCL™ -C++ C++ AMP HSA Bolt
Performance
35.00
30.00
25.00
20.00
15.00
10.00
5.00
0Copy-
back
Algorithm
Launch
Copy
Compile
Init.
Copy-back
Algorithm
Launch
Copy
Compile
Copy-back
Algorithm
Launch
Algorithm
Launch
Algorithm
Launch
Algorithm
Launch
Algorithm
Launch
(Exemplary ISV “Hessian” Kernel)
© Copyright 2014 HSA Foundation. All Rights Reserved
THE HSA FUTURE
 Architected heterogeneous processing on the SOC
 Programming of accelerators becomes much easier
 Accelerated software that runs across multiple hardware vendors
 Scalability from smart phones to super computers on a common architecture
 GPU acceleration of parallel processing is the initial target, with DSPs
and other accelerators coming to the HSA system architecture model
 Heterogeneous software ecosystem evolves at a much faster pace
 Lower power, more capable devices in your hand, on the wall, in the cloud
© Copyright 2014 HSA Foundation. All Rights Reserved
JOIN US!
WWW.HSAFOUNDATION.COM

Mais conteúdo relacionado

Mais procurados

ISCA 2014 | Heterogeneous System Architecture (HSA): Architecture and Algorit...
ISCA 2014 | Heterogeneous System Architecture (HSA): Architecture and Algorit...ISCA 2014 | Heterogeneous System Architecture (HSA): Architecture and Algorit...
ISCA 2014 | Heterogeneous System Architecture (HSA): Architecture and Algorit...HSA Foundation
 
HSA Queuing Hot Chips 2013
HSA Queuing Hot Chips 2013 HSA Queuing Hot Chips 2013
HSA Queuing Hot Chips 2013 HSA Foundation
 
HSA HSAIL Introduction Hot Chips 2013
HSA HSAIL Introduction  Hot Chips 2013 HSA HSAIL Introduction  Hot Chips 2013
HSA HSAIL Introduction Hot Chips 2013 HSA Foundation
 
HSAemu a Full System Emulator for HSA
HSAemu a Full System Emulator for HSA HSAemu a Full System Emulator for HSA
HSAemu a Full System Emulator for HSA HSA Foundation
 
HSA Memory Model Hot Chips 2013
HSA Memory Model Hot Chips 2013HSA Memory Model Hot Chips 2013
HSA Memory Model Hot Chips 2013HSA Foundation
 
HC-4015, An Overview of the HSA System Architecture Requirements, by Paul Bli...
HC-4015, An Overview of the HSA System Architecture Requirements, by Paul Bli...HC-4015, An Overview of the HSA System Architecture Requirements, by Paul Bli...
HC-4015, An Overview of the HSA System Architecture Requirements, by Paul Bli...AMD Developer Central
 
HSA Introduction Hot Chips 2013
HSA Introduction  Hot Chips 2013HSA Introduction  Hot Chips 2013
HSA Introduction Hot Chips 2013HSA Foundation
 
Heterogeneous System Architecture Overview
Heterogeneous System Architecture OverviewHeterogeneous System Architecture Overview
Heterogeneous System Architecture Overviewinside-BigData.com
 
Heterogeneous Systems Architecture: The Next Area of Computing Innovation
Heterogeneous Systems Architecture: The Next Area of Computing Innovation Heterogeneous Systems Architecture: The Next Area of Computing Innovation
Heterogeneous Systems Architecture: The Next Area of Computing Innovation AMD
 
AFDS 2012 Phil Rogers Keynote: THE PROGRAMMER’S GUIDE TO A UNIVERSE OF POSSIB...
AFDS 2012 Phil Rogers Keynote: THE PROGRAMMER’S GUIDE TO A UNIVERSE OF POSSIB...AFDS 2012 Phil Rogers Keynote: THE PROGRAMMER’S GUIDE TO A UNIVERSE OF POSSIB...
AFDS 2012 Phil Rogers Keynote: THE PROGRAMMER’S GUIDE TO A UNIVERSE OF POSSIB...HSA Foundation
 
Deeper Look Into HSAIL And It's Runtime
Deeper Look Into HSAIL And It's Runtime Deeper Look Into HSAIL And It's Runtime
Deeper Look Into HSAIL And It's Runtime HSA Foundation
 
AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
 AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.” AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”HSA Foundation
 
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...AMD Developer Central
 
Heterogeneous computing
Heterogeneous computingHeterogeneous computing
Heterogeneous computingRashid Ansari
 
HSA-4131, HSAIL Programmers Manual: Uncovered, by Ben Sander
HSA-4131, HSAIL Programmers Manual: Uncovered, by Ben SanderHSA-4131, HSAIL Programmers Manual: Uncovered, by Ben Sander
HSA-4131, HSAIL Programmers Manual: Uncovered, by Ben SanderAMD Developer Central
 
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...AMD Developer Central
 
HSA-4024, OpenJDK Sumatra Project: Bringing the GPU to Java, by Eric Caspole
HSA-4024, OpenJDK Sumatra Project: Bringing the GPU to Java, by Eric CaspoleHSA-4024, OpenJDK Sumatra Project: Bringing the GPU to Java, by Eric Caspole
HSA-4024, OpenJDK Sumatra Project: Bringing the GPU to Java, by Eric CaspoleAMD Developer Central
 
Guide to heterogeneous system architecture (hsa)
Guide to heterogeneous system architecture (hsa)Guide to heterogeneous system architecture (hsa)
Guide to heterogeneous system architecture (hsa)dibyendu.das
 
"An Update on Open Standard APIs for Vision Processing," a Presentation from ...
"An Update on Open Standard APIs for Vision Processing," a Presentation from ..."An Update on Open Standard APIs for Vision Processing," a Presentation from ...
"An Update on Open Standard APIs for Vision Processing," a Presentation from ...Edge AI and Vision Alliance
 

Mais procurados (20)

ISCA 2014 | Heterogeneous System Architecture (HSA): Architecture and Algorit...
ISCA 2014 | Heterogeneous System Architecture (HSA): Architecture and Algorit...ISCA 2014 | Heterogeneous System Architecture (HSA): Architecture and Algorit...
ISCA 2014 | Heterogeneous System Architecture (HSA): Architecture and Algorit...
 
HSA Queuing Hot Chips 2013
HSA Queuing Hot Chips 2013 HSA Queuing Hot Chips 2013
HSA Queuing Hot Chips 2013
 
HSA HSAIL Introduction Hot Chips 2013
HSA HSAIL Introduction  Hot Chips 2013 HSA HSAIL Introduction  Hot Chips 2013
HSA HSAIL Introduction Hot Chips 2013
 
HSAemu a Full System Emulator for HSA
HSAemu a Full System Emulator for HSA HSAemu a Full System Emulator for HSA
HSAemu a Full System Emulator for HSA
 
HSA Memory Model Hot Chips 2013
HSA Memory Model Hot Chips 2013HSA Memory Model Hot Chips 2013
HSA Memory Model Hot Chips 2013
 
HC-4015, An Overview of the HSA System Architecture Requirements, by Paul Bli...
HC-4015, An Overview of the HSA System Architecture Requirements, by Paul Bli...HC-4015, An Overview of the HSA System Architecture Requirements, by Paul Bli...
HC-4015, An Overview of the HSA System Architecture Requirements, by Paul Bli...
 
HSA Introduction Hot Chips 2013
HSA Introduction  Hot Chips 2013HSA Introduction  Hot Chips 2013
HSA Introduction Hot Chips 2013
 
Heterogeneous System Architecture Overview
Heterogeneous System Architecture OverviewHeterogeneous System Architecture Overview
Heterogeneous System Architecture Overview
 
Heterogeneous Systems Architecture: The Next Area of Computing Innovation
Heterogeneous Systems Architecture: The Next Area of Computing Innovation Heterogeneous Systems Architecture: The Next Area of Computing Innovation
Heterogeneous Systems Architecture: The Next Area of Computing Innovation
 
Hsa10 whitepaper
Hsa10 whitepaperHsa10 whitepaper
Hsa10 whitepaper
 
AFDS 2012 Phil Rogers Keynote: THE PROGRAMMER’S GUIDE TO A UNIVERSE OF POSSIB...
AFDS 2012 Phil Rogers Keynote: THE PROGRAMMER’S GUIDE TO A UNIVERSE OF POSSIB...AFDS 2012 Phil Rogers Keynote: THE PROGRAMMER’S GUIDE TO A UNIVERSE OF POSSIB...
AFDS 2012 Phil Rogers Keynote: THE PROGRAMMER’S GUIDE TO A UNIVERSE OF POSSIB...
 
Deeper Look Into HSAIL And It's Runtime
Deeper Look Into HSAIL And It's Runtime Deeper Look Into HSAIL And It's Runtime
Deeper Look Into HSAIL And It's Runtime
 
AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
 AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.” AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
 
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...
 
Heterogeneous computing
Heterogeneous computingHeterogeneous computing
Heterogeneous computing
 
HSA-4131, HSAIL Programmers Manual: Uncovered, by Ben Sander
HSA-4131, HSAIL Programmers Manual: Uncovered, by Ben SanderHSA-4131, HSAIL Programmers Manual: Uncovered, by Ben Sander
HSA-4131, HSAIL Programmers Manual: Uncovered, by Ben Sander
 
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...
 
HSA-4024, OpenJDK Sumatra Project: Bringing the GPU to Java, by Eric Caspole
HSA-4024, OpenJDK Sumatra Project: Bringing the GPU to Java, by Eric CaspoleHSA-4024, OpenJDK Sumatra Project: Bringing the GPU to Java, by Eric Caspole
HSA-4024, OpenJDK Sumatra Project: Bringing the GPU to Java, by Eric Caspole
 
Guide to heterogeneous system architecture (hsa)
Guide to heterogeneous system architecture (hsa)Guide to heterogeneous system architecture (hsa)
Guide to heterogeneous system architecture (hsa)
 
"An Update on Open Standard APIs for Vision Processing," a Presentation from ...
"An Update on Open Standard APIs for Vision Processing," a Presentation from ..."An Update on Open Standard APIs for Vision Processing," a Presentation from ...
"An Update on Open Standard APIs for Vision Processing," a Presentation from ...
 

Semelhante a ISCA Final Presentation - Intro

"Enabling Efficient Heterogeneous Processing Through Coherency," a Presentati...
"Enabling Efficient Heterogeneous Processing Through Coherency," a Presentati..."Enabling Efficient Heterogeneous Processing Through Coherency," a Presentati...
"Enabling Efficient Heterogeneous Processing Through Coherency," a Presentati...Edge AI and Vision Alliance
 
Petapath HP Cast 12 - Programming for High Performance Accelerated Systems
Petapath HP Cast 12 - Programming for High Performance Accelerated SystemsPetapath HP Cast 12 - Programming for High Performance Accelerated Systems
Petapath HP Cast 12 - Programming for High Performance Accelerated Systemsdairsie
 
Data Science Languages and Industry Analytics
Data Science Languages and Industry AnalyticsData Science Languages and Industry Analytics
Data Science Languages and Industry AnalyticsWes McKinney
 
LCU13: HSA Architecture Presentation
LCU13: HSA Architecture PresentationLCU13: HSA Architecture Presentation
LCU13: HSA Architecture PresentationLinaro
 
Power9 aihpc bigdataeducationserver
Power9 aihpc bigdataeducationserverPower9 aihpc bigdataeducationserver
Power9 aihpc bigdataeducationserverGanesan Narayanasamy
 
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono..."The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...Edge AI and Vision Alliance
 
HSA-4122, "HSA Queuing Mode," by Ian Bratt
HSA-4122, "HSA Queuing Mode," by Ian BrattHSA-4122, "HSA Queuing Mode," by Ian Bratt
HSA-4122, "HSA Queuing Mode," by Ian BrattAMD Developer Central
 
Learn more about the tremendous value Open Data Plane brings to NFV
Learn more about the tremendous value Open Data Plane brings to NFVLearn more about the tremendous value Open Data Plane brings to NFV
Learn more about the tremendous value Open Data Plane brings to NFVGhodhbane Mohamed Amine
 
SAM - Streaming Analytics Made Easy
SAM - Streaming Analytics Made EasySAM - Streaming Analytics Made Easy
SAM - Streaming Analytics Made EasyDataWorks Summit
 
HC-4017, HSA Compilers Technology, by Debyendu Das
HC-4017, HSA Compilers Technology, by Debyendu DasHC-4017, HSA Compilers Technology, by Debyendu Das
HC-4017, HSA Compilers Technology, by Debyendu DasAMD Developer Central
 
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...AMD Developer Central
 
Accelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing TechnologiesAccelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing TechnologiesIntel® Software
 
Cloudera - Amr Awadallah - Hadoop World 2010
Cloudera - Amr Awadallah - Hadoop World 2010Cloudera - Amr Awadallah - Hadoop World 2010
Cloudera - Amr Awadallah - Hadoop World 2010Cloudera, Inc.
 
HP CAST 2017 Frankfurt : HPE UberCloud boosting HPC as a Service
HP CAST 2017 Frankfurt : HPE UberCloud boosting HPC as a ServiceHP CAST 2017 Frankfurt : HPE UberCloud boosting HPC as a Service
HP CAST 2017 Frankfurt : HPE UberCloud boosting HPC as a ServiceThomas Francis
 
Ibis: operating the Python data ecosystem at Hadoop scale by Wes McKinney
Ibis: operating the Python data ecosystem at Hadoop scale by Wes McKinneyIbis: operating the Python data ecosystem at Hadoop scale by Wes McKinney
Ibis: operating the Python data ecosystem at Hadoop scale by Wes McKinneyHakka Labs
 
Oss the freedom dpm 2018
Oss the freedom dpm 2018Oss the freedom dpm 2018
Oss the freedom dpm 2018BIT DURG
 
OpenHPC: A Comprehensive System Software Stack
OpenHPC: A Comprehensive System Software StackOpenHPC: A Comprehensive System Software Stack
OpenHPC: A Comprehensive System Software Stackinside-BigData.com
 

Semelhante a ISCA Final Presentation - Intro (20)

"Enabling Efficient Heterogeneous Processing Through Coherency," a Presentati...
"Enabling Efficient Heterogeneous Processing Through Coherency," a Presentati..."Enabling Efficient Heterogeneous Processing Through Coherency," a Presentati...
"Enabling Efficient Heterogeneous Processing Through Coherency," a Presentati...
 
Petapath HP Cast 12 - Programming for High Performance Accelerated Systems
Petapath HP Cast 12 - Programming for High Performance Accelerated SystemsPetapath HP Cast 12 - Programming for High Performance Accelerated Systems
Petapath HP Cast 12 - Programming for High Performance Accelerated Systems
 
Implement Runtime Environments for HSA using LLVM
Implement Runtime Environments for HSA using LLVMImplement Runtime Environments for HSA using LLVM
Implement Runtime Environments for HSA using LLVM
 
Data Science Languages and Industry Analytics
Data Science Languages and Industry AnalyticsData Science Languages and Industry Analytics
Data Science Languages and Industry Analytics
 
LCU13: HSA Architecture Presentation
LCU13: HSA Architecture PresentationLCU13: HSA Architecture Presentation
LCU13: HSA Architecture Presentation
 
Power9 aihpc bigdataeducationserver
Power9 aihpc bigdataeducationserverPower9 aihpc bigdataeducationserver
Power9 aihpc bigdataeducationserver
 
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono..."The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...
 
HSA-4122, "HSA Queuing Mode," by Ian Bratt
HSA-4122, "HSA Queuing Mode," by Ian BrattHSA-4122, "HSA Queuing Mode," by Ian Bratt
HSA-4122, "HSA Queuing Mode," by Ian Bratt
 
Learn more about the tremendous value Open Data Plane brings to NFV
Learn more about the tremendous value Open Data Plane brings to NFVLearn more about the tremendous value Open Data Plane brings to NFV
Learn more about the tremendous value Open Data Plane brings to NFV
 
SAM - Streaming Analytics Made Easy
SAM - Streaming Analytics Made EasySAM - Streaming Analytics Made Easy
SAM - Streaming Analytics Made Easy
 
HC-4017, HSA Compilers Technology, by Debyendu Das
HC-4017, HSA Compilers Technology, by Debyendu DasHC-4017, HSA Compilers Technology, by Debyendu Das
HC-4017, HSA Compilers Technology, by Debyendu Das
 
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
 
Accelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing TechnologiesAccelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing Technologies
 
Cloudera - Amr Awadallah - Hadoop World 2010
Cloudera - Amr Awadallah - Hadoop World 2010Cloudera - Amr Awadallah - Hadoop World 2010
Cloudera - Amr Awadallah - Hadoop World 2010
 
An Update on Arm HPC
An Update on Arm HPCAn Update on Arm HPC
An Update on Arm HPC
 
HP CAST 2017 Frankfurt : HPE UberCloud boosting HPC as a Service
HP CAST 2017 Frankfurt : HPE UberCloud boosting HPC as a ServiceHP CAST 2017 Frankfurt : HPE UberCloud boosting HPC as a Service
HP CAST 2017 Frankfurt : HPE UberCloud boosting HPC as a Service
 
Ibis: operating the Python data ecosystem at Hadoop scale by Wes McKinney
Ibis: operating the Python data ecosystem at Hadoop scale by Wes McKinneyIbis: operating the Python data ecosystem at Hadoop scale by Wes McKinney
Ibis: operating the Python data ecosystem at Hadoop scale by Wes McKinney
 
Streaming analytics manager
Streaming analytics managerStreaming analytics manager
Streaming analytics manager
 
Oss the freedom dpm 2018
Oss the freedom dpm 2018Oss the freedom dpm 2018
Oss the freedom dpm 2018
 
OpenHPC: A Comprehensive System Software Stack
OpenHPC: A Comprehensive System Software StackOpenHPC: A Comprehensive System Software Stack
OpenHPC: A Comprehensive System Software Stack
 

Mais de HSA Foundation

Hsa Runtime version 1.00 Provisional
Hsa Runtime version  1.00  ProvisionalHsa Runtime version  1.00  Provisional
Hsa Runtime version 1.00 ProvisionalHSA Foundation
 
Hsa programmers reference manual (version 1.0 provisional)
Hsa programmers reference manual (version 1.0 provisional)Hsa programmers reference manual (version 1.0 provisional)
Hsa programmers reference manual (version 1.0 provisional)HSA Foundation
 
ISCA Final Presentaiton - Compilations
ISCA Final Presentaiton -  CompilationsISCA Final Presentaiton -  Compilations
ISCA Final Presentaiton - CompilationsHSA Foundation
 
Hsa Platform System Architecture Specification Provisional verl 1.0 ratifed
Hsa Platform System Architecture Specification Provisional  verl 1.0 ratifed Hsa Platform System Architecture Specification Provisional  verl 1.0 ratifed
Hsa Platform System Architecture Specification Provisional verl 1.0 ratifed HSA Foundation
 
Apu13 cp lu-keynote-final-slideshare
Apu13 cp lu-keynote-final-slideshareApu13 cp lu-keynote-final-slideshare
Apu13 cp lu-keynote-final-slideshareHSA Foundation
 
HSA Foundation BoF -Siggraph 2013 Flyer
HSA Foundation BoF -Siggraph 2013 Flyer HSA Foundation BoF -Siggraph 2013 Flyer
HSA Foundation BoF -Siggraph 2013 Flyer HSA Foundation
 
HSA Programmer’s Reference Manual: HSAIL Virtual ISA and Programming Model, C...
HSA Programmer’s Reference Manual: HSAIL Virtual ISA and Programming Model, C...HSA Programmer’s Reference Manual: HSAIL Virtual ISA and Programming Model, C...
HSA Programmer’s Reference Manual: HSAIL Virtual ISA and Programming Model, C...HSA Foundation
 
ARM Techcon Keynote 2012: Sensor Integration and Improved User Experiences at...
ARM Techcon Keynote 2012: Sensor Integration and Improved User Experiences at...ARM Techcon Keynote 2012: Sensor Integration and Improved User Experiences at...
ARM Techcon Keynote 2012: Sensor Integration and Improved User Experiences at...HSA Foundation
 
Phil Rogers IFA Keynote 2012
Phil Rogers IFA Keynote 2012Phil Rogers IFA Keynote 2012
Phil Rogers IFA Keynote 2012HSA Foundation
 
Bolt C++ Standard Template Libary for HSA by Ben Sanders, AMD
Bolt C++ Standard Template Libary for HSA  by Ben Sanders, AMDBolt C++ Standard Template Libary for HSA  by Ben Sanders, AMD
Bolt C++ Standard Template Libary for HSA by Ben Sanders, AMDHSA Foundation
 
Hsa2012 logo guidelines.
Hsa2012 logo guidelines.Hsa2012 logo guidelines.
Hsa2012 logo guidelines.HSA Foundation
 
What Fabric Engine Can Do With HSA
What Fabric Engine Can Do With HSAWhat Fabric Engine Can Do With HSA
What Fabric Engine Can Do With HSAHSA Foundation
 
Fabric Engine: Why HSA is Invaluable
Fabric Engine: Why HSA is  InvaluableFabric Engine: Why HSA is  Invaluable
Fabric Engine: Why HSA is InvaluableHSA Foundation
 

Mais de HSA Foundation (13)

Hsa Runtime version 1.00 Provisional
Hsa Runtime version  1.00  ProvisionalHsa Runtime version  1.00  Provisional
Hsa Runtime version 1.00 Provisional
 
Hsa programmers reference manual (version 1.0 provisional)
Hsa programmers reference manual (version 1.0 provisional)Hsa programmers reference manual (version 1.0 provisional)
Hsa programmers reference manual (version 1.0 provisional)
 
ISCA Final Presentaiton - Compilations
ISCA Final Presentaiton -  CompilationsISCA Final Presentaiton -  Compilations
ISCA Final Presentaiton - Compilations
 
Hsa Platform System Architecture Specification Provisional verl 1.0 ratifed
Hsa Platform System Architecture Specification Provisional  verl 1.0 ratifed Hsa Platform System Architecture Specification Provisional  verl 1.0 ratifed
Hsa Platform System Architecture Specification Provisional verl 1.0 ratifed
 
Apu13 cp lu-keynote-final-slideshare
Apu13 cp lu-keynote-final-slideshareApu13 cp lu-keynote-final-slideshare
Apu13 cp lu-keynote-final-slideshare
 
HSA Foundation BoF -Siggraph 2013 Flyer
HSA Foundation BoF -Siggraph 2013 Flyer HSA Foundation BoF -Siggraph 2013 Flyer
HSA Foundation BoF -Siggraph 2013 Flyer
 
HSA Programmer’s Reference Manual: HSAIL Virtual ISA and Programming Model, C...
HSA Programmer’s Reference Manual: HSAIL Virtual ISA and Programming Model, C...HSA Programmer’s Reference Manual: HSAIL Virtual ISA and Programming Model, C...
HSA Programmer’s Reference Manual: HSAIL Virtual ISA and Programming Model, C...
 
ARM Techcon Keynote 2012: Sensor Integration and Improved User Experiences at...
ARM Techcon Keynote 2012: Sensor Integration and Improved User Experiences at...ARM Techcon Keynote 2012: Sensor Integration and Improved User Experiences at...
ARM Techcon Keynote 2012: Sensor Integration and Improved User Experiences at...
 
Phil Rogers IFA Keynote 2012
Phil Rogers IFA Keynote 2012Phil Rogers IFA Keynote 2012
Phil Rogers IFA Keynote 2012
 
Bolt C++ Standard Template Libary for HSA by Ben Sanders, AMD
Bolt C++ Standard Template Libary for HSA  by Ben Sanders, AMDBolt C++ Standard Template Libary for HSA  by Ben Sanders, AMD
Bolt C++ Standard Template Libary for HSA by Ben Sanders, AMD
 
Hsa2012 logo guidelines.
Hsa2012 logo guidelines.Hsa2012 logo guidelines.
Hsa2012 logo guidelines.
 
What Fabric Engine Can Do With HSA
What Fabric Engine Can Do With HSAWhat Fabric Engine Can Do With HSA
What Fabric Engine Can Do With HSA
 
Fabric Engine: Why HSA is Invaluable
Fabric Engine: Why HSA is  InvaluableFabric Engine: Why HSA is  Invaluable
Fabric Engine: Why HSA is Invaluable
 

Último

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 

Último (20)

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 

ISCA Final Presentation - Intro

  • 1. INTRODUCTION PHIL ROGERS, AMD CORPORATE FELLOW & PRESIDENT OF HSA FOUNDATION
  • 2. HSA FOUNDATION  Founded in June 2012  Developing a new platform for heterogeneous systems  www.hsafoundation.com  Specifications under development in working groups to define the platform  Membership consists of 43 companies and 16 universities  Adding 1-2 new members each month © Copyright 2014 HSA Foundation. All Rights Reserved
  • 3. DIVERSE PARTNERS DRIVING FUTURE OF HETEROGENEOUS COMPUTING © Copyright 2014 HSA Foundation. All Rights Reserved Founders Promoters Supporters Contributors Academic Needs Updating – Add Toshiba Logo
  • 4. MEMBERSHIP TABLE Membership Level Number List Founder 6 AMD, ARM, Imagination Technologies, MediaTek Inc., Qualcomm Inc., Samsung Electronics Co Ltd Promoter 1 LG Electronics Contributor 25 Analog Devices Inc., Apical, Broadcom, Canonical Limited, CEVA Inc., Digital Media Professionals, Electronics and Telecommunications Research, Institute (ETRI), General Processor, Huawei, Industrial Technology Res. Institute, Marvell International Ltd., Mobica, Oracle, Sonics, Inc, Sony Mobile, Communications, Swarm 64 GmbH, Synopsys, Tensilica, Inc., Texas Instruments Inc., Toshiba, VIA Technologies, Vivante Corporation Supporter 13 Allinea Software Ltd, Arteris Inc., Codeplay Software, Fabric Engine, Kishonti, Lawrence Livermore National Laboratory, Linaro, MultiCoreWare, Oak Ridge National Laboratory, Sandia Corporation, StreamComputing, SUSE LLC, UChicago Argonne LLC, Operator of Argonne National Laboratory Academic 17 Institute for Computing Systems Architecture, Missouri University of Science & Technology, National Tsing Hua University, NMAM Institute of Technology, Northeastern University, Rice University, Seoul National University, System Software Lab National, Tsing Hua University, Tampere University of Technology, TEI of Crete, The University of Mississippi, University of North Texas, University of Bologna, University of Bristol Microelectronic Research Group, University of Edinburgh, University of Illinois at Urbana-Champaign Department of Computer Science © Copyright 2014 HSA Foundation. All Rights Reserved
  • 5. HETEROGENEOUS PROCESSORS HAVE PROLIFERATED — MAKE THEM BETTER  Heterogeneous SOCs have arrived and are a tremendous advance over previous platforms  SOCs combine CPU cores, GPU cores and other accelerators, with high bandwidth access to memory  How do we make them even better?  Easier to program  Easier to optimize  Higher performance  Lower power  HSA unites accelerators architecturally  Early focus on the GPU compute accelerator, but HSA will go well beyond the GPU © Copyright 2014 HSA Foundation. All Rights Reserved
  • 6. INFLECTIONS IN PROCESSOR DESIGN © Copyright 2014 HSA Foundation. All Rights Reserved ? Single-thread Performance Time we are here Enabled by:  Moore’s Law  Voltage Scaling Constrained by: Power Complexity Single-Core Era ModernApplication Performance Time (Data-parallel exploitation) we are here Heterogeneous Systems Era Enabled by:  Abundant data parallelism  Power efficient GPUs Temporarily Constrained by: Programming models Comm.overhead Throughput Performance Time (# of processors) we are here Enabled by:  Moore’s Law  SMP architecture Constrained by: Power Parallel SW Scalability Multi-Core Era Assembly C/C++ Java … pthreads OpenMP / TBB … Shader CUDA OpenCL C++ and Java
  • 7. LEGACY GPU COMPUTE PCIe ™ System Memory (Coherent) CPU CPU CPU . . . CU CU CU CU CU CU CU CU GPU Memory (Non-Coherent) GPU  Multiple memory pools  Multiple address spaces  High overhead dispatch  Data copies across PCIe  New languages for programming  Dual source development  Proprietary environments  Expert programmers only  Need to fix all of this to unleash our programmers The limiters © Copyright 2014 HSA Foundation. All Rights Reserved
  • 8. EXISTING APUS AND SOCS CPU 1 CPU N… CPU 2 Physical Integration CU 1 … CU 2 CU 3 CU M-2 CU M-1 CU M System Memory (Coherent) GPU Memory (Non-Coherent) GPU  Physical Integration  Good first step  Some copies gone  Two memory pools remain  Still queue through the OS  Still requires expert programmers  Need to finish the job
  • 9. AN HSA ENABLED SOC  Unified Coherent Memory enables data sharing across all processors  Processors architected to operate cooperatively  Designed to enable the application to run on different processors at different times Unified Coherent Memory CPU 1 CPU N… CPU 2 CU 1 CU 2 CU 3 CU M-2 CU M-1 CU M…
  • 10. PILLARS OF HSA*  Unified addressing across all processors  Operation into pageable system memory  Full memory coherency  User mode dispatch  Architected queuing language  Scheduling and context switching  HSA Intermediate Language (HSAIL)  High level language support for GPU compute processors © Copyright 2014 HSA Foundation. All Rights Reserved * All features of HSA are subject to change, pending ratification of 1.0 Final specifications by the HSA Board of Directors
  • 11. HSA SPECIFICATIONS  HSA System Architecture Specification  Version 1.0 Provisional, Released April 2014  Defines discovery, memory model, queue management, atomics, etc  HSA Programmers Reference Specification  Version 1.0 Provisional, Released June 2014  Defines the HSAIL language and object format  HSA Runtime Software Specification  Version 1.0 Provisional, expected to be released in July 2014  Defines the APIs through which an HSA application uses the platform  All released specifications can be found at the HSA Foundation web site:  www.hsafoundation.com/standards © Copyright 2014 HSA Foundation. All Rights Reserved
  • 12. HSA - AN OPEN PLATFORM  Open Architecture, membership open to all  HSA Programmers Reference Manual  HSA System Architecture  HSA Runtime  Delivered via royalty free standards  Royalty Free IP, Specifications and APIs  ISA agnostic for both CPU and GPU  Membership from all areas of computing  Hardware companies  Operating Systems  Tools and Middleware  Applications  Universities © Copyright 2014 HSA Foundation. All Rights Reserved
  • 13. HSA INTERMEDIATE LAYER — HSAIL  HSAIL is a virtual ISA for parallel programs  Finalized to ISA by a JIT compiler or “Finalizer”  ISA independent by design for CPU & GPU  Explicitly parallel  Designed for data parallel programming  Support for exceptions, virtual functions, and other high level language features  Lower level than OpenCL SPIR  Fits naturally in the OpenCL compilation stack  Suitable to support additional high level languages and programming models:  Java, C++, OpenMP, C++, Python, etc © Copyright 2014 HSA Foundation. All Rights Reserved
  • 14. HSA MEMORY MODEL  Defines visibility ordering between all threads in the HSA System  Designed to be compatible with C++11, Java, OpenCL and .NET Memory Models  Relaxed consistency memory model for parallel compute performance  Visibility controlled by:  Load.Acquire  Store.Release  Fences © Copyright 2014 HSA Foundation. All Rights Reserved
  • 15. HSA QUEUING MODEL  User mode queuing for low latency dispatch  Application dispatches directly  No OS or driver required in the dispatch path  Architected Queuing Layer  Single compute dispatch path for all hardware  No driver translation, direct to hardware  Allows for dispatch to queue from any agent  CPU or GPU  GPU self enqueue enables lots of solutions  Recursion  Tree traversal  Wavefront reforming © Copyright 2014 HSA Foundation. All Rights Reserved
  • 17. Hardware - APUs, CPUs, GPUs Driver Stack Domain Libraries OpenCL™, DX Runtimes, User Mode Drivers Graphics Kernel Mode Driver Apps Apps Apps Apps Apps Apps HSA Software Stack Task Queuing Libraries HSA Domain Libraries, OpenCL ™ 2.x Runtime HSA Kernel Mode Driver HSA Runtime HSA JIT Apps Apps Apps Apps Apps Apps User mode component Kernel mode component Components contributed by third parties EVOLUTION OF THE SOFTWARE STACK © Copyright 2014 HSA Foundation. All Rights Reserved
  • 18. OPENCL™ AND HSA  HSA is an optimized platform architecture for OpenCL  Not an alternative to OpenCL  OpenCL on HSA will benefit from  Avoidance of wasteful copies  Low latency dispatch  Improved memory model  Pointers shared between CPU and GPU  OpenCL 2.0 leverages HSA Features  Shared Virtual Memory  Platform Atomics © Copyright 2014 HSA Foundation. All Rights Reserved
  • 19. ADDITIONAL LANGUAGES ON HSA  In development © Copyright 2014 HSA Foundation. All Rights Reserved Language Body More Information Java Sumatra OpenJDK http://openjdk.java.net/projects/sumatra/ LLVM LLVM Code generator for HSAIL C++ AMP Multicoreware https://bitbucket.org/multicoreware/cppa mp-driver-ng/wiki/Home OpenMP, GCC AMD, Suse https://gcc.gnu.org/viewcvs/gcc/branches /hsa/gcc/README.hsa?view=markup&p athrev=207425
  • 20. SUMATRA PROJECT OVERVIEW  AMD/Oracle sponsored Open Source (OpenJDK) project  Targeted at Java 9 (2015 release)  Allows developers to efficiently represent data parallel algorithms in Java  Sumatra ‘repurposes’ Java 8’s multi-core Stream/Lambda API’s to enable both CPU or GPU computing  At runtime, Sumatra enabled Java Virtual Machine (JVM) will dispatch ‘selected’ constructs to available HSA enabled devices  Developers of Java libraries are already refactoring their library code to use these same constructs  So developers using existing libraries should see GPU acceleration without any code changes  http://openjdk.java.net/projects/sumatra/  https://wikis.oracle.com/display/HotSpotInternals/Sumatra  http://mail.openjdk.java.net/pipermail/sumatra-dev/ © Copyright 2014 HSA Foundation. All Rights Reserved Application.java Java Compiler GPUCPU Sumatra Enabled JVM Application GPU ISA Lambda/Stream API CPU ISA Application.clas s Development Runtime HSA Finalizer
  • 21. HSA OPEN SOURCE SOFTWARE  HSA will feature an open source linux execution and compilation stack  Allows a single shared implementation for many components  Enables university research and collaboration in all areas  Because it’s the right thing to do © Copyright 2014 HSA Foundation. All Rights Reserved Component Name IHV or Common Rationale HSA Bolt Library Common Enable understanding and debug HSAIL Code Generator Common Enable research LLVM Contributions Common Industry and academic collaboration HSAIL Assembler Common Enable understanding and debug HSA Runtime Common Standardize on a single runtime HSA Finalizer IHV Enable research and debug HSA Kernel Driver IHV For inclusion in linux distros
  • 22. WORKLOAD EXAMPLE SUFFIX ARRAY CONSTRUCTION CLOUD SERVER WORKLOAD
  • 23. SUFFIX ARRAYS  Suffix Arrays are a fundamental data structure  Designed for efficient searching of a large text  Quickly locate every occurrence of a substring S in a text T  Suffix Arrays are used to accelerate in-memory cloud workloads  Full text index search  Lossless data compression  Bio-informatics © Copyright 2014 HSA Foundation. All Rights Reserved
  • 24. ACCELERATED SUFFIX ARRAY CONSTRUCTION ON HSA © Copyright 2014 HSA Foundation. All Rights Reserved M. Deo, “Parallel Suffix Array Construction and Least Common Prefix for the GPU”, Submitted to ”Principles and Practice of Parallel Programming, (PPoPP’13)” February 2013. AMD A10 4600M APU with Radeon™ HD Graphics; CPU: 4 cores @ 2.3 MHz (turbo 3.2 GHz); GPU: AMD Radeon HD 7660G, 6 compute units, 685MHz; 4GB RAM By offloading data parallel computations to GPU, HSA increases performance and reduces energy for Suffix Array Construction. By efficiently sharing data between CPU and GPU, HSA lets us move compute to data without penalty of intermediate copies. +5.8x -5x INCREASED PERFORMANCE DECREASED ENERGYMerge Sort::GPU Radix Sort::GPU Compute SA::CPU Lexical Rank::CPU Radix Sort::GPU Skew Algorithm for Compute SA
  • 25. EASE OF PROGRAMMING CODE COMPLEXITY VS. PERFORMANCE
  • 26. LINES-OF-CODE AND PERFORMANCE FOR DIFFERENT PROGRAMMING MODELS AMD A10-5800K APU with Radeon™ HD Graphics – CPU: 4 cores, 3800MHz (4200MHz Turbo); GPU: AMD Radeon HD 7660D, 6 compute units, 800MHz; 4GB RAM. Software – Windows 7 Professional SP1 (64-bit OS); AMD OpenCL™ 1.2 AMD-APP (937.2); Microsoft Visual Studio 11 Beta 0 50 100 150 200 250 300 350 LOC Copy-back Algorithm Launch Copy Compile Init Performance Serial CPU TBB Intrinsics+TBB OpenCL™-C OpenCL™ -C++ C++ AMP HSA Bolt Performance 35.00 30.00 25.00 20.00 15.00 10.00 5.00 0Copy- back Algorithm Launch Copy Compile Init. Copy-back Algorithm Launch Copy Compile Copy-back Algorithm Launch Algorithm Launch Algorithm Launch Algorithm Launch Algorithm Launch (Exemplary ISV “Hessian” Kernel) © Copyright 2014 HSA Foundation. All Rights Reserved
  • 27. THE HSA FUTURE  Architected heterogeneous processing on the SOC  Programming of accelerators becomes much easier  Accelerated software that runs across multiple hardware vendors  Scalability from smart phones to super computers on a common architecture  GPU acceleration of parallel processing is the initial target, with DSPs and other accelerators coming to the HSA system architecture model  Heterogeneous software ecosystem evolves at a much faster pace  Lower power, more capable devices in your hand, on the wall, in the cloud © Copyright 2014 HSA Foundation. All Rights Reserved

Notas do Editor

  1. We will be open on this. We will reach out to partners and collaborate to bring this to market in the right form
  2. Lets take a deeper dive here into the details of the architecture …
  3. The memory model for a new architecture is key
  4. The memory model for a new architecture is key
  5. Key Points: Writing optimal CPU implementations requires complex development too. Programmers have to use both intrinsics for vector parallelism, and TBB for multicore parallelism. OpenCL C OpenCL-C is widely known fairly verbose C-based API, and it shows in the boilerplate initialization code, runtime-compile code, and kernel launch. OpenCL C++ : Removes initialization code by providing sensible defaults for platform, context, device, command-queue. No need to set these up, and no need to save them and drag them around for later OCL API calls. Reduce code to compile by using C++ exceptions for error-checking, automatic memory allocation (rather than calling API to determine size of return args) Default arguments, type-checking — code focuses on relevant parameters. The host-side support for C++ is available in a “cl.hpp” file which runs on any OpenCL implementation (including NV, Intel, etc). In addition, AMD OpenCL implementation supports “static” C++ kernel language — classes, namespaces, templates. (Not used in the this implementation). C++ AMP Initialization is handled through sensible defaults. C++AMP eliminates platform, context, accelerator_view combines device and queue. Single-source model : Eliminates run-time compile code, this is done at compile-time with the host code. Single-source model : streamlined kernel call convention (eliminate clSetKernelArg) The implementation here uses C++11 lambda to reduce boilerplate code for functor construction (kernel can directly access local vars) Data xfer reduced by implicit transfers performed by array_view BOLT Moves reduction code into library — what’s left is reduction operator. Removes data xfer and copy-back — interface is directly with host data structures. Bolt-for-C++AMP uses lambda syntax; Bolt-For-OCL does not (not supported) Bolt-for-OCL implementation relies on C++ static kernel language — recently introduced in AMD APP SDK 2.6 (beta) and 2.7 (production?) Other notes: Serial CPU integrates algorithm and reduction and we call it just “algorithm” ; later implementations separate these for performance. Launch is argument setup and calling of the kernel or library routine. Copy-back includes code to copy data back to host and run a host-side final reduction step. LOC includes appropriate spaces and comments. We attempted to use similar coding style across all implementations. Tbb init is 1-line to initialize the scheduler. (tbb::task_scheduler_init)