SlideShare uma empresa Scribd logo
1 de 12
Baixar para ler offline
DESIGNING A GAME AUDIO ENGINE FOR HSA
LAURENT BETBEDER
SCEA
WHAT’S SO SPECIAL ABOUT CONSOLE GAME DEV?
NOW THAT CONSOLES MOSTLY RUN PC HARDWARE

 Extreme performance optimizations
‒ Until gamers opt for shorter upgrade cycles (phones/tablets business model) ?
‒ Can’t run sub-optimal audio code when competing for cycles on crowded compute queues

 Custom hardware, OS, drivers and compilers
‒ To extract max perf from fixed hardware
‒ Helps lengthening platform life time
‒ “But but… where’s my OpenCL runtime?”

 Low latency
‒ Music games on consoles need it as much as professional music prod software on desktop
‒ But is much harder to achieve reliably when a system is constantly overloaded

2 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
GAME AUDIO DSP ON THE ACP
WHY?

 Heavy specialized DSP workloads
‒ Stuff games need badly but don’t really want to deal with
‒ Best fit for dedicated and/or fixed function hardware
‒ Codecs
‒
‒
‒
‒

CELP codecs -> party chat
100s of MP3/AT9/AAC decode instances
Huge impact on game assets footprint, down/load times
Optional output bitstream encoding (AC3/DTS)

‒ Voice recognition
‒ Echo cancelation

 Platform wide IP licensing levels the playing field
‒ Good for indy developers
‒ And good for the platform!

 Available via asynchronous secure system APIs

3 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
GAME AUDIO DSP ON THE ACP
WHY NOT?

 Exotic hardware and dev environment
‒ Closed to games
‒ Closed to middleware
‒ Platform specific

 Asynchronous interface
‒ Can’t have sequential interleaving of DSP back and forth between CPU and ACP w/o latency buildup
‒ But ultimately, we want the DSP pipeline to be data driven (by artists who know nothing about this)
‒ Modularity

 Slow clock rate @ 800MHz, very limited SIMD and no FP support
‒ Tough sell against Jaguar for many DSP algorithms
‒ Very tight local memory shared by multiple DSP cores

 Already pretty busy with codec loads and system tasks

4 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
GAME AUDIO DSP ON THE GPU
WHY?

 Much more demand for real-time effects today and will keep growing

 CPU FLOPS likely to stagnate and could even decline in HSA as CUs takes over SIMD workloads
 Flexibility: some games are CPU bound, others are GPU bound…
 hUMA is a game changer (removes NUMA’s main bottleneck: GPU write back)
 Compute queues with prioritized scheduling and even some form of preemption
 Many real-time audio DSP algorithms work well on wide SIMD units
‒ FFT convolution (spectral processing in general)
‒ Mixing, resampling, wave shaping, etc…

 Mostly coalesced mem accesses
 Low/med bandwidth (< 1GB/s)

5 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
GAME AUDIO DSP ON THE GPU
WHY NOT?

 Some algorithms do not work (as) well on wide SIMD units
‒ IIR filters, ADPCM decodes, dynamics: data recursion causes thread interdependencies within wavefronts
‒ Typical AAA game runs 1000s of biquads at various stages in the filtergraph

 Workloads may require batch voice processing to achieve high CU efficiency
‒ Build 2D grids (channels x samples) or 3D grids (channels x subbands x samples)
‒ Swizzling is key but watch out for runtime cost as SIMD widens (static vs dynamic)

 Batch processing goes against free form MaxMSP model artists are pushing for
‒ Unique DSP chain for each sound “just because we can!”
‒ Data driven filtergraph and DSP pipeline

 Complex prioritized scheduling & dispatching compute queues
‒ Do not prevent intermittent CU saturation caused by large graphics workloads
‒ Risky for low latency direct path audio DSP

 Proprietary hardware, drivers and shader compilers (PSSL)
‒ Audio middleware will need a some incentive to move up there
‒ Most will probably stay on the CPU
6 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
GAME AUDIO DSP ON JAGUAR
WHY?

 Well known and open x64 dev environment
‒ Middleware friendly
‒ CLANG/LLVM solid & stable

 Full FP unit with SSE4 support

 Early PA is surprisingly good for compiled intrinsics code
‒ ~10% slower than core i7 @ same clock rate
‒ GDDR5 latency is not an issue
‒ < ~50% of 1 core @ 1.6GHz running the entire KZSF filtergraph

 Only reliable solution for ultra low latency
‒ Music and rhythm games
‒ Run 100% on CPU (including decoding)

7 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
GAME AUDIO DSP ON JAGUAR
WHY NOT?

 “Weak laptop CPU” compared to top of the line on desktop
‒ No FMA4
‒ Slow clock @ 1.6GHz (compared to typical desktop)

 256bit AVX mostly useless
 Possible bottleneck down the line

8 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
GAME ENGINE CODE
THIN COMPUTE

 3D audio
‒ Sound emitters (distance, directionality and size modeling)
‒ Sound listeners (mic and ear modeling)
‒ Sound geometry (collision meshes)
‒ Deeper physical modeling of sound propagation
‒ Simple ray casting (occlusion, obstruction, indirect audio)
‒ Advanced ray casting (diffraction, real-time individual early reflection tracking)

 Physics
‒ Rigid body dynamics (collisions, friction, destruction)
‒ Fluid dynamics (turbulences)

 Animation, special FX
‒ Inline audio sequencing and modulation
‒ Foley, coarse granular synthesis

9 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
CONCLUSIONS
 HSA + hUMA is a great combo for high perf game audio!
‒ Maximized perf per W from specialized hardware (CPU + GPU + ACP)
‒ Our challenge is to figure out what to run where and when

 ACP is a great fit for codecs and OS services
‒ But not for modular synthesis and highly customized DSP pipelines

 GPU is great fit for mid/high latency DSP and high level 3D thin compute
‒ Indirect (reflected) audio
‒ Convolution reverb
‒ 3D ray casting for occlusion/obstruction/diffraction

 CPU is still the best fit for everything else:
‒ Open modular synthesis frameworks and middleware
‒ Low latency audio

10 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
AUDIO SYNTHESIZER SCHEDULING IN HSA

11 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
DISCLAIMER & ATTRIBUTION

The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors.
The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap
changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software
changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD
reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of
such revisions or changes.
AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY
INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION.
AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE
LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION
CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

ATTRIBUTION
© 2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices,
Inc. in the United States and/or other jurisdictions. SPEC is a registered trademark of the Standard Performance Evaluation Corporation (SPEC). Other names
are for informational purposes only and may be trademarks of their respective owners.
12 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL

Mais conteúdo relacionado

Mais procurados

PG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry KozlovPG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry Kozlov
AMD Developer Central
 

Mais procurados (20)

CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...
CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...
CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...
 
IS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe ClavelIS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
 
PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...
PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...
PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...
 
Final lisa opening_keynote_draft_-_v12.1tb
Final lisa opening_keynote_draft_-_v12.1tbFinal lisa opening_keynote_draft_-_v12.1tb
Final lisa opening_keynote_draft_-_v12.1tb
 
CE-4117, HSA Optimizations and Impact on end User Experiences for AfterShot P...
CE-4117, HSA Optimizations and Impact on end User Experiences for AfterShot P...CE-4117, HSA Optimizations and Impact on end User Experiences for AfterShot P...
CE-4117, HSA Optimizations and Impact on end User Experiences for AfterShot P...
 
MM-4099, Adapting game content to the viewing environment, by Noman Hashim
MM-4099, Adapting game content to the viewing environment, by Noman HashimMM-4099, Adapting game content to the viewing environment, by Noman Hashim
MM-4099, Adapting game content to the viewing environment, by Noman Hashim
 
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
 
GS-4139, RapidFire for Cloud Gaming, by Dmitry Kozlov
GS-4139, RapidFire for Cloud Gaming, by Dmitry KozlovGS-4139, RapidFire for Cloud Gaming, by Dmitry Kozlov
GS-4139, RapidFire for Cloud Gaming, by Dmitry Kozlov
 
CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and o...
CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and o...CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and o...
CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and o...
 
CC-4009, "Optimizing Hadoop Deployments with SeaMicro SM15000" by Satheesh Na...
CC-4009, "Optimizing Hadoop Deployments with SeaMicro SM15000" by Satheesh Na...CC-4009, "Optimizing Hadoop Deployments with SeaMicro SM15000" by Satheesh Na...
CC-4009, "Optimizing Hadoop Deployments with SeaMicro SM15000" by Satheesh Na...
 
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
 
Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...
Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...
Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...
 
PG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry KozlovPG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry Kozlov
 
CE-4114, Screen Mirror, a unified screen mirroring solution that utilizes AMD...
CE-4114, Screen Mirror, a unified screen mirroring solution that utilizes AMD...CE-4114, Screen Mirror, a unified screen mirroring solution that utilizes AMD...
CE-4114, Screen Mirror, a unified screen mirroring solution that utilizes AMD...
 
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu Feng
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu FengHC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu Feng
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu Feng
 
WT-4073, ANGLE and cross-platform WebGL support, by Shannon Woods
WT-4073, ANGLE and cross-platform WebGL support, by Shannon WoodsWT-4073, ANGLE and cross-platform WebGL support, by Shannon Woods
WT-4073, ANGLE and cross-platform WebGL support, by Shannon Woods
 
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahGS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
 
HC-4020, Enhancing OpenCL performance in AfterShot Pro with HSA, by Michael W...
HC-4020, Enhancing OpenCL performance in AfterShot Pro with HSA, by Michael W...HC-4020, Enhancing OpenCL performance in AfterShot Pro with HSA, by Michael W...
HC-4020, Enhancing OpenCL performance in AfterShot Pro with HSA, by Michael W...
 
GS-4093, "AstoundSound for Gaming – The next dimension in the evolution of Au...
GS-4093, "AstoundSound for Gaming – The next dimension in the evolution of Au...GS-4093, "AstoundSound for Gaming – The next dimension in the evolution of Au...
GS-4093, "AstoundSound for Gaming – The next dimension in the evolution of Au...
 
HSA-4122, "HSA Queuing Mode," by Ian Bratt
HSA-4122, "HSA Queuing Mode," by Ian BrattHSA-4122, "HSA Queuing Mode," by Ian Bratt
HSA-4122, "HSA Queuing Mode," by Ian Bratt
 

Destaque

Android Game Plan and Benefit
Android Game Plan and BenefitAndroid Game Plan and Benefit
Android Game Plan and Benefit
Digitalmedia outsource Solution Co.,Ltd.
 
Alpan Aytekin-Game Audio Essentials
Alpan Aytekin-Game Audio EssentialsAlpan Aytekin-Game Audio Essentials
Alpan Aytekin-Game Audio Essentials
gamedevelopersturkey
 
Game Project / Working with Unity
Game Project / Working with UnityGame Project / Working with Unity
Game Project / Working with Unity
Petri Lankoski
 

Destaque (19)

FYP New
FYP NewFYP New
FYP New
 
Optimizing your Game for Low-end Devices
Optimizing your Game for Low-end DevicesOptimizing your Game for Low-end Devices
Optimizing your Game for Low-end Devices
 
Android Game Plan and Benefit
Android Game Plan and BenefitAndroid Game Plan and Benefit
Android Game Plan and Benefit
 
Audio Mixer in Unity5 - Andy Touch
Audio Mixer in Unity5 - Andy TouchAudio Mixer in Unity5 - Andy Touch
Audio Mixer in Unity5 - Andy Touch
 
LAFS PREPRO Session 7 - Game Audio and Levels
LAFS PREPRO Session 7 - Game Audio and LevelsLAFS PREPRO Session 7 - Game Audio and Levels
LAFS PREPRO Session 7 - Game Audio and Levels
 
Game Audio Post-Production
Game Audio Post-ProductionGame Audio Post-Production
Game Audio Post-Production
 
Hands On with the Unity 5 Game Engine! - Andy Touch - Codemotion Roma 2015
Hands On with the Unity 5 Game Engine! - Andy Touch - Codemotion Roma 2015Hands On with the Unity 5 Game Engine! - Andy Touch - Codemotion Roma 2015
Hands On with the Unity 5 Game Engine! - Andy Touch - Codemotion Roma 2015
 
Optimizing Large Scenes in Unity
Optimizing Large Scenes in UnityOptimizing Large Scenes in Unity
Optimizing Large Scenes in Unity
 
Game Audio in Mobile Development
Game Audio in Mobile DevelopmentGame Audio in Mobile Development
Game Audio in Mobile Development
 
Mobile Game Development in Unity
Mobile Game Development in UnityMobile Game Development in Unity
Mobile Game Development in Unity
 
Problems and Solutions in Game Audio
Problems and Solutions in Game AudioProblems and Solutions in Game Audio
Problems and Solutions in Game Audio
 
Alpan Aytekin-Game Audio Essentials
Alpan Aytekin-Game Audio EssentialsAlpan Aytekin-Game Audio Essentials
Alpan Aytekin-Game Audio Essentials
 
Unite 2013 optimizing unity games for mobile platforms
Unite 2013 optimizing unity games for mobile platformsUnite 2013 optimizing unity games for mobile platforms
Unite 2013 optimizing unity games for mobile platforms
 
Practical guide to optimization in Unity
Practical guide to optimization in UnityPractical guide to optimization in Unity
Practical guide to optimization in Unity
 
Game Project / Working with Unity
Game Project / Working with UnityGame Project / Working with Unity
Game Project / Working with Unity
 
Optimizing mobile applications - Ian Dundore, Mark Harkness
Optimizing mobile applications - Ian Dundore, Mark HarknessOptimizing mobile applications - Ian Dundore, Mark Harkness
Optimizing mobile applications - Ian Dundore, Mark Harkness
 
How we optimized our Game - Jake & Tess' Finding Monsters Adventure
How we optimized our Game - Jake & Tess' Finding Monsters AdventureHow we optimized our Game - Jake & Tess' Finding Monsters Adventure
How we optimized our Game - Jake & Tess' Finding Monsters Adventure
 
Practical Guide for Optimizing Unity on Mobiles
Practical Guide for Optimizing Unity on MobilesPractical Guide for Optimizing Unity on Mobiles
Practical Guide for Optimizing Unity on Mobiles
 
Optimizing unity games (Google IO 2014)
Optimizing unity games (Google IO 2014)Optimizing unity games (Google IO 2014)
Optimizing unity games (Google IO 2014)
 

Semelhante a MM-4085, Designing a game audio engine for HSA, by Laurent Betbeder

AMD Heterogeneous Uniform Memory Access
AMD Heterogeneous Uniform Memory AccessAMD Heterogeneous Uniform Memory Access
AMD Heterogeneous Uniform Memory Access
AMD
 
AMD AM1 Platform Presentation
AMD AM1 Platform PresentationAMD AM1 Platform Presentation
AMD AM1 Platform Presentation
Low Hong Chuan
 
AMD 2014 Mobility APU Lineup Announcement
AMD 2014 Mobility APU Lineup AnnouncementAMD 2014 Mobility APU Lineup Announcement
AMD 2014 Mobility APU Lineup Announcement
AMD
 
Chapter 02 audio recording - part ii
Chapter 02   audio recording - part iiChapter 02   audio recording - part ii
Chapter 02 audio recording - part ii
Nazihah Ahwan
 
Emebedded Memories from GF pb-emem presentation
Emebedded Memories from GF pb-emem presentationEmebedded Memories from GF pb-emem presentation
Emebedded Memories from GF pb-emem presentation
sampige
 

Semelhante a MM-4085, Designing a game audio engine for HSA, by Laurent Betbeder (20)

Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
 
AMD Heterogeneous Uniform Memory Access
AMD Heterogeneous Uniform Memory AccessAMD Heterogeneous Uniform Memory Access
AMD Heterogeneous Uniform Memory Access
 
Linux Audio Drivers. ALSA
Linux Audio Drivers. ALSALinux Audio Drivers. ALSA
Linux Audio Drivers. ALSA
 
AMD AM1 Platform Presentation
AMD AM1 Platform PresentationAMD AM1 Platform Presentation
AMD AM1 Platform Presentation
 
Full DDR Bank Power and Signal Integrity Analysis with Chip-Package-System Co...
Full DDR Bank Power and Signal Integrity Analysis with Chip-Package-System Co...Full DDR Bank Power and Signal Integrity Analysis with Chip-Package-System Co...
Full DDR Bank Power and Signal Integrity Analysis with Chip-Package-System Co...
 
Choosing the right processor
Choosing the right processorChoosing the right processor
Choosing the right processor
 
5 Things You Need to Know About Enterprise Fl
 5 Things You Need to Know About Enterprise Fl 5 Things You Need to Know About Enterprise Fl
5 Things You Need to Know About Enterprise Fl
 
Shoot the Bird: Linear Broadcast Distribution on AWS by Usman Shakeel of Amaz...
Shoot the Bird: Linear Broadcast Distribution on AWS by Usman Shakeel of Amaz...Shoot the Bird: Linear Broadcast Distribution on AWS by Usman Shakeel of Amaz...
Shoot the Bird: Linear Broadcast Distribution on AWS by Usman Shakeel of Amaz...
 
AMD 2014 Performance Mobile APUs
AMD 2014 Performance Mobile APUsAMD 2014 Performance Mobile APUs
AMD 2014 Performance Mobile APUs
 
P1 unit 2
P1 unit 2P1 unit 2
P1 unit 2
 
fpga1 - What is.pptx
fpga1 - What is.pptxfpga1 - What is.pptx
fpga1 - What is.pptx
 
Battlefield 4 + Frostbite + Mantle
Battlefield 4 + Frostbite + MantleBattlefield 4 + Frostbite + Mantle
Battlefield 4 + Frostbite + Mantle
 
The 2008 Pc Builders Bible
The 2008 Pc Builders BibleThe 2008 Pc Builders Bible
The 2008 Pc Builders Bible
 
AMD 2014 Mobility APU Lineup Announcement
AMD 2014 Mobility APU Lineup AnnouncementAMD 2014 Mobility APU Lineup Announcement
AMD 2014 Mobility APU Lineup Announcement
 
Industry’s performance leading ultra low-power dsp solution
Industry’s performance leading ultra low-power dsp solutionIndustry’s performance leading ultra low-power dsp solution
Industry’s performance leading ultra low-power dsp solution
 
Presentation sparc m6 m5-32 server technical overview
Presentation   sparc m6 m5-32 server technical overviewPresentation   sparc m6 m5-32 server technical overview
Presentation sparc m6 m5-32 server technical overview
 
Chapter 02 audio recording - part ii
Chapter 02   audio recording - part iiChapter 02   audio recording - part ii
Chapter 02 audio recording - part ii
 
Fixed-point Multi-Core DSP Platform
Fixed-point Multi-Core DSP PlatformFixed-point Multi-Core DSP Platform
Fixed-point Multi-Core DSP Platform
 
Emebedded Memories from GF pb-emem presentation
Emebedded Memories from GF pb-emem presentationEmebedded Memories from GF pb-emem presentation
Emebedded Memories from GF pb-emem presentation
 
GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)
 

Mais de AMD Developer Central

Rendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellRendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
AMD Developer Central
 

Mais de AMD Developer Central (20)

DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsDX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
 
Leverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math LibrariesLeverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math Libraries
 
Introduction to Node.js
Introduction to Node.jsIntroduction to Node.js
Introduction to Node.js
 
Media SDK Webinar 2014
Media SDK Webinar 2014Media SDK Webinar 2014
Media SDK Webinar 2014
 
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAn Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
 
DirectGMA on AMD’S FirePro™ GPUS
DirectGMA on AMD’S  FirePro™ GPUSDirectGMA on AMD’S  FirePro™ GPUS
DirectGMA on AMD’S FirePro™ GPUS
 
Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceWebinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop Intelligence
 
Inside XBox- One, by Martin Fuller
Inside XBox- One, by Martin FullerInside XBox- One, by Martin Fuller
Inside XBox- One, by Martin Fuller
 
TressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas ThibierozTressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas Thibieroz
 
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellRendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
 
Gcn performance ftw by stephan hodes
Gcn performance ftw by stephan hodesGcn performance ftw by stephan hodes
Gcn performance ftw by stephan hodes
 
Inside XBOX ONE by Martin Fuller
Inside XBOX ONE by Martin FullerInside XBOX ONE by Martin Fuller
Inside XBOX ONE by Martin Fuller
 
Direct3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave OldcornDirect3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave Oldcorn
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan Nevraev
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasHoly smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
 
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
 
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
 
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
 
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Último (20)

Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 

MM-4085, Designing a game audio engine for HSA, by Laurent Betbeder

  • 1. DESIGNING A GAME AUDIO ENGINE FOR HSA LAURENT BETBEDER SCEA
  • 2. WHAT’S SO SPECIAL ABOUT CONSOLE GAME DEV? NOW THAT CONSOLES MOSTLY RUN PC HARDWARE  Extreme performance optimizations ‒ Until gamers opt for shorter upgrade cycles (phones/tablets business model) ? ‒ Can’t run sub-optimal audio code when competing for cycles on crowded compute queues  Custom hardware, OS, drivers and compilers ‒ To extract max perf from fixed hardware ‒ Helps lengthening platform life time ‒ “But but… where’s my OpenCL runtime?”  Low latency ‒ Music games on consoles need it as much as professional music prod software on desktop ‒ But is much harder to achieve reliably when a system is constantly overloaded 2 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 3. GAME AUDIO DSP ON THE ACP WHY?  Heavy specialized DSP workloads ‒ Stuff games need badly but don’t really want to deal with ‒ Best fit for dedicated and/or fixed function hardware ‒ Codecs ‒ ‒ ‒ ‒ CELP codecs -> party chat 100s of MP3/AT9/AAC decode instances Huge impact on game assets footprint, down/load times Optional output bitstream encoding (AC3/DTS) ‒ Voice recognition ‒ Echo cancelation  Platform wide IP licensing levels the playing field ‒ Good for indy developers ‒ And good for the platform!  Available via asynchronous secure system APIs 3 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 4. GAME AUDIO DSP ON THE ACP WHY NOT?  Exotic hardware and dev environment ‒ Closed to games ‒ Closed to middleware ‒ Platform specific  Asynchronous interface ‒ Can’t have sequential interleaving of DSP back and forth between CPU and ACP w/o latency buildup ‒ But ultimately, we want the DSP pipeline to be data driven (by artists who know nothing about this) ‒ Modularity  Slow clock rate @ 800MHz, very limited SIMD and no FP support ‒ Tough sell against Jaguar for many DSP algorithms ‒ Very tight local memory shared by multiple DSP cores  Already pretty busy with codec loads and system tasks 4 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 5. GAME AUDIO DSP ON THE GPU WHY?  Much more demand for real-time effects today and will keep growing  CPU FLOPS likely to stagnate and could even decline in HSA as CUs takes over SIMD workloads  Flexibility: some games are CPU bound, others are GPU bound…  hUMA is a game changer (removes NUMA’s main bottleneck: GPU write back)  Compute queues with prioritized scheduling and even some form of preemption  Many real-time audio DSP algorithms work well on wide SIMD units ‒ FFT convolution (spectral processing in general) ‒ Mixing, resampling, wave shaping, etc…  Mostly coalesced mem accesses  Low/med bandwidth (< 1GB/s) 5 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 6. GAME AUDIO DSP ON THE GPU WHY NOT?  Some algorithms do not work (as) well on wide SIMD units ‒ IIR filters, ADPCM decodes, dynamics: data recursion causes thread interdependencies within wavefronts ‒ Typical AAA game runs 1000s of biquads at various stages in the filtergraph  Workloads may require batch voice processing to achieve high CU efficiency ‒ Build 2D grids (channels x samples) or 3D grids (channels x subbands x samples) ‒ Swizzling is key but watch out for runtime cost as SIMD widens (static vs dynamic)  Batch processing goes against free form MaxMSP model artists are pushing for ‒ Unique DSP chain for each sound “just because we can!” ‒ Data driven filtergraph and DSP pipeline  Complex prioritized scheduling & dispatching compute queues ‒ Do not prevent intermittent CU saturation caused by large graphics workloads ‒ Risky for low latency direct path audio DSP  Proprietary hardware, drivers and shader compilers (PSSL) ‒ Audio middleware will need a some incentive to move up there ‒ Most will probably stay on the CPU 6 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 7. GAME AUDIO DSP ON JAGUAR WHY?  Well known and open x64 dev environment ‒ Middleware friendly ‒ CLANG/LLVM solid & stable  Full FP unit with SSE4 support  Early PA is surprisingly good for compiled intrinsics code ‒ ~10% slower than core i7 @ same clock rate ‒ GDDR5 latency is not an issue ‒ < ~50% of 1 core @ 1.6GHz running the entire KZSF filtergraph  Only reliable solution for ultra low latency ‒ Music and rhythm games ‒ Run 100% on CPU (including decoding) 7 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 8. GAME AUDIO DSP ON JAGUAR WHY NOT?  “Weak laptop CPU” compared to top of the line on desktop ‒ No FMA4 ‒ Slow clock @ 1.6GHz (compared to typical desktop)  256bit AVX mostly useless  Possible bottleneck down the line 8 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 9. GAME ENGINE CODE THIN COMPUTE  3D audio ‒ Sound emitters (distance, directionality and size modeling) ‒ Sound listeners (mic and ear modeling) ‒ Sound geometry (collision meshes) ‒ Deeper physical modeling of sound propagation ‒ Simple ray casting (occlusion, obstruction, indirect audio) ‒ Advanced ray casting (diffraction, real-time individual early reflection tracking)  Physics ‒ Rigid body dynamics (collisions, friction, destruction) ‒ Fluid dynamics (turbulences)  Animation, special FX ‒ Inline audio sequencing and modulation ‒ Foley, coarse granular synthesis 9 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 10. CONCLUSIONS  HSA + hUMA is a great combo for high perf game audio! ‒ Maximized perf per W from specialized hardware (CPU + GPU + ACP) ‒ Our challenge is to figure out what to run where and when  ACP is a great fit for codecs and OS services ‒ But not for modular synthesis and highly customized DSP pipelines  GPU is great fit for mid/high latency DSP and high level 3D thin compute ‒ Indirect (reflected) audio ‒ Convolution reverb ‒ 3D ray casting for occlusion/obstruction/diffraction  CPU is still the best fit for everything else: ‒ Open modular synthesis frameworks and middleware ‒ Low latency audio 10 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 11. AUDIO SYNTHESIZER SCHEDULING IN HSA 11 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 12. DISCLAIMER & ATTRIBUTION The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes. AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. ATTRIBUTION © 2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices, Inc. in the United States and/or other jurisdictions. SPEC is a registered trademark of the Standard Performance Evaluation Corporation (SPEC). Other names are for informational purposes only and may be trademarks of their respective owners. 12 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL