SlideShare uma empresa Scribd logo
1 de 17
Marco Tassemeier
Osnabrück University
20. June 2022
Reconfigurable ML Accelerators
in VEDLIoT
2
Applications
Requirements Security & Safety
Hardware
Plattforms
Microservers &
Accelerators
Middleware
Embedded/
Far Edge
Near Edge Cloud
Safety
&
Robustness
Modelling
&
Verification
Jetson AGX
NVIDIA Xavier
COM-HPC
Xilinx Zynq
UltraScale+
SMARC
Xilinx Zynq
UltraScale+
Coral SoM
Xilinx
Kria
RPi CM4
ARVSOM
Smart Home Industrial IoT Automotive AI
Open
Call
Monitoring
Trusted
Execution
&
Communication
RISC-V
extensions
Optimizer Emulation Benchmarking & Deployment
uRECS t.RECS RECS|Box
Big Picture
3
Applications
Requirements Security & Safety
Hardware
Plattforms
Microservers &
Accelerators
Middleware
Embedded/
Far Edge
Near Edge Cloud
Safety
&
Robustness
Modelling
&
Verification
Jetson AGX
NVIDIA Xavier
COM-HPC
Xilinx Zynq
UltraScale+
SMARC
Xilinx Zynq
UltraScale+
Coral SoM
Xilinx
Kria
RPi CM4
ARVSOM
Smart Home Industrial IoT Automotive AI
Open
Call
Monitoring
Trusted
Execution
&
Communication
RISC-V
extensions
Optimizer Emulation Benchmarking & Deployment
uRECS t.RECS RECS|Box
Big Picture
Hardware
Plattforms
Microservers &
Accelerators
Embedded/
Far Edge
Near Edge Cloud
Jetson AGX
NVIDIA Xavier
COM-HPC
Xilinx Zynq
UltraScale+
SMARC
Xilinx Zynq
UltraScale+
Coral SoM
Xilinx
Kria
RPi CM4
ARVSOM
uRECS t.RECS RECS|Box
• FPGA-based Accelerators in VEDLIoT
• Dynamic Reconfiguration of Accelerators
• First Results on Performance and Energy Efficiency
4
FPGA Infrastructure
• FPGA base architecture
• Integration of the required Interfaces and accelerators
• Support for dynamic run-time reconfiguration
• Exchange accelerators on the FPGA at run-time to increase resource efficiency and flexibility
• FPGA task deployment mechanism
• Migration of a task from one FPGA to another FPGA
Logic Cells 85k 2800k 25.2M 75.6M
5
Basic FPGA Infrastructure
• FPGA base architecture for the µ.RECS
• Block-based design enabling easy customization of the FPGA platform in the µ.RECS
• Front-end based on Xilinx Vitis with additional (optional) IP-cores from LiteX
• Scripting approach for complete system design
• Easy porting to new FPGA platforms, esp. µ.RECS. t.RECS, RECS|Box
• Flexible integration of accelerators
• Integration of the required Interfaces for communication (Ethernet, PCIe, etc)
as well as sensors and actuators targeted in the use cases
• PetaLinux enables easy access to the
system and to integrated accelerators
for software developers
• µ.RECS testbed for early evaluation
SMARC Module
SoC
FPGA-Fabric
Processing System
HDMI
CSI
PCIe x4
GigE
USB
DDR
(PS)
Memory
Subsystem
Interrupt
Controller
Dual/Quad Arm
Cortex- A53
Dual Arm
Cortex-R5
I/O Interfaces
AXI
Accelerator(s)
AXI
AXI-Lite
AXI-Lite
GPIO, UART
DDR
(PL)
Xilinx/ LiteX
Memory Ctrl
eMMC
Flash
SD
GPIO, UART
I/O Ctrl
SATA
Clk
Platform Mgmt,
System Funct. &
Configuration
HDMI
CSI
6
FPGA Base Architecture for µ.RECS
SMARC Module
SoC
FPGA-Fabric
Processing System
HDMI
CSI
PCIe x4
GigE
USB
DDR
(PS)
Memory
Subsystem
Interrupt Controller
Dual/Quad Arm
Cortex- A53
Dual Arm
Cortex-R5
I/O Interfaces
AXI
Accelerator(s)
AXI
AXI-Lite
AXI-Lite
GPIO, UART
DDR
(PL)
Xilinx/ LiteX
Memory Ctrl
eMMC
Flash
SD
GPIO, UART
I/O Ctrl
SATA
Clk
Platform Mgmt,
System Funct. &
Configuration
HDMI
CSI
7
First Reference Design Based on Xilinx DPU
• Baseline for evaluation of FPGA accelerators developed in VEDLIoT
• Xilinx Deep Learning Processor Unit (DPU)
• Programmable engine
for convolutional neural networks
• Easy integration as an IP core in
Xilinx UltraScale+ MPSoCs
• Configurable hardware architecture
(e.g., parallelism, memory/DSP usage)
• Evaluation on various platforms with Xilinx UltraScale+ MPSoCs
• ZU3EG on Avnet Ultra96-v2 (154k Logic Cells)
• ZU4EG in the µ.RECS testbed (192k Logic Cells)
• ZU15EG on Trenz TE0808 MPSoC Module (747k Logic Cells)
• ZU19EG on Trenz COM-HPC Module in t.RECS (1,143k Logic Cells)
DPU
Peak
ops/clock
Peak performance
(300 MHz) [GOPS]
Peak performance
(200 MHz) [GOPS]
B512 512 153.6 102.4
B2304 2304 691.2 460.8
B4096 4096 1228.8 819.2
10
Efficient Utilization of the Xilinx DPU
• Performance and power monitoring for single- and multi-threaded implementations
• Detailed power measurements on RECS platforms
• Power-aware profiling and optimization
12
Example DSE Using Different DPU
Configurations
14
Benchmark Performance of DL Accelerators
YoloV4
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLR…
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRANGE]
[CELLRA…
[CELLRANGE]
10
100
1000
10000
2 4 8 16 32 64 128
Performance
[GOPS]
Power [Watt]
INT8 FP16 FP32
ZU4
ZU15
15
Dynamic Reconfiguration of DL Accelerators
• Change the characteristics of the DL accelerator at run-time
(e.g., change performance-power trade-off or performance-accuracy trade-off)
SMARC Module
SoC
FPGA-Fabric
Processing System
HDMI
CSI
PCIe x4
GigE
USB
DDR
(PS)
Memory
Subsystem
Interrupt Controller
Dual/Quad Arm
Cortex- A53
Dual Arm
Cortex-R5
I/O Interfaces
AXI
AXI-Lite
GPIO, UART
DDR
(PL)
Xilinx/ LiteX
Memory Ctrl
eMMC
Flash
SD
GPIO, UART
I/O Ctrl
SATA
Platform Mgmt,
System Funct. &
Configuration
HDMI
CSI
Clk
AXI
CB
AXI
–Lite
CB
Disconnect
PR-Region
DFX
Accelerator A
Accelerator B
16
Dynamic Reconfiguration of DL Accelerators
SMARC Module
SoC
FPGA-Fabric
Processing System
HDMI
CSI
PCIe x4
GigE
USB
DDR
(PS)
Memory
Subsystem
Interrupt Controller
Dual/Quad Arm
Cortex- A53
Dual Arm
Cortex-R5
I/O Interfaces
AXI
AXI-Lite
GPIO, UART
DDR
(PL)
Xilinx/ LiteX
Memory Ctrl
eMMC
Flash
SD
GPIO, UART
I/O Ctrl
SATA
Platform Mgmt,
System Funct. &
Configuration
HDMI
CSI
Clk
AXI
CB
AXI
–Lite
CB
Disconnect
Accelerator
Disconnect
Accelerator
Accelerator
PR-Region
PR-Region
DFX
• Change the characteristics of the DL accelerator at run-time
(e.g., change performance-power trade-off or performance-accuracy trade-off)
17
Own Accelerator Developments
• Generation of dataflow-architectures based on C++ templates
• Optimized for high-level synthesis
• Support for inference and training
• Targeting CNNs, deep reinforcement learning, and federated learning
• Definition of parameterizable layer templates in C++
(e.g., convolution, fully connected, pooling, and activation functions, …)
• Parameterizable, e.g., quantization (from low bit-width INT to float)
• All layers integrate three functions (if required):
inference/forward propagation, backpropagation, and update function
• Inference utilizes only forward path
• Learning (DeepRL): utilizes the full functionality of the layer templates
Co-design of Accelerators
18
Thank you for your attention.
19
• Configurable soft SoC generator provides a platform for low power AI accelerator
exploration
• The generator enables a functionality to generate a system with a set of peripherals
required for a specific tasks
• Scalable from MCU-class to Linux-capable platforms
• Support for generic, vendor independent accelerator integration interface makes it a
perfect AI research platform
• Portable across different hardware, based on open-source tooling
• CFUs - Custom Function Units – custom accelerators designed for specific workflows,
tightly coupled with the CPU
• Accessed via custom RISC-V instructions
• Can be implemented in high-level hardware description languages, like, e.g., Python-based Amaranth
Configurable SoC for ML Workflows
20
• CFUs offer great flexibility
• Test various dedicated accelerators for specific
workflows
• Renode simulation framework
extended with CFU support
• Co-simulating functional models of the
SoC with verilated, cycle-accurate CFUs
• Invaluable tool for development
• Massive continuous integration testing
• Different CFU implementations
• Different inputs
• Allows for automatic result comparison and
analysis
• Everything open-sourced
Configurable SoC for ML Workflows
21
Soft SoC Platform
• Generation of soft SoC platforms
• Utilize RISC-V soft cores
• Generic interface to AI-Accelerators
• Modelled in an open source
emulation environment
• Utilize LiteX SoC generator
• Run-time reconfiguration
• Accelerators
• Processor cores
FPGA
Base Architecture
AI-Accelerator
Run-Time
Reconfiguration
Interface

Mais conteúdo relacionado

Semelhante a HiPEAC 2022_Marco Tassemeier presentation

Hari Krishna Vetsa Resume
Hari Krishna Vetsa ResumeHari Krishna Vetsa Resume
Hari Krishna Vetsa ResumeHari Krishna
 
ProjectVault[VivekKumar_CS-C_6Sem_MIT].pptx
ProjectVault[VivekKumar_CS-C_6Sem_MIT].pptxProjectVault[VivekKumar_CS-C_6Sem_MIT].pptx
ProjectVault[VivekKumar_CS-C_6Sem_MIT].pptxVivek Kumar
 
AXONIM 2018 industrial automation technical support
AXONIM 2018 industrial automation technical supportAXONIM 2018 industrial automation technical support
AXONIM 2018 industrial automation technical supportVitaliy Bozhkov ✔
 
SS-CPSIoT 2023_Kevin Mika and Piotr Zierhoffer presentation
SS-CPSIoT 2023_Kevin Mika and Piotr Zierhoffer presentationSS-CPSIoT 2023_Kevin Mika and Piotr Zierhoffer presentation
SS-CPSIoT 2023_Kevin Mika and Piotr Zierhoffer presentationVEDLIoT Project
 
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mãoWebinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mãoEmbarcados
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) ijceronline
 
HiPEAC 2019 Workshop - Hardware Starter Kit Agri
HiPEAC 2019 Workshop - Hardware Starter Kit Agri HiPEAC 2019 Workshop - Hardware Starter Kit Agri
HiPEAC 2019 Workshop - Hardware Starter Kit Agri Tulipp. Eu
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)ijceronline
 
Introduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AIIntroduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AITyrone Systems
 
Softcore processor.pptxSoftcore processor.pptxSoftcore processor.pptx
Softcore processor.pptxSoftcore processor.pptxSoftcore processor.pptxSoftcore processor.pptxSoftcore processor.pptxSoftcore processor.pptx
Softcore processor.pptxSoftcore processor.pptxSoftcore processor.pptxSnehaLatha68
 
RISC-V & SoC Architectural Exploration for AI and ML Accelerators
RISC-V & SoC Architectural Exploration for AI and ML AcceleratorsRISC-V & SoC Architectural Exploration for AI and ML Accelerators
RISC-V & SoC Architectural Exploration for AI and ML AcceleratorsRISC-V International
 
SoC~FPGA~ASIC~Embedded
SoC~FPGA~ASIC~EmbeddedSoC~FPGA~ASIC~Embedded
SoC~FPGA~ASIC~EmbeddedChili.CHIPS
 
Ti k2 e for mission critical applications
Ti k2 e for mission critical applicationsTi k2 e for mission critical applications
Ti k2 e for mission critical applicationsHitesh Jani
 

Semelhante a HiPEAC 2022_Marco Tassemeier presentation (20)

Sundance at the 49th Intelligent Sensing Program
Sundance at the 49th Intelligent Sensing ProgramSundance at the 49th Intelligent Sensing Program
Sundance at the 49th Intelligent Sensing Program
 
Hari Krishna Vetsa Resume
Hari Krishna Vetsa ResumeHari Krishna Vetsa Resume
Hari Krishna Vetsa Resume
 
ProjectVault[VivekKumar_CS-C_6Sem_MIT].pptx
ProjectVault[VivekKumar_CS-C_6Sem_MIT].pptxProjectVault[VivekKumar_CS-C_6Sem_MIT].pptx
ProjectVault[VivekKumar_CS-C_6Sem_MIT].pptx
 
AXONIM 2018 industrial automation technical support
AXONIM 2018 industrial automation technical supportAXONIM 2018 industrial automation technical support
AXONIM 2018 industrial automation technical support
 
Security and functional safety
Security and functional safetySecurity and functional safety
Security and functional safety
 
SS-CPSIoT 2023_Kevin Mika and Piotr Zierhoffer presentation
SS-CPSIoT 2023_Kevin Mika and Piotr Zierhoffer presentationSS-CPSIoT 2023_Kevin Mika and Piotr Zierhoffer presentation
SS-CPSIoT 2023_Kevin Mika and Piotr Zierhoffer presentation
 
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mãoWebinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mão
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
FPGA @ UPB-BGA
FPGA @ UPB-BGAFPGA @ UPB-BGA
FPGA @ UPB-BGA
 
HiPEAC 2019 Workshop - Hardware Starter Kit Agri
HiPEAC 2019 Workshop - Hardware Starter Kit Agri HiPEAC 2019 Workshop - Hardware Starter Kit Agri
HiPEAC 2019 Workshop - Hardware Starter Kit Agri
 
Zynq 7010
Zynq 7010Zynq 7010
Zynq 7010
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
Introduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AIIntroduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AI
 
Softcore processor.pptxSoftcore processor.pptxSoftcore processor.pptx
Softcore processor.pptxSoftcore processor.pptxSoftcore processor.pptxSoftcore processor.pptxSoftcore processor.pptxSoftcore processor.pptx
Softcore processor.pptxSoftcore processor.pptxSoftcore processor.pptx
 
Techmeeting-17feb2016
Techmeeting-17feb2016Techmeeting-17feb2016
Techmeeting-17feb2016
 
DRIVE PX 2
DRIVE PX 2DRIVE PX 2
DRIVE PX 2
 
RISC-V & SoC Architectural Exploration for AI and ML Accelerators
RISC-V & SoC Architectural Exploration for AI and ML AcceleratorsRISC-V & SoC Architectural Exploration for AI and ML Accelerators
RISC-V & SoC Architectural Exploration for AI and ML Accelerators
 
SoC~FPGA~ASIC~Embedded
SoC~FPGA~ASIC~EmbeddedSoC~FPGA~ASIC~Embedded
SoC~FPGA~ASIC~Embedded
 
uCluster
uClusteruCluster
uCluster
 
Ti k2 e for mission critical applications
Ti k2 e for mission critical applicationsTi k2 e for mission critical applications
Ti k2 e for mission critical applications
 

Mais de VEDLIoT Project

IoT Tech Expo 2023_Micha vor dem Berge presentation
IoT Tech Expo 2023_Micha vor dem Berge presentationIoT Tech Expo 2023_Micha vor dem Berge presentation
IoT Tech Expo 2023_Micha vor dem Berge presentationVEDLIoT Project
 
Computing Frontiers 2023_Pedro Trancoso presentation
Computing Frontiers 2023_Pedro Trancoso presentationComputing Frontiers 2023_Pedro Trancoso presentation
Computing Frontiers 2023_Pedro Trancoso presentationVEDLIoT Project
 
HiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentationHiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentationVEDLIoT Project
 
IoT Week 2022-NGIoT session_Micha vor dem Berge presentation
IoT Week 2022-NGIoT session_Micha vor dem Berge presentationIoT Week 2022-NGIoT session_Micha vor dem Berge presentation
IoT Week 2022-NGIoT session_Micha vor dem Berge presentationVEDLIoT Project
 
Next Generation IoT Architectures_Hans Salomonsson
Next Generation IoT Architectures_Hans SalomonssonNext Generation IoT Architectures_Hans Salomonsson
Next Generation IoT Architectures_Hans SalomonssonVEDLIoT Project
 
CONASENSE 2022_Jens Hagemeyer presentation
CONASENSE 2022_Jens Hagemeyer presentationCONASENSE 2022_Jens Hagemeyer presentation
CONASENSE 2022_Jens Hagemeyer presentationVEDLIoT Project
 
NGIoT standardisation workshops_Jens Hagemeyer presentation
NGIoT standardisation workshops_Jens Hagemeyer presentationNGIoT standardisation workshops_Jens Hagemeyer presentation
NGIoT standardisation workshops_Jens Hagemeyer presentationVEDLIoT Project
 
IoT Tech Expo 2023_Pedro Trancoso presentation
IoT Tech Expo 2023_Pedro Trancoso presentationIoT Tech Expo 2023_Pedro Trancoso presentation
IoT Tech Expo 2023_Pedro Trancoso presentationVEDLIoT Project
 
HiPEAC-CSW 2022_Kevin Mika presentation
HiPEAC-CSW 2022_Kevin Mika presentationHiPEAC-CSW 2022_Kevin Mika presentation
HiPEAC-CSW 2022_Kevin Mika presentationVEDLIoT Project
 
HiPEAC 2022-DL4IoT workshop_René Griessl presentation
HiPEAC 2022-DL4IoT workshop_René Griessl presentationHiPEAC 2022-DL4IoT workshop_René Griessl presentation
HiPEAC 2022-DL4IoT workshop_René Griessl presentationVEDLIoT Project
 
HiPEAC2023-DL4IoT Workshop_Jean Hagemeyer presentation
HiPEAC2023-DL4IoT Workshop_Jean Hagemeyer presentationHiPEAC2023-DL4IoT Workshop_Jean Hagemeyer presentation
HiPEAC2023-DL4IoT Workshop_Jean Hagemeyer presentationVEDLIoT Project
 
IoT Week 2021_Jens Hagemeyer presentation
IoT Week 2021_Jens Hagemeyer presentationIoT Week 2021_Jens Hagemeyer presentation
IoT Week 2021_Jens Hagemeyer presentationVEDLIoT Project
 
HiPEAC 2022_Marcelo Pasin presentation
HiPEAC 2022_Marcelo Pasin presentationHiPEAC 2022_Marcelo Pasin presentation
HiPEAC 2022_Marcelo Pasin presentationVEDLIoT Project
 
IoT Tech Expo 2023_Marcelo Pasin presentation
IoT Tech Expo 2023_Marcelo Pasin presentationIoT Tech Expo 2023_Marcelo Pasin presentation
IoT Tech Expo 2023_Marcelo Pasin presentationVEDLIoT Project
 
IoT Tech Expo 2023_Hans-Martin Heyn presentation
IoT Tech Expo 2023_Hans-Martin Heyn presentationIoT Tech Expo 2023_Hans-Martin Heyn presentation
IoT Tech Expo 2023_Hans-Martin Heyn presentationVEDLIoT Project
 
HiPEAC2022_António Casimiro presentation
HiPEAC2022_António Casimiro presentationHiPEAC2022_António Casimiro presentation
HiPEAC2022_António Casimiro presentationVEDLIoT Project
 
NGIoT Sustainability Workshop 2023_ Hans-Martin Heyn presentation
NGIoT Sustainability Workshop 2023_ Hans-Martin Heyn presentationNGIoT Sustainability Workshop 2023_ Hans-Martin Heyn presentation
NGIoT Sustainability Workshop 2023_ Hans-Martin Heyn presentationVEDLIoT Project
 
EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...
EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...
EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...VEDLIoT Project
 
NGIoT Sustainability Workshop 2023_Rene Griessl presentation
NGIoT Sustainability Workshop 2023_Rene Griessl presentationNGIoT Sustainability Workshop 2023_Rene Griessl presentation
NGIoT Sustainability Workshop 2023_Rene Griessl presentationVEDLIoT Project
 
HiPEAC2022-DL4IoT workshop_ Muhammad Waqar Azhar
HiPEAC2022-DL4IoT workshop_ Muhammad Waqar AzharHiPEAC2022-DL4IoT workshop_ Muhammad Waqar Azhar
HiPEAC2022-DL4IoT workshop_ Muhammad Waqar AzharVEDLIoT Project
 

Mais de VEDLIoT Project (20)

IoT Tech Expo 2023_Micha vor dem Berge presentation
IoT Tech Expo 2023_Micha vor dem Berge presentationIoT Tech Expo 2023_Micha vor dem Berge presentation
IoT Tech Expo 2023_Micha vor dem Berge presentation
 
Computing Frontiers 2023_Pedro Trancoso presentation
Computing Frontiers 2023_Pedro Trancoso presentationComputing Frontiers 2023_Pedro Trancoso presentation
Computing Frontiers 2023_Pedro Trancoso presentation
 
HiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentationHiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentation
 
IoT Week 2022-NGIoT session_Micha vor dem Berge presentation
IoT Week 2022-NGIoT session_Micha vor dem Berge presentationIoT Week 2022-NGIoT session_Micha vor dem Berge presentation
IoT Week 2022-NGIoT session_Micha vor dem Berge presentation
 
Next Generation IoT Architectures_Hans Salomonsson
Next Generation IoT Architectures_Hans SalomonssonNext Generation IoT Architectures_Hans Salomonsson
Next Generation IoT Architectures_Hans Salomonsson
 
CONASENSE 2022_Jens Hagemeyer presentation
CONASENSE 2022_Jens Hagemeyer presentationCONASENSE 2022_Jens Hagemeyer presentation
CONASENSE 2022_Jens Hagemeyer presentation
 
NGIoT standardisation workshops_Jens Hagemeyer presentation
NGIoT standardisation workshops_Jens Hagemeyer presentationNGIoT standardisation workshops_Jens Hagemeyer presentation
NGIoT standardisation workshops_Jens Hagemeyer presentation
 
IoT Tech Expo 2023_Pedro Trancoso presentation
IoT Tech Expo 2023_Pedro Trancoso presentationIoT Tech Expo 2023_Pedro Trancoso presentation
IoT Tech Expo 2023_Pedro Trancoso presentation
 
HiPEAC-CSW 2022_Kevin Mika presentation
HiPEAC-CSW 2022_Kevin Mika presentationHiPEAC-CSW 2022_Kevin Mika presentation
HiPEAC-CSW 2022_Kevin Mika presentation
 
HiPEAC 2022-DL4IoT workshop_René Griessl presentation
HiPEAC 2022-DL4IoT workshop_René Griessl presentationHiPEAC 2022-DL4IoT workshop_René Griessl presentation
HiPEAC 2022-DL4IoT workshop_René Griessl presentation
 
HiPEAC2023-DL4IoT Workshop_Jean Hagemeyer presentation
HiPEAC2023-DL4IoT Workshop_Jean Hagemeyer presentationHiPEAC2023-DL4IoT Workshop_Jean Hagemeyer presentation
HiPEAC2023-DL4IoT Workshop_Jean Hagemeyer presentation
 
IoT Week 2021_Jens Hagemeyer presentation
IoT Week 2021_Jens Hagemeyer presentationIoT Week 2021_Jens Hagemeyer presentation
IoT Week 2021_Jens Hagemeyer presentation
 
HiPEAC 2022_Marcelo Pasin presentation
HiPEAC 2022_Marcelo Pasin presentationHiPEAC 2022_Marcelo Pasin presentation
HiPEAC 2022_Marcelo Pasin presentation
 
IoT Tech Expo 2023_Marcelo Pasin presentation
IoT Tech Expo 2023_Marcelo Pasin presentationIoT Tech Expo 2023_Marcelo Pasin presentation
IoT Tech Expo 2023_Marcelo Pasin presentation
 
IoT Tech Expo 2023_Hans-Martin Heyn presentation
IoT Tech Expo 2023_Hans-Martin Heyn presentationIoT Tech Expo 2023_Hans-Martin Heyn presentation
IoT Tech Expo 2023_Hans-Martin Heyn presentation
 
HiPEAC2022_António Casimiro presentation
HiPEAC2022_António Casimiro presentationHiPEAC2022_António Casimiro presentation
HiPEAC2022_António Casimiro presentation
 
NGIoT Sustainability Workshop 2023_ Hans-Martin Heyn presentation
NGIoT Sustainability Workshop 2023_ Hans-Martin Heyn presentationNGIoT Sustainability Workshop 2023_ Hans-Martin Heyn presentation
NGIoT Sustainability Workshop 2023_ Hans-Martin Heyn presentation
 
EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...
EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...
EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...
 
NGIoT Sustainability Workshop 2023_Rene Griessl presentation
NGIoT Sustainability Workshop 2023_Rene Griessl presentationNGIoT Sustainability Workshop 2023_Rene Griessl presentation
NGIoT Sustainability Workshop 2023_Rene Griessl presentation
 
HiPEAC2022-DL4IoT workshop_ Muhammad Waqar Azhar
HiPEAC2022-DL4IoT workshop_ Muhammad Waqar AzharHiPEAC2022-DL4IoT workshop_ Muhammad Waqar Azhar
HiPEAC2022-DL4IoT workshop_ Muhammad Waqar Azhar
 

Último

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 

Último (20)

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 

HiPEAC 2022_Marco Tassemeier presentation

  • 1. Marco Tassemeier Osnabrück University 20. June 2022 Reconfigurable ML Accelerators in VEDLIoT
  • 2. 2 Applications Requirements Security & Safety Hardware Plattforms Microservers & Accelerators Middleware Embedded/ Far Edge Near Edge Cloud Safety & Robustness Modelling & Verification Jetson AGX NVIDIA Xavier COM-HPC Xilinx Zynq UltraScale+ SMARC Xilinx Zynq UltraScale+ Coral SoM Xilinx Kria RPi CM4 ARVSOM Smart Home Industrial IoT Automotive AI Open Call Monitoring Trusted Execution & Communication RISC-V extensions Optimizer Emulation Benchmarking & Deployment uRECS t.RECS RECS|Box Big Picture
  • 3. 3 Applications Requirements Security & Safety Hardware Plattforms Microservers & Accelerators Middleware Embedded/ Far Edge Near Edge Cloud Safety & Robustness Modelling & Verification Jetson AGX NVIDIA Xavier COM-HPC Xilinx Zynq UltraScale+ SMARC Xilinx Zynq UltraScale+ Coral SoM Xilinx Kria RPi CM4 ARVSOM Smart Home Industrial IoT Automotive AI Open Call Monitoring Trusted Execution & Communication RISC-V extensions Optimizer Emulation Benchmarking & Deployment uRECS t.RECS RECS|Box Big Picture Hardware Plattforms Microservers & Accelerators Embedded/ Far Edge Near Edge Cloud Jetson AGX NVIDIA Xavier COM-HPC Xilinx Zynq UltraScale+ SMARC Xilinx Zynq UltraScale+ Coral SoM Xilinx Kria RPi CM4 ARVSOM uRECS t.RECS RECS|Box • FPGA-based Accelerators in VEDLIoT • Dynamic Reconfiguration of Accelerators • First Results on Performance and Energy Efficiency
  • 4. 4 FPGA Infrastructure • FPGA base architecture • Integration of the required Interfaces and accelerators • Support for dynamic run-time reconfiguration • Exchange accelerators on the FPGA at run-time to increase resource efficiency and flexibility • FPGA task deployment mechanism • Migration of a task from one FPGA to another FPGA Logic Cells 85k 2800k 25.2M 75.6M
  • 5. 5 Basic FPGA Infrastructure • FPGA base architecture for the µ.RECS • Block-based design enabling easy customization of the FPGA platform in the µ.RECS • Front-end based on Xilinx Vitis with additional (optional) IP-cores from LiteX • Scripting approach for complete system design • Easy porting to new FPGA platforms, esp. µ.RECS. t.RECS, RECS|Box • Flexible integration of accelerators • Integration of the required Interfaces for communication (Ethernet, PCIe, etc) as well as sensors and actuators targeted in the use cases • PetaLinux enables easy access to the system and to integrated accelerators for software developers • µ.RECS testbed for early evaluation SMARC Module SoC FPGA-Fabric Processing System HDMI CSI PCIe x4 GigE USB DDR (PS) Memory Subsystem Interrupt Controller Dual/Quad Arm Cortex- A53 Dual Arm Cortex-R5 I/O Interfaces AXI Accelerator(s) AXI AXI-Lite AXI-Lite GPIO, UART DDR (PL) Xilinx/ LiteX Memory Ctrl eMMC Flash SD GPIO, UART I/O Ctrl SATA Clk Platform Mgmt, System Funct. & Configuration HDMI CSI
  • 6. 6 FPGA Base Architecture for µ.RECS SMARC Module SoC FPGA-Fabric Processing System HDMI CSI PCIe x4 GigE USB DDR (PS) Memory Subsystem Interrupt Controller Dual/Quad Arm Cortex- A53 Dual Arm Cortex-R5 I/O Interfaces AXI Accelerator(s) AXI AXI-Lite AXI-Lite GPIO, UART DDR (PL) Xilinx/ LiteX Memory Ctrl eMMC Flash SD GPIO, UART I/O Ctrl SATA Clk Platform Mgmt, System Funct. & Configuration HDMI CSI
  • 7. 7 First Reference Design Based on Xilinx DPU • Baseline for evaluation of FPGA accelerators developed in VEDLIoT • Xilinx Deep Learning Processor Unit (DPU) • Programmable engine for convolutional neural networks • Easy integration as an IP core in Xilinx UltraScale+ MPSoCs • Configurable hardware architecture (e.g., parallelism, memory/DSP usage) • Evaluation on various platforms with Xilinx UltraScale+ MPSoCs • ZU3EG on Avnet Ultra96-v2 (154k Logic Cells) • ZU4EG in the µ.RECS testbed (192k Logic Cells) • ZU15EG on Trenz TE0808 MPSoC Module (747k Logic Cells) • ZU19EG on Trenz COM-HPC Module in t.RECS (1,143k Logic Cells) DPU Peak ops/clock Peak performance (300 MHz) [GOPS] Peak performance (200 MHz) [GOPS] B512 512 153.6 102.4 B2304 2304 691.2 460.8 B4096 4096 1228.8 819.2
  • 8. 10 Efficient Utilization of the Xilinx DPU • Performance and power monitoring for single- and multi-threaded implementations • Detailed power measurements on RECS platforms • Power-aware profiling and optimization
  • 9. 12 Example DSE Using Different DPU Configurations
  • 10. 14 Benchmark Performance of DL Accelerators YoloV4 [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLR… [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRANGE] [CELLRA… [CELLRANGE] 10 100 1000 10000 2 4 8 16 32 64 128 Performance [GOPS] Power [Watt] INT8 FP16 FP32 ZU4 ZU15
  • 11. 15 Dynamic Reconfiguration of DL Accelerators • Change the characteristics of the DL accelerator at run-time (e.g., change performance-power trade-off or performance-accuracy trade-off) SMARC Module SoC FPGA-Fabric Processing System HDMI CSI PCIe x4 GigE USB DDR (PS) Memory Subsystem Interrupt Controller Dual/Quad Arm Cortex- A53 Dual Arm Cortex-R5 I/O Interfaces AXI AXI-Lite GPIO, UART DDR (PL) Xilinx/ LiteX Memory Ctrl eMMC Flash SD GPIO, UART I/O Ctrl SATA Platform Mgmt, System Funct. & Configuration HDMI CSI Clk AXI CB AXI –Lite CB Disconnect PR-Region DFX Accelerator A Accelerator B
  • 12. 16 Dynamic Reconfiguration of DL Accelerators SMARC Module SoC FPGA-Fabric Processing System HDMI CSI PCIe x4 GigE USB DDR (PS) Memory Subsystem Interrupt Controller Dual/Quad Arm Cortex- A53 Dual Arm Cortex-R5 I/O Interfaces AXI AXI-Lite GPIO, UART DDR (PL) Xilinx/ LiteX Memory Ctrl eMMC Flash SD GPIO, UART I/O Ctrl SATA Platform Mgmt, System Funct. & Configuration HDMI CSI Clk AXI CB AXI –Lite CB Disconnect Accelerator Disconnect Accelerator Accelerator PR-Region PR-Region DFX • Change the characteristics of the DL accelerator at run-time (e.g., change performance-power trade-off or performance-accuracy trade-off)
  • 13. 17 Own Accelerator Developments • Generation of dataflow-architectures based on C++ templates • Optimized for high-level synthesis • Support for inference and training • Targeting CNNs, deep reinforcement learning, and federated learning • Definition of parameterizable layer templates in C++ (e.g., convolution, fully connected, pooling, and activation functions, …) • Parameterizable, e.g., quantization (from low bit-width INT to float) • All layers integrate three functions (if required): inference/forward propagation, backpropagation, and update function • Inference utilizes only forward path • Learning (DeepRL): utilizes the full functionality of the layer templates Co-design of Accelerators
  • 14. 18 Thank you for your attention.
  • 15. 19 • Configurable soft SoC generator provides a platform for low power AI accelerator exploration • The generator enables a functionality to generate a system with a set of peripherals required for a specific tasks • Scalable from MCU-class to Linux-capable platforms • Support for generic, vendor independent accelerator integration interface makes it a perfect AI research platform • Portable across different hardware, based on open-source tooling • CFUs - Custom Function Units – custom accelerators designed for specific workflows, tightly coupled with the CPU • Accessed via custom RISC-V instructions • Can be implemented in high-level hardware description languages, like, e.g., Python-based Amaranth Configurable SoC for ML Workflows
  • 16. 20 • CFUs offer great flexibility • Test various dedicated accelerators for specific workflows • Renode simulation framework extended with CFU support • Co-simulating functional models of the SoC with verilated, cycle-accurate CFUs • Invaluable tool for development • Massive continuous integration testing • Different CFU implementations • Different inputs • Allows for automatic result comparison and analysis • Everything open-sourced Configurable SoC for ML Workflows
  • 17. 21 Soft SoC Platform • Generation of soft SoC platforms • Utilize RISC-V soft cores • Generic interface to AI-Accelerators • Modelled in an open source emulation environment • Utilize LiteX SoC generator • Run-time reconfiguration • Accelerators • Processor cores FPGA Base Architecture AI-Accelerator Run-Time Reconfiguration Interface