SlideShare a Scribd company logo
1 of 15
Download to read offline
All Programmable SoCs?
Platforms to enable the future of Embedded Machine Learning
Linaro Connect – Hong Kong March 2018
Tomas Evensen
CTO Embedded Software, Xilinx
© Copyright 2018 Xilinx
.
Processor frequency scaling ended in 2007
Multicore architecture scaling has flattened
CPU Architectures not Scaling with Workloads
Workloads require higher
performance, lower latency
– Cloud: video, big data, AI…
– Edge: auto, surveillance, AI…Andrew Danowitz, Kyle Kelley, James Mao, John P. Stevenson, Mark Horowitz
Communications of the ACM, Vol. 55 No. 4
Page 2
© Copyright 2018 Xilinx
.
Application
Processor
64-bit
Dual/Quad-Core
Zynq UltraScale+ MPSoC
Real-Time
Processors
32-bit Dual-Core
Platform & Power
Management
Granular Power Control
Functional Safety
Configuration & Security
Unit
Anti-Tamper & Trust
Industry Standards
Fabric Acceleration
Customizable Engines
High Speed Connectivity
Video Codec
8K4K (15fps)
4K2K (60fps)
High Speed
Peripherals
Key Interfaces
Graphics Processor
ARM Mali-400MP2
Memory
Subsystem
High Bandwidth
Low Latency
Page 3
© Copyright 2018 Xilinx
.
Page 4
FPGA: The “Chameleon” Chip
What is FPGA/Fabric/Programmable Logic:
Is it glue logic?
Is it a powerful parallel DSP engine?
Is it an RTL simulator?
Yes!!! And more…
FPGA includes:
Programmable logic (LUTs)
Hardened DSP blocks
Hardened memory (BRAM, URAM)
FPGAs are great to implement:
Parallel compute (e.g. MAC)
– With variable precision
Parallel, flexible dataflows
– Build your own buses
Flexible, multiport memory hierarchies
© Copyright 2018 Xilinx
.
Breakout in Programming Model
Traditional HW design
HDMI
video
proc.
video
enc.
Development
Productivity
15x productivity with HLS, IPI
Ethernet
IP
Video
decode
C++
Video
process
C++
Video
encode
C++
HDMI
IP
SW Programmability
Page 5
© Copyright 2018 Xilinx
.
SDSoC Example: Matrix Multiply + Add
main(){
malloc(A,B,C);
mmult(A,B,D);
madd(C,D,E);
printf(E);
}
madd(inA,inB,out){
}
HLS C/C++
mmult(inA,inB,out){
}
HLS C/C++
A,B datamovers
AXI Bus
Platform
Application
Driver
mmult madd
Generated
D
A B C
E
PS
PL
Page 6
© Copyright 2018 Xilinx
.
Supporting the Whole Stack
Accelerated
Open Frameworks
Accelerated
Libraries
Development
Environment
Boards w/
HLx-based platform
Machine
learning
Database
Analytics
Platform
Development Stack
VCU1525
Acceleration card
Page 7
© Copyright 2018 Xilinx
.
Deep learning-based multi-object
recognition for smart-city
Live video object detection
SSD @ 480x360 on Zynq MPSoC
End-customer obtained:
– 5x NVIDIA TX2 performance
– Better accuracy
Example of Embedded Vision Application at the Edge
Object detection
5X Perf/watt for SSD vs GPU
Page 8
© Copyright 2018 Xilinx
.
Xilinx and Avnet is partnering and is announcing the Ultra96 board
– Equipped with Zync Ultrascale+ MPSoC (ZU3EG)
– https://www.96boards.org/product/ultra96/
Ultra96 board makes ARM® based Xilinx SoCs available to developers at a low price point
– Built to 96Boards standard, suitable for software prototyping with standardized expansion kits
– Targeting a range of applications including Machine Learning, IoT, and compute
Leverages an open-source software development platform
– 96boards community: 12K actively contributing software engineers
– Supports both self-hosted and cross development
• Self-hosted: Compile on the board itself
• Cross: Develop on your workstation/laptop
• C to fabric/FPGA: SDSoC tools available later this year
• Unboxing to coding in less than 2 minutes
Page 9
$249 Ultra96 Board Targeted for Software Designers
Available from Avnet in April
© Copyright 2018 Xilinx
.
Submit your most creative, most out-of-the box AI or ML application
at the Xilinx or Avnet table during Demo Friday (12:00 – 14:00).
The best 30 get a FREE Ultra96 board plus software to
help you realize your vision.
The 1st twenty to submit a working design by MAY 25th, 2018 get a
$25 Amazon Gift Card.
ONE Winner announced through Xilinx social media channels. If
it’s you, you’re invited to present your design to your peers in
industry at Xilinx Developer Forum 2018.
Page 10
The Future is Ultra96 Xilinx Contest
© Copyright 2018 Xilinx
.
Page 11
Demo
© Copyright 2018 Xilinx
.
German Road Sign Database
– 50,000+ 32x32 bit images for training
– 44 classes (43 road signs, 1 background)
– Training via Amazon Web Services
• AWS: p2.xlarge Instance – 8 hours à $7.78 6,5e
Binary Neural Network Characteristics
– 6 convolutional layers
– 2 max pool layers
– 3 fully connected layers
Page 12
Neural Network Example
© Copyright 2018 Xilinx
.
Page 13
Neural Network Performance Results
Up to 8,600 times faster when accelerated with programmable logic
Performance Metric Software Only Programmable Logic
Accelerated
Tiles per second 2.2 19,000
Scene rate (fps) 0.011 (92 sec per frame) 94
Overall Acceleration - 8,600X
© Copyright 2018 Xilinx
.
Dramatically Accelerate 96Board Software via an FPGA with Integrated Processors
– Wednesday 16:00-16:55
Accelerating Neural Networks for Vision Systems via FPGAs
– Thursday 11:00-11:25
Page 14
Learn More About FPGA’s and Software Acceleration
© Copyright 2018 Xilinx
.
Page 15
Questions?/ Thank You

More Related Content

What's hot

How Cisco Provides World-Class Technology Conference Experiences Using Automa...
How Cisco Provides World-Class Technology Conference Experiences Using Automa...How Cisco Provides World-Class Technology Conference Experiences Using Automa...
How Cisco Provides World-Class Technology Conference Experiences Using Automa...
InfluxData
 

What's hot (20)

Project Trillium: Arm Machine Learning Platform
Project Trillium: Arm Machine Learning PlatformProject Trillium: Arm Machine Learning Platform
Project Trillium: Arm Machine Learning Platform
 
Data on the move a RISC-V opportunity
Data on the move   a RISC-V opportunityData on the move   a RISC-V opportunity
Data on the move a RISC-V opportunity
 
IoTs Place in the World of 5G
IoTs Place in the World of 5GIoTs Place in the World of 5G
IoTs Place in the World of 5G
 
EPSRC CDT Conference
EPSRC CDT ConferenceEPSRC CDT Conference
EPSRC CDT Conference
 
Akraino and Edge Computing
Akraino and Edge ComputingAkraino and Edge Computing
Akraino and Edge Computing
 
Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
 
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allHKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
Seamless FPGA deployment over Spark in cloud computing: A use case on machine...
Seamless FPGA deployment over Spark in cloud computing: A use case on machine...Seamless FPGA deployment over Spark in cloud computing: A use case on machine...
Seamless FPGA deployment over Spark in cloud computing: A use case on machine...
 
Embedded Fest 2019. Dov Nimratz. Artificial Intelligence in Small Embedded Sy...
Embedded Fest 2019. Dov Nimratz. Artificial Intelligence in Small Embedded Sy...Embedded Fest 2019. Dov Nimratz. Artificial Intelligence in Small Embedded Sy...
Embedded Fest 2019. Dov Nimratz. Artificial Intelligence in Small Embedded Sy...
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
Open Source Edge Computing Platforms - Overview
Open Source Edge Computing Platforms - OverviewOpen Source Edge Computing Platforms - Overview
Open Source Edge Computing Platforms - Overview
 
Kubernetes Native Infrastructure and CoreOS Operator Framework for 5G Edge Cl...
Kubernetes Native Infrastructure and CoreOS Operator Framework for 5G Edge Cl...Kubernetes Native Infrastructure and CoreOS Operator Framework for 5G Edge Cl...
Kubernetes Native Infrastructure and CoreOS Operator Framework for 5G Edge Cl...
 
OpenStack for EDGE computing
OpenStack for EDGE computingOpenStack for EDGE computing
OpenStack for EDGE computing
 
Xilink
XilinkXilink
Xilink
 
How Cisco Provides World-Class Technology Conference Experiences Using Automa...
How Cisco Provides World-Class Technology Conference Experiences Using Automa...How Cisco Provides World-Class Technology Conference Experiences Using Automa...
How Cisco Provides World-Class Technology Conference Experiences Using Automa...
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
ADLINK And InfluxDB Deliver Operational Efficiency For Defense Industry With ...
ADLINK And InfluxDB Deliver Operational Efficiency For Defense Industry With ...ADLINK And InfluxDB Deliver Operational Efficiency For Defense Industry With ...
ADLINK And InfluxDB Deliver Operational Efficiency For Defense Industry With ...
 
ONAP and the K8s Ecosystem: A Converged Edge Application & Network Function P...
ONAP and the K8s Ecosystem: A Converged Edge Application & Network Function P...ONAP and the K8s Ecosystem: A Converged Edge Application & Network Function P...
ONAP and the K8s Ecosystem: A Converged Edge Application & Network Function P...
 
IoT Microservices at the Edge with Eclipse ioFog
IoT Microservices at the Edge with Eclipse ioFogIoT Microservices at the Edge with Eclipse ioFog
IoT Microservices at the Edge with Eclipse ioFog
 

Similar to HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to enable the future of Embedded Machine Learning

Similar to HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to enable the future of Embedded Machine Learning (20)

Xilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systemsXilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systems
 
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super AffordableSupermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
 
Harnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceHarnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligence
 
Xilinx Data Center Strategy and CCIX
Xilinx Data Center Strategy and CCIXXilinx Data Center Strategy and CCIX
Xilinx Data Center Strategy and CCIX
 
HKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteHKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening Keynote
 
IBM Special Announcement session Intel #IDF2013 September 10, 2013
IBM Special Announcement session Intel #IDF2013 September 10, 2013IBM Special Announcement session Intel #IDF2013 September 10, 2013
IBM Special Announcement session Intel #IDF2013 September 10, 2013
 
HiPEAC 2019 Workshop - Vision Processing
HiPEAC 2019 Workshop - Vision ProcessingHiPEAC 2019 Workshop - Vision Processing
HiPEAC 2019 Workshop - Vision Processing
 
Optimize Content Delivery with Multi-Access Edge Computing
Optimize Content Delivery with Multi-Access Edge ComputingOptimize Content Delivery with Multi-Access Edge Computing
Optimize Content Delivery with Multi-Access Edge Computing
 
Tomorrow's Server Infrastructure Today
Tomorrow's Server Infrastructure TodayTomorrow's Server Infrastructure Today
Tomorrow's Server Infrastructure Today
 
Xilinx Inference solution for DL using OpenPOWER systems
Xilinx Inference solution for DL using OpenPOWER systemsXilinx Inference solution for DL using OpenPOWER systems
Xilinx Inference solution for DL using OpenPOWER systems
 
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta..."The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
 
XMOS Company Overview
XMOS Company OverviewXMOS Company Overview
XMOS Company Overview
 
Review of QNX
Review of QNXReview of QNX
Review of QNX
 
Accelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to CloudAccelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to Cloud
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
 
High Performance Object Storage in 30 Minutes with Supermicro and MinIO
High Performance Object Storage in 30 Minutes with Supermicro and MinIOHigh Performance Object Storage in 30 Minutes with Supermicro and MinIO
High Performance Object Storage in 30 Minutes with Supermicro and MinIO
 
Vertex Perspectives | AI Optimized Chipsets | Part III
Vertex Perspectives | AI Optimized Chipsets | Part IIIVertex Perspectives | AI Optimized Chipsets | Part III
Vertex Perspectives | AI Optimized Chipsets | Part III
 
FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning
 
IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X
IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X
IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X
 
Flexible and Scalable Integration in the Automation Industry/Industrial IoT
Flexible and Scalable Integration in the Automation Industry/Industrial IoTFlexible and Scalable Integration in the Automation Industry/Industrial IoT
Flexible and Scalable Integration in the Automation Industry/Industrial IoT
 

More from Linaro

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Linaro
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018
Linaro
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Linaro
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Linaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
Linaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
Linaro
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
Linaro
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMU
Linaro
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation
Linaro
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted boot
Linaro
 
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
Linaro
 
HKG18-317 - Arm Server Ready Program
HKG18-317 - Arm Server Ready ProgramHKG18-317 - Arm Server Ready Program
HKG18-317 - Arm Server Ready Program
Linaro
 

More from Linaro (20)

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
 
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaArm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
 
Bud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaBud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qa
 
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018
 
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
 
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP Workshop
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMU
 
HKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MHKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8M
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted boot
 
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
 
HKG18-317 - Arm Server Ready Program
HKG18-317 - Arm Server Ready ProgramHKG18-317 - Arm Server Ready Program
HKG18-317 - Arm Server Ready Program
 

Recently uploaded

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 

HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to enable the future of Embedded Machine Learning

  • 1. All Programmable SoCs? Platforms to enable the future of Embedded Machine Learning Linaro Connect – Hong Kong March 2018 Tomas Evensen CTO Embedded Software, Xilinx
  • 2. © Copyright 2018 Xilinx . Processor frequency scaling ended in 2007 Multicore architecture scaling has flattened CPU Architectures not Scaling with Workloads Workloads require higher performance, lower latency – Cloud: video, big data, AI… – Edge: auto, surveillance, AI…Andrew Danowitz, Kyle Kelley, James Mao, John P. Stevenson, Mark Horowitz Communications of the ACM, Vol. 55 No. 4 Page 2
  • 3. © Copyright 2018 Xilinx . Application Processor 64-bit Dual/Quad-Core Zynq UltraScale+ MPSoC Real-Time Processors 32-bit Dual-Core Platform & Power Management Granular Power Control Functional Safety Configuration & Security Unit Anti-Tamper & Trust Industry Standards Fabric Acceleration Customizable Engines High Speed Connectivity Video Codec 8K4K (15fps) 4K2K (60fps) High Speed Peripherals Key Interfaces Graphics Processor ARM Mali-400MP2 Memory Subsystem High Bandwidth Low Latency Page 3
  • 4. © Copyright 2018 Xilinx . Page 4 FPGA: The “Chameleon” Chip What is FPGA/Fabric/Programmable Logic: Is it glue logic? Is it a powerful parallel DSP engine? Is it an RTL simulator? Yes!!! And more… FPGA includes: Programmable logic (LUTs) Hardened DSP blocks Hardened memory (BRAM, URAM) FPGAs are great to implement: Parallel compute (e.g. MAC) – With variable precision Parallel, flexible dataflows – Build your own buses Flexible, multiport memory hierarchies
  • 5. © Copyright 2018 Xilinx . Breakout in Programming Model Traditional HW design HDMI video proc. video enc. Development Productivity 15x productivity with HLS, IPI Ethernet IP Video decode C++ Video process C++ Video encode C++ HDMI IP SW Programmability Page 5
  • 6. © Copyright 2018 Xilinx . SDSoC Example: Matrix Multiply + Add main(){ malloc(A,B,C); mmult(A,B,D); madd(C,D,E); printf(E); } madd(inA,inB,out){ } HLS C/C++ mmult(inA,inB,out){ } HLS C/C++ A,B datamovers AXI Bus Platform Application Driver mmult madd Generated D A B C E PS PL Page 6
  • 7. © Copyright 2018 Xilinx . Supporting the Whole Stack Accelerated Open Frameworks Accelerated Libraries Development Environment Boards w/ HLx-based platform Machine learning Database Analytics Platform Development Stack VCU1525 Acceleration card Page 7
  • 8. © Copyright 2018 Xilinx . Deep learning-based multi-object recognition for smart-city Live video object detection SSD @ 480x360 on Zynq MPSoC End-customer obtained: – 5x NVIDIA TX2 performance – Better accuracy Example of Embedded Vision Application at the Edge Object detection 5X Perf/watt for SSD vs GPU Page 8
  • 9. © Copyright 2018 Xilinx . Xilinx and Avnet is partnering and is announcing the Ultra96 board – Equipped with Zync Ultrascale+ MPSoC (ZU3EG) – https://www.96boards.org/product/ultra96/ Ultra96 board makes ARM® based Xilinx SoCs available to developers at a low price point – Built to 96Boards standard, suitable for software prototyping with standardized expansion kits – Targeting a range of applications including Machine Learning, IoT, and compute Leverages an open-source software development platform – 96boards community: 12K actively contributing software engineers – Supports both self-hosted and cross development • Self-hosted: Compile on the board itself • Cross: Develop on your workstation/laptop • C to fabric/FPGA: SDSoC tools available later this year • Unboxing to coding in less than 2 minutes Page 9 $249 Ultra96 Board Targeted for Software Designers Available from Avnet in April
  • 10. © Copyright 2018 Xilinx . Submit your most creative, most out-of-the box AI or ML application at the Xilinx or Avnet table during Demo Friday (12:00 – 14:00). The best 30 get a FREE Ultra96 board plus software to help you realize your vision. The 1st twenty to submit a working design by MAY 25th, 2018 get a $25 Amazon Gift Card. ONE Winner announced through Xilinx social media channels. If it’s you, you’re invited to present your design to your peers in industry at Xilinx Developer Forum 2018. Page 10 The Future is Ultra96 Xilinx Contest
  • 11. © Copyright 2018 Xilinx . Page 11 Demo
  • 12. © Copyright 2018 Xilinx . German Road Sign Database – 50,000+ 32x32 bit images for training – 44 classes (43 road signs, 1 background) – Training via Amazon Web Services • AWS: p2.xlarge Instance – 8 hours à $7.78 6,5e Binary Neural Network Characteristics – 6 convolutional layers – 2 max pool layers – 3 fully connected layers Page 12 Neural Network Example
  • 13. © Copyright 2018 Xilinx . Page 13 Neural Network Performance Results Up to 8,600 times faster when accelerated with programmable logic Performance Metric Software Only Programmable Logic Accelerated Tiles per second 2.2 19,000 Scene rate (fps) 0.011 (92 sec per frame) 94 Overall Acceleration - 8,600X
  • 14. © Copyright 2018 Xilinx . Dramatically Accelerate 96Board Software via an FPGA with Integrated Processors – Wednesday 16:00-16:55 Accelerating Neural Networks for Vision Systems via FPGAs – Thursday 11:00-11:25 Page 14 Learn More About FPGA’s and Software Acceleration
  • 15. © Copyright 2018 Xilinx . Page 15 Questions?/ Thank You