SlideShare uma empresa Scribd logo
1 de 32
Baixar para ler offline
© 2019 Mentor Graphics, A Siemens Business
Using High-level Synthesis to Bridge
the Gap Between Deep Learning
Frameworks and Custom Hardware
Accelerators
Mike Fingeroff
High-level Synthesis Technologist
© 2019 Mentor Graphics, A Siemens Business
Agenda
Machine learning has massive design complexity requirements
Why Catapult High-level Synthesis (HLS) is crucial to getting designs to
market on time
Verification of the quantized algorithm
Customer successes and future direction of Machine Learning and HLS
© 2019 Mentor Graphics, A Siemens Business
Machine Learning
Hardware is Evolving
Rapidly
© 2019 Mentor Graphics, A Siemens Business
Machine Learning Algorithms Have Massive
Computational Complexity
Training
• Very large datasets & memory,
CPU/GPU farms, floating point
required
• Not real time, can take
days/weeks
This is where
Catapult HLS fits
Inferencing
• Uses weights from trained network
• Memory storage/bandwidth challenges
• Often real-time
• Can be reduced to fixed point, dramatically
reduce the power
© 2019 Mentor Graphics, A Siemens Business
Numerous Possible Hardware/Memory NN Architectures for
Inference Engines
Machine learning
architectures are still
evolving
• How to know which one
is right for the
application?
• Not enough time to do
them all in RTL
On-chip memory, memory
bandwidth, power
performance and area
are all important
© 2019 Mentor Graphics, A Siemens Business
Memory Architecture and Power Considerations
Keeping data local is key to
minimizing power consumption
• Very important for ASIC
Floating-point is costly
• Used in training of networks
• Not needed in network
inference engine
Processor ML architectures are
fixed bit-width
• Not power efficient
*MIT/NVIDIA 2017
© 2019 Mentor Graphics, A Siemens Business
Data in the Real World is Exploding
Data traffic is going to
increase exponentially
over the next decade
• Frame rates,
sensor/camera
resolution will keep
doubling every few
years
How can processing
technology keep up?
• General purpose
solutions wont work,
too much power
Tractica 2018
EB = 10^18 bytes
© 2019 Mentor Graphics, A Siemens Business
Machine Learning Design Flow
Algorithm Engineers work here.
They don’t understand hardware
AI Development
Platforms
Pruning
Quantization
Compression
HW
Implementation
Weights
Retraining
Compilation
Hardware Engineers work here and
are already building NN HW using
Catapult HLS. They don’t
understand the NN platforms
© 2019 Mentor Graphics, A Siemens Business
Why Catapult HLS is
Crucial to Getting Designs
to Market on Time
© 2019 Mentor Graphics, A Siemens Business
Catapult HLS is the Best Solution for Rapid Algorithm to HW
void func (short a[N],
for (int i=0; i<N; i++) {
if (cond)
z+=a[i]*b[i];
else
RTL
Enable late functional changes without impacting schedule
• Algorithms can be easily modified and regenerated
• New technology nodes are easy (or FPGA to ASIC)
Quickly evaluate power and performance of algorithms
• Rapidly explore multiple options for optimal power,
performance and area (PPA)
Accelerate design time with higher level of abstraction
• 1 Year reduced to a few months
• New features added in days not weeks
• 5X less code than RTL
© 2019 Mentor Graphics, A Siemens Business
Catapult Synthesizes C++ and SystemC to Optimal ASIC
or FPGA Hardware
void simple_function(<function interface variables>){
<function body>
}
class simpleClass{
…
public:
void simple_function(<function interface variables>){
<function body>
}
};
SC_MODULE(simpleClass){
<module ports>
SC_CTOR(simpleClass){
SC_THREAD(run)
}
void run(){
<function body>
}
};
ASIC FPGAs
Catapult synthesizes C++/SystemC to optimized Register Transfer Level (RTL)
© 2019 Mentor Graphics, A Siemens Business
High-level Synthesis Models Bit-accuracy in the C++
Source
Arbitrary precision Integer, fixed-point, and floating-point
• New bfloat16, ac_std_float<E,M> ac_ieee_float
HLS uses exact bit-widths to meet specification and save
power/area
• Hardware bit-widths are not always pow2 (1, 8, 16, 32,
64 bits)
Rapid simulation of true hardware behavior
Bit-accurate
C++/SystemC
Verify
Refine/Explore
Precision
Model
using floating-point
Bit-accurate RTL
Catapult Ultra Verify
The Algorithmic C fixed point
data types are declared as:
ac_fixed<W,I,S> x;
width #integer bits
© 2019 Mentor Graphics, A Siemens Business
Constraint Driven Exploration of Parallelism/Timing
Exploration done using loop transformations
Loop unrolling drives parallelism
Timing closure is automatic
Architecture
Constraints
+x
+
x
x
x
x
+
+
Catapult Architectural Constraints View
Loops in design
© 2019 Mentor Graphics, A Siemens Business
Simplifies designing memory architecture
C++ arrays automatically mapped to ASIC or FPGA memories/registers
User control over memory mapping, banking, etc.
Arrays on the design interface can be synthesized as memory interfaces or
AXI4 Master/slave interfaces
Constraint-driven Creation of Memories/Memory Architecture
void simple_function(… ,int data[1024]){
int mem[1024];
<function body>
}
43,264
words
Width = 17
Ram1
676
words
Width = 17
Ram2
676
words
Ram64
676
words
Catapult Constraint GUI
© 2019 Mentor Graphics, A Siemens Business
Verification of the Quantized
Algorithm in Hardware
© 2019 Mentor Graphics, A Siemens Business
Automatic Verification of C++ vs Hardware
Implementation
C++ algorithm is fully verified before synthesis
No RTL debug required
Bit-accurate
C++ Reference
Model
Stimulus ==
Synthesizable
Model
Is the same?
C++ Testbench
Catapult C++
Synthesis
RTL
Automated
RTL Sanity
Check
Algorithm Input
Algorithm Output
© 2019 Mentor Graphics, A Siemens Business
Swap any layer or the entire design
Easily Test HW C++ Models Directly in Tensorflow
catapult
conv2d
Sliding-
Window
Convolution/
Max Pooling
FIFO
Sliding-
Window
Convolution/
Max Pooling
…. FIFO
In-place
Convolution/
Max Pooling
Off-chip DRAM
AXI4 stream
Weights and results
Tensorflow Python File
Tensorflow
Operator
wrapper call
Tensorflow C++ API
Operator Wrapper
HLS Model in C++
Tensorflow C++ API Wrapper
© 2019 Mentor Graphics, A Siemens Business
Customer Successes and
Future Direction of Machine
Learning and HLS
© 2019 Mentor Graphics, A Siemens Business
Chips&Media Success: Deep Learning Object Detection IP
19
Successfully delivered inference-targeted
deep learning IP with move to HLS
• RTL designers now plan to use HLS on all
future new computer vision/deep learning IP
• HLS is key to finding power optimized specific
DNN
Cut the block/IP design and verification
time in half
• New DNN architecture
• Delivered critical FPGA customer demonstrator
early
HLS helped find optimal power/performance
architecture that RTL “would not have had time”
© 2019 Mentor Graphics, A Siemens Business
NVIDIA Research with DARPA - New
methodology for 10x faster chip design
HLS to target 80% of future NVIDIA chips
• Open-Source Matchlib HLS IP
2 Tapeouts - 20M+ gate machine learning
accelerator SoC
Foundation for NVDLA HW
• NVIDIA Deep Learning Accelerators
NVIDIA Research New Methodology with Catapult
Machine Learning Accelerator SoC using an Object-Oriented HLS flow
20
© 2019 Mentor Graphics, A Siemens Business
Vision: Enable Fast Path to Custom AI/Neural
Network Accelerators with Catapult HLS
21
• Build low-power HW from
trained network
• Quickly produce deployable
proof-of-concept
• Optimize performance,
power and area when final
requirements are set
• Make FPGA design flow a
viable alternative for neural
networking vs GPU
AI Development
Platforms
HLS Model in C++
HLS IP
Catapult HLS
© 2019 Mentor Graphics, A Siemens Business
Conclusion
Machine learning hardware implementations are massively complex
• Implementing real-time HW solutions on-time is very challenging
• General purpose solutions will not be power efficient
Catapult High-level Synthesis enables designers to rapidly deliver custom
hardware solutions for machine learning algorithms
• Hardware is optimized for the ML network/algorithm
• Most efficient power
Verification in C++ is the most flexible solution
• Easily verify the hardware model in Tensorflow ML framework
© 2019 Mentor Graphics, A Siemens Business
Backup Material
© 2019 Mentor Graphics, A Siemens Business
Catapult HLS Resources
24
Catapult Customer White Papers
Chips&Media
Design and Verification of Deep Learning
Object Detection IP
NVDIA
Digital VLSI Flow for High-Productivity SoC
Design
Hardware Accelerator for Mobile Computer
Vision Applications
Design and Verification of a Machine
Learning Accelerator SoC Using an Object-
Oriented HLS-Based Design Flow
SeeCubic
CATAPULT HLS Enables ULTRA-D 3D without
Glasses
ST Imaging
STMicroelectronics Quickly Brings
Automotive Image Signal Processing to
Market with HL
Google
Google White Paper
Google Presentation
© 2019 Mentor Graphics, A Siemens Business
Chips&Media Success for Deep Learning Object
Detection IP
Successfully delivered
inference-targeted
Deep Learning IP with move to HLS
• RTL designers now plan to use HLS
on all future new
computer vision/deep learning IP
• HLS is key to finding power optimized specific
DNN
Cut the block/IP design and verification
time in half
• New DNN architecture
• Delivered critical FPGA customer demonstrator early
HLS helped find optimal power/performance
architecture that RTL “would not have had time”
New detailed white paper: Design and Verification of Deep Learning
Object Detection IP
© 2019 Mentor Graphics, A Siemens Business
NVIDIA Research with DARPA - New methodology for 10x faster chip design
• HLS to target 80% of future NVIDIA chips
2 Tapeouts - 20M+ gate Machine Learning
accelerator SoC
Used for SoC performance verification
• 30X RTL, <2.6% error in cycle count
Foundation for NVDLA HW
• NVIDIA Deep Learning Accelerators
2 DAC Papers; 2016 ,2018 available now
• Digital VLSI Flow for High-Productivity SoC Design
• Hardware Accelerator for Mobile Computer Vision Applications
• Design and Verification of a Machine Learning Accelerator SoC Using an Object-Oriented HLS-Based Design
Flow
NVIDIA Research New Methodology with Catapult
Machine Learning Accelerator SoC using an Object-Oriented HLS flow
© 2019 Mentor Graphics, A Siemens Business
NVIDIA Achieves Cost Reduction of ~80%
for Functional Verification with Catapult
Used in production level automotive targeted SoC’s
C++ functional verification runtime ~500x less resources than RTL
Fast verification makes rapid product changes possible
• VP9/HEVC code from 8 to 10 bit color depth in 2 weeks
• Change from 20nm/500MHz to 28nm/800MHz in 3 days with HLS
Traditional RTL
Functional
Regression
3 months
1000 CPUs
Resources
Time
HLS C++
Functional
Regression
2 weeks
14 CPUs
Resources
Time
NVIDIA Xavier 12nFF SoC
Most Complex SoC Ever Made
9 Billion Transistors
~8,000 man years
NVIDIA Case Study available on
mentor.com
Video
Processor
DLA
© 2019 Mentor Graphics, A Siemens Business
FotoNation
Next-Gen Mobile Face Recognition With Catapult
DAC Presentation
• “A Designer’s Life with HLS - Faster Computer Vision/Neural
Networks”
“3 weeks from Caffe to FPGA”
• Initial FPGA from unique C algorithm - 10fps
• HLS for desired µArchitecture delivered 30fps FPGA
at 100MHz
Faster, easier reuse, testing and customization
• “4x faster then hand coding”
• “Verification is Easier - Bit exact between
HW and C is native”
• Instant retargeting to optimal ASIC RTL
3+ B DEVICES
High Performance,
Low-Power
Computational Imaging
© 2019 Mentor Graphics, A Siemens Business
SeeCubic/StreamTV Networks uses Catapult HLS to
Deliver Realistic 3D Experience without Glasses
New Ultra-D branded technology and algorithms
- Far more realistic 3D display
Target Automotive, Medical and Consumer
“Catapult HLS came to the rescue”
• First, must prove the image quality
and algorithms and demonstrate on FPGA
• Enables to work with partners to embed
in ASIC/SoC
• Only Catapult HLS methodology delivers needed
technology independence
Presented at DAC 2017 and White paper
CATAPULT HLS Enables ULTRA-D 3D without Glasses
© 2019 Mentor Graphics, A Siemens Business
To date created 50+ Image Processing IPs using HLS Imaging Template
Why they use HLS and Catapult (their words)
• Increase IP value
• Improve IP performance versus power & area
• Reduce project cost
• Reduce IP development from 24 weeks to 4 weeks
Experience with HLS
• Less code to write and debug
• Fast integration of new features
• Algorithm and architecture exploration possible
• Fast Verification using C++
On-Demand Webinar and White Paper
STMicroelectronics Quickly Brings Automotive Image Signal Processing to Market with H
ST Imaging HLS Success for ISP (Automotive)
© 2019 Mentor Graphics, A Siemens Business
Google Continues Video CODEC Success with Catapult HLS
AV1 improving compression by 40-50% over VP9/HEVC
Goal: High bandwidth free-of-charge CODEC releasing every 3-4 years
(rather than 10 which is HEVC)
Catapult HLS on VP9 CODEC
• Time to Verified RTL: 2x faster
• Simulation Speed: 500x faster
• >99% bugs caught in C simulation
Catapult HLS on AV1 CODEC
• Productivity –90% less code, less bugs
• Leverage the whole team – Algorithm, architect, HW, DV
• Flexibility – SW-like process, late-stage algorithm change easy
• Empowering HW engineers – work on interesting/important problems
• Rapid HW prototyping – rapidly evaluate new ideas, algorithms
Google Presentation
Google White Paper
© 2019 Mentor Graphics, A Siemens Business

Mais conteúdo relacionado

Mais de Edge AI and Vision Alliance

“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...Edge AI and Vision Alliance
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...Edge AI and Vision Alliance
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...Edge AI and Vision Alliance
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...Edge AI and Vision Alliance
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from SamsaraEdge AI and Vision Alliance
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...Edge AI and Vision Alliance
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...Edge AI and Vision Alliance
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...Edge AI and Vision Alliance
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...Edge AI and Vision Alliance
 
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...Edge AI and Vision Alliance
 
“MIPI CSI-2 Image Sensor Interface Standard Features Enable Efficient Embedde...
“MIPI CSI-2 Image Sensor Interface Standard Features Enable Efficient Embedde...“MIPI CSI-2 Image Sensor Interface Standard Features Enable Efficient Embedde...
“MIPI CSI-2 Image Sensor Interface Standard Features Enable Efficient Embedde...Edge AI and Vision Alliance
 
“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...
“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...
“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...Edge AI and Vision Alliance
 
“Practical Approaches to DNN Quantization,” a Presentation from Magic Leap
“Practical Approaches to DNN Quantization,” a Presentation from Magic Leap“Practical Approaches to DNN Quantization,” a Presentation from Magic Leap
“Practical Approaches to DNN Quantization,” a Presentation from Magic LeapEdge AI and Vision Alliance
 
"Optimizing Image Quality and Stereo Depth at the Edge," a Presentation from ...
"Optimizing Image Quality and Stereo Depth at the Edge," a Presentation from ..."Optimizing Image Quality and Stereo Depth at the Edge," a Presentation from ...
"Optimizing Image Quality and Stereo Depth at the Edge," a Presentation from ...Edge AI and Vision Alliance
 
“Using a Collaborative Network of Distributed Cameras for Object Tracking,” a...
“Using a Collaborative Network of Distributed Cameras for Object Tracking,” a...“Using a Collaborative Network of Distributed Cameras for Object Tracking,” a...
“Using a Collaborative Network of Distributed Cameras for Object Tracking,” a...Edge AI and Vision Alliance
 
“A Survey of Model Compression Methods,” a Presentation from Instrumental
“A Survey of Model Compression Methods,” a Presentation from Instrumental“A Survey of Model Compression Methods,” a Presentation from Instrumental
“A Survey of Model Compression Methods,” a Presentation from InstrumentalEdge AI and Vision Alliance
 
“Reinventing Smart Cities with Computer Vision,” a Presentation from Hayden AI
“Reinventing Smart Cities with Computer Vision,” a Presentation from Hayden AI“Reinventing Smart Cities with Computer Vision,” a Presentation from Hayden AI
“Reinventing Smart Cities with Computer Vision,” a Presentation from Hayden AIEdge AI and Vision Alliance
 

Mais de Edge AI and Vision Alliance (20)

“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
 
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...
 
“MIPI CSI-2 Image Sensor Interface Standard Features Enable Efficient Embedde...
“MIPI CSI-2 Image Sensor Interface Standard Features Enable Efficient Embedde...“MIPI CSI-2 Image Sensor Interface Standard Features Enable Efficient Embedde...
“MIPI CSI-2 Image Sensor Interface Standard Features Enable Efficient Embedde...
 
“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...
“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...
“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...
 
“Practical Approaches to DNN Quantization,” a Presentation from Magic Leap
“Practical Approaches to DNN Quantization,” a Presentation from Magic Leap“Practical Approaches to DNN Quantization,” a Presentation from Magic Leap
“Practical Approaches to DNN Quantization,” a Presentation from Magic Leap
 
"Optimizing Image Quality and Stereo Depth at the Edge," a Presentation from ...
"Optimizing Image Quality and Stereo Depth at the Edge," a Presentation from ..."Optimizing Image Quality and Stereo Depth at the Edge," a Presentation from ...
"Optimizing Image Quality and Stereo Depth at the Edge," a Presentation from ...
 
“Using a Collaborative Network of Distributed Cameras for Object Tracking,” a...
“Using a Collaborative Network of Distributed Cameras for Object Tracking,” a...“Using a Collaborative Network of Distributed Cameras for Object Tracking,” a...
“Using a Collaborative Network of Distributed Cameras for Object Tracking,” a...
 
“A Survey of Model Compression Methods,” a Presentation from Instrumental
“A Survey of Model Compression Methods,” a Presentation from Instrumental“A Survey of Model Compression Methods,” a Presentation from Instrumental
“A Survey of Model Compression Methods,” a Presentation from Instrumental
 
“Reinventing Smart Cities with Computer Vision,” a Presentation from Hayden AI
“Reinventing Smart Cities with Computer Vision,” a Presentation from Hayden AI“Reinventing Smart Cities with Computer Vision,” a Presentation from Hayden AI
“Reinventing Smart Cities with Computer Vision,” a Presentation from Hayden AI
 

Último

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 

Último (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 

"Using High-level Synthesis to Bridge the Gap Between Deep Learning Frameworks and Custom Hardware Accelerators," a Presentation from Mentor

  • 1. © 2019 Mentor Graphics, A Siemens Business Using High-level Synthesis to Bridge the Gap Between Deep Learning Frameworks and Custom Hardware Accelerators Mike Fingeroff High-level Synthesis Technologist
  • 2. © 2019 Mentor Graphics, A Siemens Business Agenda Machine learning has massive design complexity requirements Why Catapult High-level Synthesis (HLS) is crucial to getting designs to market on time Verification of the quantized algorithm Customer successes and future direction of Machine Learning and HLS
  • 3. © 2019 Mentor Graphics, A Siemens Business Machine Learning Hardware is Evolving Rapidly
  • 4. © 2019 Mentor Graphics, A Siemens Business Machine Learning Algorithms Have Massive Computational Complexity Training • Very large datasets & memory, CPU/GPU farms, floating point required • Not real time, can take days/weeks This is where Catapult HLS fits Inferencing • Uses weights from trained network • Memory storage/bandwidth challenges • Often real-time • Can be reduced to fixed point, dramatically reduce the power
  • 5. © 2019 Mentor Graphics, A Siemens Business Numerous Possible Hardware/Memory NN Architectures for Inference Engines Machine learning architectures are still evolving • How to know which one is right for the application? • Not enough time to do them all in RTL On-chip memory, memory bandwidth, power performance and area are all important
  • 6. © 2019 Mentor Graphics, A Siemens Business Memory Architecture and Power Considerations Keeping data local is key to minimizing power consumption • Very important for ASIC Floating-point is costly • Used in training of networks • Not needed in network inference engine Processor ML architectures are fixed bit-width • Not power efficient *MIT/NVIDIA 2017
  • 7. © 2019 Mentor Graphics, A Siemens Business Data in the Real World is Exploding Data traffic is going to increase exponentially over the next decade • Frame rates, sensor/camera resolution will keep doubling every few years How can processing technology keep up? • General purpose solutions wont work, too much power Tractica 2018 EB = 10^18 bytes
  • 8. © 2019 Mentor Graphics, A Siemens Business Machine Learning Design Flow Algorithm Engineers work here. They don’t understand hardware AI Development Platforms Pruning Quantization Compression HW Implementation Weights Retraining Compilation Hardware Engineers work here and are already building NN HW using Catapult HLS. They don’t understand the NN platforms
  • 9. © 2019 Mentor Graphics, A Siemens Business Why Catapult HLS is Crucial to Getting Designs to Market on Time
  • 10. © 2019 Mentor Graphics, A Siemens Business Catapult HLS is the Best Solution for Rapid Algorithm to HW void func (short a[N], for (int i=0; i<N; i++) { if (cond) z+=a[i]*b[i]; else RTL Enable late functional changes without impacting schedule • Algorithms can be easily modified and regenerated • New technology nodes are easy (or FPGA to ASIC) Quickly evaluate power and performance of algorithms • Rapidly explore multiple options for optimal power, performance and area (PPA) Accelerate design time with higher level of abstraction • 1 Year reduced to a few months • New features added in days not weeks • 5X less code than RTL
  • 11. © 2019 Mentor Graphics, A Siemens Business Catapult Synthesizes C++ and SystemC to Optimal ASIC or FPGA Hardware void simple_function(<function interface variables>){ <function body> } class simpleClass{ … public: void simple_function(<function interface variables>){ <function body> } }; SC_MODULE(simpleClass){ <module ports> SC_CTOR(simpleClass){ SC_THREAD(run) } void run(){ <function body> } }; ASIC FPGAs Catapult synthesizes C++/SystemC to optimized Register Transfer Level (RTL)
  • 12. © 2019 Mentor Graphics, A Siemens Business High-level Synthesis Models Bit-accuracy in the C++ Source Arbitrary precision Integer, fixed-point, and floating-point • New bfloat16, ac_std_float<E,M> ac_ieee_float HLS uses exact bit-widths to meet specification and save power/area • Hardware bit-widths are not always pow2 (1, 8, 16, 32, 64 bits) Rapid simulation of true hardware behavior Bit-accurate C++/SystemC Verify Refine/Explore Precision Model using floating-point Bit-accurate RTL Catapult Ultra Verify The Algorithmic C fixed point data types are declared as: ac_fixed<W,I,S> x; width #integer bits
  • 13. © 2019 Mentor Graphics, A Siemens Business Constraint Driven Exploration of Parallelism/Timing Exploration done using loop transformations Loop unrolling drives parallelism Timing closure is automatic Architecture Constraints +x + x x x x + + Catapult Architectural Constraints View Loops in design
  • 14. © 2019 Mentor Graphics, A Siemens Business Simplifies designing memory architecture C++ arrays automatically mapped to ASIC or FPGA memories/registers User control over memory mapping, banking, etc. Arrays on the design interface can be synthesized as memory interfaces or AXI4 Master/slave interfaces Constraint-driven Creation of Memories/Memory Architecture void simple_function(… ,int data[1024]){ int mem[1024]; <function body> } 43,264 words Width = 17 Ram1 676 words Width = 17 Ram2 676 words Ram64 676 words Catapult Constraint GUI
  • 15. © 2019 Mentor Graphics, A Siemens Business Verification of the Quantized Algorithm in Hardware
  • 16. © 2019 Mentor Graphics, A Siemens Business Automatic Verification of C++ vs Hardware Implementation C++ algorithm is fully verified before synthesis No RTL debug required Bit-accurate C++ Reference Model Stimulus == Synthesizable Model Is the same? C++ Testbench Catapult C++ Synthesis RTL Automated RTL Sanity Check Algorithm Input Algorithm Output
  • 17. © 2019 Mentor Graphics, A Siemens Business Swap any layer or the entire design Easily Test HW C++ Models Directly in Tensorflow catapult conv2d Sliding- Window Convolution/ Max Pooling FIFO Sliding- Window Convolution/ Max Pooling …. FIFO In-place Convolution/ Max Pooling Off-chip DRAM AXI4 stream Weights and results Tensorflow Python File Tensorflow Operator wrapper call Tensorflow C++ API Operator Wrapper HLS Model in C++ Tensorflow C++ API Wrapper
  • 18. © 2019 Mentor Graphics, A Siemens Business Customer Successes and Future Direction of Machine Learning and HLS
  • 19. © 2019 Mentor Graphics, A Siemens Business Chips&Media Success: Deep Learning Object Detection IP 19 Successfully delivered inference-targeted deep learning IP with move to HLS • RTL designers now plan to use HLS on all future new computer vision/deep learning IP • HLS is key to finding power optimized specific DNN Cut the block/IP design and verification time in half • New DNN architecture • Delivered critical FPGA customer demonstrator early HLS helped find optimal power/performance architecture that RTL “would not have had time”
  • 20. © 2019 Mentor Graphics, A Siemens Business NVIDIA Research with DARPA - New methodology for 10x faster chip design HLS to target 80% of future NVIDIA chips • Open-Source Matchlib HLS IP 2 Tapeouts - 20M+ gate machine learning accelerator SoC Foundation for NVDLA HW • NVIDIA Deep Learning Accelerators NVIDIA Research New Methodology with Catapult Machine Learning Accelerator SoC using an Object-Oriented HLS flow 20
  • 21. © 2019 Mentor Graphics, A Siemens Business Vision: Enable Fast Path to Custom AI/Neural Network Accelerators with Catapult HLS 21 • Build low-power HW from trained network • Quickly produce deployable proof-of-concept • Optimize performance, power and area when final requirements are set • Make FPGA design flow a viable alternative for neural networking vs GPU AI Development Platforms HLS Model in C++ HLS IP Catapult HLS
  • 22. © 2019 Mentor Graphics, A Siemens Business Conclusion Machine learning hardware implementations are massively complex • Implementing real-time HW solutions on-time is very challenging • General purpose solutions will not be power efficient Catapult High-level Synthesis enables designers to rapidly deliver custom hardware solutions for machine learning algorithms • Hardware is optimized for the ML network/algorithm • Most efficient power Verification in C++ is the most flexible solution • Easily verify the hardware model in Tensorflow ML framework
  • 23. © 2019 Mentor Graphics, A Siemens Business Backup Material
  • 24. © 2019 Mentor Graphics, A Siemens Business Catapult HLS Resources 24 Catapult Customer White Papers Chips&Media Design and Verification of Deep Learning Object Detection IP NVDIA Digital VLSI Flow for High-Productivity SoC Design Hardware Accelerator for Mobile Computer Vision Applications Design and Verification of a Machine Learning Accelerator SoC Using an Object- Oriented HLS-Based Design Flow SeeCubic CATAPULT HLS Enables ULTRA-D 3D without Glasses ST Imaging STMicroelectronics Quickly Brings Automotive Image Signal Processing to Market with HL Google Google White Paper Google Presentation
  • 25. © 2019 Mentor Graphics, A Siemens Business Chips&Media Success for Deep Learning Object Detection IP Successfully delivered inference-targeted Deep Learning IP with move to HLS • RTL designers now plan to use HLS on all future new computer vision/deep learning IP • HLS is key to finding power optimized specific DNN Cut the block/IP design and verification time in half • New DNN architecture • Delivered critical FPGA customer demonstrator early HLS helped find optimal power/performance architecture that RTL “would not have had time” New detailed white paper: Design and Verification of Deep Learning Object Detection IP
  • 26. © 2019 Mentor Graphics, A Siemens Business NVIDIA Research with DARPA - New methodology for 10x faster chip design • HLS to target 80% of future NVIDIA chips 2 Tapeouts - 20M+ gate Machine Learning accelerator SoC Used for SoC performance verification • 30X RTL, <2.6% error in cycle count Foundation for NVDLA HW • NVIDIA Deep Learning Accelerators 2 DAC Papers; 2016 ,2018 available now • Digital VLSI Flow for High-Productivity SoC Design • Hardware Accelerator for Mobile Computer Vision Applications • Design and Verification of a Machine Learning Accelerator SoC Using an Object-Oriented HLS-Based Design Flow NVIDIA Research New Methodology with Catapult Machine Learning Accelerator SoC using an Object-Oriented HLS flow
  • 27. © 2019 Mentor Graphics, A Siemens Business NVIDIA Achieves Cost Reduction of ~80% for Functional Verification with Catapult Used in production level automotive targeted SoC’s C++ functional verification runtime ~500x less resources than RTL Fast verification makes rapid product changes possible • VP9/HEVC code from 8 to 10 bit color depth in 2 weeks • Change from 20nm/500MHz to 28nm/800MHz in 3 days with HLS Traditional RTL Functional Regression 3 months 1000 CPUs Resources Time HLS C++ Functional Regression 2 weeks 14 CPUs Resources Time NVIDIA Xavier 12nFF SoC Most Complex SoC Ever Made 9 Billion Transistors ~8,000 man years NVIDIA Case Study available on mentor.com Video Processor DLA
  • 28. © 2019 Mentor Graphics, A Siemens Business FotoNation Next-Gen Mobile Face Recognition With Catapult DAC Presentation • “A Designer’s Life with HLS - Faster Computer Vision/Neural Networks” “3 weeks from Caffe to FPGA” • Initial FPGA from unique C algorithm - 10fps • HLS for desired µArchitecture delivered 30fps FPGA at 100MHz Faster, easier reuse, testing and customization • “4x faster then hand coding” • “Verification is Easier - Bit exact between HW and C is native” • Instant retargeting to optimal ASIC RTL 3+ B DEVICES High Performance, Low-Power Computational Imaging
  • 29. © 2019 Mentor Graphics, A Siemens Business SeeCubic/StreamTV Networks uses Catapult HLS to Deliver Realistic 3D Experience without Glasses New Ultra-D branded technology and algorithms - Far more realistic 3D display Target Automotive, Medical and Consumer “Catapult HLS came to the rescue” • First, must prove the image quality and algorithms and demonstrate on FPGA • Enables to work with partners to embed in ASIC/SoC • Only Catapult HLS methodology delivers needed technology independence Presented at DAC 2017 and White paper CATAPULT HLS Enables ULTRA-D 3D without Glasses
  • 30. © 2019 Mentor Graphics, A Siemens Business To date created 50+ Image Processing IPs using HLS Imaging Template Why they use HLS and Catapult (their words) • Increase IP value • Improve IP performance versus power & area • Reduce project cost • Reduce IP development from 24 weeks to 4 weeks Experience with HLS • Less code to write and debug • Fast integration of new features • Algorithm and architecture exploration possible • Fast Verification using C++ On-Demand Webinar and White Paper STMicroelectronics Quickly Brings Automotive Image Signal Processing to Market with H ST Imaging HLS Success for ISP (Automotive)
  • 31. © 2019 Mentor Graphics, A Siemens Business Google Continues Video CODEC Success with Catapult HLS AV1 improving compression by 40-50% over VP9/HEVC Goal: High bandwidth free-of-charge CODEC releasing every 3-4 years (rather than 10 which is HEVC) Catapult HLS on VP9 CODEC • Time to Verified RTL: 2x faster • Simulation Speed: 500x faster • >99% bugs caught in C simulation Catapult HLS on AV1 CODEC • Productivity –90% less code, less bugs • Leverage the whole team – Algorithm, architect, HW, DV • Flexibility – SW-like process, late-stage algorithm change easy • Empowering HW engineers – work on interesting/important problems • Rapid HW prototyping – rapidly evaluate new ideas, algorithms Google Presentation Google White Paper
  • 32. © 2019 Mentor Graphics, A Siemens Business