SlideShare uma empresa Scribd logo
1 de 18
Baixar para ler offline
1
High-Efficiency Edge Vision
Processing Based on
Dynamically Reconfigurable
TPU Technology
Cheng C. Wang, Co-Founder and CTO
Flex Logix Technologies
© 2022 Flex Logix Technologies
Flex Logix: A technology leader
2
Cheng Wang, CTO, Co-founder
• Industry expert with track record in
tech innovation
• Winner: ISSCC Outstanding Paper
Award, the premier chip design
award. (Recent winners include IBM,
Toshiba, Nvidia and Sandisk)
Flex Logix:
• Founded in 2014
• Profitable embedded FPGA IP business
• Edge AI Inference accelerator based on tensor array + programmable logic
• logic
• Backed by top technology and innovation investors
• Growing rapidly – We’re hiring!
• Based in Mountain View, CA, with offices in Austin, TX
Geoff Tate, CEO
• Experienced executive taking
company public
• Rambus: 4 people to IPO to
$2B
© 2022 Flex Logix Technologies
Short history of vision processing
In 1966, American computer scientist and co-founder of the
MIT AI Lab Marvin Minsky hired a first-year undergraduate
student, Gerald Sussman, to spend the summer linking a
camera to a computer and getting the computer to describe
what it saw.
Needless to say, Sussman didn’t make the deadline.
Vision turned out to be one of the most difficult and
frustrating challenges in AI over the next four decades.
3
© 2022 Flex Logix Technologies
• Operator Types: 11x11, 5x5, 3x3, MaxPool 3x3s2, FC
• Total Layers: 8
• Output is classification to 1000 classes
• Operations per Inference: 724 Million
Operations per Inference: 724 Million
AlexNet 2012 (ten years ago)
ImageNet competition winner
4
© 2022 Flex Logix Technologies
• General Purpose Processing?
• Operating Systems?
• Rugged/Industrial Computers?
• Network Connectivity?
• Software Paradigm?
• Imaging Sensor?
What are the tough new problems?
All Solved Problems
5
© 2022 Flex Logix Technologies
• Software Complexity
• How do you program a trillion
operations?
• TeraOp Processing Efficiency
• How do you fit TeraOps in a factory?
What are the tough new problems?
6
© 2022 Flex Logix Technologies
Miniaturizing AI vision systems
Providing TeraOps of vision processing into smaller form-factors is a real challenge
Industrial Vision Computers
Compact Vision Box
High Power – 250 W Card
System Power – 750 W
Low Power – 8 W Card
System Power – 30 W
$$$$ $
7
© 2022 Flex Logix Technologies
• Good for inference and training
• Tons of memory bandwidth
• Numerous precision choices
• Good SW ecosystem to get started
• But ..
• Large, expensive, power hungry
• Further amplified at the system level
• Easy to get started, but hard to optimize for
• Support difficulties
• Supply-chain difficulties
GPUs offer flexibility but at a price
Flexible but…
• Large
• Expensive
• Power hungry
8
© 2022 Flex Logix Technologies
It’s all about the memory
GPUs are architected for DDR, not local memories
• Computation is designed to access GDDR memory
• NVLink used for further expansion
• Only 4 + 12 MB of L2 + RF
Highly parallelized version of Von Neumann
architecture
• Still inefficient, but highly flexible
Turing TU104 Full Chip Diagram
9
© 2022 Flex Logix Technologies
• Models are evolving rapidly, even the same-
model is going through incremental changes
• Different versions, flavors (s/m/l/xl), image
sizes, are added
• New operators are introduced regularly
• Flexible architecture is a MUST
• Dedicated ASICs cannot chase a moving target
Fast model evolution – Flexibility is key
YOLOv5 Changes
V1 May, 2020 Initial Release
V2 July, 2020 LeakyReLU(0.1)
V3 August 2020 HardSwish activations added
V4 January 2021 siLU() replaces leakyReLU and
hardswish
V5 April 2021 Added P6 (1280x1280) models
V6 October 2021 SPP -> SPPF, C3 changes
10
© 2022 Flex Logix Technologies
• Graph streaming reduces DRAM
requirements, but BW matching
is difficult
• Each operator requires different
amount of compute
• How to maintain efficiency with
load imbalance?
Load-balancing difficult for streaming architectures
0
0.5
1
1.5
2
2.5
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 103 106 109
TeraOps per Layer
Model Layer
11
© 2022 Flex Logix Technologies
Efficient data access:
Each 1D TPU core can stream data from:
• Neighboring 1D TPU (dedicated)
• Any 1D TPU (via XFLX)
• L2 SRAM (via XFLX)
• DDR (via NoC)
Flexible control and data path:
• “Future proof” compute, activations, and
generic operators via EFLX
Dynamic TPU:
Flexible, balanced & memory-efficient
12
© 2022 Flex Logix Technologies
• Stream data at a sub-graph level with efficient BW matching
Layer fusion – Match workloads at sublayer level
0
0.5
1
1.5
2
2.5
1
4
7
10
13
16
19
22
25
28
31
34
37
40
43
46
49
52
55
58
61
64
67
70
73
76
79
82
85
88
91
94
97
100
103
106
109
TeraOps per Layer
L2 1DTPU
+
1DTPU
L2
1DTPU
1DTPU
L2
L2 1DTPU
+
1DTPU
L2
1DTPU
1DTPU
L2
Input
Memory
MACS Misc Ops
Output
Memory
L2 1DTPU
+
1DTPU
L2
1DTPU
1DTPU
L2 1DTPU
+
1DTPU
L2
1DTPU
1DTPU
+
1DTPU
1DTPU
+
1DTPU
1DTPU
1DTPU
1DTPU
13
© 2022 Flex Logix Technologies
Comparing GPU to our dynamic TPU
GPU Dynamic TPU
Lots of MACs Fewer MACs, but more efficiently used
Computes predominantly via GDDR Compute via local connections, XFLX connections, flexible L2s,
and DDR
Lots of SW but hard to achieve efficiency Easy to achieve efficiency with X1 XDK
Many GDDR Memory (256-bit) Few LPDDR (32-bit)
75 – 300 W (typ.) 6 – 10 W (typ.)
Flexible Equally flexible via EFLX & XFLX
Large, expensive & brute-force Small, low-cost & efficient
14
© 2022 Flex Logix Technologies
Embedded Application
+ X1
X1 drivers X1 Runtime
Embedded App at The Edge
Runtime API
Neural Net
Conv
AvgPool
MaxPool
Concat
.ncf
InferXDK
Software
Development
Toolkit
System level view of InferXDK software
15
NN Model Framework
© 2022 Flex Logix Technologies
2019: Xin Feng, Computer vision algorithms
and hardware implementations: A survey
• X1 provides ASIC performance/efficiency with
flexibility of software
• InferX SDK directly converts neural network graph
model to dynamic InferX hardware instance
• Much more flexible & future proof vs ASIC solutions
• Much higher efficiency (Inf/W & Inf/$)
vs CPU and GPU based solution
• Thus enabling compact form
factors such as M.2 2280 B+M
>2x
efficiency
Superior performance with SW flexibility
InferX
© 2022 Flex Logix Technologies
• Putting TeraOps of performance in low power edge devices is today’s challenge
• InferX was designed from scratch to solve this problem
• > 10x improvement in efficiency versus comparable GPUs
• Smart architecture reduces memory bandwidth and capacity requirements
• While supporting complex models at high throughput
• Designed to fit in small and low power system form factors
Come visit us in the expo hall for demonstrations for InferX technology
Conclusions
17
Thank you!

Mais conteúdo relacionado

Semelhante a “High-Efficiency Edge Vision Processing Based on Dynamically Reconfigurable TPU Technology,” a Presentation from Flex Logix

Oracle ExaLogic Overview
Oracle ExaLogic OverviewOracle ExaLogic Overview
Oracle ExaLogic OverviewPeter Doolan
 
Automated Deployment and Management of Edge Clouds
Automated Deployment and Management of Edge CloudsAutomated Deployment and Management of Edge Clouds
Automated Deployment and Management of Edge CloudsJay Bryant
 
“Introducing the Kria Robotics Starter Kit: Robotics and Machine Vision for S...
“Introducing the Kria Robotics Starter Kit: Robotics and Machine Vision for S...“Introducing the Kria Robotics Starter Kit: Robotics and Machine Vision for S...
“Introducing the Kria Robotics Starter Kit: Robotics and Machine Vision for S...Edge AI and Vision Alliance
 
Resume_Achhar_Kalia
Resume_Achhar_KaliaResume_Achhar_Kalia
Resume_Achhar_KaliaAchhar Kalia
 
Lenovo networking: top of the top of the rack
Lenovo networking: top of the top of the rackLenovo networking: top of the top of the rack
Lenovo networking: top of the top of the rackLenovo Data Center
 
Data Center of the Future v1.0.pptx
Data Center of the Future v1.0.pptxData Center of the Future v1.0.pptx
Data Center of the Future v1.0.pptxjuergenJaeckel
 
Ultime Novità di Prodotto Neo4j
Ultime Novità di Prodotto Neo4j Ultime Novità di Prodotto Neo4j
Ultime Novità di Prodotto Neo4j Neo4j
 
Microsofts Configurable Cloud
Microsofts Configurable CloudMicrosofts Configurable Cloud
Microsofts Configurable CloudChris Genazzio
 
Xilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systemsXilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systemsGanesan Narayanasamy
 
OVNC 2015-Software-Defined Networking: Where Are We Today?
OVNC 2015-Software-Defined Networking: Where Are We Today?OVNC 2015-Software-Defined Networking: Where Are We Today?
OVNC 2015-Software-Defined Networking: Where Are We Today?NAIM Networks, Inc.
 
Consumption Based On-Demand Private Cloud in a Box
Consumption Based On-Demand Private Cloud in a BoxConsumption Based On-Demand Private Cloud in a Box
Consumption Based On-Demand Private Cloud in a BoxRebekah Rodriguez
 
“Challenges in Architecting Vision Inference Systems for Transformer Models,”...
“Challenges in Architecting Vision Inference Systems for Transformer Models,”...“Challenges in Architecting Vision Inference Systems for Transformer Models,”...
“Challenges in Architecting Vision Inference Systems for Transformer Models,”...Edge AI and Vision Alliance
 
Intels presentation at blue line industrial computer seminar
Intels presentation at blue line industrial computer seminarIntels presentation at blue line industrial computer seminar
Intels presentation at blue line industrial computer seminarBlue Line
 
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super AffordableSupermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super AffordableRebekah Rodriguez
 
Moving to software-based production workflows and containerisation of media a...
Moving to software-based production workflows and containerisation of media a...Moving to software-based production workflows and containerisation of media a...
Moving to software-based production workflows and containerisation of media a...Kieran Kunhya
 
Building managedprivatecloud kvh_vancouversummit
Building managedprivatecloud kvh_vancouversummitBuilding managedprivatecloud kvh_vancouversummit
Building managedprivatecloud kvh_vancouversummitmatsunota
 
White Box Hardware Challenges in the 5G & IoT Hyperconnected Era
White Box Hardware Challenges in the 5G & IoT Hyperconnected EraWhite Box Hardware Challenges in the 5G & IoT Hyperconnected Era
White Box Hardware Challenges in the 5G & IoT Hyperconnected EraCharo Sanchez
 
Cisco storage networking protect scale-simplify_dec_2016
Cisco storage networking   protect scale-simplify_dec_2016Cisco storage networking   protect scale-simplify_dec_2016
Cisco storage networking protect scale-simplify_dec_2016Tony Antony
 

Semelhante a “High-Efficiency Edge Vision Processing Based on Dynamically Reconfigurable TPU Technology,” a Presentation from Flex Logix (20)

Rendering in the Cloud
Rendering in the CloudRendering in the Cloud
Rendering in the Cloud
 
Oracle ExaLogic Overview
Oracle ExaLogic OverviewOracle ExaLogic Overview
Oracle ExaLogic Overview
 
Automated Deployment and Management of Edge Clouds
Automated Deployment and Management of Edge CloudsAutomated Deployment and Management of Edge Clouds
Automated Deployment and Management of Edge Clouds
 
“Introducing the Kria Robotics Starter Kit: Robotics and Machine Vision for S...
“Introducing the Kria Robotics Starter Kit: Robotics and Machine Vision for S...“Introducing the Kria Robotics Starter Kit: Robotics and Machine Vision for S...
“Introducing the Kria Robotics Starter Kit: Robotics and Machine Vision for S...
 
Resume_Achhar_Kalia
Resume_Achhar_KaliaResume_Achhar_Kalia
Resume_Achhar_Kalia
 
Lenovo networking: top of the top of the rack
Lenovo networking: top of the top of the rackLenovo networking: top of the top of the rack
Lenovo networking: top of the top of the rack
 
Data Center of the Future v1.0.pptx
Data Center of the Future v1.0.pptxData Center of the Future v1.0.pptx
Data Center of the Future v1.0.pptx
 
Ultime Novità di Prodotto Neo4j
Ultime Novità di Prodotto Neo4j Ultime Novità di Prodotto Neo4j
Ultime Novità di Prodotto Neo4j
 
Microsofts Configurable Cloud
Microsofts Configurable CloudMicrosofts Configurable Cloud
Microsofts Configurable Cloud
 
Xilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systemsXilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systems
 
OVNC 2015-Software-Defined Networking: Where Are We Today?
OVNC 2015-Software-Defined Networking: Where Are We Today?OVNC 2015-Software-Defined Networking: Where Are We Today?
OVNC 2015-Software-Defined Networking: Where Are We Today?
 
Building a Digital Telco
Building a Digital TelcoBuilding a Digital Telco
Building a Digital Telco
 
Consumption Based On-Demand Private Cloud in a Box
Consumption Based On-Demand Private Cloud in a BoxConsumption Based On-Demand Private Cloud in a Box
Consumption Based On-Demand Private Cloud in a Box
 
“Challenges in Architecting Vision Inference Systems for Transformer Models,”...
“Challenges in Architecting Vision Inference Systems for Transformer Models,”...“Challenges in Architecting Vision Inference Systems for Transformer Models,”...
“Challenges in Architecting Vision Inference Systems for Transformer Models,”...
 
Intels presentation at blue line industrial computer seminar
Intels presentation at blue line industrial computer seminarIntels presentation at blue line industrial computer seminar
Intels presentation at blue line industrial computer seminar
 
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super AffordableSupermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
 
Moving to software-based production workflows and containerisation of media a...
Moving to software-based production workflows and containerisation of media a...Moving to software-based production workflows and containerisation of media a...
Moving to software-based production workflows and containerisation of media a...
 
Building managedprivatecloud kvh_vancouversummit
Building managedprivatecloud kvh_vancouversummitBuilding managedprivatecloud kvh_vancouversummit
Building managedprivatecloud kvh_vancouversummit
 
White Box Hardware Challenges in the 5G & IoT Hyperconnected Era
White Box Hardware Challenges in the 5G & IoT Hyperconnected EraWhite Box Hardware Challenges in the 5G & IoT Hyperconnected Era
White Box Hardware Challenges in the 5G & IoT Hyperconnected Era
 
Cisco storage networking protect scale-simplify_dec_2016
Cisco storage networking   protect scale-simplify_dec_2016Cisco storage networking   protect scale-simplify_dec_2016
Cisco storage networking protect scale-simplify_dec_2016
 

Mais de Edge AI and Vision Alliance

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...Edge AI and Vision Alliance
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...Edge AI and Vision Alliance
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...Edge AI and Vision Alliance
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...Edge AI and Vision Alliance
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...Edge AI and Vision Alliance
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsightsEdge AI and Vision Alliance
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...Edge AI and Vision Alliance
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...Edge AI and Vision Alliance
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...Edge AI and Vision Alliance
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...Edge AI and Vision Alliance
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from SamsaraEdge AI and Vision Alliance
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...Edge AI and Vision Alliance
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...Edge AI and Vision Alliance
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...Edge AI and Vision Alliance
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...Edge AI and Vision Alliance
 

Mais de Edge AI and Vision Alliance (20)

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
 

Último

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 

Último (20)

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 

“High-Efficiency Edge Vision Processing Based on Dynamically Reconfigurable TPU Technology,” a Presentation from Flex Logix

  • 1. 1 High-Efficiency Edge Vision Processing Based on Dynamically Reconfigurable TPU Technology Cheng C. Wang, Co-Founder and CTO Flex Logix Technologies
  • 2. © 2022 Flex Logix Technologies Flex Logix: A technology leader 2 Cheng Wang, CTO, Co-founder • Industry expert with track record in tech innovation • Winner: ISSCC Outstanding Paper Award, the premier chip design award. (Recent winners include IBM, Toshiba, Nvidia and Sandisk) Flex Logix: • Founded in 2014 • Profitable embedded FPGA IP business • Edge AI Inference accelerator based on tensor array + programmable logic • logic • Backed by top technology and innovation investors • Growing rapidly – We’re hiring! • Based in Mountain View, CA, with offices in Austin, TX Geoff Tate, CEO • Experienced executive taking company public • Rambus: 4 people to IPO to $2B
  • 3. © 2022 Flex Logix Technologies Short history of vision processing In 1966, American computer scientist and co-founder of the MIT AI Lab Marvin Minsky hired a first-year undergraduate student, Gerald Sussman, to spend the summer linking a camera to a computer and getting the computer to describe what it saw. Needless to say, Sussman didn’t make the deadline. Vision turned out to be one of the most difficult and frustrating challenges in AI over the next four decades. 3
  • 4. © 2022 Flex Logix Technologies • Operator Types: 11x11, 5x5, 3x3, MaxPool 3x3s2, FC • Total Layers: 8 • Output is classification to 1000 classes • Operations per Inference: 724 Million Operations per Inference: 724 Million AlexNet 2012 (ten years ago) ImageNet competition winner 4
  • 5. © 2022 Flex Logix Technologies • General Purpose Processing? • Operating Systems? • Rugged/Industrial Computers? • Network Connectivity? • Software Paradigm? • Imaging Sensor? What are the tough new problems? All Solved Problems 5
  • 6. © 2022 Flex Logix Technologies • Software Complexity • How do you program a trillion operations? • TeraOp Processing Efficiency • How do you fit TeraOps in a factory? What are the tough new problems? 6
  • 7. © 2022 Flex Logix Technologies Miniaturizing AI vision systems Providing TeraOps of vision processing into smaller form-factors is a real challenge Industrial Vision Computers Compact Vision Box High Power – 250 W Card System Power – 750 W Low Power – 8 W Card System Power – 30 W $$$$ $ 7
  • 8. © 2022 Flex Logix Technologies • Good for inference and training • Tons of memory bandwidth • Numerous precision choices • Good SW ecosystem to get started • But .. • Large, expensive, power hungry • Further amplified at the system level • Easy to get started, but hard to optimize for • Support difficulties • Supply-chain difficulties GPUs offer flexibility but at a price Flexible but… • Large • Expensive • Power hungry 8
  • 9. © 2022 Flex Logix Technologies It’s all about the memory GPUs are architected for DDR, not local memories • Computation is designed to access GDDR memory • NVLink used for further expansion • Only 4 + 12 MB of L2 + RF Highly parallelized version of Von Neumann architecture • Still inefficient, but highly flexible Turing TU104 Full Chip Diagram 9
  • 10. © 2022 Flex Logix Technologies • Models are evolving rapidly, even the same- model is going through incremental changes • Different versions, flavors (s/m/l/xl), image sizes, are added • New operators are introduced regularly • Flexible architecture is a MUST • Dedicated ASICs cannot chase a moving target Fast model evolution – Flexibility is key YOLOv5 Changes V1 May, 2020 Initial Release V2 July, 2020 LeakyReLU(0.1) V3 August 2020 HardSwish activations added V4 January 2021 siLU() replaces leakyReLU and hardswish V5 April 2021 Added P6 (1280x1280) models V6 October 2021 SPP -> SPPF, C3 changes 10
  • 11. © 2022 Flex Logix Technologies • Graph streaming reduces DRAM requirements, but BW matching is difficult • Each operator requires different amount of compute • How to maintain efficiency with load imbalance? Load-balancing difficult for streaming architectures 0 0.5 1 1.5 2 2.5 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 103 106 109 TeraOps per Layer Model Layer 11
  • 12. © 2022 Flex Logix Technologies Efficient data access: Each 1D TPU core can stream data from: • Neighboring 1D TPU (dedicated) • Any 1D TPU (via XFLX) • L2 SRAM (via XFLX) • DDR (via NoC) Flexible control and data path: • “Future proof” compute, activations, and generic operators via EFLX Dynamic TPU: Flexible, balanced & memory-efficient 12
  • 13. © 2022 Flex Logix Technologies • Stream data at a sub-graph level with efficient BW matching Layer fusion – Match workloads at sublayer level 0 0.5 1 1.5 2 2.5 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 103 106 109 TeraOps per Layer L2 1DTPU + 1DTPU L2 1DTPU 1DTPU L2 L2 1DTPU + 1DTPU L2 1DTPU 1DTPU L2 Input Memory MACS Misc Ops Output Memory L2 1DTPU + 1DTPU L2 1DTPU 1DTPU L2 1DTPU + 1DTPU L2 1DTPU 1DTPU + 1DTPU 1DTPU + 1DTPU 1DTPU 1DTPU 1DTPU 13
  • 14. © 2022 Flex Logix Technologies Comparing GPU to our dynamic TPU GPU Dynamic TPU Lots of MACs Fewer MACs, but more efficiently used Computes predominantly via GDDR Compute via local connections, XFLX connections, flexible L2s, and DDR Lots of SW but hard to achieve efficiency Easy to achieve efficiency with X1 XDK Many GDDR Memory (256-bit) Few LPDDR (32-bit) 75 – 300 W (typ.) 6 – 10 W (typ.) Flexible Equally flexible via EFLX & XFLX Large, expensive & brute-force Small, low-cost & efficient 14
  • 15. © 2022 Flex Logix Technologies Embedded Application + X1 X1 drivers X1 Runtime Embedded App at The Edge Runtime API Neural Net Conv AvgPool MaxPool Concat .ncf InferXDK Software Development Toolkit System level view of InferXDK software 15 NN Model Framework
  • 16. © 2022 Flex Logix Technologies 2019: Xin Feng, Computer vision algorithms and hardware implementations: A survey • X1 provides ASIC performance/efficiency with flexibility of software • InferX SDK directly converts neural network graph model to dynamic InferX hardware instance • Much more flexible & future proof vs ASIC solutions • Much higher efficiency (Inf/W & Inf/$) vs CPU and GPU based solution • Thus enabling compact form factors such as M.2 2280 B+M >2x efficiency Superior performance with SW flexibility InferX
  • 17. © 2022 Flex Logix Technologies • Putting TeraOps of performance in low power edge devices is today’s challenge • InferX was designed from scratch to solve this problem • > 10x improvement in efficiency versus comparable GPUs • Smart architecture reduces memory bandwidth and capacity requirements • While supporting complex models at high throughput • Designed to fit in small and low power system form factors Come visit us in the expo hall for demonstrations for InferX technology Conclusions 17