VVIP Pune Call Girls Kalyani Nagar (7001035870) Pune Escorts Nearby with Comp...
1570514051.pptx
1. A Software/Hardware Co-Design Framework
for the ‘Internet of Eyes’
Cathal Garry, Derek Molloy
Entwine Centre for IoT, Dublin City University
2. Introduction
o The main challenge examined in this paper was to bring ‘eyes’ to the
Internet of Things in real time
o Background research indicates current technologies that can facilitate this
are:
Cloud Computing
GPUs
FPGA
Neuromorphic Chipsets
SDSoC
3. What are SDSoCs?
o An SDSoC is an integer circuit that contains a processor, a number of
peripherals and some programmable logic
o SDSoC like the Xilinx Zynq chipset consist of two main components on the same
SoC:
The processing system (PS)
The programmable logic (PL)
o The PL is used to create custom IP (intellectual property), which is linked to the
processing system using standard AXI AMBA interfaces
o The processing system is used to run a software stack, which can access this
custom IP in the programmable logic
4. What are SDSoCs?
o PL can be updated either
before or dynamically during
run-time operation by software
o Effectively allows software to
redefine the hardware
o This simplifies the process of
the software development flow
[Xilinx SDSoC Overview]
5. Advantages and Disadvantages
o Cloud computing can offer real time imaging processing while saving on local
power consumption. But in areas with restricted network access latency can be a
problem
o GPU offer real time imaging processing at the edge but have high power
requirements
o FPGA can offer real time imaging processing with relatively low power
consumption but they require a developer to have a high level of expertise
o Neuromorphic chipset like the Movidius compute stick are a relatively new to the
market and require a high level of expertise in order to implement a solution
o SDSoC is the only solution that can offer low power consumption and real time
image processing while keeping development complexity relatively low
7. Architecture
o The aim of this architecture
was to develop a solution that
could provide low power
consumption and real time
image processing using an
SDSoC
o The proposed architecture is
made up of three components
The producer
The handler
The consumer
o Architecture was applied to a
chosen application which was
a variable speed limit
controlled motorway
8. The Producer
o The SDSoC that was chosen for this
research was the Xilinx Zynq chipset
o The processing system on the Xilinx
Zynq chipset contains an ARM A9
processor along with a number of
standard peripherals like UART and I2C
o The programmable logic contains a
number a system gates, DSP and RAM
o There are many Zynq platforms
available on the market, the one that
was chosen for this research was the
PYNQ platform
[Xilinx Zybo]
9. PYNQ Platform
o PYNQ or Python for productivity for Zynq is a Xilinx platform that provides a software
stack that allows developers to access the benefits of an FPGA without learning
advanced skills
o The PYNQ platform provides this support through Python libraries for accessing the PL.
o Running the PYNQ platform can be done over UART or through Jupyter notebooks
[Xilinx PYNQ]
10. PYNQ Platform
o The PYNQ platform runs Ubuntu-based Linux which is optimized for
developer productivity and provides support for many standard drivers and
libraries
o The framework also provides a function called Overlays which allows for the
hardware in the PL to be reprogrammable at run time
o The PYNQ framework can be ported to other Zynq based platforms as well
o The application of a variable speed limit (VSL) controlled motorway was
implemented by splitting the application between the PS and PL
11. The Producer – Processing System
o The PS was used to monitor the vehicle count and send data to the
handler using MQTT
o The PS read the result from a register in the custom IP. This was read
over an AXI Lite interface for each frame in the input video stream
o The PS also stored a history of the count values provided by the custom
IP in the PL. This history was then used to create a congestion level on
the motorway
12. Producer – Programmable Logic
o PL was used to implement a
custom IP for counting the
number of videos in a given
image frame
o Performed by implementing a
number of image processing
technics in Vivado HLS
o Result of this image
processing was stored in a
register which could be
accessed over an AXI Lite
interface
13. The Handler and Consumer
o Remaining parts of the
architecture are the
handler and consumer
o The handler acts as an
intermediate agent
between the producers
and consumers in the
network
o The consumer acts as an
endpoint for the
producers data.
It can receive data from a
single or multiple
producers in order to make
a decision
16. Power Consumption
o The power consumption
was measured using a
number of different
profiles including:
Different types of amount
of programmable logic in
the PL
Different processor states
in the PS
o The worst case power
consumption when
performing some image
processing in the PL and
in the PS was 2.5 Watts
17. Response Time
o The response time was
measured using a number of
different platforms and
processors
o The tests were also varied
using a number of different
image processing tasks
across different image
resolution
o Worst case response time
for the PL when processing a
1080p image at 30fps was
40ms
o This increases to 50ms when
testing a 1080p image at
60fps
18. MQTT Latency
o MQTT was used as the
transfer protocol so it was
important to determine
the latency across the
network
o The MQTT latency was
measured by varying the
number of messages
published per second
o The response time is in
the msec range, once the
number of published
message is less than 100
per second
o After this the latency
increases by 1000x
19. Register Access Times
o Since the result is
produced for each frame
in a video stream it was
important to determine
the register access time
from the PL
o The register access time
was measured over a
varying number of reads
per second across a
number of iterations
o The worse case latency
was ~100usec which is
more than enough for a
60fps video
20. Overlay Switching
o Overlay switching allows
the user to change the
logic in the PL at run time
– e.g., change from day
time to night time image
processing algorithm
o This test measures how
long it takes for the
programmable logic to
change and for the image
processing to restart
o The worst overlay
switching time found in
this test was 30 seconds
21. Other Analysis
oThe implementation in this research found some types of
image processing are better suited to SDSoC than others:
Image processing techniques that are very spatially localized in
nature perform better in the programmable logic
The further away a required pixel is (spatially) the more memory is
required to store it
Alternative approach to this is to split the image processing
between the PS and PL (e.g., for higher level reasoning)
23. Conclusion
o The Architecture provides a scalable IoT architecture using a
software/hardware co-design for real–time IoE applications
o The research provides an implementation and evaluation of this
architecture through the development of a full stack IoE application
24. Question?
ACKNOWLEDGEMENTS
This research was supported by Xilinx Inc. who provided the PYNQ platform used in this project.
In particular we would like to thank Cathal McCabe and Peter Ogden from Xilinx Inc who provided
technical support during the research. We would also like to thank the Intel Corporation for their
support throughout the DCU master’s program and the development of this research.
Cloud computing, GPUs and FPGA have been around a while but SDSoCs and neuromorphic chipset are relatively new
- When I looked at the Movidius compute stick the level of complexity involved in using it in a full network stack was quite high and there was limited support available
- The producer roles is to process the large amounts of image data at the edge and reduce it down to a small piece of information. This small piece of information is then sent to the handler
The handler acts an intermediate agent between he producer and consumer. It is responsible for collecting and storing the small pieces of information provided by the producers.
The consumer is responsible for making a decision based on the data received from the producer. This decision can be made based on a single or multiple producers
Explain what AXI Lite is here
Video stream of the motorway is output by the laptop
The handler is implemented on a virtual machine running Ubuntu linux
The image on the monitor is of the post processed image
- NTP server used to synchronize the consumer and producer clocks
The architecture outline provides a scalable IoT architecture where the number of producer or consumer could easily be increased. Looking at the chosen application the structure could easily be changed to have a large number of producer along the motorway monitoring traffic and feeding this information to a single consumer.