SlideShare uma empresa Scribd logo
1 de 24
Intermediate Fabrics:
Virtual Architectures for
Near-Instant FPGA
Compilation
Greg Stitt and James Coole, Member, IEEE
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
Field-Programmable Gate Arrays (FPGAs) are digital
integrated circuits (ICs) that contain configurable blocks of
logic along with configurable interconnects between these
blocks . Specifically, an FPGA contains programmable logic
components called logic elements (LEs) and a hierarchy of
reconfigurable interconnects that allow the LEs to be physically
connected. LEs can be configured to perform complex combinational
functions, or merely simple logic gates like AND and XOR. In most
FPGAs, the logic blocks also include memory elements, which may be
simple flip-flops or more complete blocks of memory .
FPGAs are generally favoured in prototype building
because the device does not need to be thrown away
every time a change is
made. This allows one piece of hardware to perform
several different functions. Of course, those functions
cannot be performed at the same time.
Besides that, FPGAs are standard parts, they are not
designed for any particular function but are
programmed by the customer for a particular purpose.
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
FPGAs have compensating advantages, largely
due to the fact that they are standard parts. There
is no wait from completing the design to obtaining
a working chip.
The design can be programmed into the FPGA
and tested immediately.
Apart from that, FPGAs are excellent prototyping
vehicles. When the FPGA is used in the final
design, the jump from prototype to product is
much smaller and easier to negotiate.
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
O Field-programmable gate arrays
(FPGAs) suffer from lower application
design productivity than other devices,
which is largely due to compilation
taking hours or even days.
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
O In this letter, virtual reconfigurable
architectures are evaluated, referred to
as intermediate fabrics, which enable
near-instant placement and routing of
applications for commercial FPGAs.
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
O DESPITE performance, power, and energy
advantages compared to other devices, field-
programmable gate array (FPGA) usage has been
limited by poor application design productivity.
WHY??
O Because of placement and routing(PAR), which
commonly requires many hours or even days.
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
O IFs are virtual reconfigurable architectures
implemented atop commercial-off-the-shelf (COTS)
FPGAs, which designers either select from a library
or custom generate to provide an application-
specialized virtual architecture.
O They contains appropriate resources (e.g., floating-
point cores, FFTs, filters, etc.), as opposed to
mapping the application onto possibly hundreds of
thousands of fine-grained look-up-tables (LUTs) in
the physical device.
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
O Provides several orders of magnitude faster PAR,
thus enabling fast compilation.
O IFs enable designers to use commercial-off-the-
shelf(COTS) devices, which have significant cost
advantages.
O Other advantages include application portability
across devices and rapid reconfiguration that is
orders of magnitude faster than partial
reconfiguration on FPGAs.
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
O The main limitation of IFs is area and
performance overhead.
O IFs are not intended for all usage
scenarios due to overhead that may be
too significant for small embedded
systems or large high-performance
systems.
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
O Flexibility is also a potential limitation.
A custom circuit can use any mixture of
operators and precisions. An IF circuit can also
include any mixture, but it is unlikely that highly
specialized IFs would exist in a predetermined
library.
O Certain application domains will likely not
benefit from IFs. For example, control-intensive
applications clearly cannot be implemented on
the IF.
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
O Previous work has focused on fast PAR
algorithms, but those studies either
provided insufficient speedup for runtime
compilation required custom devices, or
were intended for specific usage
scenarios (e.g., runtime reconfiguration).
O Compilation time can be reduced by using
pre-compiled circuit blocks (hard macros).
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
O In this letter, a more realistic usage scenario of
a single, larger IF specialized for common
image-processing kernels is evaluated.
O Although the fabric could include any resource,
we chose resources common to image-
processing kernels:
• multipliers,
• subtractors with an optional absolute-value
output,
• adders,
• a square root,
• and delay components that act as configurable shift
registers.
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
O For the total size there are128 inputs, five delays
capable of delaying up to 9000 cycles, 64
multipliers, 64 subtractors, 63 adders, one square
root, and one single output.
O To load a bitfile, the IF uses a configuration shift
register that requires only 9234 bits, compared to
several megabytes for commercial FPGAs.
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
Fig. 1. Common circuit structure for pipelined image-
processing kernels.
O This architecture consists of a controller and a
pipelined datapath that accepts a window—a
subset of the image—as input every cycle, and
generates an output every cycle after some initial
latency.
O The IF uses the window generator which buffers
image rows in block RAMs.
O The window generator continually shifts buffered
data into shift registers to create a new window
each cycle
O The architecture uses address generators and
memory controllers to stream data to and from
memories.
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
Fig 2: datapath fabric used for the evaluated
image-processing kernels. Each row has 66 columns.
O We implemented the delays using block RAMs and
the square root using an Altera-provided IP core.
O connections boxes are to connect resources to virtual
routing tracks, and planar switch boxes to connect
tracks. The IF uses two 16-bit bidirectional tracks in
each row and column.
O We implement each track as a mux.
O Each mux has a configuration register (not shown)
that controls the mux select, as determined by PAR.
O In addition, each output has a register to pipeline the
interconnect, which was necessary to achieve clock
frequencies comparable to FPGAs.
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
O To implement the IF on the FPGA, an IF
generation tool is created that reads an
XML(Extensible Markup
Language) description of the fabric and
outputs synthesizable VHDL.
O FPGA used:- Stratix III E260 FPGA
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
O The router uses Pathfinder, which does not
support pipelined routing, but as mentioned
earlier, the realignment registers allow the router
to ignore nets that are misaligned by eight or
less cycles.
O Currently, we have no synthesis tools for IFs
and therefore manually created technology-
mapped IF netlists. We could potentially
synthesize from any HDL, for runtime
compilation.
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
O In this letter, we evaluated an intermediate fabric
for common image-processing kernels, which
enabled placement and routing 700 x speedups
averaging compared to vendor tools.
O The main limitation is overhead, which was a
modest 7% for performance, and a more significant
34% to 44% for area. However, FPGA vendors
could eliminate such area overhead by directly
mapping virtual routing resources onto physical
FPGA resources.
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
[1] P. Athanas, J. Bowen, T. Dunham, C. Patterson, J. Rice, M. Shelburne, J. Suris,
M. Bucciero, and J. Graf, “Wires on demand: Run-time communication synthesis
for reconfigurable computing,” in Proc. Int. Workshop Field-Program. Logic Appl.
(FPL’97), London, U.K., 2007, pp. 513–516.
[2] V. Betz and J. Rose, “VPR: A new packing, placement and routing tool for
FPGA research,” in Proc. Int. Workshop Field-Program. Logic Appl. (FPL’97),
London, U.K., 1997, pp. 213–222.
[3] J. Coole and G. Stitt, “Intermediate fabrics:Virtual architectures for circuit
portability and fast placement and routing,” in Proc. IEEE/ACM/ IFIP Int. Conf.
Hardware/Software Codesign Syst. Synth. (CODES/ ISSS’10), Scottsdale, AZ, 2010,
pp. 13–22.
[4] A. DeHon, “The density advantage of configurable computing,” Computer, vol.
33, no. 4, pp. 41–49, 2000.
[5] Y. Dong, Y. Dou, and J. Zhou, “Optimized generation of memory structure in
compiling window operations onto reconfigurable hardware,” in Proc. Int. Symp.
Appl. Reconfigurable Comput. (ARC’07), Rio De Janeiro, Brazil, 2007, pp. 110–
121.
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3,
SEPTEMBER 2011
Presentation by:-
Deepika Ahlawat
12ECP025
M. Tech VLSI Design

Mais conteúdo relacionado

Mais procurados

Floorplanning in physical design
Floorplanning in physical designFloorplanning in physical design
Floorplanning in physical designMurali Rai
 
CAD: introduction to floorplanning
CAD:  introduction to floorplanningCAD:  introduction to floorplanning
CAD: introduction to floorplanningTeam-VLSI-ITMU
 
Physical Design overview
Physical Design overviewPhysical Design overview
Physical Design overviewMarcus Wei
 
Deep Learning Fast MRI Using Channel Attention in Magnitude Domain
Deep Learning Fast MRI Using Channel Attention in Magnitude DomainDeep Learning Fast MRI Using Channel Attention in Magnitude Domain
Deep Learning Fast MRI Using Channel Attention in Magnitude DomainJoonhyung Lee
 
PLA Minimization -Testing
PLA Minimization -TestingPLA Minimization -Testing
PLA Minimization -TestingDr.YNM
 
Field Programmable Gate Array for Data Processing in Medical Systems
Field Programmable Gate Array for Data Processing in Medical SystemsField Programmable Gate Array for Data Processing in Medical Systems
Field Programmable Gate Array for Data Processing in Medical SystemsIOSR Journals
 
Design and Implementation of a Programmable Truncated Multiplier
Design and Implementation of a Programmable Truncated MultiplierDesign and Implementation of a Programmable Truncated Multiplier
Design and Implementation of a Programmable Truncated Multiplierijsrd.com
 
Hardware Architecture for Calculating LBP-Based Image Region Descriptors
Hardware Architecture for Calculating LBP-Based Image Region DescriptorsHardware Architecture for Calculating LBP-Based Image Region Descriptors
Hardware Architecture for Calculating LBP-Based Image Region DescriptorsMarek Kraft
 
Vlsi design process for low power design methodology using reconfigurable fpga
Vlsi design process for low power design methodology using reconfigurable fpgaVlsi design process for low power design methodology using reconfigurable fpga
Vlsi design process for low power design methodology using reconfigurable fpgaeSAT Publishing House
 
Implementation and Optimization of FDTD Kernels by Using Cache-Aware Time-Ske...
Implementation and Optimization of FDTD Kernels by Using Cache-Aware Time-Ske...Implementation and Optimization of FDTD Kernels by Using Cache-Aware Time-Ske...
Implementation and Optimization of FDTD Kernels by Using Cache-Aware Time-Ske...Serhan
 
An Index-first Addressing Scheme for Multi-level Caches
An Index-first Addressing Scheme for Multi-level CachesAn Index-first Addressing Scheme for Multi-level Caches
An Index-first Addressing Scheme for Multi-level Cachesidescitation
 
Iaetsd pipelined parallel fft architecture through folding transformation
Iaetsd pipelined parallel fft architecture through folding transformationIaetsd pipelined parallel fft architecture through folding transformation
Iaetsd pipelined parallel fft architecture through folding transformationIaetsd Iaetsd
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
A PROGRESSIVE MESH METHOD FOR PHYSICAL SIMULATIONS USING LATTICE BOLTZMANN ME...
A PROGRESSIVE MESH METHOD FOR PHYSICAL SIMULATIONS USING LATTICE BOLTZMANN ME...A PROGRESSIVE MESH METHOD FOR PHYSICAL SIMULATIONS USING LATTICE BOLTZMANN ME...
A PROGRESSIVE MESH METHOD FOR PHYSICAL SIMULATIONS USING LATTICE BOLTZMANN ME...ijdpsjournal
 

Mais procurados (20)

Floorplanning in physical design
Floorplanning in physical designFloorplanning in physical design
Floorplanning in physical design
 
09 placement
09 placement09 placement
09 placement
 
CAD: introduction to floorplanning
CAD:  introduction to floorplanningCAD:  introduction to floorplanning
CAD: introduction to floorplanning
 
Physical Design overview
Physical Design overviewPhysical Design overview
Physical Design overview
 
Deep Learning Fast MRI Using Channel Attention in Magnitude Domain
Deep Learning Fast MRI Using Channel Attention in Magnitude DomainDeep Learning Fast MRI Using Channel Attention in Magnitude Domain
Deep Learning Fast MRI Using Channel Attention in Magnitude Domain
 
PLA Minimization -Testing
PLA Minimization -TestingPLA Minimization -Testing
PLA Minimization -Testing
 
Field Programmable Gate Array for Data Processing in Medical Systems
Field Programmable Gate Array for Data Processing in Medical SystemsField Programmable Gate Array for Data Processing in Medical Systems
Field Programmable Gate Array for Data Processing in Medical Systems
 
Design and Implementation of a Programmable Truncated Multiplier
Design and Implementation of a Programmable Truncated MultiplierDesign and Implementation of a Programmable Truncated Multiplier
Design and Implementation of a Programmable Truncated Multiplier
 
Hardware Architecture for Calculating LBP-Based Image Region Descriptors
Hardware Architecture for Calculating LBP-Based Image Region DescriptorsHardware Architecture for Calculating LBP-Based Image Region Descriptors
Hardware Architecture for Calculating LBP-Based Image Region Descriptors
 
2021 03-02-spade
2021 03-02-spade2021 03-02-spade
2021 03-02-spade
 
Block diagrams
Block diagramsBlock diagrams
Block diagrams
 
Vlsi design process for low power design methodology using reconfigurable fpga
Vlsi design process for low power design methodology using reconfigurable fpgaVlsi design process for low power design methodology using reconfigurable fpga
Vlsi design process for low power design methodology using reconfigurable fpga
 
Scolari's ICCD17 Talk
Scolari's ICCD17 TalkScolari's ICCD17 Talk
Scolari's ICCD17 Talk
 
Implementation and Optimization of FDTD Kernels by Using Cache-Aware Time-Ske...
Implementation and Optimization of FDTD Kernels by Using Cache-Aware Time-Ske...Implementation and Optimization of FDTD Kernels by Using Cache-Aware Time-Ske...
Implementation and Optimization of FDTD Kernels by Using Cache-Aware Time-Ske...
 
An Index-first Addressing Scheme for Multi-level Caches
An Index-first Addressing Scheme for Multi-level CachesAn Index-first Addressing Scheme for Multi-level Caches
An Index-first Addressing Scheme for Multi-level Caches
 
Iaetsd pipelined parallel fft architecture through folding transformation
Iaetsd pipelined parallel fft architecture through folding transformationIaetsd pipelined parallel fft architecture through folding transformation
Iaetsd pipelined parallel fft architecture through folding transformation
 
Kv3419501953
Kv3419501953Kv3419501953
Kv3419501953
 
2021 04-03-sean
2021 04-03-sean2021 04-03-sean
2021 04-03-sean
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
A PROGRESSIVE MESH METHOD FOR PHYSICAL SIMULATIONS USING LATTICE BOLTZMANN ME...
A PROGRESSIVE MESH METHOD FOR PHYSICAL SIMULATIONS USING LATTICE BOLTZMANN ME...A PROGRESSIVE MESH METHOD FOR PHYSICAL SIMULATIONS USING LATTICE BOLTZMANN ME...
A PROGRESSIVE MESH METHOD FOR PHYSICAL SIMULATIONS USING LATTICE BOLTZMANN ME...
 

Semelhante a Intermediate Fabrics

System designing and modelling using fpga
System designing and modelling using fpgaSystem designing and modelling using fpga
System designing and modelling using fpgaIAEME Publication
 
Ieee 2015 project list_vlsi
Ieee 2015 project list_vlsiIeee 2015 project list_vlsi
Ieee 2015 project list_vlsiigeeks1234
 
Ieee 2015 project list_vlsi
Ieee 2015 project list_vlsiIeee 2015 project list_vlsi
Ieee 2015 project list_vlsiigeeks1234
 
Me,be ieee 2015 project list_vlsi
Me,be ieee 2015 project list_vlsiMe,be ieee 2015 project list_vlsi
Me,be ieee 2015 project list_vlsiigeeks1234
 
Vlsi design process for low power design methodology using reconfigurable fpga
Vlsi design process for low power design methodology using reconfigurable fpgaVlsi design process for low power design methodology using reconfigurable fpga
Vlsi design process for low power design methodology using reconfigurable fpgaeSAT Journals
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)ijceronline
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) ijceronline
 
Inter-Process communication using pipe in FPGA based adaptive communication
Inter-Process communication using pipe in FPGA based adaptive communicationInter-Process communication using pipe in FPGA based adaptive communication
Inter-Process communication using pipe in FPGA based adaptive communicationMayur Shah
 
Fpga based motor controller
Fpga based motor controllerFpga based motor controller
Fpga based motor controllerUday Wankar
 
Conference Paper: Universal Node: Towards a high-performance NFV environment
Conference Paper: Universal Node: Towards a high-performance NFV environmentConference Paper: Universal Node: Towards a high-performance NFV environment
Conference Paper: Universal Node: Towards a high-performance NFV environmentEricsson
 
HIGH PERFORMANCE ETHERNET PACKET PROCESSOR CORE FOR NEXT GENERATION NETWORKS
HIGH PERFORMANCE ETHERNET PACKET PROCESSOR CORE FOR NEXT GENERATION NETWORKSHIGH PERFORMANCE ETHERNET PACKET PROCESSOR CORE FOR NEXT GENERATION NETWORKS
HIGH PERFORMANCE ETHERNET PACKET PROCESSOR CORE FOR NEXT GENERATION NETWORKSijngnjournal
 

Semelhante a Intermediate Fabrics (20)

System designing and modelling using fpga
System designing and modelling using fpgaSystem designing and modelling using fpga
System designing and modelling using fpga
 
Ieee 2015 project list_vlsi
Ieee 2015 project list_vlsiIeee 2015 project list_vlsi
Ieee 2015 project list_vlsi
 
Ieee 2015 project list_vlsi
Ieee 2015 project list_vlsiIeee 2015 project list_vlsi
Ieee 2015 project list_vlsi
 
Me,be ieee 2015 project list_vlsi
Me,be ieee 2015 project list_vlsiMe,be ieee 2015 project list_vlsi
Me,be ieee 2015 project list_vlsi
 
Vlsi design process for low power design methodology using reconfigurable fpga
Vlsi design process for low power design methodology using reconfigurable fpgaVlsi design process for low power design methodology using reconfigurable fpga
Vlsi design process for low power design methodology using reconfigurable fpga
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
Fpga lecture
Fpga lectureFpga lecture
Fpga lecture
 
Inter-Process communication using pipe in FPGA based adaptive communication
Inter-Process communication using pipe in FPGA based adaptive communicationInter-Process communication using pipe in FPGA based adaptive communication
Inter-Process communication using pipe in FPGA based adaptive communication
 
Convolution
ConvolutionConvolution
Convolution
 
Streams on wires
Streams on wiresStreams on wires
Streams on wires
 
Ia3115171521
Ia3115171521Ia3115171521
Ia3115171521
 
Introduction to FPGAs
Introduction to FPGAsIntroduction to FPGAs
Introduction to FPGAs
 
Hbdfpga fpl07
Hbdfpga fpl07Hbdfpga fpl07
Hbdfpga fpl07
 
Fpg as 11 body
Fpg as 11 bodyFpg as 11 body
Fpg as 11 body
 
Fpga Knowledge
Fpga KnowledgeFpga Knowledge
Fpga Knowledge
 
Fpga based motor controller
Fpga based motor controllerFpga based motor controller
Fpga based motor controller
 
Conference Paper: Universal Node: Towards a high-performance NFV environment
Conference Paper: Universal Node: Towards a high-performance NFV environmentConference Paper: Universal Node: Towards a high-performance NFV environment
Conference Paper: Universal Node: Towards a high-performance NFV environment
 
FPGAs memory synchronization and performance evaluation using the open compu...
FPGAs memory synchronization and performance evaluation  using the open compu...FPGAs memory synchronization and performance evaluation  using the open compu...
FPGAs memory synchronization and performance evaluation using the open compu...
 
HIGH PERFORMANCE ETHERNET PACKET PROCESSOR CORE FOR NEXT GENERATION NETWORKS
HIGH PERFORMANCE ETHERNET PACKET PROCESSOR CORE FOR NEXT GENERATION NETWORKSHIGH PERFORMANCE ETHERNET PACKET PROCESSOR CORE FOR NEXT GENERATION NETWORKS
HIGH PERFORMANCE ETHERNET PACKET PROCESSOR CORE FOR NEXT GENERATION NETWORKS
 

Mais de Team-VLSI-ITMU

Reduced ordered binary decision diagram
Reduced ordered binary decision diagramReduced ordered binary decision diagram
Reduced ordered binary decision diagramTeam-VLSI-ITMU
 
Nmos design using synopsys TCAD tool
Nmos design using synopsys TCAD toolNmos design using synopsys TCAD tool
Nmos design using synopsys TCAD toolTeam-VLSI-ITMU
 
CAD: Layout Extraction
CAD: Layout ExtractionCAD: Layout Extraction
CAD: Layout ExtractionTeam-VLSI-ITMU
 
Computer Aided Design: Layout Compaction
Computer Aided Design: Layout CompactionComputer Aided Design: Layout Compaction
Computer Aided Design: Layout CompactionTeam-VLSI-ITMU
 
Computer Aided Design: Global Routing
Computer Aided Design:  Global RoutingComputer Aided Design:  Global Routing
Computer Aided Design: Global RoutingTeam-VLSI-ITMU
 
Cmos inverter design using tanner 180nm technology
Cmos inverter design using tanner 180nm technologyCmos inverter design using tanner 180nm technology
Cmos inverter design using tanner 180nm technologyTeam-VLSI-ITMU
 
SRAM- Ultra low voltage operation
SRAM- Ultra low voltage operationSRAM- Ultra low voltage operation
SRAM- Ultra low voltage operationTeam-VLSI-ITMU
 
All opam assignment2_main
All opam assignment2_mainAll opam assignment2_main
All opam assignment2_mainTeam-VLSI-ITMU
 
MOSFET Small signal model
MOSFET Small signal modelMOSFET Small signal model
MOSFET Small signal modelTeam-VLSI-ITMU
 
twin well cmos fabrication steps using Synopsys TCAD
twin well cmos fabrication steps using Synopsys TCADtwin well cmos fabrication steps using Synopsys TCAD
twin well cmos fabrication steps using Synopsys TCADTeam-VLSI-ITMU
 

Mais de Team-VLSI-ITMU (15)

Ch 6 randomization
Ch 6 randomizationCh 6 randomization
Ch 6 randomization
 
RTX Kernal
RTX KernalRTX Kernal
RTX Kernal
 
CNTFET
CNTFETCNTFET
CNTFET
 
scripting in Python
scripting in Pythonscripting in Python
scripting in Python
 
Reduced ordered binary decision diagram
Reduced ordered binary decision diagramReduced ordered binary decision diagram
Reduced ordered binary decision diagram
 
Nmos design using synopsys TCAD tool
Nmos design using synopsys TCAD toolNmos design using synopsys TCAD tool
Nmos design using synopsys TCAD tool
 
Linux Basics
Linux BasicsLinux Basics
Linux Basics
 
CAD: Layout Extraction
CAD: Layout ExtractionCAD: Layout Extraction
CAD: Layout Extraction
 
Computer Aided Design: Layout Compaction
Computer Aided Design: Layout CompactionComputer Aided Design: Layout Compaction
Computer Aided Design: Layout Compaction
 
Computer Aided Design: Global Routing
Computer Aided Design:  Global RoutingComputer Aided Design:  Global Routing
Computer Aided Design: Global Routing
 
Cmos inverter design using tanner 180nm technology
Cmos inverter design using tanner 180nm technologyCmos inverter design using tanner 180nm technology
Cmos inverter design using tanner 180nm technology
 
SRAM- Ultra low voltage operation
SRAM- Ultra low voltage operationSRAM- Ultra low voltage operation
SRAM- Ultra low voltage operation
 
All opam assignment2_main
All opam assignment2_mainAll opam assignment2_main
All opam assignment2_main
 
MOSFET Small signal model
MOSFET Small signal modelMOSFET Small signal model
MOSFET Small signal model
 
twin well cmos fabrication steps using Synopsys TCAD
twin well cmos fabrication steps using Synopsys TCADtwin well cmos fabrication steps using Synopsys TCAD
twin well cmos fabrication steps using Synopsys TCAD
 

Último

Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfModule-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfManish Kumar
 
Ch10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfCh10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfChristianCDAM
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONjhunlian
 
Prach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism CommunityPrach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism Communityprachaibot
 
CS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfCS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfBalamuruganV28
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating SystemRashmi Bhat
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdfAkritiPradhan2
 
Engineering Drawing section of solid
Engineering Drawing     section of solidEngineering Drawing     section of solid
Engineering Drawing section of solidnamansinghjarodiya
 
Novel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsNovel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsResearcher Researcher
 
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTFUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTSneha Padhiar
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxRomil Mishra
 
Cost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionCost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionSneha Padhiar
 
SOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATIONSOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATIONSneha Padhiar
 
Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdf
Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdfPaper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdf
Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdfNainaShrivastava14
 
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.elesangwon
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionMebane Rash
 
TEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACHTEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACHSneha Padhiar
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingBootNeck1
 
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithmComputer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithmDeepika Walanjkar
 
Python Programming for basic beginners.pptx
Python Programming for basic beginners.pptxPython Programming for basic beginners.pptx
Python Programming for basic beginners.pptxmohitesoham12
 

Último (20)

Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfModule-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
 
Ch10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfCh10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdf
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
 
Prach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism CommunityPrach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism Community
 
CS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfCS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdf
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating System
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
 
Engineering Drawing section of solid
Engineering Drawing     section of solidEngineering Drawing     section of solid
Engineering Drawing section of solid
 
Novel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsNovel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending Actuators
 
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTFUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptx
 
Cost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionCost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based question
 
SOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATIONSOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATION
 
Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdf
Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdfPaper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdf
Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdf
 
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of Action
 
TEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACHTEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACH
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event Scheduling
 
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithmComputer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithm
 
Python Programming for basic beginners.pptx
Python Programming for basic beginners.pptxPython Programming for basic beginners.pptx
Python Programming for basic beginners.pptx
 

Intermediate Fabrics

  • 1. Intermediate Fabrics: Virtual Architectures for Near-Instant FPGA Compilation Greg Stitt and James Coole, Member, IEEE IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011
  • 2. IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011 Field-Programmable Gate Arrays (FPGAs) are digital integrated circuits (ICs) that contain configurable blocks of logic along with configurable interconnects between these blocks . Specifically, an FPGA contains programmable logic components called logic elements (LEs) and a hierarchy of reconfigurable interconnects that allow the LEs to be physically connected. LEs can be configured to perform complex combinational functions, or merely simple logic gates like AND and XOR. In most FPGAs, the logic blocks also include memory elements, which may be simple flip-flops or more complete blocks of memory .
  • 3. FPGAs are generally favoured in prototype building because the device does not need to be thrown away every time a change is made. This allows one piece of hardware to perform several different functions. Of course, those functions cannot be performed at the same time. Besides that, FPGAs are standard parts, they are not designed for any particular function but are programmed by the customer for a particular purpose. IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011
  • 4. FPGAs have compensating advantages, largely due to the fact that they are standard parts. There is no wait from completing the design to obtaining a working chip. The design can be programmed into the FPGA and tested immediately. Apart from that, FPGAs are excellent prototyping vehicles. When the FPGA is used in the final design, the jump from prototype to product is much smaller and easier to negotiate. IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011
  • 5. O Field-programmable gate arrays (FPGAs) suffer from lower application design productivity than other devices, which is largely due to compilation taking hours or even days. IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011
  • 6. O In this letter, virtual reconfigurable architectures are evaluated, referred to as intermediate fabrics, which enable near-instant placement and routing of applications for commercial FPGAs. IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011
  • 7. O DESPITE performance, power, and energy advantages compared to other devices, field- programmable gate array (FPGA) usage has been limited by poor application design productivity. WHY?? O Because of placement and routing(PAR), which commonly requires many hours or even days. IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011
  • 8. O IFs are virtual reconfigurable architectures implemented atop commercial-off-the-shelf (COTS) FPGAs, which designers either select from a library or custom generate to provide an application- specialized virtual architecture. O They contains appropriate resources (e.g., floating- point cores, FFTs, filters, etc.), as opposed to mapping the application onto possibly hundreds of thousands of fine-grained look-up-tables (LUTs) in the physical device. IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011
  • 9. IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011
  • 10. O Provides several orders of magnitude faster PAR, thus enabling fast compilation. O IFs enable designers to use commercial-off-the- shelf(COTS) devices, which have significant cost advantages. O Other advantages include application portability across devices and rapid reconfiguration that is orders of magnitude faster than partial reconfiguration on FPGAs. IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011
  • 11. O The main limitation of IFs is area and performance overhead. O IFs are not intended for all usage scenarios due to overhead that may be too significant for small embedded systems or large high-performance systems. IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011
  • 12. O Flexibility is also a potential limitation. A custom circuit can use any mixture of operators and precisions. An IF circuit can also include any mixture, but it is unlikely that highly specialized IFs would exist in a predetermined library. O Certain application domains will likely not benefit from IFs. For example, control-intensive applications clearly cannot be implemented on the IF. IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011
  • 13. O Previous work has focused on fast PAR algorithms, but those studies either provided insufficient speedup for runtime compilation required custom devices, or were intended for specific usage scenarios (e.g., runtime reconfiguration). O Compilation time can be reduced by using pre-compiled circuit blocks (hard macros). IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011
  • 14. O In this letter, a more realistic usage scenario of a single, larger IF specialized for common image-processing kernels is evaluated. O Although the fabric could include any resource, we chose resources common to image- processing kernels: • multipliers, • subtractors with an optional absolute-value output, • adders, • a square root, • and delay components that act as configurable shift registers. IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011
  • 15. O For the total size there are128 inputs, five delays capable of delaying up to 9000 cycles, 64 multipliers, 64 subtractors, 63 adders, one square root, and one single output. O To load a bitfile, the IF uses a configuration shift register that requires only 9234 bits, compared to several megabytes for commercial FPGAs. IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011
  • 16. IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011 Fig. 1. Common circuit structure for pipelined image- processing kernels.
  • 17. O This architecture consists of a controller and a pipelined datapath that accepts a window—a subset of the image—as input every cycle, and generates an output every cycle after some initial latency. O The IF uses the window generator which buffers image rows in block RAMs. O The window generator continually shifts buffered data into shift registers to create a new window each cycle O The architecture uses address generators and memory controllers to stream data to and from memories. IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011
  • 18. IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011 Fig 2: datapath fabric used for the evaluated image-processing kernels. Each row has 66 columns.
  • 19. O We implemented the delays using block RAMs and the square root using an Altera-provided IP core. O connections boxes are to connect resources to virtual routing tracks, and planar switch boxes to connect tracks. The IF uses two 16-bit bidirectional tracks in each row and column. O We implement each track as a mux. O Each mux has a configuration register (not shown) that controls the mux select, as determined by PAR. O In addition, each output has a register to pipeline the interconnect, which was necessary to achieve clock frequencies comparable to FPGAs. IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011
  • 20. O To implement the IF on the FPGA, an IF generation tool is created that reads an XML(Extensible Markup Language) description of the fabric and outputs synthesizable VHDL. O FPGA used:- Stratix III E260 FPGA IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011
  • 21. O The router uses Pathfinder, which does not support pipelined routing, but as mentioned earlier, the realignment registers allow the router to ignore nets that are misaligned by eight or less cycles. O Currently, we have no synthesis tools for IFs and therefore manually created technology- mapped IF netlists. We could potentially synthesize from any HDL, for runtime compilation. IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011
  • 22. O In this letter, we evaluated an intermediate fabric for common image-processing kernels, which enabled placement and routing 700 x speedups averaging compared to vendor tools. O The main limitation is overhead, which was a modest 7% for performance, and a more significant 34% to 44% for area. However, FPGA vendors could eliminate such area overhead by directly mapping virtual routing resources onto physical FPGA resources. IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011
  • 23. [1] P. Athanas, J. Bowen, T. Dunham, C. Patterson, J. Rice, M. Shelburne, J. Suris, M. Bucciero, and J. Graf, “Wires on demand: Run-time communication synthesis for reconfigurable computing,” in Proc. Int. Workshop Field-Program. Logic Appl. (FPL’97), London, U.K., 2007, pp. 513–516. [2] V. Betz and J. Rose, “VPR: A new packing, placement and routing tool for FPGA research,” in Proc. Int. Workshop Field-Program. Logic Appl. (FPL’97), London, U.K., 1997, pp. 213–222. [3] J. Coole and G. Stitt, “Intermediate fabrics:Virtual architectures for circuit portability and fast placement and routing,” in Proc. IEEE/ACM/ IFIP Int. Conf. Hardware/Software Codesign Syst. Synth. (CODES/ ISSS’10), Scottsdale, AZ, 2010, pp. 13–22. [4] A. DeHon, “The density advantage of configurable computing,” Computer, vol. 33, no. 4, pp. 41–49, 2000. [5] Y. Dong, Y. Dou, and J. Zhou, “Optimized generation of memory structure in compiling window operations onto reconfigurable hardware,” in Proc. Int. Symp. Appl. Reconfigurable Comput. (ARC’07), Rio De Janeiro, Brazil, 2007, pp. 110– 121. IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011
  • 24. IEEE EMBEDDED SYSTEMS LETTERS, VOL. 3, NO. 3, SEPTEMBER 2011 Presentation by:- Deepika Ahlawat 12ECP025 M. Tech VLSI Design