SlideShare uma empresa Scribd logo
1 de 29
Baixar para ler offline
3D Microprocessor Design
                                     Stacking at different granularities


                                              Alberto Villegas Erce

                                            Seminar on Computer Systems
                                                  Turku University


                                                       April 2010




Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor           April 2010   1 / 29
Introduction


  Concepts review
  Previously on 3D world...


   Industry trends
   Make it faster, smaller and cuter but do not forget the prize

   3D Design
   Benefits: shorter wire length, speed increase, lower power consumption.
   Challenges: risk of defects, heat problems, design complexity.

   Through Silicon Vias (TSVs)
   Vertical electrical connection passing completely through a silicon die.
           Low power consumption
           Low latency
           Increasing integration level (10k-100k per cm2 )

Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor           April 2010   2 / 29
Introduction


  Today
  Three dimensional Puzzle


                                             How to face 3D design?
                                             2D design decomposition at different
                                             granularities.
                                                1     Entire cores, cache: add functionality
                                                      with high 2D reuse.
                                                2     Functional unit blocks: performance
                                                      improvement and power reduction.
                                                      Must re-floorplan and retime paths.
                                                3     Logic gates (block splitting): reduce
                                                      latency and power on every level routes.
                                                      Need new 3D circuit design,
                                                      methodologies and layout tools.


Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor                   April 2010   3 / 29
Introduction


  Index




                                                     1   Stacking Complete Modules
                                                     2   Stacking Functional Unit Blocks
                                                     3   Splitting Functional Unit Blocks
                                                     4   Conclusions




Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor                  April 2010   4 / 29
Stacking Complete Modules


  Index




                                                     1   Stacking Complete Modules
                                                     2   Stacking Functional Unit Blocks
                                                     3   Splitting Functional Unit Blocks
                                                     4   Conclusions




Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor                  April 2010   5 / 29
Stacking Complete Modules     Idea


  Three-Dimensional Stacked Caches

   Idea
   Break & stack existing modules.




                                                                 Conventional dual-core processor
                                                                 featuring a 4MB L2 cache.
                                                                 Design options for 3D stacking
                                                                        Reduce space.
                                                                        Increase storage.




Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor                        April 2010   6 / 29
Stacking Complete Modules     Increasing storage


  L2 cache controller in 3D


                                                                     Objective
                                                                     Add more storage to the L2
                                                                     cache.
                                                                     Stacking a second silicon
                                                                     layer
                                                                               Additional 8MB of cache
                                                                               Nearly no impact in L2
                                                                               access latency
                                                                     Traditional 2D solution
                                                                               Double silicon area.
                                                                               Latency increased.


Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor                             April 2010   7 / 29
Stacking Complete Modules     Increasing storage


  L2 cache controller in 3D (cont.)

                                                                                 DRAM Solution
                                                                                      Much greater
                                                                                      storage density.
                                                                                      Greater latency
                                                                                      (50-150 cycles).
                                                                                      Reduce silicon
                                                                                      area in a half.
                                                                                 Hybrid solution
                                                                                      SRAM to store
                                                                                      only the tags.
                                                                                      DRAM to store
                                                                                      the actual data.


Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor                            April 2010   8 / 29
Stacking Complete Modules     Increasing storage


  L2 cache controller in 3D (testing)
   Three programs test:
    Program A : small working set that fits in 4MB SRAM cache.
    Program B : larger working set that do not fit 4MB SRAM but does fit
               within 32MB DRAM cache.
    Program C : streaming memory access patterns. Poor cache hits rate for
               both configurations.




Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor                  April 2010   9 / 29
Stacking Complete Modules     3D optionality


  3D Integration
  ... for everyone?



                                     3D Integration:
                                            Increase silicon required for the chip (layers)
                                            =⇒ Increase manufacturing cost
                                            Extra manufacturing steps for bounding.
                                            Impact on yield rates.

                                                       3D is not the general answer!

      3D stacking is to use it as a means to optionally augment the processor
                     with some additional functionality



Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor                   April 2010   10 / 29
Stacking Complete Modules     3D optionality


  Introspective 3D Processors
   Objective
   Access to more dynamic information about the internal state of a
   microprocessor.




Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor              April 2010   11 / 29
Stacking Complete Modules     3D optionality


  Reliable 3D Processors

   Problem
   Small size in modern processors makes them vulnerable to data corruption

      Solutions
              Redundancy: two/three copies of the
              processor operating lock-step =⇒
              multiple pipelines increase cost.
              Leading execution/trailing checking
              cores: trailing core re-executes
              instructions (not lock-step) =⇒ still
              additional pipeline increases area.
                                                                                  Extra wires eliminated.
                                   Stack it!                                      Optional checker core.
                                                                                  Unutilized silicon area.


Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor                          April 2010     12 / 29
Stacking Functional Unit Blocks


  Index




                                                     1   Stacking Complete Modules
                                                     2   Stacking Functional Unit Blocks
                                                     3   Splitting Functional Unit Blocks
                                                     4   Conclusions




Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor                 April 2010   13 / 29
Stacking Functional Unit Blocks   Introduction


  Stacking Functional Unit Blocks

                                                 Nowadays
                                                 Early step of development for this
                                                 technologies.
                                                 3D integration will require
                                                        Design automation tools.
                                                        Layout support.
                                                        Verification and validation
                                                        methodologies.

                                                 Future
                                                 Reorganize the processor pipeline in new
                                                 ways.

Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor                  April 2010   14 / 29
Stacking Functional Unit Blocks    Removing wires


  Removing Wires
  Pentium III & IV branch misprediction

   Problem
   Wire delays have not evolve as fast as transistors speed.



                                                    PIII branch misprediction




                                                    PIV branch misprediction



   Solution
   3D implementation so distant blocks are now vertically stacked on top of
   each other.
Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor               April 2010   15 / 29
Stacking Functional Unit Blocks   Removing wires


  Removing Wires
  Alpha 21264


   Problem
   Superscalar processor with multiple execution units (EU) requires a bypass
   network to forward results between all of the EU =⇒ wiring.

      2D Solution
      Divide EU into two groups or
      clusters, each with its own bypass
      network and communicated.



      3D Solution
               Stack the clusters.


Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor              April 2010   16 / 29
Stacking Functional Unit Blocks   Trade-offs


  Removing Wires
  Trade-offs




                                                                     Cons
          Pros                                                           Non-trivial engineering
                  Optimize processor                                     effort.
                  pipeline opportunities.                                          Modify pipeline
                  Physically reduction of                                          Verify and validate
                  amount of wiring.                                                new design.
                                                                               Additional costs.




Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor                               April 2010   17 / 29
Stacking Functional Unit Blocks   TSV Reality


  Removing Wires
  TSV Reality

   Problem
   After stacking two blocks there is enough room for placing TSVs.




   Solution
   Different layouts of the TSVs.
   Wire overhead reintroduction
           Reintroduced wires do not completely cancel the 3D wire reduction benefits.
Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor           April 2010   18 / 29
Splitting Functional Unit Blocks


  Index




                                                     1    Stacking Complete Modules
                                                     2    Stacking Functional Unit Blocks
                                                     3    Splitting Functional Unit Blocks
                                                     4    Conclusions




Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor                  April 2010   19 / 29
Splitting Functional Unit Blocks   Introduction


  Splitting Functional Unit Blocks

                                                                  Last level
                                                                  Logic gates
                                                                         Split individual functional units
                                                                         across multiple layers.
                                                                         Reorganize the functional unit
                                                                         block =⇒ more compact 3D
                                                                         arragement.

                                                                  Benefits
                                                                      Reduce length of intra-block
                                                                      wiring.
                                                                         Improve operating frequencies.

                          We will introduce a starting point of thinking.
Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor                         April 2010   20 / 29
Splitting Functional Unit Blocks   3D Cache Organizations


  3D Cache Organizations
  First view



                                                 Problem
                                                 L2 cache consumes about half of the overall
                                                 die area.

                                                         Worst case routing distance: 2x+4y

                                             Two stack possibilities.

    Banks on cores                                                                         Banks on banks
            Half space.                                                                         Half space.
            Accessing                                                                           Accessing
            equal.                                                                              reduced.


Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor                          April 2010   21 / 29
Splitting Functional Unit Blocks   Splitting the cache


  3D Splitting the cache


      Problem
      Wires within each bank also impact overall
      latency.


                       Split individual cache banks across multiple layers.


       Columns on
       columns
               Best
               latency.
                                                                        Rows on rows
                                                                                Energy reduction.
Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor                            April 2010   22 / 29
Splitting Functional Unit Blocks   Splitting the cache


  3D Splitting cache
  Testing


   Experimental results
           SPICE simulation.
           Column on column organization.
           SRAM implementations in 65-nm process.




Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor                    April 2010   23 / 29
Splitting Functional Unit Blocks   3D Adders


  3D Adders
  Classic Look-ahead Carry Adder




      Look-ahead Carry Adder
              n = 16-bits
              Critical path along bit[0]-bit[n-1]


   Several ways to split the adder

 Based on inputs                                                               By significance
         x bottom layer;                                                              least significant bits
         y top layer.                                                                 bottom layer;
                                                                                      most significant top
         1st lvl of propagate                                                         layer.
         layer splitted.
                                                                                      TSV between root
         Half wire length.                                                            nodes.




Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor                   April 2010      24 / 29
Conclusions


  Index




                                                     1   Stacking Complete Modules
                                                     2   Stacking Functional Unit Blocks
                                                     3   Splitting Functional Unit Blocks
                                                     4   Conclusions




Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor                 April 2010   25 / 29
Conclusions


  Conclusions

                                                                         Benefits of 3D organizing
                                                                         components
                                                                             Can significantly reduce
                                                                             wire lengths.
                                                                               Devices from different
                                                                               technologies can be
                                                                               tightly integrated and
                                                                               combined.

                                                                         3D organizations may be
                                                                         required depending on the
                                                                         exact design constraints and
                                                                         objectives.

Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor                         April 2010   26 / 29
Conclusions


  Conclusions


                                                                         Cons
                                                                             More granularity ⇒
                                                                             more re-dising.
                                                                               Stacking can increase
                                                                               heat.
                                                                               Long level of
                                                                               technological
                                                                               development

                                                                         Every re-design process yields
                                                                         to a cost increment.


Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor                           April 2010   27 / 29
References


  References


          Three-Dimensional Microprocessor Design
          Gabriel H. Loh
          Springer Science 2010
          A Modular 3D Processor for Flexible Product Design and Technology
          Migration
          Gabriel H. Loh
          ACM 2008
          Die-stacking (3D) microarchitecture
          B. Black.
          International Symposium on Microarchitecture, pp. 469-479, 2006



Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor           April 2010   28 / 29
The end     Questions




                    Thank you.
                     Questions?
                                                     Please be nice




Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design
                                                   3D Microprocessor           April 2010   29 / 29

Mais conteúdo relacionado

Mais procurados

Distributed systems
Distributed systemsDistributed systems
Distributed systemsSave Manos
 
DWT-DCT-SVD Based Semi Blind Image Watermarking Using Middle Frequency Band
DWT-DCT-SVD Based Semi Blind Image Watermarking Using Middle Frequency BandDWT-DCT-SVD Based Semi Blind Image Watermarking Using Middle Frequency Band
DWT-DCT-SVD Based Semi Blind Image Watermarking Using Middle Frequency BandIOSR Journals
 
Discrete cosine transform
Discrete cosine transformDiscrete cosine transform
Discrete cosine transformaniruddh Tyagi
 
Digital Image Watermarking Basics
Digital Image Watermarking BasicsDigital Image Watermarking Basics
Digital Image Watermarking BasicsIOSR Journals
 
Hardware Implementation of Genetic Algorithm Based Digital Colour Image Water...
Hardware Implementation of Genetic Algorithm Based Digital Colour Image Water...Hardware Implementation of Genetic Algorithm Based Digital Colour Image Water...
Hardware Implementation of Genetic Algorithm Based Digital Colour Image Water...IDES Editor
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD Editor
 
Image Authentication Using Digital Watermarking
Image Authentication Using Digital WatermarkingImage Authentication Using Digital Watermarking
Image Authentication Using Digital Watermarkingijceronline
 
Hybrid Approach for Robust Digital Video Watermarking
Hybrid Approach for Robust Digital Video WatermarkingHybrid Approach for Robust Digital Video Watermarking
Hybrid Approach for Robust Digital Video WatermarkingIJSRD
 
Deep Convolutional Network evaluation on the Intel Xeon Phi
Deep Convolutional Network evaluation on the Intel Xeon PhiDeep Convolutional Network evaluation on the Intel Xeon Phi
Deep Convolutional Network evaluation on the Intel Xeon PhiGaurav Raina
 
Defeating Windows memory forensics
Defeating Windows memory forensicsDefeating Windows memory forensics
Defeating Windows memory forensicslmilkovic
 
A Novel Digital Watermarking Technique for Video Copyright Protection
A Novel Digital Watermarking Technique for Video Copyright Protection A Novel Digital Watermarking Technique for Video Copyright Protection
A Novel Digital Watermarking Technique for Video Copyright Protection cscpconf
 

Mais procurados (17)

Distributed systems
Distributed systemsDistributed systems
Distributed systems
 
DWT-DCT-SVD Based Semi Blind Image Watermarking Using Middle Frequency Band
DWT-DCT-SVD Based Semi Blind Image Watermarking Using Middle Frequency BandDWT-DCT-SVD Based Semi Blind Image Watermarking Using Middle Frequency Band
DWT-DCT-SVD Based Semi Blind Image Watermarking Using Middle Frequency Band
 
Discrete cosine transform
Discrete cosine transformDiscrete cosine transform
Discrete cosine transform
 
Distributed systems
Distributed systemsDistributed systems
Distributed systems
 
Digital Image Watermarking Basics
Digital Image Watermarking BasicsDigital Image Watermarking Basics
Digital Image Watermarking Basics
 
LSB & DWT BASED DIGITAL WATERMARKING SYSTEM FOR VIDEO AUTHENTICATION.
LSB & DWT BASED DIGITAL WATERMARKING SYSTEM FOR VIDEO AUTHENTICATION.LSB & DWT BASED DIGITAL WATERMARKING SYSTEM FOR VIDEO AUTHENTICATION.
LSB & DWT BASED DIGITAL WATERMARKING SYSTEM FOR VIDEO AUTHENTICATION.
 
145 153
145 153145 153
145 153
 
Hardware Implementation of Genetic Algorithm Based Digital Colour Image Water...
Hardware Implementation of Genetic Algorithm Based Digital Colour Image Water...Hardware Implementation of Genetic Algorithm Based Digital Colour Image Water...
Hardware Implementation of Genetic Algorithm Based Digital Colour Image Water...
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
 
[IJET-V1I6P5] Authors: Tawde Priyanka, Londhe Archana, Nazirkar Sandhya, Khat...
[IJET-V1I6P5] Authors: Tawde Priyanka, Londhe Archana, Nazirkar Sandhya, Khat...[IJET-V1I6P5] Authors: Tawde Priyanka, Londhe Archana, Nazirkar Sandhya, Khat...
[IJET-V1I6P5] Authors: Tawde Priyanka, Londhe Archana, Nazirkar Sandhya, Khat...
 
Image Authentication Using Digital Watermarking
Image Authentication Using Digital WatermarkingImage Authentication Using Digital Watermarking
Image Authentication Using Digital Watermarking
 
Hybrid Approach for Robust Digital Video Watermarking
Hybrid Approach for Robust Digital Video WatermarkingHybrid Approach for Robust Digital Video Watermarking
Hybrid Approach for Robust Digital Video Watermarking
 
Deep Convolutional Network evaluation on the Intel Xeon Phi
Deep Convolutional Network evaluation on the Intel Xeon PhiDeep Convolutional Network evaluation on the Intel Xeon Phi
Deep Convolutional Network evaluation on the Intel Xeon Phi
 
Defeating Windows memory forensics
Defeating Windows memory forensicsDefeating Windows memory forensics
Defeating Windows memory forensics
 
45 135-1-pb
45 135-1-pb45 135-1-pb
45 135-1-pb
 
Watermarking
WatermarkingWatermarking
Watermarking
 
A Novel Digital Watermarking Technique for Video Copyright Protection
A Novel Digital Watermarking Technique for Video Copyright Protection A Novel Digital Watermarking Technique for Video Copyright Protection
A Novel Digital Watermarking Technique for Video Copyright Protection
 

Destaque

8086 addressing modes
8086 addressing modes8086 addressing modes
8086 addressing modesj4jiet
 
Superscalar Architecture_AIUB
Superscalar Architecture_AIUBSuperscalar Architecture_AIUB
Superscalar Architecture_AIUBNusrat Mary
 
Multicore Processsors
Multicore ProcesssorsMulticore Processsors
Multicore ProcesssorsAveen Meena
 
Superscalar & superpipeline processor
Superscalar & superpipeline processorSuperscalar & superpipeline processor
Superscalar & superpipeline processorMuhammad Ishaq
 
Introduction to 8086 microprocessor
Introduction to 8086 microprocessorIntroduction to 8086 microprocessor
Introduction to 8086 microprocessorShreyans Pathak
 
Addressing Modes Of 8086
Addressing Modes Of 8086Addressing Modes Of 8086
Addressing Modes Of 8086Ikhlas Rahman
 
Multi core processors
Multi core processorsMulti core processors
Multi core processorsAdithya Bhat
 
8086 microprocessor-architecture
8086 microprocessor-architecture8086 microprocessor-architecture
8086 microprocessor-architectureprasadpawaskar
 

Destaque (11)

Superscalar processors
Superscalar processorsSuperscalar processors
Superscalar processors
 
8086 addressing modes
8086 addressing modes8086 addressing modes
8086 addressing modes
 
Basic of ARM Processor
Basic of ARM Processor Basic of ARM Processor
Basic of ARM Processor
 
Chapter 2: Microprocessors
Chapter 2: MicroprocessorsChapter 2: Microprocessors
Chapter 2: Microprocessors
 
Superscalar Architecture_AIUB
Superscalar Architecture_AIUBSuperscalar Architecture_AIUB
Superscalar Architecture_AIUB
 
Multicore Processsors
Multicore ProcesssorsMulticore Processsors
Multicore Processsors
 
Superscalar & superpipeline processor
Superscalar & superpipeline processorSuperscalar & superpipeline processor
Superscalar & superpipeline processor
 
Introduction to 8086 microprocessor
Introduction to 8086 microprocessorIntroduction to 8086 microprocessor
Introduction to 8086 microprocessor
 
Addressing Modes Of 8086
Addressing Modes Of 8086Addressing Modes Of 8086
Addressing Modes Of 8086
 
Multi core processors
Multi core processorsMulti core processors
Multi core processors
 
8086 microprocessor-architecture
8086 microprocessor-architecture8086 microprocessor-architecture
8086 microprocessor-architecture
 

Semelhante a 3D Microprocessor Design: Stacking at different granularities

Osa-multi-core.ppt
Osa-multi-core.pptOsa-multi-core.ppt
Osa-multi-core.pptSrikumarTB
 
4838281 operating-system-scheduling-on-multicore-architectures
4838281 operating-system-scheduling-on-multicore-architectures4838281 operating-system-scheduling-on-multicore-architectures
4838281 operating-system-scheduling-on-multicore-architecturesIslam Samir
 
Virtual Memory In Contemporary Microprocessors And 64-Bit Microprocessors Arc...
Virtual Memory In Contemporary Microprocessors And 64-Bit Microprocessors Arc...Virtual Memory In Contemporary Microprocessors And 64-Bit Microprocessors Arc...
Virtual Memory In Contemporary Microprocessors And 64-Bit Microprocessors Arc...Anurag Deb
 
CXL 2.0/3.x Switch Enabling Composable Memory Architecture in AI/HPC Computing
CXL 2.0/3.x Switch Enabling Composable Memory Architecture in AI/HPC ComputingCXL 2.0/3.x Switch Enabling Composable Memory Architecture in AI/HPC Computing
CXL 2.0/3.x Switch Enabling Composable Memory Architecture in AI/HPC ComputingMemory Fabric Forum
 
Enabling 3d Microelectronics Platforms Mcms
Enabling 3d Microelectronics Platforms  McmsEnabling 3d Microelectronics Platforms  Mcms
Enabling 3d Microelectronics Platforms McmsIonela
 
3d i cs_full_seminar_report
3d i cs_full_seminar_report3d i cs_full_seminar_report
3d i cs_full_seminar_reportsaitejarevathi
 
HISTORY AND FUTURE TRENDS OF MULTICORE COMPUTER ARCHITECTURE
HISTORY AND FUTURE TRENDS OF MULTICORE COMPUTER ARCHITECTUREHISTORY AND FUTURE TRENDS OF MULTICORE COMPUTER ARCHITECTURE
HISTORY AND FUTURE TRENDS OF MULTICORE COMPUTER ARCHITECTUREijcga
 
Architecture and implementation issues of multi core processors and caching –...
Architecture and implementation issues of multi core processors and caching –...Architecture and implementation issues of multi core processors and caching –...
Architecture and implementation issues of multi core processors and caching –...eSAT Publishing House
 
Cluster Computing
Cluster ComputingCluster Computing
Cluster ComputingNIKHIL NAIR
 
Modern INTEL Microprocessors' Architecture and Sneak Peak at NVIDIA TEGRA GPU
Modern INTEL Microprocessors' Architecture and Sneak Peak at NVIDIA TEGRA GPUModern INTEL Microprocessors' Architecture and Sneak Peak at NVIDIA TEGRA GPU
Modern INTEL Microprocessors' Architecture and Sneak Peak at NVIDIA TEGRA GPUabhijeetnawal
 
Digital Integrated Circuit (IC) Design
Digital Integrated Circuit (IC) DesignDigital Integrated Circuit (IC) Design
Digital Integrated Circuit (IC) DesignMahesh Dananjaya
 
MTE104-L2: Overview of Microcontrollers
MTE104-L2: Overview of MicrocontrollersMTE104-L2: Overview of Microcontrollers
MTE104-L2: Overview of MicrocontrollersAbdalla Ahmed
 
Power Optimization Through Manycore Multiprocessing
Power Optimization Through Manycore MultiprocessingPower Optimization Through Manycore Multiprocessing
Power Optimization Through Manycore Multiprocessingchiportal
 

Semelhante a 3D Microprocessor Design: Stacking at different granularities (20)

Osa-multi-core.ppt
Osa-multi-core.pptOsa-multi-core.ppt
Osa-multi-core.ppt
 
4838281 operating-system-scheduling-on-multicore-architectures
4838281 operating-system-scheduling-on-multicore-architectures4838281 operating-system-scheduling-on-multicore-architectures
4838281 operating-system-scheduling-on-multicore-architectures
 
what is core-i
what is core-i what is core-i
what is core-i
 
Virtual Memory In Contemporary Microprocessors And 64-Bit Microprocessors Arc...
Virtual Memory In Contemporary Microprocessors And 64-Bit Microprocessors Arc...Virtual Memory In Contemporary Microprocessors And 64-Bit Microprocessors Arc...
Virtual Memory In Contemporary Microprocessors And 64-Bit Microprocessors Arc...
 
CXL 2.0/3.x Switch Enabling Composable Memory Architecture in AI/HPC Computing
CXL 2.0/3.x Switch Enabling Composable Memory Architecture in AI/HPC ComputingCXL 2.0/3.x Switch Enabling Composable Memory Architecture in AI/HPC Computing
CXL 2.0/3.x Switch Enabling Composable Memory Architecture in AI/HPC Computing
 
International Journal of Engineering Inventions (IJEI), www.ijeijournal.com,c...
International Journal of Engineering Inventions (IJEI), www.ijeijournal.com,c...International Journal of Engineering Inventions (IJEI), www.ijeijournal.com,c...
International Journal of Engineering Inventions (IJEI), www.ijeijournal.com,c...
 
Enabling 3d Microelectronics Platforms Mcms
Enabling 3d Microelectronics Platforms  McmsEnabling 3d Microelectronics Platforms  Mcms
Enabling 3d Microelectronics Platforms Mcms
 
3d i cs_full_seminar_report
3d i cs_full_seminar_report3d i cs_full_seminar_report
3d i cs_full_seminar_report
 
ITE - Chapter 3
ITE - Chapter 3ITE - Chapter 3
ITE - Chapter 3
 
HISTORY AND FUTURE TRENDS OF MULTICORE COMPUTER ARCHITECTURE
HISTORY AND FUTURE TRENDS OF MULTICORE COMPUTER ARCHITECTUREHISTORY AND FUTURE TRENDS OF MULTICORE COMPUTER ARCHITECTURE
HISTORY AND FUTURE TRENDS OF MULTICORE COMPUTER ARCHITECTURE
 
Nehalem
NehalemNehalem
Nehalem
 
Architecture and implementation issues of multi core processors and caching –...
Architecture and implementation issues of multi core processors and caching –...Architecture and implementation issues of multi core processors and caching –...
Architecture and implementation issues of multi core processors and caching –...
 
Cat @ scale
Cat @ scaleCat @ scale
Cat @ scale
 
Cluster Computing
Cluster ComputingCluster Computing
Cluster Computing
 
Cluster computing
Cluster computingCluster computing
Cluster computing
 
Modern INTEL Microprocessors' Architecture and Sneak Peak at NVIDIA TEGRA GPU
Modern INTEL Microprocessors' Architecture and Sneak Peak at NVIDIA TEGRA GPUModern INTEL Microprocessors' Architecture and Sneak Peak at NVIDIA TEGRA GPU
Modern INTEL Microprocessors' Architecture and Sneak Peak at NVIDIA TEGRA GPU
 
Digital Integrated Circuit (IC) Design
Digital Integrated Circuit (IC) DesignDigital Integrated Circuit (IC) Design
Digital Integrated Circuit (IC) Design
 
MTE104-L2: Overview of Microcontrollers
MTE104-L2: Overview of MicrocontrollersMTE104-L2: Overview of Microcontrollers
MTE104-L2: Overview of Microcontrollers
 
92 97
92 9792 97
92 97
 
Power Optimization Through Manycore Multiprocessing
Power Optimization Through Manycore MultiprocessingPower Optimization Through Manycore Multiprocessing
Power Optimization Through Manycore Multiprocessing
 

Último

HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 

Último (20)

HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

3D Microprocessor Design: Stacking at different granularities

  • 1. 3D Microprocessor Design Stacking at different granularities Alberto Villegas Erce Seminar on Computer Systems Turku University April 2010 Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 1 / 29
  • 2. Introduction Concepts review Previously on 3D world... Industry trends Make it faster, smaller and cuter but do not forget the prize 3D Design Benefits: shorter wire length, speed increase, lower power consumption. Challenges: risk of defects, heat problems, design complexity. Through Silicon Vias (TSVs) Vertical electrical connection passing completely through a silicon die. Low power consumption Low latency Increasing integration level (10k-100k per cm2 ) Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 2 / 29
  • 3. Introduction Today Three dimensional Puzzle How to face 3D design? 2D design decomposition at different granularities. 1 Entire cores, cache: add functionality with high 2D reuse. 2 Functional unit blocks: performance improvement and power reduction. Must re-floorplan and retime paths. 3 Logic gates (block splitting): reduce latency and power on every level routes. Need new 3D circuit design, methodologies and layout tools. Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 3 / 29
  • 4. Introduction Index 1 Stacking Complete Modules 2 Stacking Functional Unit Blocks 3 Splitting Functional Unit Blocks 4 Conclusions Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 4 / 29
  • 5. Stacking Complete Modules Index 1 Stacking Complete Modules 2 Stacking Functional Unit Blocks 3 Splitting Functional Unit Blocks 4 Conclusions Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 5 / 29
  • 6. Stacking Complete Modules Idea Three-Dimensional Stacked Caches Idea Break & stack existing modules. Conventional dual-core processor featuring a 4MB L2 cache. Design options for 3D stacking Reduce space. Increase storage. Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 6 / 29
  • 7. Stacking Complete Modules Increasing storage L2 cache controller in 3D Objective Add more storage to the L2 cache. Stacking a second silicon layer Additional 8MB of cache Nearly no impact in L2 access latency Traditional 2D solution Double silicon area. Latency increased. Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 7 / 29
  • 8. Stacking Complete Modules Increasing storage L2 cache controller in 3D (cont.) DRAM Solution Much greater storage density. Greater latency (50-150 cycles). Reduce silicon area in a half. Hybrid solution SRAM to store only the tags. DRAM to store the actual data. Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 8 / 29
  • 9. Stacking Complete Modules Increasing storage L2 cache controller in 3D (testing) Three programs test: Program A : small working set that fits in 4MB SRAM cache. Program B : larger working set that do not fit 4MB SRAM but does fit within 32MB DRAM cache. Program C : streaming memory access patterns. Poor cache hits rate for both configurations. Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 9 / 29
  • 10. Stacking Complete Modules 3D optionality 3D Integration ... for everyone? 3D Integration: Increase silicon required for the chip (layers) =⇒ Increase manufacturing cost Extra manufacturing steps for bounding. Impact on yield rates. 3D is not the general answer! 3D stacking is to use it as a means to optionally augment the processor with some additional functionality Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 10 / 29
  • 11. Stacking Complete Modules 3D optionality Introspective 3D Processors Objective Access to more dynamic information about the internal state of a microprocessor. Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 11 / 29
  • 12. Stacking Complete Modules 3D optionality Reliable 3D Processors Problem Small size in modern processors makes them vulnerable to data corruption Solutions Redundancy: two/three copies of the processor operating lock-step =⇒ multiple pipelines increase cost. Leading execution/trailing checking cores: trailing core re-executes instructions (not lock-step) =⇒ still additional pipeline increases area. Extra wires eliminated. Stack it! Optional checker core. Unutilized silicon area. Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 12 / 29
  • 13. Stacking Functional Unit Blocks Index 1 Stacking Complete Modules 2 Stacking Functional Unit Blocks 3 Splitting Functional Unit Blocks 4 Conclusions Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 13 / 29
  • 14. Stacking Functional Unit Blocks Introduction Stacking Functional Unit Blocks Nowadays Early step of development for this technologies. 3D integration will require Design automation tools. Layout support. Verification and validation methodologies. Future Reorganize the processor pipeline in new ways. Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 14 / 29
  • 15. Stacking Functional Unit Blocks Removing wires Removing Wires Pentium III & IV branch misprediction Problem Wire delays have not evolve as fast as transistors speed. PIII branch misprediction PIV branch misprediction Solution 3D implementation so distant blocks are now vertically stacked on top of each other. Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 15 / 29
  • 16. Stacking Functional Unit Blocks Removing wires Removing Wires Alpha 21264 Problem Superscalar processor with multiple execution units (EU) requires a bypass network to forward results between all of the EU =⇒ wiring. 2D Solution Divide EU into two groups or clusters, each with its own bypass network and communicated. 3D Solution Stack the clusters. Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 16 / 29
  • 17. Stacking Functional Unit Blocks Trade-offs Removing Wires Trade-offs Cons Pros Non-trivial engineering Optimize processor effort. pipeline opportunities. Modify pipeline Physically reduction of Verify and validate amount of wiring. new design. Additional costs. Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 17 / 29
  • 18. Stacking Functional Unit Blocks TSV Reality Removing Wires TSV Reality Problem After stacking two blocks there is enough room for placing TSVs. Solution Different layouts of the TSVs. Wire overhead reintroduction Reintroduced wires do not completely cancel the 3D wire reduction benefits. Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 18 / 29
  • 19. Splitting Functional Unit Blocks Index 1 Stacking Complete Modules 2 Stacking Functional Unit Blocks 3 Splitting Functional Unit Blocks 4 Conclusions Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 19 / 29
  • 20. Splitting Functional Unit Blocks Introduction Splitting Functional Unit Blocks Last level Logic gates Split individual functional units across multiple layers. Reorganize the functional unit block =⇒ more compact 3D arragement. Benefits Reduce length of intra-block wiring. Improve operating frequencies. We will introduce a starting point of thinking. Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 20 / 29
  • 21. Splitting Functional Unit Blocks 3D Cache Organizations 3D Cache Organizations First view Problem L2 cache consumes about half of the overall die area. Worst case routing distance: 2x+4y Two stack possibilities. Banks on cores Banks on banks Half space. Half space. Accessing Accessing equal. reduced. Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 21 / 29
  • 22. Splitting Functional Unit Blocks Splitting the cache 3D Splitting the cache Problem Wires within each bank also impact overall latency. Split individual cache banks across multiple layers. Columns on columns Best latency. Rows on rows Energy reduction. Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 22 / 29
  • 23. Splitting Functional Unit Blocks Splitting the cache 3D Splitting cache Testing Experimental results SPICE simulation. Column on column organization. SRAM implementations in 65-nm process. Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 23 / 29
  • 24. Splitting Functional Unit Blocks 3D Adders 3D Adders Classic Look-ahead Carry Adder Look-ahead Carry Adder n = 16-bits Critical path along bit[0]-bit[n-1] Several ways to split the adder Based on inputs By significance x bottom layer; least significant bits y top layer. bottom layer; most significant top 1st lvl of propagate layer. layer splitted. TSV between root Half wire length. nodes. Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 24 / 29
  • 25. Conclusions Index 1 Stacking Complete Modules 2 Stacking Functional Unit Blocks 3 Splitting Functional Unit Blocks 4 Conclusions Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 25 / 29
  • 26. Conclusions Conclusions Benefits of 3D organizing components Can significantly reduce wire lengths. Devices from different technologies can be tightly integrated and combined. 3D organizations may be required depending on the exact design constraints and objectives. Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 26 / 29
  • 27. Conclusions Conclusions Cons More granularity ⇒ more re-dising. Stacking can increase heat. Long level of technological development Every re-design process yields to a cost increment. Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 27 / 29
  • 28. References References Three-Dimensional Microprocessor Design Gabriel H. Loh Springer Science 2010 A Modular 3D Processor for Flexible Product Design and Technology Migration Gabriel H. Loh ACM 2008 Die-stacking (3D) microarchitecture B. Black. International Symposium on Microarchitecture, pp. 469-479, 2006 Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 28 / 29
  • 29. The end Questions Thank you. Questions? Please be nice Alberto Villegas Erce (Seminar on Computer Systems Turku University ) Design 3D Microprocessor April 2010 29 / 29