SlideShare uma empresa Scribd logo
1 de 59
Baixar para ler offline
Random Test Generators for
Microprocessor Design Validation
Joel Storm
Staff Engineer, Hardware
Technology, Validation, & Test
Sun Microsystems Inc.
12 May 2006
8th
EMICRO Porto Alegre, RS, Brazil
http://www.inf.ufrgs.br/emicro
2Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Agenda
• Design Verification Process Overview
> Functional Verification Only
>No performance
>No timing
>No electrical, circuit, power, etc.
• Random Instruction Generator Design
> Full Featured Generator
>Useful in simulation
>Bootable on hardware
• OpenSPARC http://opensparc.sunsource.net/
3Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
UltraSPARC-T1: Some Design
Choices • Simpler core architecture to
maximize cores on die
• Caches, dram channels shared
across cores give better area
utilization
• Shared L2 decreases cost of
coherence misses by an order of
magnitude
• On die memory controllers reduce
miss latency
• Crossbar good for b/w, latency,
functional verification
• 378mm2 die in 90nm dissipating
~70W
• http://opensparc.net
4Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Importance of Design Verification
• Cost of manufacturing prototype silicon increasing
• Chip manufacturing turn around time increasing
• Cost of design faults in hardware increasing
> Money
> Electronics now used in “life critical” applications
• Time is money
> A great product 2 years late is not so great
> “Hacking” is too slow
5Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
PART 1: Design Verification Process
• Design for Verification
• Pre silicon Verification
• Post silicon Verification
• Review results to improve things for next time
6Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Design for Verification
• Architecture Features
> Instruction monitors
> Address monitors
> Data saved on exceptions
• Microarchitecture Visibility
> Special access to microarchitecture for software
> External visibility: Scan
> Error injection for RAS testing
• Repeatability / Determinism
> Ability to see problem again (and again, and ....)
7Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Bad Design for Verification
• Architecture Features
> Write Only Registers
> Undefined Fields or Actions
• Repeatability / Determinism
> No way to sync clock domains
> Random state after reset
• Typical reasons for poor Design for Verification
> Mistaken belief decision makes verification easier
> Concerns about impact to size or performance
8Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Pre Silicon Verification
• Environments
> RTL simulation
>Stand Alone Test environments (SATs)
>Fullchip
> RTL simulation methods
>Software
>Hardware accelerated
> Architectural simulation
9Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Pre Silicon Verification 2
• Tools
> Formal verification
> Architectural directed tests
> Microarchitectural directed tests
> Pseudo Random tests
• Common problems
> No tests for “hard” cases
> Too many tests
10Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Post Silicon Verification
• “Bootable” or “Bare Metal” or “Native Mode” tests
> From PROM
> Like Operating System
• Operating Systems based tests
> Runs like normal application program
> May use special debug/validation system calls
11Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Design Verification Philosophy
• Test to the specification!!!
• Test all reasons for exceptions
• Do not limit testing to the “real” cases
> No one really knows all “real” cases
> New “real” cases will appear in the future
> Users will hit “no one would ever do that” cases
by accident
• Do give priority to “real” case
> OK, honestly we do have a pretty good idea what
users do
12Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Design Verification Philosophy 2
• Know critical path in schedule (important!)
• Understand “chicken & egg” problem
> What comes first? The test or the debug
platform?
> Best to work this out with software model
> Bad to work it out with RTL
13Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Priorities -> This is What's Important
• 1: Test Coverage
• 2: Usability
• 3: Efficiency (time/cycles to get to specific case)
• 4: Serious problems in 2 & 3 can affect #1
> Note that a common mistake is to use too many
resources on 2 & 3.
>An intuitive interface and cycle efficient code
that can test 80% of the design is not as useful
as a crude interface and slow code that can
test 95%
14Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
PART 2: Random Instruction
Generator Design
• Full featured, do everything CPU test generator
> Works in all simulation environments
> Boots on hardware like an Operating System
> Automatic stress test generation
> User selectable features
> Pseudo Random
>Some built in intelligence on randomness
>Completely random is not very usefull
> Too big for 1 Engineer
15Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Random Generator Usage
• Simulation environments
> Generate tests on good system
> Run only individual tests on target
• Hardware & hardware like environments
> Load (boot) test generator on target system
> Generate and run tests on target system
>Usually “infinite” loop
– Generate test
– Run test
– Repeat
16Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Architecture of Random Generator
• Three main sections
> Infrastructure
> Test Generator
> Run time environment
17Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Architecture of Random Generator 2
• Infrastructure
> User interface
>Command reader
>Output displays
> Boot code
> Debug aids
>Event history tables
>Instruction breakpoints
18Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Architecture of Random Generator 3
• Test Generator
> Memory Allocation
>Data areas
>Instruction areas
>Address translations (Virtual to Physical)
> Feature Selection
>Architectural / Microarchitectural features
>Instruction mix
19Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Architecture of Random Generator 4
• Run time environment
> Test start up code (context switch)
>Stand alone tests
>Switch from generator control code to test
> Exception handlers (interrupts, traps)
>Handle & recover from exception
>Verify correct behavior
> Test end code (context switch)
>Switch back to generator control code
>Final state checks
20Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Important Considerations
• Not a “normal” application!
• Must be debugable on broken hardware
> Limit outside dependencies (none is best)
>Outside code may not work
>Link to outside code may not work
> Software deterministic
> “Efficient” programing strategies not always good
>Keep run time environment in 100% assembly
– Debuggers will be stepping through this
21Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Important Considerations 2
• Leverage other code carefully!
> Don't try to convert another program into a test
> Grabbing small functions if fine (stack, parse,
print)
> Check if the code you want to use requires more
code, that uses a library, that includes....
• If in doubt, write it yourself
> Time to convert code for verification use often
longer than time to write new usage specific
code.
22Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Programming Techniques
• Data structure are better than logic!
> Tables of data can greatly reduce the need for
long switch/case logic statements
• Pointers are better than logic
> Pass pointers to data structures
> Use function pointers
• Just say “NO” to recursive algorithms
> Remember: need to debug on broken hardware
23Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Instruction Generator Goals
• Build all combinations of instructions
> Instruction X before and after instruction Y
> No limits on what can precede/follow instruction
• Conditional branches to/around any instruction
• Microarchitecture testing
> Fill instruction cache (subroutines a good way)
> Branches over “interesting” boundaries
>Cache lines
>Pages
24Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Instruction Data Table
• All instruction data in one place
> Opcode (bit pattern that is unique to instruction)
> Mnemonic
> Number of operands
> Operand size
> Pointers to build/simulation/disassembly functions
> Flags for valid exceptions
> Instruction family flags (Branch, Floating-Point)
25Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Instruction Data Table Example Entry
0x81a00d20 (opcode)
“fsmuld” (mnemonic)
SOURCE1_SINGLE (flag, mask, or constant for 1'st operand)
SOURCE2_SINGLE (flag, mask, or constant for 2'nd operand)
DESTINATION_DOUBLE (same info for destination reg)
FP_DISABLED | XYZ (flags for valid exceptions on this inst)
build_FP_s_to_d (function pointer to build code for this type)
sim_FP_s_to_d (function pointer to simulation code [if any])
disassemble_FP_s_to_d (fcn pointer to disassembly code)
26Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Instruction Data Table 2
• Organize instruction table as a tree
> Use opcode to determine layout
> Leaf nodes are variable length arrays of
instruction structures
> Easy to traverse
>With tree traversal algorithm
>Using opcode
27Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Random Instruction Generation
• Once during program initialization (or when table is
modified):
> Traverse whole instruction tree and build sub
tables
>Array of pointers to all Floating-Point
instructions
>All branch instructions
>All integer multiply instructions
>etc.
• Used by code to stress specific features
28Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Random Instruction Generation 2
• Once for each test built:
> Combine user specified adjustments with
automatic instruction tuning
> Fill instruction mix array with pointers to
instruction sub tables
>100 element array allows 1% adjustments
>1000 elements for 0.1% adjustments (duh)
29Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Main Instruction Generation Loop
• Get a random number
• Use it to index into the instruction mix array
• Use another random number and pointer from
instruction mix array to index into a sub table
• Follow pointer in sub table to instruction data
structure
• Call instruction build function pointed to by function
pointer in instruction data structure
30Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Instruction Build Function
• One for each type of instruction (not each inst.)
> Example: 2 integer register sources and 1 integer
register destination
> Floating-Point register load instructions
• Is passed a pointer to the instruction data entry
• Simple logic uses masks and flags from data entry
to build a complete instruction
• Calls common functions for access to resources
31Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Resource Allocation Functions
• One for each type of major resource
> Integer registers
> Floating-Point registers
> Memory addresses
• Can track usage and force data dependencies
• Can force no dependencies
• Can reserve resources for special uses
> Loop counters
32Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
“Split Up” Instruction Sequence
• Loop (backward branch sequence)
> Initialize loop counter
> Decrement/Increment counter
>Set condition code
> Conditional branch back to point after loop
counter initialization
• Forward branch
> Build branch skeleton (not finished instruction)
> Fill in offset for target address
33Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
“Split Up” Instruction Sequence 2
• Other sequences
> Load subroutine address
> Call subroutine
• Two ways to implement
> Recursive calls from build function to main loop
> Scoreboard
34Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Result Testing
• Simulation environments usually include built in
checking
> Results on RTL checked against software model
• Three major types of random generator built in
testing
> Sanity checks of exceptions
> Two pass comparison (Very Powerful)
> Instruction simulation
• End of test sanity checks
35Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Sanity Checks of Exceptions
• Divide by zero trap
> Divide instruction?
> Operand was zero?
• TLB miss trap
> Load or Store?
> Miss was expected/allowed
• Illegal instruction trap
> Is it illegal
> Did the generator put it in this test (Important!)
36Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Two Pass Comparison First Pass
• Run test in single step mode
> Use hardware feature or software traps
>Software trap method is self modifying code
that walks trap instruction through test code
• Save state data at periodic checkpoints
> Registers
> Exceptions taken
37Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Two Pass Comparison Second Pass
• Reset all data
• Run test normally (no single step)
• Check state data at periodic checkpoints
> Registers
> Exceptions taken
>Can be part of checkpoint
>Can be part of exception handling
38Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Instruction Simulation
• Include simulation functions in random generator
> Connections to separate simulation engines don't
work (trust me on this)
• Leverage pass one single step functions
> Easy to implement
> Allows partial simulation (don't need to simulate
simple instructions)
> Use function pointer from instruction data table
>Easy implementation: look up instruction & call
function from function pointer
39Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
End of Test Sanity Checks
• If CPU1 sent CPU7 12 messages, were they all
received?
• Did all CPUs complete the test?
• Were expected exceptions actually seen?
40Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Test Tuning
• Instruction mix
• Data stress
> Boundary conditions
> IEEE Floating-Point fun
• Microarchitecture stress
> Instruction cache
> Data cache
> Store buffers
> Data dependencies
41Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Test Tuning 2
• Controls
> On
> Off
> Random
• Range variables
> Percent forward branches or loops
> Percent register or immediate operands
42Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Usability Extras
• Common sense displays
> Registers & addresses in Hexadecimal
> Instruction dump disassembler
>Use function pointer in instruction data table
>Add helpful comments
• Access to microarchitecture: TLBs, Control registers
• Exception history logs
• Error logs
43Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
PART 3: OpenSPARC
• Freely available version of Sun's UltraSPARC T1
microprocessor
• Architecture documentation
• RTL
• Software models
• Verification tests
• http://OpenSPARC.net
44Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
OpenSPARC Introduction
All Following pages taken from David Yen's
“Opening Doors to the Multicore Era”
presentation for the Multicore Expo in March
2006
45Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Network Computing Is
Thread Rich
Web services, Java
TM
applications, database
transactions, ERP . . .
Moore’s Law
A fraction of the die can
already build a good processor
core; how am I going to use a
billion transistors?
Worsening
Memory Latency
It’s approaching 1000s
of CPU cycles! Friend or foe?
Forcing a rethinking of
processor architecture –
modularity, less is more,
time-to-market
Growing Complexity
of Processor Design
The Big Bang Has Happened
—
Four Converging Trends
46Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Attributes of Commercial
Workloads
Attribute
Application
Category
Web
Server
Instruction-level
Parallelism
Thread-level
Parallelism
Instruction/Data
Working Set
Data Sharing
SAP 2T SAP 3T
(DB)
DSS
(TPC-H)
Server
Java
OLTP ERP ERP DSS
Low Low Low LowMedium High
High High High High High High
Large Large Large Medium Large Large
Low Medium High Medium High Medium
TIER1
Web
(Web99)
TIER2
App Serv
(JBB)
TIER3
Data
(TPC-C)
Web Services Client Server Data
Warehouse
47Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Memory Bottleneck
Relative Performance
10000
1
1990 1995 20051980
1000
100
10
1985 2000
2x Every 6 Years
2x Every 2 Years
Gap
CPU Frequency
DRAM Speeds
Source: Sun World Wide Analyst Conference Feb. 25, 2003
48Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
C MC MC M
Memory Latency Compute
Time
Memory Latency Compute
Typical Complex High
Frequency Processor
Thread
Memory Latency Compute
Time
Time Saved
Memory Latency Compute
C M C M C MThread
Note: Up to 75% Cycles Waiting for
Memory
HURRY
UP AND
WAIT!
49Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Chip Multithreading (CMT)
Memory Latency Compute
Time
Memory Latency Compute
C MC MC MThread 1
Thread 2 C MC MC M
Thread 3 C MC MC M
Thread 4 C MC MC M
50Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Core 1
Memory Latency Compute
CMT – Multiple Multithreaded Cores
Thread 4
Thread 3
Thread 2
Thread 1
Core 2
Thread 4
Thread 3
Thread 2
Thread 1
Core 3
Thread 4
Thread 3
Thread 2
Thread 1
Thread 4
Thread 3
Thread 2
Thread 1
Core 4
Core 5
Thread 4
Thread 3
Thread 2
Thread 1
Core 6
Thread 4
Thread 3
Thread 2
Thread 1
Core 7
Thread 4
Thread 3
Thread 2
Thread 1
Core 8
Thread 4
Thread 3
Thread 2
Thread 1
Time
51Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Why CMT Works
“Goal: 100% Resource Utilization”
20% MaximumCore Size
SPARC: 4 threads per core
● Increases core die area by ~20%
● Improves performance by ~50–100%
.05
1.0
10.0
2.0
Multi-Thread, Single-Core
Multi-Thread, Multi-Core
Single-Thread, Single-Core
RelativePerformanceonthread-
richmemoryboundworkloads
52Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
• SPARC V9 implementation
• Up to eight 4-way multi-
threaded cores for up to
32 simultaneous threads
• All cores connected through a
134.4GB/s crossbar switch
• High-bandwidth 12-way associative
3MB Level-2 cache on chip
• 4 DDR2 channels (23GB/s)
• Power : < 80W
• ~300M transistors
• 378 sq. mm die
1 of 8 Cores
BUS
C8C7C6C5C4C3C2C1
L2$L2$L2$L2$
Xbar
DDR-2
SDRAM
DDR-2
SDRAM
DDR-2
SDRAM
DDR-2
SDRAM
FPU
UltraSPARC T1 Processor
Sys I/F
Buffer Switch
Core
53Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Single-Core Processor CMT Processor
(Not to Scale)
C1 C2 C3 C4
C5 C6 C7 C8
Faster Can Be Cooler
107C
102C
96C
91C
85C
80C
74C
69C
63C
58C
54Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
CMT: On-chip = High Bandwidth
Switch
P P P P
M M M M
IO
Switch
P P P P
M M M M
IO
Switch
P P P P
M M M M
IO
Switch
P P P P
M M M M
IO
Niagara: 32 Threads
Direct crossbar interconnect
Lower cost, better RAS, lower BTUs,
lower and uniform latency,
greater and uniform bandwidth. . .
PP
PP
PP
PP
Mem Ctlr
Mem Ctlr
Mem Ctlr
Mem CtlrI/O
Switch
Traditional SMP: 32 Threads
Example: Typical SMP Machine Configuration
One motherboard, no switch ASICs
Switc
h
Switch
P P P P
M M M M
IO
Switch
P P P P
M M M M
IO
Switch
P P P P
M M M M
IO
Switch
P P P P
M M M M
IO
Switch
XBar
L2
L2
L2
L2
55Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
CMT Benefits
Performance
Cost
● Fewer servers
● Less floor space
● Reduced power consumption
● Less air conditioning
● Lower administration and
maintenance
Reliability
56Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
OpenSPARC: New Frontier in Choice!
• Sun's OpenSPARC initiative intends to
open source UltraSPARC T1 design
point
> Announced Dec. 6, 2005
> RTL in Verilog released March 21, 2006
• Initial publications also include:
> A verification suite and simulation models
> ISA specification (UltraSPARC Architecture
2005)
> UltraSPARC T1-specific ISA supplement
> A Solaris port
57Joel Storm 8th
EMICRO, Porto Alegre, RS, Brazil
Innovation
willhappeneverywhere
About the Community: opensparc.net
Innovation Happens Everywhere
Clustermaps for http://opensparc.net
64 bit, 32 threads,
free
www.opensparc.net
Get the code. Start
innovating.
Multi-threaded algorithms and applications,
Operating Systems, System Architecture, EDA
Tools/Methodology,
Circuit implementations, Compiler Tools, System
Modeling, System on a Chip, Debug tools,
Performance analysis and benchmarking
Joel Storm
Joel.Storm@sun.com
Random Test Generators for
Microprocessor Design Validation

Mais conteúdo relacionado

Destaque

Bab5pencemaranudara 131009223353-phpapp02
Bab5pencemaranudara 131009223353-phpapp02Bab5pencemaranudara 131009223353-phpapp02
Bab5pencemaranudara 131009223353-phpapp02Firdaus Bakhtiar
 
Evaluating a restructured university portal framework with reusable components
Evaluating a restructured university portal framework with reusable componentsEvaluating a restructured university portal framework with reusable components
Evaluating a restructured university portal framework with reusable componentseSAT Journals
 
ASQ QMD GSC Voice Of Customer Survey 020509
ASQ QMD GSC Voice Of Customer Survey 020509ASQ QMD GSC Voice Of Customer Survey 020509
ASQ QMD GSC Voice Of Customer Survey 020509John Cachat
 
Atualização manual direito prev 7ª edção
Atualização manual direito prev 7ª edçãoAtualização manual direito prev 7ª edção
Atualização manual direito prev 7ª edçãoFelipe Junior
 
OS SETE SAPATOS SUJOS, por Mia Couto
OS SETE SAPATOS SUJOS, por Mia CoutoOS SETE SAPATOS SUJOS, por Mia Couto
OS SETE SAPATOS SUJOS, por Mia CoutoDavid Pires
 
Aventura jogo em equipe montagem de texto em fichas
Aventura jogo em equipe montagem de texto em fichasAventura jogo em equipe montagem de texto em fichas
Aventura jogo em equipe montagem de texto em fichasGiselda morais rodrigues do
 
Computer graphics lab assignment
Computer graphics lab assignmentComputer graphics lab assignment
Computer graphics lab assignmentAbdullah Al Shiam
 
Residuos de farmacos en animales
Residuos de farmacos en animalesResiduos de farmacos en animales
Residuos de farmacos en animalesmodeltop
 
Evaluating philosophical claims and theories
Evaluating philosophical claims and theories Evaluating philosophical claims and theories
Evaluating philosophical claims and theories philipapeters
 
Information Technology Infrastructure
Information Technology Infrastructure Information Technology Infrastructure
Information Technology Infrastructure Hurriya Saeed rana
 

Destaque (13)

Bab5pencemaranudara 131009223353-phpapp02
Bab5pencemaranudara 131009223353-phpapp02Bab5pencemaranudara 131009223353-phpapp02
Bab5pencemaranudara 131009223353-phpapp02
 
Freud’s understanding of religion
Freud’s understanding of religionFreud’s understanding of religion
Freud’s understanding of religion
 
Abing grace ..
Abing grace ..Abing grace ..
Abing grace ..
 
THE KINGDOM
THE KINGDOMTHE KINGDOM
THE KINGDOM
 
Evaluating a restructured university portal framework with reusable components
Evaluating a restructured university portal framework with reusable componentsEvaluating a restructured university portal framework with reusable components
Evaluating a restructured university portal framework with reusable components
 
ASQ QMD GSC Voice Of Customer Survey 020509
ASQ QMD GSC Voice Of Customer Survey 020509ASQ QMD GSC Voice Of Customer Survey 020509
ASQ QMD GSC Voice Of Customer Survey 020509
 
Atualização manual direito prev 7ª edção
Atualização manual direito prev 7ª edçãoAtualização manual direito prev 7ª edção
Atualização manual direito prev 7ª edção
 
OS SETE SAPATOS SUJOS, por Mia Couto
OS SETE SAPATOS SUJOS, por Mia CoutoOS SETE SAPATOS SUJOS, por Mia Couto
OS SETE SAPATOS SUJOS, por Mia Couto
 
Aventura jogo em equipe montagem de texto em fichas
Aventura jogo em equipe montagem de texto em fichasAventura jogo em equipe montagem de texto em fichas
Aventura jogo em equipe montagem de texto em fichas
 
Computer graphics lab assignment
Computer graphics lab assignmentComputer graphics lab assignment
Computer graphics lab assignment
 
Residuos de farmacos en animales
Residuos de farmacos en animalesResiduos de farmacos en animales
Residuos de farmacos en animales
 
Evaluating philosophical claims and theories
Evaluating philosophical claims and theories Evaluating philosophical claims and theories
Evaluating philosophical claims and theories
 
Information Technology Infrastructure
Information Technology Infrastructure Information Technology Infrastructure
Information Technology Infrastructure
 

Semelhante a 53-rand-test-gen-validation-1530392

L1_Introduction.ppt
L1_Introduction.pptL1_Introduction.ppt
L1_Introduction.pptVarsha506533
 
IoT Best Practices: Unit Testing
IoT Best Practices: Unit TestingIoT Best Practices: Unit Testing
IoT Best Practices: Unit Testingfarmckon
 
39245147 intro-es-i
39245147 intro-es-i39245147 intro-es-i
39245147 intro-es-iEmbeddedbvp
 
HW Emulators: Does it Belong in your Verification Tool Chest?
HW Emulators: Does it Belong in your Verification Tool Chest?HW Emulators: Does it Belong in your Verification Tool Chest?
HW Emulators: Does it Belong in your Verification Tool Chest?DVClub
 
An Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai Yu
An Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai YuAn Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai Yu
An Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai YuDatabricks
 
Performance tuning Grails Applications GR8Conf US 2014
Performance tuning Grails Applications GR8Conf US 2014Performance tuning Grails Applications GR8Conf US 2014
Performance tuning Grails Applications GR8Conf US 2014Lari Hotari
 
Early Software Development through Palladium Emulation
Early Software Development through Palladium EmulationEarly Software Development through Palladium Emulation
Early Software Development through Palladium EmulationRaghav Nayak
 
S emb t10-development
S emb t10-developmentS emb t10-development
S emb t10-developmentJoão Moreira
 
Microcontroller Based Testing of Digital IP-Core
Microcontroller Based Testing of Digital IP-CoreMicrocontroller Based Testing of Digital IP-Core
Microcontroller Based Testing of Digital IP-CoreVLSICS Design
 
Architecture for Massively Parallel HDL Simulations
Architecture for Massively Parallel HDL Simulations Architecture for Massively Parallel HDL Simulations
Architecture for Massively Parallel HDL Simulations DVClub
 
SE2_Lec 20_Software Testing
SE2_Lec 20_Software TestingSE2_Lec 20_Software Testing
SE2_Lec 20_Software TestingAmr E. Mohamed
 
Lecture 2 computer function
Lecture 2  computer functionLecture 2  computer function
Lecture 2 computer functionPradeep Kumar TS
 
One integrated platform for all activities,from engineering to production
One integrated platform for all activities,from engineering to productionOne integrated platform for all activities,from engineering to production
One integrated platform for all activities,from engineering to productionS Jebaraj
 
Ppt on embedded system
Ppt on embedded systemPpt on embedded system
Ppt on embedded systemPankaj joshi
 
代码大全(内训)
代码大全(内训)代码大全(内训)
代码大全(内训)Horky Chen
 
Coverage Solutions on Emulators
Coverage Solutions on EmulatorsCoverage Solutions on Emulators
Coverage Solutions on EmulatorsDVClub
 
Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...
Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...
Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...DVClub
 

Semelhante a 53-rand-test-gen-validation-1530392 (20)

L1_Introduction.ppt
L1_Introduction.pptL1_Introduction.ppt
L1_Introduction.ppt
 
IoT Best Practices: Unit Testing
IoT Best Practices: Unit TestingIoT Best Practices: Unit Testing
IoT Best Practices: Unit Testing
 
39245147 intro-es-i
39245147 intro-es-i39245147 intro-es-i
39245147 intro-es-i
 
JavaMicroBenchmarkpptm
JavaMicroBenchmarkpptmJavaMicroBenchmarkpptm
JavaMicroBenchmarkpptm
 
HW Emulators: Does it Belong in your Verification Tool Chest?
HW Emulators: Does it Belong in your Verification Tool Chest?HW Emulators: Does it Belong in your Verification Tool Chest?
HW Emulators: Does it Belong in your Verification Tool Chest?
 
An Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai Yu
An Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai YuAn Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai Yu
An Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai Yu
 
Performance tuning Grails Applications GR8Conf US 2014
Performance tuning Grails Applications GR8Conf US 2014Performance tuning Grails Applications GR8Conf US 2014
Performance tuning Grails Applications GR8Conf US 2014
 
Early Software Development through Palladium Emulation
Early Software Development through Palladium EmulationEarly Software Development through Palladium Emulation
Early Software Development through Palladium Emulation
 
Embedded System-design technology
Embedded System-design technologyEmbedded System-design technology
Embedded System-design technology
 
S emb t10-development
S emb t10-developmentS emb t10-development
S emb t10-development
 
Microcontroller Based Testing of Digital IP-Core
Microcontroller Based Testing of Digital IP-CoreMicrocontroller Based Testing of Digital IP-Core
Microcontroller Based Testing of Digital IP-Core
 
Architecture for Massively Parallel HDL Simulations
Architecture for Massively Parallel HDL Simulations Architecture for Massively Parallel HDL Simulations
Architecture for Massively Parallel HDL Simulations
 
SE2_Lec 20_Software Testing
SE2_Lec 20_Software TestingSE2_Lec 20_Software Testing
SE2_Lec 20_Software Testing
 
Lecture 2 computer function
Lecture 2  computer functionLecture 2  computer function
Lecture 2 computer function
 
One integrated platform for all activities,from engineering to production
One integrated platform for all activities,from engineering to productionOne integrated platform for all activities,from engineering to production
One integrated platform for all activities,from engineering to production
 
Ppt on embedded system
Ppt on embedded systemPpt on embedded system
Ppt on embedded system
 
Kernel Debugging & Profiling
Kernel Debugging & ProfilingKernel Debugging & Profiling
Kernel Debugging & Profiling
 
代码大全(内训)
代码大全(内训)代码大全(内训)
代码大全(内训)
 
Coverage Solutions on Emulators
Coverage Solutions on EmulatorsCoverage Solutions on Emulators
Coverage Solutions on Emulators
 
Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...
Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...
Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...
 

53-rand-test-gen-validation-1530392

  • 1. Random Test Generators for Microprocessor Design Validation Joel Storm Staff Engineer, Hardware Technology, Validation, & Test Sun Microsystems Inc. 12 May 2006 8th EMICRO Porto Alegre, RS, Brazil http://www.inf.ufrgs.br/emicro
  • 2. 2Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Agenda • Design Verification Process Overview > Functional Verification Only >No performance >No timing >No electrical, circuit, power, etc. • Random Instruction Generator Design > Full Featured Generator >Useful in simulation >Bootable on hardware • OpenSPARC http://opensparc.sunsource.net/
  • 3. 3Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil UltraSPARC-T1: Some Design Choices • Simpler core architecture to maximize cores on die • Caches, dram channels shared across cores give better area utilization • Shared L2 decreases cost of coherence misses by an order of magnitude • On die memory controllers reduce miss latency • Crossbar good for b/w, latency, functional verification • 378mm2 die in 90nm dissipating ~70W • http://opensparc.net
  • 4. 4Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Importance of Design Verification • Cost of manufacturing prototype silicon increasing • Chip manufacturing turn around time increasing • Cost of design faults in hardware increasing > Money > Electronics now used in “life critical” applications • Time is money > A great product 2 years late is not so great > “Hacking” is too slow
  • 5. 5Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil PART 1: Design Verification Process • Design for Verification • Pre silicon Verification • Post silicon Verification • Review results to improve things for next time
  • 6. 6Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Design for Verification • Architecture Features > Instruction monitors > Address monitors > Data saved on exceptions • Microarchitecture Visibility > Special access to microarchitecture for software > External visibility: Scan > Error injection for RAS testing • Repeatability / Determinism > Ability to see problem again (and again, and ....)
  • 7. 7Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Bad Design for Verification • Architecture Features > Write Only Registers > Undefined Fields or Actions • Repeatability / Determinism > No way to sync clock domains > Random state after reset • Typical reasons for poor Design for Verification > Mistaken belief decision makes verification easier > Concerns about impact to size or performance
  • 8. 8Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Pre Silicon Verification • Environments > RTL simulation >Stand Alone Test environments (SATs) >Fullchip > RTL simulation methods >Software >Hardware accelerated > Architectural simulation
  • 9. 9Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Pre Silicon Verification 2 • Tools > Formal verification > Architectural directed tests > Microarchitectural directed tests > Pseudo Random tests • Common problems > No tests for “hard” cases > Too many tests
  • 10. 10Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Post Silicon Verification • “Bootable” or “Bare Metal” or “Native Mode” tests > From PROM > Like Operating System • Operating Systems based tests > Runs like normal application program > May use special debug/validation system calls
  • 11. 11Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Design Verification Philosophy • Test to the specification!!! • Test all reasons for exceptions • Do not limit testing to the “real” cases > No one really knows all “real” cases > New “real” cases will appear in the future > Users will hit “no one would ever do that” cases by accident • Do give priority to “real” case > OK, honestly we do have a pretty good idea what users do
  • 12. 12Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Design Verification Philosophy 2 • Know critical path in schedule (important!) • Understand “chicken & egg” problem > What comes first? The test or the debug platform? > Best to work this out with software model > Bad to work it out with RTL
  • 13. 13Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Priorities -> This is What's Important • 1: Test Coverage • 2: Usability • 3: Efficiency (time/cycles to get to specific case) • 4: Serious problems in 2 & 3 can affect #1 > Note that a common mistake is to use too many resources on 2 & 3. >An intuitive interface and cycle efficient code that can test 80% of the design is not as useful as a crude interface and slow code that can test 95%
  • 14. 14Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil PART 2: Random Instruction Generator Design • Full featured, do everything CPU test generator > Works in all simulation environments > Boots on hardware like an Operating System > Automatic stress test generation > User selectable features > Pseudo Random >Some built in intelligence on randomness >Completely random is not very usefull > Too big for 1 Engineer
  • 15. 15Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Random Generator Usage • Simulation environments > Generate tests on good system > Run only individual tests on target • Hardware & hardware like environments > Load (boot) test generator on target system > Generate and run tests on target system >Usually “infinite” loop – Generate test – Run test – Repeat
  • 16. 16Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Architecture of Random Generator • Three main sections > Infrastructure > Test Generator > Run time environment
  • 17. 17Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Architecture of Random Generator 2 • Infrastructure > User interface >Command reader >Output displays > Boot code > Debug aids >Event history tables >Instruction breakpoints
  • 18. 18Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Architecture of Random Generator 3 • Test Generator > Memory Allocation >Data areas >Instruction areas >Address translations (Virtual to Physical) > Feature Selection >Architectural / Microarchitectural features >Instruction mix
  • 19. 19Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Architecture of Random Generator 4 • Run time environment > Test start up code (context switch) >Stand alone tests >Switch from generator control code to test > Exception handlers (interrupts, traps) >Handle & recover from exception >Verify correct behavior > Test end code (context switch) >Switch back to generator control code >Final state checks
  • 20. 20Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Important Considerations • Not a “normal” application! • Must be debugable on broken hardware > Limit outside dependencies (none is best) >Outside code may not work >Link to outside code may not work > Software deterministic > “Efficient” programing strategies not always good >Keep run time environment in 100% assembly – Debuggers will be stepping through this
  • 21. 21Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Important Considerations 2 • Leverage other code carefully! > Don't try to convert another program into a test > Grabbing small functions if fine (stack, parse, print) > Check if the code you want to use requires more code, that uses a library, that includes.... • If in doubt, write it yourself > Time to convert code for verification use often longer than time to write new usage specific code.
  • 22. 22Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Programming Techniques • Data structure are better than logic! > Tables of data can greatly reduce the need for long switch/case logic statements • Pointers are better than logic > Pass pointers to data structures > Use function pointers • Just say “NO” to recursive algorithms > Remember: need to debug on broken hardware
  • 23. 23Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Instruction Generator Goals • Build all combinations of instructions > Instruction X before and after instruction Y > No limits on what can precede/follow instruction • Conditional branches to/around any instruction • Microarchitecture testing > Fill instruction cache (subroutines a good way) > Branches over “interesting” boundaries >Cache lines >Pages
  • 24. 24Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Instruction Data Table • All instruction data in one place > Opcode (bit pattern that is unique to instruction) > Mnemonic > Number of operands > Operand size > Pointers to build/simulation/disassembly functions > Flags for valid exceptions > Instruction family flags (Branch, Floating-Point)
  • 25. 25Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Instruction Data Table Example Entry 0x81a00d20 (opcode) “fsmuld” (mnemonic) SOURCE1_SINGLE (flag, mask, or constant for 1'st operand) SOURCE2_SINGLE (flag, mask, or constant for 2'nd operand) DESTINATION_DOUBLE (same info for destination reg) FP_DISABLED | XYZ (flags for valid exceptions on this inst) build_FP_s_to_d (function pointer to build code for this type) sim_FP_s_to_d (function pointer to simulation code [if any]) disassemble_FP_s_to_d (fcn pointer to disassembly code)
  • 26. 26Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Instruction Data Table 2 • Organize instruction table as a tree > Use opcode to determine layout > Leaf nodes are variable length arrays of instruction structures > Easy to traverse >With tree traversal algorithm >Using opcode
  • 27. 27Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Random Instruction Generation • Once during program initialization (or when table is modified): > Traverse whole instruction tree and build sub tables >Array of pointers to all Floating-Point instructions >All branch instructions >All integer multiply instructions >etc. • Used by code to stress specific features
  • 28. 28Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Random Instruction Generation 2 • Once for each test built: > Combine user specified adjustments with automatic instruction tuning > Fill instruction mix array with pointers to instruction sub tables >100 element array allows 1% adjustments >1000 elements for 0.1% adjustments (duh)
  • 29. 29Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Main Instruction Generation Loop • Get a random number • Use it to index into the instruction mix array • Use another random number and pointer from instruction mix array to index into a sub table • Follow pointer in sub table to instruction data structure • Call instruction build function pointed to by function pointer in instruction data structure
  • 30. 30Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Instruction Build Function • One for each type of instruction (not each inst.) > Example: 2 integer register sources and 1 integer register destination > Floating-Point register load instructions • Is passed a pointer to the instruction data entry • Simple logic uses masks and flags from data entry to build a complete instruction • Calls common functions for access to resources
  • 31. 31Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Resource Allocation Functions • One for each type of major resource > Integer registers > Floating-Point registers > Memory addresses • Can track usage and force data dependencies • Can force no dependencies • Can reserve resources for special uses > Loop counters
  • 32. 32Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil “Split Up” Instruction Sequence • Loop (backward branch sequence) > Initialize loop counter > Decrement/Increment counter >Set condition code > Conditional branch back to point after loop counter initialization • Forward branch > Build branch skeleton (not finished instruction) > Fill in offset for target address
  • 33. 33Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil “Split Up” Instruction Sequence 2 • Other sequences > Load subroutine address > Call subroutine • Two ways to implement > Recursive calls from build function to main loop > Scoreboard
  • 34. 34Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Result Testing • Simulation environments usually include built in checking > Results on RTL checked against software model • Three major types of random generator built in testing > Sanity checks of exceptions > Two pass comparison (Very Powerful) > Instruction simulation • End of test sanity checks
  • 35. 35Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Sanity Checks of Exceptions • Divide by zero trap > Divide instruction? > Operand was zero? • TLB miss trap > Load or Store? > Miss was expected/allowed • Illegal instruction trap > Is it illegal > Did the generator put it in this test (Important!)
  • 36. 36Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Two Pass Comparison First Pass • Run test in single step mode > Use hardware feature or software traps >Software trap method is self modifying code that walks trap instruction through test code • Save state data at periodic checkpoints > Registers > Exceptions taken
  • 37. 37Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Two Pass Comparison Second Pass • Reset all data • Run test normally (no single step) • Check state data at periodic checkpoints > Registers > Exceptions taken >Can be part of checkpoint >Can be part of exception handling
  • 38. 38Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Instruction Simulation • Include simulation functions in random generator > Connections to separate simulation engines don't work (trust me on this) • Leverage pass one single step functions > Easy to implement > Allows partial simulation (don't need to simulate simple instructions) > Use function pointer from instruction data table >Easy implementation: look up instruction & call function from function pointer
  • 39. 39Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil End of Test Sanity Checks • If CPU1 sent CPU7 12 messages, were they all received? • Did all CPUs complete the test? • Were expected exceptions actually seen?
  • 40. 40Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Test Tuning • Instruction mix • Data stress > Boundary conditions > IEEE Floating-Point fun • Microarchitecture stress > Instruction cache > Data cache > Store buffers > Data dependencies
  • 41. 41Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Test Tuning 2 • Controls > On > Off > Random • Range variables > Percent forward branches or loops > Percent register or immediate operands
  • 42. 42Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Usability Extras • Common sense displays > Registers & addresses in Hexadecimal > Instruction dump disassembler >Use function pointer in instruction data table >Add helpful comments • Access to microarchitecture: TLBs, Control registers • Exception history logs • Error logs
  • 43. 43Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil PART 3: OpenSPARC • Freely available version of Sun's UltraSPARC T1 microprocessor • Architecture documentation • RTL • Software models • Verification tests • http://OpenSPARC.net
  • 44. 44Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil OpenSPARC Introduction All Following pages taken from David Yen's “Opening Doors to the Multicore Era” presentation for the Multicore Expo in March 2006
  • 45. 45Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Network Computing Is Thread Rich Web services, Java TM applications, database transactions, ERP . . . Moore’s Law A fraction of the die can already build a good processor core; how am I going to use a billion transistors? Worsening Memory Latency It’s approaching 1000s of CPU cycles! Friend or foe? Forcing a rethinking of processor architecture – modularity, less is more, time-to-market Growing Complexity of Processor Design The Big Bang Has Happened — Four Converging Trends
  • 46. 46Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Attributes of Commercial Workloads Attribute Application Category Web Server Instruction-level Parallelism Thread-level Parallelism Instruction/Data Working Set Data Sharing SAP 2T SAP 3T (DB) DSS (TPC-H) Server Java OLTP ERP ERP DSS Low Low Low LowMedium High High High High High High High Large Large Large Medium Large Large Low Medium High Medium High Medium TIER1 Web (Web99) TIER2 App Serv (JBB) TIER3 Data (TPC-C) Web Services Client Server Data Warehouse
  • 47. 47Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Memory Bottleneck Relative Performance 10000 1 1990 1995 20051980 1000 100 10 1985 2000 2x Every 6 Years 2x Every 2 Years Gap CPU Frequency DRAM Speeds Source: Sun World Wide Analyst Conference Feb. 25, 2003
  • 48. 48Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil C MC MC M Memory Latency Compute Time Memory Latency Compute Typical Complex High Frequency Processor Thread Memory Latency Compute Time Time Saved Memory Latency Compute C M C M C MThread Note: Up to 75% Cycles Waiting for Memory HURRY UP AND WAIT!
  • 49. 49Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Chip Multithreading (CMT) Memory Latency Compute Time Memory Latency Compute C MC MC MThread 1 Thread 2 C MC MC M Thread 3 C MC MC M Thread 4 C MC MC M
  • 50. 50Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Core 1 Memory Latency Compute CMT – Multiple Multithreaded Cores Thread 4 Thread 3 Thread 2 Thread 1 Core 2 Thread 4 Thread 3 Thread 2 Thread 1 Core 3 Thread 4 Thread 3 Thread 2 Thread 1 Thread 4 Thread 3 Thread 2 Thread 1 Core 4 Core 5 Thread 4 Thread 3 Thread 2 Thread 1 Core 6 Thread 4 Thread 3 Thread 2 Thread 1 Core 7 Thread 4 Thread 3 Thread 2 Thread 1 Core 8 Thread 4 Thread 3 Thread 2 Thread 1 Time
  • 51. 51Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Why CMT Works “Goal: 100% Resource Utilization” 20% MaximumCore Size SPARC: 4 threads per core ● Increases core die area by ~20% ● Improves performance by ~50–100% .05 1.0 10.0 2.0 Multi-Thread, Single-Core Multi-Thread, Multi-Core Single-Thread, Single-Core RelativePerformanceonthread- richmemoryboundworkloads
  • 52. 52Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil • SPARC V9 implementation • Up to eight 4-way multi- threaded cores for up to 32 simultaneous threads • All cores connected through a 134.4GB/s crossbar switch • High-bandwidth 12-way associative 3MB Level-2 cache on chip • 4 DDR2 channels (23GB/s) • Power : < 80W • ~300M transistors • 378 sq. mm die 1 of 8 Cores BUS C8C7C6C5C4C3C2C1 L2$L2$L2$L2$ Xbar DDR-2 SDRAM DDR-2 SDRAM DDR-2 SDRAM DDR-2 SDRAM FPU UltraSPARC T1 Processor Sys I/F Buffer Switch Core
  • 53. 53Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Single-Core Processor CMT Processor (Not to Scale) C1 C2 C3 C4 C5 C6 C7 C8 Faster Can Be Cooler 107C 102C 96C 91C 85C 80C 74C 69C 63C 58C
  • 54. 54Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil CMT: On-chip = High Bandwidth Switch P P P P M M M M IO Switch P P P P M M M M IO Switch P P P P M M M M IO Switch P P P P M M M M IO Niagara: 32 Threads Direct crossbar interconnect Lower cost, better RAS, lower BTUs, lower and uniform latency, greater and uniform bandwidth. . . PP PP PP PP Mem Ctlr Mem Ctlr Mem Ctlr Mem CtlrI/O Switch Traditional SMP: 32 Threads Example: Typical SMP Machine Configuration One motherboard, no switch ASICs Switc h Switch P P P P M M M M IO Switch P P P P M M M M IO Switch P P P P M M M M IO Switch P P P P M M M M IO Switch XBar L2 L2 L2 L2
  • 55. 55Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil CMT Benefits Performance Cost ● Fewer servers ● Less floor space ● Reduced power consumption ● Less air conditioning ● Lower administration and maintenance Reliability
  • 56. 56Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil OpenSPARC: New Frontier in Choice! • Sun's OpenSPARC initiative intends to open source UltraSPARC T1 design point > Announced Dec. 6, 2005 > RTL in Verilog released March 21, 2006 • Initial publications also include: > A verification suite and simulation models > ISA specification (UltraSPARC Architecture 2005) > UltraSPARC T1-specific ISA supplement > A Solaris port
  • 57. 57Joel Storm 8th EMICRO, Porto Alegre, RS, Brazil Innovation willhappeneverywhere About the Community: opensparc.net Innovation Happens Everywhere Clustermaps for http://opensparc.net
  • 58. 64 bit, 32 threads, free www.opensparc.net Get the code. Start innovating. Multi-threaded algorithms and applications, Operating Systems, System Architecture, EDA Tools/Methodology, Circuit implementations, Compiler Tools, System Modeling, System on a Chip, Debug tools, Performance analysis and benchmarking
  • 59. Joel Storm Joel.Storm@sun.com Random Test Generators for Microprocessor Design Validation