SlideShare uma empresa Scribd logo
1 de 112
Velammal Engineering College
Department of Computer Science
and Engineering
Welcome…
Slide Sources: Patterson & Hennessy COD book
website (copyright Morgan Kaufmann) adapted
and supplemented
Mr. A. Arockia Abins &
Ms. R. Amirthavalli,
Asst. Prof,
CSE,
Velammal Engineering College
Course Objectives
• This course aims to learn the basic structure and operations of
a computer.
• The course is intended to learn ALU, pipelined execution,
parallelism and multi-core processors.
• The course will enable the students to understand memory
hierarchies, cache memories and virtual memories.
Course Outcomes
CO 1
Discuss the basics structure of computers, operations and
instructions.
CO 2 Design arithmetic and logic unit.
CO 3 Analyze pipelined execution and design control unit.
CO 4 Analyze parallel processing architectures.
CO 5 Examine the performance of various memory systems
CO 6 Organize the various I/O communications.
Syllabus
Unit Titles:
• Unit I Basic Structure of a Computer System
• Unit II Arithmetic for Computers
• Unit III Processor and Control Unit
• Unit IV Parallelism
• Unit V Memory & I/O Systems
Syllabus – Unit I
UNIT-I BASIC STRUCTURE OF A COMPUTER
SYSTEM
Functional Units – Basic operational concepts –– Instructions:
Operations, Operands – Instruction representation – Instruction
Types – MIPS addressing, Performance
Syllabus – Unit II
UNIT-II ARITHMETIC FOR COMPUTERS
Addition and Subtraction – Multiplication – Division – Floating
Point Representation – Floating Point Addition and Subtraction.
Syllabus – Unit III
UNIT-III PROCESSOR AND CONTROL UNIT
A Basic MIPS implementation – Building a Datapath – Control
Implementation Scheme – Pipelining – Pipelined datapath and
control – Handling Data Hazards & Control Hazards.
Syllabus – Unit IV
UNIT-IV PARALLELISM
Introduction to Multicore processors and other shared memory
multiprocessors – Flynn’s classification: SISD, MIMD, SIMD,
SPMD and Vector – Hardware multithreading – GPU
architecture.
Syllabus – Unit V
• UNIT-V MEMORY & I/O SYSTEMS
Memory Hierarchy – memory technologies – Cache Memory –
Performance Considerations, Virtual Memory,TLB’s – Accessing
I/O devices – Interrupts – Direct Memory Access – Bus Structure
– Bus operation.
Text Books
• Book 1:
o Name: Computer Organization and Design: The
Hardware/Software Interface
o Authors: David A. Patterson and John L. Hennessy
o Publisher: Morgan Kaufmann / Elsevier
o Edition: Fifth Edition, 2014
• Book 2:
o Name: Computer Organization and Embedded Systems
Interface
o Authors: Carl Hamacher, Zvonko Vranesic, Safwat Zaky and
Naraig Manjikian
o Publisher: Tata McGraw Hill
o Edition: Sixth Edition, 2012
Introduction
• What is mean by Computer Architecture?
Hardware parts
Instruction set
Interface between hardware &
software
Introduction
ISA: a+b -> add a,b ->000100110101010
Instruction Set Architecture
(ISA)
ISA: The interface or contact between the hardware and
the software
Rules about how to code and interpret machine
instructions:
Execution model (program counter)
Operations (instructions)
Data formats (sizes, addressing modes)
Processor state (registers)
Input and Output (memory, etc.)
Introduction
• What is meant by Computer
Architecture?
Computer architecture encompasses
the specification of an instruction set
and the functional behavior of the
hardware units that implement the
instructions.
Introduction
Technology Evolution
UNIT-I
BASIC STRUCTURE OF A
COMPUTER SYSTEM
Topics:
• Functional Units
• Basic operational concepts
• Instructions: Operations, Operands
• Instruction representation
• Instruction Types
• MIPS addressing mode
• Performance
Functional Units
Also called
as Datapath
Functional Units
Functional Units
• Input unit
• Output unit
• Memory unit
• Arithmetic Logic unit
• Control unit
Functional Units
• Input unit
Functional Units
• Output unit
Functional Units
• Memory unit
Functional Units
Functional Units
Functional Units
Arithmetic & Logic unit and Control unit
Basic Operational Concepts
Unit I
Connection between the processor and the main
memory Code Snippet:
Load R2, LOC
Add R4, R3, R2
Store LOC, R4
IR & PC
• Instruction Register:
The instruction register (IR) holds the
instruction that is currently being executed.
• Program Counter:
The program counter (PC) contains the
memory address of the next instruction to be
fetched and executed.
Memory Locations and Addresses
Examples of encoded information in a
32-bit word.
Instructions
Steps in program
translation
Translations
Machine vs Assembly
Language
Machine Language Assembly Language
• A particular set of
instructions that the
CPU can directly
execute – but these
are ones and zeros
• Ex:
0100001010101
• Assembly language
is a symbolic
version of the
equivalent machine
language
• Ex:
add a,b
Instructions
• Instruction Set:
o The vocabulary of commands understand by a
given architecture.
• Some ISA:
o ARM
o Intel x86
o IBM Power
o MIPS
o SPARC
• Different CPUs implement different set of
instructions.
MIPS
MIPS - Microprocessor with Interlocked Pipeline Stages
Features:
• five-stage execution pipeline: fetch, decode, execute,
memory-access, write-result
• regular instruction set, all instructions are 32-bit
• three-operand arithmetical and logical instructions
• 32 general-purpose registers of 32-bits each
• only the load and store instruction access memory
• flat address space of 4 GBytes of main memory (2^32
bytes)
MIPS Assembly Language
• Categories:
oArithmetic – Only processor and registers
involved (sum of two registers)
oData transfer – Interacts with memory
(load and store)
oLogical - Only processor and registers
involved (and, sll)
oConditional branch – Change flow of
execution (branch instructions)
oUnconditional Jump – Change flow of
execution (jump to a subroutine)
MIPS Registers
Arithmetic
Data Transfer
Load & Store Instructions
• Load:
o Transfer data from memory to a register
• Store:
o Transfer a data from a register to memory
• Memory address must be specified by
load and store
•
Processor Memory
STORE
LOAD
Logical
Conditional
Unconditional Jump
MIPS Arithmetic
• All MIPS arithmetic instructions have 3 operands
• Operand order is fixed (e.g., destination first)
• Example:
C code: A = B + C
MIPS code: add $s0, $s1, $s2
compiler’s job to associate
variables with registers
MIPS Arithmetic
• Design Principle 1: simplicity favors regularity.
Translation: Regular instructions make for simple hardware!
• Simpler hardware reduces design time and manufacturing cost.
• Of course this complicates some things...
C code: A = B + C + D;
E = F - A;
MIPS code add $t0, $s1, $s2
(arithmetic): add $s0, $t0, $s3
sub $s4, $s5, $s0
• Performance penalty: high-level code translates to denser machine
code.
Allowing variable number
of operands would
simplify the assembly
code but complicate the
hardware.
MIPS Arithmetic
a b c f g h i j
$ s 0 $ s 1 $ s 2 $ s 3 $ s 4 $ s 5 $ s 6
$ s 7
a = b - c ;
f = ( g + h ) – ( i + j ) ;
s u b $ s 0 , $ s 1 , $ s 2
a d d $ t 0 , $ s 4 , $ s 5
a d d $ t 1 , $ s 6 , $ s 7
s u b $ s 3 , $ t 0 , $ t 1
1 9 / 6 7
T r y :
1 . f = g + ( h – 5 )
2 . f = ( i + j ) – ( k – 2 0 )
Registers vs. Memory
• Arithmetic instructions operands must be in registers
o MIPS has 32 registers
• Compiler associates variables with registers
• What about programs with lots of variables (arrays, etc.)? Use
memory, load/store operations to transfer data from memory to
register – if not enough registers spill registers to memory
• MIPS is a load/store architecture
Processor I/O
Control
Datapath
Memory
Input
Output
Memory Organization
• Viewed as a large single-dimension array with access by
address
• A memory address is an index into the memory array
• Byte addressing means that the index points to a byte of
memory, and that the unit of memory accessed by a load/store
is a byte
0
1
2
3
4
5
6
...
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
8 bits of data
Memory Organization
• Bytes are load/store units, but most data items use larger words
• For MIPS, a word is 32 bits or 4 bytes.
• 232 bytes with byte addresses from 0 to 232-1
• 230 words with byte addresses 0, 4, 8, ... 232-4
o i.e., words are aligned
o what are the least 2 significant bits of a word address?
0
4
8
12
...
32 bits of data
32 bits of data
32 bits of data
32 bits of data
Registers correspondingly hold 32 bits of data
The Endian Question
Big Endian
31 0
MIPS can also load and
store 4-byte words and
2-byte halfwords.
The endian question:
when you read a word, in
what order do the bytes
appear?
Little Endian: Intel, DEC,
et al.
Big Endian: Motorola,
IBM, Sun, et al.
MIPS can do either
SPIM adopts its host’s
convention
by te 0 by te 1 by te 2 by te 3
Little Endian
31 0
by te 3 by te 2 by te 1 by te 0
3 2 / 6 7
The Endian Question
x = 0x01234567
Load/Store Instructions
• Load and store instructions
• Example:
C code: A[8] = h + A[8];
MIPS code (load): lw $t0, 32($s3)
(arithmetic): add $t0, $s2, $t0
(store): sw $t0, 32($s3)
• Load word has destination first, store has destination last
• Remember MIPS arithmetic operands are registers, not memory
locations
o therefore, words must first be moved from memory to registers using
loads before they can be operated on; then result can be stored back to
memory
offset address
value
So far we’ve learned:
• MIPS
o loading words but addressing bytes
o arithmetic on registers only
• Instruction Meaning
add $s1, $s2, $s3 $s1 = $s2 + $s3
sub $s1, $s2, $s3 $s1 = $s2 – $s3
lw $s1, 100($s2) $s1 = Memory[$s2+100]
sw $s1, 100($s2) Memory[$s2+100]= $s1
• Try:Find the assembly code of B[8]=A[i]+A[j];
A and B available in $s6 and $s7 respectively
$so-$s5 consists of the values f-j
Exercise
Q: For the following C statement, what is the corresponding
MIPS assembly code? Assume that the variables f, g, h,
and i are given and could be considered 32-bit integers as
declared in a C program. Use a minimal number of MIPS
assembly instructions. f = g + (h − 5);
Solution:
f -> $s1, g -> $s2, h -> $s3
addi $t0, $s3,-5
add $s1, $s2, $t0
Representing Instructions
in the Computer
• Instruction format:
o A form of representation of an instruction
composed of fields of binary numbers.
• All MIPS instructions are 32 bit long.
• Three types of instruction formats:
o R-type (for register) or R-format
o I-type (for immediate) or I-format
o J-type (for jump) or J-format
R-type (for register)
• MIPS fields:
• op: Basic operation of the instruction (opcode)
• rs: The first register source operand
• rt: The second register source operand
• rd: The register destination operand
• shamt: Shift amount
• funt: Function. It selects the specific variant of the
operation in the op filed. (function code)
Ex: add $t0, $s1, $s2
I-type (for immediate)
• MIPS fields:
• op: Basic operation of the instruction (opcode)
• rs: The register source operand
• rt: destination register, which receives the result of the
load
• constant or address: It contains 16 bit constant or
address value.
I-type (for immediate)
• MIPS fields:
Ex: addi $t1, $s0, 10
lw $t0, 40($s4)
bne $s5,$s6, 100
J-type (for jump)
• MIPS fields:
• op: Basic operation of the instruction (opcode)
• address: It contains 26 bit address value.
• Ex:
j 10000
Instruction formats for
MIPS architecture
MIPS instruction
encoding
MIPS Registers
Mapping register names
to register numbers
t0 t1 t2 t3 t4 t5 t6 t7
8 9 10 11 12 13 14 15
s0 s1 s2 s3 s4 s5 s6 S7
16 17 18 19 20 21 22 23
Translating a MIPS Assembly
Instruction into a Machine Instruction
Given instruction: add $t0,$s1,$s2
• Solution:
• Identify the type instruction format: R-type
• Format: Operation rd, rs, rt
• rs -> $s1, rt -> $s2, rd -> $t0, shamt – NA
• Op -> , funct ->
• Decimal representation:
• Binary representation:
op rs rt rd shamt funct
0 17 18 8 0 32
op rs rt rd shamt funct
000000 10001 10010 01000 00000 100000
Exercise
Q: Translate the following MIPS Assembly code
into binary code.
sub $t3,$s4,$s5
op rs rt rd Shamt Funct
0 20 21 11 0 34
000000 10100 10101 01011 00000 100010
Exercise
Q: Translate the following MIPS Assembly code
into binary code.
sub $t3,$s4,$s5
000000 10100 10101 01011 00000 100010
Translating a MIPS Assembly
Instruction into a Machine Instruction
Given instruction: lw $t0,32($s3)
• Solution:
• Identify the type instruction format: I-type
• Format: Operation rt, addr.(rs)
• rs -> $s3, rt -> $to, immediate -> 32
• Decimal representation:
• Binary representation:
op rs rt address
35 19 8 32
op rs rt
100011 10011 01000 0000 0000 0010 0000
Exercise
Q: Translate the following MIPS Assembly code
into binary code.
sw $t2,58($s5)
101011 10101 01010 0000 0000 0011 1010
Translating High level Language
into Machine Language
Q: Consider the following high level statement
A[300] = h + A[300];
If $t1 has the base of the array A and $s2 corresponds to
h, What is the MIPS machine language code?
Logical Operations
Shift operations
• Shift allow bits to be moved around inside of a register.
• Shift left logical
Example: sll $t2,$s0,4 # reg $t2 = reg $s0 << 4 bits
Machine Code:
op rs rt rd shamt funct
000000 00000 10000 01010 00100 000000
Shift Left Logical(sll)
• Example: sll $t2,$s0,4 # reg $t2 = reg $s0 << 4 bits
• If $s0=10
• Value of $t2=???
Shift operations
• Shift right logical
Example: srl $t5,$s3,2 # reg $t5 = reg $s3 >> 2 bits
Machine Code:
op rs rt rd shamt funct
000000 00000 10011 01101 00010 000010
op rs rt rd shamt funct
0 00000 19 13 2 2
Shift Right Logical(srl)
Example: srl $t5,$s3,2 # reg $t5 = reg $s3 >> 2 bits
• If $s3=12
• Value of $t5=???
Logical Operations –
AND, OR & NOT
• A logical bit-by-bit operation with two operands.
• EX:
and $t0,$t1,$t2 # reg $t0 = reg $t1 & reg $t2
or $t0,$t1,$t2 # reg $t0 = reg $t1 | reg $t2
nor $t0,$t1,$t3 # reg $t0 = ~ (reg $t1 | reg $t3)
Example
Instructions for Making
Decisions
• Sequences that allow programs to execute statements in order
one after another.
•  Branches that allow programs to jump to other points in a
program.
•  Loops that allow a program to execute a fragment of code
multiple times.
• MIPS Instructions:
beq register1, register2, L1
bne register1, register2, L1
• beq and bne are mnemonics
• Conditional branches
Instructions for Making
Decisions
Q: In the following code segment, f, g, h, i, and j are
variables. If the five variables f through j correspond to the
five registers $s0 through $s4, what is the compiled MIPS
code for this C if statement?
if (i == j) f = g + h; else f = g - h;
Instructions for Making
Decisions
• Solution:
Instructions for Making
Decisions
High level code:
if (i == j)
f = g + h;
else
f = g - h;
MIPS code:
bne $s3,$s4,Else # go to Else if i ≠ j
add $s0,$s1,$s2 # f = g + h (skipped if i ≠ j)
j Exit # go to Exit
Else: sub $s0,$s1,$s2 # f = g - h (skipped if i = j)
Exit:
Compiling a while Loop
in C
while (save[i] == k)
i += 1;
Assume that i and k correspond to registers $s3 and $s5
and the base of the array save is in $s6. What is the MIPS
assembly code corresponding to this C segment?
Compiling a while Loop
in C
while (save[i] == k)
i += 1;
1. load save[i] into a temporary register
1. add i to the base of array save to form the address
2. performs the loop test
1. go to Exit if save[i] ≠ k
3. adds 1 to I
4. back to the while test at the top of the loop
5. Exit
while (save[i] == k)
i += 1;
Assume that i and k correspond to registers $s3 and $s5
and the base of the array save is in $s6. What is the MIPS
assembly code corresponding to this C segment?
Solution:
Loop: sll $t1,$s3,2 # Temp reg $t1 = i * 4
add $t1,$t1,$s6 # $t1 = address of save[i]
lw $t0,0($t1) # Temp reg $t0 = save[i]
bne $t0,$s5, Exit # go to Exit if save[i] ≠ k
addi $s3,$s3,1 # i = i + 1
j Loop # go to Loop
Exit:
MIPS Addressing Mode
• The different ways for specifying the locations
of instruction operands are known as
addressing mode.
• The MIPS addressing modes are the following:
1. Immediate addressing mode
2. Register addressing mode
3. Base or displacement addressing mode
4. PC-relative addressing mode
5. Pseudodirect addressing mode
Immediate addressing mode
• Def:
o the operand is a constant within the instruction itself
• Ex:
o addi $s1, $s2, 20 #$s1=$s2+20
• Ilustration:
Register addressing mode
• Def:
o source and destination operands are registers which are
available in processor registers.
o Direct addressing mode
• Ex:
o add $s1, $s2, $s3 #$s1=$s2+$s3
• Ilustration:
Base or displacement
addressing mode
• Def:
o the operand is at the memory location whose address is the
sum of a register and a constant in the instruction
o Indirect addressing mode
• Ex:
o lw $s1, 20 ($s3) #$s1= Memory[$s3+20]
• Ilustration:
PC-relative addressing mode
• Def:
o the branch address is the sum of the PC and a constant in
the instruction
• Ex:
o bne $s4, $s5, 25 # if ($s4 != $s5), go to
pc=12+4+100
• Ilustration:
Pseudodirect addressing
mode
• Def:
o the jump address is the 26 bits of the instruction
concatenated with the upper bits of the PC
• Ex:
o j 1000
• Ilustration:
Decoding Machine Code
• Q: What is the assembly language statement
corresponding to this machine instruction?
00af8020hex
Solution:
converting hexadecimal to binary
Binary instruction format
Assembly instruction
Translating Machine Language
to Assembly Language
• Translate the following machine language code into
assembly language.
0x02F34022
Performance
• Performance is the key to understanding underlying motivation for
the hardware and its organization
• Measure, report, and summarize performance to enable users to
o make intelligent choices
o see through the marketing hype!
• Why is some hardware better than others for different programs?
• What factors of system performance are hardware related?
(e.g., do we need a new machine, or a new operating system?)
• How does the machine's instruction set affect performance?
Computer Performance:
TIME, TIME, TIME!!!
• Response Time (elapsed time, latency):
o how long does it take for my job to run?
o how long does it take to execute (start to
finish) my job?
o how long must I wait for the database query?
• Throughput:
o how many jobs can the machine run at once?
o what is the average execution rate?
o how much work is getting done?
• If we upgrade a machine with a new processor what do we increase?
• If we add a new machine to the lab what do we increase?
Individual user
concerns…
Systems manager
concerns…
Execution Time
• Elapsed Time
o counts everything (disk and memory accesses, waiting for I/O, running
other programs, etc.) from start to finish
o a useful number, but often not good for comparison purposes
elapsed time = CPU time + wait time (I/O, other programs, etc.)
• CPU time
o doesn't count waiting for I/O or time spent running other programs
o can be divided into user CPU time and system CPU time (OS calls)
CPU time = user CPU time + system CPU time
 elapsed time = user CPU time + system CPU time + wait time
• Our focus: user CPU time (CPU execution time or, simply, execution
time)
o time spent executing the lines of code that are in our program
Definition of Performance
• For some program running on machine X:
PerformanceX = 1 / Execution timeX
• If there are two machines X and Y if the performance of X is greater than performance of
Y,
PerformanceX > PerformanceY
ie., 1 / Execution timeX > 1 / Execution timeY
• X is n times faster than Y means:
PerformanceX / PerformanceY = n
PerformanceX / PerformanceY = Execution timeY / Execution timeX = n
Q: If computer A runs a program in 10 sec
and computer B runs the same program in
15 secs, how much faster is A than B
• We know that,
PerformanceA / PerformanceB
= Execution timeB / Execution timeA = n
Thus the performance ratio is,
Execution timeB / Execution timeA = 15 / 10 = 1.5
ie., PerformanceA / PerformanceB = 1.5
Therfore Peformance of A 1.5 times faster than Performance
of B
Clock Cycles
• Instead of reporting execution time in seconds, we often use cycles.
In modern computers hardware events progress cycle by cycle: in
other words, each event, e.g., multiplication, addition, etc., is a
sequence of cycles
• Clock ticks indicate start and end of cycles:
• cycle time = time between ticks = seconds per cycle
• clock rate (frequency) = clock cycles per second (1 Hz. = 1
cycle/sec, 1 MHz. = 106 cycles/sec)
• Example: A 200 Mhz. clock has a cycle time of ????
time
seconds
program

cycles
program

seconds
cycle
cycle
tick
tick
Performance Equation I
• So, to improve performance one can either:
o reduce the number of cycles for a program, or
o reduce the clock cycle time, or, equivalently,
o increase the clock rate
seconds
program

cycles
program

seconds
cycle
CPU execution time CPU clock cycles Clock cycle time
for a program for a program
=

equivalently
Also, CPU execution time CPU clock cycles / Clock cycle rate
for a program for a program
Our favorite program runs in 10 seconds on computer A, which has a 2
GHz clock. We are trying to help a computer designer build a computer,
B, which will run this program in 6 seconds. The designer has determined
that a substantial increase in the clock rate is possible, but this increase
will affect the rest of the CPU design, causing computer B to require 1.2
times as many clock cycles as computer A for this program. What clock
rate should we tell the designer to target?
CPU timeA = CPU Clock cyclesA / clock rateA
10 sec = CPU Clock cyclesA / 2*109 cycles/sec
CPU Clock cyclesA = 10 sec * 2*109 cycles/sec
= 20 *109 cycles
CPU timeB = 1.2 * CPU Clock cyclesA / clock rateB
6 secs = 1.2 * 20 *109 cycles / clock rateB
clock rateB = 1.2 * 20 *109 cycles / 6 sec= 4 * 109 Hz
To run the program in 6 secs, B must be 4 * 109 Hz
Instruction Performance
• No reference to no of instructions in previous equation
• The execution time depends on the number of
instructions in the program
Clock cycles per instruction (CPI)
• Average number of clock cycles per instruction for a
program or program fragment
Suppose we have two implementations of the same instruction
set architecture. Computer A has a clock cycle time of 250 ps
and a CPI of 2.0 for some program, and computer B has a
clock cycle time of 500 ps and a CPI of 1.2 for the same
program. Which computer is faster for this program and by
how much?
• Same number of instructions are instructions are
executed
Instruction Performance
CPU execution time = Instruction count * average CPI * Clock cycle time
for a program for a program
Or
CPU execution time = Instruction count * average CPI / Clock rate
for a program for a program
Instruction Performance
Which code sequence
executes the most?
• Sequence 1 executes,
2 + 1 + 2 = 5 instructions
• Sequence 2 executes,
4+ 1 + 1 = 6 instructions
Sequence 2 executes most no of instructions
Which will be faster?
• So code sequence 2 is faster
What is the CPI for each
sequence?
• Sequence 2 has lower CPI as it takes fewer clock cycles
but has more instructions
Basic components of
Performance
Factors affecting
Peformance

Mais conteúdo relacionado

Mais procurados

OS - Ch2
OS - Ch2OS - Ch2
OS - Ch2
sphs
 
12 process control blocks
12 process control blocks12 process control blocks
12 process control blocks
myrajendra
 
6 multiprogramming & time sharing
6 multiprogramming & time sharing6 multiprogramming & time sharing
6 multiprogramming & time sharing
myrajendra
 
Chapter 04 the processor
Chapter 04   the processorChapter 04   the processor
Chapter 04 the processor
Bảo Hoang
 

Mais procurados (20)

Interrupts
InterruptsInterrupts
Interrupts
 
Memory organization (Computer architecture)
Memory organization (Computer architecture)Memory organization (Computer architecture)
Memory organization (Computer architecture)
 
Functional units
Functional unitsFunctional units
Functional units
 
OS - Ch2
OS - Ch2OS - Ch2
OS - Ch2
 
I/O System
I/O SystemI/O System
I/O System
 
File organization 1
File organization 1File organization 1
File organization 1
 
Computer Systems Organization
Computer Systems OrganizationComputer Systems Organization
Computer Systems Organization
 
Multiprocessor system
Multiprocessor system Multiprocessor system
Multiprocessor system
 
ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING
ADVANCED COMPUTER ARCHITECTUREAND PARALLEL PROCESSINGADVANCED COMPUTER ARCHITECTUREAND PARALLEL PROCESSING
ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING
 
Coa module1
Coa module1Coa module1
Coa module1
 
Memory fragmentation by ofor williams daniel
Memory fragmentation by ofor williams danielMemory fragmentation by ofor williams daniel
Memory fragmentation by ofor williams daniel
 
12 process control blocks
12 process control blocks12 process control blocks
12 process control blocks
 
Memory mapping
Memory mappingMemory mapping
Memory mapping
 
Von Neumann Architecture
Von Neumann ArchitectureVon Neumann Architecture
Von Neumann Architecture
 
6 multiprogramming & time sharing
6 multiprogramming & time sharing6 multiprogramming & time sharing
6 multiprogramming & time sharing
 
Harvard Architecture | Computer Science
Harvard Architecture | Computer ScienceHarvard Architecture | Computer Science
Harvard Architecture | Computer Science
 
cache memory
 cache memory cache memory
cache memory
 
Memory Organization | Computer Fundamental and Organization
Memory Organization | Computer Fundamental and OrganizationMemory Organization | Computer Fundamental and Organization
Memory Organization | Computer Fundamental and Organization
 
Chapter 04 the processor
Chapter 04   the processorChapter 04   the processor
Chapter 04 the processor
 
Basic processing unit by aniket bhute
Basic processing unit by aniket bhuteBasic processing unit by aniket bhute
Basic processing unit by aniket bhute
 

Semelhante a Basic Structure of a Computer System

4.1 Introduction 145• In this section, we first take a gander at a.pdf
4.1 Introduction 145• In this section, we first take a gander at a.pdf4.1 Introduction 145• In this section, we first take a gander at a.pdf
4.1 Introduction 145• In this section, we first take a gander at a.pdf
arpowersarps
 

Semelhante a Basic Structure of a Computer System (20)

Introduction to Computer Architecture
Introduction to Computer ArchitectureIntroduction to Computer Architecture
Introduction to Computer Architecture
 
CA_mod05_ISA.ppt
CA_mod05_ISA.pptCA_mod05_ISA.ppt
CA_mod05_ISA.ppt
 
Assembly.ppt
Assembly.pptAssembly.ppt
Assembly.ppt
 
Introduction to Computer Architecture and Organization
Introduction to Computer Architecture and OrganizationIntroduction to Computer Architecture and Organization
Introduction to Computer Architecture and Organization
 
Advanced Processor Power Point Presentation
Advanced Processor  Power Point  PresentationAdvanced Processor  Power Point  Presentation
Advanced Processor Power Point Presentation
 
Computer organization basics
Computer organization  basicsComputer organization  basics
Computer organization basics
 
CODch3Slides.ppt
CODch3Slides.pptCODch3Slides.ppt
CODch3Slides.ppt
 
CO_Chapter2.ppt
CO_Chapter2.pptCO_Chapter2.ppt
CO_Chapter2.ppt
 
Instruction set.pptx
Instruction set.pptxInstruction set.pptx
Instruction set.pptx
 
Unit I_MT2301.pdf
Unit I_MT2301.pdfUnit I_MT2301.pdf
Unit I_MT2301.pdf
 
isa architecture
isa architectureisa architecture
isa architecture
 
Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.
Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.
Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.
 
11-risc-cisc-and-isa-w.pptx
11-risc-cisc-and-isa-w.pptx11-risc-cisc-and-isa-w.pptx
11-risc-cisc-and-isa-w.pptx
 
MCA-I-COA- overview of register transfer, micro operations and basic computer...
MCA-I-COA- overview of register transfer, micro operations and basic computer...MCA-I-COA- overview of register transfer, micro operations and basic computer...
MCA-I-COA- overview of register transfer, micro operations and basic computer...
 
Instruction Set Architecture
Instruction Set ArchitectureInstruction Set Architecture
Instruction Set Architecture
 
4.1 Introduction 145• In this section, we first take a gander at a.pdf
4.1 Introduction 145• In this section, we first take a gander at a.pdf4.1 Introduction 145• In this section, we first take a gander at a.pdf
4.1 Introduction 145• In this section, we first take a gander at a.pdf
 
Processors selection
Processors selectionProcessors selection
Processors selection
 
CST 20363 Session 4 Computer Logic Design
CST 20363 Session 4 Computer Logic DesignCST 20363 Session 4 Computer Logic Design
CST 20363 Session 4 Computer Logic Design
 
Unit 1 computer architecture (1)
Unit 1   computer architecture (1)Unit 1   computer architecture (1)
Unit 1 computer architecture (1)
 
isa architecture
isa architectureisa architecture
isa architecture
 

Último

Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
AldoGarca30
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
jaanualu31
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 

Último (20)

Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEGEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to Computers
 
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptxOrlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and properties
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 

Basic Structure of a Computer System

  • 1. Velammal Engineering College Department of Computer Science and Engineering Welcome… Slide Sources: Patterson & Hennessy COD book website (copyright Morgan Kaufmann) adapted and supplemented Mr. A. Arockia Abins & Ms. R. Amirthavalli, Asst. Prof, CSE, Velammal Engineering College
  • 2. Course Objectives • This course aims to learn the basic structure and operations of a computer. • The course is intended to learn ALU, pipelined execution, parallelism and multi-core processors. • The course will enable the students to understand memory hierarchies, cache memories and virtual memories.
  • 3. Course Outcomes CO 1 Discuss the basics structure of computers, operations and instructions. CO 2 Design arithmetic and logic unit. CO 3 Analyze pipelined execution and design control unit. CO 4 Analyze parallel processing architectures. CO 5 Examine the performance of various memory systems CO 6 Organize the various I/O communications.
  • 4. Syllabus Unit Titles: • Unit I Basic Structure of a Computer System • Unit II Arithmetic for Computers • Unit III Processor and Control Unit • Unit IV Parallelism • Unit V Memory & I/O Systems
  • 5. Syllabus – Unit I UNIT-I BASIC STRUCTURE OF A COMPUTER SYSTEM Functional Units – Basic operational concepts –– Instructions: Operations, Operands – Instruction representation – Instruction Types – MIPS addressing, Performance
  • 6. Syllabus – Unit II UNIT-II ARITHMETIC FOR COMPUTERS Addition and Subtraction – Multiplication – Division – Floating Point Representation – Floating Point Addition and Subtraction.
  • 7. Syllabus – Unit III UNIT-III PROCESSOR AND CONTROL UNIT A Basic MIPS implementation – Building a Datapath – Control Implementation Scheme – Pipelining – Pipelined datapath and control – Handling Data Hazards & Control Hazards.
  • 8. Syllabus – Unit IV UNIT-IV PARALLELISM Introduction to Multicore processors and other shared memory multiprocessors – Flynn’s classification: SISD, MIMD, SIMD, SPMD and Vector – Hardware multithreading – GPU architecture.
  • 9. Syllabus – Unit V • UNIT-V MEMORY & I/O SYSTEMS Memory Hierarchy – memory technologies – Cache Memory – Performance Considerations, Virtual Memory,TLB’s – Accessing I/O devices – Interrupts – Direct Memory Access – Bus Structure – Bus operation.
  • 10. Text Books • Book 1: o Name: Computer Organization and Design: The Hardware/Software Interface o Authors: David A. Patterson and John L. Hennessy o Publisher: Morgan Kaufmann / Elsevier o Edition: Fifth Edition, 2014 • Book 2: o Name: Computer Organization and Embedded Systems Interface o Authors: Carl Hamacher, Zvonko Vranesic, Safwat Zaky and Naraig Manjikian o Publisher: Tata McGraw Hill o Edition: Sixth Edition, 2012
  • 11. Introduction • What is mean by Computer Architecture? Hardware parts Instruction set Interface between hardware & software
  • 12. Introduction ISA: a+b -> add a,b ->000100110101010
  • 13. Instruction Set Architecture (ISA) ISA: The interface or contact between the hardware and the software Rules about how to code and interpret machine instructions: Execution model (program counter) Operations (instructions) Data formats (sizes, addressing modes) Processor state (registers) Input and Output (memory, etc.)
  • 14. Introduction • What is meant by Computer Architecture? Computer architecture encompasses the specification of an instruction set and the functional behavior of the hardware units that implement the instructions.
  • 17. UNIT-I BASIC STRUCTURE OF A COMPUTER SYSTEM Topics: • Functional Units • Basic operational concepts • Instructions: Operations, Operands • Instruction representation • Instruction Types • MIPS addressing mode • Performance
  • 20. Functional Units • Input unit • Output unit • Memory unit • Arithmetic Logic unit • Control unit
  • 26. Functional Units Arithmetic & Logic unit and Control unit
  • 28. Connection between the processor and the main memory Code Snippet: Load R2, LOC Add R4, R3, R2 Store LOC, R4
  • 29. IR & PC • Instruction Register: The instruction register (IR) holds the instruction that is currently being executed. • Program Counter: The program counter (PC) contains the memory address of the next instruction to be fetched and executed.
  • 30. Memory Locations and Addresses
  • 31. Examples of encoded information in a 32-bit word.
  • 35. Machine vs Assembly Language Machine Language Assembly Language • A particular set of instructions that the CPU can directly execute – but these are ones and zeros • Ex: 0100001010101 • Assembly language is a symbolic version of the equivalent machine language • Ex: add a,b
  • 36.
  • 37. Instructions • Instruction Set: o The vocabulary of commands understand by a given architecture. • Some ISA: o ARM o Intel x86 o IBM Power o MIPS o SPARC • Different CPUs implement different set of instructions.
  • 38. MIPS MIPS - Microprocessor with Interlocked Pipeline Stages Features: • five-stage execution pipeline: fetch, decode, execute, memory-access, write-result • regular instruction set, all instructions are 32-bit • three-operand arithmetical and logical instructions • 32 general-purpose registers of 32-bits each • only the load and store instruction access memory • flat address space of 4 GBytes of main memory (2^32 bytes)
  • 39. MIPS Assembly Language • Categories: oArithmetic – Only processor and registers involved (sum of two registers) oData transfer – Interacts with memory (load and store) oLogical - Only processor and registers involved (and, sll) oConditional branch – Change flow of execution (branch instructions) oUnconditional Jump – Change flow of execution (jump to a subroutine)
  • 43. Load & Store Instructions • Load: o Transfer data from memory to a register • Store: o Transfer a data from a register to memory • Memory address must be specified by load and store • Processor Memory STORE LOAD
  • 47.
  • 48. MIPS Arithmetic • All MIPS arithmetic instructions have 3 operands • Operand order is fixed (e.g., destination first) • Example: C code: A = B + C MIPS code: add $s0, $s1, $s2 compiler’s job to associate variables with registers
  • 49. MIPS Arithmetic • Design Principle 1: simplicity favors regularity. Translation: Regular instructions make for simple hardware! • Simpler hardware reduces design time and manufacturing cost. • Of course this complicates some things... C code: A = B + C + D; E = F - A; MIPS code add $t0, $s1, $s2 (arithmetic): add $s0, $t0, $s3 sub $s4, $s5, $s0 • Performance penalty: high-level code translates to denser machine code. Allowing variable number of operands would simplify the assembly code but complicate the hardware.
  • 50. MIPS Arithmetic a b c f g h i j $ s 0 $ s 1 $ s 2 $ s 3 $ s 4 $ s 5 $ s 6 $ s 7 a = b - c ; f = ( g + h ) – ( i + j ) ; s u b $ s 0 , $ s 1 , $ s 2 a d d $ t 0 , $ s 4 , $ s 5 a d d $ t 1 , $ s 6 , $ s 7 s u b $ s 3 , $ t 0 , $ t 1 1 9 / 6 7 T r y : 1 . f = g + ( h – 5 ) 2 . f = ( i + j ) – ( k – 2 0 )
  • 51. Registers vs. Memory • Arithmetic instructions operands must be in registers o MIPS has 32 registers • Compiler associates variables with registers • What about programs with lots of variables (arrays, etc.)? Use memory, load/store operations to transfer data from memory to register – if not enough registers spill registers to memory • MIPS is a load/store architecture Processor I/O Control Datapath Memory Input Output
  • 52. Memory Organization • Viewed as a large single-dimension array with access by address • A memory address is an index into the memory array • Byte addressing means that the index points to a byte of memory, and that the unit of memory accessed by a load/store is a byte 0 1 2 3 4 5 6 ... 8 bits of data 8 bits of data 8 bits of data 8 bits of data 8 bits of data 8 bits of data 8 bits of data
  • 53. Memory Organization • Bytes are load/store units, but most data items use larger words • For MIPS, a word is 32 bits or 4 bytes. • 232 bytes with byte addresses from 0 to 232-1 • 230 words with byte addresses 0, 4, 8, ... 232-4 o i.e., words are aligned o what are the least 2 significant bits of a word address? 0 4 8 12 ... 32 bits of data 32 bits of data 32 bits of data 32 bits of data Registers correspondingly hold 32 bits of data
  • 54. The Endian Question Big Endian 31 0 MIPS can also load and store 4-byte words and 2-byte halfwords. The endian question: when you read a word, in what order do the bytes appear? Little Endian: Intel, DEC, et al. Big Endian: Motorola, IBM, Sun, et al. MIPS can do either SPIM adopts its host’s convention by te 0 by te 1 by te 2 by te 3 Little Endian 31 0 by te 3 by te 2 by te 1 by te 0 3 2 / 6 7
  • 55. The Endian Question x = 0x01234567
  • 56. Load/Store Instructions • Load and store instructions • Example: C code: A[8] = h + A[8]; MIPS code (load): lw $t0, 32($s3) (arithmetic): add $t0, $s2, $t0 (store): sw $t0, 32($s3) • Load word has destination first, store has destination last • Remember MIPS arithmetic operands are registers, not memory locations o therefore, words must first be moved from memory to registers using loads before they can be operated on; then result can be stored back to memory offset address value
  • 57. So far we’ve learned: • MIPS o loading words but addressing bytes o arithmetic on registers only • Instruction Meaning add $s1, $s2, $s3 $s1 = $s2 + $s3 sub $s1, $s2, $s3 $s1 = $s2 – $s3 lw $s1, 100($s2) $s1 = Memory[$s2+100] sw $s1, 100($s2) Memory[$s2+100]= $s1 • Try:Find the assembly code of B[8]=A[i]+A[j]; A and B available in $s6 and $s7 respectively $so-$s5 consists of the values f-j
  • 58. Exercise Q: For the following C statement, what is the corresponding MIPS assembly code? Assume that the variables f, g, h, and i are given and could be considered 32-bit integers as declared in a C program. Use a minimal number of MIPS assembly instructions. f = g + (h − 5); Solution: f -> $s1, g -> $s2, h -> $s3 addi $t0, $s3,-5 add $s1, $s2, $t0
  • 59. Representing Instructions in the Computer • Instruction format: o A form of representation of an instruction composed of fields of binary numbers. • All MIPS instructions are 32 bit long. • Three types of instruction formats: o R-type (for register) or R-format o I-type (for immediate) or I-format o J-type (for jump) or J-format
  • 60. R-type (for register) • MIPS fields: • op: Basic operation of the instruction (opcode) • rs: The first register source operand • rt: The second register source operand • rd: The register destination operand • shamt: Shift amount • funt: Function. It selects the specific variant of the operation in the op filed. (function code) Ex: add $t0, $s1, $s2
  • 61. I-type (for immediate) • MIPS fields: • op: Basic operation of the instruction (opcode) • rs: The register source operand • rt: destination register, which receives the result of the load • constant or address: It contains 16 bit constant or address value.
  • 62. I-type (for immediate) • MIPS fields: Ex: addi $t1, $s0, 10 lw $t0, 40($s4) bne $s5,$s6, 100
  • 63. J-type (for jump) • MIPS fields: • op: Basic operation of the instruction (opcode) • address: It contains 26 bit address value. • Ex: j 10000
  • 67. Mapping register names to register numbers t0 t1 t2 t3 t4 t5 t6 t7 8 9 10 11 12 13 14 15 s0 s1 s2 s3 s4 s5 s6 S7 16 17 18 19 20 21 22 23
  • 68. Translating a MIPS Assembly Instruction into a Machine Instruction Given instruction: add $t0,$s1,$s2 • Solution: • Identify the type instruction format: R-type • Format: Operation rd, rs, rt • rs -> $s1, rt -> $s2, rd -> $t0, shamt – NA • Op -> , funct -> • Decimal representation: • Binary representation: op rs rt rd shamt funct 0 17 18 8 0 32 op rs rt rd shamt funct 000000 10001 10010 01000 00000 100000
  • 69. Exercise Q: Translate the following MIPS Assembly code into binary code. sub $t3,$s4,$s5 op rs rt rd Shamt Funct 0 20 21 11 0 34 000000 10100 10101 01011 00000 100010
  • 70. Exercise Q: Translate the following MIPS Assembly code into binary code. sub $t3,$s4,$s5 000000 10100 10101 01011 00000 100010
  • 71. Translating a MIPS Assembly Instruction into a Machine Instruction Given instruction: lw $t0,32($s3) • Solution: • Identify the type instruction format: I-type • Format: Operation rt, addr.(rs) • rs -> $s3, rt -> $to, immediate -> 32 • Decimal representation: • Binary representation: op rs rt address 35 19 8 32 op rs rt 100011 10011 01000 0000 0000 0010 0000
  • 72. Exercise Q: Translate the following MIPS Assembly code into binary code. sw $t2,58($s5) 101011 10101 01010 0000 0000 0011 1010
  • 73. Translating High level Language into Machine Language Q: Consider the following high level statement A[300] = h + A[300]; If $t1 has the base of the array A and $s2 corresponds to h, What is the MIPS machine language code?
  • 75. Shift operations • Shift allow bits to be moved around inside of a register. • Shift left logical Example: sll $t2,$s0,4 # reg $t2 = reg $s0 << 4 bits Machine Code: op rs rt rd shamt funct 000000 00000 10000 01010 00100 000000
  • 76. Shift Left Logical(sll) • Example: sll $t2,$s0,4 # reg $t2 = reg $s0 << 4 bits • If $s0=10 • Value of $t2=???
  • 77. Shift operations • Shift right logical Example: srl $t5,$s3,2 # reg $t5 = reg $s3 >> 2 bits Machine Code: op rs rt rd shamt funct 000000 00000 10011 01101 00010 000010 op rs rt rd shamt funct 0 00000 19 13 2 2
  • 78. Shift Right Logical(srl) Example: srl $t5,$s3,2 # reg $t5 = reg $s3 >> 2 bits • If $s3=12 • Value of $t5=???
  • 79. Logical Operations – AND, OR & NOT • A logical bit-by-bit operation with two operands. • EX: and $t0,$t1,$t2 # reg $t0 = reg $t1 & reg $t2 or $t0,$t1,$t2 # reg $t0 = reg $t1 | reg $t2 nor $t0,$t1,$t3 # reg $t0 = ~ (reg $t1 | reg $t3)
  • 81. Instructions for Making Decisions • Sequences that allow programs to execute statements in order one after another. •  Branches that allow programs to jump to other points in a program. •  Loops that allow a program to execute a fragment of code multiple times. • MIPS Instructions: beq register1, register2, L1 bne register1, register2, L1 • beq and bne are mnemonics • Conditional branches
  • 82. Instructions for Making Decisions Q: In the following code segment, f, g, h, i, and j are variables. If the five variables f through j correspond to the five registers $s0 through $s4, what is the compiled MIPS code for this C if statement? if (i == j) f = g + h; else f = g - h;
  • 84. Instructions for Making Decisions High level code: if (i == j) f = g + h; else f = g - h; MIPS code: bne $s3,$s4,Else # go to Else if i ≠ j add $s0,$s1,$s2 # f = g + h (skipped if i ≠ j) j Exit # go to Exit Else: sub $s0,$s1,$s2 # f = g - h (skipped if i = j) Exit:
  • 85. Compiling a while Loop in C while (save[i] == k) i += 1; Assume that i and k correspond to registers $s3 and $s5 and the base of the array save is in $s6. What is the MIPS assembly code corresponding to this C segment?
  • 86. Compiling a while Loop in C while (save[i] == k) i += 1; 1. load save[i] into a temporary register 1. add i to the base of array save to form the address 2. performs the loop test 1. go to Exit if save[i] ≠ k 3. adds 1 to I 4. back to the while test at the top of the loop 5. Exit
  • 87. while (save[i] == k) i += 1; Assume that i and k correspond to registers $s3 and $s5 and the base of the array save is in $s6. What is the MIPS assembly code corresponding to this C segment? Solution: Loop: sll $t1,$s3,2 # Temp reg $t1 = i * 4 add $t1,$t1,$s6 # $t1 = address of save[i] lw $t0,0($t1) # Temp reg $t0 = save[i] bne $t0,$s5, Exit # go to Exit if save[i] ≠ k addi $s3,$s3,1 # i = i + 1 j Loop # go to Loop Exit:
  • 88. MIPS Addressing Mode • The different ways for specifying the locations of instruction operands are known as addressing mode. • The MIPS addressing modes are the following: 1. Immediate addressing mode 2. Register addressing mode 3. Base or displacement addressing mode 4. PC-relative addressing mode 5. Pseudodirect addressing mode
  • 89. Immediate addressing mode • Def: o the operand is a constant within the instruction itself • Ex: o addi $s1, $s2, 20 #$s1=$s2+20 • Ilustration:
  • 90. Register addressing mode • Def: o source and destination operands are registers which are available in processor registers. o Direct addressing mode • Ex: o add $s1, $s2, $s3 #$s1=$s2+$s3 • Ilustration:
  • 91. Base or displacement addressing mode • Def: o the operand is at the memory location whose address is the sum of a register and a constant in the instruction o Indirect addressing mode • Ex: o lw $s1, 20 ($s3) #$s1= Memory[$s3+20] • Ilustration:
  • 92. PC-relative addressing mode • Def: o the branch address is the sum of the PC and a constant in the instruction • Ex: o bne $s4, $s5, 25 # if ($s4 != $s5), go to pc=12+4+100 • Ilustration:
  • 93. Pseudodirect addressing mode • Def: o the jump address is the 26 bits of the instruction concatenated with the upper bits of the PC • Ex: o j 1000 • Ilustration:
  • 94. Decoding Machine Code • Q: What is the assembly language statement corresponding to this machine instruction? 00af8020hex Solution: converting hexadecimal to binary Binary instruction format Assembly instruction
  • 95. Translating Machine Language to Assembly Language • Translate the following machine language code into assembly language. 0x02F34022
  • 96. Performance • Performance is the key to understanding underlying motivation for the hardware and its organization • Measure, report, and summarize performance to enable users to o make intelligent choices o see through the marketing hype! • Why is some hardware better than others for different programs? • What factors of system performance are hardware related? (e.g., do we need a new machine, or a new operating system?) • How does the machine's instruction set affect performance?
  • 97. Computer Performance: TIME, TIME, TIME!!! • Response Time (elapsed time, latency): o how long does it take for my job to run? o how long does it take to execute (start to finish) my job? o how long must I wait for the database query? • Throughput: o how many jobs can the machine run at once? o what is the average execution rate? o how much work is getting done? • If we upgrade a machine with a new processor what do we increase? • If we add a new machine to the lab what do we increase? Individual user concerns… Systems manager concerns…
  • 98. Execution Time • Elapsed Time o counts everything (disk and memory accesses, waiting for I/O, running other programs, etc.) from start to finish o a useful number, but often not good for comparison purposes elapsed time = CPU time + wait time (I/O, other programs, etc.) • CPU time o doesn't count waiting for I/O or time spent running other programs o can be divided into user CPU time and system CPU time (OS calls) CPU time = user CPU time + system CPU time  elapsed time = user CPU time + system CPU time + wait time • Our focus: user CPU time (CPU execution time or, simply, execution time) o time spent executing the lines of code that are in our program
  • 99. Definition of Performance • For some program running on machine X: PerformanceX = 1 / Execution timeX • If there are two machines X and Y if the performance of X is greater than performance of Y, PerformanceX > PerformanceY ie., 1 / Execution timeX > 1 / Execution timeY • X is n times faster than Y means: PerformanceX / PerformanceY = n PerformanceX / PerformanceY = Execution timeY / Execution timeX = n
  • 100. Q: If computer A runs a program in 10 sec and computer B runs the same program in 15 secs, how much faster is A than B • We know that, PerformanceA / PerformanceB = Execution timeB / Execution timeA = n Thus the performance ratio is, Execution timeB / Execution timeA = 15 / 10 = 1.5 ie., PerformanceA / PerformanceB = 1.5 Therfore Peformance of A 1.5 times faster than Performance of B
  • 101. Clock Cycles • Instead of reporting execution time in seconds, we often use cycles. In modern computers hardware events progress cycle by cycle: in other words, each event, e.g., multiplication, addition, etc., is a sequence of cycles • Clock ticks indicate start and end of cycles: • cycle time = time between ticks = seconds per cycle • clock rate (frequency) = clock cycles per second (1 Hz. = 1 cycle/sec, 1 MHz. = 106 cycles/sec) • Example: A 200 Mhz. clock has a cycle time of ???? time seconds program  cycles program  seconds cycle cycle tick tick
  • 102. Performance Equation I • So, to improve performance one can either: o reduce the number of cycles for a program, or o reduce the clock cycle time, or, equivalently, o increase the clock rate seconds program  cycles program  seconds cycle CPU execution time CPU clock cycles Clock cycle time for a program for a program =  equivalently Also, CPU execution time CPU clock cycles / Clock cycle rate for a program for a program
  • 103. Our favorite program runs in 10 seconds on computer A, which has a 2 GHz clock. We are trying to help a computer designer build a computer, B, which will run this program in 6 seconds. The designer has determined that a substantial increase in the clock rate is possible, but this increase will affect the rest of the CPU design, causing computer B to require 1.2 times as many clock cycles as computer A for this program. What clock rate should we tell the designer to target? CPU timeA = CPU Clock cyclesA / clock rateA 10 sec = CPU Clock cyclesA / 2*109 cycles/sec CPU Clock cyclesA = 10 sec * 2*109 cycles/sec = 20 *109 cycles CPU timeB = 1.2 * CPU Clock cyclesA / clock rateB 6 secs = 1.2 * 20 *109 cycles / clock rateB clock rateB = 1.2 * 20 *109 cycles / 6 sec= 4 * 109 Hz To run the program in 6 secs, B must be 4 * 109 Hz
  • 104. Instruction Performance • No reference to no of instructions in previous equation • The execution time depends on the number of instructions in the program Clock cycles per instruction (CPI) • Average number of clock cycles per instruction for a program or program fragment
  • 105. Suppose we have two implementations of the same instruction set architecture. Computer A has a clock cycle time of 250 ps and a CPI of 2.0 for some program, and computer B has a clock cycle time of 500 ps and a CPI of 1.2 for the same program. Which computer is faster for this program and by how much? • Same number of instructions are instructions are executed
  • 106. Instruction Performance CPU execution time = Instruction count * average CPI * Clock cycle time for a program for a program Or CPU execution time = Instruction count * average CPI / Clock rate for a program for a program
  • 108. Which code sequence executes the most? • Sequence 1 executes, 2 + 1 + 2 = 5 instructions • Sequence 2 executes, 4+ 1 + 1 = 6 instructions Sequence 2 executes most no of instructions
  • 109. Which will be faster? • So code sequence 2 is faster
  • 110. What is the CPI for each sequence? • Sequence 2 has lower CPI as it takes fewer clock cycles but has more instructions