SlideShare uma empresa Scribd logo
1 de 182
Baixar para ler offline
QUESTIONS AND ANSWERS
FOR
COMPUTER ORGANIZATION
AND ARCHITECTURE
G.Appasami, M.Sc., M.C.A., M.Phil., M.Tech.,
Assistant Professor
Department of Computer Science and Engineering
Dr. Pauls Engineering Collage
Pauls Nagar, Villupuram
Tamilnadu, India.
SARUMATHI PUBLICATIONS
No. 109, Pillayar Kovil Street, Periya Kalapet
Pondicherry – 605014, India
Phone: 0413 – 2656368
Mobile: 9786554175, 8940872274
First Edition: July 2010
Second Edition: July 2011
Published By
SARUMATHI PUBLICATIONS
© All rights reserved. No part of this publication can be reproduced or stored in any form or
by means of photocopy, recording or otherwise without the prior written permission of the
author.
Price Rs. 160/-
Copies can be had from
SARUMATHI PUBLICATIONS
No. 109, Pillayar Kovil Street, Periya Kalapet
Pondicherry – 605014, India
Phone: 0413 – 2656368
Mobile: 9786554175, 8940872274
Printed at
Meenam Offset
Pondicherry – 605014, India
ANNA UNIEVRSITY SYLLABUS
CS 2253 COMPUTER ORGANIZATION AND ARCHITECTURE
(Common to CSE & IT)
1. BASIC STRUCTURE OF COMPUTERS 9
Functional units – Basic operational concepts – Bus structures – Performance and metrics –
Instructions and instruction sequencing – Hardware – Software Interface – Instruction set
architecture – Addressing modes – RISC – CISC. ALU design – Fixed point and floating point
operations.
2. BASIC PROCESSING UNIT 9
Fundamental concepts – Execution of a complete instruction – Multiple bus organization –
Hardwired control – Micro programmed control – Nano programming.
3. PIPELINING 9
Basic concepts – Data hazards – Instruction hazards – Influence on instruction sets – Data path and
control considerations – Performance considerations – Exception handling.
4. MEMORY SYSTEM 9
Basic concepts – Semiconductor RAM – ROM – Speed – Size and cost – Cache memories –
Improving cache performance – Virtual memory – Memory management requirements –
Associative memories – Secondary storage devices.
5. I/O ORGANIZATION 9
Accessing I/O devices – Programmed Input/Output -Interrupts – Direct Memory Access – Buses –
Interface circuits – Standard I/O Interfaces (PCI, SCSI, USB), I/O devices and processors.
TOTAL = 45
TABLE OF CONTENTS
UNIT 1: BASIC STRUCTURE OF COMPUTERS
1.1 Explain functional unit of a computer 1
1.2 Discuss basic operational concepts 4
1.3 Explain bus structure of a computer 6
1.4 Explain performance and metrics 9
1.5 Explain instruction and instruction sequencing 13
1.6 Discuss Hardware – Software Interface 19
1.7 Explain various addressing modes 22
1.8 Discuss RISC and CISC 25
1.9 Design ALU 29
UNIT 2: BASIC PROCESSING UNIT
2.1 Explain the process Fundamental concepts 31
2.2 Explain the process complete instruction execution 36
2.3 Discuss multiple bus organization 38
2.4 Discuss Hardwired control 40
2.5 Micro programmed control 43
2.6 Explain Nano programming 47
UNIT 3: PIPELINING
3.1 Explain Basic concepts of pipeline 49
3.2 Discus Data hazards 54
3.3 Discus Instruction hazards 58
3.4 Discus Influence on Instruction Sets 66
3.5 Explain Datapath and control considerations 69
3.6 Explain Performance consideration 71
3.7 Discuss Exception Handling 73
UNIT 4: MEMORY SYSTEM
4.1 Explain some basic concepts of memory system 76
4.2 Discuss Semiconductor RAM Memories 77
4.3 Discuss Read Only Memories (ROM) 87
4.4 Discuss speed, size and cost of Memories 89
4.5 Discuss about Cache memories 91
4.6 explain improving cache performance 96
4.7 Explain virtual memory 100
4.8 Explain Memory management requirements 104
4.9 Write about Associative memory 105
4.10 Discuss Secondary memory 108
UNIT 5: I/O ORGANIZATION
5.1 Discuss Accessing I/O Devices 118
5.2 Explain Program-controlled I/O 120
5.3 Explain Interrupts 121
5.4 Explain direct memory access (DMA) 129
5.5 Explain about Buses 134
5.6 Discuss interface circuits 139
5.7 Describe about Standard I/O Interface 147
5.8. Discuss I/O device 160
5.9 Discuss Processors 161
Appendix: Short answers for questions
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 1
1.1 Explain functional unit of a computer.
A computer consists of five functionally independent main parts: input, memory, arithmetic
and logic, output, and control units, as shown in Figure 1.1. The input unit accepts coded
information from human operators, from electromechanical devices such as key boards, or from
other computers over digital communication lines. The information received is either stored in the
computer's memory for later reference or immediately used by the arithmetic and logic circuitry to
perform the desired operations. The processing steps are determined by a program stored in the
memory. Finally, the results are sent back to the outside world through the output unit. All of these
actions are coordinated by the control unit. Figure 1.1 does not show the connections among the
functional units. The arithmetic and logic circuits, in conjunction with the main control circuits, as
the processor, input and output equipment is often collectively referred to as the input-output (I/O)
unit.
Figure 1.1 Basic functional unit of a computer.
Input Unit
Computers accept coded information through input units, which read data. The most well-
known input device is the keyboard. Whenever a key is pressed, the corresponding letter or digit is
automatically translated into its corresponding binary code and transmitted over a cable to either
the memory or the processor.
Many other kinds of input devices are available, including joysticks, trackballs, and
mouses, which can be used as pointing device. Touch screens are often used as graphic input
devices in conjunction with displays. Microphones can be used to capture audio input which is
then sampled and converted into digital codes for storage and. processing. Cameras and scanners
are used as to get digital images.
Memory Unit
The function of the memory unit is to store programs and data. There are two classes of
storage, called primary and secondary.
Primary storage is a fast memory that operates at electronic speeds. Programs must be
stored in the memory while they are being executed. The memory contains a large number of
semiconductor storage cells, each capable of storing one bit of information. These cells are rarely
read or written as individual cells but instead are processed in groups of fixed size called words.
The memory is organized so that the contents of one word, containing n bits, can be stored or
retrieved in one basic operation.
Processor
Arithmetic
and
logic unit
Control unit
Input
unit
Output
unit
Memory unit
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 2
To provide easy access to any word in the memory, a distinct address is associated with
each word location. Addresses are numbers that identify successive locations. A given word is
accessed by specifying its address and issuing a control command that starts the storage or retrieval
process. The number of bits in each word is often referred to as the word length of the computer.
Typical word lengths range from 16 to 64 bits. The capacity of the memory is one factor that
characterizes the size of a computer.
Programs must reside in the memory during execution. Instructions and data can be written
into the memory or read out under the controller of the processor. It is essential to be able to access
any word location in the memory as quickly as possible. Memory in which any location can be
reached in a short and fixed amount of time after specifying its address is called random-access
Memory (RAM). The time required to access one word is called the Memory access time. This
time is fixed, independent of the location of the word being accessed. It typically ranges from a
few nanoseconds (ns) to about 100 ns for modem RAM units. The memory of a computer is
normally implemented as a Memory hierarchy of three or four levels of semiconductor RAM units
with different speeds and sizes. The small, fast, RAM units are called caches. They are tightly
coupled with the processor and are often contained on the same integrated circuit chip to achieve
high performance. The largest and slowest unit is referred to as the main Memory.
Although primary storage is essential, it tends to be expensive. Thus additional, cheaper,
secondary storage is used when large amounts of data and many programs have to be stored,
particularly for information that is access infrequently. A wide selection of secondary storage
devices is available, including magnetic disks and tapes and optical disks
Arithmetic and Logic Unit.
Most computer operations are executed in the arithmetic and logic unit (ALU) of the
processor. Consider a typical example: Suppose two numbers located in the memory are to be
added. They are brought into the processor, and the actual addition is carried out by the ALU. The
sum may then be stored in the memory or retained in the processor for immediate use.
Any other arithmetic or logic operation, for example, multiplication, division, or
comparison of numbers, is initiated by bringing the required operands into the processor, where the
operation is performed by the ALU. When operands are brought into the processor, they are stored
in high-speed storage elements called registers. Each register can store one word of data. Access
times to registers are somewhat faster than access times to the fastest cache unit in the memory h-
ierarchy.
The control and the arithmetic and logic units are many times faster than other devices
connected to a computer system. This enables a single processor to control a number of external
devices such as keyboards, displays, magnetic and optical disks, sensors, and mechanical
controllers.
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 3
Output Unit
The output unit is the counterpart of the input unit. Its function is to send processed results
to the outside world. The most familiar example of such a device is a printer. Printers employ
mechanical impact heads, inkjet streams, or photocopying techniques, as in laser printers, to
perform the printing. It is possible to produce printers capable of printing as many as 10,000 lines
per minute. This is a tremendous speed for a mechanical device but is still very slow compared to
the electronic speed of a processor unit.
Monitors, Speakers, Headphones and projectors are also some of the output devices. Some
units, such as graphic displays, provide both an output function and an input function. The dual
role of input and output of such units are referred with single name as I/O unit in many cases.
Speakers, Headphones and projectors are some of the output devices. Storage devices such as hard
disk, floppy disk, flash drives are also used for input as well as output.
Control Unit
The memory, arithmetic and logic, and input and output 'units store and process
information and perform input and output operations. The operation of these units must be
coordinated in some way. This is the task of the control unit. The control unit is effectively the
nerve center that sends control signals to other units and senses their states.
I/O transfers, consisting of input and output operations, controlled by the instructions of
I/O programs that identify the devices involved and the information to be transferred. However,
the actual timing signals that govern the transfers are generated by the control circuits. Timing
signals are signals that determine when a given action is to take place. Data transfers between the
processor and the memory are also controlled by the control unit through timing signals. It is
reasonable to think of a control unit as a well-defined, physically separate unit that interacts with
other parts of the machine. In practice, however, this is seldom the case. Much of the control
circuitry is physically distributed throughout the machine. A large set of control lines (wires)
carries the signals used for timing and synchronization of events in all units.
The operation of a computer can be summarized as follows:
1. The computer accepts information in the form of programs and data through an input unit
and stores it in the memory.
2. Information stored in the memory is fetched, under program control, into an arithmetic and
logic unit, where it is processed.
3. Processed information leaves the computer through an output unit.
4. All activities inside the machine are directed by the control unit.
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 4
1.2 Explain basic operations of a computer.
1.2 Discuss basic operational concepts.
The processor contains arithmetic logic unit (ALU), control circuitry unit (CU) and a
number of registers used for several different purposes. The instruction register (IR) holds the
instruction that is currently being executed. Its output is available to the control circuits, which
generate the timing signals that control the various processing elements involved in executing the
instruction.
The program counter (PC) is another specialized register. It keeps track of the execution of
a program. "It contains the memory address of the next instruction to be fetched and executed.
During the execution of an instruction, the contents of the PC are updated to correspond to the
address of the next instruction to be executed. It is customary to say that the PC points to the next
instruction that is to be fetched from the memory. Besides the IR and PC, Figure 1.2 shows n
general-purpose registers, R0 through Rn-1.
Figure 1.2 Connections between processor and memory
Finally, two registers facilitate communication with the memory. These are the memory
address register (MAR) and the memory data register (MDR). The MAR holds the address of the
location to be accessed. The MDR contains the data to be written into or read out of the addressed
location.
Let us now consider some typical operating steps. Programs reside in the memory and
usuaJ1y get there through the input unit Execution of the program starts when the PC is set to point
to the first instruction of the program. The contents of the PC are transferred to the MAR and a
Read control signal is sent to the memory. After the time required to access the memory elapses,
the addressed word (in this case, the first instruction of the program) is read out of the memory and
loaded into the MDR. Next, the contents of the MDR are transferred to the JR. At this point, the
instruction is ready to be decoded and executed.
Processor
MAR
Control
unit
Memory
ALU
n general purpose
registers
.
.
.
R0
MDR
Rn-1
R1
PC
IR
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 5
If the instruction involves an operation to be performed by the ALU, it is necessary to
obtain the required operands. If an operand resides in the memory (it could also be in a general-
purpose register in the processor), it has to be fetched by sending its address to the MAR and
initiating a Read cycle. When the operand has been read from the memory into the MDR, it is
transferred from the MDR to the ALU. After one or more operands are fetched in this way, the
ALU can perform the desired operation. If the result of this operation is to be stored in the
memory, then the result is sent to the MDR. The address of the location where the result is to be
stored is sent to the MAR, and a Write cycle is initiated. At some point during the execution of the
current instruction, the contents of the PC are incremented so that the PC points to the next
instruction to be executed. Thus, as soon as the execution of the current instruction is completed, a
new instruction fetch may be started.
To perform a given task, an appropriate program consisting of a list of instructions is stored
in the memory. Individual instructions are brought from the memory into the processor, which
executes the specified operations. Data to be used as operands are also stored in the memory. A
typical instruction may be
Add LOCA, R0
This instruction adds the operand at memory location LOCA to the operand in a register in
the processor, R0, and places the sum into register R0. The original contents of location LOCA are
preserved, whereas those of R0 are overwritten. This instruction requires the performance of
several steps. First, the instruction is fetched from the memory into the processor. Next, the
operand at LOCA is fetched and added to the contents of R0. Finally, the resulting sum is stored in
register R0.
Primary storage (or main memory or internal memory), often referred to simply as
memory, is the only one directly accessible to the Processor. The Processor continuously reads
instructions stored there and executes them as required. Any data actively operated on is also
stored there in uniform manner. The processor transfers data using MAR and MDR register using
control unit.
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 6
1.3 Explain bus structure of a computer.
Single bus structure
In computer architecture, a bus is a subsystem that transfers data between components
inside a computer, or between computers. Early computer buses were literally parallel electrical
wires with multiple connections, but Modern computer buses can use both parallel and bit serial
connections.
Figure 1.3.1 Single bus structure
To achieve a reasonable speed of operation, a computer must be organized so that all its
units can handle one full word of data at a given time. When a word of data is transferred between
units, all its bits are transferred in parallel, that is, the bits are transferred simultaneously over
many wires, or lines, one bit per line. A group of lines that serves as a connecting path for several
devices is called a bus. In addition to the lines that carry the data, the bus must have lines for
address and control purposes. The simplest way to interconnect functional units is to use a single
bus, as shown in Figure 1.3.1. All units are connected to this bus. Because the bus can be used for
only one transfer at a time, only two units can actively use the bus at any given time. Bus control
lines are used to arbitrate multiple requests for use of the bus. The main virtue of the single-bus
structure is its low cost and is flexibility for attaching peripheral" devices. Systems that contain
multiple buses achieve more concurrency in operations by allowing two or more transfers to be
carried out at the same time. This leads to better performance but at an increased cost.
Parts of a System bus
Processor, memory, Input and output devices are connected by system bus, which consists
of separate busses as shown in figure 1.3.2. They are:
(i)Address bus: Address bus is used to carry the address. It is unidirectional bus. The address is
sent to from CPU to memory and I/O port and hence unidirectional. It consists of 16, 20, 24 or
more parallel signal lines.
(ii)Data bus: Data bus is used to carry or transfer data to and from memory and I/O ports. They
are bidirectional. The processor can read on data lines from memory and I/O port and as well as it
can write data to memory. It consists of 8, 16, 32 or more parallel signal lines.
(iii)Control bus: Control bus is used to carry control signals in order to regulate the control
activities. They are bidirectional. The CPU sends control signals on the control bus to enable the
outputs of addressed memory devices or port devices. Some of the control signals are: MEMR
(memory read), MEMW (memory write), IOR (I/O read), IOW (I/O write), BR (bus request), BG
(bus grant), INTR (interrupt request), INTA (interrupt acknowledge), RST (reset), RDY (ready),
HLD (hold), HLDA (hold acknowledge),
MemoryProcessor Input Output
…
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 7
Figure 1.3.2 Bus interconnection scheme
The devices connected to a bus vary widely in their speed of operation. Some
electromechanical devices, such as keyboards and printers are relatively slow. Other devises like
magnetic or optical disks, are considerably faster. Memory and processor units operate at
electronic speeds, making them the fastest parts of a computer. Because all these devices must
communicate with each other over a bus, an efficient transfer mechanism that is not constrained by
the slow devices and that can be used to smooth out the differences in timing among processors,
memories, and external devices is necessary.
A common approach is to include buffer registers with the devices to hold the information
during transfers. To illustrate this technique, consider the transfer of an encoded character from a
processor to a character printer. The processor sends the character over the bus to the printer
buffer. Since the buffer is an electronic register, this transfer requires relatively little time. Once
the buffer is loaded, the printer can start printing without further intervention by the processor. The
bus and the processor are no longer needed and can be released for other activity. The printer
continues printing the character in its buffer and is not available for further transfers until this
process is completed. Thus, buffer registers smooth out timing differences among processors,
memories, and I/O devices. They prevent a high-speed processor from being locked to a slow I/O
device during a sequence of data transfers. This allows the processor to switch rapidly from one
device to another, interweaving its processing activity with data transfers involving several I/O
devices.
The Figure 1.3.3 shows traditional bus configurations and the Figure 1.3.4 shows high
speed bus configurations. The traditional bus connection uses three buses: local bus, system
bus and expanded bus. The high speed bus configuration uses high-speed bus along with the
three buses used in the traditional bus connection. Here, cache controller is connected to high-
speed bus. This bus supports connection to high-speed LANs, such as Fiber Distributed Data
Interface (FDDI), video and graphics workstation controllers, as well as interface controllers to
local peripheral including SCSI.
MemoryProcessor Input Output
…
Control
bus
Data
bus
Address
buss
System
bus
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 8
Figure 1.3.3 Traditional bus configuration
Figure 1.3.4 High speed bus configuration
Main
Memory
Processor Cache
Local I/O
controller
Local bus
System bus
Expansion bus
SCSI Network
Expansion
bus interface
Modem Serial
Main
Memory
Processor Cache
Local I/O
controller
Local bus
System bus
High Speed bus
Expansion bus
SCSI Video Graphics LAN
Expansion
bus interfaceFAX Modem Serial
Traditional
bus
High
Speed
bus
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 9
1.4 Explain performance and metrics
The most important measure of the performance of a computer is how quickly it can execute
programs. The speed with which a computer executes programs is affected by the design of its
hardware and its machine language instructions. Because programs are usually written in a high-level
language, performance is also affected by the compiler that translates programs into machine language.
For best performance, it is necessary to design the compiler, the machine instruction set, and the
hardware in a coordinated way. The operating system overlaps processing, disk transfers, and printing
for several programs to make the best possible use of the resources available. The total time required to
execute the program is called elapsed time in operating system. This elapsed time is a measure of the
performance of the entire computer system. It is affected by the speed of the processor, the disk, and
the printer.
CACHE MEMORY
Just as the elapsed time for the execution of a program depends on all units in a computer
system, the processor time depends on the hardware involved in the execution of individual machine
instructions. This hardware comprises the processor and the memory, which are usually connected by a
bus, as shown in Figure 1.3.1. The pertinent parts of this figure are repeated in Figure 1.4, including the
cache memory as part of the processor unit. Let us examine the flow of program instructions and data
between the memory and the processor. At the start of execution, all program instructions and the
required data are stored in the main memory. As execution proceeds, instructions are fetched one by
one over the bus into the processor, and a copy is placed in the cache. When the execution of an
instruction calls for data located in the main memory, the data are fetched and a copy is placed in the
cache. Later, if the same instruction or data item is needed a second time, it is read directly from the
cache.
Figure 1.4 Processor Cache
The processor and a relatively small cache memory can be fabricated on a single integrated
circuit chip. The internal speed of performing the basic steps of instruction, processing on such chips is
very high and is considerably faster than the speed at which instructions and data can be fetched from
the main memory. A program will be executed faster if the movement of instructions and data between
the main memory and the processor is minimized, which is achieved by using the cache. For example,
suppose a number of instructions are executed repeatedly over a short period of time, as happens in a
program loop. If these instructions are available in the cache, they can be fetched quickly during the
period of repeated use.
PROCESSOR CLOCK
Processor circuits are controlled by a timing signal called a clock. The clock defines regular
time intervals, called clock cycles. To execute a machine instruction, the processor divides the action to
be performed into a sequence of basic steps, such that each step can be completed in one clock cycle.
The length P of one clock cycle is an important parameter that affects processor performance. Its
inverse is the clock rate, R = 1/ P, which is measured in cycles per second. Processors used in today's
personal computers and workstations have clock rates that range from a few hundred million to over a
billion cycles per second. In standard electrical engineering terminology, the term "cycles per second"
Main
Memory Processor
Cache
Memory
System bus
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 10
is called hertz (Hz). The term "million" is denoted by the prefix Mega (M), and "billion" is denoted by
the prefix Giga (G). Hence, 500 million cycles per second is usually abbreviated to 500 Megahertz
(MHz), and 1250 million cycles per second is abbreviated to 1.25 Gigahertz (GHz). The corresponding
clock periods are 2 and 0.8 nanoseconds (ns), respectively.
BASIC PERFORMANCE EQUATION
Let T be the processor time required to execute a program that has been prepared in some high-
level language. The compiler generates a machine language object program that corresponds to the
source program. Assume that complete execution of the program requires the execution of N machine
language instructions. The number N is the actual number of instruction executions, and is not
necessarily equal to the number of machine instructions in the object program. Some instructions may
be executed more than once, which is the case for instructions inside a program loop. Others may not
be executed at all, depending on the input data used. Suppose that the average number of basic steps
needed to execute one machine instruction is S, where each basic step is completed in one clock cycle.
If the clock rate is R cycles per second, the program execution time is
(1.1)
This is often referred to as the basic performance equation. The performance parameter T for
an application program is much more important to the user than the individual values of the parameters
N, S, or R. To achieve high performance, the computer designer must seek ways to reduce the value of
T, which means reducing N and S, and increasing R. The value of N is reduced if the source program is
compiled into fewer machine instructions. The value of S is reduced if instructions have a smaller
number of basic steps to perform or if the execution of instructions is overlapped. Using a higher-
frequency clock increases the value or R, which means that the time required to complete a basic
execution step is reduced.
The N, S, and R are not independent parameters; changing one may affect another. Introducing
a new feature in the design of a processor will lead to improved performance only if the overall result
is to reduce the value of T. A processor advertised as having a 900-MHz clock does not necessarily
provide better performance than a 700-MHz processor because it may have a different value of S.
PIPELINING AND SUPERSCALAR OPERATION
Usually in sequential execution, the instructions are executed one after another. Hence, the
value of S is the total number of basic steps, or clock cycles, required to execute an instruction. A
substantial improvement in performance can be achieved by overlapping the execution of successive
instructions, using a technique called pipelining. Consider the instruction
Add Rl,R2,R3
Which adds the contents of registers R 1 and R2, and places the sum into R3. The contents of
Rl and R2 are first transferred to the inputs of the ALU. After the add operation is performed, the sum
is transferred to R3. The processor can read the next instruction from the memory while the addition o-
ration is being performed. Then, if that instruction also uses the ALU, its operands can be transferred to
the ALU inputs at the same time that the result of the Add instruction is being transferred to R3. In the
ideal case, if all instructions are overlapped to the maximum degree possible, execution proceeds at the
rate of one instruction completed in each clock cycle. Individual instructions still require several clock
cycles to complete. But, for the purpose of computing T, the effective value of S is 1.
The ideal va1ue S = 1 cannot be attained in practice for a variety of reasons. However,
pipelining increases the rate of executing instructions significantly and causes the effective value of S
to approach 1.
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 11
A higher degree of concurrency can be achieved if multiple instruction pipelines are
implemented in the processor. This means that multiple functional units are used, creating parallel
paths through which different instructions can be executed in parallel. With such an arrangement, it
becomes possible to start the execution of several instructions in every clock cycle. This mode of
operation is called superscalar execution. If it can be sustained for a long time during program
execution, the effective value of S can be reduced to less than one. Of course, parallel execution must
preserve the logical correctness of programs, that is, the results produced must be the same as those
produced by serial execution of program instructions. Many of today's high-performance processors are
designed to operate in this manner.
CLOCK RATE
There are two possibilities for increasing the clock rate, R. First, improving the integrated-
circuit (IC) technology makes logic circuits faster, which reduces the time needed to complete a basic
step. This allows the clock period, P, to be reduced and the clock rate, R, to be increased. Second,
reducing the amount of processing done in one basic step also makes it possible to reduce the clock
period, P. However, if the actions that have to be performed by an instruction remain the same, the
number of basic steps needed may increase.
Increases in the value of R that are entirely caused by improvements in IC technology affect all
aspects of the processor's operation equally with the exception of the time it takes to access the main
memory. In the presence of a cache, the percentage of accesses to the main memory is small. Hence,
much of the performance gain expected from the use of faster technology can be realized. The value of
T will be reduced by the same factor as R is increased because S and N are not affected.
INSTRUCTION SET: CISC AND RISC
Simple instructions require a small number of basic steps to execute. Complex Instructions
involve a large number of steps. For a processor that has only simple instructions, a large number of
instructions may be needed to perform a given programming task. This could lead to a large value for
N and a small value for S. On the other hand, if individual instructions perform more complex
operations, fewer instructions will be needed, leading to a lower value of N and a larger value of S. It is
not obvious if one choice is better than the other.
A key consideration in comparing the two choices is the use of pipelining. We pointed out
earlier that the effective value of S in a pipelined processor is I close to 1 even though the number of
basic steps per instruction may be considerably larger. This seems to imply that complex instructions
combined with pipelining would achieve the best performance. However, it is much easier to
implement efficient pipelining in processors with simple instruction sets. The suitability of the
instruction set for pipelined execution is an important and often deciding consideration.
The terms RISC and CISC refer to design principles and techniques. Reduced instruction set
computing (RISC), is a CPU design strategy based on the insight that simplified (as opposed to
complex) instructions can provide higher performance if this simplicity enables much faster execution
of each instruction. A complex instruction set computer (CISC) is a computer where single instructions
can execute several low-level operations (such as a load from memory, an arithmetic operation, and a
memory store) and/or are capable of multi-step operations or addressing modes within single
instructions.
COMPILER
A compiler translates a high-level language program into a sequence of machine instructions.
To reduce N, we need to have a suitable machine instruction set and a compiler that makes good use of
it. An optimizing compiler takes advantage of various features of the target processor to reduce the
product N x S, which is the total number of clock cycles needed to execute a program. The number of
cycles is dependent not only on the choice of instructions, but also on the order in which they appear in
the program. The compiler may rearrange program instructions to achieve better performance. Of
course, such changes must not affect the result of the computation.
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 12
Superficially, a compiler appears as a separate entity from the processor with which it is used
and may even be available from a different vendor. However, a high quality compiler must be closely
linked to the processor architecture. The compiler and the processor are often designed at the same
time, with much interaction between the designers to achieve best results. The ultimate objective is to
reduce the total number of clock cycles needed to perform a required programming task.
PERFORMANCE EQUATION
It is important to be able to assess the performance of a computer. Computer designers use
performance estimates to evaluate the effectiveness of new features. Manufacturers use performance
indicators in the marketing process. Buyers use such data to choose among many available computer
models.
The previous discussion suggests that the only parameter that properly describes the -
performance of a computer is the execution time, T, for the programs of interest. Despite the
conceptual simplicity of Equation 1.1, computing the value of T is not simple. Moreover, parameters
such as the clock spee4 and various architectural features are not reliable indicators of the expected
performance.
For these reasons, the computer community adopted the idea of measuring computer
performance using benchmark programs. To make comparisons possible, standardized programs must
be used. The performance measure is the time it takes a computer to execute a given benchmark.
Initially, some attempts were made to create artificial programs that could be used as standard
benchmarks. But, synthetic programs do not properly predict performance obtained when real
application programs are nun.
A nonprofit organization called System Performance Evaluation Corporation (SPEC) selects
and publishes representative application programs for different application domains, together with test
results for many commercially available computers. For general-purpose computers, a suite of
benchmark programs was selected in 1989. It was modified somewhat and published in 1995 and again
in 2000. For SPEC2000, the reference computer is an Ultra SPARCI0 workstation with a 300-MHz
UltraSPARC-III processor.
The SPEC rating is computed as follows
Thus a SPEC rating of 50 means that the computer under test is 50 times as fast as the
UltraSPARC I0 for this particular benchmark. The test is repeated for all the programs in the SPEC
suite, and the geometric mean of the results is computed. Let SPECi be the rating for program i in the
suite. The overall SPEC rating for the computer is given by
where n is the number of programs in the suite. Because the actual execution time is measured,
the SPEC rating is a measure of the combined effect of all factors affecting performance, including the
compiler, the operating system, the processor, and the memory of the computer being tested.
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 13
1.5 Explain instruction and instruction sequencing
A computer program consist of a sequence of small steps, such as adding two numbers,
testing for a particular condition, reading a character from the keyboard, or sending a character to
be displayed on a display screen.
A computer must have instructions capable of performing four types of operations:
· Data transfers between the memory and the processor registers
· Arithmetic and logic operations on data
· Program sequencing and control
· I/O transfers
REGISTER TRANSFER NOTATION
In general, the information is transferred from one location to another in a computer.
Possible locations for such transfers are memory locations, processor registers, or registers in the
I/O subsystem.
For example, Names for the addresses of memory locations may be LOC, PLACE, A,
VAR2; processor register names may be RO, R5; and I/O register names may be DATAIN,
OUTSTATUS, and so on. The contents of a location are denoted by placing square brackets
around the name of the location.
Thus, the expression
means that the contents of memory location LOC are transferred into processor register R1.
As another example, consider the operation that adds the contents of registers R1 and R2,
and then places their sum into register R3. This action is indicated as
This type of notation is known as Register Transfer Notation (RTN). .Note that the right
hand side of an RTN expression always denotes a value, and the left-hand side is the name of a
location where the value is to be placed, overwriting the old contents of that location.
ASSEMBLY TRANSFER NOTATION
The another type of notation to represent machine instructions and programs is assembly transfer
notation. For example, an instruction that causes the transfer described above, from memory location LOC
to processor register RI, is specified by the statement.
Move LOC,Rl
The contents of LOC are unchanged by the execution of this instruction, but the old contents of register R1
are overwritten.
The second example of adding two numbers contained in processor registers Rl and R2 and placing
their sum in R3 can be specified by the assembly language statement
Add RI,R2,R3
R1 [R1] + [R2]
R1 [LOC]
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 14
BASIC INSTRUCTION TYPES
Instruction types are classified based on the number of operands used in instructions. They are:
(i). Three - Address Instruction,
(ii). Two - Address Instruction,
(iii). One - Address Instruction, and
(iv). Zero - Address Instruction.
Consider a high level language program command, which adds two variables A and B, and assign
the sum in third variable C.
C = A + B ;
To carry out this action, the contents of memory locations A and B are fetched from the memory
and transferred into the processor where their Sum is computed. This result is then sent back to the memory
and stored in location C.
C ← [A] + [B]
Example: Add two variables A and B, and assign the sum in third variable C
(i). Three - Address Instruction
The general instruction format is: Operation Source1,Source2,Destination
Symbolic add instruction: ADD A,B,C
Operands A and B are called the source operands, C is called Destination operand, and Add is the
operation to be performed on the operands.
If k bits are needed to specify the memory address of each operand, the encoded form of the above
instruction must contain 3k bits for addressing purposes in addition to the bits needed to denote the Add
operation. For a modem processor with a 32-bit address space, a 3-address instruction is too large to fit in
one word for a reasonable word length. Thus, a format that allows multiple words to be used for a single
instruction would be needed to represent an instruction of this type. An alternative approach is to use a
sequence of simpler instructions to perform the same task, with each instruction having only one or two
operands. Suppose that two-address instructions of the form
(ii). Two - Address Instruction
The general instruction format is: Operation Source,Destination
Symbolic add instruction: MOVE B,C
ADD A,C
An Add instruction of this type is ADD A,B which performs the operation B ←
[A] + [B] When the sum is calculated, the result is sent to the memory and stored in location B,
replacing the original contents of this location. This means that operand B is both a source and a
destination.
A single two-address instruction cannot be used to solve our original problem, which is to
add the contents of locations A and B, without destroying either of them, and to place the sum in
location C. The problem can be solved by using another two- address instruction that copies the
contents of one memory location into another. Such an instruction is Move B,C which performs
the operation C ← [B], leaving the contents of location B unchanged. The word "Move" is a
misnomer here; it should be "Copy." However, this instruction name is deeply entrenched in
computer nomenclature. The operation C ← [A] + [B] can now be performed by the two-
instruction sequence
Move B,C
Add A,C
Even two-address instructions will not normally fit into one word for usual word lengths
and address sizes. Another possibility is to have machine instructions that specify only one
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 15
memory operand. when a second operand is needed, as in the case of an Add instruction, it is
understood implicitly to be in a unique location. A processor register, usually called the
accumulator, may be used for this purpose. Thus, the one-address instruction
(iii). One - Address Instruction
The general instruction format is: Operation operand
Symbolic add instruction: LOAD A
ADD B
STORE C
ADD A means the following: Add the contents of memory location A to the contents of the
accumulator register and place the sum back into the accumulator. Let us also introduce the one-
address instructions Load A and Store A
The Load instruction copies the contents of memory location A into the Accumulator, and
the Store instruction copies the contents of the accumulator into memory location A. Using only
one-address instructions, the operation C ← [A] + [B] can be performed by executing the sequence
of instructions Load A, Add B, Store C
Note that the operand specified in the instruction may be a source or a destination,
depending on the instruction. In the Load instruction, address A specifies the source operand, and
the destination location, the accumulator, is implied. On the other hand, C denotes the destination
location in the Store instruction, whereas the source, the accumulator, is implied.
INSTRUCTION EXECUTION AND STRAIGHT LINE SEQUENCING
To perform a particular task on the computer, it is programmer’s job to select and write
appropriate instructions one after other, i.e. programmer has to write instructions in a proper
sequence. This job of the programmer is known as instruction sequencing. The instructions written
in a proper sequence to execute a particular task is called program.
Figure 1.5.2 Basic instruction cycle
The complete instruction cycle involves three operations: Instruction fetching, opcode
decoding and instruction execution as shown in figure 1.5.1.
Processor executes a program with the help of program counter (PC). PC holds the address
of the instruction to be executed next. To begin execution of a program, the address of its first
instruction is placed into the PC. Then, the processor control circuits use the information (address
Fetch the next instruction
START
Decode instruction
Execute instruction
START
Fetch cycle
Decode cycle
Execute cycle
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 16
of memory) in the PC to fetch and execute instructions, one at a time, in the order of increasing
addresses. This is called straight-line sequencing. During the execution of instruction, the PC is
incremented by the length of the current instruction in execution. For example, if currently
executing instruction length is 4 bytes, then PC is incremented by 4 so that it points to instruction
to be executed next.
Consider the task C ← [A] + [B] for illustration. Figure 1.5.2 shows a possible program
segment for this task as it appears in the memory of a computer. Assume the word length is 32 bits
and the memory is byte addressable. The three instructions of the program are in successive word
locations, starting at location i. Since each instruction is 4 bytes long, the second and third
instructions start at addresses i + 4 and i + 8.
Figure 1.5.2 A program for C ← [A] + [B]
The processor contains a register called the program counter (PC), which holds the address
of the instruction to be executed next. To begin executing a program, the address of its first
instruction (i in our example) must be placed into the PC. Then, the processor control circuits use
the information in the PC to fetch and execute instructions, one at a time, in the order of increasing
addresses. This is called straight-line sequencing. During the execution of each instruction, the PC
is incremented by 4 to point to the next instruction. Thus, after the Move instruction at location i +
8 is executed, the PC contains the value i + 12, which is the address of the first instruction of the
next program segment.
Executing a given instruction is a two-phase procedure. In the first phase, called instruction
fetch, the instruction is fetched from the memory location whose address is in the PC. This
instruction is placed in the instruction register (IR) in the processor. At the start of the second
phase, called instruction execute, the instruction in IR is examined to determine which operation is
to be performed. The specified operation is then performed by the processor. This often involves
fetching operands from the memory or from processor registers, performing an arithmetic or logic
operation, and storing the result in the destination location. At some point during this two-phase
procedure, the contents of the PC are advanced to point to the next instruction. When the execute
phase of an instruction is completed, the PC contains the address of the next instruction, and a new
instruction fetch phase can begin. In most processors, the execute phase itself is divided into a
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 17
small number of distinct phases corresponding to fetching operands, performing the operation, and
storing the result.
BRANCHING
Consider the task of adding a list of n numbers. The program outlined in Figure 1.5.3 is a
generalization of the program in Figure 1.5.2. The addresses of the memory locations containing
the n numbers are symbolically given as NUMl, NUM2, ..., NUMn, and a separate Add instruction
is used to add each number to the contents of register R0. After all the numbers have been added,
the result is placed in memory location SUM. Instead of using a long list of Add instructions, it is
possible to place a single Add instruction in a program loop, as shown in Figure 1.5.4. The loop is
a straight-line sequence of instructions executed as many times as needed. It starts at location
LOOP and ends at the instruction Branch>O. During each pass through this loop, the address of
the next list entry is determined, and that entry is fetched and added to R0.
Figure 1.5.3 A straight line program Figure 1.5.4 A program using a loop
for adding n numbers for adding n numbers
Assume that the number of entries in the list, n, is stored in memory location N, as shown.
Register Rl is used as a counter to determine the number of times the loop is executed. Hence, the
contents of location N are loaded into register Rl at the beginning of the program. Then, within the
body of the loop, the instruction Decrement Rl Reduces the contents of Rl by 1 each time through
the loop. (A similar type of operation is performed by an Increment instruction, which adds 1 to its
operand.) Execution of the loop is repeated as long as the result of the decrement operation is
greater than zero.
This type of instruction loads a new value into the program counter. As a result, the
processor fetches and executes the instruction at this new address, called the branch target, instead
of the instruction at the location that follows the branch instruction in sequential address order. A
conditional branch instruction causes a branch only if a specified condition is satisfied. If the
condition is not satisfied, the PC is incremented in the normal way, and the next instruction in
sequential address order is fetched and executed.
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 18
In the program in Figure 1.5.4, the instruction Branch>0 LOOP (branch if greater than 0)
is a conditional branch instruction that causes a branch to location LOOP if the result of the
immediately preceding instruction, which is the decremented value in register Rl, is greater than
zero. This means that the loop is repeated as long as there are entries in the list that are yet to be
added to R0. At the end of the nth pass through the loop, the Decrement instruction produces a
value of zero, and, hence, branching does not occur. Instead, the Move instruction is fetched and
executed. It moves the final result from R0 into memory location SUM.
CONDITION CODES
The processor keeps track of information about the results of various operations for use by
subsequent conditional branch instructions. This is accomplished by recording the required
information in individual bits, often called condition code flags. These flags are usually grouped
together in a special processor register called the condition code register or status register.
Individual condition code flags are set to 1 or cleared to 0, depending on the outcome of the
operation performed.
Four commonly used flags are
N (negative) Set to 1 if the result is negative; otherwise, cleared to 0
Z (zero) Set to 1 if the result is 0; otherwise, cleared to 0
V (overflow) Set to 1 if arithmetic overflow occurs; otherwise, cleared to 0
C (carry) Set to 1 if a carry-out results from the operation; otherwise, cleared to 0
The N and Z flags indicate whether the result of an arithmetic or logic operation is negative
or zero. The N and Z flags may also be affected by instructions that transfer data, such as Move,
Load, or Store. This makes it possible for a later conditional branch instruction to cause a branch
based on the sign and value of the operand that was moved. Some computers also provide a special
Test instruction that examines a value in a register or in the memory and sets or clears the N and Z
flags accordingly.
GENERATION OF MEMORY ADDRESSES
Different addressing modes give rise to the need for flexible ways to specify the address of
an operand. The instruction set of a computer typically provides a number of such methods, called
addressing modes. While the details differ from one computer to another, the underlying concepts
are the same.
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 19
1.6 Discuss Hardware – Software Interface
Hardware
Computer hardware is the collection of physical elements that comprise a computer system.
Example Processor, memory, hard disk, floppy disk, keyboard, mouse, monitors, printers and so
on.
Software
Computer software, or just software, is a collection of computer programs and related data
that provides the instructions for telling a computer what to do and how to do it. In other words,
software is a conceptual entity which is a set of computer programs, procedures, and associated
documentation concerned with the operation of a data processing system. We can also say software
refers to one or more computer programs and data held in the storage of the computer for some
purposes. In other words software is a set of programs, procedures, algorithms and its
documentation. Program software performs the function of the program it implements, either by
directly providing instructions to the computer hardware or by serving as input to another piece of
software. The term was coined to contrast to the old term hardware (meaning physical devices). In
contrast to hardware, software "cannot be touched". Software is also sometimes used in a more
narrow sense, meaning application software only.
Types of software
System software
System software provides the basic functions for computer usage and helps run the computer
hardware and system. It includes a combination of the following:
1. Device drivers:
A device driver or software driver is a computer program allowing higher-level computer
programs to interact with a hardware device.
2. Operating systems:
An operating system (OS) is a set of programs that manage computer hardware resources
and provide common services for application software. The operating system is the most important
type of system software in a computer system. A user cannot run an application program on the
computer without an operating system, unless the application program is self booting.
The main functions of an OS are Device management, Storage management, User
interface, Memory management and Processor management
3. Servers:
A server is a computer program running to serve the requests of other programs, the
"clients". Thus, the "server" performs some computational task on behalf of "clients". The clients
either run on the same computer or connect through the network.
4. Utilities:
Utility software is system software designed to help analyze, configure, optimize or
maintain a computer. A single piece of utility software is usually called a utility or tool. Utility
software usually focuses on how the computer infrastructure (including the computer hardware,
operating system, application software and data storage) operates.
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 20
5. Window systems:
A windowing system (or window system) is a component of a graphical user interface
(GUI), and more specifically of a desktop environment, which supports the implementation of
window managers, and provides basic support for graphics hardware, pointing devices such as
mice, and keyboards. The mouse cursor is also generally drawn by the windowing system.
System software is responsible for managing a variety of independent hardware
components, so that they can work together harmoniously. Its purpose is to unburden the
application software programmer from the often complex details of the particular computer being
used, including such accessories as communications devices, printers, device readers, displays and
keyboards, and also to partition the computer's resources such as memory and processor time in a
safe and stable manner.
Programming software
Programming software usually provides tools to assist a programmer in writing computer
programs, and software using different programming languages in a more convenient way. The
tools include:
1. Compilers:
A compiler is a computer program (or set of programs) that transforms source code written
in a programming language (the source language) into another computer language (the target
language, often having a binary form known as object code). The most common reason for
wanting to transform source code is to create an executable program.
The name "compiler" is primarily used for programs that translate source code from a high-
level programming language to a lower level language (e.g., assembly language or machine code).
If the compiled program can run on a computer whose CPU or operating system is different from
the one on which the compiler runs, the compiler is known as a cross-compiler. A program that
translates from a low level language to a higher level one is a decompiler.
A compiler is likely to perform many or all of the following operations: lexical analysis,
preprocessing, parsing, semantic analysis (Syntax-directed translation), code generation, and code
optimization.
2. Debuggers:
A debugger or debugging tool is a computer program that is used to test and debug other
programs (the "target" program). The code to be examined might alternatively be running on an
instruction set simulator (ISS), a technique that allows great power in its ability to halt when
specific conditions are encountered but which will typically be somewhat slower than executing
the code directly on the appropriate (or the same) processor. Some debuggers offer two modes of
operation - full or partial simulation, to limit this impact.
3. Interpreters:
An interpreter normally means a computer program that executes Program by converting
source code to object code line by line.
4. Linkers:
A linker or link editor is a program that takes one or more objects generated by a compiler
and combines them into a single executable program.
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 21
5. Text editors:
A text editor is a type of program used for editing plain text files.Text editors are often
provided with operating systems or software development packages, and can be used to change
configuration files and programming language source code.
An Integrated development environment (IDE) is a single application that attempts to manage all
these functions.
Application software
Application software is developed to perform in any task that benefits from computation. It is a
broad category, and encompasses software of many kinds, including the internet browser being
used to display this page. This category includes:
Business software
Computer-aided design
Databases
Decision making software
Educational software
Image editing
Industrial automation
Mathematical software
Medical software
Simulation software
Spreadsheets
Word processing
Hardware – Software Interface
It is an interface tool and it to a point of interaction between components, and is applicable
at the level of both hardware and software. This allows a component, whether a piece of hardware
such as a graphics card or a piece of software such as an Internet browser, to function
independently while using interfaces to communicate with other components via an input/output
system and an associated protocol.
All Device drivers, Operating systems, Servers, Utilities, Window systems, Compilers,
Debuggers, Interpreters, Linkers and Text editors are considered as Harware – Software Interfaces.
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 22
1.7 Explain various addressing modes
A program operates on data that reside in the computer's memory. These data can be
organized in a variety of ways. For example to record marks in various courses, we may organize
this information in the form of a table. Programmers use organizations called data structures to
represent the data used in computations. These include lists, linked lists, arrays, queues, and so on.
Programs are normally written in a high-level language, which enables the programmer to
use constants, local and global variables, pointers, and arrays. When translating a high-level
language program into assembly language, the compiler must be able to implement these
constructs using the facilities provided in the instruction set of the computer in which the program
will be run. The different ways in which the location of an operand is specified in an instruction
are referred to as addressing modes.
IMPLEMENTATION OF VARIABLES AND CONSTANTS
Variables and constants are the simplest data types and are found in almost every computer
program. A variable is represented by allocating a register or a memory location to hold its value.
Thus, the value can be changed as needed using appropriate instructions.
1. Register addressing mode - The operand is the contents of a processor register; the name
(address) of the register is given in the instruction.
Example: MOVE R1,R2
This instruction copies the contents of register R2 to R1.
2. Absolute addressing mode - The operand is in a memory location; the address of this location
is given explicitly in the instruction. (In some assembly languages, this mode is called Direct.)
Example: MOVE LOC,R2
This instruction copies the contents of memory location of LOC to register R2.
3. Immediate addressing mode - The operand is given explicitly in the instruction.
Example: MOVE #200 , R0
The above statement places the value 200 in the register R0. A common convention is to
use the sharp sign (#) in front of the value to indicate that this value is to be used as an immediate
operand.
INDIRECTION AND POINTERS
In the addressing modes that follow, the instruction does not give the operand or its address
explicitly. Instead, it provides information from which the memory address of the operand can be
determined. We refer to this address as the effective address (EA) of the operand.
4. Indirect addressing mode - The effective address of the operand is the contents of a register or
memory location whose address appears in the instruction.
Example Add (R2),R0
Register R2 is used as a pointer to the numbers in the list, and the operands are accessed indirectly
through R2. The initialization section of the program loads the counter value n from memory
location N into Rl and uses the Immediate addressing mode to place the address value NUM 1,
which is the address of the first number in the list, into R2.
INDEXING AND ARRAY
It is useful in dealing with lists and arrays.
5. Index mode - The effective address of the operand is generated by adding a constant value to
the contents of a register.
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 23
The register used may be either a special register provided for this purpose, or, more
commonly; it may be anyone of a set of general-purpose registers in the processor. In either case, it
is referred to as an index register. We indicate the Index mode symbolically as X(Ri).
Where X denotes the constant value contained in the instruction and Ri is the name of the
register involved. The effective address of the operand is given by EA = X + [Ri]. The contents of
the index register are not changed in the process of generating the effective address.
RELATIVE ADDRESSING
We have defined the Index mode using general-purpose processor registers. A useful
version of this mode is obtained if the program counter, PC, is used instead of a general purpose
register. Then, X(PC) can be used to address a memory location that is X bytes away from the
location presently pointed to by the program counter. Since the addressed location is identified
''relative'' to the program counter, which always identifies the current execution point in a program,
the name Relative mode is associated with this type of addressing.
6.Relative mode - The effective address is determined by the Index mode using the program
counter in place of the general-purpose register Ri.
This mode can be used to access data operands. But, its most common use is to specify the target
address in branch instructions. An instruction such as Branch>O LOOP causes program
execution to go to the branch target location identified by the name LOOP if the branch condition
is satisfied. This location can be computed by specifying it as an offset from the current value of
the program counter. Since the branch target may be either before or after the branch instruction,
the offset is given as a signed number.
ADDITIONAL MODES
The two additional modes described are useful for accessing data items in successive
locations in the memory.
7. Autoincrement mode - The effective address of the operand is the contents of a register
specified in the instruction. After accessing the operand, the contents of this register are
automatically incremented to point to the next item in a list.
We denote the Autoincrement mode by putting the specified register in parentheses, to
show that the contents of the register are used as the effective address, followed by a plus sign to
indicate that these contents are to be incremented after the operand is accessed. Thus, the
Autoincrement mode is written as (Ri) +
As a companion for the Autoincrement mode, another useful mode accesses the items of a list in
the reverse order:
8. Autodecrement mode - The contents of a register specified in the instruction is first
automatically decremented and is then used as the effective address of the operand.
We denote the Autodecrement mode by putting the specified register in parentheses, preceded by a
minus sign to indicate that the contents of the register are to be decremented before being used as
the effective address. Thus, we write - (Ri)
Table 1.1 Generic addressing modes
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 24
Name Assembler syntax Addressing function
Immediate #Value Operand = Value
Register Ri EA = Ri
Absolute (Direct) LOC EA = LOC
Indirect (Ri) EA = [Ri]
(LOC) EA = [LOC]
Index X(Ri) EA = [Ri] + X
Base with index (Ri,Rj) EA = [Ril + [Rj]
Base with index X(Ri,Rj) EA = [Ri] + [Rj] + X
and offset
Relative X(PC) EA = [PC] +X
Autoincrement (Ri)+ EA = [Ri]; Increment Ri
Autodecrement -(Ri) Decrement Ri; EA = [Ri]
EA = effective address Value = a signed number
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 25
1.8 Discuss RISC and CISC
In recent years, the boundary between RISC and CISC architectures has been blurred. Future
processors may be designed with features from both RISC and CISC types.
Architectural description of RISC
Figure 1.8.1 RISC Architecture
As shown in figure 1.8.1, RISC architecture uses separate instruction and data caches. Their access
paths are also different. The hardwired control unit is found in most of the RISC processors.
Architectural description of CISC
Figure 1.8.2 CISC Architecture
As shown in figure 1.8.2, In CISC processor, there is a unified cache buffer for holding both
instruction and data. Therefore they have to share the common path. The micro programmed
control unit uses the CISC processors. but modern CISC processors may also use hardwired
control unit.
Control unit Instruction and
Data path
Microprogrammed
Control memory Cache
Main memory
Hardwired control unit Data path
Instruction cache Data cache
(Instruction) (Data)
Main memory
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 26
Table 2 Difference between RISC and CISC
RISC CISC
Used by Apple Used by Intel and AMD processors
Requires less registers therefore it is
easier to design
Slower than RISC chips when performing
instructions
Faster than CISC More expensive to make compared to RISC
Reduced Instruction Set Computer Complex Instruction Set Architecture
Pipelining can be implemented easily Pipelining implementation is not easy
Direct addition is not possible Direct addition between data in two memory
locations. Ex.8085
Fewer, simpler and faster instructions Large amount of different and complex
instructions
RISC architecture is not widely used Atleast 75% of the processor use CISC
architecture
RISC chips require fewer transistors and
cheaper to produce. Finally, it's easier to
write powerful optimized compilers.
In common CISC chips are relatively slow
(compared to RISC chips) per instruction, but
use little (less than RISC) instructions.
RISC puts a greater burden on the
software. Software developers need to
write more lines for the same tasks.
In CISC, software developers no need to write
more lines for the same tasks
Mainly used for real time applications Mainly used in normal PC’s, Workstations and
servers
Large number of registers, most of
which can be used as general purpose
registers
CISC processors cannot have a large number of
registers.
RISC processor has a number of
hardwired instructions.
CISC processor executes microcode
instructions.
Instructions are executed by hardware Instructions are executed by micro program
Fixed format instructions variable format instructions
Few instructions Many instructions
Multiple instruction set Single instruction set
Highly pipelined Less pipelined
Complexity is in the compiler Complexity is in the microprogram
1.9 Design ALU
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 27
The most of the computer operations are performed in arithmetic and logical unit (ALU)
only. The data & operands are brought to memory to the processor register and actual addition is
carried out in ALU. Each register can store one word of data. The ALU and control unit are many
times faster than other devices connected to the system.
The following operations can be performed by the ALU. They are:
(i) Arithmetic operations (Addition, subtraction, multiplication and division)
(ii)Logical operations (AND, OR, NOT and XOR)
Adders
For two input adders we can have four possible operations. They are
0 + 0 = 0 ⇒ 1 digit sum and no carry
0 + 1 = 1 ⇒ 1 digit sum and no carry
1 + 0 = 1 ⇒ 1 digit sum and no carry
1 + 1 = 10 ⇒ 2 digit sum (Higher bit =carry and lower bit =sum)
Addition with out carry ⇒ half adder
Addition with carry ⇒ full adder
Half adder
The half adder is an example of a simple, functional digital circuit built from two logic gates. A
half adder adds two one-bit binary numbers A and B. It has two outputs, S and C (the value
theoretically carried on to the next addition); the final sum is 2C + S. The simplest half-adder
design, pictured on the right, incorporates an XOR gate for S and an AND gate for C. Half adders
cannot be used compositely, given their incapacity for a carry-in bit.
Inputs Outputs
A B Carry Sum
0 0 0 0
1 0 0 1
0 1 0 1
1 1 1 0
Figure 1.9.1 Half adder's one-bit truth table
Figure 1.9.2 Half adder block diagram
1 bit
Half Adder
(HA)
Input Output
Sum
CarryA
B
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 28
Figure 1.9.3 Half adder Karnaugh map for carry and sum
Figure 1.9.4 Half adder logic diagram
Disadvantages: Previous carry cannot be added in half adder.
Full adder
Schematic symbol for a 1-bit full adder with Cin and Cout drawn on sides of block to
emphasize their use in a multi-bit adder. A full adder adds binary numbers and accounts for
values carried in as well as out. A one-bit full adder adds three one-bit numbers, often written as A,
B, and Cin; A and B are the operands, and Cin is a bit carried in (in theory from a past addition). The
full-adder is usually a component in a cascade of adders, which add 8, 16, 32, etc. binary numbers.
The circuit produces a two-bit output sum typically represented by the signals Cout and S, where
.
The one-bit full adder's truth table is:
Inputs Outputs
A B Cin Cout S
0 0 0 0 0
1 0 0 0 1
0 1 0 0 1
1 1 0 1 0
0 0 1 0 1
1 0 1 1 0
0 1 1 1 0
1 1 1 1 1
Figure 1.9.5 Full adder's one-bit truth table
1
0 0
0
B 0 1
A
0
1
0
0
B 0 1
A
0
1
Carry = AB
1
1
Sum = AB + AB
= A ⊕ B
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 29
Figure 1.9.6 Full adder block diagram
Figure 1.9.7 Full adder Karnaugh map for carry and sum
Figure 1.9.8 Full adder logic diagram
Example full adder logic diagram; the AND gates and the OR gate can be replaced with
NAND gates for the same results. A full adder can be implemented in many different ways such as
with a custom transistor-level circuit or composed of other gates. One example implementation is
with
and .
In this implementation, the final OR gate before the carry-out output may be replaced by an XOR
gate without altering the resulting logic. Using only two types of gates is convenient if the circuit
is being implemented using simple IC chips which contain only one gate type per chip. In this
light, Cout can be implemented as .
A full adder can be constructed from two half adders by connecting A and B to the input of one
half adder, connecting the sum from that to an input to the second adder, connecting Ci to the other
input and OR the two carry outputs. Equivalently, S could be made the three-bit XOR of A, B, and
Ci, and Cout could be made the three-bit majority function of A, B, and Ci.
More complex adders
Carry = AB+BC+AC
Sum = ABC + ABC+ ABC+ ABC
= (A ⊕ B) ⊕ C
1
0 0
0
A
0
1
0
0
A
0
1
1
1
1
1 0
11
1 0
01
1
BC 00 01 11 10 BC 00 01 11 10
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 30
Ripple carry adder
Figure 1.9.9 4-bit adder with logic gates shown
It is possible to create a logical circuit using multiple full adders to add N-bit numbers.
Each full adder inputs a Cin, which is the Cout of the previous adder. This kind of adder is a ripple
carry adder, since each carry bit "ripples" to the next full adder. Note that the first (and only the
first) full adder may be replaced by a half adder.
The layout of a ripple carry adder is simple, which allows for fast design time; however,
the ripple carry adder is relatively slow, since each full adder must wait for the carry bit to be
calculated from the previous full adder. The gate delay can easily be calculated by inspection of
the full adder circuit. Each full adder requires three levels of logic. In a 32-bit [ripple carry] adder,
there are 32 full adders, so the critical path (worst case) delay is 3 (for carry propagation in first
adder) + 31 * 2 (for carry propagation in later adders) = 65 gate delays.
Carry-lookahead adders
Figure 1.9.10 4-bit adder with carry lookahead
To reduce the computation time, engineers devised faster ways to add two binary numbers
by using carry-lookahead adders. They work by creating two signals (P and G) for each bit
position, based on if a carry is propagated through from a less significant bit position (at least one
input is a '1'), a carry is generated in that bit position (both inputs are '1'), or if a carry is killed in
that bit position (both inputs are '0'). In most cases, P is simply the sum output of a half-adder and
G is the carry output of the same adder. After P and G are generated the carries for every bit
position are created.
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 31
2.1 Explain the process Fundamental concepts
To execute a program, the processor fetches one instruction at a time and performs the
operations specified. Instructions are fetched from successive memory locations until a branch or a
jump instruction is encountered. The processor keeps track of the address of the memory location
containing the next instruction to be fetched using the program counter, PC. After fetching an
instruction, the contents of the PC are updated to point to the next instruction in the sequence. A
branch instruction may load a different value into the PC.
Another key register in the processor is the instruction register, IR. Suppose that each
instruction comprises 4 bytes, and that it is stored in one memory word. To execute an instruction,
the processor has to perform the following three steps:
1. Fetch the contents of the memory location pointed to by the PC. The contents of this
location are interpreted as an instruction to be executed. Hence, they are loaded into the IR.
Symbolically, this can be written as IR ← [[PC]]
2. Assuming that the memory is byte addressable, increment the contents of the PC by 4,
that is, PC ← [PC] + 4
3. Carry out the actions specified by the instruction in the IR.
In cases where an instruction occupies more than one word, steps 1 and 2 must be repeated as
many times as necessary to fetch the complete instruction. These two steps are usually referred to
as the fetch phase; step 3 constitutes the execution phase. Figure 2.1 shows an organization in
which the arithmetic and logic unit (ALU) and all the registers are interconnected via a single
common bus. This bus is internal to the processor and should not be confused with the external bus
that connects the processor to the memory and I/O devices.
The data and address lines of the external memory bus are shown in Figure 2.1 connected
to the internal processor bus via the memory data register, MDR, and the memory address register,
MAR, respectively. Register MDR has two inputs and two outputs. Data may be loaded into MDR
either from the memory bus or from the internal processor bus. The data stored in MDR may be
placed on either bus. The input of MAR is connected to the internal bus, and its output is
connected to the external bus. The control lines of the memory bus are connected to the instruction
decoder and control logic block. This unit is responsible for issuing the signals that control the
operation of all the units inside the processor and for interacting with the memory bus.
The number and use of the processor registers R0 through R(n - 1) vary considerably from
one processor to another. Registers may be provided for general-purpose use by the programmer.
Some may be dedicated as special-purpose registers, such as index registers or stack pointers.
Three registers, Y, Z, and TEMP in Figure 2.1, have not been mentioned before. These registers
are transparent to the programmer, that is, the programmer need not be concerned with them
because they are never referenced explicitly by any instruction. They are used by the processor for
temporary storage during execution of some instructions. These registers are never used for storing
data generated by one instruction for later use by another instruction.
The multiplexer MUX selects either the output of register Y or a constant value 4 to be
provided as input A of the ALU. The constant 4 is used to increment the contents of the program
counter. We will refer to the two possible values of the MUX control input Select as Select4 and
SelectY for selecting the constant 4 or register Y, respectively.
As instruction execution progresses, data are transferred from one register to another, often
passing through the AL U to perform some arithmetic or logic operation. The instruction decoder
and control logic unit is responsible for implementing the actions specified by the instruction
loaded in the IR register. The decoder generates the control signals needed to select the registers
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 32
involved and direct the transfer of data. The registers, the ALU, and the interconnecting bus are
collectively referred to as the datapath.
Figure 2.1 Single bus organization of the data path inside a processor.
With few exceptions, an instruction can be executed by performing one or more of the
following operations in some specified sequence:
· Transfer a word of data from one processor register to another or to the ALU
· Perform arithmetic or a logic operation and store the result in a processor register
· Fetch the contents of a given memory location and load them into a processor register
· Store a word of data from a processor register into a given memory location
REGISTER TRANSFERS
Instruction execution involves a sequence of steps in which data are transferred from one
register to another. For each register, two control signals are used to place the contents of that
register on the bus or to load the data on the bus into the register. This is represented symbolically
in Figure 2.2. The input and output of register Ri are connected to the bus via switches controlled
by the signals Riin and Ri out respectively. When Riin is set to 1, the data on the bus are loaded into
Ri. Similarly, when Riout is set to 1, the contents of register Ri are placed on the bus. While Riout is
equal to 0, the bus can be used for transferring data from other registers.
Suppose that we wish to transfer the contents of register Rl to register R4. This can be
accomplished as follows:
. . .
Sub
…
Internal Processor Bus
Add
Select
Memory
Bus
Data line
Address line
PC
MAR
MDR
Y
MUX
Constant
ALU
A B
XOR
ALU
control
lines
Carry in
Z
IR
Instruction
Decoder and
control logic
R0
R (n-1)
TEMP
...
Control signals
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 33
· Enable the output of register Rl by setting R1out to 1. This places the contents of R 1 on the
processor bus.
· Enable the input of register R4 by setting R4in to 1. This loads data from the processor bus into
register R4.
Figure 2.2 Input and output gating for the registers in figure 2.1.
All operations and data transfers within the processor take place within time periods
defined by the processor clock. The control signals that govern a particular transfer are asserted at
the start of the clock cycle. In our example, R1out and R4in are set to 1. The registers consist of
edge-triggered flip-flops. Hence, at the next active edge of the clock, the flip-flops that constitute
R4 will load the data present at their inputs. At the same time, the control signals R1out and R4in
will return to 0. We will use this simple model of the timing of data transfers for the rest of this
chapter. However, we should point out that other schemes are possible. For example, data transfers
may use both the rising and falling edges of the clock. Also, when edge-triggered flip-flops are not
used, two or more clock signals may be needed to guarantee proper transfer of data. This is known
as multiphase clocking.
PERFORMING ARITHMETIC AND LOGICAL OPERATION
The ALU is a combinational circuit that has no internal storage. It performs arithmetic and
logic operations on the two operands applied to its A and B inputs. In Figures 2.1 and 2.2, one of
Zout
Ri out
Ri in
Internal Processor Bus
Select
Ri
Y
MUX
Constant C
Z
X
Yin
X
X
ALU
A B
X
Zin
X
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 34
the operands is the output of the multiplexer MUX and the other operand is obtained directly from
the bus. The result produced by the ALU is stored temporarily in register Z. Therefore, a sequence
of operations to add the contents of register Rl to those of register R2 and store the result in
register R3 is
1. R1out, Yin
2. R2out, Select Y, Add, Zin
3. Zout, R3in
FETCHING A WORD FROM MEMORY
The connections for register MDR are illustrated in Figure 2.4. It has four control signals:
MDR in and MDRout control the connection to the internal bus, and MDR inE and MDRout E control
the connection to the external bus. The circuit in Figure 2.3 is easily modified to provide the
additional connections. A three-input multiplexer can be used, with the memory bus data line
connected to the third input. This input is selected when MDRinE = 1. A second tri-state gate,
controlled by MDRoutE can be used to connect the output of the flip-flop to the memory bus.
Figure 2.3 Input and output gating for one register bit.
Figure 2.4 Connections and control signals for register MDR.
As an example of a read operation, consider the instruction Move (R1),R2. The actions needed to
execute this instruction are:
1. MAR ← [R1]
2. Start a Read operation on the memory bus
MDR in
MDR out
Internal Processor Bus
MDR
X
X
MDR out E
X
MDR in E
X
Memory bus data line
Bus
D Q
_
> Q
M
U
X
Ri in
Ri out
Clock
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 35
3. Wait for the MFC response from the memory
4. Load MDR from the memory bus
5. R2 ← [MDR]
These actions may be carried out as separate steps, but some can be combined into a single step.
Each action can be completed in one clock cycle, except action 3 which requires one or more clock
cycles, depending on the speed of the addressed device.
The memory read operation requires three steps, which can be described by the signals
being activated as follows:
1. R1out, MARin, Read
2. MDR inE , WMFC
3. MDRout, R2 in
where WMFC is the control signal that causes the processor's control circuitry to wait for the
arrival of the MFC signal.
STORING A WORD IN MEMORY
Writing a word into a memory location follows a similar procedure. The desired address is
loaded into MAR. Then, the data to be written are loaded into MDR, and a Write command is
issued. Hence, executing the instruction Move R2,(R 1) requires the following sequence:
1. R1out, MARin
2. R2out, MDRin, Write
3. MDRoutE, WMFC
As in the case of the read operation, the Write control signal causes the memory bus interface
hardware to issue a Write command on the memory bus. The processor remains in step 3 until the
memory operation is completed and an MFC response is received.
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 36
2.2 Explain the process complete instruction execution
Consider the instruction Add (R3),Rl , which adds the contents of a memory location
pointed to by R3 to register R1. Executing this instruction requires the following actions:
1. Fetch the instruction.
2. Fetch the first operand (the contents of the memory location pointed to by R3).
3. Perform the addition.
4. Load the result into R1.
Figure 2.5 gives the sequence of control steps required to perform these operations for the single-
bus architecture of Figure 2.1. Instruction execution proceeds as follows. In step 1, the instruction
fetch operation is initiated by loading the contents of the PC into the MAR and sending a Read
request to the memory. The Select signal is set to Select4, which causes the multiplexer MUX to
select the constant 4. This value is added to the operand at input B, which is the contents of the PC,
and the result is stored in register Z. The updated value is moved from register Z back into the PC
during step 2, while waiting for the memory to respond. In step 3, the word fetched from the
memory is loaded into the IR.
Steps 1 through 3 constitute the instruction fetch phase, which is the same for all instructions. The
instruction decoding circuit interprets the contents of the IR at the beginning of step 4. This
enables the control circuitry to activate the control signals for steps 4 through 7, which constitute
the execution phase. The contents of register R3 are transferred to the MAR in step 4, and a
memory read operation is initiated. Then
Figure 2.5 Control sequence for execution of instruction Add (R3), Rl.
the contents of R 1 are transferred to register Y in step 5, to prepare for the addition operation.
When the Read operation is completed, the memory operand is available in register MDR, and the
addition operation is performed in step 6. The contents of MDR are gated to the bus, and thus also
to the B input of the ALU, and register Y is selected as the second input to the ALU by choosing
Select Y The sum is stored in register Z, then transferred to R 1 in step 7. The End signal causes a
new instruction fetch cycle to begin by returning to step 1.
This discussion accounts for all control signals in Figure 2.5 except Y in in step 2. There is no need
to copy the updated contents of PC into register Y when executing the Add instruction. But, in
Branch instructions the updated value of the PC is needed to compute the Branch target address.
To speed up the execution of Branch instructions, this value is copied into register Y in step 2.
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 37
Since step 2 is part of the fetch phase, the same action will be performed for all instructions. This
does not cause any harm because register Y is not used for any other purpose at that time.
BRANCH INSTRUCTION
A branch instruction replaces the contents of the PC with the branch target address. This
address is usually obtained by adding an offset X, which is given in the branch instruction, to the
updated value of the PC. Figure 2.6 gives a control sequence that implements an unconditional
branch instruction. Processing starts, as usual, with the fetch phase. This phase ends when the
instruction is loaded into the IR in step 3. The offset value is extracted from the IR by the
instruction decoding circuit, which will also perform sign extension if required. Since the value of
the updated PC is already available in register Y, the offset X is gated onto the bus in step 4, and
an addition operation is performed. The result, which is the branch target address, is loaded into
the PC in step 5.
The offset X used in a branch instruction is usually the difference between the branch target
address and the address immediately following the branch instruction. For example, if the branch
instruction is at location 2000 and if the branch target address is 2050, the value of X must be 46.
The reason for this can be readily appreciated from the control sequence in Figure 2.6. The PC is
incremented during the fetch phase, before knowing the type of instruction being executed. Thus,
when the branch address is computed in step 4, the PC value used is the updated value, which
points to the instruction following the branch instruction in the memory.
Figure 2.6 Control sequence for an unconditional instruction.
Consider now a conditional branch. In this case, we need to check the status of the condition codes
before loading a new value into the PC. For example, for a Branch-on-negative (Branch <0)
instruction, step 4 in Figure 2.6 is replaced with
Thus, if N = 0 the processor returns to step 1 immediately after step 4. If N = 1, step 5 is performed
to load a new value into the PC, thus performing the branch operation.
2.3 Discuss multiple bus organization
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 38
In single bus organization, only one data item can be transferred over the bus in a clock
cycle. To reduce the number of steps needed, most commercial processors provide multiple
internal paths that enable several transfers to take place in parallel.
Figure 2.7 Three bus organization of data path.
Figure 2.7 illustrates a three-bus structure used to connect the registers and the ALU of a
processor. All general-purpose registers are combined into a single block called the register file.
The register file in Figure 2.7 is said to have three ports. There are two outputs, allowing the
contents of two different registers to be accessed simultaneously and have their contents placed on
R
A
L
U
Constant
B
A
Bus C
PC
Register
file
Instruction
Decoder
Bus BBus A
Incrementer
M
U
X
IR
MDR
MAR
Memory bus
data lines
Address
lines
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 39
buses A and B. The third port allows the data on bus C to be loaded into a third register during the
same clock cycle.
Buses A and B are used to transfer the source operands to the A and B inputs of the ALU,
where an arithmetic or logic operation may be performed. The result is transferred to the
destination over bus C. If needed, the ALU may simply pass one of its two input operands
unmodified to bus C. We will call the ALU control signals for such an operation R=A or R=B.
A second feature in Figure 2.7 is the introduction of the Incrementer unit, which is used to
increment the PC by 4. Using the Incrementer eliminates the need to add 4 to the PC using the
main ALD, as was done in single bus organization. The source for the constant 4 at the ALU input
multiplexer is still useful. It can be used to increment other addresses, such as the memory
addresses in LoadMultiple and StoreMultiple instructions.
Figure 2.8 Control sequence for execution of instruction Add R4,R5,R6 for 3 bus organization
Consider the three-operand instruction Add R4,R5,R6
The control sequence for executing this instruction is given in Figure 2.8. In step 1, the
contents of the PC are passed through the ALU, using the R=B control signal, and loaded into the
MAR to start a memory read operation. At the same time the PC is incremented by 4. Note that the
value loaded into MAR is the original contents of the PC. The incremented value is loaded into the
PC at the end of the clock cycle and will not affect the contents of MAR. In step 2, the processor
waits for MFC and loads the data received into MDR, then transfers them to IR in step 3. Finally,
the execution phase of the instruction requires only one control step to complete, step 4. By
providing more paths for data transfer a significant reduction in the number of clock cycles needed
to execute an instruction is achieved.
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 40
2.4 Discuss Hardwired control
To execute instructions, the processor must have some means of generating the control
signals needed in the proper sequence. Computer designers use a wide variety of techniques to
solve this problem. The approaches used fall into one of two categories: hardwired control and
microprogrammed control.
The required control signals are determined by the following information:
· Contents of the control step counter
· Contents of the instruction register
· Contents of the condition code flags
· External input signals, such as MFC and interrupt requests
Figure 2.10 Control unit organization.
To gain insight into the structure of the control unit, we start with a simplified view of the
hardware involved. The decoder/encoder block in Figure 2.10 is a combinational circuit that
generates the required control outputs, depending on the state of all its inputs. By separating the
decoding and encoding functions, we obtain the more detailed block diagram in Figure 2.11. The
step decoder provides a separate signal line for each step, or time slot, in the control sequence.
Similarly, the output of the instruction decoder consists of a separate line for each machine
instruction. For any instruction loaded in the IR, one of the output lines INS 1 through INS m is set
to 1, and all other lines are set to O. (For design details of decoders, refer to Appendix A.) The
input signals to the encoder block in Figure 2.11 are combined to generate the individual control
signals Y in , PC OUh Add, End, and so on. An example of how the encoder generates the Zin
control signal for the processor organization is given in Figure 2.12. This circuit implements the
logic function
This signal is asserted during time slot Tl for all instructions, during T6 for an Add
instruction, during T 4 for an unconditional branch instruction, and so on. Figure 2.13 gives a
circuit that generates the End control signal from the logic function
CLK
Decoder/
Encoder
IR
Control step
counter
External
inputs
Condition
codes
Control signals
.
.
.
…
.
.
.
.
.
.
Clock
…
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 41
The End signal starts a new instruction fetch cycle by resetting the control step counter to its
starting value. Figure 2.11 contains another control signal called RUN. When set to 1, RUN causes
the counter to be incremented by one at the end of every clock cycle. When RUN is equal to 0, the
counter stops counting. This is needed whenever the WMFC signal is issued, to cause the
processor to wait for the reply from the memory.
Figure 2.11 Separation of decoding and encoding functions.
The control hardware shown in Figure 2.10 or 2.11 can be viewed as a state machine that
changes from one state to another in every clock cycle, depending on the contents of the
instruction register, the condition codes, and the external inputs. The outputs of the state machine
are the control signals. The sequence of operations carried out by this machine is determined by
the wiring of the logic elements, hence the name "hardwired." A controller that uses this approach
can operate at high speed. However, it has little flexibility, and the complexity of the instruction
set it can implement is limited.
INSn
INS2
Run End
CLK
Encoder
Control step
counter
External
inputs
Condition
codes
Control signals
Instruction
Decoder
.
.
.
…
.
.
.
.
.
.
Clock
. . .
IR
.
.
.
Step decoder
T1 T2 . . . Tn
INS1
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 42
Figure 2.12 Generation of the Zin Control Figure 2.13 Generation of the
signal for the processor in figure 7.1. End Control signal.
A COMPLETE PROCESSOR
A complete processor can be designed using the structure shown in Figure 2.14.
This structure has an instruction unit that fetches instructions from an instruction cache or from the
main memory when the desired instructions are not already in the cache. It has separate processing
units to deal with integer data and floating-point data. A data cache is inserted between these units
and the main memory. Using separate caches for instructions and data is common practice in many
processors today. Other processors use a single cache that stores both instructions and data. The
processor is connected to the system bus and, hence, to the rest of the computer, by means of a bus
interface.
Figure 2.14 Block diagram of a complete processor.
…
T6T5
…
T7
Branch <0Add
End
T4
N N
Branch
Main
Memory
Processor
Instruction
Unit
Integer
Unit
Floating-point Unit
Instruction
Cache
Data
Cache
Bus Interface
Input/
Output
System bus
… …
T1
T4 T6
Branch Add
Zin
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 43
2.5 Micro programmed control
An alternative scheme for hardwired control is called micro programmed control in which
control signals are generated by a program similar to machine language programs.
Figure 2.15 An example of micro instructions for figure 2.6
A control word (CW) is a word whose individual bits represent the various control signals
Each of the control steps in the control sequence of an instruction defines a unique combination of
1s and 0s in the CW. The CW s corresponding to the 7 steps of Figure 2.6 are shown in Figure
2.15. SelectY is represented by Select = 0 and Select4 by Select = 1. A sequence of CW s
corresponding to the control sequence of a machine instruction constitutes the microroutine for that
instruction, and the individual control words in this microroutine are referred to as
microinstructions.
Figure 2.16 Basic organization of a microprogrammed control unit.
The microroutines for all instructions in the instruction set of a computer are stored in a
special memory called the control store. The control unit can generate the control signals for any
instruction by sequentially reading the CW s of the corresponding microroutine from the control
store. This suggests organizing the control unit as shown in Figure 2.16. To read the control words
sequentially from the control store, a microprogram counter (µPC) is used. Every time a new
Micro program
counter (µPC)
Control Store /
memory (µCM)
CW
Clock
IR
Starting
address
generator
G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 44
instruction is loaded into the IR, the output of the block labeled "starting address generator" is
loaded into the µPC. The µPC is then automatically incremented by the clock, causing successive
microinstructions to be read from the control store. Hence, the control signals are delivered to
various parts of the processor in the correct sequence.
In microprogrammed control, an alternative approach is to use conditional branch
microinstructions. In addition to the branch address, these microinstructions specify which of the
external inputs, condition codes, or, possibly, bits of the instruction register should be checked as a
condition for branching to take place.
Figure 2.17 Micro routine for the instruction branch <0
Figure 2.18 Organization of a control unit to allow conditional branching in microprogram
The instruction Branch <0 may now be implemented by a microroutine such as that shown
in Figure 7.17. After loading this instruction into IR, a branch microinstruction transfers control to
the corresponding microroutine, which is assumed to start at location 25 in the control store. This
address is the output of the starting address generator block in Figure 7.16. The microinstruction at
location 25 tests N bit of the condition codes. If this bit is equal to 0, a branch takes place to
location 0 to fetch a new machine instruction. Otherwise, the microinstruction at location 26 is
Micro program
counter (µPC)
Control Store /
memory (µCM)
CW
Clock
IR
Starting and
branch address
generator
External inputs
Condition codes
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers
Computer organization-and-architecture-questions-and-answers

Mais conteúdo relacionado

Mais procurados

Dynamic interconnection networks
Dynamic interconnection networksDynamic interconnection networks
Dynamic interconnection networksPrasenjit Dey
 
Intermediate code generation (Compiler Design)
Intermediate code generation (Compiler Design)   Intermediate code generation (Compiler Design)
Intermediate code generation (Compiler Design) Tasif Tanzim
 
Verilog VHDL code Multiplexer and De Multiplexer
Verilog VHDL code Multiplexer and De Multiplexer Verilog VHDL code Multiplexer and De Multiplexer
Verilog VHDL code Multiplexer and De Multiplexer Bharti Airtel Ltd.
 
Intermediate code generator
Intermediate code generatorIntermediate code generator
Intermediate code generatorsanchi29
 
Code optimization in compiler design
Code optimization in compiler designCode optimization in compiler design
Code optimization in compiler designKuppusamy P
 
Control Units : Microprogrammed and Hardwired:control unit
Control Units : Microprogrammed and Hardwired:control unitControl Units : Microprogrammed and Hardwired:control unit
Control Units : Microprogrammed and Hardwired:control unitabdosaidgkv
 
Instruction pipeline: Computer Architecture
Instruction pipeline: Computer ArchitectureInstruction pipeline: Computer Architecture
Instruction pipeline: Computer ArchitectureInteX Research Lab
 
15 puzzle problem using branch and bound
15 puzzle problem using branch and bound15 puzzle problem using branch and bound
15 puzzle problem using branch and boundAbhishek Singh
 
Hardware and Software parallelism
Hardware and Software parallelismHardware and Software parallelism
Hardware and Software parallelismprashantdahake
 
Processor organization &amp; register organization
Processor organization &amp; register organizationProcessor organization &amp; register organization
Processor organization &amp; register organizationGhanshyam Patel
 
Lecture 14 run time environment
Lecture 14 run time environmentLecture 14 run time environment
Lecture 14 run time environmentIffat Anjum
 
General register organization (computer organization)
General register organization  (computer organization)General register organization  (computer organization)
General register organization (computer organization)rishi ram khanal
 
Computer Organization and Assembly Language
Computer Organization and Assembly LanguageComputer Organization and Assembly Language
Computer Organization and Assembly Languagefasihuddin90
 
Computer organisation -morris mano
Computer organisation  -morris manoComputer organisation  -morris mano
Computer organisation -morris manovishnu murthy
 
Computer Organisation & Architecture (chapter 1)
Computer Organisation & Architecture (chapter 1) Computer Organisation & Architecture (chapter 1)
Computer Organisation & Architecture (chapter 1) Subhasis Dash
 
Presentation on cyclic redundancy check (crc)
Presentation on cyclic redundancy check (crc)Presentation on cyclic redundancy check (crc)
Presentation on cyclic redundancy check (crc)Sudhanshu Srivastava
 

Mais procurados (20)

Dynamic interconnection networks
Dynamic interconnection networksDynamic interconnection networks
Dynamic interconnection networks
 
Intermediate code generation (Compiler Design)
Intermediate code generation (Compiler Design)   Intermediate code generation (Compiler Design)
Intermediate code generation (Compiler Design)
 
Verilog VHDL code Multiplexer and De Multiplexer
Verilog VHDL code Multiplexer and De Multiplexer Verilog VHDL code Multiplexer and De Multiplexer
Verilog VHDL code Multiplexer and De Multiplexer
 
Intermediate code generator
Intermediate code generatorIntermediate code generator
Intermediate code generator
 
Code optimization in compiler design
Code optimization in compiler designCode optimization in compiler design
Code optimization in compiler design
 
Control Units : Microprogrammed and Hardwired:control unit
Control Units : Microprogrammed and Hardwired:control unitControl Units : Microprogrammed and Hardwired:control unit
Control Units : Microprogrammed and Hardwired:control unit
 
Deadlock ppt
Deadlock ppt Deadlock ppt
Deadlock ppt
 
Instruction pipeline: Computer Architecture
Instruction pipeline: Computer ArchitectureInstruction pipeline: Computer Architecture
Instruction pipeline: Computer Architecture
 
15 puzzle problem using branch and bound
15 puzzle problem using branch and bound15 puzzle problem using branch and bound
15 puzzle problem using branch and bound
 
Hardware and Software parallelism
Hardware and Software parallelismHardware and Software parallelism
Hardware and Software parallelism
 
Processor organization &amp; register organization
Processor organization &amp; register organizationProcessor organization &amp; register organization
Processor organization &amp; register organization
 
Chapter 1
Chapter 1Chapter 1
Chapter 1
 
Lecture 14 run time environment
Lecture 14 run time environmentLecture 14 run time environment
Lecture 14 run time environment
 
General register organization (computer organization)
General register organization  (computer organization)General register organization  (computer organization)
General register organization (computer organization)
 
Instruction format
Instruction formatInstruction format
Instruction format
 
Computer Organization and Assembly Language
Computer Organization and Assembly LanguageComputer Organization and Assembly Language
Computer Organization and Assembly Language
 
Computer organisation -morris mano
Computer organisation  -morris manoComputer organisation  -morris mano
Computer organisation -morris mano
 
Computer Organisation & Architecture (chapter 1)
Computer Organisation & Architecture (chapter 1) Computer Organisation & Architecture (chapter 1)
Computer Organisation & Architecture (chapter 1)
 
Computer arithmetic
Computer arithmeticComputer arithmetic
Computer arithmetic
 
Presentation on cyclic redundancy check (crc)
Presentation on cyclic redundancy check (crc)Presentation on cyclic redundancy check (crc)
Presentation on cyclic redundancy check (crc)
 

Semelhante a Computer organization-and-architecture-questions-and-answers

Ise iv-computer organization [10 cs46]-notes new
Ise iv-computer  organization [10 cs46]-notes newIse iv-computer  organization [10 cs46]-notes new
Ise iv-computer organization [10 cs46]-notes newdilshad begum
 
FUNDAMENTAL UNITS OF COMPUTER.pptx
FUNDAMENTAL UNITS OF COMPUTER.pptxFUNDAMENTAL UNITS OF COMPUTER.pptx
FUNDAMENTAL UNITS OF COMPUTER.pptxShubhamGupta345141
 
Computer Fundamental
Computer FundamentalComputer Fundamental
Computer FundamentalShradha Kabra
 
Basic Organisation and fundamental Of Computer.pptx
Basic Organisation and fundamental Of Computer.pptxBasic Organisation and fundamental Of Computer.pptx
Basic Organisation and fundamental Of Computer.pptxhasanbashar400
 
Computer Fundamentals
Computer FundamentalsComputer Fundamentals
Computer FundamentalsTinzo02
 
Tìm hiểu về Công nghệ thông tin (IT) toàn tập
Tìm hiểu về Công nghệ thông tin (IT) toàn tậpTìm hiểu về Công nghệ thông tin (IT) toàn tập
Tìm hiểu về Công nghệ thông tin (IT) toàn tậpINFOCHIEF institute
 
Basic of computers
Basic of computersBasic of computers
Basic of computersSanthi thi
 
Computer Organization and Architecture for engineering
Computer Organization and Architecture for engineeringComputer Organization and Architecture for engineering
Computer Organization and Architecture for engineeringallwynanands1
 
Introduction to pc operations nc ii
Introduction to pc operations nc iiIntroduction to pc operations nc ii
Introduction to pc operations nc iiNSU-Biliran Campus
 
Introductiontopcoperationsncii 130724004019-phpapp01
Introductiontopcoperationsncii 130724004019-phpapp01Introductiontopcoperationsncii 130724004019-phpapp01
Introductiontopcoperationsncii 130724004019-phpapp01Lanie Plecerda
 
3. Component of computer - System Unit ( CSI-321)
3. Component of computer - System Unit  ( CSI-321) 3. Component of computer - System Unit  ( CSI-321)
3. Component of computer - System Unit ( CSI-321) ghayour abbas
 
The Deal
The DealThe Deal
The Dealadhaval
 
Project on computer assembling
Project on computer assemblingProject on computer assembling
Project on computer assemblingSubhojit Paul
 
Introduction to Computer and Generations of Computer by Er. Kamlesh Tripathi
Introduction to Computer and Generations of Computer by Er. Kamlesh TripathiIntroduction to Computer and Generations of Computer by Er. Kamlesh Tripathi
Introduction to Computer and Generations of Computer by Er. Kamlesh TripathiLEALucknow
 

Semelhante a Computer organization-and-architecture-questions-and-answers (20)

Ise iv-computer organization [10 cs46]-notes new
Ise iv-computer  organization [10 cs46]-notes newIse iv-computer  organization [10 cs46]-notes new
Ise iv-computer organization [10 cs46]-notes new
 
FUNDAMENTAL UNITS OF COMPUTER.pptx
FUNDAMENTAL UNITS OF COMPUTER.pptxFUNDAMENTAL UNITS OF COMPUTER.pptx
FUNDAMENTAL UNITS OF COMPUTER.pptx
 
Computer Fundamental
Computer FundamentalComputer Fundamental
Computer Fundamental
 
Basic Organisation and fundamental Of Computer.pptx
Basic Organisation and fundamental Of Computer.pptxBasic Organisation and fundamental Of Computer.pptx
Basic Organisation and fundamental Of Computer.pptx
 
Computer Fundamentals
Computer FundamentalsComputer Fundamentals
Computer Fundamentals
 
Tìm hiểu về Công nghệ thông tin (IT) toàn tập
Tìm hiểu về Công nghệ thông tin (IT) toàn tậpTìm hiểu về Công nghệ thông tin (IT) toàn tập
Tìm hiểu về Công nghệ thông tin (IT) toàn tập
 
Basic of computers
Basic of computersBasic of computers
Basic of computers
 
Co notes3 sem
Co notes3 semCo notes3 sem
Co notes3 sem
 
os mod1 notes
 os mod1 notes os mod1 notes
os mod1 notes
 
unit-i.pdf
unit-i.pdfunit-i.pdf
unit-i.pdf
 
Computer Organization and Architecture for engineering
Computer Organization and Architecture for engineeringComputer Organization and Architecture for engineering
Computer Organization and Architecture for engineering
 
Introduction to pc operations nc ii
Introduction to pc operations nc iiIntroduction to pc operations nc ii
Introduction to pc operations nc ii
 
Introductiontopcoperationsncii 130724004019-phpapp01
Introductiontopcoperationsncii 130724004019-phpapp01Introductiontopcoperationsncii 130724004019-phpapp01
Introductiontopcoperationsncii 130724004019-phpapp01
 
Computer system
Computer systemComputer system
Computer system
 
3. Component of computer - System Unit ( CSI-321)
3. Component of computer - System Unit  ( CSI-321) 3. Component of computer - System Unit  ( CSI-321)
3. Component of computer - System Unit ( CSI-321)
 
The Deal
The DealThe Deal
The Deal
 
Csc 2313 (lecture 1)
Csc 2313 (lecture 1)Csc 2313 (lecture 1)
Csc 2313 (lecture 1)
 
Csc 2313 (lecture 1)
Csc 2313 (lecture 1)Csc 2313 (lecture 1)
Csc 2313 (lecture 1)
 
Project on computer assembling
Project on computer assemblingProject on computer assembling
Project on computer assembling
 
Introduction to Computer and Generations of Computer by Er. Kamlesh Tripathi
Introduction to Computer and Generations of Computer by Er. Kamlesh TripathiIntroduction to Computer and Generations of Computer by Er. Kamlesh Tripathi
Introduction to Computer and Generations of Computer by Er. Kamlesh Tripathi
 

Mais de appasami

Data visualization using python
Data visualization using pythonData visualization using python
Data visualization using pythonappasami
 
Cs6503 theory of computation book notes
Cs6503 theory of computation book notesCs6503 theory of computation book notes
Cs6503 theory of computation book notesappasami
 
Cs6503 theory of computation april may 2017
Cs6503 theory of computation april may 2017Cs6503 theory of computation april may 2017
Cs6503 theory of computation april may 2017appasami
 
Cs6660 compiler design may june 2017 answer key
Cs6660 compiler design may june 2017  answer keyCs6660 compiler design may june 2017  answer key
Cs6660 compiler design may june 2017 answer keyappasami
 
Cs6660 compiler design november december 2016 Answer key
Cs6660 compiler design november december 2016 Answer keyCs6660 compiler design november december 2016 Answer key
Cs6660 compiler design november december 2016 Answer keyappasami
 
Cs6660 compiler design may june 2016 Answer Key
Cs6660 compiler design may june 2016 Answer KeyCs6660 compiler design may june 2016 Answer Key
Cs6660 compiler design may june 2016 Answer Keyappasami
 
CS2303 theory of computation Toc answer key november december 2014
CS2303 theory of computation Toc answer key november december 2014CS2303 theory of computation Toc answer key november december 2014
CS2303 theory of computation Toc answer key november december 2014appasami
 
Cs6503 theory of computation november december 2016
Cs6503 theory of computation november december 2016Cs6503 theory of computation november december 2016
Cs6503 theory of computation november december 2016appasami
 
Cs2303 theory of computation all anna University question papers
Cs2303 theory of computation all anna University question papersCs2303 theory of computation all anna University question papers
Cs2303 theory of computation all anna University question papersappasami
 
CS2303 Theory of computation April may 2015
CS2303 Theory of computation April may  2015CS2303 Theory of computation April may  2015
CS2303 Theory of computation April may 2015appasami
 
Cs2303 theory of computation may june 2016
Cs2303 theory of computation may june 2016Cs2303 theory of computation may june 2016
Cs2303 theory of computation may june 2016appasami
 
Cs2303 theory of computation november december 2015
Cs2303 theory of computation november december 2015Cs2303 theory of computation november december 2015
Cs2303 theory of computation november december 2015appasami
 
CS6702 graph theory and applications notes pdf book
CS6702 graph theory and applications notes pdf bookCS6702 graph theory and applications notes pdf book
CS6702 graph theory and applications notes pdf bookappasami
 
Cs6503 theory of computation november december 2015 be cse anna university q...
Cs6503 theory of computation november december 2015  be cse anna university q...Cs6503 theory of computation november december 2015  be cse anna university q...
Cs6503 theory of computation november december 2015 be cse anna university q...appasami
 
Cs6503 theory of computation may june 2016 be cse anna university question paper
Cs6503 theory of computation may june 2016 be cse anna university question paperCs6503 theory of computation may june 2016 be cse anna university question paper
Cs6503 theory of computation may june 2016 be cse anna university question paperappasami
 
Cs6402 design and analysis of algorithms impartant part b questions appasami
Cs6402 design and analysis of algorithms  impartant part b questions appasamiCs6402 design and analysis of algorithms  impartant part b questions appasami
Cs6402 design and analysis of algorithms impartant part b questions appasamiappasami
 
Cs6702 graph theory and applications question bank
Cs6702 graph theory and applications question bankCs6702 graph theory and applications question bank
Cs6702 graph theory and applications question bankappasami
 
Cs6702 graph theory and applications Anna University question paper apr may 2...
Cs6702 graph theory and applications Anna University question paper apr may 2...Cs6702 graph theory and applications Anna University question paper apr may 2...
Cs6702 graph theory and applications Anna University question paper apr may 2...appasami
 
Cs6503 theory of computation syllabus
Cs6503 theory of computation syllabusCs6503 theory of computation syllabus
Cs6503 theory of computation syllabusappasami
 
Cs6503 theory of computation index page
Cs6503 theory of computation index pageCs6503 theory of computation index page
Cs6503 theory of computation index pageappasami
 

Mais de appasami (20)

Data visualization using python
Data visualization using pythonData visualization using python
Data visualization using python
 
Cs6503 theory of computation book notes
Cs6503 theory of computation book notesCs6503 theory of computation book notes
Cs6503 theory of computation book notes
 
Cs6503 theory of computation april may 2017
Cs6503 theory of computation april may 2017Cs6503 theory of computation april may 2017
Cs6503 theory of computation april may 2017
 
Cs6660 compiler design may june 2017 answer key
Cs6660 compiler design may june 2017  answer keyCs6660 compiler design may june 2017  answer key
Cs6660 compiler design may june 2017 answer key
 
Cs6660 compiler design november december 2016 Answer key
Cs6660 compiler design november december 2016 Answer keyCs6660 compiler design november december 2016 Answer key
Cs6660 compiler design november december 2016 Answer key
 
Cs6660 compiler design may june 2016 Answer Key
Cs6660 compiler design may june 2016 Answer KeyCs6660 compiler design may june 2016 Answer Key
Cs6660 compiler design may june 2016 Answer Key
 
CS2303 theory of computation Toc answer key november december 2014
CS2303 theory of computation Toc answer key november december 2014CS2303 theory of computation Toc answer key november december 2014
CS2303 theory of computation Toc answer key november december 2014
 
Cs6503 theory of computation november december 2016
Cs6503 theory of computation november december 2016Cs6503 theory of computation november december 2016
Cs6503 theory of computation november december 2016
 
Cs2303 theory of computation all anna University question papers
Cs2303 theory of computation all anna University question papersCs2303 theory of computation all anna University question papers
Cs2303 theory of computation all anna University question papers
 
CS2303 Theory of computation April may 2015
CS2303 Theory of computation April may  2015CS2303 Theory of computation April may  2015
CS2303 Theory of computation April may 2015
 
Cs2303 theory of computation may june 2016
Cs2303 theory of computation may june 2016Cs2303 theory of computation may june 2016
Cs2303 theory of computation may june 2016
 
Cs2303 theory of computation november december 2015
Cs2303 theory of computation november december 2015Cs2303 theory of computation november december 2015
Cs2303 theory of computation november december 2015
 
CS6702 graph theory and applications notes pdf book
CS6702 graph theory and applications notes pdf bookCS6702 graph theory and applications notes pdf book
CS6702 graph theory and applications notes pdf book
 
Cs6503 theory of computation november december 2015 be cse anna university q...
Cs6503 theory of computation november december 2015  be cse anna university q...Cs6503 theory of computation november december 2015  be cse anna university q...
Cs6503 theory of computation november december 2015 be cse anna university q...
 
Cs6503 theory of computation may june 2016 be cse anna university question paper
Cs6503 theory of computation may june 2016 be cse anna university question paperCs6503 theory of computation may june 2016 be cse anna university question paper
Cs6503 theory of computation may june 2016 be cse anna university question paper
 
Cs6402 design and analysis of algorithms impartant part b questions appasami
Cs6402 design and analysis of algorithms  impartant part b questions appasamiCs6402 design and analysis of algorithms  impartant part b questions appasami
Cs6402 design and analysis of algorithms impartant part b questions appasami
 
Cs6702 graph theory and applications question bank
Cs6702 graph theory and applications question bankCs6702 graph theory and applications question bank
Cs6702 graph theory and applications question bank
 
Cs6702 graph theory and applications Anna University question paper apr may 2...
Cs6702 graph theory and applications Anna University question paper apr may 2...Cs6702 graph theory and applications Anna University question paper apr may 2...
Cs6702 graph theory and applications Anna University question paper apr may 2...
 
Cs6503 theory of computation syllabus
Cs6503 theory of computation syllabusCs6503 theory of computation syllabus
Cs6503 theory of computation syllabus
 
Cs6503 theory of computation index page
Cs6503 theory of computation index pageCs6503 theory of computation index page
Cs6503 theory of computation index page
 

Último

Cost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionCost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionSneha Padhiar
 
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Erbil Polytechnic University
 
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.elesangwon
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingBootNeck1
 
Engineering Drawing section of solid
Engineering Drawing     section of solidEngineering Drawing     section of solid
Engineering Drawing section of solidnamansinghjarodiya
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxRomil Mishra
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating SystemRashmi Bhat
 
KCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosKCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosVictor Morales
 
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptJohnWilliam111370
 
SOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATIONSOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATIONSneha Padhiar
 
Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfModule-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfManish Kumar
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substationstephanwindworld
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxsiddharthjain2303
 
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书rnrncn29
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating SystemRashmi Bhat
 
Levelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument methodLevelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument methodManicka Mamallan Andavar
 
Turn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxTurn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxStephen Sitton
 
TEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACHTEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACHSneha Padhiar
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating SystemRashmi Bhat
 

Último (20)

Cost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionCost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based question
 
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
 
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event Scheduling
 
Engineering Drawing section of solid
Engineering Drawing     section of solidEngineering Drawing     section of solid
Engineering Drawing section of solid
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptx
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating System
 
KCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosKCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitos
 
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
 
SOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATIONSOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATION
 
Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfModule-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substation
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptx
 
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating System
 
Levelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument methodLevelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument method
 
Turn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxTurn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptx
 
Designing pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptxDesigning pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptx
 
TEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACHTEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACH
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating System
 

Computer organization-and-architecture-questions-and-answers

  • 1. QUESTIONS AND ANSWERS FOR COMPUTER ORGANIZATION AND ARCHITECTURE G.Appasami, M.Sc., M.C.A., M.Phil., M.Tech., Assistant Professor Department of Computer Science and Engineering Dr. Pauls Engineering Collage Pauls Nagar, Villupuram Tamilnadu, India. SARUMATHI PUBLICATIONS No. 109, Pillayar Kovil Street, Periya Kalapet Pondicherry – 605014, India Phone: 0413 – 2656368 Mobile: 9786554175, 8940872274
  • 2. First Edition: July 2010 Second Edition: July 2011 Published By SARUMATHI PUBLICATIONS © All rights reserved. No part of this publication can be reproduced or stored in any form or by means of photocopy, recording or otherwise without the prior written permission of the author. Price Rs. 160/- Copies can be had from SARUMATHI PUBLICATIONS No. 109, Pillayar Kovil Street, Periya Kalapet Pondicherry – 605014, India Phone: 0413 – 2656368 Mobile: 9786554175, 8940872274 Printed at Meenam Offset Pondicherry – 605014, India
  • 3. ANNA UNIEVRSITY SYLLABUS CS 2253 COMPUTER ORGANIZATION AND ARCHITECTURE (Common to CSE & IT) 1. BASIC STRUCTURE OF COMPUTERS 9 Functional units – Basic operational concepts – Bus structures – Performance and metrics – Instructions and instruction sequencing – Hardware – Software Interface – Instruction set architecture – Addressing modes – RISC – CISC. ALU design – Fixed point and floating point operations. 2. BASIC PROCESSING UNIT 9 Fundamental concepts – Execution of a complete instruction – Multiple bus organization – Hardwired control – Micro programmed control – Nano programming. 3. PIPELINING 9 Basic concepts – Data hazards – Instruction hazards – Influence on instruction sets – Data path and control considerations – Performance considerations – Exception handling. 4. MEMORY SYSTEM 9 Basic concepts – Semiconductor RAM – ROM – Speed – Size and cost – Cache memories – Improving cache performance – Virtual memory – Memory management requirements – Associative memories – Secondary storage devices. 5. I/O ORGANIZATION 9 Accessing I/O devices – Programmed Input/Output -Interrupts – Direct Memory Access – Buses – Interface circuits – Standard I/O Interfaces (PCI, SCSI, USB), I/O devices and processors. TOTAL = 45
  • 4. TABLE OF CONTENTS UNIT 1: BASIC STRUCTURE OF COMPUTERS 1.1 Explain functional unit of a computer 1 1.2 Discuss basic operational concepts 4 1.3 Explain bus structure of a computer 6 1.4 Explain performance and metrics 9 1.5 Explain instruction and instruction sequencing 13 1.6 Discuss Hardware – Software Interface 19 1.7 Explain various addressing modes 22 1.8 Discuss RISC and CISC 25 1.9 Design ALU 29 UNIT 2: BASIC PROCESSING UNIT 2.1 Explain the process Fundamental concepts 31 2.2 Explain the process complete instruction execution 36 2.3 Discuss multiple bus organization 38 2.4 Discuss Hardwired control 40 2.5 Micro programmed control 43 2.6 Explain Nano programming 47 UNIT 3: PIPELINING 3.1 Explain Basic concepts of pipeline 49 3.2 Discus Data hazards 54 3.3 Discus Instruction hazards 58 3.4 Discus Influence on Instruction Sets 66 3.5 Explain Datapath and control considerations 69 3.6 Explain Performance consideration 71 3.7 Discuss Exception Handling 73
  • 5. UNIT 4: MEMORY SYSTEM 4.1 Explain some basic concepts of memory system 76 4.2 Discuss Semiconductor RAM Memories 77 4.3 Discuss Read Only Memories (ROM) 87 4.4 Discuss speed, size and cost of Memories 89 4.5 Discuss about Cache memories 91 4.6 explain improving cache performance 96 4.7 Explain virtual memory 100 4.8 Explain Memory management requirements 104 4.9 Write about Associative memory 105 4.10 Discuss Secondary memory 108 UNIT 5: I/O ORGANIZATION 5.1 Discuss Accessing I/O Devices 118 5.2 Explain Program-controlled I/O 120 5.3 Explain Interrupts 121 5.4 Explain direct memory access (DMA) 129 5.5 Explain about Buses 134 5.6 Discuss interface circuits 139 5.7 Describe about Standard I/O Interface 147 5.8. Discuss I/O device 160 5.9 Discuss Processors 161 Appendix: Short answers for questions
  • 6. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 1 1.1 Explain functional unit of a computer. A computer consists of five functionally independent main parts: input, memory, arithmetic and logic, output, and control units, as shown in Figure 1.1. The input unit accepts coded information from human operators, from electromechanical devices such as key boards, or from other computers over digital communication lines. The information received is either stored in the computer's memory for later reference or immediately used by the arithmetic and logic circuitry to perform the desired operations. The processing steps are determined by a program stored in the memory. Finally, the results are sent back to the outside world through the output unit. All of these actions are coordinated by the control unit. Figure 1.1 does not show the connections among the functional units. The arithmetic and logic circuits, in conjunction with the main control circuits, as the processor, input and output equipment is often collectively referred to as the input-output (I/O) unit. Figure 1.1 Basic functional unit of a computer. Input Unit Computers accept coded information through input units, which read data. The most well- known input device is the keyboard. Whenever a key is pressed, the corresponding letter or digit is automatically translated into its corresponding binary code and transmitted over a cable to either the memory or the processor. Many other kinds of input devices are available, including joysticks, trackballs, and mouses, which can be used as pointing device. Touch screens are often used as graphic input devices in conjunction with displays. Microphones can be used to capture audio input which is then sampled and converted into digital codes for storage and. processing. Cameras and scanners are used as to get digital images. Memory Unit The function of the memory unit is to store programs and data. There are two classes of storage, called primary and secondary. Primary storage is a fast memory that operates at electronic speeds. Programs must be stored in the memory while they are being executed. The memory contains a large number of semiconductor storage cells, each capable of storing one bit of information. These cells are rarely read or written as individual cells but instead are processed in groups of fixed size called words. The memory is organized so that the contents of one word, containing n bits, can be stored or retrieved in one basic operation. Processor Arithmetic and logic unit Control unit Input unit Output unit Memory unit
  • 7. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 2 To provide easy access to any word in the memory, a distinct address is associated with each word location. Addresses are numbers that identify successive locations. A given word is accessed by specifying its address and issuing a control command that starts the storage or retrieval process. The number of bits in each word is often referred to as the word length of the computer. Typical word lengths range from 16 to 64 bits. The capacity of the memory is one factor that characterizes the size of a computer. Programs must reside in the memory during execution. Instructions and data can be written into the memory or read out under the controller of the processor. It is essential to be able to access any word location in the memory as quickly as possible. Memory in which any location can be reached in a short and fixed amount of time after specifying its address is called random-access Memory (RAM). The time required to access one word is called the Memory access time. This time is fixed, independent of the location of the word being accessed. It typically ranges from a few nanoseconds (ns) to about 100 ns for modem RAM units. The memory of a computer is normally implemented as a Memory hierarchy of three or four levels of semiconductor RAM units with different speeds and sizes. The small, fast, RAM units are called caches. They are tightly coupled with the processor and are often contained on the same integrated circuit chip to achieve high performance. The largest and slowest unit is referred to as the main Memory. Although primary storage is essential, it tends to be expensive. Thus additional, cheaper, secondary storage is used when large amounts of data and many programs have to be stored, particularly for information that is access infrequently. A wide selection of secondary storage devices is available, including magnetic disks and tapes and optical disks Arithmetic and Logic Unit. Most computer operations are executed in the arithmetic and logic unit (ALU) of the processor. Consider a typical example: Suppose two numbers located in the memory are to be added. They are brought into the processor, and the actual addition is carried out by the ALU. The sum may then be stored in the memory or retained in the processor for immediate use. Any other arithmetic or logic operation, for example, multiplication, division, or comparison of numbers, is initiated by bringing the required operands into the processor, where the operation is performed by the ALU. When operands are brought into the processor, they are stored in high-speed storage elements called registers. Each register can store one word of data. Access times to registers are somewhat faster than access times to the fastest cache unit in the memory h- ierarchy. The control and the arithmetic and logic units are many times faster than other devices connected to a computer system. This enables a single processor to control a number of external devices such as keyboards, displays, magnetic and optical disks, sensors, and mechanical controllers.
  • 8. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 3 Output Unit The output unit is the counterpart of the input unit. Its function is to send processed results to the outside world. The most familiar example of such a device is a printer. Printers employ mechanical impact heads, inkjet streams, or photocopying techniques, as in laser printers, to perform the printing. It is possible to produce printers capable of printing as many as 10,000 lines per minute. This is a tremendous speed for a mechanical device but is still very slow compared to the electronic speed of a processor unit. Monitors, Speakers, Headphones and projectors are also some of the output devices. Some units, such as graphic displays, provide both an output function and an input function. The dual role of input and output of such units are referred with single name as I/O unit in many cases. Speakers, Headphones and projectors are some of the output devices. Storage devices such as hard disk, floppy disk, flash drives are also used for input as well as output. Control Unit The memory, arithmetic and logic, and input and output 'units store and process information and perform input and output operations. The operation of these units must be coordinated in some way. This is the task of the control unit. The control unit is effectively the nerve center that sends control signals to other units and senses their states. I/O transfers, consisting of input and output operations, controlled by the instructions of I/O programs that identify the devices involved and the information to be transferred. However, the actual timing signals that govern the transfers are generated by the control circuits. Timing signals are signals that determine when a given action is to take place. Data transfers between the processor and the memory are also controlled by the control unit through timing signals. It is reasonable to think of a control unit as a well-defined, physically separate unit that interacts with other parts of the machine. In practice, however, this is seldom the case. Much of the control circuitry is physically distributed throughout the machine. A large set of control lines (wires) carries the signals used for timing and synchronization of events in all units. The operation of a computer can be summarized as follows: 1. The computer accepts information in the form of programs and data through an input unit and stores it in the memory. 2. Information stored in the memory is fetched, under program control, into an arithmetic and logic unit, where it is processed. 3. Processed information leaves the computer through an output unit. 4. All activities inside the machine are directed by the control unit.
  • 9. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 4 1.2 Explain basic operations of a computer. 1.2 Discuss basic operational concepts. The processor contains arithmetic logic unit (ALU), control circuitry unit (CU) and a number of registers used for several different purposes. The instruction register (IR) holds the instruction that is currently being executed. Its output is available to the control circuits, which generate the timing signals that control the various processing elements involved in executing the instruction. The program counter (PC) is another specialized register. It keeps track of the execution of a program. "It contains the memory address of the next instruction to be fetched and executed. During the execution of an instruction, the contents of the PC are updated to correspond to the address of the next instruction to be executed. It is customary to say that the PC points to the next instruction that is to be fetched from the memory. Besides the IR and PC, Figure 1.2 shows n general-purpose registers, R0 through Rn-1. Figure 1.2 Connections between processor and memory Finally, two registers facilitate communication with the memory. These are the memory address register (MAR) and the memory data register (MDR). The MAR holds the address of the location to be accessed. The MDR contains the data to be written into or read out of the addressed location. Let us now consider some typical operating steps. Programs reside in the memory and usuaJ1y get there through the input unit Execution of the program starts when the PC is set to point to the first instruction of the program. The contents of the PC are transferred to the MAR and a Read control signal is sent to the memory. After the time required to access the memory elapses, the addressed word (in this case, the first instruction of the program) is read out of the memory and loaded into the MDR. Next, the contents of the MDR are transferred to the JR. At this point, the instruction is ready to be decoded and executed. Processor MAR Control unit Memory ALU n general purpose registers . . . R0 MDR Rn-1 R1 PC IR
  • 10. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 5 If the instruction involves an operation to be performed by the ALU, it is necessary to obtain the required operands. If an operand resides in the memory (it could also be in a general- purpose register in the processor), it has to be fetched by sending its address to the MAR and initiating a Read cycle. When the operand has been read from the memory into the MDR, it is transferred from the MDR to the ALU. After one or more operands are fetched in this way, the ALU can perform the desired operation. If the result of this operation is to be stored in the memory, then the result is sent to the MDR. The address of the location where the result is to be stored is sent to the MAR, and a Write cycle is initiated. At some point during the execution of the current instruction, the contents of the PC are incremented so that the PC points to the next instruction to be executed. Thus, as soon as the execution of the current instruction is completed, a new instruction fetch may be started. To perform a given task, an appropriate program consisting of a list of instructions is stored in the memory. Individual instructions are brought from the memory into the processor, which executes the specified operations. Data to be used as operands are also stored in the memory. A typical instruction may be Add LOCA, R0 This instruction adds the operand at memory location LOCA to the operand in a register in the processor, R0, and places the sum into register R0. The original contents of location LOCA are preserved, whereas those of R0 are overwritten. This instruction requires the performance of several steps. First, the instruction is fetched from the memory into the processor. Next, the operand at LOCA is fetched and added to the contents of R0. Finally, the resulting sum is stored in register R0. Primary storage (or main memory or internal memory), often referred to simply as memory, is the only one directly accessible to the Processor. The Processor continuously reads instructions stored there and executes them as required. Any data actively operated on is also stored there in uniform manner. The processor transfers data using MAR and MDR register using control unit.
  • 11. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 6 1.3 Explain bus structure of a computer. Single bus structure In computer architecture, a bus is a subsystem that transfers data between components inside a computer, or between computers. Early computer buses were literally parallel electrical wires with multiple connections, but Modern computer buses can use both parallel and bit serial connections. Figure 1.3.1 Single bus structure To achieve a reasonable speed of operation, a computer must be organized so that all its units can handle one full word of data at a given time. When a word of data is transferred between units, all its bits are transferred in parallel, that is, the bits are transferred simultaneously over many wires, or lines, one bit per line. A group of lines that serves as a connecting path for several devices is called a bus. In addition to the lines that carry the data, the bus must have lines for address and control purposes. The simplest way to interconnect functional units is to use a single bus, as shown in Figure 1.3.1. All units are connected to this bus. Because the bus can be used for only one transfer at a time, only two units can actively use the bus at any given time. Bus control lines are used to arbitrate multiple requests for use of the bus. The main virtue of the single-bus structure is its low cost and is flexibility for attaching peripheral" devices. Systems that contain multiple buses achieve more concurrency in operations by allowing two or more transfers to be carried out at the same time. This leads to better performance but at an increased cost. Parts of a System bus Processor, memory, Input and output devices are connected by system bus, which consists of separate busses as shown in figure 1.3.2. They are: (i)Address bus: Address bus is used to carry the address. It is unidirectional bus. The address is sent to from CPU to memory and I/O port and hence unidirectional. It consists of 16, 20, 24 or more parallel signal lines. (ii)Data bus: Data bus is used to carry or transfer data to and from memory and I/O ports. They are bidirectional. The processor can read on data lines from memory and I/O port and as well as it can write data to memory. It consists of 8, 16, 32 or more parallel signal lines. (iii)Control bus: Control bus is used to carry control signals in order to regulate the control activities. They are bidirectional. The CPU sends control signals on the control bus to enable the outputs of addressed memory devices or port devices. Some of the control signals are: MEMR (memory read), MEMW (memory write), IOR (I/O read), IOW (I/O write), BR (bus request), BG (bus grant), INTR (interrupt request), INTA (interrupt acknowledge), RST (reset), RDY (ready), HLD (hold), HLDA (hold acknowledge), MemoryProcessor Input Output …
  • 12. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 7 Figure 1.3.2 Bus interconnection scheme The devices connected to a bus vary widely in their speed of operation. Some electromechanical devices, such as keyboards and printers are relatively slow. Other devises like magnetic or optical disks, are considerably faster. Memory and processor units operate at electronic speeds, making them the fastest parts of a computer. Because all these devices must communicate with each other over a bus, an efficient transfer mechanism that is not constrained by the slow devices and that can be used to smooth out the differences in timing among processors, memories, and external devices is necessary. A common approach is to include buffer registers with the devices to hold the information during transfers. To illustrate this technique, consider the transfer of an encoded character from a processor to a character printer. The processor sends the character over the bus to the printer buffer. Since the buffer is an electronic register, this transfer requires relatively little time. Once the buffer is loaded, the printer can start printing without further intervention by the processor. The bus and the processor are no longer needed and can be released for other activity. The printer continues printing the character in its buffer and is not available for further transfers until this process is completed. Thus, buffer registers smooth out timing differences among processors, memories, and I/O devices. They prevent a high-speed processor from being locked to a slow I/O device during a sequence of data transfers. This allows the processor to switch rapidly from one device to another, interweaving its processing activity with data transfers involving several I/O devices. The Figure 1.3.3 shows traditional bus configurations and the Figure 1.3.4 shows high speed bus configurations. The traditional bus connection uses three buses: local bus, system bus and expanded bus. The high speed bus configuration uses high-speed bus along with the three buses used in the traditional bus connection. Here, cache controller is connected to high- speed bus. This bus supports connection to high-speed LANs, such as Fiber Distributed Data Interface (FDDI), video and graphics workstation controllers, as well as interface controllers to local peripheral including SCSI. MemoryProcessor Input Output … Control bus Data bus Address buss System bus
  • 13. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 8 Figure 1.3.3 Traditional bus configuration Figure 1.3.4 High speed bus configuration Main Memory Processor Cache Local I/O controller Local bus System bus Expansion bus SCSI Network Expansion bus interface Modem Serial Main Memory Processor Cache Local I/O controller Local bus System bus High Speed bus Expansion bus SCSI Video Graphics LAN Expansion bus interfaceFAX Modem Serial Traditional bus High Speed bus
  • 14. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 9 1.4 Explain performance and metrics The most important measure of the performance of a computer is how quickly it can execute programs. The speed with which a computer executes programs is affected by the design of its hardware and its machine language instructions. Because programs are usually written in a high-level language, performance is also affected by the compiler that translates programs into machine language. For best performance, it is necessary to design the compiler, the machine instruction set, and the hardware in a coordinated way. The operating system overlaps processing, disk transfers, and printing for several programs to make the best possible use of the resources available. The total time required to execute the program is called elapsed time in operating system. This elapsed time is a measure of the performance of the entire computer system. It is affected by the speed of the processor, the disk, and the printer. CACHE MEMORY Just as the elapsed time for the execution of a program depends on all units in a computer system, the processor time depends on the hardware involved in the execution of individual machine instructions. This hardware comprises the processor and the memory, which are usually connected by a bus, as shown in Figure 1.3.1. The pertinent parts of this figure are repeated in Figure 1.4, including the cache memory as part of the processor unit. Let us examine the flow of program instructions and data between the memory and the processor. At the start of execution, all program instructions and the required data are stored in the main memory. As execution proceeds, instructions are fetched one by one over the bus into the processor, and a copy is placed in the cache. When the execution of an instruction calls for data located in the main memory, the data are fetched and a copy is placed in the cache. Later, if the same instruction or data item is needed a second time, it is read directly from the cache. Figure 1.4 Processor Cache The processor and a relatively small cache memory can be fabricated on a single integrated circuit chip. The internal speed of performing the basic steps of instruction, processing on such chips is very high and is considerably faster than the speed at which instructions and data can be fetched from the main memory. A program will be executed faster if the movement of instructions and data between the main memory and the processor is minimized, which is achieved by using the cache. For example, suppose a number of instructions are executed repeatedly over a short period of time, as happens in a program loop. If these instructions are available in the cache, they can be fetched quickly during the period of repeated use. PROCESSOR CLOCK Processor circuits are controlled by a timing signal called a clock. The clock defines regular time intervals, called clock cycles. To execute a machine instruction, the processor divides the action to be performed into a sequence of basic steps, such that each step can be completed in one clock cycle. The length P of one clock cycle is an important parameter that affects processor performance. Its inverse is the clock rate, R = 1/ P, which is measured in cycles per second. Processors used in today's personal computers and workstations have clock rates that range from a few hundred million to over a billion cycles per second. In standard electrical engineering terminology, the term "cycles per second" Main Memory Processor Cache Memory System bus
  • 15. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 10 is called hertz (Hz). The term "million" is denoted by the prefix Mega (M), and "billion" is denoted by the prefix Giga (G). Hence, 500 million cycles per second is usually abbreviated to 500 Megahertz (MHz), and 1250 million cycles per second is abbreviated to 1.25 Gigahertz (GHz). The corresponding clock periods are 2 and 0.8 nanoseconds (ns), respectively. BASIC PERFORMANCE EQUATION Let T be the processor time required to execute a program that has been prepared in some high- level language. The compiler generates a machine language object program that corresponds to the source program. Assume that complete execution of the program requires the execution of N machine language instructions. The number N is the actual number of instruction executions, and is not necessarily equal to the number of machine instructions in the object program. Some instructions may be executed more than once, which is the case for instructions inside a program loop. Others may not be executed at all, depending on the input data used. Suppose that the average number of basic steps needed to execute one machine instruction is S, where each basic step is completed in one clock cycle. If the clock rate is R cycles per second, the program execution time is (1.1) This is often referred to as the basic performance equation. The performance parameter T for an application program is much more important to the user than the individual values of the parameters N, S, or R. To achieve high performance, the computer designer must seek ways to reduce the value of T, which means reducing N and S, and increasing R. The value of N is reduced if the source program is compiled into fewer machine instructions. The value of S is reduced if instructions have a smaller number of basic steps to perform or if the execution of instructions is overlapped. Using a higher- frequency clock increases the value or R, which means that the time required to complete a basic execution step is reduced. The N, S, and R are not independent parameters; changing one may affect another. Introducing a new feature in the design of a processor will lead to improved performance only if the overall result is to reduce the value of T. A processor advertised as having a 900-MHz clock does not necessarily provide better performance than a 700-MHz processor because it may have a different value of S. PIPELINING AND SUPERSCALAR OPERATION Usually in sequential execution, the instructions are executed one after another. Hence, the value of S is the total number of basic steps, or clock cycles, required to execute an instruction. A substantial improvement in performance can be achieved by overlapping the execution of successive instructions, using a technique called pipelining. Consider the instruction Add Rl,R2,R3 Which adds the contents of registers R 1 and R2, and places the sum into R3. The contents of Rl and R2 are first transferred to the inputs of the ALU. After the add operation is performed, the sum is transferred to R3. The processor can read the next instruction from the memory while the addition o- ration is being performed. Then, if that instruction also uses the ALU, its operands can be transferred to the ALU inputs at the same time that the result of the Add instruction is being transferred to R3. In the ideal case, if all instructions are overlapped to the maximum degree possible, execution proceeds at the rate of one instruction completed in each clock cycle. Individual instructions still require several clock cycles to complete. But, for the purpose of computing T, the effective value of S is 1. The ideal va1ue S = 1 cannot be attained in practice for a variety of reasons. However, pipelining increases the rate of executing instructions significantly and causes the effective value of S to approach 1.
  • 16. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 11 A higher degree of concurrency can be achieved if multiple instruction pipelines are implemented in the processor. This means that multiple functional units are used, creating parallel paths through which different instructions can be executed in parallel. With such an arrangement, it becomes possible to start the execution of several instructions in every clock cycle. This mode of operation is called superscalar execution. If it can be sustained for a long time during program execution, the effective value of S can be reduced to less than one. Of course, parallel execution must preserve the logical correctness of programs, that is, the results produced must be the same as those produced by serial execution of program instructions. Many of today's high-performance processors are designed to operate in this manner. CLOCK RATE There are two possibilities for increasing the clock rate, R. First, improving the integrated- circuit (IC) technology makes logic circuits faster, which reduces the time needed to complete a basic step. This allows the clock period, P, to be reduced and the clock rate, R, to be increased. Second, reducing the amount of processing done in one basic step also makes it possible to reduce the clock period, P. However, if the actions that have to be performed by an instruction remain the same, the number of basic steps needed may increase. Increases in the value of R that are entirely caused by improvements in IC technology affect all aspects of the processor's operation equally with the exception of the time it takes to access the main memory. In the presence of a cache, the percentage of accesses to the main memory is small. Hence, much of the performance gain expected from the use of faster technology can be realized. The value of T will be reduced by the same factor as R is increased because S and N are not affected. INSTRUCTION SET: CISC AND RISC Simple instructions require a small number of basic steps to execute. Complex Instructions involve a large number of steps. For a processor that has only simple instructions, a large number of instructions may be needed to perform a given programming task. This could lead to a large value for N and a small value for S. On the other hand, if individual instructions perform more complex operations, fewer instructions will be needed, leading to a lower value of N and a larger value of S. It is not obvious if one choice is better than the other. A key consideration in comparing the two choices is the use of pipelining. We pointed out earlier that the effective value of S in a pipelined processor is I close to 1 even though the number of basic steps per instruction may be considerably larger. This seems to imply that complex instructions combined with pipelining would achieve the best performance. However, it is much easier to implement efficient pipelining in processors with simple instruction sets. The suitability of the instruction set for pipelined execution is an important and often deciding consideration. The terms RISC and CISC refer to design principles and techniques. Reduced instruction set computing (RISC), is a CPU design strategy based on the insight that simplified (as opposed to complex) instructions can provide higher performance if this simplicity enables much faster execution of each instruction. A complex instruction set computer (CISC) is a computer where single instructions can execute several low-level operations (such as a load from memory, an arithmetic operation, and a memory store) and/or are capable of multi-step operations or addressing modes within single instructions. COMPILER A compiler translates a high-level language program into a sequence of machine instructions. To reduce N, we need to have a suitable machine instruction set and a compiler that makes good use of it. An optimizing compiler takes advantage of various features of the target processor to reduce the product N x S, which is the total number of clock cycles needed to execute a program. The number of cycles is dependent not only on the choice of instructions, but also on the order in which they appear in the program. The compiler may rearrange program instructions to achieve better performance. Of course, such changes must not affect the result of the computation.
  • 17. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 12 Superficially, a compiler appears as a separate entity from the processor with which it is used and may even be available from a different vendor. However, a high quality compiler must be closely linked to the processor architecture. The compiler and the processor are often designed at the same time, with much interaction between the designers to achieve best results. The ultimate objective is to reduce the total number of clock cycles needed to perform a required programming task. PERFORMANCE EQUATION It is important to be able to assess the performance of a computer. Computer designers use performance estimates to evaluate the effectiveness of new features. Manufacturers use performance indicators in the marketing process. Buyers use such data to choose among many available computer models. The previous discussion suggests that the only parameter that properly describes the - performance of a computer is the execution time, T, for the programs of interest. Despite the conceptual simplicity of Equation 1.1, computing the value of T is not simple. Moreover, parameters such as the clock spee4 and various architectural features are not reliable indicators of the expected performance. For these reasons, the computer community adopted the idea of measuring computer performance using benchmark programs. To make comparisons possible, standardized programs must be used. The performance measure is the time it takes a computer to execute a given benchmark. Initially, some attempts were made to create artificial programs that could be used as standard benchmarks. But, synthetic programs do not properly predict performance obtained when real application programs are nun. A nonprofit organization called System Performance Evaluation Corporation (SPEC) selects and publishes representative application programs for different application domains, together with test results for many commercially available computers. For general-purpose computers, a suite of benchmark programs was selected in 1989. It was modified somewhat and published in 1995 and again in 2000. For SPEC2000, the reference computer is an Ultra SPARCI0 workstation with a 300-MHz UltraSPARC-III processor. The SPEC rating is computed as follows Thus a SPEC rating of 50 means that the computer under test is 50 times as fast as the UltraSPARC I0 for this particular benchmark. The test is repeated for all the programs in the SPEC suite, and the geometric mean of the results is computed. Let SPECi be the rating for program i in the suite. The overall SPEC rating for the computer is given by where n is the number of programs in the suite. Because the actual execution time is measured, the SPEC rating is a measure of the combined effect of all factors affecting performance, including the compiler, the operating system, the processor, and the memory of the computer being tested.
  • 18. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 13 1.5 Explain instruction and instruction sequencing A computer program consist of a sequence of small steps, such as adding two numbers, testing for a particular condition, reading a character from the keyboard, or sending a character to be displayed on a display screen. A computer must have instructions capable of performing four types of operations: · Data transfers between the memory and the processor registers · Arithmetic and logic operations on data · Program sequencing and control · I/O transfers REGISTER TRANSFER NOTATION In general, the information is transferred from one location to another in a computer. Possible locations for such transfers are memory locations, processor registers, or registers in the I/O subsystem. For example, Names for the addresses of memory locations may be LOC, PLACE, A, VAR2; processor register names may be RO, R5; and I/O register names may be DATAIN, OUTSTATUS, and so on. The contents of a location are denoted by placing square brackets around the name of the location. Thus, the expression means that the contents of memory location LOC are transferred into processor register R1. As another example, consider the operation that adds the contents of registers R1 and R2, and then places their sum into register R3. This action is indicated as This type of notation is known as Register Transfer Notation (RTN). .Note that the right hand side of an RTN expression always denotes a value, and the left-hand side is the name of a location where the value is to be placed, overwriting the old contents of that location. ASSEMBLY TRANSFER NOTATION The another type of notation to represent machine instructions and programs is assembly transfer notation. For example, an instruction that causes the transfer described above, from memory location LOC to processor register RI, is specified by the statement. Move LOC,Rl The contents of LOC are unchanged by the execution of this instruction, but the old contents of register R1 are overwritten. The second example of adding two numbers contained in processor registers Rl and R2 and placing their sum in R3 can be specified by the assembly language statement Add RI,R2,R3 R1 [R1] + [R2] R1 [LOC]
  • 19. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 14 BASIC INSTRUCTION TYPES Instruction types are classified based on the number of operands used in instructions. They are: (i). Three - Address Instruction, (ii). Two - Address Instruction, (iii). One - Address Instruction, and (iv). Zero - Address Instruction. Consider a high level language program command, which adds two variables A and B, and assign the sum in third variable C. C = A + B ; To carry out this action, the contents of memory locations A and B are fetched from the memory and transferred into the processor where their Sum is computed. This result is then sent back to the memory and stored in location C. C ← [A] + [B] Example: Add two variables A and B, and assign the sum in third variable C (i). Three - Address Instruction The general instruction format is: Operation Source1,Source2,Destination Symbolic add instruction: ADD A,B,C Operands A and B are called the source operands, C is called Destination operand, and Add is the operation to be performed on the operands. If k bits are needed to specify the memory address of each operand, the encoded form of the above instruction must contain 3k bits for addressing purposes in addition to the bits needed to denote the Add operation. For a modem processor with a 32-bit address space, a 3-address instruction is too large to fit in one word for a reasonable word length. Thus, a format that allows multiple words to be used for a single instruction would be needed to represent an instruction of this type. An alternative approach is to use a sequence of simpler instructions to perform the same task, with each instruction having only one or two operands. Suppose that two-address instructions of the form (ii). Two - Address Instruction The general instruction format is: Operation Source,Destination Symbolic add instruction: MOVE B,C ADD A,C An Add instruction of this type is ADD A,B which performs the operation B ← [A] + [B] When the sum is calculated, the result is sent to the memory and stored in location B, replacing the original contents of this location. This means that operand B is both a source and a destination. A single two-address instruction cannot be used to solve our original problem, which is to add the contents of locations A and B, without destroying either of them, and to place the sum in location C. The problem can be solved by using another two- address instruction that copies the contents of one memory location into another. Such an instruction is Move B,C which performs the operation C ← [B], leaving the contents of location B unchanged. The word "Move" is a misnomer here; it should be "Copy." However, this instruction name is deeply entrenched in computer nomenclature. The operation C ← [A] + [B] can now be performed by the two- instruction sequence Move B,C Add A,C Even two-address instructions will not normally fit into one word for usual word lengths and address sizes. Another possibility is to have machine instructions that specify only one
  • 20. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 15 memory operand. when a second operand is needed, as in the case of an Add instruction, it is understood implicitly to be in a unique location. A processor register, usually called the accumulator, may be used for this purpose. Thus, the one-address instruction (iii). One - Address Instruction The general instruction format is: Operation operand Symbolic add instruction: LOAD A ADD B STORE C ADD A means the following: Add the contents of memory location A to the contents of the accumulator register and place the sum back into the accumulator. Let us also introduce the one- address instructions Load A and Store A The Load instruction copies the contents of memory location A into the Accumulator, and the Store instruction copies the contents of the accumulator into memory location A. Using only one-address instructions, the operation C ← [A] + [B] can be performed by executing the sequence of instructions Load A, Add B, Store C Note that the operand specified in the instruction may be a source or a destination, depending on the instruction. In the Load instruction, address A specifies the source operand, and the destination location, the accumulator, is implied. On the other hand, C denotes the destination location in the Store instruction, whereas the source, the accumulator, is implied. INSTRUCTION EXECUTION AND STRAIGHT LINE SEQUENCING To perform a particular task on the computer, it is programmer’s job to select and write appropriate instructions one after other, i.e. programmer has to write instructions in a proper sequence. This job of the programmer is known as instruction sequencing. The instructions written in a proper sequence to execute a particular task is called program. Figure 1.5.2 Basic instruction cycle The complete instruction cycle involves three operations: Instruction fetching, opcode decoding and instruction execution as shown in figure 1.5.1. Processor executes a program with the help of program counter (PC). PC holds the address of the instruction to be executed next. To begin execution of a program, the address of its first instruction is placed into the PC. Then, the processor control circuits use the information (address Fetch the next instruction START Decode instruction Execute instruction START Fetch cycle Decode cycle Execute cycle
  • 21. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 16 of memory) in the PC to fetch and execute instructions, one at a time, in the order of increasing addresses. This is called straight-line sequencing. During the execution of instruction, the PC is incremented by the length of the current instruction in execution. For example, if currently executing instruction length is 4 bytes, then PC is incremented by 4 so that it points to instruction to be executed next. Consider the task C ← [A] + [B] for illustration. Figure 1.5.2 shows a possible program segment for this task as it appears in the memory of a computer. Assume the word length is 32 bits and the memory is byte addressable. The three instructions of the program are in successive word locations, starting at location i. Since each instruction is 4 bytes long, the second and third instructions start at addresses i + 4 and i + 8. Figure 1.5.2 A program for C ← [A] + [B] The processor contains a register called the program counter (PC), which holds the address of the instruction to be executed next. To begin executing a program, the address of its first instruction (i in our example) must be placed into the PC. Then, the processor control circuits use the information in the PC to fetch and execute instructions, one at a time, in the order of increasing addresses. This is called straight-line sequencing. During the execution of each instruction, the PC is incremented by 4 to point to the next instruction. Thus, after the Move instruction at location i + 8 is executed, the PC contains the value i + 12, which is the address of the first instruction of the next program segment. Executing a given instruction is a two-phase procedure. In the first phase, called instruction fetch, the instruction is fetched from the memory location whose address is in the PC. This instruction is placed in the instruction register (IR) in the processor. At the start of the second phase, called instruction execute, the instruction in IR is examined to determine which operation is to be performed. The specified operation is then performed by the processor. This often involves fetching operands from the memory or from processor registers, performing an arithmetic or logic operation, and storing the result in the destination location. At some point during this two-phase procedure, the contents of the PC are advanced to point to the next instruction. When the execute phase of an instruction is completed, the PC contains the address of the next instruction, and a new instruction fetch phase can begin. In most processors, the execute phase itself is divided into a
  • 22. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 17 small number of distinct phases corresponding to fetching operands, performing the operation, and storing the result. BRANCHING Consider the task of adding a list of n numbers. The program outlined in Figure 1.5.3 is a generalization of the program in Figure 1.5.2. The addresses of the memory locations containing the n numbers are symbolically given as NUMl, NUM2, ..., NUMn, and a separate Add instruction is used to add each number to the contents of register R0. After all the numbers have been added, the result is placed in memory location SUM. Instead of using a long list of Add instructions, it is possible to place a single Add instruction in a program loop, as shown in Figure 1.5.4. The loop is a straight-line sequence of instructions executed as many times as needed. It starts at location LOOP and ends at the instruction Branch>O. During each pass through this loop, the address of the next list entry is determined, and that entry is fetched and added to R0. Figure 1.5.3 A straight line program Figure 1.5.4 A program using a loop for adding n numbers for adding n numbers Assume that the number of entries in the list, n, is stored in memory location N, as shown. Register Rl is used as a counter to determine the number of times the loop is executed. Hence, the contents of location N are loaded into register Rl at the beginning of the program. Then, within the body of the loop, the instruction Decrement Rl Reduces the contents of Rl by 1 each time through the loop. (A similar type of operation is performed by an Increment instruction, which adds 1 to its operand.) Execution of the loop is repeated as long as the result of the decrement operation is greater than zero. This type of instruction loads a new value into the program counter. As a result, the processor fetches and executes the instruction at this new address, called the branch target, instead of the instruction at the location that follows the branch instruction in sequential address order. A conditional branch instruction causes a branch only if a specified condition is satisfied. If the condition is not satisfied, the PC is incremented in the normal way, and the next instruction in sequential address order is fetched and executed.
  • 23. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 18 In the program in Figure 1.5.4, the instruction Branch>0 LOOP (branch if greater than 0) is a conditional branch instruction that causes a branch to location LOOP if the result of the immediately preceding instruction, which is the decremented value in register Rl, is greater than zero. This means that the loop is repeated as long as there are entries in the list that are yet to be added to R0. At the end of the nth pass through the loop, the Decrement instruction produces a value of zero, and, hence, branching does not occur. Instead, the Move instruction is fetched and executed. It moves the final result from R0 into memory location SUM. CONDITION CODES The processor keeps track of information about the results of various operations for use by subsequent conditional branch instructions. This is accomplished by recording the required information in individual bits, often called condition code flags. These flags are usually grouped together in a special processor register called the condition code register or status register. Individual condition code flags are set to 1 or cleared to 0, depending on the outcome of the operation performed. Four commonly used flags are N (negative) Set to 1 if the result is negative; otherwise, cleared to 0 Z (zero) Set to 1 if the result is 0; otherwise, cleared to 0 V (overflow) Set to 1 if arithmetic overflow occurs; otherwise, cleared to 0 C (carry) Set to 1 if a carry-out results from the operation; otherwise, cleared to 0 The N and Z flags indicate whether the result of an arithmetic or logic operation is negative or zero. The N and Z flags may also be affected by instructions that transfer data, such as Move, Load, or Store. This makes it possible for a later conditional branch instruction to cause a branch based on the sign and value of the operand that was moved. Some computers also provide a special Test instruction that examines a value in a register or in the memory and sets or clears the N and Z flags accordingly. GENERATION OF MEMORY ADDRESSES Different addressing modes give rise to the need for flexible ways to specify the address of an operand. The instruction set of a computer typically provides a number of such methods, called addressing modes. While the details differ from one computer to another, the underlying concepts are the same.
  • 24. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 19 1.6 Discuss Hardware – Software Interface Hardware Computer hardware is the collection of physical elements that comprise a computer system. Example Processor, memory, hard disk, floppy disk, keyboard, mouse, monitors, printers and so on. Software Computer software, or just software, is a collection of computer programs and related data that provides the instructions for telling a computer what to do and how to do it. In other words, software is a conceptual entity which is a set of computer programs, procedures, and associated documentation concerned with the operation of a data processing system. We can also say software refers to one or more computer programs and data held in the storage of the computer for some purposes. In other words software is a set of programs, procedures, algorithms and its documentation. Program software performs the function of the program it implements, either by directly providing instructions to the computer hardware or by serving as input to another piece of software. The term was coined to contrast to the old term hardware (meaning physical devices). In contrast to hardware, software "cannot be touched". Software is also sometimes used in a more narrow sense, meaning application software only. Types of software System software System software provides the basic functions for computer usage and helps run the computer hardware and system. It includes a combination of the following: 1. Device drivers: A device driver or software driver is a computer program allowing higher-level computer programs to interact with a hardware device. 2. Operating systems: An operating system (OS) is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system. A user cannot run an application program on the computer without an operating system, unless the application program is self booting. The main functions of an OS are Device management, Storage management, User interface, Memory management and Processor management 3. Servers: A server is a computer program running to serve the requests of other programs, the "clients". Thus, the "server" performs some computational task on behalf of "clients". The clients either run on the same computer or connect through the network. 4. Utilities: Utility software is system software designed to help analyze, configure, optimize or maintain a computer. A single piece of utility software is usually called a utility or tool. Utility software usually focuses on how the computer infrastructure (including the computer hardware, operating system, application software and data storage) operates.
  • 25. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 20 5. Window systems: A windowing system (or window system) is a component of a graphical user interface (GUI), and more specifically of a desktop environment, which supports the implementation of window managers, and provides basic support for graphics hardware, pointing devices such as mice, and keyboards. The mouse cursor is also generally drawn by the windowing system. System software is responsible for managing a variety of independent hardware components, so that they can work together harmoniously. Its purpose is to unburden the application software programmer from the often complex details of the particular computer being used, including such accessories as communications devices, printers, device readers, displays and keyboards, and also to partition the computer's resources such as memory and processor time in a safe and stable manner. Programming software Programming software usually provides tools to assist a programmer in writing computer programs, and software using different programming languages in a more convenient way. The tools include: 1. Compilers: A compiler is a computer program (or set of programs) that transforms source code written in a programming language (the source language) into another computer language (the target language, often having a binary form known as object code). The most common reason for wanting to transform source code is to create an executable program. The name "compiler" is primarily used for programs that translate source code from a high- level programming language to a lower level language (e.g., assembly language or machine code). If the compiled program can run on a computer whose CPU or operating system is different from the one on which the compiler runs, the compiler is known as a cross-compiler. A program that translates from a low level language to a higher level one is a decompiler. A compiler is likely to perform many or all of the following operations: lexical analysis, preprocessing, parsing, semantic analysis (Syntax-directed translation), code generation, and code optimization. 2. Debuggers: A debugger or debugging tool is a computer program that is used to test and debug other programs (the "target" program). The code to be examined might alternatively be running on an instruction set simulator (ISS), a technique that allows great power in its ability to halt when specific conditions are encountered but which will typically be somewhat slower than executing the code directly on the appropriate (or the same) processor. Some debuggers offer two modes of operation - full or partial simulation, to limit this impact. 3. Interpreters: An interpreter normally means a computer program that executes Program by converting source code to object code line by line. 4. Linkers: A linker or link editor is a program that takes one or more objects generated by a compiler and combines them into a single executable program.
  • 26. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 21 5. Text editors: A text editor is a type of program used for editing plain text files.Text editors are often provided with operating systems or software development packages, and can be used to change configuration files and programming language source code. An Integrated development environment (IDE) is a single application that attempts to manage all these functions. Application software Application software is developed to perform in any task that benefits from computation. It is a broad category, and encompasses software of many kinds, including the internet browser being used to display this page. This category includes: Business software Computer-aided design Databases Decision making software Educational software Image editing Industrial automation Mathematical software Medical software Simulation software Spreadsheets Word processing Hardware – Software Interface It is an interface tool and it to a point of interaction between components, and is applicable at the level of both hardware and software. This allows a component, whether a piece of hardware such as a graphics card or a piece of software such as an Internet browser, to function independently while using interfaces to communicate with other components via an input/output system and an associated protocol. All Device drivers, Operating systems, Servers, Utilities, Window systems, Compilers, Debuggers, Interpreters, Linkers and Text editors are considered as Harware – Software Interfaces.
  • 27. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 22 1.7 Explain various addressing modes A program operates on data that reside in the computer's memory. These data can be organized in a variety of ways. For example to record marks in various courses, we may organize this information in the form of a table. Programmers use organizations called data structures to represent the data used in computations. These include lists, linked lists, arrays, queues, and so on. Programs are normally written in a high-level language, which enables the programmer to use constants, local and global variables, pointers, and arrays. When translating a high-level language program into assembly language, the compiler must be able to implement these constructs using the facilities provided in the instruction set of the computer in which the program will be run. The different ways in which the location of an operand is specified in an instruction are referred to as addressing modes. IMPLEMENTATION OF VARIABLES AND CONSTANTS Variables and constants are the simplest data types and are found in almost every computer program. A variable is represented by allocating a register or a memory location to hold its value. Thus, the value can be changed as needed using appropriate instructions. 1. Register addressing mode - The operand is the contents of a processor register; the name (address) of the register is given in the instruction. Example: MOVE R1,R2 This instruction copies the contents of register R2 to R1. 2. Absolute addressing mode - The operand is in a memory location; the address of this location is given explicitly in the instruction. (In some assembly languages, this mode is called Direct.) Example: MOVE LOC,R2 This instruction copies the contents of memory location of LOC to register R2. 3. Immediate addressing mode - The operand is given explicitly in the instruction. Example: MOVE #200 , R0 The above statement places the value 200 in the register R0. A common convention is to use the sharp sign (#) in front of the value to indicate that this value is to be used as an immediate operand. INDIRECTION AND POINTERS In the addressing modes that follow, the instruction does not give the operand or its address explicitly. Instead, it provides information from which the memory address of the operand can be determined. We refer to this address as the effective address (EA) of the operand. 4. Indirect addressing mode - The effective address of the operand is the contents of a register or memory location whose address appears in the instruction. Example Add (R2),R0 Register R2 is used as a pointer to the numbers in the list, and the operands are accessed indirectly through R2. The initialization section of the program loads the counter value n from memory location N into Rl and uses the Immediate addressing mode to place the address value NUM 1, which is the address of the first number in the list, into R2. INDEXING AND ARRAY It is useful in dealing with lists and arrays. 5. Index mode - The effective address of the operand is generated by adding a constant value to the contents of a register.
  • 28. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 23 The register used may be either a special register provided for this purpose, or, more commonly; it may be anyone of a set of general-purpose registers in the processor. In either case, it is referred to as an index register. We indicate the Index mode symbolically as X(Ri). Where X denotes the constant value contained in the instruction and Ri is the name of the register involved. The effective address of the operand is given by EA = X + [Ri]. The contents of the index register are not changed in the process of generating the effective address. RELATIVE ADDRESSING We have defined the Index mode using general-purpose processor registers. A useful version of this mode is obtained if the program counter, PC, is used instead of a general purpose register. Then, X(PC) can be used to address a memory location that is X bytes away from the location presently pointed to by the program counter. Since the addressed location is identified ''relative'' to the program counter, which always identifies the current execution point in a program, the name Relative mode is associated with this type of addressing. 6.Relative mode - The effective address is determined by the Index mode using the program counter in place of the general-purpose register Ri. This mode can be used to access data operands. But, its most common use is to specify the target address in branch instructions. An instruction such as Branch>O LOOP causes program execution to go to the branch target location identified by the name LOOP if the branch condition is satisfied. This location can be computed by specifying it as an offset from the current value of the program counter. Since the branch target may be either before or after the branch instruction, the offset is given as a signed number. ADDITIONAL MODES The two additional modes described are useful for accessing data items in successive locations in the memory. 7. Autoincrement mode - The effective address of the operand is the contents of a register specified in the instruction. After accessing the operand, the contents of this register are automatically incremented to point to the next item in a list. We denote the Autoincrement mode by putting the specified register in parentheses, to show that the contents of the register are used as the effective address, followed by a plus sign to indicate that these contents are to be incremented after the operand is accessed. Thus, the Autoincrement mode is written as (Ri) + As a companion for the Autoincrement mode, another useful mode accesses the items of a list in the reverse order: 8. Autodecrement mode - The contents of a register specified in the instruction is first automatically decremented and is then used as the effective address of the operand. We denote the Autodecrement mode by putting the specified register in parentheses, preceded by a minus sign to indicate that the contents of the register are to be decremented before being used as the effective address. Thus, we write - (Ri) Table 1.1 Generic addressing modes
  • 29. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 24 Name Assembler syntax Addressing function Immediate #Value Operand = Value Register Ri EA = Ri Absolute (Direct) LOC EA = LOC Indirect (Ri) EA = [Ri] (LOC) EA = [LOC] Index X(Ri) EA = [Ri] + X Base with index (Ri,Rj) EA = [Ril + [Rj] Base with index X(Ri,Rj) EA = [Ri] + [Rj] + X and offset Relative X(PC) EA = [PC] +X Autoincrement (Ri)+ EA = [Ri]; Increment Ri Autodecrement -(Ri) Decrement Ri; EA = [Ri] EA = effective address Value = a signed number
  • 30. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 25 1.8 Discuss RISC and CISC In recent years, the boundary between RISC and CISC architectures has been blurred. Future processors may be designed with features from both RISC and CISC types. Architectural description of RISC Figure 1.8.1 RISC Architecture As shown in figure 1.8.1, RISC architecture uses separate instruction and data caches. Their access paths are also different. The hardwired control unit is found in most of the RISC processors. Architectural description of CISC Figure 1.8.2 CISC Architecture As shown in figure 1.8.2, In CISC processor, there is a unified cache buffer for holding both instruction and data. Therefore they have to share the common path. The micro programmed control unit uses the CISC processors. but modern CISC processors may also use hardwired control unit. Control unit Instruction and Data path Microprogrammed Control memory Cache Main memory Hardwired control unit Data path Instruction cache Data cache (Instruction) (Data) Main memory
  • 31. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 26 Table 2 Difference between RISC and CISC RISC CISC Used by Apple Used by Intel and AMD processors Requires less registers therefore it is easier to design Slower than RISC chips when performing instructions Faster than CISC More expensive to make compared to RISC Reduced Instruction Set Computer Complex Instruction Set Architecture Pipelining can be implemented easily Pipelining implementation is not easy Direct addition is not possible Direct addition between data in two memory locations. Ex.8085 Fewer, simpler and faster instructions Large amount of different and complex instructions RISC architecture is not widely used Atleast 75% of the processor use CISC architecture RISC chips require fewer transistors and cheaper to produce. Finally, it's easier to write powerful optimized compilers. In common CISC chips are relatively slow (compared to RISC chips) per instruction, but use little (less than RISC) instructions. RISC puts a greater burden on the software. Software developers need to write more lines for the same tasks. In CISC, software developers no need to write more lines for the same tasks Mainly used for real time applications Mainly used in normal PC’s, Workstations and servers Large number of registers, most of which can be used as general purpose registers CISC processors cannot have a large number of registers. RISC processor has a number of hardwired instructions. CISC processor executes microcode instructions. Instructions are executed by hardware Instructions are executed by micro program Fixed format instructions variable format instructions Few instructions Many instructions Multiple instruction set Single instruction set Highly pipelined Less pipelined Complexity is in the compiler Complexity is in the microprogram 1.9 Design ALU
  • 32. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 27 The most of the computer operations are performed in arithmetic and logical unit (ALU) only. The data & operands are brought to memory to the processor register and actual addition is carried out in ALU. Each register can store one word of data. The ALU and control unit are many times faster than other devices connected to the system. The following operations can be performed by the ALU. They are: (i) Arithmetic operations (Addition, subtraction, multiplication and division) (ii)Logical operations (AND, OR, NOT and XOR) Adders For two input adders we can have four possible operations. They are 0 + 0 = 0 ⇒ 1 digit sum and no carry 0 + 1 = 1 ⇒ 1 digit sum and no carry 1 + 0 = 1 ⇒ 1 digit sum and no carry 1 + 1 = 10 ⇒ 2 digit sum (Higher bit =carry and lower bit =sum) Addition with out carry ⇒ half adder Addition with carry ⇒ full adder Half adder The half adder is an example of a simple, functional digital circuit built from two logic gates. A half adder adds two one-bit binary numbers A and B. It has two outputs, S and C (the value theoretically carried on to the next addition); the final sum is 2C + S. The simplest half-adder design, pictured on the right, incorporates an XOR gate for S and an AND gate for C. Half adders cannot be used compositely, given their incapacity for a carry-in bit. Inputs Outputs A B Carry Sum 0 0 0 0 1 0 0 1 0 1 0 1 1 1 1 0 Figure 1.9.1 Half adder's one-bit truth table Figure 1.9.2 Half adder block diagram 1 bit Half Adder (HA) Input Output Sum CarryA B
  • 33. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 28 Figure 1.9.3 Half adder Karnaugh map for carry and sum Figure 1.9.4 Half adder logic diagram Disadvantages: Previous carry cannot be added in half adder. Full adder Schematic symbol for a 1-bit full adder with Cin and Cout drawn on sides of block to emphasize their use in a multi-bit adder. A full adder adds binary numbers and accounts for values carried in as well as out. A one-bit full adder adds three one-bit numbers, often written as A, B, and Cin; A and B are the operands, and Cin is a bit carried in (in theory from a past addition). The full-adder is usually a component in a cascade of adders, which add 8, 16, 32, etc. binary numbers. The circuit produces a two-bit output sum typically represented by the signals Cout and S, where . The one-bit full adder's truth table is: Inputs Outputs A B Cin Cout S 0 0 0 0 0 1 0 0 0 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 1 0 1 1 0 0 1 1 1 0 1 1 1 1 1 Figure 1.9.5 Full adder's one-bit truth table 1 0 0 0 B 0 1 A 0 1 0 0 B 0 1 A 0 1 Carry = AB 1 1 Sum = AB + AB = A ⊕ B
  • 34. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 29 Figure 1.9.6 Full adder block diagram Figure 1.9.7 Full adder Karnaugh map for carry and sum Figure 1.9.8 Full adder logic diagram Example full adder logic diagram; the AND gates and the OR gate can be replaced with NAND gates for the same results. A full adder can be implemented in many different ways such as with a custom transistor-level circuit or composed of other gates. One example implementation is with and . In this implementation, the final OR gate before the carry-out output may be replaced by an XOR gate without altering the resulting logic. Using only two types of gates is convenient if the circuit is being implemented using simple IC chips which contain only one gate type per chip. In this light, Cout can be implemented as . A full adder can be constructed from two half adders by connecting A and B to the input of one half adder, connecting the sum from that to an input to the second adder, connecting Ci to the other input and OR the two carry outputs. Equivalently, S could be made the three-bit XOR of A, B, and Ci, and Cout could be made the three-bit majority function of A, B, and Ci. More complex adders Carry = AB+BC+AC Sum = ABC + ABC+ ABC+ ABC = (A ⊕ B) ⊕ C 1 0 0 0 A 0 1 0 0 A 0 1 1 1 1 1 0 11 1 0 01 1 BC 00 01 11 10 BC 00 01 11 10
  • 35. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 30 Ripple carry adder Figure 1.9.9 4-bit adder with logic gates shown It is possible to create a logical circuit using multiple full adders to add N-bit numbers. Each full adder inputs a Cin, which is the Cout of the previous adder. This kind of adder is a ripple carry adder, since each carry bit "ripples" to the next full adder. Note that the first (and only the first) full adder may be replaced by a half adder. The layout of a ripple carry adder is simple, which allows for fast design time; however, the ripple carry adder is relatively slow, since each full adder must wait for the carry bit to be calculated from the previous full adder. The gate delay can easily be calculated by inspection of the full adder circuit. Each full adder requires three levels of logic. In a 32-bit [ripple carry] adder, there are 32 full adders, so the critical path (worst case) delay is 3 (for carry propagation in first adder) + 31 * 2 (for carry propagation in later adders) = 65 gate delays. Carry-lookahead adders Figure 1.9.10 4-bit adder with carry lookahead To reduce the computation time, engineers devised faster ways to add two binary numbers by using carry-lookahead adders. They work by creating two signals (P and G) for each bit position, based on if a carry is propagated through from a less significant bit position (at least one input is a '1'), a carry is generated in that bit position (both inputs are '1'), or if a carry is killed in that bit position (both inputs are '0'). In most cases, P is simply the sum output of a half-adder and G is the carry output of the same adder. After P and G are generated the carries for every bit position are created.
  • 36. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 31 2.1 Explain the process Fundamental concepts To execute a program, the processor fetches one instruction at a time and performs the operations specified. Instructions are fetched from successive memory locations until a branch or a jump instruction is encountered. The processor keeps track of the address of the memory location containing the next instruction to be fetched using the program counter, PC. After fetching an instruction, the contents of the PC are updated to point to the next instruction in the sequence. A branch instruction may load a different value into the PC. Another key register in the processor is the instruction register, IR. Suppose that each instruction comprises 4 bytes, and that it is stored in one memory word. To execute an instruction, the processor has to perform the following three steps: 1. Fetch the contents of the memory location pointed to by the PC. The contents of this location are interpreted as an instruction to be executed. Hence, they are loaded into the IR. Symbolically, this can be written as IR ← [[PC]] 2. Assuming that the memory is byte addressable, increment the contents of the PC by 4, that is, PC ← [PC] + 4 3. Carry out the actions specified by the instruction in the IR. In cases where an instruction occupies more than one word, steps 1 and 2 must be repeated as many times as necessary to fetch the complete instruction. These two steps are usually referred to as the fetch phase; step 3 constitutes the execution phase. Figure 2.1 shows an organization in which the arithmetic and logic unit (ALU) and all the registers are interconnected via a single common bus. This bus is internal to the processor and should not be confused with the external bus that connects the processor to the memory and I/O devices. The data and address lines of the external memory bus are shown in Figure 2.1 connected to the internal processor bus via the memory data register, MDR, and the memory address register, MAR, respectively. Register MDR has two inputs and two outputs. Data may be loaded into MDR either from the memory bus or from the internal processor bus. The data stored in MDR may be placed on either bus. The input of MAR is connected to the internal bus, and its output is connected to the external bus. The control lines of the memory bus are connected to the instruction decoder and control logic block. This unit is responsible for issuing the signals that control the operation of all the units inside the processor and for interacting with the memory bus. The number and use of the processor registers R0 through R(n - 1) vary considerably from one processor to another. Registers may be provided for general-purpose use by the programmer. Some may be dedicated as special-purpose registers, such as index registers or stack pointers. Three registers, Y, Z, and TEMP in Figure 2.1, have not been mentioned before. These registers are transparent to the programmer, that is, the programmer need not be concerned with them because they are never referenced explicitly by any instruction. They are used by the processor for temporary storage during execution of some instructions. These registers are never used for storing data generated by one instruction for later use by another instruction. The multiplexer MUX selects either the output of register Y or a constant value 4 to be provided as input A of the ALU. The constant 4 is used to increment the contents of the program counter. We will refer to the two possible values of the MUX control input Select as Select4 and SelectY for selecting the constant 4 or register Y, respectively. As instruction execution progresses, data are transferred from one register to another, often passing through the AL U to perform some arithmetic or logic operation. The instruction decoder and control logic unit is responsible for implementing the actions specified by the instruction loaded in the IR register. The decoder generates the control signals needed to select the registers
  • 37. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 32 involved and direct the transfer of data. The registers, the ALU, and the interconnecting bus are collectively referred to as the datapath. Figure 2.1 Single bus organization of the data path inside a processor. With few exceptions, an instruction can be executed by performing one or more of the following operations in some specified sequence: · Transfer a word of data from one processor register to another or to the ALU · Perform arithmetic or a logic operation and store the result in a processor register · Fetch the contents of a given memory location and load them into a processor register · Store a word of data from a processor register into a given memory location REGISTER TRANSFERS Instruction execution involves a sequence of steps in which data are transferred from one register to another. For each register, two control signals are used to place the contents of that register on the bus or to load the data on the bus into the register. This is represented symbolically in Figure 2.2. The input and output of register Ri are connected to the bus via switches controlled by the signals Riin and Ri out respectively. When Riin is set to 1, the data on the bus are loaded into Ri. Similarly, when Riout is set to 1, the contents of register Ri are placed on the bus. While Riout is equal to 0, the bus can be used for transferring data from other registers. Suppose that we wish to transfer the contents of register Rl to register R4. This can be accomplished as follows: . . . Sub … Internal Processor Bus Add Select Memory Bus Data line Address line PC MAR MDR Y MUX Constant ALU A B XOR ALU control lines Carry in Z IR Instruction Decoder and control logic R0 R (n-1) TEMP ... Control signals
  • 38. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 33 · Enable the output of register Rl by setting R1out to 1. This places the contents of R 1 on the processor bus. · Enable the input of register R4 by setting R4in to 1. This loads data from the processor bus into register R4. Figure 2.2 Input and output gating for the registers in figure 2.1. All operations and data transfers within the processor take place within time periods defined by the processor clock. The control signals that govern a particular transfer are asserted at the start of the clock cycle. In our example, R1out and R4in are set to 1. The registers consist of edge-triggered flip-flops. Hence, at the next active edge of the clock, the flip-flops that constitute R4 will load the data present at their inputs. At the same time, the control signals R1out and R4in will return to 0. We will use this simple model of the timing of data transfers for the rest of this chapter. However, we should point out that other schemes are possible. For example, data transfers may use both the rising and falling edges of the clock. Also, when edge-triggered flip-flops are not used, two or more clock signals may be needed to guarantee proper transfer of data. This is known as multiphase clocking. PERFORMING ARITHMETIC AND LOGICAL OPERATION The ALU is a combinational circuit that has no internal storage. It performs arithmetic and logic operations on the two operands applied to its A and B inputs. In Figures 2.1 and 2.2, one of Zout Ri out Ri in Internal Processor Bus Select Ri Y MUX Constant C Z X Yin X X ALU A B X Zin X
  • 39. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 34 the operands is the output of the multiplexer MUX and the other operand is obtained directly from the bus. The result produced by the ALU is stored temporarily in register Z. Therefore, a sequence of operations to add the contents of register Rl to those of register R2 and store the result in register R3 is 1. R1out, Yin 2. R2out, Select Y, Add, Zin 3. Zout, R3in FETCHING A WORD FROM MEMORY The connections for register MDR are illustrated in Figure 2.4. It has four control signals: MDR in and MDRout control the connection to the internal bus, and MDR inE and MDRout E control the connection to the external bus. The circuit in Figure 2.3 is easily modified to provide the additional connections. A three-input multiplexer can be used, with the memory bus data line connected to the third input. This input is selected when MDRinE = 1. A second tri-state gate, controlled by MDRoutE can be used to connect the output of the flip-flop to the memory bus. Figure 2.3 Input and output gating for one register bit. Figure 2.4 Connections and control signals for register MDR. As an example of a read operation, consider the instruction Move (R1),R2. The actions needed to execute this instruction are: 1. MAR ← [R1] 2. Start a Read operation on the memory bus MDR in MDR out Internal Processor Bus MDR X X MDR out E X MDR in E X Memory bus data line Bus D Q _ > Q M U X Ri in Ri out Clock
  • 40. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 35 3. Wait for the MFC response from the memory 4. Load MDR from the memory bus 5. R2 ← [MDR] These actions may be carried out as separate steps, but some can be combined into a single step. Each action can be completed in one clock cycle, except action 3 which requires one or more clock cycles, depending on the speed of the addressed device. The memory read operation requires three steps, which can be described by the signals being activated as follows: 1. R1out, MARin, Read 2. MDR inE , WMFC 3. MDRout, R2 in where WMFC is the control signal that causes the processor's control circuitry to wait for the arrival of the MFC signal. STORING A WORD IN MEMORY Writing a word into a memory location follows a similar procedure. The desired address is loaded into MAR. Then, the data to be written are loaded into MDR, and a Write command is issued. Hence, executing the instruction Move R2,(R 1) requires the following sequence: 1. R1out, MARin 2. R2out, MDRin, Write 3. MDRoutE, WMFC As in the case of the read operation, the Write control signal causes the memory bus interface hardware to issue a Write command on the memory bus. The processor remains in step 3 until the memory operation is completed and an MFC response is received.
  • 41. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 36 2.2 Explain the process complete instruction execution Consider the instruction Add (R3),Rl , which adds the contents of a memory location pointed to by R3 to register R1. Executing this instruction requires the following actions: 1. Fetch the instruction. 2. Fetch the first operand (the contents of the memory location pointed to by R3). 3. Perform the addition. 4. Load the result into R1. Figure 2.5 gives the sequence of control steps required to perform these operations for the single- bus architecture of Figure 2.1. Instruction execution proceeds as follows. In step 1, the instruction fetch operation is initiated by loading the contents of the PC into the MAR and sending a Read request to the memory. The Select signal is set to Select4, which causes the multiplexer MUX to select the constant 4. This value is added to the operand at input B, which is the contents of the PC, and the result is stored in register Z. The updated value is moved from register Z back into the PC during step 2, while waiting for the memory to respond. In step 3, the word fetched from the memory is loaded into the IR. Steps 1 through 3 constitute the instruction fetch phase, which is the same for all instructions. The instruction decoding circuit interprets the contents of the IR at the beginning of step 4. This enables the control circuitry to activate the control signals for steps 4 through 7, which constitute the execution phase. The contents of register R3 are transferred to the MAR in step 4, and a memory read operation is initiated. Then Figure 2.5 Control sequence for execution of instruction Add (R3), Rl. the contents of R 1 are transferred to register Y in step 5, to prepare for the addition operation. When the Read operation is completed, the memory operand is available in register MDR, and the addition operation is performed in step 6. The contents of MDR are gated to the bus, and thus also to the B input of the ALU, and register Y is selected as the second input to the ALU by choosing Select Y The sum is stored in register Z, then transferred to R 1 in step 7. The End signal causes a new instruction fetch cycle to begin by returning to step 1. This discussion accounts for all control signals in Figure 2.5 except Y in in step 2. There is no need to copy the updated contents of PC into register Y when executing the Add instruction. But, in Branch instructions the updated value of the PC is needed to compute the Branch target address. To speed up the execution of Branch instructions, this value is copied into register Y in step 2.
  • 42. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 37 Since step 2 is part of the fetch phase, the same action will be performed for all instructions. This does not cause any harm because register Y is not used for any other purpose at that time. BRANCH INSTRUCTION A branch instruction replaces the contents of the PC with the branch target address. This address is usually obtained by adding an offset X, which is given in the branch instruction, to the updated value of the PC. Figure 2.6 gives a control sequence that implements an unconditional branch instruction. Processing starts, as usual, with the fetch phase. This phase ends when the instruction is loaded into the IR in step 3. The offset value is extracted from the IR by the instruction decoding circuit, which will also perform sign extension if required. Since the value of the updated PC is already available in register Y, the offset X is gated onto the bus in step 4, and an addition operation is performed. The result, which is the branch target address, is loaded into the PC in step 5. The offset X used in a branch instruction is usually the difference between the branch target address and the address immediately following the branch instruction. For example, if the branch instruction is at location 2000 and if the branch target address is 2050, the value of X must be 46. The reason for this can be readily appreciated from the control sequence in Figure 2.6. The PC is incremented during the fetch phase, before knowing the type of instruction being executed. Thus, when the branch address is computed in step 4, the PC value used is the updated value, which points to the instruction following the branch instruction in the memory. Figure 2.6 Control sequence for an unconditional instruction. Consider now a conditional branch. In this case, we need to check the status of the condition codes before loading a new value into the PC. For example, for a Branch-on-negative (Branch <0) instruction, step 4 in Figure 2.6 is replaced with Thus, if N = 0 the processor returns to step 1 immediately after step 4. If N = 1, step 5 is performed to load a new value into the PC, thus performing the branch operation. 2.3 Discuss multiple bus organization
  • 43. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 38 In single bus organization, only one data item can be transferred over the bus in a clock cycle. To reduce the number of steps needed, most commercial processors provide multiple internal paths that enable several transfers to take place in parallel. Figure 2.7 Three bus organization of data path. Figure 2.7 illustrates a three-bus structure used to connect the registers and the ALU of a processor. All general-purpose registers are combined into a single block called the register file. The register file in Figure 2.7 is said to have three ports. There are two outputs, allowing the contents of two different registers to be accessed simultaneously and have their contents placed on R A L U Constant B A Bus C PC Register file Instruction Decoder Bus BBus A Incrementer M U X IR MDR MAR Memory bus data lines Address lines
  • 44. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 39 buses A and B. The third port allows the data on bus C to be loaded into a third register during the same clock cycle. Buses A and B are used to transfer the source operands to the A and B inputs of the ALU, where an arithmetic or logic operation may be performed. The result is transferred to the destination over bus C. If needed, the ALU may simply pass one of its two input operands unmodified to bus C. We will call the ALU control signals for such an operation R=A or R=B. A second feature in Figure 2.7 is the introduction of the Incrementer unit, which is used to increment the PC by 4. Using the Incrementer eliminates the need to add 4 to the PC using the main ALD, as was done in single bus organization. The source for the constant 4 at the ALU input multiplexer is still useful. It can be used to increment other addresses, such as the memory addresses in LoadMultiple and StoreMultiple instructions. Figure 2.8 Control sequence for execution of instruction Add R4,R5,R6 for 3 bus organization Consider the three-operand instruction Add R4,R5,R6 The control sequence for executing this instruction is given in Figure 2.8. In step 1, the contents of the PC are passed through the ALU, using the R=B control signal, and loaded into the MAR to start a memory read operation. At the same time the PC is incremented by 4. Note that the value loaded into MAR is the original contents of the PC. The incremented value is loaded into the PC at the end of the clock cycle and will not affect the contents of MAR. In step 2, the processor waits for MFC and loads the data received into MDR, then transfers them to IR in step 3. Finally, the execution phase of the instruction requires only one control step to complete, step 4. By providing more paths for data transfer a significant reduction in the number of clock cycles needed to execute an instruction is achieved.
  • 45. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 40 2.4 Discuss Hardwired control To execute instructions, the processor must have some means of generating the control signals needed in the proper sequence. Computer designers use a wide variety of techniques to solve this problem. The approaches used fall into one of two categories: hardwired control and microprogrammed control. The required control signals are determined by the following information: · Contents of the control step counter · Contents of the instruction register · Contents of the condition code flags · External input signals, such as MFC and interrupt requests Figure 2.10 Control unit organization. To gain insight into the structure of the control unit, we start with a simplified view of the hardware involved. The decoder/encoder block in Figure 2.10 is a combinational circuit that generates the required control outputs, depending on the state of all its inputs. By separating the decoding and encoding functions, we obtain the more detailed block diagram in Figure 2.11. The step decoder provides a separate signal line for each step, or time slot, in the control sequence. Similarly, the output of the instruction decoder consists of a separate line for each machine instruction. For any instruction loaded in the IR, one of the output lines INS 1 through INS m is set to 1, and all other lines are set to O. (For design details of decoders, refer to Appendix A.) The input signals to the encoder block in Figure 2.11 are combined to generate the individual control signals Y in , PC OUh Add, End, and so on. An example of how the encoder generates the Zin control signal for the processor organization is given in Figure 2.12. This circuit implements the logic function This signal is asserted during time slot Tl for all instructions, during T6 for an Add instruction, during T 4 for an unconditional branch instruction, and so on. Figure 2.13 gives a circuit that generates the End control signal from the logic function CLK Decoder/ Encoder IR Control step counter External inputs Condition codes Control signals . . . … . . . . . . Clock …
  • 46. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 41 The End signal starts a new instruction fetch cycle by resetting the control step counter to its starting value. Figure 2.11 contains another control signal called RUN. When set to 1, RUN causes the counter to be incremented by one at the end of every clock cycle. When RUN is equal to 0, the counter stops counting. This is needed whenever the WMFC signal is issued, to cause the processor to wait for the reply from the memory. Figure 2.11 Separation of decoding and encoding functions. The control hardware shown in Figure 2.10 or 2.11 can be viewed as a state machine that changes from one state to another in every clock cycle, depending on the contents of the instruction register, the condition codes, and the external inputs. The outputs of the state machine are the control signals. The sequence of operations carried out by this machine is determined by the wiring of the logic elements, hence the name "hardwired." A controller that uses this approach can operate at high speed. However, it has little flexibility, and the complexity of the instruction set it can implement is limited. INSn INS2 Run End CLK Encoder Control step counter External inputs Condition codes Control signals Instruction Decoder . . . … . . . . . . Clock . . . IR . . . Step decoder T1 T2 . . . Tn INS1
  • 47. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 42 Figure 2.12 Generation of the Zin Control Figure 2.13 Generation of the signal for the processor in figure 7.1. End Control signal. A COMPLETE PROCESSOR A complete processor can be designed using the structure shown in Figure 2.14. This structure has an instruction unit that fetches instructions from an instruction cache or from the main memory when the desired instructions are not already in the cache. It has separate processing units to deal with integer data and floating-point data. A data cache is inserted between these units and the main memory. Using separate caches for instructions and data is common practice in many processors today. Other processors use a single cache that stores both instructions and data. The processor is connected to the system bus and, hence, to the rest of the computer, by means of a bus interface. Figure 2.14 Block diagram of a complete processor. … T6T5 … T7 Branch <0Add End T4 N N Branch Main Memory Processor Instruction Unit Integer Unit Floating-point Unit Instruction Cache Data Cache Bus Interface Input/ Output System bus … … T1 T4 T6 Branch Add Zin
  • 48. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 43 2.5 Micro programmed control An alternative scheme for hardwired control is called micro programmed control in which control signals are generated by a program similar to machine language programs. Figure 2.15 An example of micro instructions for figure 2.6 A control word (CW) is a word whose individual bits represent the various control signals Each of the control steps in the control sequence of an instruction defines a unique combination of 1s and 0s in the CW. The CW s corresponding to the 7 steps of Figure 2.6 are shown in Figure 2.15. SelectY is represented by Select = 0 and Select4 by Select = 1. A sequence of CW s corresponding to the control sequence of a machine instruction constitutes the microroutine for that instruction, and the individual control words in this microroutine are referred to as microinstructions. Figure 2.16 Basic organization of a microprogrammed control unit. The microroutines for all instructions in the instruction set of a computer are stored in a special memory called the control store. The control unit can generate the control signals for any instruction by sequentially reading the CW s of the corresponding microroutine from the control store. This suggests organizing the control unit as shown in Figure 2.16. To read the control words sequentially from the control store, a microprogram counter (µPC) is used. Every time a new Micro program counter (µPC) Control Store / memory (µCM) CW Clock IR Starting address generator
  • 49. G. APPASAMI / COA / CSE / DR. PAULS ENGINEERING COLLEGE 44 instruction is loaded into the IR, the output of the block labeled "starting address generator" is loaded into the µPC. The µPC is then automatically incremented by the clock, causing successive microinstructions to be read from the control store. Hence, the control signals are delivered to various parts of the processor in the correct sequence. In microprogrammed control, an alternative approach is to use conditional branch microinstructions. In addition to the branch address, these microinstructions specify which of the external inputs, condition codes, or, possibly, bits of the instruction register should be checked as a condition for branching to take place. Figure 2.17 Micro routine for the instruction branch <0 Figure 2.18 Organization of a control unit to allow conditional branching in microprogram The instruction Branch <0 may now be implemented by a microroutine such as that shown in Figure 7.17. After loading this instruction into IR, a branch microinstruction transfers control to the corresponding microroutine, which is assumed to start at location 25 in the control store. This address is the output of the starting address generator block in Figure 7.16. The microinstruction at location 25 tests N bit of the condition codes. If this bit is equal to 0, a branch takes place to location 0 to fetch a new machine instruction. Otherwise, the microinstruction at location 26 is Micro program counter (µPC) Control Store / memory (µCM) CW Clock IR Starting and branch address generator External inputs Condition codes