3. Introduction
Definition:
In computing, code generation is the process by which a
compiler's code generator converts some intermediate representation
of source code into a form (e.g., machine code) that can be readily
executed by a machine. Sophisticated compilers typically perform
multiple passes over various intermediate forms.
4.
5. Cont.….
So far we discussed front-end completely
Where we discussed lexical phase ,syntax phase, semantic phase
completely
After doing that we did intermediate code generator where we
have converted code into three address code
All the (Front-end, code optimizer, code generator ) are access the
symbol table
6. Cont.…
Code produced by complier must be correct
Source to target program transformation is semantic preserving
Code produced by compiler should be of high quality
Effective use of target machine resources
The code which is generated is in optimal state
7. Issues in the design of Code
Generator
Input to the code generator
Target Program
Memory Management
Instruction selection
Register allocation
8. Input to the code generator:
Intermediate representation of the source program
Linear – Postfix
Tables – Quadruples,Triples,Indirect triples
No-Linear – AST,DAG
In addition we also need a symbol table information give it to the
code generator
9. Target Program:
The Back-end code generator of a complier may different forms of
code, depending on the requirements.
Absolute machine code – Executable
Relocatable machine code – Object files
Assembly language
Byte code forms for interpreters like JVM
10. Implementing code generation requires thorough understanding of
the target machine architecture and its instruction set
Our machine:
Byte – addressable(word = 4 bytes)
Has n general purpose registers R0,R1,……,Rn-1
Two – address instructions of the form
OP SOURCE , DESTINATION
11. Target machine :
Op-Code&Address modes
Op – code , for example
MOV (move content of source to destination)
ADD (add content of source to destination)
SUB (subtract content of source from destination)
It has 6 address modes, They are
12.
13. Instruction selection : Cost
Machine is a simple , non-super-scalar processor with fixed
instruction costs
Realistic machine have deep pipelines, I –cache, D –cache etc.
Define the cost of instruction
1+cost(source node)+cost(destination node)
14.
15.
16. Instruction selection :
Addressing modes
Suppose we translate “ a:b+c” into
MOV b,R0
ADD c,R0
MOV R0,a
Assuming address of a , b and c are stored in R0,R1 and R2
MOV *R1,*R2
ADD *R2,*R0
Assuming R1 and R2 contain values of b and c
ADD R2,R1
MOV R1,a
17. Need for Global code optimization:
Suppose we translate three – address code ” x:=y+z” to:
MOV y,R0
ADD z,R0
MOV R0,x
This is for single instruction…..
18. Cont…..
For double instructions:
EX: a:=b+c
d:=a+e
to:
MOV a,R0
ADD b,R0
MOV R0,a Redundant as R0 is used.
MOV a,R0
ADD e,R0
MOV R0,d
19. Register Allocation and Assignment:
Efficient utilization of the limited set of register s is important to
generate good code
Registers are assigned by
Register allocation to select the set of variables that will reside in
registers at a point in the code
Register assignment to pick the specific register that a variable will
reside in particular point of time
20.
21. Choice of Evaluation Order:
a+b-(c+d)*e t1:=a+b MOV a,R0
t2:=c+d ADD b,R0
t3:=e*t2 MOV R0,t1
t4:=t1-t3 MOV c,R1
ADD d,R1
MOV e,R0
MUL R1,R0
MOV t1,R1
SUB R0,R1
MOV R1,t4
22. Reordered Instructions and code:
t2: = c+d MOV c,R0
t3: = e*t2 ADD d,R0
t1: = a+b MOV e,R1
t4: = t1-t3 MUL R0,R1
MOV a ,R0
ADD b,R0
SUB R1,R0
MOV R0,t4
23. Cont.….
When instruction are independent , their evolution order can be
changed to utilize registers and save on instruction set
This is also possible with code generator