This document discusses the phases of a compiler:
1. Lexical analysis breaks the program into individual tokens.
2. Syntax analysis checks that the tokens are arranged according to grammar rules.
3. Semantic analysis checks for semantic errors like variable usage before declaration. It annotates the syntax tree with types.
4. The intermediate code generator produces a three-address code that is easy to optimize and translate to target code.
5. Code optimization simplifies the three-address code through techniques like constant folding and dead code elimination. Target code optimization further improves the code.
6. The code generator produces assembly instructions for the target machine based on the optimized three-address code.
3. Dr. Hussien M. Sharaf
3. SEMANTIC ANALYZER
The semantics of a program are its meaning
as opposed to syntax or structure.
The semantics consist of:
Runtime semantics: behavior of program
at run time.
Static semantics: checked by the
compiler.
3
4. Dr. Hussien M. Sharaf
STATIC SEMANTICS
Static semantics include:
Declaration of variables and constants
before use. i.e.
int x;
x = 3;
Calling
functions that exist (predefined in a library
or defined by the user) i.e.
n = Max(4,7);
int Max(int x, int y)
{
int z;
if(x > y)
z = x;
else
z = y;
return z;
}
4
5. Dr. Hussien M. Sharaf
STATIC SEMANTICS
Passing
parameters properly.
Type checking, i.e.
int x, y;
x = 3;
y = 2.5; performs an error, cause 2.5 is not int it is float data type.
Static semantics can not be checked by the
parser.
5
6. Dr. Hussien M. Sharaf
SEMANTIC ANALYZER (CONT.)
The semantic analyzer does the following:
Checks
the static semantics of the language.
Annotates the syntax tree with type information,
as shown in example. := x + y * 2.5;
a
:= real
id a real
+ real
id x real
Annotated syntax tree
* real
inttoreal
id y integer
literal 2.5 real
6
7. Dr. Hussien M. Sharaf
4. INTERMEDIATE CODE GENERATOR
Intermediate language called “Three-address
code”.
Should have two important properties:
Should be easy to produce.
Should be easy to translate to the target
language.
A TAC instruction have at most one instruction
per line and can have at most three operands.
7
8. Dr. Hussien M. Sharaf
INTERMEDIATE CODE GENERATOR
Example:
a := x + y * 2.5;
:= real
Three-address code
id a real
+ real
id x real
* real
inttoreal
temp1 := inttoreal(y)
temp2 := temp1 real* 2.5
temp3 := x real+ temp2
a := temp3
literal 2.5 real
id y integer
A TAC instruction have at most one instruction per
line and can have at most three operands.
8
9. Dr. Hussien M. Sharaf
5. CODE OPTIMIZER
Code optimization can be applied to:
code – independent of the
target machine.
Target code – dependent on the target
machine.
Intermediate
9
10. Dr. Hussien M. Sharaf
INTERMEDIATE CODE OPTIMIZATION
Intermediate code optimization include:
A.
B.
C.
D.
E.
Constant folding.
Elimination of common sub- expressions.
Identification and elimination of unreachable
code (called dead code).
Improving loops.
Improving function calls.
10
11. Dr. Hussien M. Sharaf
A. CONSTANT FOLDING
Simplifying constant expressions at compile
time.
Example:
i = 320 * 200 * 32
In this example modern compilers identify
constructs, and substitute the computed
values at compile time (in this case –
2,048,000)
11
12. Dr. Hussien M. Sharaf
B. ELIMINATION OF COMMON SUB- EXPRESSIONS
Replace the common expressions with a
single variable holding the computed value.
Example:
a = b * c + g;
d = b * c * e;
tmp = b * c;
a = tmp + g;
d = tmp * e;
12
13. Dr. Hussien M. Sharaf
C. IDENTIFICATION AND ELIMINATION OF DEAD CODE
Dead code is the code in a program which is
executed but whose result is never used in
any other computation.
Dead code wastes computation time.
Example:
int f (int x, int y)
{
int z = x + y;
return x * y;
}
should be eliminated.
13
14. Dr. Hussien M. Sharaf
D. IMPROVING LOOPS
There are a lot of strategies for improving loops.
A simple one is: Move loop invariants out of the loop
Example:
For y = 0 to Height-1
For x = 0 to Width-1
' y*Width is invariant
i = y*Width + x
Process i
Next x
Next y
For y = 0 to Height-1
temp = y*Width
For x = 0 to Width-1
i = temp + x
Process i
Next x
Next y
This link : http://www.aivosto.com/vbtips/loopopt.html
illustrates all the strategies for interested readers. 14
15. Dr. Hussien M. Sharaf
E. IMPROVING FUNCTION CALLS
One strategy is by Removing recursion (function that calls itself).
Example of recursive function:
unsigned int factorial(unsigned int n)
{
if (n == 0) { return 1; }
else { return n * factorial(n - 1); }
}
After removing recursion and using loops instead it will be:
unsigned int factorial(unsigned int n)
{
int result = 1;
if (n == 0) { return 1; }
else {
for (i = n; i>=1; i--)
{
result = result * i;
}
return result;
}
}
15
16. Dr. Hussien M. Sharaf
TARGET CODE OPTIMIZATION
Target Code optimization is done by
improving the intermediate code, and
removing the redundant code.
Example:
Temp1 := int-to-real(60)
Temp2 := id3 * temp1
Temp3 := id2 + temp2
Id1 := temp3
temp1 := id3 * 60.0
id1 := id2 + temp1
16
17. Dr. Hussien M. Sharaf
TARGET CODE OPTIMIZATION
Target code optimization include:
a.
b.
Allocation and use of registers.
Selection of better (safer) instructions and
addressing modes.
17
18. Dr. Hussien M. Sharaf
6. CODE GENERATOR
Generates code for the target machine.
Selects appropriate machine instructions.
Allocates memory locations for variables.
Allocates registers for intermediate
computations.
Three-address code
temp1 := int2real(y)
temp2 := temp1 * 2.5
temp3 := x + temp2
a : = temp3
Assembly code
LOADI
MOVF
MULF
LOADF
ADDF
STORF
R1 , y
F1 , R1
F2 , F1, 2.5
F3 , x
F4 , F3, F2
a, F4
;; R1
;; F1
;; F2
;; F3
;; F4
;; a
y
int2real(R1)
F1 * 2.5
x
F3 + F2
F4
18