2. Course Goals • To provide students with an understanding of the major phases of a compiler. • To introduce students to the theory behind the various phases, including regular expressions, context-free grammars, and finite state automata. • To provide students with an understanding of the design and implementation of a compiler. • To have the students build a compiler, through type checking and intermediate code generation, for a small language. • To provide students with an opportunity to work in a group on a large project.
3. Course Outcomes • Students will have experience using current compiler generation tools. • Students will be familiar with the different phases of compilation. • Students will have experience defining and specifying the semantic rules of a programming language
4. Prerequisites • In-depth knowledge of at least one structured programming language. • Strong background in algorithms, data structures, and abstract data types, including stacks, binary trees, graphs. • Understanding of grammar theories. • Understanding of data types and control structures, their design and implementation. • Understanding of the design and implementation of subprograms, parameter passing mechanisms, scope.
5.
6. Textbook Compilers: Principles, Techniques, and Tools” by Aho, Lam, Sethi, and Ullman, 2 nd edition.
11. Preprocessors, Compilers, Assemblers, and Linkers Preprocessor Compiler Assembler Linker Skeletal Source Program Source Program Target Assembly Program Relocatable Object Code Absolute Machine Code Libraries and Relocatable Object Files Try for example: gcc -v myprog.c
12. The Phases of a Compiler Phase Output Sample Programmer (source code producer) Source string A=B+C; Scanner (performs lexical analysis ) Token string ‘ A’ , ‘=’ , ‘B’ , ‘+’ , ‘C’ , ‘;’ And symbol table with names Parser (performs syntax analysis based on the grammar of the programming language) Parse tree or abstract syntax tree ; | = / A + / B C Semantic analyzer (type checking, etc) Annotated parse tree or abstract syntax tree Intermediate code generator Three-address code, quads, or RTL int2fp B t1 + t1 C t2 := t2 A Optimizer Three-address code, quads, or RTL int2fp B t1 + t1 #2.3 A Code generator Assembly code MOVF #2.3,r1 ADDF2 r1,r2 MOVF r2,A Peephole optimizer Assembly code ADDF2 #2.3,r2 MOVF r2,A
13.
14.
15.
16.
17.
18.
19.
20.
21.
22. The Back End Responsibilities Translate IR into target machine code Choose instructions to implement each IR operation Decide which value to keep in registers Ensure conformance with system interfaces Automation has been much less successful in the back end Errors IR Instruction Scheduling Instruction Selection Machine code Register Allocation IR IR
23.
24.
25.
26.
27. The Optimizer (or Middle End) Modern optimizers are structured as a series of passes Typical Transformations Discover & propagate some constant value Move a computation to a less frequently executed place Discover a redundant computation & remove it Remove useless or unreachable co de Errors O p t 1 O p t 3 O p t 2 O p t n ... IR IR IR IR IR
28.
29.
30.
33.
34.
35.
36. Set Operations (refresher) You need to know these definitions