SlideShare uma empresa Scribd logo
1 de 129
Baixar para ler offline
Lecture 2: Declarative Syntax Definition
CS4200 Compiler Construction
Eelco Visser
TU Delft
September 2019
This Lecture
!2
Java Type Check
JVM
bytecode
Parse CodeGenOptimize
Specification of syntax definition from which we can derive parsers
Language Workbench
3
Programming Environment
!4
Architecture: Language Workbench vs Compiler
!5
Java Type Check
JVM
bytecode
Parse CodeGenOptimize
Language Workbench: Live Language Development
!6
Reading Material
7
The perspective of this lecture on declarative
syntax definition is explained more elaborately in
this Onward! 2010 essay. It uses an on older version
of SDF than used in these slides. Production rules
have the form
X1 … Xn -> N {cons(“C”)}
instead of
N.C = X1 … Xn
http://swerl.tudelft.nl/twiki/pub/Main/TechnicalReports/TUD-SERG-2010-019.pdf
https://doi.org/10.1145/1932682.1869535
The SPoofax Testing (SPT) language used in
the section on testing syntax definitions was
introduced in this OOPSLA 2011 paper.
https://doi.org/10.1145/2076021.2048080
http://swerl.tudelft.nl/twiki/pub/Main/TechnicalReports/TUD-SERG-2011-011.pdf
The SDF3 syntax definition
formalism is documented at the
metaborg.org website.
http://www.metaborg.org/en/latest/source/langdev/meta/lang/sdf3/index.html
Syntax
11
What is Syntax?
!12
https://en.wikipedia.org/wiki/Syntax
In mathematics, syntax refers to the rules governing the behavior of
mathematical systems, such as formal languages used in logic. (See
logical syntax.)
In linguistics, syntax (/ˈsɪntæks/[1][2]) is the set of rules, principles, and
processes that govern the structure of sentences in a given language,
specifically word order and punctuation. 

The term syntax is also used to refer to the study of such principles
and processes.[3] 

The goal of many syntacticians is to discover the syntactic rules
common to all languages.
The word syntax comes from Ancient Greek: σύνταξις "coordination",
which consists of σύν syn, "together," and τάξις táxis, "an ordering".
Syntax (Programming Languages)
!13
In computer science, the syntax of a computer language is the set of
rules that defines the combinations of symbols that are considered to be
a correctly structured document or fragment in that language.

This applies both to programming languages, where the document
represents source code, and markup languages, where the document
represents data. 

The syntax of a language defines its surface form.[1] 

Text-based computer languages are based on sequences of characters,
while visual programming languages are based on the spatial layout and
connections between symbols (which may be textual or graphical). 

Documents that are syntactically invalid are said to have a syntax error.
https://en.wikipedia.org/wiki/Syntax_(programming_languages)
Syntax
- The set of rules, principles, and processes that govern the structure
of sentences in a given language

- The set of rules that defines the combinations of symbols that are
considered to be a correctly structured document or fragment in that
language

How to describe such a set of rules?
!14
That Govern the Structure
The Structure of Programs
15
What do we call the elements of programs?
!16
#include <stdio.h>
int power(int m, int n);
/* test power function */
main() {
int i;
for (i = 0; i < 10; ++i)
printf("%d %d %dn", i, power(2, i), power(-3, i));
return 0;
}
/* power: raise base to n-th power; n >= 0 */
int power(int base, int n) {
int i, p;
p = 1;
for (i = 1; i <= n; ++i)
p = p * base;
return p;
}
What kind of program element is this?
!17
#include <stdio.h>
int power(int m, int n);
/* test power function */
main() {
int i;
for (i = 0; i < 10; ++i)
printf("%d %d %dn", i, power(2, i), power(-3, i));
return 0;
}
/* power: raise base to n-th power; n >= 0 */
int power(int base, int n) {
int i, p;
p = 1;
for (i = 1; i <= n; ++i)
p = p * base;
return p;
}
Program
Compilation Unit
What kind of program element is this?
!18
#include <stdio.h>
int power(int m, int n);
/* test power function */
main() {
int i;
for (i = 0; i < 10; ++i)
printf("%d %d %dn", i, power(2, i), power(-3, i));
return 0;
}
/* power: raise base to n-th power; n >= 0 */
int power(int base, int n) {
int i, p;
p = 1;
for (i = 1; i <= n; ++i)
p = p * base;
return p;
}
Preprocessor Directive
What kind of program element is this?
!19
#include <stdio.h>
int power(int m, int n);
/* test power function */
main() {
int i;
for (i = 0; i < 10; ++i)
printf("%d %d %dn", i, power(2, i), power(-3, i));
return 0;
}
/* power: raise base to n-th power; n >= 0 */
int power(int base, int n) {
int i, p;
p = 1;
for (i = 1; i <= n; ++i)
p = p * base;
return p;
}
Function Declaration
Function Prototype
What kind of program element is this?
!20
#include <stdio.h>
int power(int m, int n);
/* test power function */
main() {
int i;
for (i = 0; i < 10; ++i)
printf("%d %d %dn", i, power(2, i), power(-3, i));
return 0;
}
/* power: raise base to n-th power; n >= 0 */
int power(int base, int n) {
int i, p;
p = 1;
for (i = 1; i <= n; ++i)
p = p * base;
return p;
}
Comment
What kind of program element is this?
!21
#include <stdio.h>
int power(int m, int n);
/* test power function */
main() {
int i;
for (i = 0; i < 10; ++i)
printf("%d %d %dn", i, power(2, i), power(-3, i));
return 0;
}
/* power: raise base to n-th power; n >= 0 */
int power(int base, int n) {
int i, p;
p = 1;
for (i = 1; i <= n; ++i)
p = p * base;
return p;
}
Function Definition
What kind of program element is this?
!22
#include <stdio.h>
int power(int m, int n);
/* test power function */
main() {
int i;
for (i = 0; i < 10; ++i)
printf("%d %d %dn", i, power(2, i), power(-3, i));
return 0;
}
/* power: raise base to n-th power; n >= 0 */
int power(int base, int n) {
int i, p;
p = 1;
for (i = 1; i <= n; ++i)
p = p * base;
return p;
}
Variable Declaration
What kind of program element is this?
!23
#include <stdio.h>
int power(int m, int n);
/* test power function */
main() {
int i;
for (i = 0; i < 10; ++i)
printf("%d %d %dn", i, power(2, i), power(-3, i));
return 0;
}
/* power: raise base to n-th power; n >= 0 */
int power(int base, int n) {
int i, p;
p = 1;
for (i = 1; i <= n; ++i)
p = p * base;
return p;
}
Statement
For Loop
What kind of program element is this?
!24
#include <stdio.h>
int power(int m, int n);
/* test power function */
main() {
int i;
for (i = 0; i < 10; ++i)
printf("%d %d %dn", i, power(2, i), power(-3, i));
return 0;
}
/* power: raise base to n-th power; n >= 0 */
int power(int base, int n) {
int i, p;
p = 1;
for (i = 1; i <= n; ++i)
p = p * base;
return p;
}
Statement
Function Call
What kind of program element is this?
!25
#include <stdio.h>
int power(int m, int n);
/* test power function */
main() {
int i;
for (i = 0; i < 10; ++i)
printf("%d %d %dn", i, power(2, i), power(-3, i));
return 0;
}
/* power: raise base to n-th power; n >= 0 */
int power(int base, int n) {
int i, p;
p = 1;
for (i = 1; i <= n; ++i)
p = p * base;
return p;
}
Expression
What kind of program element is this?
!26
#include <stdio.h>
int power(int m, int n);
/* test power function */
main() {
int i;
for (i = 0; i < 10; ++i)
printf("%d %d %dn", i, power(2, i), power(-3, i));
return 0;
}
/* power: raise base to n-th power; n >= 0 */
int power(int base, int n) {
int i, p;
p = 1;
for (i = 1; i <= n; ++i)
p = p * base;
return p;
}
Formal Function Parameter
What kind of program element is this?
!27
#include <stdio.h>
int power(int m, int n);
/* test power function */
main() {
int i;
for (i = 0; i < 10; ++i)
printf("%d %d %dn", i, power(2, i), power(-3, i));
return 0;
}
/* power: raise base to n-th power; n >= 0 */
int power(int base, int n) {
int i, p;
p = 1;
for (i = 1; i <= n; ++i)
p = p * base;
return p;
}
Type
Syntactic Categories
!28
Program
Compilation Unit
Preprocessor Directive
Function Declaration
Function Prototype
Function Definition
Variable Declaration
Statement
For Loop
Expression
Formal Function 

Parameter
Type
Function Call
Programs consist of different kinds of elements
Hierarchy of Syntactic Categories
!29
Program
Compilation Unit
Preprocessor 

Directive
Function Declaration
Function Prototype
Function 

Definition
Variable 

Declaration
Statement
For LoopExpression
Formal Function 

Parameter
Type
Function Call
Some kinds of constructs are contained in others
The Tiger Language
!30
Example language used in lectures
https://www.lrde.epita.fr/~tiger/tiger.html
Documentation
https://github.com/MetaBorgCube/metaborg-tiger
Spoofax project
!31
let
var N := 8
type intArray = array of int
var row := intArray[N] of 0
var col := intArray[N] of 0
var diag1 := intArray[N + N - 1] of 0
var diag2 := intArray[N + N - 1] of 0
function printboard() = (
for i := 0 to N - 1 do (
for j := 0 to N - 1 do
print(if col[i] = j then
" O"
else
" .");
print("n")
);
print("n"))
function try(c : int) = (
if c = N then
printboard()
else
for r := 0 to N - 1 do
if row[r] = 0 & diag1[r + c] = 0 & diag2[r + 7 - c] = 0 then (
row[r] := 1;
diag1[r + c] := 1;
diag2[r + 7 - c] := 1;
col[c] := r;
try(c + 1);
row[r] := 0;
diag1[r + c] := 0;
diag2[r + 7 - c] := 0))
in
try(0)
end
A Tiger program that solves
the n-queens problem
!32
let
var N := 8
type intArray = array of int
var diag2 := intArray[N + N - 1] of 0
function printboard() = (
for i := 0 to N - 1 do (
for j := 0 to N - 1 do
print(if col[i] = j then
" O"
else
" .");
print("n")))
function try(c : int) = (
if c = N then
printboard()
else
for r := 0 to N - 1 do
if row[r] = 0 & diag1[r + c] = 0 then (
diag2[r + 7 - c] := 1;
try(c + 1);))
in
try(0)
end
A mutilated n-queens Tiger
program with ‘redundant’
elements removed
What are the syntactic
categories of Tiger?
What are the language
constructs of Tiger called?
!33
let
var N := 8
type intArray = array of int
var diag2 := intArray[N + N - 1] of 0
function printboard() = (
for i := 0 to N - 1 do (
for j := 0 to N - 1 do
print(if col[i] = j then
" O"
else
" .");
print("n")))
function try(c : int) = (
if c = N then
printboard()
else
for r := 0 to N - 1 do
if row[r] = 0 & diag1[r + c] = 0 then (
diag2[r + 7 - c] := 1;
try(c + 1);))
in
try(0)
end
Program
Expression
Let Binding
!34
let
var N := 8
type intArray = array of int
var diag2 := intArray[N + N - 1] of 0
function printboard() = (
for i := 0 to N - 1 do (
for j := 0 to N - 1 do
print(if col[i] = j then
" O"
else
" .");
print("n")))
function try(c : int) = (
if c = N then
printboard()
else
for r := 0 to N - 1 do
if row[r] = 0 & diag1[r + c] = 0 then (
diag2[r + 7 - c] := 1;
try(c + 1);))
in
try(0)
end
Variable Declaration
!35
let
var N := 8
type intArray = array of int
var diag2 := intArray[N + N - 1] of 0
function printboard() = (
for i := 0 to N - 1 do (
for j := 0 to N - 1 do
print(if col[i] = j then
" O"
else
" .");
print("n")))
function try(c : int) = (
if c = N then
printboard()
else
for r := 0 to N - 1 do
if row[r] = 0 & diag1[r + c] = 0 then (
diag2[r + 7 - c] := 1;
try(c + 1);))
in
try(0)
end
Type Declaration
!36
let
var N := 8
type intArray = array of int
var diag2 := intArray[N + N - 1] of 0
function printboard() = (
for i := 0 to N - 1 do (
for j := 0 to N - 1 do
print(if col[i] = j then
" O"
else
" .");
print("n")))
function try(c : int) = (
if c = N then
printboard()
else
for r := 0 to N - 1 do
if row[r] = 0 & diag1[r + c] = 0 then (
diag2[r + 7 - c] := 1;
try(c + 1);))
in
try(0)
end
Type Expression
!37
let
var N := 8
type intArray = array of int
var diag2 := intArray[N + N - 1] of 0
function printboard() = (
for i := 0 to N - 1 do (
for j := 0 to N - 1 do
print(if col[i] = j then
" O"
else
" .");
print("n")))
function try(c : int) = (
if c = N then
printboard()
else
for r := 0 to N - 1 do
if row[r] = 0 & diag1[r + c] = 0 then (
diag2[r + 7 - c] := 1;
try(c + 1);))
in
try(0)
end
Expression
Array Initialization
!38
let
var N := 8
type intArray = array of int
var diag2 := intArray[N + N - 1] of 0
function printboard() = (
for i := 0 to N - 1 do (
for j := 0 to N - 1 do
print(if col[i] = j then
" O"
else
" .");
print("n")))
function try(c : int) = (
if c = N then
printboard()
else
for r := 0 to N - 1 do
if row[r] = 0 & diag1[r + c] = 0 then (
diag2[r + 7 - c] := 1;
try(c + 1);))
in
try(0)
end
Function Definition
!39
let
var N := 8
type intArray = array of int
var diag2 := intArray[N + N - 1] of 0
function printboard() = (
for i := 0 to N - 1 do (
for j := 0 to N - 1 do
print(if col[i] = j then
" O"
else
" .");
print("n")))
function try(c : int) = (
if c = N then
printboard()
else
for r := 0 to N - 1 do
if row[r] = 0 & diag1[r + c] = 0 then (
diag2[r + 7 - c] := 1;
try(c + 1);))
in
try(0)
end
Function Name
!40
let
var N := 8
type intArray = array of int
var diag2 := intArray[N + N - 1] of 0
function printboard() = (
for i := 0 to N - 1 do (
for j := 0 to N - 1 do
print(if col[i] = j then
" O"
else
" .");
print("n")))
function try(c : int) = (
if c = N then
printboard()
else
for r := 0 to N - 1 do
if row[r] = 0 & diag1[r + c] = 0 then (
diag2[r + 7 - c] := 1;
try(c + 1);))
in
try(0)
end
Function Body
Expression
For Loop
!41
let
var N := 8
type intArray = array of int
var diag2 := intArray[N + N - 1] of 0
function printboard() = (
for i := 0 to N - 1 do (
for j := 0 to N - 1 do
print(if col[i] = j then
" O"
else
" .");
print("n")))
function try(c : int) = (
if c = N then
printboard()
else
for r := 0 to N - 1 do
if row[r] = 0 & diag1[r + c] = 0 then (
diag2[r + 7 - c] := 1;
try(c + 1);))
in
try(0)
end
Expression
Sequential Composition
!42
let
var N := 8
type intArray = array of int
var diag2 := intArray[N + N - 1] of 0
function printboard() = (
for i := 0 to N - 1 do (
for j := 0 to N - 1 do
print(if col[i] = j then
" O"
else
" .");
print("n")))
function try(c : int) = (
if c = N then
printboard()
else
for r := 0 to N - 1 do
if row[r] = 0 & diag1[r + c] = 0 then (
diag2[r + 7 - c] := 1;
try(c + 1);))
in
try(0)
end
Formal Parameter
!43
let
var N := 8
type intArray = array of int
var diag2 := intArray[N + N - 1] of 0
function printboard() = (
for i := 0 to N - 1 do (
for j := 0 to N - 1 do
print(if col[i] = j then
" O"
else
" .");
print("n")))
function try(c : int) = (
if c = N then
printboard()
else
for r := 0 to N - 1 do
if row[r] = 0 & diag1[r + c] = 0 then (
diag2[r + 7 - c] := 1;
try(c + 1);))
in
try(0)
end
Expression
Assignment
!44
let
…
function prettyprint(tree : tree) : string =
let
var output := ""
function write(s : string) =
output := concat(output, s)
function show(n : int, t : tree) =
let
function indent(s : string) = (
write("n");
for i := 1 to n do
write(" ") ;
output := concat(output, s))
in
if t = nil then
indent(".")
else (
indent(t.key);
show(n + 1, t.left);
show(n + 1, t.right))
end
in
show(0, tree);
output
end
…
in
…
end
Functions can be nested
Structure
- Programs have structure

Categories
- Program elements come in multiple categories

- Elements cannot be arbitrarily interchanged

Constructs
- Some categories have multiple elements

Hierarchy
- Categories are organized in a hierarchy
!45
Elements of Programs
Decomposing Programs
46
Decomposing a Program into Elements
!47
let function fact(n : int) : int =
if n < 1 then
1
else
n * fact(n - 1)
in
fact(10)
end
Decomposing a Program into Elements
!48
function fact(n : int) : int =
if n < 1 then
1
else
n * fact(n - 1)
fact(10)
Exp.Let
Decomposing a Program into Elements
!49
fact(10)
Exp.Let
n : int if n < 1 then
1
else
n * fact(n - 1)
FunDec
fact int
Decomposing a Program into Elements
!50
fact(10)
Exp.Let
int
if n < 1 then
1
else
n * fact(n - 1)
FunDec
fact
n
FArg
Tid
int
Tid
Decomposing a Program into Elements
!51
fact(10)
Exp.Let
int
n < 1
FunDec
fact
n
FArg
Tid
Exp.If
1 n * fact(n - 1)
int
Tid
Decomposing a Program into Elements
!52
fact(10)
Exp.Let
int
n
FunDec
fact
n
FArg
Tid
Exp.If
1
n * fact(n - 1)
int
Tid
Exp.Lt
Exp.Int
1
Exp.Var
Exp.Int
Decomposing a Program into Elements
!53
fact(10)
Exp.Let
int
n
FunDec
fact
n
FArg
Tid
Exp.If
1
fact
int
Tid
Exp.Lt
Exp.Int
1
Exp.Var
Exp.Int Exp.Times
n
Exp.Var Exp.Call
n - 1
Etc.
let function fact(n : int) : int =
if n < 1 then
1
else
n * fact(n - 1)
in
fact(10)
end
Tree Structure Represented as (First-Order) Term
!54
Mod(
Let(
[ FunDecs(
[ FunDec(
"fact"
, [FArg("n", Tid("int"))]
, Tid("int")
, If(
Lt(Var("n"), Int("1"))
, Int("1")
, Times(
Var("n")
, Call("fact", [Minus(Var("n"), Int("1"))])
)
)
)
]
)
]
, [Call("fact", [Int("10")])]
)
)
let function fact(n : int) : int =
if n < 1 then
1
else
n * fact(n - 1)
in
fact(10)
end
Textual representation
- Convenient to read and write (human processing)

- Concrete syntax / notation

Structural tree/term representation
- Represents the decomposition of a program into elements

- Convenient for machine processing

- Abstract syntax
!55
Decomposing Programs
What are well-formed textual programs?
What are well-formed terms/trees?
How to decompose programs automatically?
!56
Formalizing Program Decomposition
Abstract Syntax:
Formalizing Program
Structure
57
Algebraic Signatures
!58
signature
sorts S0 S1 S2 …
constructors
C : S1 * S2 * … -> S0
…
Sorts: syntactic categories
Constructors: language constructs
Well-Formed Terms
!59
The family of well-formed terms T(Sig) 

defined by signature Sig 

is inductively defined as follows:

If C : S1 * S2 * … -> S0 is a constructor in Sig and 



if t1, t2, … are terms in T(Sig)(S1), T(Sig)(S2), …, 



then C(t1, t2, …) is a term in T(Sig)(S0)
Well-Formed Terms: Example
!60
signature
sorts Exp
constructors
Int : IntConst -> Exp
Var : ID -> Exp
Times : Exp * Exp -> Exp
Minus : Exp * Exp -> Exp
Lt : Exp * Exp -> Exp
If : Exp * Exp * Exp -> Exp
Call : ID * List(Exp) -> Exp
if n < 1 then
1
else
n * fact(n - 1)
If(
Lt(Var("n"), Int("1"))
, Int("1")
, Times(
Var("n")
, Call("fact", [Minus(Var("n"), Int("1"))])
)
)
decompose
well-formed wrt
Lists of Terms
!61
signature
sorts Exp
constructors
…
Call : ID * List(Exp) -> Exp
[Minus(Var("n"), Int(“1”))]
[Minus(Var("n"), Int(“1”)), Lt(Var("n"), Int("1"))]
[]
Well-Formed Terms with Lists
!62
The family of well-formed terms T(Sig) 

defined by signature Sig 

is inductively defined as follows:

If C : S1 * S2 * … -> S0 is a constructor in Sig and 

if t1, t2, … are terms in T(Sig)(S1), T(Sig)(S2), …, 

then C(t1, t2, …) is a term in T(Sig)(S0)

If t1, t2, … are terms in T(Sig)(S),

Then [t1, t2, …] is a term in T(Sig)(List(S))
Abstract syntax of a language
- Defined by algebraic signature

- Sorts: syntactic categories

- Constructors: language constructs

Program structure
- Represented by (first-order) term

- Well-formed with respect to abstract syntax

- (Isomorphic to tree structure)
!63
Abstract Syntax
From Abstract Syntax to
Concrete Syntax
64
What does Abstract Syntax Abstract from?
!65
signature
sorts Exp
constructors
Int : IntConst -> Exp
Var : ID -> Exp
Times : Exp * Exp -> Exp
Minus : Exp * Exp -> Exp
Lt : Exp * Exp -> Exp
If : Exp * Exp * Exp -> Exp
Call : ID * List(Exp) -> Exp
Signature does not define ‘notation’
What is Notation?
!66
signature
sorts Exp
constructors
Int : IntConst -> Exp
Var : ID -> Exp
Times : Exp * Exp -> Exp
Minus : Exp * Exp -> Exp
Lt : Exp * Exp -> Exp
If : Exp * Exp * Exp -> Exp
Call : ID * List(Exp) -> Exp
Notation: literals, keywords, delimiters, punctuation
n
x
e1 * e2
e1 - e2
e1 < e2
if e1 then e2 else e3
f(e1, e2, …)
if n < 1 then
1
else
n * fact(n - 1)
How can we couple notation to abstract syntax?
!67
signature
sorts Exp
constructors
Int : IntConst -> Exp
Var : ID -> Exp
Times : Exp * Exp -> Exp
Minus : Exp * Exp -> Exp
Lt : Exp * Exp -> Exp
If : Exp * Exp * Exp -> Exp
Call : ID * List(Exp) -> Exp
n
x
e1 * e2
e1 - e2
e1 < e2
if e1 then e2 else e3
f(e1, e2, …)
if n < 1 then
1
else
n * fact(n - 1)
Notation: literals, keywords, delimiters, punctuation
Context-Free Grammars
!68
grammar
non-terminals N0 N1 N2 …
terminals T0 T1 T2 …
productions
N0 = S1 S2 …
…
Non-terminals (N): syntactic categories
Terminals (T): words of sentences
Productions: rules to create sentences
Symbols (S): non-terminals and terminals
Well-Formed Sentences
!69
The family of sentences L(G) defined by context-free
grammar G is inductively defined as follows:

A terminal T is a sentence in L(G)(T)

If N0 = S1 S2 … is a production in G and 

if w1, w2, … are sentences in L(G)(S1), L(G)(S2), …, 

then w1 w2 … is a sentence in L(G)(N0)
Well-Formed Sentences
!70
if n < 1 then
1
else
n * fact(n - 1)
grammar
non-terminals Exp
productions
Exp = IntConst
Exp = Id
Exp = Exp "*" Exp
Exp = Exp "-" Exp
Exp = Exp "<" Exp
Exp = "if" Exp "then" Exp "else" Exp
Exp = Id "(" {Exp ","}* ")"
What is the relation between concrete and abstract syntax?
!71
if n < 1 then
1
else
n * fact(n - 1)
If(
Lt(Var("n"), Int("1"))
, Int("1")
, Times(
Var("n")
, Call("fact", [Minus(Var("n"), Int("1"))])
)
)
grammar
non-terminals Exp
productions
Exp = IntConst
Exp = Id
Exp = Exp "*" Exp
Exp = Exp "-" Exp
Exp = Exp "<" Exp
Exp = "if" Exp "then" Exp "else" Exp
Exp = Id "(" {Exp ","}* ")"
?
?
signature
sorts Exp
constructors
Int : IntConst -> Exp
Var : ID -> Exp
Times : Exp * Exp -> Exp
Minus : Exp * Exp -> Exp
Lt : Exp * Exp -> Exp
If : Exp * Exp * Exp -> Exp
Call : ID * List(Exp) -> Exp
Context-Free Grammars with Constructor Declarations
!72
sorts Exp
context-free syntax
Exp.Int = IntConst
Exp.Var = Id
Exp.Times = Exp "*" Exp
Exp.Minus = Exp "-" Exp
Exp.Lt = Exp "<" Exp
Exp.If = "if" Exp "then" Exp "else" Exp
Exp.Call = Id "(" {Exp ","}* ")"
signature
sorts Exp
constructors
Int : IntConst -> Exp
Var : ID -> Exp
Times : Exp * Exp -> Exp
Minus : Exp * Exp -> Exp
Lt : Exp * Exp -> Exp
If : Exp * Exp * Exp -> Exp
Call : ID * List(Exp) -> Exp
grammar
non-terminals Exp
productions
Exp = IntConst
Exp = Id
Exp = Exp "*" Exp
Exp = Exp "-" Exp
Exp = Exp "<" Exp
Exp = "if" Exp "then" Exp "else" Exp
Exp = Id "(" {Exp ","}* ")"
Context-Free Grammars with Constructor Declarations
!73
sorts Exp
context-free syntax
Exp.Int = IntConst
Exp.Var = Id
Exp.Times = Exp "*" Exp
Exp.Minus = Exp "-" Exp
Exp.Lt = Exp "<" Exp
Exp.If = "if" Exp "then" Exp "else" Exp
Exp.Call = Id "(" {Exp ","}* ")"
Abstract syntax: productions define
constructor and sorts of arguments
Concrete syntax: productions define
notation for language constructs
Abstract syntax
- Production defines constructor, argument sorts, result sort

- Abstract from notation: lexical elements of productions

Concrete syntax
- Productions define context-free grammar rules

Some details to discuss
- Ambiguities 

- Sequences 

- Lexical syntax

- Converting text to tree and back (parsing, unparsing)
!74
CFG with Constructors defines Abstract and Concrete Syntax
Ambiguity &
Disambiguation
75
Ambiguity
!76
t1 != t2 / format(t1) = format(t2)
Add
VarRef VarRef
“y”“x”
Mul
Int
“3”
Add
VarRef
VarRef
“y”
“x”
Mul
Int
“3”
3 * x + y
!77
Disambiguation Filters
!78
parse filter
Disambiguation Filters [Klint & Visser; 1994], [Van den Brand, Scheerder, Vinju, Visser; CC 2002]
Associativity and Priority
!79
context-free syntax
Expr.Int = INT
Expr.Add = "Expr" + "Expr" {left}
Expr.Mul = "Expr" * "Expr" {left}
context-free priorities
Expr.Mul > Expr.Add
Recent improvement: safe disambiguation of operator precedence [SLE13, Onward15, Programming18]
Add
VarRef VarRef
“y”“x”
Mul
Int
“3”
Add
VarRef
VarRef
“y”
“x”
Mul
Int
“3”
3 * x + y
context-free syntax
Exp.Add = Exp "+" Exp
Exp.Sub = Exp "-" Exp
Exp.Mul = Exp "*" Exp
Exp.Div = Exp "/" Exp
Exp.Eq = Exp "=" Exp
Exp.Neq = Exp "<>" Exp
Exp.Gt = Exp ">" Exp
Exp.Lt = Exp "<" Exp
Exp.Gte = Exp ">=" Exp
Exp.Lte = Exp "<=" Exp
Exp.And = Exp "&" Exp
Exp.Or = Exp "|" Exp
Ambiguous Operator Syntax
!80
Disambiguation by Associativity and Priority Rules
!81
context-free syntax
Exp.Add = Exp "+" Exp {left}
Exp.Sub = Exp "-" Exp {left}
Exp.Mul = Exp "*" Exp {left}
Exp.Div = Exp "/" Exp {left}
Exp.Eq = Exp "=" Exp {non-assoc}
Exp.Neq = Exp "<>" Exp {non-assoc}
Exp.Gt = Exp ">" Exp {non-assoc}
Exp.Lt = Exp "<" Exp {non-assoc}
Exp.Gte = Exp ">=" Exp {non-assoc}
Exp.Lte = Exp "<=" Exp {non-assoc}
Exp.And = Exp "&" Exp {left}
Exp.Or = Exp "|" Exp {left}
context-free priorities
{ left:
Exp.Mul
Exp.Div
} > { left:
Exp.Add
Exp.Sub
} > { non-assoc:
Exp.Eq
Exp.Neq
Exp.Gt
Exp.Lt
Exp.Gte
Exp.Lte
} > Exp.And
> Exp.Or
Disambiguation by Associativity and Priority Rules
!82
context-free syntax
Exp.Add = Exp "+" Exp {left}
Exp.Sub = Exp "-" Exp {left}
Exp.Mul = Exp "*" Exp {left}
Exp.Div = Exp "/" Exp {left}
Exp.Eq = Exp "=" Exp {non-assoc}
Exp.Neq = Exp "<>" Exp {non-assoc}
Exp.Gt = Exp ">" Exp {non-assoc}
Exp.Lt = Exp "<" Exp {non-assoc}
Exp.Gte = Exp ">=" Exp {non-assoc}
Exp.Lte = Exp "<=" Exp {non-assoc}
Exp.And = Exp "&" Exp {left}
Exp.Or = Exp "|" Exp {left}
context-free priorities
{ left:
Exp.Mul
Exp.Div
} > { left:
Exp.Add
Exp.Sub
} > { non-assoc:
Exp.Eq
Exp.Neq
Exp.Gt
Exp.Lt
Exp.Gte
Exp.Lte
} > Exp.And
> Exp.Or
1 + 3 <= 4 - 35 & 12 > 16
And(
Leq(
Plus(Int("1"), Int("3"))
, Minus(Int("4"), Int("35"))
)
, Gt(Int("12"), Int("16"))
)
Sequences (Lists)
83
Encoding Sequences (Lists)
!84
sorts Exp
context-free syntax
Exp.Int = IntConst
Exp.Var = Id
Exp.Times = Exp "*" Exp
Exp.Minus = Exp "-" Exp
Exp.Lt = Exp "<" Exp
Exp.If = "if" Exp "then" Exp "else" Exp
Exp.Call = Id "(" ExpList ")"
ExpList.Nil =
ExpList = ExpListNE
ExpListNE.One = Exp
ExpListNE.Snoc = ExpListNE “," Exp
printlist(merge(list1,list2))
Call("printlist"
, [Call("merge", [Var("list1")
, Var("list2")])]
)
Sugar for Sequences and Optionals
!85
context-free syntax
Exp.Call = Id "(" {Exp ","}* ")"
printlist(merge(list1,list2))
Call("printlist"
, [Call("merge", [Var("list1")
, Var("list2")])]
)
context-free syntax
// automatically generated
{Exp “,"}*.Nil = // empty list
{Exp “,”}* = {Exp ","}+
{Exp “,"}+.One = Exp
{Exp “,"}+.Snoc = {Exp ","}+ "," Exp
Exp*.Nil = // empty list
Exp* = Exp+
Exp+.One = Exp
Exp+.Snoc = Exp+ Exp
Exp?.None = // no expression
Exp?.Some = Exp // one expression
Normalizing Lists
!86
rules
Snoc(Nil(), x) -> Cons(x, Nil())
Snoc(Cons(x, xs), y) -> Cons(x, Snoc(xs, y))
One(x) -> Cons(x, Nil())
Nil() -> []
Cons(x, xs) -> [x | xs]
context-free syntax
// automatically generated
{Exp “,"}*.Nil = // empty list
{Exp “,”}* = {Exp ","}+
{Exp “,"}+.One = Exp
{Exp “,"}+.Snoc = {Exp ","}+ "," Exp
Exp*.Nil = // empty list
Exp* = Exp+
Exp+.One = Exp
Exp+.Snoc = Exp+ Exp
Exp?.None = // no expression
Exp?.Some = Exp // one expression
Using Sugar for Sequences
!87
module Functions
imports Identifiers
imports Types
context-free syntax
Dec = FunDec+
FunDec = "function" Id "(" {FArg ","}* ")" "=" Exp
FunDec = "function" Id "(" {FArg ","}* ")" ":" Type "=" Exp
FArg = Id ":" Type
Exp = Id "(" {Exp ","}* ")"
let function power(x: int, n: int): int =
if n <= 0 then 1
else x * power(x, n - 1)
in power(3, 10)
end
module Bindings
imports Control-Flow Identifiers Types Functions Variables
sorts Declarations
context-free syntax
Exp = "let" Dec* "in" {Exp ";"}* "end"
Declarations = "declarations" Dec*
Lexical Syntax
88
Context-Free Syntax vs Lexical Syntax
!89
let function power(x: int, n: int): int =
if n <= 0 then 1
else x * power(x, n - 1)
in power(3, 10)
end
Mod(
Let(
[ FunDecs(
[ FunDec(
"power"
, [FArg("x", Tid("int")), FArg("n", Tid("int"))]
, Tid("int")
, If(
Leq(Var("n"), Int("0"))
, Int("1")
, Times(
Var("x")
, Call(
"power"
, [Var("x"), Minus(Var("n"), Int("1"))]
)
)
)
)
]
)
]
, [Call("power", [Int("3"), Int("10")])]
)
)
phrase structure
lexeme / token
separated by layout
structure not relevant
not separated by layout
Character Classes
!90
lexical syntax // character codes
Character = [65]
Range = [65-90]
Union = [65-90] / [97-122]
Difference = [0-127] / [1013]
Union = [0-911-1214-255]
Character class represents choice from a set of characters
Sugar for Character Classes
!91
lexical syntax // sugar
CharSugar = [a]
= [97]
CharClass = [abcdefghijklmnopqrstuvwxyz]
= [97-122]
SugarRange = [a-z]
= [97-122]
Union = [a-z] / [A-Z] / [0-9]
= [48-5765-9097-122]
RangeCombi = [a-z0-9_]
= [48-579597-122]
Complement = ~[nr]
= [0-255] / [1013]
= [0-911-1214-255]
Literals are Sequences of Characters
!92
lexical syntax // literals
Literal = "then" // case sensitive sequence of characters
CaseInsensitive = 'then' // case insensitive sequence of characters
syntax
"then" = [116] [104] [101] [110]
'then' = [84116] [72104] [69101] [78110]
syntax
"then" = [t] [h] [e] [n]
'then' = [Tt] [Hh] [Ee] [Nn]
Identifiers
!93
lexical syntax
Id = [a-zA-Z] [a-zA-Z0-9_]*
Id = a
Id = B
Id = cD
Id = xyz10
Id = internal_
Id = CamelCase
Id = lower_case
Id = ...
Lexical Ambiguity: Longest Match
!94
lexical syntax
Id = [a-zA-Z] [a-zA-Z0-9_]*
context-free syntax
Exp.Var = Id
Exp.Call = Exp Exp {left} // curried function call
ab
Mod(
amb(
[Var("ab"),
Call(Var("a"), Var("b"))]
)
)
abc
Lexical Ambiguity: Longest Match
!95
Mod(
amb(
[ amb(
[ Var("abc")
, Call(
amb(
[Var("ab"), Call(Var("a"), Var("b"))]
)
, Var("c")
)
]
)
, Call(Var("a"), Var("bc"))
]
)
)
lexical syntax
Id = [a-zA-Z] [a-zA-Z0-9_]*
context-free syntax
Exp.Var = Id
Exp.Call = Exp Exp {left} // curried function call
abc def ghi
Lexical Restriction => Longest Match
!96
lexical syntax
Id = [a-zA-Z] [a-zA-Z0-9_]*
lexical restrictions
Id -/- [a-zA-Z0-9_] // longest match for identifiers
context-free syntax
Exp.Var = Id
Exp.Call = Exp Exp {left} // curried function call
Call(Call(Var("abc"), Var("def")), Var("ghi"))
Lexical restriction: phrase cannot be followed by character in character class
if def then ghi
Lexical Ambiguity: Keywords overlap with Identifiers
!97
lexical syntax
Id = [a-zA-Z] [a-zA-Z0-9_]*
lexical restrictions
Id -/- [a-zA-Z0-9_] // longest match for identifiers
context-free syntax
Exp.Var = Id
Exp.Call = Exp Exp {left}
Exp.IfThen = "if" Exp "then" Exp
amb(
[ Mod(
Call(
Call(Call(Var("if"), Var("def")), Var("then"))
, Var("ghi")
)
)
, Mod(IfThen(Var("def"), Var("ghi")))
]
)
ifdef then ghi
Lexical Ambiguity: Keywords overlap with Identifiers
!98
amb(
[ Mod(
Call(Call(Var("ifdef"), Var("then")), Var("ghi"))
)
, Mod(IfThen(Var("def"), Var("ghi")))
]
)
lexical syntax
Id = [a-zA-Z] [a-zA-Z0-9_]*
lexical restrictions
Id -/- [a-zA-Z0-9_] // longest match for identifiers
context-free syntax
Exp.Var = Id
Exp.Call = Exp Exp {left}
Exp.IfThen = "if" Exp "then" Exp
Reject Productions => Reserved Words
!99
lexical syntax
Id = [a-zA-Z] [a-zA-Z0-9_]*
Id = "if" {reject}
Id = "then" {reject}
lexical restrictions
Id -/- [a-zA-Z0-9_] // longest match for identifiers
"if" "then" -/- [a-zA-Z0-9_]
context-free syntax
Exp.Var = Id
Exp.Call = Exp Exp {left}
Exp.IfThen = "if" Exp "then" Exp
if def then ghi IfThen(Var("def"), Var("ghi"))
ifdef then ghi
Reject Productions => Reserved Words
!100
parse error
lexical syntax
Id = [a-zA-Z] [a-zA-Z0-9_]*
Id = "if" {reject}
Id = "then" {reject}
lexical restrictions
Id -/- [a-zA-Z0-9_] // longest match for identifiers
"if" "then" -/- [a-zA-Z0-9_]
context-free syntax
Exp.Var = Id
Exp.Call = Exp Exp {left}
Exp.IfThen = "if" Exp "then" Exp
Tiger Lexical Syntax
101
Tiger Lexical Syntax: Identifiers
!102
module Identifiers
lexical syntax
Id = [a-zA-Z] [a-zA-Z0-9_]*
lexical restrictions
Id -/- [a-zA-Z0-9_]
Tiger Lexical Syntax: Number Literals
!103
module Numbers
lexical syntax
IntConst = [0-9]+
lexical syntax
RealConst.RealConstNoExp = IntConst "." IntConst
RealConst.RealConst = IntConst "." IntConst "e" Sign IntConst
Sign = "+"
Sign = "-"
context-free syntax
Exp.Int = IntConst
Tiger Lexical Syntax: String Literals
!104
module Strings
sorts StrConst
lexical syntax
StrConst = """ StrChar* """
StrChar = ~["n]
StrChar = [] [n]
StrChar = [] [t]
StrChar = [] [^] [A-Z]
StrChar = [] [0-9] [0-9] [0-9]
StrChar = [] ["]
StrChar = [] []
StrChar = [] [ tn]+ []
context-free syntax // records
Exp.String = StrConst
Tiger Lexical Syntax: Whitespace
!105
module Whitespace
lexical syntax
LAYOUT = [ tnr]
context-free restrictions
// Ensure greedy matching for comments
LAYOUT? -/- [ tnr]
LAYOUT? -/- [/].[/]
LAYOUT? -/- [/].[*]
syntax
LAYOUT-CF = LAYOUT-LEX
LAYOUT-CF = LAYOUT-CF LAYOUT-CF {left}
Implicit composition of layout
Tiger Lexical Syntax: Comment
!106
lexical syntax
CommentChar = [*]
LAYOUT = "/*" InsideComment* "*/"
InsideComment = ~[*]
InsideComment = CommentChar
lexical restrictions
CommentChar -/- [/]
context-free restrictions
LAYOUT? -/- [/].[*]
lexical syntax
LAYOUT = SingleLineComment
SingleLineComment = "//" ~[nr]* NewLineEOF
NewLineEOF = [nr]
NewLineEOF = EOF
EOF =
lexical restrictions
EOF -/- ~[]
context-free restrictions
LAYOUT? -/- [/].[/]
Desugaring Lexical Syntax
107
Core language
- context-free grammar productions

- with constructors

- only character classes as terminals

- explicit definition of layout

Desugaring
- express lexical syntax in terms of character classes

- explicate layout between context-free syntax symbols

- separate lexical and context-free syntax non-terminals
!108
Explication of Lexical Syntax
Explication of Layout by Transformation
!109
context-free syntax
Exp.Int = IntConst
Exp.Uminus = "-" Exp
Exp.Times = Exp "*" Exp {left}
Exp.Divide = Exp "/" Exp {left}
Exp.Plus = Exp "+" Exp {left}
syntax
Exp-CF.Int = IntConst-CF
Exp-CF.Uminus = "-" LAYOUT?-CF Exp-CF
Exp-CF.Times = Exp-CF LAYOUT?-CF "*" LAYOUT?-CF Exp-CF {left}
Exp-CF.Divide = Exp-CF LAYOUT?-CF "/" LAYOUT?-CF Exp-CF {left}
Exp-CF.Plus = Exp-CF LAYOUT?-CF "+" LAYOUT?-CF Exp-CF {left}
Symbols in context-free syntax are
implicitly separated by optional layout
Separation of Lexical and Context-free Syntax
!110
syntax
Id-LEX = [65-9097-122] [48-5765-909597-122]*-LEX
Id-LEX = "if" {reject}
Id-LEX = "then" {reject}
Id-CF = Id-LEX
Exp-CF.Var = Id-CF
lexical syntax
Id = [a-zA-Z] [a-zA-Z0-9_]*
Id = "if" {reject}
Id = "then" {reject}
context-free syntax
Exp.Var = Id
syntax
Id = [65-9097-122] [48-5765-909597-122]*
Id = "if" {reject}
Id = "then" {reject}
Exp.Var = Id
Why Separation of Lexical and Context-Free Syntax?
!111
Homework: what would go wrong if we not do this?
lexical syntax
Id = [a-zA-Z] [a-zA-Z0-9_]*
Id = "if" {reject}
Id = "then" {reject}
context-free syntax
Exp.Var = Id
!112
syntax
"if" = [105] [102]
"then" = [116] [104] [101] [110]
[48-5765-909597-122]+-LEX = [48-5765-909597-122]
[48-5765-909597-122]+-LEX = [48-5765-909597-122]+-LEX [48-5765-909597-122]
[48-5765-909597-122]*-LEX =
[48-5765-909597-122]*-LEX = [48-5765-909597-122]+-LEX
Id-LEX = [65-9097-122] [48-5765-909597-122]*-LEX
Id-LEX = "if" {reject}
Id-LEX = "then" {reject}
Id-CF = Id-LEX
Exp-CF.Var = Id-CF
Exp-CF.Call = Exp-CF LAYOUT?-CF Exp-CF {left}
Exp-CF.IfThen = "if" LAYOUT?-CF Exp-CF LAYOUT?-CF "then" LAYOUT?-CF Exp-CF
LAYOUT-CF = LAYOUT-CF LAYOUT-CF {left}
LAYOUT?-CF = LAYOUT-CF
LAYOUT?-CF =
restrictions
Id-LEX -/- [48-5765-909597-122]
"if" -/- [48-5765-909597-122]
"then" -/- [48-5765-909597-122]
priorities
Exp-CF.Call left Exp-CF.Call,
LAYOUT-CF = LAYOUT-CF LAYOUT-CF left LAYOUT-CF = LAYOUT-CF LAYOUT-CF
separate lexical and
context-free syntax
separate context-
free symbols by
optional layout
character classes
as only terminals
lexical syntax
Id = [a-zA-Z] [a-zA-Z0-9_]*
Id = “if" {reject}
Id = "then" {reject}
lexical restrictions
Id -/- [a-zA-Z0-9_]
"if" "then" -/- [a-zA-Z0-9_]
context-free syntax
Exp.Var = Id
Exp.Call = Exp Exp {left}
Exp.IfThen = "if" Exp "then" Exp
Testing Syntax Definitions
with SPT
113
Are you defining the right language?
Test Suites
!114
module example-suite
language Tiger
start symbol Start
test name
[[...]]
parse succeeds
test another name
[[...]]
parse fails
SPT: SPoofax Testing language
Success is Failure, Failure is Success, …
!115
module success-failure
language Tiger
test this test succeeds [[
1 + 3 * 4
]] parse succeeds
test this test fails [[
1 + 3 * 4
]] parse fails
test this test fails [[
1 + 3 *
]] parse succeeds
test this test succeeds [[
1 + 3 *
]] parse fails
Test Cases: Valid
!116
Language
valid program
test valid program
[[...]]
parse succeeds
Test Cases: Invalid
!117
test invalid program
[[...]]
parse fails
invalid program
Language
Test Cases
!118
module syntax/identifiers
language Tiger start symbol Id
test single lower case [[x]] parse succeeds
test single upper case [[X]] parse succeeds
test single digit [[1]] parse fails
test single lc digit [[x1]] parse succeeds
test single digit lc [[1x]] parse fails
test single uc digit [[X1]] parse succeeds
test single digit uc [[1X]] parse fails
test double digit [[11]] parse fails
Test Corner Cases
!119
Language
Testing Structure
!120
module structure
language Tiger
test add times [[
21 + 14 + 7
]] parse to Mod(Plus(Int(“21"),Times(Int("14"),Int("7"))))
test times add [[
3 * 7 + 21
]] parse to Mod(Plus(Times(Int(“3"),Int("7")),Int("21")))
test if [[
if x then 3 else 4
]] parse to Mod(If(Var("x"),Int("3"),Int("4")))
The fragment did not parse to the expected ATerm. Parse result was:
Mod(Plus(Plus(Int("21"),Int("14")),Int("7"))) Expected result was:
Mod(Plus(Int(21), Times(Int(14), Int(7))))
Testing Ambiguity
!121
Note : this does not actually work in Tiger, since () is a
sequencing construct; but works fine in Mini-Java
module precedence
language Tiger start symbol Exp
test parentheses
[[(42)]] parse to [[42]]
test left-associative addition
[[21 + 14 + 7]] parse to [[(21 + 14) + 7]]
test precedence multiplication
[[3 * 7 + 21]] parse to [[(3 * 7) + 21]]
Testing Ambiguity
!122
module precedence
language Tiger start symbol Exp
test plus/times priority [[
x + 4 * 5
]] parse to Plus(Var("x"), Times(Int("4"), Int("5")))
test plus/times sequence [[
(x + 4) * 5
]] parse to Times(
Seq([Plus(Var("x"), Int("4"))])
, Int("5")
)
Syntax Engineering in
Spoofax
123
Developing syntax definition
- Define syntax of language in multiple modules

- Syntax checking, colouring

- Checking for undefined non-terminals

Testing syntax definition
- Write example programs in editor for language under def

- Inspect abstract syntax terms

‣ Spoofax > Syntax > Show Parsed AST

- Write SPT test for success and failure cases

‣ Updated after build of syntax definition
!125
Syntax Engineering in Spoofax
Declarative Syntax
Definition: Summary
126
Syntax definition
- Define structure (decomposition) of programs

- Define concrete syntax: notation

- Define abstract syntax: constructors

Using syntax definitions (next)
- Parsing: converting text to abstract syntax term

- Pretty-printing: convert abstract syntax term to text

- Editor services: syntax highlighting, syntax checking, completion
!127
Declarative Syntax Definition
Next: Parsing
128
Except where otherwise noted, this work is licensed under

Mais conteúdo relacionado

Mais procurados

Introduction of bison
Introduction of bisonIntroduction of bison
Introduction of bison
vip_du
 

Mais procurados (20)

Declare Your Language: Transformation by Strategic Term Rewriting
Declare Your Language: Transformation by Strategic Term RewritingDeclare Your Language: Transformation by Strategic Term Rewriting
Declare Your Language: Transformation by Strategic Term Rewriting
 
CS4200 2019 | Lecture 3 | Parsing
CS4200 2019 | Lecture 3 | ParsingCS4200 2019 | Lecture 3 | Parsing
CS4200 2019 | Lecture 3 | Parsing
 
Declare Your Language: Type Checking
Declare Your Language: Type CheckingDeclare Your Language: Type Checking
Declare Your Language: Type Checking
 
46630497 fun-pointer-1
46630497 fun-pointer-146630497 fun-pointer-1
46630497 fun-pointer-1
 
Compiler Construction | Lecture 12 | Virtual Machines
Compiler Construction | Lecture 12 | Virtual MachinesCompiler Construction | Lecture 12 | Virtual Machines
Compiler Construction | Lecture 12 | Virtual Machines
 
Lex (lexical analyzer)
Lex (lexical analyzer)Lex (lexical analyzer)
Lex (lexical analyzer)
 
Advanced C - Part 2
Advanced C - Part 2Advanced C - Part 2
Advanced C - Part 2
 
Python unit 2 as per Anna university syllabus
Python unit 2 as per Anna university syllabusPython unit 2 as per Anna university syllabus
Python unit 2 as per Anna university syllabus
 
Thinking in Functions: Functional Programming in Python
Thinking in Functions: Functional Programming in PythonThinking in Functions: Functional Programming in Python
Thinking in Functions: Functional Programming in Python
 
Functional programming
Functional programmingFunctional programming
Functional programming
 
Advance python programming
Advance python programming Advance python programming
Advance python programming
 
Functional programming ii
Functional programming iiFunctional programming ii
Functional programming ii
 
C++ theory
C++ theoryC++ theory
C++ theory
 
GE8151 Problem Solving and Python Programming
GE8151 Problem Solving and Python ProgrammingGE8151 Problem Solving and Python Programming
GE8151 Problem Solving and Python Programming
 
Chapter 22. Lambda Expressions and LINQ
Chapter 22. Lambda Expressions and LINQChapter 22. Lambda Expressions and LINQ
Chapter 22. Lambda Expressions and LINQ
 
Python Programming
Python ProgrammingPython Programming
Python Programming
 
Python programming workshop session 1
Python programming workshop session 1Python programming workshop session 1
Python programming workshop session 1
 
Library functions in c++
Library functions in c++Library functions in c++
Library functions in c++
 
Python 3000
Python 3000Python 3000
Python 3000
 
Introduction of bison
Introduction of bisonIntroduction of bison
Introduction of bison
 

Semelhante a CS4200 2019 | Lecture 2 | syntax-definition

Dam31303 dti2143 lab sheet 7
Dam31303 dti2143 lab sheet 7Dam31303 dti2143 lab sheet 7
Dam31303 dti2143 lab sheet 7
alish sha
 
Dti2143 lab sheet 6
Dti2143 lab sheet 6Dti2143 lab sheet 6
Dti2143 lab sheet 6
alish sha
 
Dti2143 lab sheet 9
Dti2143 lab sheet 9Dti2143 lab sheet 9
Dti2143 lab sheet 9
alish sha
 
Datastructure notes
Datastructure notesDatastructure notes
Datastructure notes
Srikanth
 

Semelhante a CS4200 2019 | Lecture 2 | syntax-definition (20)

Dam31303 dti2143 lab sheet 7
Dam31303 dti2143 lab sheet 7Dam31303 dti2143 lab sheet 7
Dam31303 dti2143 lab sheet 7
 
C++ manual Report Full
C++ manual Report FullC++ manual Report Full
C++ manual Report Full
 
Functions
FunctionsFunctions
Functions
 
Functions in c
Functions in cFunctions in c
Functions in c
 
Functions
FunctionsFunctions
Functions
 
6. function
6. function6. function
6. function
 
Dti2143 lab sheet 6
Dti2143 lab sheet 6Dti2143 lab sheet 6
Dti2143 lab sheet 6
 
An imperative study of c
An imperative study of cAn imperative study of c
An imperative study of c
 
Unit 4 (1)
Unit 4 (1)Unit 4 (1)
Unit 4 (1)
 
Function in c program
Function in c programFunction in c program
Function in c program
 
Program presentation
Program presentationProgram presentation
Program presentation
 
Array Cont
Array ContArray Cont
Array Cont
 
Fundamentals of computer programming by Dr. A. Charan Kumari
Fundamentals of computer programming by Dr. A. Charan KumariFundamentals of computer programming by Dr. A. Charan Kumari
Fundamentals of computer programming by Dr. A. Charan Kumari
 
Functions
FunctionsFunctions
Functions
 
Dti2143 lab sheet 9
Dti2143 lab sheet 9Dti2143 lab sheet 9
Dti2143 lab sheet 9
 
C function
C functionC function
C function
 
CHAPTER 6
CHAPTER 6CHAPTER 6
CHAPTER 6
 
7 functions
7  functions7  functions
7 functions
 
Datastructure notes
Datastructure notesDatastructure notes
Datastructure notes
 
Preprocessor directives
Preprocessor directivesPreprocessor directives
Preprocessor directives
 

Mais de Eelco Visser

Mais de Eelco Visser (18)

CS4200 2019 Lecture 1: Introduction
CS4200 2019 Lecture 1: IntroductionCS4200 2019 Lecture 1: Introduction
CS4200 2019 Lecture 1: Introduction
 
A Direct Semantics of Declarative Disambiguation Rules
A Direct Semantics of Declarative Disambiguation RulesA Direct Semantics of Declarative Disambiguation Rules
A Direct Semantics of Declarative Disambiguation Rules
 
Compiler Construction | Lecture 17 | Beyond Compiler Construction
Compiler Construction | Lecture 17 | Beyond Compiler ConstructionCompiler Construction | Lecture 17 | Beyond Compiler Construction
Compiler Construction | Lecture 17 | Beyond Compiler Construction
 
Domain Specific Languages for Parallel Graph AnalytiX (PGX)
Domain Specific Languages for Parallel Graph AnalytiX (PGX)Domain Specific Languages for Parallel Graph AnalytiX (PGX)
Domain Specific Languages for Parallel Graph AnalytiX (PGX)
 
Compiler Construction | Lecture 15 | Memory Management
Compiler Construction | Lecture 15 | Memory ManagementCompiler Construction | Lecture 15 | Memory Management
Compiler Construction | Lecture 15 | Memory Management
 
Compiler Construction | Lecture 13 | Code Generation
Compiler Construction | Lecture 13 | Code GenerationCompiler Construction | Lecture 13 | Code Generation
Compiler Construction | Lecture 13 | Code Generation
 
Compiler Construction | Lecture 11 | Monotone Frameworks
Compiler Construction | Lecture 11 | Monotone FrameworksCompiler Construction | Lecture 11 | Monotone Frameworks
Compiler Construction | Lecture 11 | Monotone Frameworks
 
Compiler Construction | Lecture 10 | Data-Flow Analysis
Compiler Construction | Lecture 10 | Data-Flow AnalysisCompiler Construction | Lecture 10 | Data-Flow Analysis
Compiler Construction | Lecture 10 | Data-Flow Analysis
 
Compiler Construction | Lecture 7 | Type Checking
Compiler Construction | Lecture 7 | Type CheckingCompiler Construction | Lecture 7 | Type Checking
Compiler Construction | Lecture 7 | Type Checking
 
Compiler Construction | Lecture 6 | Introduction to Static Analysis
Compiler Construction | Lecture 6 | Introduction to Static AnalysisCompiler Construction | Lecture 6 | Introduction to Static Analysis
Compiler Construction | Lecture 6 | Introduction to Static Analysis
 
Compiler Construction | Lecture 4 | Parsing
Compiler Construction | Lecture 4 | Parsing Compiler Construction | Lecture 4 | Parsing
Compiler Construction | Lecture 4 | Parsing
 
Compiler Construction | Lecture 3 | Syntactic Editor Services
Compiler Construction | Lecture 3 | Syntactic Editor ServicesCompiler Construction | Lecture 3 | Syntactic Editor Services
Compiler Construction | Lecture 3 | Syntactic Editor Services
 
Compiler Construction | Lecture 1 | What is a compiler?
Compiler Construction | Lecture 1 | What is a compiler?Compiler Construction | Lecture 1 | What is a compiler?
Compiler Construction | Lecture 1 | What is a compiler?
 
Declare Your Language: Virtual Machines & Code Generation
Declare Your Language: Virtual Machines & Code GenerationDeclare Your Language: Virtual Machines & Code Generation
Declare Your Language: Virtual Machines & Code Generation
 
Declare Your Language: Dynamic Semantics
Declare Your Language: Dynamic SemanticsDeclare Your Language: Dynamic Semantics
Declare Your Language: Dynamic Semantics
 
Declare Your Language: Constraint Resolution 2
Declare Your Language: Constraint Resolution 2Declare Your Language: Constraint Resolution 2
Declare Your Language: Constraint Resolution 2
 
Declare Your Language: Constraint Resolution 1
Declare Your Language: Constraint Resolution 1Declare Your Language: Constraint Resolution 1
Declare Your Language: Constraint Resolution 1
 
Declare Your Language: Name Resolution
Declare Your Language: Name ResolutionDeclare Your Language: Name Resolution
Declare Your Language: Name Resolution
 

Último

%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 

Último (20)

OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 

CS4200 2019 | Lecture 2 | syntax-definition

  • 1. Lecture 2: Declarative Syntax Definition CS4200 Compiler Construction Eelco Visser TU Delft September 2019
  • 2. This Lecture !2 Java Type Check JVM bytecode Parse CodeGenOptimize Specification of syntax definition from which we can derive parsers
  • 5. Architecture: Language Workbench vs Compiler !5 Java Type Check JVM bytecode Parse CodeGenOptimize
  • 6. Language Workbench: Live Language Development !6
  • 8. The perspective of this lecture on declarative syntax definition is explained more elaborately in this Onward! 2010 essay. It uses an on older version of SDF than used in these slides. Production rules have the form X1 … Xn -> N {cons(“C”)} instead of N.C = X1 … Xn http://swerl.tudelft.nl/twiki/pub/Main/TechnicalReports/TUD-SERG-2010-019.pdf https://doi.org/10.1145/1932682.1869535
  • 9. The SPoofax Testing (SPT) language used in the section on testing syntax definitions was introduced in this OOPSLA 2011 paper. https://doi.org/10.1145/2076021.2048080 http://swerl.tudelft.nl/twiki/pub/Main/TechnicalReports/TUD-SERG-2011-011.pdf
  • 10. The SDF3 syntax definition formalism is documented at the metaborg.org website. http://www.metaborg.org/en/latest/source/langdev/meta/lang/sdf3/index.html
  • 12. What is Syntax? !12 https://en.wikipedia.org/wiki/Syntax In mathematics, syntax refers to the rules governing the behavior of mathematical systems, such as formal languages used in logic. (See logical syntax.) In linguistics, syntax (/ˈsɪntæks/[1][2]) is the set of rules, principles, and processes that govern the structure of sentences in a given language, specifically word order and punctuation. The term syntax is also used to refer to the study of such principles and processes.[3] The goal of many syntacticians is to discover the syntactic rules common to all languages. The word syntax comes from Ancient Greek: σύνταξις "coordination", which consists of σύν syn, "together," and τάξις táxis, "an ordering".
  • 13. Syntax (Programming Languages) !13 In computer science, the syntax of a computer language is the set of rules that defines the combinations of symbols that are considered to be a correctly structured document or fragment in that language. This applies both to programming languages, where the document represents source code, and markup languages, where the document represents data. The syntax of a language defines its surface form.[1] Text-based computer languages are based on sequences of characters, while visual programming languages are based on the spatial layout and connections between symbols (which may be textual or graphical). Documents that are syntactically invalid are said to have a syntax error. https://en.wikipedia.org/wiki/Syntax_(programming_languages)
  • 14. Syntax - The set of rules, principles, and processes that govern the structure of sentences in a given language - The set of rules that defines the combinations of symbols that are considered to be a correctly structured document or fragment in that language How to describe such a set of rules? !14 That Govern the Structure
  • 15. The Structure of Programs 15
  • 16. What do we call the elements of programs? !16 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; }
  • 17. What kind of program element is this? !17 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Program Compilation Unit
  • 18. What kind of program element is this? !18 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Preprocessor Directive
  • 19. What kind of program element is this? !19 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Function Declaration Function Prototype
  • 20. What kind of program element is this? !20 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Comment
  • 21. What kind of program element is this? !21 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Function Definition
  • 22. What kind of program element is this? !22 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Variable Declaration
  • 23. What kind of program element is this? !23 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Statement For Loop
  • 24. What kind of program element is this? !24 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Statement Function Call
  • 25. What kind of program element is this? !25 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Expression
  • 26. What kind of program element is this? !26 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Formal Function Parameter
  • 27. What kind of program element is this? !27 #include <stdio.h> int power(int m, int n); /* test power function */ main() { int i; for (i = 0; i < 10; ++i) printf("%d %d %dn", i, power(2, i), power(-3, i)); return 0; } /* power: raise base to n-th power; n >= 0 */ int power(int base, int n) { int i, p; p = 1; for (i = 1; i <= n; ++i) p = p * base; return p; } Type
  • 28. Syntactic Categories !28 Program Compilation Unit Preprocessor Directive Function Declaration Function Prototype Function Definition Variable Declaration Statement For Loop Expression Formal Function Parameter Type Function Call Programs consist of different kinds of elements
  • 29. Hierarchy of Syntactic Categories !29 Program Compilation Unit Preprocessor Directive Function Declaration Function Prototype Function Definition Variable Declaration Statement For LoopExpression Formal Function Parameter Type Function Call Some kinds of constructs are contained in others
  • 30. The Tiger Language !30 Example language used in lectures https://www.lrde.epita.fr/~tiger/tiger.html Documentation https://github.com/MetaBorgCube/metaborg-tiger Spoofax project
  • 31. !31 let var N := 8 type intArray = array of int var row := intArray[N] of 0 var col := intArray[N] of 0 var diag1 := intArray[N + N - 1] of 0 var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n") ); print("n")) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 & diag2[r + 7 - c] = 0 then ( row[r] := 1; diag1[r + c] := 1; diag2[r + 7 - c] := 1; col[c] := r; try(c + 1); row[r] := 0; diag1[r + c] := 0; diag2[r + 7 - c] := 0)) in try(0) end A Tiger program that solves the n-queens problem
  • 32. !32 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end A mutilated n-queens Tiger program with ‘redundant’ elements removed What are the syntactic categories of Tiger? What are the language constructs of Tiger called?
  • 33. !33 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Program Expression Let Binding
  • 34. !34 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Variable Declaration
  • 35. !35 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Type Declaration
  • 36. !36 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Type Expression
  • 37. !37 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Expression Array Initialization
  • 38. !38 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Function Definition
  • 39. !39 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Function Name
  • 40. !40 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Function Body Expression For Loop
  • 41. !41 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Expression Sequential Composition
  • 42. !42 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Formal Parameter
  • 43. !43 let var N := 8 type intArray = array of int var diag2 := intArray[N + N - 1] of 0 function printboard() = ( for i := 0 to N - 1 do ( for j := 0 to N - 1 do print(if col[i] = j then " O" else " ."); print("n"))) function try(c : int) = ( if c = N then printboard() else for r := 0 to N - 1 do if row[r] = 0 & diag1[r + c] = 0 then ( diag2[r + 7 - c] := 1; try(c + 1);)) in try(0) end Expression Assignment
  • 44. !44 let … function prettyprint(tree : tree) : string = let var output := "" function write(s : string) = output := concat(output, s) function show(n : int, t : tree) = let function indent(s : string) = ( write("n"); for i := 1 to n do write(" ") ; output := concat(output, s)) in if t = nil then indent(".") else ( indent(t.key); show(n + 1, t.left); show(n + 1, t.right)) end in show(0, tree); output end … in … end Functions can be nested
  • 45. Structure - Programs have structure Categories - Program elements come in multiple categories - Elements cannot be arbitrarily interchanged Constructs - Some categories have multiple elements Hierarchy - Categories are organized in a hierarchy !45 Elements of Programs
  • 47. Decomposing a Program into Elements !47 let function fact(n : int) : int = if n < 1 then 1 else n * fact(n - 1) in fact(10) end
  • 48. Decomposing a Program into Elements !48 function fact(n : int) : int = if n < 1 then 1 else n * fact(n - 1) fact(10) Exp.Let
  • 49. Decomposing a Program into Elements !49 fact(10) Exp.Let n : int if n < 1 then 1 else n * fact(n - 1) FunDec fact int
  • 50. Decomposing a Program into Elements !50 fact(10) Exp.Let int if n < 1 then 1 else n * fact(n - 1) FunDec fact n FArg Tid int Tid
  • 51. Decomposing a Program into Elements !51 fact(10) Exp.Let int n < 1 FunDec fact n FArg Tid Exp.If 1 n * fact(n - 1) int Tid
  • 52. Decomposing a Program into Elements !52 fact(10) Exp.Let int n FunDec fact n FArg Tid Exp.If 1 n * fact(n - 1) int Tid Exp.Lt Exp.Int 1 Exp.Var Exp.Int
  • 53. Decomposing a Program into Elements !53 fact(10) Exp.Let int n FunDec fact n FArg Tid Exp.If 1 fact int Tid Exp.Lt Exp.Int 1 Exp.Var Exp.Int Exp.Times n Exp.Var Exp.Call n - 1 Etc. let function fact(n : int) : int = if n < 1 then 1 else n * fact(n - 1) in fact(10) end
  • 54. Tree Structure Represented as (First-Order) Term !54 Mod( Let( [ FunDecs( [ FunDec( "fact" , [FArg("n", Tid("int"))] , Tid("int") , If( Lt(Var("n"), Int("1")) , Int("1") , Times( Var("n") , Call("fact", [Minus(Var("n"), Int("1"))]) ) ) ) ] ) ] , [Call("fact", [Int("10")])] ) ) let function fact(n : int) : int = if n < 1 then 1 else n * fact(n - 1) in fact(10) end
  • 55. Textual representation - Convenient to read and write (human processing) - Concrete syntax / notation Structural tree/term representation - Represents the decomposition of a program into elements - Convenient for machine processing - Abstract syntax !55 Decomposing Programs
  • 56. What are well-formed textual programs? What are well-formed terms/trees? How to decompose programs automatically? !56 Formalizing Program Decomposition
  • 58. Algebraic Signatures !58 signature sorts S0 S1 S2 … constructors C : S1 * S2 * … -> S0 … Sorts: syntactic categories Constructors: language constructs
  • 59. Well-Formed Terms !59 The family of well-formed terms T(Sig) defined by signature Sig is inductively defined as follows: If C : S1 * S2 * … -> S0 is a constructor in Sig and 
 if t1, t2, … are terms in T(Sig)(S1), T(Sig)(S2), …, 
 then C(t1, t2, …) is a term in T(Sig)(S0)
  • 60. Well-Formed Terms: Example !60 signature sorts Exp constructors Int : IntConst -> Exp Var : ID -> Exp Times : Exp * Exp -> Exp Minus : Exp * Exp -> Exp Lt : Exp * Exp -> Exp If : Exp * Exp * Exp -> Exp Call : ID * List(Exp) -> Exp if n < 1 then 1 else n * fact(n - 1) If( Lt(Var("n"), Int("1")) , Int("1") , Times( Var("n") , Call("fact", [Minus(Var("n"), Int("1"))]) ) ) decompose well-formed wrt
  • 61. Lists of Terms !61 signature sorts Exp constructors … Call : ID * List(Exp) -> Exp [Minus(Var("n"), Int(“1”))] [Minus(Var("n"), Int(“1”)), Lt(Var("n"), Int("1"))] []
  • 62. Well-Formed Terms with Lists !62 The family of well-formed terms T(Sig) defined by signature Sig is inductively defined as follows: If C : S1 * S2 * … -> S0 is a constructor in Sig and 
 if t1, t2, … are terms in T(Sig)(S1), T(Sig)(S2), …, 
 then C(t1, t2, …) is a term in T(Sig)(S0) If t1, t2, … are terms in T(Sig)(S), Then [t1, t2, …] is a term in T(Sig)(List(S))
  • 63. Abstract syntax of a language - Defined by algebraic signature - Sorts: syntactic categories - Constructors: language constructs Program structure - Represented by (first-order) term - Well-formed with respect to abstract syntax - (Isomorphic to tree structure) !63 Abstract Syntax
  • 64. From Abstract Syntax to Concrete Syntax 64
  • 65. What does Abstract Syntax Abstract from? !65 signature sorts Exp constructors Int : IntConst -> Exp Var : ID -> Exp Times : Exp * Exp -> Exp Minus : Exp * Exp -> Exp Lt : Exp * Exp -> Exp If : Exp * Exp * Exp -> Exp Call : ID * List(Exp) -> Exp Signature does not define ‘notation’
  • 66. What is Notation? !66 signature sorts Exp constructors Int : IntConst -> Exp Var : ID -> Exp Times : Exp * Exp -> Exp Minus : Exp * Exp -> Exp Lt : Exp * Exp -> Exp If : Exp * Exp * Exp -> Exp Call : ID * List(Exp) -> Exp Notation: literals, keywords, delimiters, punctuation n x e1 * e2 e1 - e2 e1 < e2 if e1 then e2 else e3 f(e1, e2, …) if n < 1 then 1 else n * fact(n - 1)
  • 67. How can we couple notation to abstract syntax? !67 signature sorts Exp constructors Int : IntConst -> Exp Var : ID -> Exp Times : Exp * Exp -> Exp Minus : Exp * Exp -> Exp Lt : Exp * Exp -> Exp If : Exp * Exp * Exp -> Exp Call : ID * List(Exp) -> Exp n x e1 * e2 e1 - e2 e1 < e2 if e1 then e2 else e3 f(e1, e2, …) if n < 1 then 1 else n * fact(n - 1) Notation: literals, keywords, delimiters, punctuation
  • 68. Context-Free Grammars !68 grammar non-terminals N0 N1 N2 … terminals T0 T1 T2 … productions N0 = S1 S2 … … Non-terminals (N): syntactic categories Terminals (T): words of sentences Productions: rules to create sentences Symbols (S): non-terminals and terminals
  • 69. Well-Formed Sentences !69 The family of sentences L(G) defined by context-free grammar G is inductively defined as follows: A terminal T is a sentence in L(G)(T) If N0 = S1 S2 … is a production in G and 
 if w1, w2, … are sentences in L(G)(S1), L(G)(S2), …, 
 then w1 w2 … is a sentence in L(G)(N0)
  • 70. Well-Formed Sentences !70 if n < 1 then 1 else n * fact(n - 1) grammar non-terminals Exp productions Exp = IntConst Exp = Id Exp = Exp "*" Exp Exp = Exp "-" Exp Exp = Exp "<" Exp Exp = "if" Exp "then" Exp "else" Exp Exp = Id "(" {Exp ","}* ")"
  • 71. What is the relation between concrete and abstract syntax? !71 if n < 1 then 1 else n * fact(n - 1) If( Lt(Var("n"), Int("1")) , Int("1") , Times( Var("n") , Call("fact", [Minus(Var("n"), Int("1"))]) ) ) grammar non-terminals Exp productions Exp = IntConst Exp = Id Exp = Exp "*" Exp Exp = Exp "-" Exp Exp = Exp "<" Exp Exp = "if" Exp "then" Exp "else" Exp Exp = Id "(" {Exp ","}* ")" ? ? signature sorts Exp constructors Int : IntConst -> Exp Var : ID -> Exp Times : Exp * Exp -> Exp Minus : Exp * Exp -> Exp Lt : Exp * Exp -> Exp If : Exp * Exp * Exp -> Exp Call : ID * List(Exp) -> Exp
  • 72. Context-Free Grammars with Constructor Declarations !72 sorts Exp context-free syntax Exp.Int = IntConst Exp.Var = Id Exp.Times = Exp "*" Exp Exp.Minus = Exp "-" Exp Exp.Lt = Exp "<" Exp Exp.If = "if" Exp "then" Exp "else" Exp Exp.Call = Id "(" {Exp ","}* ")" signature sorts Exp constructors Int : IntConst -> Exp Var : ID -> Exp Times : Exp * Exp -> Exp Minus : Exp * Exp -> Exp Lt : Exp * Exp -> Exp If : Exp * Exp * Exp -> Exp Call : ID * List(Exp) -> Exp grammar non-terminals Exp productions Exp = IntConst Exp = Id Exp = Exp "*" Exp Exp = Exp "-" Exp Exp = Exp "<" Exp Exp = "if" Exp "then" Exp "else" Exp Exp = Id "(" {Exp ","}* ")"
  • 73. Context-Free Grammars with Constructor Declarations !73 sorts Exp context-free syntax Exp.Int = IntConst Exp.Var = Id Exp.Times = Exp "*" Exp Exp.Minus = Exp "-" Exp Exp.Lt = Exp "<" Exp Exp.If = "if" Exp "then" Exp "else" Exp Exp.Call = Id "(" {Exp ","}* ")" Abstract syntax: productions define constructor and sorts of arguments Concrete syntax: productions define notation for language constructs
  • 74. Abstract syntax - Production defines constructor, argument sorts, result sort - Abstract from notation: lexical elements of productions Concrete syntax - Productions define context-free grammar rules Some details to discuss - Ambiguities - Sequences - Lexical syntax - Converting text to tree and back (parsing, unparsing) !74 CFG with Constructors defines Abstract and Concrete Syntax
  • 76. Ambiguity !76 t1 != t2 / format(t1) = format(t2) Add VarRef VarRef “y”“x” Mul Int “3” Add VarRef VarRef “y” “x” Mul Int “3” 3 * x + y
  • 77. !77
  • 78. Disambiguation Filters !78 parse filter Disambiguation Filters [Klint & Visser; 1994], [Van den Brand, Scheerder, Vinju, Visser; CC 2002]
  • 79. Associativity and Priority !79 context-free syntax Expr.Int = INT Expr.Add = "Expr" + "Expr" {left} Expr.Mul = "Expr" * "Expr" {left} context-free priorities Expr.Mul > Expr.Add Recent improvement: safe disambiguation of operator precedence [SLE13, Onward15, Programming18] Add VarRef VarRef “y”“x” Mul Int “3” Add VarRef VarRef “y” “x” Mul Int “3” 3 * x + y
  • 80. context-free syntax Exp.Add = Exp "+" Exp Exp.Sub = Exp "-" Exp Exp.Mul = Exp "*" Exp Exp.Div = Exp "/" Exp Exp.Eq = Exp "=" Exp Exp.Neq = Exp "<>" Exp Exp.Gt = Exp ">" Exp Exp.Lt = Exp "<" Exp Exp.Gte = Exp ">=" Exp Exp.Lte = Exp "<=" Exp Exp.And = Exp "&" Exp Exp.Or = Exp "|" Exp Ambiguous Operator Syntax !80
  • 81. Disambiguation by Associativity and Priority Rules !81 context-free syntax Exp.Add = Exp "+" Exp {left} Exp.Sub = Exp "-" Exp {left} Exp.Mul = Exp "*" Exp {left} Exp.Div = Exp "/" Exp {left} Exp.Eq = Exp "=" Exp {non-assoc} Exp.Neq = Exp "<>" Exp {non-assoc} Exp.Gt = Exp ">" Exp {non-assoc} Exp.Lt = Exp "<" Exp {non-assoc} Exp.Gte = Exp ">=" Exp {non-assoc} Exp.Lte = Exp "<=" Exp {non-assoc} Exp.And = Exp "&" Exp {left} Exp.Or = Exp "|" Exp {left} context-free priorities { left: Exp.Mul Exp.Div } > { left: Exp.Add Exp.Sub } > { non-assoc: Exp.Eq Exp.Neq Exp.Gt Exp.Lt Exp.Gte Exp.Lte } > Exp.And > Exp.Or
  • 82. Disambiguation by Associativity and Priority Rules !82 context-free syntax Exp.Add = Exp "+" Exp {left} Exp.Sub = Exp "-" Exp {left} Exp.Mul = Exp "*" Exp {left} Exp.Div = Exp "/" Exp {left} Exp.Eq = Exp "=" Exp {non-assoc} Exp.Neq = Exp "<>" Exp {non-assoc} Exp.Gt = Exp ">" Exp {non-assoc} Exp.Lt = Exp "<" Exp {non-assoc} Exp.Gte = Exp ">=" Exp {non-assoc} Exp.Lte = Exp "<=" Exp {non-assoc} Exp.And = Exp "&" Exp {left} Exp.Or = Exp "|" Exp {left} context-free priorities { left: Exp.Mul Exp.Div } > { left: Exp.Add Exp.Sub } > { non-assoc: Exp.Eq Exp.Neq Exp.Gt Exp.Lt Exp.Gte Exp.Lte } > Exp.And > Exp.Or 1 + 3 <= 4 - 35 & 12 > 16 And( Leq( Plus(Int("1"), Int("3")) , Minus(Int("4"), Int("35")) ) , Gt(Int("12"), Int("16")) )
  • 84. Encoding Sequences (Lists) !84 sorts Exp context-free syntax Exp.Int = IntConst Exp.Var = Id Exp.Times = Exp "*" Exp Exp.Minus = Exp "-" Exp Exp.Lt = Exp "<" Exp Exp.If = "if" Exp "then" Exp "else" Exp Exp.Call = Id "(" ExpList ")" ExpList.Nil = ExpList = ExpListNE ExpListNE.One = Exp ExpListNE.Snoc = ExpListNE “," Exp printlist(merge(list1,list2)) Call("printlist" , [Call("merge", [Var("list1") , Var("list2")])] )
  • 85. Sugar for Sequences and Optionals !85 context-free syntax Exp.Call = Id "(" {Exp ","}* ")" printlist(merge(list1,list2)) Call("printlist" , [Call("merge", [Var("list1") , Var("list2")])] ) context-free syntax // automatically generated {Exp “,"}*.Nil = // empty list {Exp “,”}* = {Exp ","}+ {Exp “,"}+.One = Exp {Exp “,"}+.Snoc = {Exp ","}+ "," Exp Exp*.Nil = // empty list Exp* = Exp+ Exp+.One = Exp Exp+.Snoc = Exp+ Exp Exp?.None = // no expression Exp?.Some = Exp // one expression
  • 86. Normalizing Lists !86 rules Snoc(Nil(), x) -> Cons(x, Nil()) Snoc(Cons(x, xs), y) -> Cons(x, Snoc(xs, y)) One(x) -> Cons(x, Nil()) Nil() -> [] Cons(x, xs) -> [x | xs] context-free syntax // automatically generated {Exp “,"}*.Nil = // empty list {Exp “,”}* = {Exp ","}+ {Exp “,"}+.One = Exp {Exp “,"}+.Snoc = {Exp ","}+ "," Exp Exp*.Nil = // empty list Exp* = Exp+ Exp+.One = Exp Exp+.Snoc = Exp+ Exp Exp?.None = // no expression Exp?.Some = Exp // one expression
  • 87. Using Sugar for Sequences !87 module Functions imports Identifiers imports Types context-free syntax Dec = FunDec+ FunDec = "function" Id "(" {FArg ","}* ")" "=" Exp FunDec = "function" Id "(" {FArg ","}* ")" ":" Type "=" Exp FArg = Id ":" Type Exp = Id "(" {Exp ","}* ")" let function power(x: int, n: int): int = if n <= 0 then 1 else x * power(x, n - 1) in power(3, 10) end module Bindings imports Control-Flow Identifiers Types Functions Variables sorts Declarations context-free syntax Exp = "let" Dec* "in" {Exp ";"}* "end" Declarations = "declarations" Dec*
  • 89. Context-Free Syntax vs Lexical Syntax !89 let function power(x: int, n: int): int = if n <= 0 then 1 else x * power(x, n - 1) in power(3, 10) end Mod( Let( [ FunDecs( [ FunDec( "power" , [FArg("x", Tid("int")), FArg("n", Tid("int"))] , Tid("int") , If( Leq(Var("n"), Int("0")) , Int("1") , Times( Var("x") , Call( "power" , [Var("x"), Minus(Var("n"), Int("1"))] ) ) ) ) ] ) ] , [Call("power", [Int("3"), Int("10")])] ) ) phrase structure lexeme / token separated by layout structure not relevant not separated by layout
  • 90. Character Classes !90 lexical syntax // character codes Character = [65] Range = [65-90] Union = [65-90] / [97-122] Difference = [0-127] / [1013] Union = [0-911-1214-255] Character class represents choice from a set of characters
  • 91. Sugar for Character Classes !91 lexical syntax // sugar CharSugar = [a] = [97] CharClass = [abcdefghijklmnopqrstuvwxyz] = [97-122] SugarRange = [a-z] = [97-122] Union = [a-z] / [A-Z] / [0-9] = [48-5765-9097-122] RangeCombi = [a-z0-9_] = [48-579597-122] Complement = ~[nr] = [0-255] / [1013] = [0-911-1214-255]
  • 92. Literals are Sequences of Characters !92 lexical syntax // literals Literal = "then" // case sensitive sequence of characters CaseInsensitive = 'then' // case insensitive sequence of characters syntax "then" = [116] [104] [101] [110] 'then' = [84116] [72104] [69101] [78110] syntax "then" = [t] [h] [e] [n] 'then' = [Tt] [Hh] [Ee] [Nn]
  • 93. Identifiers !93 lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* Id = a Id = B Id = cD Id = xyz10 Id = internal_ Id = CamelCase Id = lower_case Id = ...
  • 94. Lexical Ambiguity: Longest Match !94 lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} // curried function call ab Mod( amb( [Var("ab"), Call(Var("a"), Var("b"))] ) )
  • 95. abc Lexical Ambiguity: Longest Match !95 Mod( amb( [ amb( [ Var("abc") , Call( amb( [Var("ab"), Call(Var("a"), Var("b"))] ) , Var("c") ) ] ) , Call(Var("a"), Var("bc")) ] ) ) lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} // curried function call
  • 96. abc def ghi Lexical Restriction => Longest Match !96 lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* lexical restrictions Id -/- [a-zA-Z0-9_] // longest match for identifiers context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} // curried function call Call(Call(Var("abc"), Var("def")), Var("ghi")) Lexical restriction: phrase cannot be followed by character in character class
  • 97. if def then ghi Lexical Ambiguity: Keywords overlap with Identifiers !97 lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* lexical restrictions Id -/- [a-zA-Z0-9_] // longest match for identifiers context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} Exp.IfThen = "if" Exp "then" Exp amb( [ Mod( Call( Call(Call(Var("if"), Var("def")), Var("then")) , Var("ghi") ) ) , Mod(IfThen(Var("def"), Var("ghi"))) ] )
  • 98. ifdef then ghi Lexical Ambiguity: Keywords overlap with Identifiers !98 amb( [ Mod( Call(Call(Var("ifdef"), Var("then")), Var("ghi")) ) , Mod(IfThen(Var("def"), Var("ghi"))) ] ) lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* lexical restrictions Id -/- [a-zA-Z0-9_] // longest match for identifiers context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} Exp.IfThen = "if" Exp "then" Exp
  • 99. Reject Productions => Reserved Words !99 lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* Id = "if" {reject} Id = "then" {reject} lexical restrictions Id -/- [a-zA-Z0-9_] // longest match for identifiers "if" "then" -/- [a-zA-Z0-9_] context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} Exp.IfThen = "if" Exp "then" Exp if def then ghi IfThen(Var("def"), Var("ghi"))
  • 100. ifdef then ghi Reject Productions => Reserved Words !100 parse error lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* Id = "if" {reject} Id = "then" {reject} lexical restrictions Id -/- [a-zA-Z0-9_] // longest match for identifiers "if" "then" -/- [a-zA-Z0-9_] context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} Exp.IfThen = "if" Exp "then" Exp
  • 102. Tiger Lexical Syntax: Identifiers !102 module Identifiers lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* lexical restrictions Id -/- [a-zA-Z0-9_]
  • 103. Tiger Lexical Syntax: Number Literals !103 module Numbers lexical syntax IntConst = [0-9]+ lexical syntax RealConst.RealConstNoExp = IntConst "." IntConst RealConst.RealConst = IntConst "." IntConst "e" Sign IntConst Sign = "+" Sign = "-" context-free syntax Exp.Int = IntConst
  • 104. Tiger Lexical Syntax: String Literals !104 module Strings sorts StrConst lexical syntax StrConst = """ StrChar* """ StrChar = ~["n] StrChar = [] [n] StrChar = [] [t] StrChar = [] [^] [A-Z] StrChar = [] [0-9] [0-9] [0-9] StrChar = [] ["] StrChar = [] [] StrChar = [] [ tn]+ [] context-free syntax // records Exp.String = StrConst
  • 105. Tiger Lexical Syntax: Whitespace !105 module Whitespace lexical syntax LAYOUT = [ tnr] context-free restrictions // Ensure greedy matching for comments LAYOUT? -/- [ tnr] LAYOUT? -/- [/].[/] LAYOUT? -/- [/].[*] syntax LAYOUT-CF = LAYOUT-LEX LAYOUT-CF = LAYOUT-CF LAYOUT-CF {left} Implicit composition of layout
  • 106. Tiger Lexical Syntax: Comment !106 lexical syntax CommentChar = [*] LAYOUT = "/*" InsideComment* "*/" InsideComment = ~[*] InsideComment = CommentChar lexical restrictions CommentChar -/- [/] context-free restrictions LAYOUT? -/- [/].[*] lexical syntax LAYOUT = SingleLineComment SingleLineComment = "//" ~[nr]* NewLineEOF NewLineEOF = [nr] NewLineEOF = EOF EOF = lexical restrictions EOF -/- ~[] context-free restrictions LAYOUT? -/- [/].[/]
  • 108. Core language - context-free grammar productions - with constructors - only character classes as terminals - explicit definition of layout Desugaring - express lexical syntax in terms of character classes - explicate layout between context-free syntax symbols - separate lexical and context-free syntax non-terminals !108 Explication of Lexical Syntax
  • 109. Explication of Layout by Transformation !109 context-free syntax Exp.Int = IntConst Exp.Uminus = "-" Exp Exp.Times = Exp "*" Exp {left} Exp.Divide = Exp "/" Exp {left} Exp.Plus = Exp "+" Exp {left} syntax Exp-CF.Int = IntConst-CF Exp-CF.Uminus = "-" LAYOUT?-CF Exp-CF Exp-CF.Times = Exp-CF LAYOUT?-CF "*" LAYOUT?-CF Exp-CF {left} Exp-CF.Divide = Exp-CF LAYOUT?-CF "/" LAYOUT?-CF Exp-CF {left} Exp-CF.Plus = Exp-CF LAYOUT?-CF "+" LAYOUT?-CF Exp-CF {left} Symbols in context-free syntax are implicitly separated by optional layout
  • 110. Separation of Lexical and Context-free Syntax !110 syntax Id-LEX = [65-9097-122] [48-5765-909597-122]*-LEX Id-LEX = "if" {reject} Id-LEX = "then" {reject} Id-CF = Id-LEX Exp-CF.Var = Id-CF lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* Id = "if" {reject} Id = "then" {reject} context-free syntax Exp.Var = Id
  • 111. syntax Id = [65-9097-122] [48-5765-909597-122]* Id = "if" {reject} Id = "then" {reject} Exp.Var = Id Why Separation of Lexical and Context-Free Syntax? !111 Homework: what would go wrong if we not do this? lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* Id = "if" {reject} Id = "then" {reject} context-free syntax Exp.Var = Id
  • 112. !112 syntax "if" = [105] [102] "then" = [116] [104] [101] [110] [48-5765-909597-122]+-LEX = [48-5765-909597-122] [48-5765-909597-122]+-LEX = [48-5765-909597-122]+-LEX [48-5765-909597-122] [48-5765-909597-122]*-LEX = [48-5765-909597-122]*-LEX = [48-5765-909597-122]+-LEX Id-LEX = [65-9097-122] [48-5765-909597-122]*-LEX Id-LEX = "if" {reject} Id-LEX = "then" {reject} Id-CF = Id-LEX Exp-CF.Var = Id-CF Exp-CF.Call = Exp-CF LAYOUT?-CF Exp-CF {left} Exp-CF.IfThen = "if" LAYOUT?-CF Exp-CF LAYOUT?-CF "then" LAYOUT?-CF Exp-CF LAYOUT-CF = LAYOUT-CF LAYOUT-CF {left} LAYOUT?-CF = LAYOUT-CF LAYOUT?-CF = restrictions Id-LEX -/- [48-5765-909597-122] "if" -/- [48-5765-909597-122] "then" -/- [48-5765-909597-122] priorities Exp-CF.Call left Exp-CF.Call, LAYOUT-CF = LAYOUT-CF LAYOUT-CF left LAYOUT-CF = LAYOUT-CF LAYOUT-CF separate lexical and context-free syntax separate context- free symbols by optional layout character classes as only terminals lexical syntax Id = [a-zA-Z] [a-zA-Z0-9_]* Id = “if" {reject} Id = "then" {reject} lexical restrictions Id -/- [a-zA-Z0-9_] "if" "then" -/- [a-zA-Z0-9_] context-free syntax Exp.Var = Id Exp.Call = Exp Exp {left} Exp.IfThen = "if" Exp "then" Exp
  • 113. Testing Syntax Definitions with SPT 113 Are you defining the right language?
  • 114. Test Suites !114 module example-suite language Tiger start symbol Start test name [[...]] parse succeeds test another name [[...]] parse fails SPT: SPoofax Testing language
  • 115. Success is Failure, Failure is Success, … !115 module success-failure language Tiger test this test succeeds [[ 1 + 3 * 4 ]] parse succeeds test this test fails [[ 1 + 3 * 4 ]] parse fails test this test fails [[ 1 + 3 * ]] parse succeeds test this test succeeds [[ 1 + 3 * ]] parse fails
  • 116. Test Cases: Valid !116 Language valid program test valid program [[...]] parse succeeds
  • 117. Test Cases: Invalid !117 test invalid program [[...]] parse fails invalid program Language
  • 118. Test Cases !118 module syntax/identifiers language Tiger start symbol Id test single lower case [[x]] parse succeeds test single upper case [[X]] parse succeeds test single digit [[1]] parse fails test single lc digit [[x1]] parse succeeds test single digit lc [[1x]] parse fails test single uc digit [[X1]] parse succeeds test single digit uc [[1X]] parse fails test double digit [[11]] parse fails
  • 120. Testing Structure !120 module structure language Tiger test add times [[ 21 + 14 + 7 ]] parse to Mod(Plus(Int(“21"),Times(Int("14"),Int("7")))) test times add [[ 3 * 7 + 21 ]] parse to Mod(Plus(Times(Int(“3"),Int("7")),Int("21"))) test if [[ if x then 3 else 4 ]] parse to Mod(If(Var("x"),Int("3"),Int("4"))) The fragment did not parse to the expected ATerm. Parse result was: Mod(Plus(Plus(Int("21"),Int("14")),Int("7"))) Expected result was: Mod(Plus(Int(21), Times(Int(14), Int(7))))
  • 121. Testing Ambiguity !121 Note : this does not actually work in Tiger, since () is a sequencing construct; but works fine in Mini-Java module precedence language Tiger start symbol Exp test parentheses [[(42)]] parse to [[42]] test left-associative addition [[21 + 14 + 7]] parse to [[(21 + 14) + 7]] test precedence multiplication [[3 * 7 + 21]] parse to [[(3 * 7) + 21]]
  • 122. Testing Ambiguity !122 module precedence language Tiger start symbol Exp test plus/times priority [[ x + 4 * 5 ]] parse to Plus(Var("x"), Times(Int("4"), Int("5"))) test plus/times sequence [[ (x + 4) * 5 ]] parse to Times( Seq([Plus(Var("x"), Int("4"))]) , Int("5") )
  • 124.
  • 125. Developing syntax definition - Define syntax of language in multiple modules - Syntax checking, colouring - Checking for undefined non-terminals Testing syntax definition - Write example programs in editor for language under def - Inspect abstract syntax terms ‣ Spoofax > Syntax > Show Parsed AST - Write SPT test for success and failure cases ‣ Updated after build of syntax definition !125 Syntax Engineering in Spoofax
  • 127. Syntax definition - Define structure (decomposition) of programs - Define concrete syntax: notation - Define abstract syntax: constructors Using syntax definitions (next) - Parsing: converting text to abstract syntax term - Pretty-printing: convert abstract syntax term to text - Editor services: syntax highlighting, syntax checking, completion !127 Declarative Syntax Definition
  • 129. Except where otherwise noted, this work is licensed under