Language extensions increase programmer productivity by providing concise, often domain-specific syntax, and support for static verification of correctness, security, and style constraints. Language extensions can often be realized through translation to the base language, supported by preprocessors and extensible compilers. However, various kinds of extensions require further adaptation of a base compiler's internal stages and components, for example to support separate compilation or to make use of low-level primitives of the platform (e.g., jump instructions or unbalanced synchronization). To allow for a more loosely coupled approach, we propose an open compiler model based on normalization steps from a high-level language to a subset of it, the core language. We developed such a compiler for a mixed Java and (core) bytecode language, and evaluate its effectiveness for composition mechanisms such as traits, as well as statement-level and expression-level language extensions.
Mixing Source and Bytecode: A Case for Compilation By Normalization (OOPSLA 2008)
1. Mixing Sour ce and Bytecode
A Case for Compilation by Normalization
OOPSLA'08, Nashville, Tennessee
Lennart Kats, TU Delft
Martin Bravenboer, UMass
Eelco Visser, TU Delft
October 21, 2008
Software Engineering Research Group
3. DSLs and Language Extensions
Domain-specific void browse() {
List<Book> books =
• Database queries
<| SELECT *
• Regular expressions FROM books
• XML processing WHERE price < 100.00
|>;
• Matrices
• Complex numbers …
• ...
for (int i = 0; i < books.size(); i++)
books.get(i).title =~ s/^The //;
}
4. DSLs and Language Extensions
Domain-specific General-purpose
• Database queries • AOP
• Regular expressions • Traits
• XML processing • Enums
• Matrices • Iterators
• Complex numbers • Case classes
• ... • Properties/accessors
• ...
5. Example: The Foreach Statement
Program B: Foo.java
Program A: Foo.mylang public class Foo {
void bar(BarList list) {
public class Foo { Iterator it = list.iterator();
void bar(BarList list) { while (it.hasNext()) {
foreach (Bar b in list) { Bar b = it.next();
b.print(); b.print();
} }
Code generator
} }
} }
javac
Program C: Foo.class
(bytecode)
6. The Preprocessor
Extended
language
+ loosely coupled
+ small, modular extension
Preprocessor
+ no need to know compiler
internals
Host language – no parser
– semantic analysis, errors?
INTERFACE
Compiler
Bytecode
7. The Front-end Extensible Compiler
Extended + front-end reuse
language
– tied to the front-end
Front-end Extension internals
Host language
INTERFACE
Compiler
Polyglot MetaBorg
Bytecode
ableJ
OJ
8. Traits: Composable Units of Behavior
[Schärli, Ducasse, Nierstrasz, Black 2003]
public trait TDrawing {
void draw() {
…
}
require Vertices getVertices(); public class Shape {
} void draw() { … }
Traits
Vertices getVertices() { … }
public class Shape with TDrawing { }
…
Vertices getVertices() { … }
}
9. Foo.java TBar.java
Traits
• Simple idea
Traits
• Most libraries are
distributed as compiled
code
→ Why not do this for traits? FooBar.java
javac
Not extensible
FooBar.class
11. Extended
language
Parsing
Name
Parsing
Frontend
The
analysis
Fully
Type
Parsing Extensible
analysis
Compiler?
Code
Parsing
generation
+ reuse everything
Backend
Optimize + access to back-end
–– tied to all compiler internals
Parsing
Emit – complexity
– performance trade-offs
Bytecode
12. The
White
Box
Model?
–– tied to all compiler internals
– complexity
– performance trade-offs
13. The Normalizing Compiler
Extended
language
+ simple interface
Extension
processor + access to back-end
++ not tied to
Source+bytecode compiler internals
INTERFACE
Normalizing compiler
Bytecode
core language
17. Typical Applications
Class-method / Statement / Expression level
Traits X
Partial/open classes X
Iterator generators X
Finite State Automata X
Expression-based X
extensions
21. Method-body Extensions
• Operators: x as T, x ?? y, ...
• Embedded (external) DSLs:
Database queries, XML, ...
• Closures No
r m
ali
ze
Lower-level
code
22. Stratego Normalization Rules
desugar-foreach:
|[ foreach (t x in e) stm ]|
|[ Iterator x_iterator = e.iterator();
while (x_iterator.hasNext()) {
t x = x_iterator.next();
stm;
}
]|
with x_iterator := <newname> “iterator”
24. Normalizing ??
“But placing statements in front of the
assimilated expressions is easy, isn’t it?”
Requires syntactic knowledge:
• for, while, return, throw, …
• other expressions
• other extensions
25. Normalizing ??
“But placing statements in front of the
assimilated expressions is easy, isn’t it?
… Right?”
Also requires semantic knowledge:
Iterator<String> it = ...;
for (String e = e0; it.hasNext(); e = it.next() ?? “default”) {
...
}
26. Normalizing ??
“But placing statements in front of the
assimilated expressions is easy, isn’t it?
… Right?”
Also requires semantic knowledge:
Iterator<String> it = ...;
Integer temp = ...;
for (String e = e0; it.hasNext(); e = temp) {
...
}
30. Normalizing '??' !! - with an EBlock
String property = {|
String temp = readProperty(“x”);
if (temp == null) temp = “default”;
| temp
|};
System.out.println(property);
31. desugar-coalescing:
|[ e1 ?? e2 ]|
|[{|
String x_temp = e1;
if (x_temp == null) x_temp = e2;
| x_temp
|}
]|
with x_temp := <newname> “temp”
'??' in a Normalization Rule
32. desugar-coalescing:
|[ e1 ?? e2 ]|
|[{|
String x_temp = e1;
if (x_temp == null) x_temp = e2;
| x_temp
|} Run-time:
Exception in thread "main"
]| java.lang.NullPointerException
…
with x_temp := <newname> “temp”
at Generated.getVertices(Generated.java:10)
Position Information In
Generated Code
33. desugar-coalescing:
|[ e1 ?? e2 ]|
|[ trace (e1 ?? e2 @ <File>:<Line>:<Column>) {{|
String x_temp = e1;
if (x_temp == null) x_temp = e2;
| x_temp
|}}
]|
with x_temp := <newname> “temp”
Run-time:
Maintaining Position Information With
Exception in thread "main"
java.lang.NullPointerException
… Source Tracing
at Shape.getVertices(Shape.java:10)
34. Reflection
Class-method / Statement / Expression level
Traits X
Partial/open classes X
Iterator generators X
Finite State Automata X
Expression-based X
extensions
Aspect weaving X X X
35. Concluding Remarks
• Extension effort must be proportional to the benefits
• Black Boxes vs. White Boxes
• Don't throw away your primitives
• Normalization rules
Mixing Source and Bytecode. A Case for Compilation by Normalization.
Lennart C. L. Kats, Martin Bravenboer, and Eelco Visser. In OOPSLA'08.
http://www.programtransformation.org/Stratego/TheDryadCompiler
http://www.strategoxt.org