Mais conteúdo relacionado Semelhante a JVM: A Platform for Multiple Languages (20) JVM: A Platform for Multiple Languages1. 1 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
2. JVM: A Platform for Multiple
Languages
Krystal Mo
Member of Technical Staff
HotSpot JVM Compiler Team
2 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
3. The following is intended to outline our general product direction. It is intended
for information purposes only, and may not be incorporated into any contract.
It is not a commitment to deliver any material, code, or functionality, and should
not be relied upon in making purchasing decisions. The development, release,
and timing of any features or functionality described for Oracle’s products
remains at the sole discretion of Oracle.
3 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
4. Once there was a time…
Java
Source: http://www.tiobe.com
4 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
5. Oh wait…
JavaTM
Source: http://www.tiobe.com
5 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
6. But really, what we meant was…
Java TM
Source: http://www.tiobe.com
6 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
7. Fortunately, clearer minds prevail
Language Implementations on JVM
Gosu Jaskell
C
JudoScript
Fortress
X10 Jacl
BeanShell
Fantom jgo
ABCL
myForth jdart Erjang ANTLR Nice
(and many more…)
7 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
8. Agenda
Why make a language on JVM
Language features by emulation
What we did in JDK 7
Building the future
8 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
9. Agenda
Why make a language on JVM
Language features by emulation
What we did in JDK 7
Building the future
9 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
10. Why Make a Language At All? (1)
Syntax
– the “easy” part
– pick one that fits your eyes
Semantics and Capabiliies
– static vs. dynamic
– sequential vs. parallel
– …one that fits the problem domain
10 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
11. Why Make a Language At All? (2)
Versus writing a library
Language can use alternative syntax
– where as library has to adhere to some host language
Language can impose more restrictions
– e.g. controlling capability
– where as library has no control over host language’s capabilities
11 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
12. Why on JVM?
Mature low-level services
– Dynamic (“JIT”) compilation
– Garbage collection
– Threading
– Debugging Support
Cross-platform
Vast array of libraries
12 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
13. JVM Strengths
Compiler Optimizations
compiler tactics language-specific techniques loop transformations
delayed compilation class hierarchy analysis loop unrolling
tiered compilation devirtualization loop peeling
on-stack replacement symbolic constant propagation safepoint elimination
delayed reoptimization autobox elimination iteration range splitting
program dependence graph representation escape analysis range check elimination
static single assignment representation lock elision loop vectorization
proof-based techniques lock fusion global code shaping
exact type inference de-reflection inlining (graph integration)
memory value inference speculative (profile-based) techniques global code motion
memory value tracking optimistic nullness assertions heat-based code layout
constant folding optimistic type assertions switch balancing
reassociation optimistic type strengthening throw inlining
operator strength reduction optimistic array length strengthening control flow graph transformation
null check elimination untaken branch pruning local code scheduling
type test strength reduction optimistic N-morphic inlining local code bundling
type test elimination branch frequency prediction delay slot filling
algebraic simplification call frequency prediction graph-coloring register allocation
common subexpression elimination memory and placement transformation linear scan register allocation
integer range typing expression hoisting live range splitting
flow-sensitive rewrites expression sinking copy coalescing
conditional constant propagation redundant store elimination constant splitting
dominating test detection adjacent store fusion copy removal
flow-carried type narrowing card-mark elimination address mode matching
dead code elimination merge-point splitting instruction peepholing
DFA-based code generator
13 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
14. Why in Java?
Instead of C/C++?
Robustness: Runtime exceptions not fatal
Reflection: Annotations instead of macros
Tooling: Java IDEs speed up the development process
etc.
14 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
15. Ease of Development
Excellent Tooling Support
Good IDEs
Good Profilers ANTLR
Good tooling for developing
parsers and other language
support
15 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
16. Developing a Language on JVM
Syntax Semantics Low-level Details
•(your work goes here)
•Mature parser libraries
•Backed by various
•e.g. ANTLR, JavaCC Backed by JVM
libraries
•e.g. ASM, dynalink
16 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
17. Case Study: Writing a Compiler in Java
Using Reflection
class CompareNode extends FloatingNode,
implements ValueNumberable,
Canonicalizable {
@Input ValueNode x;
@Input ValueNode y;
@Data Condition condition;
public Node canonical(CanonicalizerTool t) {
return this;
}
}
for (IfNode n : graph.getNodes(IfNode.class)) { ... }
17 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
18. Agenda
Why make a language on JVM
Language features by emulation
What we did in JDK 7
Building the future
18 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
19. It can always be done
Even without direct native support from JVM
Java and JVM provide a rich set of primitives to build on
Almost any language feature can be implemented on JVM
– albeit not necessarily efficient
19 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
20. “All problems in computer science can be solved by
another level of indirection.”
David Wheeler
20 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
21. “… except for the problem of too many layers of
indirection.”
Kevlin Henney
21 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
22. Case Study: A Bytecode Interpreter in Java
“Double interpretation”
Bytecode Host JVM
Java Source Bytecode Interpreter in (also an
Program
Java interpreter)
while (true) { …
byte opcode = code[pc++]; …
switch (opcode) { …
// ... …
iload_2
case ILOAD_2: …
iconst_1
i = j + 1 int i = locals[2]; …
iadd
stack[sp++] = i; …
istore_1
break; …
// ... …
} …
} …
22 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
23. Case Study: A JVM in Java
Ideally, redundant indirections are squeezed out
Java Source Compiler in
Bytecode Host Machine
Program Java
iload_2
iconst_1
i = j + 1 lea eax, [edx+1]
iadd
istore_1
23 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
24. Alternative Method Dispatching
e.g. prototype-based dispatch, metaclass, etc.
Emulate with reflection
– Custom lookup / binding
– Then java.lang.reflect.Method.invoke()
– Reflective invocation overhead
Security checking
Argument boxing / unboxing
24 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
25. Tail-calls
Often seen in functional languages
Emulate with trampoline loop
Special case:
– Direct tail-recursions can easily be transformed into loops
– e.g. Scala does this
25 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
26. static int trampolineLoop(Task t) {
Case Study: tail-call Context ctx = new Context();
Rewrite into trampoline while (t != null) {
t = t.invoke(ctx);
}
static int a() { return ctx.value;
return b(); }
}
static Task a(Context ctx) {
static int b() { return new Task(#b);
return c(); }
}
static Task b(Context ctx) {
static int c() { return new Task(#c);
return 42; }
}
static Task c(Context ctx) {
ctx.value = 42;
return null;
}
26 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
27. Case Study: tail-recursion
Rewrite into loop
static int fib(int n) {
static int fib(int n) { int a = 0, b = 1;
return fibInner(n, 0, 1); while (n >= 2) {
} n = n - 1;
int temp = a + b;
static int fibInner(int n, int a, int b) { a = b;
if (n < 2) return b; b = temp;
return fibInner(n - 1, b, a + b); }
} return b;
}
27 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
28. Coroutines
Emulate with threads
– Can implement full (“stackful”) coroutine semantics
– Often use thread pooling as an optimization
– Waste (virtual) memory
– Could leak memory
– e.g. used by JRuby on stock JVMs
28 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
29. Coroutines
Emulate with Finite State Machines
– Compile-time transformation
– Can only implement “stackless coroutines”
Can only yield from the main method
– e.g. C# does this with its iterator
– e.g. there’s a coroutines library for Java that does the same
29 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
30. Case Study: C#’s iterator
Original source
static IEnumerable<int> GetNaturals() {
int i = 1;
while (true) {
yield return i++;
}
}
30 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
31. Case Study: C#’s iterator
Transformed into FSM (simplied from actual generated code)
static IEnumerable<int> GetNaturals() { bool IEnumerator.MoveNext() {
return new NaturalsIterator(0); switch (_state) {
} case 0:
_i = 1;
sealed class NaturalsIterator : break;
IEnumerable<int>, IEnumerable, case 1:
IEnumerator<int>, IEnumerator, break;
IDisposable { default:
int _current, _state, _i; return false;
public NaturalsIterator(int state) { }
_state = state; _current = _i++;
} _state = 1;
int IEnumerator<int>.Current { return true;
get { return _current; } }
} }
31 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
32. Reference Counting
public class CountedReference<T> {
private volatile int refCount = 1;
private final T target;
public CountedReference(T target) {
Emulate with reified reference this.target = target;
}
– Boxing overhead public T addRef() {
refCount++;
– You really don’t want to do this… return target;
}
public void release() {
if (refCount >= 1) refCount--;
if (refCount < 1 && target != null) {
target.finalize();
target = null;
}
}
}
32 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
33. Infinite Precision Integer
Emulate with java.math.BigInteger
– Boxing overhead
Performance impact
Heap bloat
33 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
34. Closer to the Metal
Reducing redundant emulation
Less indirections
Better performance
34 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
35. The “J” in JVM
Geared towards Java semantics
Semantics match => good perf, easy to impl
Closer to Java => closer to the metal on JVM
35 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
36. The “J” in JVM
Should optimize for non-Java features, too
Languages on JVM shouldn’t be forced to be like Java to be
performant
Improve the VM to accommodate non-Java language features
– In turn, benefits Java itself, e.g. lambdas
36 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
37. Agenda
Why make a language on JVM
Language features by emulation
What we did in JDK 7
Building the future
37 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
38. Why dynamic languages?
Fast turnaround time for simple programs
– no compile step required
– direct interpretation possible
– loose binding to the environment
Data-driven programming
– program shape can change along with data shape
– radically open-ended code (plugins, aspects, closures)
38 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
39. Dynamic languages are here to stay
Source: http://www.tiobe.com
39 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
40. What slows down a JVM
Non-Java languages require special call sites.
– e.g.: Smalltalk message sending (no static types).
– e.g.: JavaScript or Ruby method call (different lookup rules).
In the past, special calls required simulation overheads
– ...such as reflection and/or extra levels of lookup and indirection
– ...which have inhibited JIT optimizations.
Result: Pain for non-Java developers.
Enter Java 7.
40 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
41. Key Features
New bytecode instruction: invokedynamic.
– Linked reflectively, under user control.
– User-visible object: java.lang.invoke.CallSite
– Dynamic call sites can be linked and relinked, dynamically.
New unit of behavior: method handle
– The content of a dynamic call site is a method handle.
– Method handles are function pointers for the JVM.
– (Or if you like, each MH implements a single-method interface.)
41 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
42. Dynamic Program Composition
Bytecodes are A dynamic call site
created by Java is created for each
compilers or invokedynamic call
dynamic runtimes Dynamic Call bytecode
Bytecodes
Sites
JVM
The JVM seamlessly JIT
integrates execution, Each call site is
optimizing to native Method bound to one or more
code as necessary Handles method handles,
which point back to
bytecoded methods
42 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
43. Passing the burden to the JVM
Non-Java languages require special call sites.
In the past, special calls required simulation overheads
Now, invokedynamic call sites are fully user-configurable
– ...and are fully optimizable by the JIT.
Result: Much simpler code for language implementors
– ...and new leverage for the JIT.
43 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
44. What’s in a method call? (before invokedynamic)
Source code Bytecode Linking Executing
Naming Identifiers Utf8 constants JVM
“dictionary”
Selecting Scopes Class names Loaded V-table lookup
classes
Adapting Argument C2I / I2C Receiver
conversion adapters narrowing
Calling Jump with
arguments
44 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
45. What’s in a method call? (using invokedynamic)
Source code Bytecode Linking Executing
Naming ∞ ∞ ∞ ∞
Selecting ∞ Bootstrap Bootstrap ∞
methods method call
Adapting ∞ Method ∞
handles
Calling Jump with
arguments
45 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
46. “Invokedynamic is the most important addition to Java in
years. It will change the face of the platform.”
Charles Nutter
JRuby Lead,
Red Hat
46 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
47. Agenda
Why make a language on JVM
Language features by emulation
What we did in JDK 7
Building the future
47 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
48. Loose ends in the Java 7 API
Method handle introspection (reflection)
Generalized proxies (more than single-method intfs)
Class hierarchy analysis (override notification)
Smaller issues:
– usability (MethodHandle.toString, polymorphic bindTo)
– sharp corners (MethodHandle.invokeWithArguments)
– repertoire (tryFinally, more fold/spread/collect options)
Integration with other APIs (java.lang.reflect)
48 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
49. Support for Lambda in OpenJDK8
More transforms for SAM types (as needed).
Faster bindTo operation to create bound MHs
– No JNI calls.
– Maybe multiple-value bindTo.
Faster inexact invoke (as needed).
49 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
50. Let’s continue building our “future VM”
http://hg.openjdk.java.net/mlvm/mlvm/hotspot/
Da Vinci Machine Project: an open source
incubator for JVM futures
Contains code fragments (patches).
Movement to OpenJDK requires:
– a standard (e.g., JSR 292)
– a feature release plan (7 vs. 8 vs. ...)
bsd-port for developer friendliness.
mlvm-dev@openjdk.java.net
50 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
51. Current Da Vinci Machine Patches
MLVM patches
meth method handles implementation
indy invokedynamic
coro light weight coroutines (Lukas Stadler)
inti interface injection (Tobias Ivarsson)
tailc hard tail call optimization (Arnold Schwaighofer)
tuple integrating tuple types (Michael Barker)
hotswap online general class schema updates (Thomas Wuerthinger)
anonk anonymous classes; light weight bytecode loading
51 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
52. Caveat: Change is hard and slow
(especially the “last 20%”)
Hacking code is relatively simple.
Removing bugs is harder.
Verifying is difficult (millions of users).
Integrating to a giant system very hard.
– interpreter, multiple compilers
– managed heap (multiple GC algos.)
– debugging, monitoring, profiling machinery
– security interactions
Specifying is hard (the last 20%...).
Running process is time-consuming.
52 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
53. Further Reading
Multi-Language VM (MLVM) Project on OpenJDK
JVM Language Summit
JSR 292 Cookbook
53 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
54. References
VM Optimizations for Language Designers, John Pampuch, JVM
Language Summit 2008
Method Handles and Beyond, Some basis vectors, John Rose, JVM
Language Summit 2011
54 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
55. 55 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
56. 56 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Notas do Editor JavaFX is an example of a good language got dumped. Suppose register allocations is done, i is allocated in eax, j is allocated in edx