SlideShare a Scribd company logo
1 of 82
Download to read offline
Welcome to “Thnad’s Revenge,” a programming language implementation tale in three acts.
Not to be confused with...
http://en.wikipedia.org/wiki/Yars'_Revenge
...Yars’ Revenge, the awesome Atari video game from the ’80s.
Cucumber Recipes




                                                                            Ian Dees
                                                                 with Aslak Hellesøy
                                                                    and Matt Wynne




     pragprog/titles/JRUBY
     discount code: JRubyIanDees



Before we get to the talk, let me make a couple of quick announcements. First, we’re
updating the JRuby book this summer with a JRuby 1.7-ready PDF. To celebrate that, we’re
offering a discount code on the book during the conference. Second, I’m working on a new
book with the Cucumber folks, which has some JRuby/JVM stuff in it—if you’d like to be a
tech reviewer, please find me after this talk.
I. Meet Thnad
            II. Enter the Frenemy
            III. Thnad’s Revenge

(with apologies to Ira Glass) Act I, Meet Thnad, in which we encounter Thnad, a programming
language built with JRuby and designed not for programmer happiness, but for implementer
happiness. Act II, Enter the Frenemy, in which we meet a new Ruby runtime. Act III, Thnad's
Revenge, in which we port Thnad to run on the Rubinius runtime and encounter some
surprises along the way.
I. Meet Thnad



Thnad is a programming language I created last summer as an excuse to learn some fun
JRuby tools and see what it's like to write a compiler.
The name comes from a letter invented by Dr. Seuss in his book, “On Beyond Zebra.” Since
most of the real letters are already taken by programming languages, a fictional one seems
appropriate.
A Fictional Programming
                Language
                   Optimized for Implementer Happiness




Just as Ruby is optimized for programmer happiness, Thnad is optimized for implementer
happiness. It was designed to be implemented with a minimum of time and effort, and a
maximum amount of fun.
function factorial(n) {
                 if (eq(n, 1)) {
                   1
                 } else {
                   times(n, factorial(minus(n, 1)))
                 }
               }

               print(factorial(4))




Here’s a sample Thnad program demonstrating all the major features. Thnad has integers,
functions, conditionals, and... not much else. These minimal features were easy to add,
thanks to the great tools available in the JRuby ecosystem (and other ecosystems, as we’ll
see).
Thnad Features

          1. Names and Numbers
          2. Function Calls
          3. Conditionals
          4. Function Definitions




In the next few minutes, we’re going to trace through each of these four language features,
from parsing the source all the way to generating the final binary. We won’t show every
single grammar rule, but we will hit the high points.
As Tom mentioned in his talk, there are a number of phases a piece of source code goes
through during compilation.
Stages of Parsing

        tokenize
        parse
        transform
        emit




These break down into four main stages in a typical language: finding the tokens or parts of
speech of the text, parsing the tokens into an in-memory tree, transforming the tree, and
generating the bytecode. We’re going to look at each of Thnad’s major features in the
context of these stages.
1. Names and Numbers



First, let’s look at the easiest language feature: numbers and function parameters.
{:number => '42'}


                                                                   root


     '42'                                                       :number


                                                                   "42"




Our parser needs to transform this input text into some kind of Ruby data structure.
Parslet
                            kschiess.github.com/parslet




I used a library called Parslet for that. Parslet handles the first two stages of compilation
(tokenizing and parsing) using a Parsing Expression Grammar, or PEG. PEGs are like regular
expressions attached to blocks of code. They sound like a hack, but there’s solid compiler
theory behind them.
{:number => '42'}


                                                                  root


    '42'                                                       :number


                                                                  "42"




   rule(:number) {
     match('[0-9]').repeat(1).as(:number) >> space? }



The rule at the bottom of the page is Parslet’s notation for matching one or more numbers
followed by a optional space.
{:number => '42'}                                  Thnad::Number.new(42)

                                                                  root
              root
                                                            Thnad::Number
           :number
                                                                 :value

              "42"                                                  42




     rule(:number => simple(:value)) {
       Number.new(value.to_i) }



Now for the third stage, transformation. We could generate the bytecode straight from the
original tree, using a bunch of hard-to-test case statements. But it would be nicer to have a
specific Ruby class for each Thnad language feature. The rule at the bottom of this slide tells
Parslet to transform a Hash with a key called :number into an instance of a Number class we
provide.
BiteScript
                               github/headius/bitescript




The final stage, outputting bytecode, is handled by the BiteScript library, which is basically a
domain-specific language for emitting JVM opcodes.
main do
         ldc 42
         ldc 1
         invokestatic :Example, :baz, [int, int, int]
         returnvoid
       end




Here's an example, just to get an idea of the flavor. To call a method, you just push the
arguments onto the stack and then call a specific opcode, in this case invokestatic. The VM
you're writing for is aware of classes, interfaces, and so on—you don't have to implement
method lookup like you would with plain machine code.
“JVM Bytecode for Dummies”
                           Charles Nutter, Øredev 2010
         slideshare/CharlesNutter/redev-2010-jvm-bytecode-for-dummies
When I first saw the BiteScript, I thought it was something you'd only need if you were doing
deep JVM hacking. But when I read the slides from Charlie's presentation at Øredev, it
clicked. This library takes me way back to my college days, when we'd write assembler
programs for a really simple instruction set like MIPS. BiteScript evokes that same kind of
feeling. I'd always thought the JVM would have a huge, crufty instruction set—but it's actually
quite manageable to keep the most important parts of it in your head.
class Number < Struct.new :value
   def eval(context, builder)
     builder.ldc value
   end
 end




We can generate the bytecode any way we want. One simple way is to give each of our
classes an eval() method that takes a BiteScript generator and calls various methods on it to
generate JVM instructions.
class Name < Struct.new :name
   def eval(context, builder)
     param_names = context[:params] || []
     position    = param_names.index(name)
     raise "Unknown parameter #{name}" unless position

     builder.iload position
   end
 end




Dealing with passed-in parameters is nearly as easy as dealing with raw integers; we just
look up the parameter name by position, and then push the nth parameter onto the stack.
2. Function Calls



The next major feature is function calls. Once we have those, we will be able to run a trivial
Thnad program.
{:funcall =>
                                 {:name => 'baz',
                                  :args => [
                                    {:arg => {:number => '42'}}]}}
                                    {:arg => {:name   => 'foo'}}]}}


                                                       root

'baz(42, foo)'                                      :funcall


                                                 :name     :args


                                                 "baz"        :arg     :arg


                                                         :number       :name


                                                           "42"        "foo"



We’re going to move a little faster here, to leave time for Rubinius. Here, we want to
transform this source code into this Ruby data structure representing a function call.
Thnad::Funcall.new 'foo',
                         [Thnad::Number.new(42)]



                                     root


                              Thnad::Funcall


                              :name           :args


                            "foo"           Thnad::Number


                                               :value


                                                  42



Now, we want to transform generic Ruby data structures into purpose-built ones that we can
attach bytecode-emitting behavior to.
class Funcall < Struct.new :name, :args
        def eval(context, builder)
          args.each { |a| a.eval(context, builder) }
          types = [builder.int] * (args.length + 1)

          builder.invokestatic 
            builder.class_builder, name, types
        end
      end




The bytecode for a function call is really simple in BiteScript. All functions in Thnad are static
methods on a single class.
3. Conditionals



The first two features we’ve defined are enough to write simple programs like print(42). The
next two features will let us add conditionals and custom functions.
{:cond =>
                                      {:number => '0'},
                                      :if_true =>
                                       {:body => {:number => '42'}},
                                      :if_false =>
                                       {:body => {:number => '667'}}}
  'if (0) {
      42
                                                           root
   } else {
      667
                                            :cond        :if_true          :if_false
   }'
                                           :number         :body             :body


                                             "0"         :number            :number


                                                           "42"              "667"



A conditional consists of the “if” keyword, followed by a body of code inside braces, then the
“else” keyword, followed by another body of code in braces.
Thnad::Conditional.new 
                          Thnad::Number.new(0),
                          Thnad::Number.new(42),
                          Thnad::Number.new(667)



                                            root


                                    Thnad::Conditional


                           :cond          :if_true         :if_false


                 Thnad::Number         Thnad::Number         Thnad::Number


                     :value                :value               :value


                       0                     42                   667




Here’s the transformed tree representing a set of custom Ruby classes.
class Conditional < Struct.new :cond, :if_true, :if_false
  def eval(context, builder)
    cond.eval context, builder
    builder.ifeq :else
    if_true.eval context, builder
    builder.goto :endif
    builder.label :else
    if_false.eval context, builder
    builder.label :endif
  end
end




The bytecode emitter for conditionals has a new twist. The Conditional struct points to three
other Thnad nodes. It needs to eval() them at the right time to emit their bytecode in
between all the zero checks and gotos.
4. Function Definitions



On to the final piece of Thnad: defining new functions.
{:func =>
                                               {:name => 'foo'},
                                                :params =>
                                                  {:param =>
                                                     {:name => 'x'}},
                                                :body =>
                                                  {:number => '5'}}

'function foo(x) {
                                                               root
    5
 }'
                                              :func         :params           :body


                                              :name          :param          :number


                                              "foo"           :name             "5"


                                                               "x"

A function definition looks a lot like a function call, but with a body attached to it.
Thnad::Function.new 
                          'foo',
                          [Thnad::Name.new('x')],
                          Thnad::Number.new(5)


                                    root


                             Thnad::Function


                    :name         :params             :body


                  "foo"        Thnad::Name            Thnad::Number


                                   :name                   :value


                                     "x"                      5



Here’s the transformation we want to perform for this language feature.
class Function < Struct.new :name, :params, :body
  def eval(context, builder)
    param_names = [params].flatten.map(&:name)
    context[:params] = param_names
    types = [builder.int] * (param_names.count + 1)

     builder.public_static_method(self.name, [], *types) do
       |method|

      self.body.eval(context, method)
      method.ireturn
    end
  end
end



Since all Thnad parameters and return types are integers, emitting a function definition is
really easy. We count the parameters so that we can give the JVM a correct signature. Then,
we just pass a block to the public_static_method helper, a feature of BiteScript that will
inspire the Rubinius work later on.
Compiler



We’ve seen how to generate individual chunks of bytecode; how do they all get stitched
together into a .class file?
builder = BiteScript::FileBuilder.build(@filename) do
  public_class classname, object do |klass|
     # ...

     klass.public_static_method 'main', [], void, string[] do
       |method|

         context = Hash.new
         exprs.each do |e|
           e.eval(context, method)
         end

      method.returnvoid
    end
  end
end


Here’s the core of class generation. We output a standard Java main() function...
builder = BiteScript::FileBuilder.build(@filename) do
  public_class classname, object do |klass|
     # ...

     klass.public_static_method 'main', [], void, string[] do
       |method|

        context = Hash.new
        exprs.each do |e|
          e.eval(context, method)
        end

      method.returnvoid
    end
  end
end


...inside which we eval() our Thnad expressions (not counting function definitions) one by
one.
Built-ins
                  plus, minus, times, eq, print




Thnad ships with a few basic arithmetic operations, plus a print() function. Let’s look at one
of those now.
public_static_method 'minus', [], int, int, int do
     iload 0
     iload 1
     isub
     ireturn
   end




Here’s the definition of minus(). It just pushes its two arguments onto the stack and then
subtracts them. The rest of the built-ins are nearly identical to this one, so we won’t show
them here.
II. Enter the Frenemy



So that's a whirlwind tour of Thnad. Last year, I was telling someone about this project—it
was either Shane Becker or Brian Ford, I think—and he said,...
Rubinius



...“Hey, you should port this to Rubinius!” I thought, “Hmm, why not? Sounds fun.” Let’s
take a look at this other runtime that has sprung up as a rival for Thnad’s affections.
Ruby in Ruby

          • As much as performance allows
          • Initially 100%, now around half (?)
          • Core in C++ / LLVM
          • Tons in Ruby: primitives, parser, bytecode


The goal of Rubinius is to implement Ruby in Ruby as much as performance allows. Quite a
lot of functionality you’d think would need to be in C is actually in Ruby.
RubySpec, FFI
                             Brought to you by Rubinius
                                   (Thank you!)




We have Rubinius to thank for the executable Ruby specification that all Rubies are now
judged against, and for the excellent foreign-function interface that lets you call C code in a
way that’s compatible with at least four Rubies.
Looking Inside Your Code



Rubinius also has tons of mechanisms for looking inside your code, which was very helpful
when I needed to learn what bytecode I’d need to output to accomplish a particular task in
Thnad.
class Example
                                 def add(a, b)
                                   a + b
                                 end
                               end




For example, with this class,...
AST
 $ rbx compile -S example.rb
 [:script,
  [:class,
   :Example,
   nil,
   [:scope,
    [:block,
     [:defn,
      :add,
      [:args, :a, :b],
      [:scope,
        [:block,
         [:call,
           [:lvar, :a], :+, [:arglist, [:lvar, :b]]]]]]]]]]

...you can get a Lisp-like representation of the syntax tree,...
Bytecode
            $ rbx compile -B example.rb
            ...
            ================= :add =================
            Arguments:   2 required, 2 total
            Locals:      2: a, b
            Stack size: 4
            Lines to IP: 2: -1..-1, 3: 0..6

            0000: push_local                  0    # a
            0002: push_local                  1    # b
            0004: meta_send_op_plus           :+
            0006: ret
            ----------------------------------------

...or a dump of the actual bytecode for the Rubinius VM.
“Ruby Platform Throwdown”
                             Moderated by Dr Nic, 2011
                                 vimeo/26773441

For more on the similarities and differences between Rubinius and JRuby, see the throwdown
video moderated by Dr Nic.
III: Thnad’s Revenge



Now that we’ve gotten to know Rubinius a little...
Let’s port Thnad to Rubinius!



...let’s see what it would take to port Thnad to it.
photo: JSConf US




               Our Guide Through the Wilderness
                                        @brixen

Brian Ford was a huge help during this effort, answering tons of my “How do I...?” questions
in an awesome Socratic way (“Let’s take a look at the Generator class source code....”)
Same parser
             Same AST transformation
             Different bytecode
             (But similar bytecode ideas)



Because the Thnad syntax is unchanged, we can reuse the parser and syntax transformation.
All we need to change is the bytecode output. And even that’s not drastically different.
Thnad’s Four Features,
                 Revisited



Let’s go back through Thnad’s four features in the context of Rubinius.
1. Names and Numbers



First, function parameters and integers.
JVM                                             RBX

        # Numbers:                                       # Numbers:
        ldc 42                                           push 42

        # Names:                                         # Names:
        iload 0                                          push_local 0




See how similar the JVM and Rubinius bytecode is for these basic features?
class Number < Struct.new :value
  def eval(context, builder)
    builder.push value
  end
end




All we had to change was the name of the opcode both for numbers...
class Name < Struct.new :name
   def eval(context, builder)
     param_names = context[:params] || []
     position    = param_names.index(name)
     raise "Unknown parameter #{name}" unless position

     builder.push_local position
   end
 end




...and for parameter names.
2. Function Calls



Function calls were similarly easy.
JVM                                             RBX

                                                     push_const :Example
ldc 42                                               push 42
ldc 1                                                push 1
invokestatic #2; //Method                            send_stack #<CM>, 2
                 //add:(II)I




In Rubinius, there are no truly static methods. We are calling the method on a Ruby object—
namely, an entire Ruby class. So we have to push the name of that class onto the stack first.
The other big difference is that in Rubinius, we don’t just push the method name onto the
stack—we push a reference to the compiled code itself. Fortunately, there’s a helper method
to make this look more Bitescript-like.
class Funcall < Struct.new :name, :args
         def eval(context, builder)
           builder.push_const :Thnad
           args.each { |a| a.eval(context, builder) }
           builder.allow_private
           builder.send name.to_sym, args.length
         end
       end




Here’s how that difference affects the bytecode. Notice the allow_private() call? I’m not sure
exactly why we need this. It may be an “onion in the varnish,” a reference to a story by Primo
Levi in _The Periodic Table_.
flickr/black-and-white-prints/1366095561
flickr/ianfuller/76775606
In the story, the workers at a varnish factory wondered why the recipe called for an onion.
They couldn’t work out chemically why it would be needed, but it had always been one of the
ingredients. It turned out that it was just a crude old-school thermometer: when the onion
sizzled, the varnish was ready.
3. Conditionals



On to conditionals.
JVM                                            RBX
        0:     iconst_0                              37:    push 0
        1:     ifeq   9                              38:    push 0
        4:     bipush 42                             39:    send :==
        6:     goto   12                             41:    goto_if_false 47
        9:     sipush 667                            43:    push 42
        12:    ...                                   45:    goto 49
                                                     47:    push 667
                                                     49:    ...




Here, the JVM directly supports an “if equal to zero” opcode, whereas in Rubinius we have to
explicitly compare the item on the stack with zero.
class Conditional < Struct.new :cond, :if_true, :if_false
  def eval(context, builder)
    else_label = builder.new_label
    endif_label = builder.new_label

      cond.eval context, builder
      builder.push 0
      builder.send :==, 1

      builder.goto_if_true else_label

      if_true.eval context, builder
      builder.goto endif_label

    else_label.set!
    if_false.eval context, builder
    endif_label.set!
  end
end
Labels are also a little different in Rubinius, too; here’s what the bytecode for conditionals
looks like now.
4. Function Definitions



The trickiest part to implement was function calls.
JVM                                            RBX
public int add(int, int);                          push_rubinius
  iload_1                                          push :add
  iload_2                                          push #<CM>
  iadd                                             push_scope
  ireturn                                          push_self
                                                   push_const :Thnad
                                                   send :attach_method, 4




Remember that in Ruby, there’s no compile-time representation of a class. So rather than
emitting a class definition, we emit code that creates a class at runtime.
class Function < Struct.new :name, :params, :body
     def eval(context, builder)
       param_names = [params].flatten.map(&:name)
       context[:params] = param_names

       # create a new Rubinius::Generator
       builder.begin_method name.to_sym, params.count
       self.body.eval(context, builder.current_method)
       builder.current_method.ret
       builder.end_method
     end
   end




The code to define a method in Rubinius requires spinning up a completely separate
bytecode generator. I stuck all this hairy logic in a set of helpers to make it more BiteScript-
like.
class Rubinius::Generator
       def end_method
         # ...

            cm = @inner.package Rubinius::CompiledMethod

         push_rubinius
         push_literal inner.name
         push_literal cm
         push_scope
         push_const :Thnad
         send :attach_method, 4
         pop
       end
     end


Here’s the most interesting part of those helpers. After the function definition is compiled,
we push it onto the stack and tell Rubinius to attach it to our class.
Compiler



How does the compiled code make its way into a .rbc file?
g = Rubinius::Generator.new

                     # ...

                     context = Hash.new
                     exprs.each do |e|
                       e.eval(context, g)
                     end

                     # ...




As with JRuby, we create a bytecode generation object, then evaluate all the Thnad
statements into it.
main = g.package Rubinius::CompiledMethod

          Rubinius::CompiledFile.dump 
            main, @outname, Rubinius::Signature, 18




Finally, we tell Rubinius to marshal the compiled code to a .rbc file.
Runner (new!)



That means we now need a small script to unmarshal that compiled code and run it. This is
new; on the Java runtime, we already have a runner: the java binary.
#!/usr/bin/env rbx -rubygems

    (puts("Usage: #{} BINARY"); exit) if ARGV.empty?

    loader = Rubinius::CodeLoader.new(ARGV.first)
    method = loader.load_compiled_file(
      ARGV.first, Rubinius::Signature, 18)
    result = Rubinius.run_script(method)




Here’s the entirety of the code to load and run a compiled Rubinius file.
Built-ins



As we’ve just seen, defining a function in Rubinius takes a lot of steps, even with helper
functions to abstract away some of the hairiness.
g.begin_method :minus, 2
                      g.current_method.push_local 0
                      g.current_method.push_local 1
                      g.current_method.send :-, 1
                      g.current_method.ret
                      g.end_method




For example, here’s the built-in minus() function. I wanted to avoid writing a bunch of these.
function plus(a, b) {
                   minus(a, minus(0, b))
                 }




I realized that you could write plus() in Thnad instead, defining it in terms of minus.
function times(a, b) {
                   if (eq(b, 0)) {
                     0
                   } else {
                     plus(a, times(a, minus(b, 1)))
                   }
                 }




If you don’t care about bounds checking, you can also do times()...
function eq(a, b) {
                 if (minus(a, b)) {
                   0
                 } else {
                   1
                 }
               }




...and if()!
stdthnadlib?!?
                            We have a standard library!




That means we have a standard library! Doing the Rubinius implementation helped me
improve the JRuby version. I was able to go back and rip out most of the built-in functions
from that implementation.
Thnad Online
                      github/undees/thnad/tree/master
                      github/undees/thnad/tree/rbx




Here’s where you can download and play with either implementation.
This has been a fantastic conference. Thank you to our hosts...
Special Thanks

         Kaspar Schiess for Parslet
         Charles Nutter for BiteScript
         Ryan Davis and Aja Hammerly for Graph
         Brian Ford for guidance
         Our tireless conference organizers!



...and to the makers of JRuby, Rubinius, Parslet, BiteScript, and everything else that made this
project possible. Cheers!

More Related Content

What's hot

name name2 n2.ppt
name name2 n2.pptname name2 n2.ppt
name name2 n2.pptcallroom
 
name name2 n2
name name2 n2name name2 n2
name name2 n2callroom
 
Inside Python [OSCON 2012]
Inside Python [OSCON 2012]Inside Python [OSCON 2012]
Inside Python [OSCON 2012]Tom Lee
 
Learn Python The Hard Way Presentation
Learn Python The Hard Way PresentationLearn Python The Hard Way Presentation
Learn Python The Hard Way PresentationAmira ElSharkawy
 
A(n abridged) tour of the Rust compiler [PDX-Rust March 2014]
A(n abridged) tour of the Rust compiler [PDX-Rust March 2014]A(n abridged) tour of the Rust compiler [PDX-Rust March 2014]
A(n abridged) tour of the Rust compiler [PDX-Rust March 2014]Tom Lee
 
Variables: names, bindings, type, scope
Variables: names, bindings, type, scopeVariables: names, bindings, type, scope
Variables: names, bindings, type, scopesuthi
 
Memory Management In Python The Basics
Memory Management In Python The BasicsMemory Management In Python The Basics
Memory Management In Python The BasicsNina Zakharenko
 
Ti1220 Lecture 2: Names, Bindings, and Scopes
Ti1220 Lecture 2: Names, Bindings, and ScopesTi1220 Lecture 2: Names, Bindings, and Scopes
Ti1220 Lecture 2: Names, Bindings, and ScopesEelco Visser
 
Mixing Source and Bytecode: A Case for Compilation By Normalization (OOPSLA 2...
Mixing Source and Bytecode: A Case for Compilation By Normalization (OOPSLA 2...Mixing Source and Bytecode: A Case for Compilation By Normalization (OOPSLA 2...
Mixing Source and Bytecode: A Case for Compilation By Normalization (OOPSLA 2...lennartkats
 
Python Compiler Internals Presentation Slides
Python Compiler Internals Presentation SlidesPython Compiler Internals Presentation Slides
Python Compiler Internals Presentation SlidesTom Lee
 
WordPress Plugin Localization
WordPress Plugin LocalizationWordPress Plugin Localization
WordPress Plugin LocalizationRonald Huereca
 

What's hot (13)

name name2 n2.ppt
name name2 n2.pptname name2 n2.ppt
name name2 n2.ppt
 
ppt18
ppt18ppt18
ppt18
 
name name2 n2
name name2 n2name name2 n2
name name2 n2
 
Inside Python [OSCON 2012]
Inside Python [OSCON 2012]Inside Python [OSCON 2012]
Inside Python [OSCON 2012]
 
Learn Python The Hard Way Presentation
Learn Python The Hard Way PresentationLearn Python The Hard Way Presentation
Learn Python The Hard Way Presentation
 
A(n abridged) tour of the Rust compiler [PDX-Rust March 2014]
A(n abridged) tour of the Rust compiler [PDX-Rust March 2014]A(n abridged) tour of the Rust compiler [PDX-Rust March 2014]
A(n abridged) tour of the Rust compiler [PDX-Rust March 2014]
 
Variables: names, bindings, type, scope
Variables: names, bindings, type, scopeVariables: names, bindings, type, scope
Variables: names, bindings, type, scope
 
Memory Management In Python The Basics
Memory Management In Python The BasicsMemory Management In Python The Basics
Memory Management In Python The Basics
 
Ti1220 Lecture 2: Names, Bindings, and Scopes
Ti1220 Lecture 2: Names, Bindings, and ScopesTi1220 Lecture 2: Names, Bindings, and Scopes
Ti1220 Lecture 2: Names, Bindings, and Scopes
 
Mixing Source and Bytecode: A Case for Compilation By Normalization (OOPSLA 2...
Mixing Source and Bytecode: A Case for Compilation By Normalization (OOPSLA 2...Mixing Source and Bytecode: A Case for Compilation By Normalization (OOPSLA 2...
Mixing Source and Bytecode: A Case for Compilation By Normalization (OOPSLA 2...
 
Perl intro
Perl introPerl intro
Perl intro
 
Python Compiler Internals Presentation Slides
Python Compiler Internals Presentation SlidesPython Compiler Internals Presentation Slides
Python Compiler Internals Presentation Slides
 
WordPress Plugin Localization
WordPress Plugin LocalizationWordPress Plugin Localization
WordPress Plugin Localization
 

Similar to Thnad's Revenge

What we can learn from Rebol?
What we can learn from Rebol?What we can learn from Rebol?
What we can learn from Rebol?lichtkind
 
Processing massive amount of data with Map Reduce using Apache Hadoop - Indi...
Processing massive amount of data with Map Reduce using Apache Hadoop  - Indi...Processing massive amount of data with Map Reduce using Apache Hadoop  - Indi...
Processing massive amount of data with Map Reduce using Apache Hadoop - Indi...IndicThreads
 
What's new in Ruby 2.0
What's new in Ruby 2.0What's new in Ruby 2.0
What's new in Ruby 2.0Kartik Sahoo
 
Ida python intro
Ida python introIda python intro
Ida python intro小静 安
 
Unmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/InvokeUnmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/InvokeDmitri Nesteruk
 
Pydiomatic
PydiomaticPydiomatic
Pydiomaticrik0
 
Iron Languages - NYC CodeCamp 2/19/2011
Iron Languages - NYC CodeCamp 2/19/2011Iron Languages - NYC CodeCamp 2/19/2011
Iron Languages - NYC CodeCamp 2/19/2011Jimmy Schementi
 
Ruby 1.9.3 Basic Introduction
Ruby 1.9.3 Basic IntroductionRuby 1.9.3 Basic Introduction
Ruby 1.9.3 Basic IntroductionPrabu D
 
Language-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchLanguage-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchAndrew Lowe
 
Tackling repetitive tasks with serial or parallel programming in R
Tackling repetitive tasks with serial or parallel programming in RTackling repetitive tasks with serial or parallel programming in R
Tackling repetitive tasks with serial or parallel programming in RLun-Hsien Chang
 
Sugar Presentation - YULHackers March 2009
Sugar Presentation - YULHackers March 2009Sugar Presentation - YULHackers March 2009
Sugar Presentation - YULHackers March 2009spierre
 
What lies beneath the beautiful code?
What lies beneath the beautiful code?What lies beneath the beautiful code?
What lies beneath the beautiful code?Niranjan Sarade
 
Cross Compiling for Perl Hackers
Cross Compiling for Perl HackersCross Compiling for Perl Hackers
Cross Compiling for Perl HackersJens Rehsack
 
Specialized Compiler for Hash Cracking
Specialized Compiler for Hash CrackingSpecialized Compiler for Hash Cracking
Specialized Compiler for Hash CrackingPositive Hack Days
 
name name2 n
name name2 nname name2 n
name name2 ncallroom
 

Similar to Thnad's Revenge (20)

What we can learn from Rebol?
What we can learn from Rebol?What we can learn from Rebol?
What we can learn from Rebol?
 
Processing massive amount of data with Map Reduce using Apache Hadoop - Indi...
Processing massive amount of data with Map Reduce using Apache Hadoop  - Indi...Processing massive amount of data with Map Reduce using Apache Hadoop  - Indi...
Processing massive amount of data with Map Reduce using Apache Hadoop - Indi...
 
What's new in Ruby 2.0
What's new in Ruby 2.0What's new in Ruby 2.0
What's new in Ruby 2.0
 
Ida python intro
Ida python introIda python intro
Ida python intro
 
Unmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/InvokeUnmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/Invoke
 
Perl Basics with Examples
Perl Basics with ExamplesPerl Basics with Examples
Perl Basics with Examples
 
Python basic
Python basicPython basic
Python basic
 
Pydiomatic
PydiomaticPydiomatic
Pydiomatic
 
Python idiomatico
Python idiomaticoPython idiomatico
Python idiomatico
 
Iron Languages - NYC CodeCamp 2/19/2011
Iron Languages - NYC CodeCamp 2/19/2011Iron Languages - NYC CodeCamp 2/19/2011
Iron Languages - NYC CodeCamp 2/19/2011
 
Ruby 1.9.3 Basic Introduction
Ruby 1.9.3 Basic IntroductionRuby 1.9.3 Basic Introduction
Ruby 1.9.3 Basic Introduction
 
Language-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchLanguage-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible research
 
Tackling repetitive tasks with serial or parallel programming in R
Tackling repetitive tasks with serial or parallel programming in RTackling repetitive tasks with serial or parallel programming in R
Tackling repetitive tasks with serial or parallel programming in R
 
Sugar Presentation - YULHackers March 2009
Sugar Presentation - YULHackers March 2009Sugar Presentation - YULHackers March 2009
Sugar Presentation - YULHackers March 2009
 
What lies beneath the beautiful code?
What lies beneath the beautiful code?What lies beneath the beautiful code?
What lies beneath the beautiful code?
 
Cross Compiling for Perl Hackers
Cross Compiling for Perl HackersCross Compiling for Perl Hackers
Cross Compiling for Perl Hackers
 
Specialized Compiler for Hash Cracking
Specialized Compiler for Hash CrackingSpecialized Compiler for Hash Cracking
Specialized Compiler for Hash Cracking
 
ppt7
ppt7ppt7
ppt7
 
ppt2
ppt2ppt2
ppt2
 
name name2 n
name name2 nname name2 n
name name2 n
 

More from Erin Dees

Logic Lessons That Last Generations
Logic Lessons That Last GenerationsLogic Lessons That Last Generations
Logic Lessons That Last GenerationsErin Dees
 
How 5 people with 4 day jobs in 3 time zones enjoyed 2 years writing 1 book
How 5 people with 4 day jobs in 3 time zones enjoyed 2 years writing 1 bookHow 5 people with 4 day jobs in 3 time zones enjoyed 2 years writing 1 book
How 5 people with 4 day jobs in 3 time zones enjoyed 2 years writing 1 bookErin Dees
 
How 5 people with 4 day jobs in 3 time zones enjoyed 2 years writing 1 book
How 5 people with 4 day jobs in 3 time zones enjoyed 2 years writing 1 bookHow 5 people with 4 day jobs in 3 time zones enjoyed 2 years writing 1 book
How 5 people with 4 day jobs in 3 time zones enjoyed 2 years writing 1 bookErin Dees
 
A jar-nORM-ous Task
A jar-nORM-ous TaskA jar-nORM-ous Task
A jar-nORM-ous TaskErin Dees
 
Cucumber meets iPhone
Cucumber meets iPhoneCucumber meets iPhone
Cucumber meets iPhoneErin Dees
 

More from Erin Dees (6)

Logic Lessons That Last Generations
Logic Lessons That Last GenerationsLogic Lessons That Last Generations
Logic Lessons That Last Generations
 
How 5 people with 4 day jobs in 3 time zones enjoyed 2 years writing 1 book
How 5 people with 4 day jobs in 3 time zones enjoyed 2 years writing 1 bookHow 5 people with 4 day jobs in 3 time zones enjoyed 2 years writing 1 book
How 5 people with 4 day jobs in 3 time zones enjoyed 2 years writing 1 book
 
How 5 people with 4 day jobs in 3 time zones enjoyed 2 years writing 1 book
How 5 people with 4 day jobs in 3 time zones enjoyed 2 years writing 1 bookHow 5 people with 4 day jobs in 3 time zones enjoyed 2 years writing 1 book
How 5 people with 4 day jobs in 3 time zones enjoyed 2 years writing 1 book
 
A jar-nORM-ous Task
A jar-nORM-ous TaskA jar-nORM-ous Task
A jar-nORM-ous Task
 
Cucumber meets iPhone
Cucumber meets iPhoneCucumber meets iPhone
Cucumber meets iPhone
 
Yes, But
Yes, ButYes, But
Yes, But
 

Recently uploaded

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 

Recently uploaded (20)

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 

Thnad's Revenge

  • 1. Welcome to “Thnad’s Revenge,” a programming language implementation tale in three acts.
  • 2. Not to be confused with...
  • 3. http://en.wikipedia.org/wiki/Yars'_Revenge ...Yars’ Revenge, the awesome Atari video game from the ’80s.
  • 4. Cucumber Recipes Ian Dees with Aslak Hellesøy and Matt Wynne pragprog/titles/JRUBY discount code: JRubyIanDees Before we get to the talk, let me make a couple of quick announcements. First, we’re updating the JRuby book this summer with a JRuby 1.7-ready PDF. To celebrate that, we’re offering a discount code on the book during the conference. Second, I’m working on a new book with the Cucumber folks, which has some JRuby/JVM stuff in it—if you’d like to be a tech reviewer, please find me after this talk.
  • 5. I. Meet Thnad II. Enter the Frenemy III. Thnad’s Revenge (with apologies to Ira Glass) Act I, Meet Thnad, in which we encounter Thnad, a programming language built with JRuby and designed not for programmer happiness, but for implementer happiness. Act II, Enter the Frenemy, in which we meet a new Ruby runtime. Act III, Thnad's Revenge, in which we port Thnad to run on the Rubinius runtime and encounter some surprises along the way.
  • 6. I. Meet Thnad Thnad is a programming language I created last summer as an excuse to learn some fun JRuby tools and see what it's like to write a compiler.
  • 7. The name comes from a letter invented by Dr. Seuss in his book, “On Beyond Zebra.” Since most of the real letters are already taken by programming languages, a fictional one seems appropriate.
  • 8. A Fictional Programming Language Optimized for Implementer Happiness Just as Ruby is optimized for programmer happiness, Thnad is optimized for implementer happiness. It was designed to be implemented with a minimum of time and effort, and a maximum amount of fun.
  • 9. function factorial(n) { if (eq(n, 1)) { 1 } else { times(n, factorial(minus(n, 1))) } } print(factorial(4)) Here’s a sample Thnad program demonstrating all the major features. Thnad has integers, functions, conditionals, and... not much else. These minimal features were easy to add, thanks to the great tools available in the JRuby ecosystem (and other ecosystems, as we’ll see).
  • 10. Thnad Features 1. Names and Numbers 2. Function Calls 3. Conditionals 4. Function Definitions In the next few minutes, we’re going to trace through each of these four language features, from parsing the source all the way to generating the final binary. We won’t show every single grammar rule, but we will hit the high points.
  • 11. As Tom mentioned in his talk, there are a number of phases a piece of source code goes through during compilation.
  • 12. Stages of Parsing tokenize parse transform emit These break down into four main stages in a typical language: finding the tokens or parts of speech of the text, parsing the tokens into an in-memory tree, transforming the tree, and generating the bytecode. We’re going to look at each of Thnad’s major features in the context of these stages.
  • 13. 1. Names and Numbers First, let’s look at the easiest language feature: numbers and function parameters.
  • 14. {:number => '42'} root '42' :number "42" Our parser needs to transform this input text into some kind of Ruby data structure.
  • 15. Parslet kschiess.github.com/parslet I used a library called Parslet for that. Parslet handles the first two stages of compilation (tokenizing and parsing) using a Parsing Expression Grammar, or PEG. PEGs are like regular expressions attached to blocks of code. They sound like a hack, but there’s solid compiler theory behind them.
  • 16. {:number => '42'} root '42' :number "42" rule(:number) { match('[0-9]').repeat(1).as(:number) >> space? } The rule at the bottom of the page is Parslet’s notation for matching one or more numbers followed by a optional space.
  • 17. {:number => '42'} Thnad::Number.new(42) root root Thnad::Number :number :value "42" 42 rule(:number => simple(:value)) { Number.new(value.to_i) } Now for the third stage, transformation. We could generate the bytecode straight from the original tree, using a bunch of hard-to-test case statements. But it would be nicer to have a specific Ruby class for each Thnad language feature. The rule at the bottom of this slide tells Parslet to transform a Hash with a key called :number into an instance of a Number class we provide.
  • 18. BiteScript github/headius/bitescript The final stage, outputting bytecode, is handled by the BiteScript library, which is basically a domain-specific language for emitting JVM opcodes.
  • 19. main do ldc 42 ldc 1 invokestatic :Example, :baz, [int, int, int] returnvoid end Here's an example, just to get an idea of the flavor. To call a method, you just push the arguments onto the stack and then call a specific opcode, in this case invokestatic. The VM you're writing for is aware of classes, interfaces, and so on—you don't have to implement method lookup like you would with plain machine code.
  • 20. “JVM Bytecode for Dummies” Charles Nutter, Øredev 2010 slideshare/CharlesNutter/redev-2010-jvm-bytecode-for-dummies When I first saw the BiteScript, I thought it was something you'd only need if you were doing deep JVM hacking. But when I read the slides from Charlie's presentation at Øredev, it clicked. This library takes me way back to my college days, when we'd write assembler programs for a really simple instruction set like MIPS. BiteScript evokes that same kind of feeling. I'd always thought the JVM would have a huge, crufty instruction set—but it's actually quite manageable to keep the most important parts of it in your head.
  • 21. class Number < Struct.new :value def eval(context, builder) builder.ldc value end end We can generate the bytecode any way we want. One simple way is to give each of our classes an eval() method that takes a BiteScript generator and calls various methods on it to generate JVM instructions.
  • 22. class Name < Struct.new :name def eval(context, builder) param_names = context[:params] || [] position = param_names.index(name) raise "Unknown parameter #{name}" unless position builder.iload position end end Dealing with passed-in parameters is nearly as easy as dealing with raw integers; we just look up the parameter name by position, and then push the nth parameter onto the stack.
  • 23. 2. Function Calls The next major feature is function calls. Once we have those, we will be able to run a trivial Thnad program.
  • 24. {:funcall => {:name => 'baz', :args => [ {:arg => {:number => '42'}}]}} {:arg => {:name => 'foo'}}]}} root 'baz(42, foo)' :funcall :name :args "baz" :arg :arg :number :name "42" "foo" We’re going to move a little faster here, to leave time for Rubinius. Here, we want to transform this source code into this Ruby data structure representing a function call.
  • 25. Thnad::Funcall.new 'foo', [Thnad::Number.new(42)] root Thnad::Funcall :name :args "foo" Thnad::Number :value 42 Now, we want to transform generic Ruby data structures into purpose-built ones that we can attach bytecode-emitting behavior to.
  • 26. class Funcall < Struct.new :name, :args def eval(context, builder) args.each { |a| a.eval(context, builder) } types = [builder.int] * (args.length + 1) builder.invokestatic builder.class_builder, name, types end end The bytecode for a function call is really simple in BiteScript. All functions in Thnad are static methods on a single class.
  • 27. 3. Conditionals The first two features we’ve defined are enough to write simple programs like print(42). The next two features will let us add conditionals and custom functions.
  • 28. {:cond => {:number => '0'}, :if_true => {:body => {:number => '42'}}, :if_false => {:body => {:number => '667'}}} 'if (0) { 42 root } else { 667 :cond :if_true :if_false }' :number :body :body "0" :number :number "42" "667" A conditional consists of the “if” keyword, followed by a body of code inside braces, then the “else” keyword, followed by another body of code in braces.
  • 29. Thnad::Conditional.new Thnad::Number.new(0), Thnad::Number.new(42), Thnad::Number.new(667) root Thnad::Conditional :cond :if_true :if_false Thnad::Number Thnad::Number Thnad::Number :value :value :value 0 42 667 Here’s the transformed tree representing a set of custom Ruby classes.
  • 30. class Conditional < Struct.new :cond, :if_true, :if_false def eval(context, builder) cond.eval context, builder builder.ifeq :else if_true.eval context, builder builder.goto :endif builder.label :else if_false.eval context, builder builder.label :endif end end The bytecode emitter for conditionals has a new twist. The Conditional struct points to three other Thnad nodes. It needs to eval() them at the right time to emit their bytecode in between all the zero checks and gotos.
  • 31. 4. Function Definitions On to the final piece of Thnad: defining new functions.
  • 32. {:func => {:name => 'foo'}, :params => {:param => {:name => 'x'}}, :body => {:number => '5'}} 'function foo(x) { root 5 }' :func :params :body :name :param :number "foo" :name "5" "x" A function definition looks a lot like a function call, but with a body attached to it.
  • 33. Thnad::Function.new 'foo', [Thnad::Name.new('x')], Thnad::Number.new(5) root Thnad::Function :name :params :body "foo" Thnad::Name Thnad::Number :name :value "x" 5 Here’s the transformation we want to perform for this language feature.
  • 34. class Function < Struct.new :name, :params, :body def eval(context, builder) param_names = [params].flatten.map(&:name) context[:params] = param_names types = [builder.int] * (param_names.count + 1) builder.public_static_method(self.name, [], *types) do |method| self.body.eval(context, method) method.ireturn end end end Since all Thnad parameters and return types are integers, emitting a function definition is really easy. We count the parameters so that we can give the JVM a correct signature. Then, we just pass a block to the public_static_method helper, a feature of BiteScript that will inspire the Rubinius work later on.
  • 35. Compiler We’ve seen how to generate individual chunks of bytecode; how do they all get stitched together into a .class file?
  • 36. builder = BiteScript::FileBuilder.build(@filename) do public_class classname, object do |klass| # ... klass.public_static_method 'main', [], void, string[] do |method| context = Hash.new exprs.each do |e| e.eval(context, method) end method.returnvoid end end end Here’s the core of class generation. We output a standard Java main() function...
  • 37. builder = BiteScript::FileBuilder.build(@filename) do public_class classname, object do |klass| # ... klass.public_static_method 'main', [], void, string[] do |method| context = Hash.new exprs.each do |e| e.eval(context, method) end method.returnvoid end end end ...inside which we eval() our Thnad expressions (not counting function definitions) one by one.
  • 38. Built-ins plus, minus, times, eq, print Thnad ships with a few basic arithmetic operations, plus a print() function. Let’s look at one of those now.
  • 39. public_static_method 'minus', [], int, int, int do iload 0 iload 1 isub ireturn end Here’s the definition of minus(). It just pushes its two arguments onto the stack and then subtracts them. The rest of the built-ins are nearly identical to this one, so we won’t show them here.
  • 40. II. Enter the Frenemy So that's a whirlwind tour of Thnad. Last year, I was telling someone about this project—it was either Shane Becker or Brian Ford, I think—and he said,...
  • 41. Rubinius ...“Hey, you should port this to Rubinius!” I thought, “Hmm, why not? Sounds fun.” Let’s take a look at this other runtime that has sprung up as a rival for Thnad’s affections.
  • 42. Ruby in Ruby • As much as performance allows • Initially 100%, now around half (?) • Core in C++ / LLVM • Tons in Ruby: primitives, parser, bytecode The goal of Rubinius is to implement Ruby in Ruby as much as performance allows. Quite a lot of functionality you’d think would need to be in C is actually in Ruby.
  • 43. RubySpec, FFI Brought to you by Rubinius (Thank you!) We have Rubinius to thank for the executable Ruby specification that all Rubies are now judged against, and for the excellent foreign-function interface that lets you call C code in a way that’s compatible with at least four Rubies.
  • 44. Looking Inside Your Code Rubinius also has tons of mechanisms for looking inside your code, which was very helpful when I needed to learn what bytecode I’d need to output to accomplish a particular task in Thnad.
  • 45. class Example def add(a, b) a + b end end For example, with this class,...
  • 46. AST $ rbx compile -S example.rb [:script, [:class, :Example, nil, [:scope, [:block, [:defn, :add, [:args, :a, :b], [:scope, [:block, [:call, [:lvar, :a], :+, [:arglist, [:lvar, :b]]]]]]]]]] ...you can get a Lisp-like representation of the syntax tree,...
  • 47. Bytecode $ rbx compile -B example.rb ... ================= :add ================= Arguments: 2 required, 2 total Locals: 2: a, b Stack size: 4 Lines to IP: 2: -1..-1, 3: 0..6 0000: push_local 0 # a 0002: push_local 1 # b 0004: meta_send_op_plus :+ 0006: ret ---------------------------------------- ...or a dump of the actual bytecode for the Rubinius VM.
  • 48. “Ruby Platform Throwdown” Moderated by Dr Nic, 2011 vimeo/26773441 For more on the similarities and differences between Rubinius and JRuby, see the throwdown video moderated by Dr Nic.
  • 49. III: Thnad’s Revenge Now that we’ve gotten to know Rubinius a little...
  • 50. Let’s port Thnad to Rubinius! ...let’s see what it would take to port Thnad to it.
  • 51. photo: JSConf US Our Guide Through the Wilderness @brixen Brian Ford was a huge help during this effort, answering tons of my “How do I...?” questions in an awesome Socratic way (“Let’s take a look at the Generator class source code....”)
  • 52. Same parser Same AST transformation Different bytecode (But similar bytecode ideas) Because the Thnad syntax is unchanged, we can reuse the parser and syntax transformation. All we need to change is the bytecode output. And even that’s not drastically different.
  • 53. Thnad’s Four Features, Revisited Let’s go back through Thnad’s four features in the context of Rubinius.
  • 54. 1. Names and Numbers First, function parameters and integers.
  • 55. JVM RBX # Numbers: # Numbers: ldc 42 push 42 # Names: # Names: iload 0 push_local 0 See how similar the JVM and Rubinius bytecode is for these basic features?
  • 56. class Number < Struct.new :value def eval(context, builder) builder.push value end end All we had to change was the name of the opcode both for numbers...
  • 57. class Name < Struct.new :name def eval(context, builder) param_names = context[:params] || [] position = param_names.index(name) raise "Unknown parameter #{name}" unless position builder.push_local position end end ...and for parameter names.
  • 58. 2. Function Calls Function calls were similarly easy.
  • 59. JVM RBX push_const :Example ldc 42 push 42 ldc 1 push 1 invokestatic #2; //Method send_stack #<CM>, 2 //add:(II)I In Rubinius, there are no truly static methods. We are calling the method on a Ruby object— namely, an entire Ruby class. So we have to push the name of that class onto the stack first. The other big difference is that in Rubinius, we don’t just push the method name onto the stack—we push a reference to the compiled code itself. Fortunately, there’s a helper method to make this look more Bitescript-like.
  • 60. class Funcall < Struct.new :name, :args def eval(context, builder) builder.push_const :Thnad args.each { |a| a.eval(context, builder) } builder.allow_private builder.send name.to_sym, args.length end end Here’s how that difference affects the bytecode. Notice the allow_private() call? I’m not sure exactly why we need this. It may be an “onion in the varnish,” a reference to a story by Primo Levi in _The Periodic Table_.
  • 61. flickr/black-and-white-prints/1366095561 flickr/ianfuller/76775606 In the story, the workers at a varnish factory wondered why the recipe called for an onion. They couldn’t work out chemically why it would be needed, but it had always been one of the ingredients. It turned out that it was just a crude old-school thermometer: when the onion sizzled, the varnish was ready.
  • 62. 3. Conditionals On to conditionals.
  • 63. JVM RBX 0: iconst_0 37: push 0 1: ifeq 9 38: push 0 4: bipush 42 39: send :== 6: goto 12 41: goto_if_false 47 9: sipush 667 43: push 42 12: ... 45: goto 49 47: push 667 49: ... Here, the JVM directly supports an “if equal to zero” opcode, whereas in Rubinius we have to explicitly compare the item on the stack with zero.
  • 64. class Conditional < Struct.new :cond, :if_true, :if_false def eval(context, builder) else_label = builder.new_label endif_label = builder.new_label cond.eval context, builder builder.push 0 builder.send :==, 1 builder.goto_if_true else_label if_true.eval context, builder builder.goto endif_label else_label.set! if_false.eval context, builder endif_label.set! end end Labels are also a little different in Rubinius, too; here’s what the bytecode for conditionals looks like now.
  • 65. 4. Function Definitions The trickiest part to implement was function calls.
  • 66. JVM RBX public int add(int, int); push_rubinius iload_1 push :add iload_2 push #<CM> iadd push_scope ireturn push_self push_const :Thnad send :attach_method, 4 Remember that in Ruby, there’s no compile-time representation of a class. So rather than emitting a class definition, we emit code that creates a class at runtime.
  • 67. class Function < Struct.new :name, :params, :body def eval(context, builder) param_names = [params].flatten.map(&:name) context[:params] = param_names # create a new Rubinius::Generator builder.begin_method name.to_sym, params.count self.body.eval(context, builder.current_method) builder.current_method.ret builder.end_method end end The code to define a method in Rubinius requires spinning up a completely separate bytecode generator. I stuck all this hairy logic in a set of helpers to make it more BiteScript- like.
  • 68. class Rubinius::Generator def end_method # ... cm = @inner.package Rubinius::CompiledMethod push_rubinius push_literal inner.name push_literal cm push_scope push_const :Thnad send :attach_method, 4 pop end end Here’s the most interesting part of those helpers. After the function definition is compiled, we push it onto the stack and tell Rubinius to attach it to our class.
  • 69. Compiler How does the compiled code make its way into a .rbc file?
  • 70. g = Rubinius::Generator.new # ... context = Hash.new exprs.each do |e| e.eval(context, g) end # ... As with JRuby, we create a bytecode generation object, then evaluate all the Thnad statements into it.
  • 71. main = g.package Rubinius::CompiledMethod Rubinius::CompiledFile.dump main, @outname, Rubinius::Signature, 18 Finally, we tell Rubinius to marshal the compiled code to a .rbc file.
  • 72. Runner (new!) That means we now need a small script to unmarshal that compiled code and run it. This is new; on the Java runtime, we already have a runner: the java binary.
  • 73. #!/usr/bin/env rbx -rubygems (puts("Usage: #{} BINARY"); exit) if ARGV.empty? loader = Rubinius::CodeLoader.new(ARGV.first) method = loader.load_compiled_file( ARGV.first, Rubinius::Signature, 18) result = Rubinius.run_script(method) Here’s the entirety of the code to load and run a compiled Rubinius file.
  • 74. Built-ins As we’ve just seen, defining a function in Rubinius takes a lot of steps, even with helper functions to abstract away some of the hairiness.
  • 75. g.begin_method :minus, 2 g.current_method.push_local 0 g.current_method.push_local 1 g.current_method.send :-, 1 g.current_method.ret g.end_method For example, here’s the built-in minus() function. I wanted to avoid writing a bunch of these.
  • 76. function plus(a, b) { minus(a, minus(0, b)) } I realized that you could write plus() in Thnad instead, defining it in terms of minus.
  • 77. function times(a, b) { if (eq(b, 0)) { 0 } else { plus(a, times(a, minus(b, 1))) } } If you don’t care about bounds checking, you can also do times()...
  • 78. function eq(a, b) { if (minus(a, b)) { 0 } else { 1 } } ...and if()!
  • 79. stdthnadlib?!? We have a standard library! That means we have a standard library! Doing the Rubinius implementation helped me improve the JRuby version. I was able to go back and rip out most of the built-in functions from that implementation.
  • 80. Thnad Online github/undees/thnad/tree/master github/undees/thnad/tree/rbx Here’s where you can download and play with either implementation.
  • 81. This has been a fantastic conference. Thank you to our hosts...
  • 82. Special Thanks Kaspar Schiess for Parslet Charles Nutter for BiteScript Ryan Davis and Aja Hammerly for Graph Brian Ford for guidance Our tireless conference organizers! ...and to the makers of JRuby, Rubinius, Parslet, BiteScript, and everything else that made this project possible. Cheers!

Editor's Notes

  1. Welcome to Thnad&apos;s Revenge, a prgramming language implementation tale in three acts.\n
  2. (with apologies to Ira Glass) Act I, Meet Thnad, in which we encounter Thnad, a programming language built with JRuby and designed not for programmer happiness, but for implementer happiness. Act II, Enter the Challenger: Rubinius, in which we meet a new Ruby runtime. Act III, Thnad&apos;s Revenge, in which we port Thnad to run on the Rubinius runtime and encounter some surprises along the way.\n
  3. Thnad is a programming language I created last summer as an excuse to learn some fun JRuby tools and see what it&apos;s like to write a compiler.\n
  4. \n
  5. \n
  6. \n
  7. We&apos;re going to look at a couple of those tools today. Starting at the low level of generating code, we have the Bitescript library, a DSL for generating Java bytecode.\n
  8. \n
  9. Here&apos;s an example, just to get an idea of the flavor. To call a method, you just push the arguments onto the stack and then call a specific opcode, in this case invokevirtual. The VM you&apos;re writing for is aware of classes, interfaces, and so on&amp;#x2014;you don&apos;t have to implement method lookup like you would on a typical physical CPU.\n
  10. When I first saw the library, I thought it was something you&apos;d only need if you were doing deep JVM hacking. But when I read the slides from Charlie&apos;s presentation at &amp;#xD8;redev, it clicked. This library takes me way back to my college days, when we&apos;d write assembler programs for a really simple instruction set like MIPS. Bitescript evokes that same kind of feeling. I&apos;d always thought the JVM would have a huge, crufty instruction set&amp;#x2014;but it&apos;s actually quite manageable to keep the most important parts of it in your head.\n
  11. That covers generating the final stage of compliation. But what about parsing the input? For that , I used a Ruby library called Parslet.\n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. Parslet is a little different: it basically does the tokenizing and parsing together.\n
  18. \n
  19. Those two tools are all we need to build a simple programming language. I decided to call mine Thnad, which is named after a fictional letter in a Dr. Seuss book about extending the alphabet.\n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. So that&apos;s a whirlwind tour of Thnad. I was telling someone about this project&amp;#x2014;it was either Shane Becker or Brian Ford, I think&amp;#x2014;and he said, &quot;Hey, you should port this to Rubinius!&quot; I thought, &quot;Hey, why not? Sounds fun.&quot; Before I could do this, I needed to learn a little more about the runtime.\n
  51. \n
  52. \n
  53. \n
  54. \n
  55. \n
  56. \n
  57. \n
  58. \n
  59. \n
  60. \n
  61. \n
  62. \n
  63. \n
  64. \n
  65. \n
  66. \n
  67. \n
  68. \n
  69. \n
  70. \n
  71. \n
  72. \n
  73. \n
  74. \n
  75. \n
  76. \n
  77. \n
  78. \n
  79. \n
  80. \n
  81. \n
  82. \n
  83. \n
  84. \n
  85. \n
  86. \n
  87. \n
  88. \n
  89. \n
  90. \n
  91. \n