PyCon TW 2017 - PyPy's approach to construct domain-specific language runtime -Part 2
This is the slide for PyCon TW 2017 Day 3 PyPy's approach to construct domain-specific language runtime's Slide, and this is part 2, Part 1 is jserv's work, refer to his slide
2. Test Environment
⢠Vagrant, Ubuntu/16.04
⢠The benchmark result on Host OS and Guest OS
is really close, so I use VM to get resultâ¨
(BTW, it's really easy to get your VM dirty
3. Test Version
⢠CPython2 2.7.12
⢠CPython3 3.5.2
⢠PyPy2 5.1.2 (Installed from apt-get)
⢠PyPy2 5.6.0 (Compiled from source)
⢠PyPy3 5.7.1 (Compiled from source)
4. Why only CPython & PyPy
⢠Cython
⢠You'll need to learn Cython's syntax, it's mixing
C and Python.
⢠Jython
⢠The latest version of Jython 2.7.0 is released in
May 2015, so it's outdated
5. Some notice
⢠PyPy3 is still in beta, so if it's slower than
CPython 3, no surprise
⢠And not every module can run faster in PyPy
than CPython, there will be samples later
6. Why PyPy3 is beta ?
⢠The way CPython develop and the way PyPy develop
is different
⢠CPython
⢠Focus on Python3, only maintain Python2 when
security issue pops up
⢠PyPy
⢠Focus on PyPy2, also updating PyPy3, but it's not
their main development
9. PyPy Installation
⢠If you want to compile PyPy from scratch
⢠First, install dependencies
⢠http://doc.pypy.org/en/latest/build.html
⢠Then, cd to pypy/goal
11. PyPy Installation
⢠Notice
⢠Compile PyPy takes lots of time, and compile it
with JIT-Enabled takes even more.
⢠Usually takes 30min up
⢠And you need at least 4G RAM to compile it on
64-Bit Machine, make sure you have enough
RAM for this, or it may be killed by system
24. Not every case should use PyPy
⢠For example, when it comes to the code below,
CPython is faster than PyPyâ¨
myStr = âââ¨
for x in xrange(1, 10**6):â¨
myStr += str(myStr[x])
31. What to do to add JIT
⢠We need to ďŹnd "Reds" and "Greens"
⢠Greens -> DeďŹne instructions
⢠Reds -> What's being manipulated
32. What to do to add JIT
⢠from rpython.rlib.jit import JitDriver
⢠jitdriver = JitDriver(greens=[], reds=[])
⢠and add jit_merge_point to your main loop
35. Optimize
⢠Speed up loop
⢠Because every loop needs to look up address in
dictionary, but the dictionary is static, so we can
use @elidable decorator and add a function to
speed up
41. Basic Knowledge
⢠It reads in Brainf*ck ďŹle, then turn into IR
⢠Then you can choose to do Optimize in IR
⢠Finally, turn your IR into Python Code, and
compile it with PyPy to generate a binary ďŹle
Brainf*ck
Code
IR
Python
Code
Binary
File
42. Architecture
⢠ir.py -> For Brainf*ck to IR and IR to Python
⢠trans.py -> Main program
⢠python trans.py <input> <output> <optmode>
⢠optmode 1 to open optimization, 0 to not to
⢠opt.py -> Optimize tricks
43. Optimizations
⢠opt_contract ( Contract)
⢠Operation like " +++++ ", means that we have
to do "mem[p] += 1" ďŹve times
⢠But because we have IR, so we can change
the instruction to "mem[p] += 5"
⢠When it comes to â+ - > <â, this trick can apply
44. Optimizations
⢠opt_clearloop (Clear Loop)
⢠Command like [-], it means when(mem[p]), do
mem[p] -= 1
⢠We know what the result is, so we can set
mem[p] to zero directlyâ¨
mem[p] = 0
45. Optimizations
⢠opt_multiloop & opt_copyloop (Multiplication and
Copy)
⢠Command like [->+>+<<] is copy mem[p]'s
value to mem[p+1] and mem[p+2], and set
mem[p] to zero
⢠If we know what this is doing, we can make it
short
46. Optimizations
⢠opt_multiloop & opt_copyloop (Multiplication and
Copy)
⢠Same trick can apply to [->++<], makeâ¨
mem[p+1] = 2 * mem[p] and set mem[p] = 0
⢠Which is multiplication
47. Optimizations
⢠opt_offsetops (Operation Offsets)
⢠In Brainf*ck, we know that we have a pointer
indicating where we are now, and pointer
usually move a lot
⢠What if we can calculate offset for Instructions
directly, so we don't need to move the pointer
around
48. Optimizations
⢠opt_cancel (Cancel Instructions)
⢠++++-->>+-<<< do the same thing as ++<
⢠Then, why waste all the time on these
Instructons ?
55. Wait a sec...
⢠Not every case can use JIT
⢠Because JIT needs to warm-up and Analysisâ¨
Maybe warm-up can take more time than your
code actually run
⢠And it's import to avoid to record the warm-up
time when you want to do some benchmarking
56. Wait a sec...
⢠And do you really need JIT ?
⢠It may cost a lot for one to import JIT to a
project
⢠Sometimes, maybe buy more server is a better
choice than import JIT into your project
57. Wait a sec...
⢠But if you analyzed your project, know how
difďŹcult it is for you to import PyPy and JIT into
your project, then you're good to go!
⢠BTW, ďŹle size of executable with JIT Enabled is
bigger than the one with No-JIT