SlideShare uma empresa Scribd logo
1 de 44
Baixar para ler offline
Extending Python in C 
Cluj.py meetup, Nov 19th 
Steffen Wenz, CTO TrustYou
Goals of today’s talk 
● Look behind the scenes of the CPython interpreter - 
gain insights into “how Python works” 
● Explore the CPython C API 
● Build a Python extension in C 
● Introduction to Cython
Who are we? 
● For each hotel on the 
planet, provide a 
summary of all reviews 
● Expertise: 
○ NLP 
○ Machine Learning 
○ Big Data 
● Clients: 

TrustYou Tech Stack 
Batch Layer 
● Hadoop (HDP 2.1) 
● Python 
● Pig 
● Luigi 
Service Layer 
● PostgreSQL 
● MongoDB 
● Redis 
● Cassandra 
Data Data Queries 
Hadoop cluster (100 nodes) Application machines
Let’s dive in! Assigning an integer 
a = 4 PyObject* a = 
PyInt_FromLong(4); 
// what's the 
difference to 
int a = 4? 
Documentation: PyInt_FromLong
List item access 
x = xs[i] PyObject* x = 
PyList_GetItem(xs, 
i); 
Documentation: PyList_GetItem
Returning None 
 
return None Py_INCREF(Py_None); 
return Py_None; 
Documentation: Py_INCREF
Calling a function 
foo(1337, "bar") // argument list 
PyObject *args = Py_BuildValue 
("is", 1337, "bar"); 
// make call 
PyObject_CallObject(foo, 
args); 
// release arguments 
Py_DECREF(args); 
Documentation: Py_BuildValue, 
PyObject_CallObject
What’s the CPython C API? 
● API to manipulate Python objects, and interact with 
Python code, from C/C++ 
● Purpose: Extend Python with new modules/types 
● Why?
CPython internals 
def slangify(s): 
return s + ", yo!" 
C API 
Compiler Interpreter 
>>> slangify("hey") 
'hey, yo!' 
|x00x00dx01x00x17S 
Not true for Jython, IronPython, PyPy, Stackless 

Why is Python slow? 
a = 1 
a = a + 1 
int a = 1; 
a++;
Why is Python slow? 
class Point: 
def __init__(self, x, y): 
self.x = x; self.y = y 
p = Point(1, 2) 
print p.x 
typedef struct { int x, y; } 
point; 
int main() { 
point p = {1, 2}; 
printf("%i", p.x); 
}
Why is Python slow? 
The GIL
Writing in C 
● No OOP : 
typedef struct { /* ... */ } complexType; 
void fun(complexType* obj, int x, char* y) { 
// ... 
} 
● Macros for code generation: 
#define SWAP(x,y) {int tmp = x; x = y; y = tmp;} 
SWAP(a, b);
Writing in C 
● Manual memory management: 
○ C: static, stack, malloc/free 
○ Python C API: Reference counting 
● No exceptions 
○ Error handling via returning values 
○ CPython: return null; signals an error
Reference Counting 
void Py_INCREF(PyObject *o) 
Increment the reference count for object o. The object must not be NULL; if you aren’t sure that it isn’t NULL, use 
Py_XINCREF(). 
= I want to hold on to this object and use it again after a while* 
*) any interaction with Python interpreter that may invalidate my reference 
void Py_DECREF(PyObject *o) 
Decrement the reference count for object o. The object must not be NULL; if you aren’t sure that it isn’t NULL, use 
Py_XDECREF(). If the reference count reaches zero, the object’s type’s deallocation function (which must not 
beNULL) is invoked. 
= I’m done, and don’t care if the object is discarded at this call 
See documentation
Anatomy of a refcount bug 
void buggy(PyObject *list) 
{ 
PyObject *item = PyList_GetItem(list, 0); // borrowed ref. 
PyList_SetItem(list, 1, PyInt_FromLong(0L)); // calls 
destructor of previous element 
PyObject_Print(item, stdout, 0); // BUG! 
}
Our First Extension Module
Adding integers in C 
>>> import arithmetic 
>>> arithmetic.add(1, 1337) 
1338
#include <Python.h> 
static PyObject* 
arithmetic_add(PyObject* self, PyObject* args) 
{ 
int i, j; 
PyArg_ParseTuple(args, "ii", &i, &j); 
PyObject* sum = PyInt_FromLong(i + j); 
return sum; 
}
static PyObject* 
arithmetic_add(PyObject* self, PyObject* args) 
{ 
int i, j; 
PyObject* sum = NULL; 
if (!PyArg_ParseTuple(args, "ii", &i, &j)) 
goto error; 
sum = PyInt_FromLong(i + j); 
if (sum == NULL) 
goto error; 
return sum; 
error: 
Py_XDECREF(sum); 
return NULL; 
}
BoilerplateÂČ 
static PyMethodDef ArithmeticMethods[] = { 
{"add", arithmetic_add, METH_VARARGS, "Add two integers."}, 
{NULL, NULL, 0, NULL} // sentinel 
}; 
PyMODINIT_FUNC 
initarithmetic(void) 
{ 
(void) Py_InitModule("arithmetic", ArithmeticMethods); 
}

 and build your module 
from distutils.core import setup, Extension 
module = Extension("arithmetic", sources=["arithmeticmodule.c"]) 
setup( 
name="Arithmetic", 
version="1.0", 
ext_modules=[module] 
)
$ sudo python setup.py install 
# build with gcc, any compiler errors & warnings are shown here 
$ python 
>>> import arithmetic 
>>> arithmetic 
<module 'arithmetic' from '/usr/local/lib/python2.7/dist-packages/ 
arithmetic.so'> 
>>> arithmetic.add 
<built-in function add> 
>>> arithmetic.add(1, "1337") 
Traceback (most recent call last): 
File "<stdin>", line 1, in <module> 
TypeError: an integer is required
Why on earth 
would I do that?
Why go through all this trouble? 
● Performance 
○ C extensions & Cython optimize CPU-bound code 
(vs. memory-bound, IO-bound) 
○ Pareto principle: 20% of the code responsible for 
80% of the runtime 
● Also: Interfacing with existing C/C++ code
Is my Python code performance-critical? 
import cProfile, pstats, sys 
pr = cProfile.Profile() 
pr.enable() 
setup() 
# run code you want to profile 
pr.disable() 
stats = pstats.Stats(pr, stream=sys.stdout).sort_stats("time") 
stats.print_stats()
55705589 function calls (55688041 primitive calls) in 69.216 seconds 
Ordered by: internal time 
ncalls tottime percall cumtime percall filename:lineno(function) 
45413 21.856 0.000 21.856 0.000 {method 'get' of 'pytc.HDB' objects} 
32275 9.490 0.000 9.656 0.000 /usr/local/lib/python2.7/dist-packages/simplejson/decoder.py:376(raw_decode) 
18760 6.403 0.000 12.797 0.001 /home/steffen/apps/group/lib/util/timeseries.py:29(reindex_pad) 
56992 2.586 0.000 2.624 0.000 {sorted} 
1383832 2.244 0.000 2.244 0.000 /home/steffen/apps/group/lib/hotel/index.py:231(<lambda>) 
2708692 1.845 0.000 5.657 0.000 /home/steffen/apps/group/lib/hotel/index.py:21(<genexpr>) 
497989 1.718 0.000 2.456 0.000 {_heapq.heapreplace} 
4734466 1.624 0.000 2.491 0.000 /home/steffen/apps/group/lib/util/timeseries.py:43(<genexpr>) 
346738 1.475 0.000 1.475 0.000 /usr/lib/python2.7/json/decoder.py:371(raw_decode) 
510726 1.354 0.000 10.432 0.000 /usr/lib/python2.7/heapq.py:357(merge) 
2691966 1.310 0.000 1.310 0.000 /home/steffen/apps/group/lib/util/timeseries.py:21(float_parse) 
357260 1.160 0.000 5.122 0.000 /home/steffen/apps/group/lib/hotel/index.py:471(<genexpr>) 
5348564 0.912 0.000 0.912 0.000 /home/steffen/apps/group/lib/util/timeseries.py:90(<genexpr>) 
758026 0.882 0.000 0.882 0.000 {method 'match' of '_sre.SRE_Pattern' objects} 
9470443 0.868 0.000 0.868 0.000 {method 'append' of 'list' objects} 
4715746 0.867 0.000 0.867 0.000 /home/steffen/apps/group/lib/util/timeseries.py:31(bound) 
1 0.857 0.857 69.220 69.220 /home/steffen/apps/group/lib/pages/table_page.py:37(calculate) 
644766 0.839 0.000 1.752 0.000 {sum}
You can’t observe without changing 
 
import timeit 
def setup(): 
pass 
def stmt(): 
pass 
print timeit.timeit(stmt=stmt, setup=setup, number=100)
Example: QuickSort
Pythonic QuickSort 
def quicksort(xs): 
if len(xs) <= 1: 
return xs 
middle = len(xs) / 2 
pivot = xs[middle] 
del xs[middle] 
left, right = [], [] 
for x in xs: 
append_to = left if x < pivot else right 
append_to.append(x) 
return quicksort(left) + [pivot] + quicksort(right)
Results: Python vs. C extension 
Pythonic QuickSort: 2.0s 
C extension module: 0.092s
Cython
Adding integers in Cython 
# add.pyx 
def add(i, j): 
return i + j 
# main.py 
import pyximport; pyximport.install() 
import add 
if __name__ == "__main__": 
print add.add(1, 1337)
What is Cython? 
● Compiles Python to C code 
● “Superset” of Python: Accepts type annotations to 
compile more efficient code (optional!) 
cdef int i = 2 
● No reference counting, error handling, boilerplate 
 
plus nicer compiling workflows
Results: 
Pythonic QuickSort: 2.0s 
C extension module: 0.092s 
Cython QuickSort (unchanged): 0.82s
cdef partition(xs, int left, int right, int pivot_index): 
cdef int pivot = xs[pivot_index] 
cdef int el 
xs[pivot_index], xs[right] = xs[right], xs[pivot_index] 
pivot_index = left 
for i in xrange(left, right): 
el = xs[i] 
if el <= pivot: 
xs[i], xs[pivot_index] = xs[pivot_index], xs[i] 
pivot_index += 1 
xs[pivot_index], xs[right] = xs[right], xs[pivot_index] 
return pivot_index 
def quicksort(xs, left=0, right=None): 
if right is None: 
right = len(xs) - 1 
if left < right: 
middle = (left + right) / 2 
pivot_index = partition(xs, left, right, middle) 
quicksort(xs, left, pivot_index - 1) 
quicksort(xs, pivot_index + 1, right)
Results: 
Pythonic QuickSort: 2.0s 
C extension module: 0.092s 
Cython QuickSort (unchanged): 0.82s 
Cython QuickSort (C-like): 0.37s 
● Unscientific result. Cython can be faster than hand-written 
C extensions!
Further Reading on Cython 
See code samples on ● O’Reilly Book 
TrustYou GitHub account: 
https://github. 
com/trustyou/meetups/tre 
e/master/python-c
TrustYou wants you! 
We offer positions 
in Cluj & Munich: 
● Data engineer 
● Application developer 
● Crawling engineer 
Write me at swenz@trustyou.net, check out our website, 
or see you at the next meetup!
Thank you!
Python Bytecode 
>>> def slangify(s): 
... return s + ", yo!" 
... 
>>> slangify.func_code.co_code 
'|x00x00dx01x00x17S' 
>>> import dis 
>>> dis.dis(slangify) 
2 0 LOAD_FAST 0 (s) 
3 LOAD_CONST 1 (', yo!') 
6 BINARY_ADD 
7 RETURN_VALUE
Anatomy of a memory leak 
void buggier() 
{ 
PyObject *lst = PyList_New(10); 
return Py_BuildValue("Oi", lst, 10); // increments refcount 
} 
// read the doc carefully before using *any* C API function

Mais conteĂșdo relacionado

Mais procurados

All I know about rsc.io/c2go
All I know about rsc.io/c2goAll I know about rsc.io/c2go
All I know about rsc.io/c2goMoriyoshi Koizumi
 
Hacking Go Compiler Internals / GoCon 2014 Autumn
Hacking Go Compiler Internals / GoCon 2014 AutumnHacking Go Compiler Internals / GoCon 2014 Autumn
Hacking Go Compiler Internals / GoCon 2014 AutumnMoriyoshi Koizumi
 
Letswift19-clean-architecture
Letswift19-clean-architectureLetswift19-clean-architecture
Letswift19-clean-architectureJung Kim
 
DevTalks Cluj - Open-Source Technologies for Analyzing Text
DevTalks Cluj - Open-Source Technologies for Analyzing TextDevTalks Cluj - Open-Source Technologies for Analyzing Text
DevTalks Cluj - Open-Source Technologies for Analyzing TextSteffen Wenz
 
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak   CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak PROIDEA
 
Concurrent applications with free monads and stm
Concurrent applications with free monads and stmConcurrent applications with free monads and stm
Concurrent applications with free monads and stmAlexander Granin
 
PyCon KR 2019 sprint - RustPython by example
PyCon KR 2019 sprint  - RustPython by examplePyCon KR 2019 sprint  - RustPython by example
PyCon KR 2019 sprint - RustPython by exampleYunWon Jeong
 
Python GC
Python GCPython GC
Python GCdelimitry
 
Python Objects
Python ObjectsPython Objects
Python ObjectsQuintagroup
 
Compose Async with RxJS
Compose Async with RxJSCompose Async with RxJS
Compose Async with RxJSKyung Yeol Kim
 
서ëȄ 개발자가 바띌 ëłž Functional Reactive Programming with RxJava - SpringCamp2015
서ëȄ 개발자가 바띌 ëłž Functional Reactive Programming with RxJava - SpringCamp2015서ëȄ 개발자가 바띌 ëłž Functional Reactive Programming with RxJava - SpringCamp2015
서ëȄ 개발자가 바띌 ëłž Functional Reactive Programming with RxJava - SpringCamp2015NAVER / MusicPlatform
 
ClojureScript loves React, DomCode May 26 2015
ClojureScript loves React, DomCode May 26 2015ClojureScript loves React, DomCode May 26 2015
ClojureScript loves React, DomCode May 26 2015Michiel Borkent
 
C++ How I learned to stop worrying and love metaprogramming
C++ How I learned to stop worrying and love metaprogrammingC++ How I learned to stop worrying and love metaprogramming
C++ How I learned to stop worrying and love metaprogrammingcppfrug
 
RxJS Evolved
RxJS EvolvedRxJS Evolved
RxJS Evolvedtrxcllnt
 
C++totural file
C++totural fileC++totural file
C++totural filehalaisumit
 
Basic C++ 11/14 for Python Programmers
Basic C++ 11/14 for Python ProgrammersBasic C++ 11/14 for Python Programmers
Basic C++ 11/14 for Python ProgrammersAppier
 
ĐŸŃ€ĐŸĐŽĐČĐžĐœŃƒŃ‚Đ°Ń ĐŸŃ‚Đ»Đ°ĐŽĐșĐ° JavaScript с ĐżĐŸĐŒĐŸŃ‰ŃŒŃŽ Chrome Dev Tools
ĐŸŃ€ĐŸĐŽĐČĐžĐœŃƒŃ‚Đ°Ń ĐŸŃ‚Đ»Đ°ĐŽĐșĐ° JavaScript с ĐżĐŸĐŒĐŸŃ‰ŃŒŃŽ Chrome Dev ToolsĐŸŃ€ĐŸĐŽĐČĐžĐœŃƒŃ‚Đ°Ń ĐŸŃ‚Đ»Đ°ĐŽĐșĐ° JavaScript с ĐżĐŸĐŒĐŸŃ‰ŃŒŃŽ Chrome Dev Tools
ĐŸŃ€ĐŸĐŽĐČĐžĐœŃƒŃ‚Đ°Ń ĐŸŃ‚Đ»Đ°ĐŽĐșĐ° JavaScript с ĐżĐŸĐŒĐŸŃ‰ŃŒŃŽ Chrome Dev ToolsFDConf
 
C c++-meetup-1nov2017-autofdo
C c++-meetup-1nov2017-autofdoC c++-meetup-1nov2017-autofdo
C c++-meetup-1nov2017-autofdoKim Phillips
 
ClojureScript for the web
ClojureScript for the webClojureScript for the web
ClojureScript for the webMichiel Borkent
 

Mais procurados (20)

All I know about rsc.io/c2go
All I know about rsc.io/c2goAll I know about rsc.io/c2go
All I know about rsc.io/c2go
 
Hacking Go Compiler Internals / GoCon 2014 Autumn
Hacking Go Compiler Internals / GoCon 2014 AutumnHacking Go Compiler Internals / GoCon 2014 Autumn
Hacking Go Compiler Internals / GoCon 2014 Autumn
 
Letswift19-clean-architecture
Letswift19-clean-architectureLetswift19-clean-architecture
Letswift19-clean-architecture
 
DevTalks Cluj - Open-Source Technologies for Analyzing Text
DevTalks Cluj - Open-Source Technologies for Analyzing TextDevTalks Cluj - Open-Source Technologies for Analyzing Text
DevTalks Cluj - Open-Source Technologies for Analyzing Text
 
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak   CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
 
Concurrent applications with free monads and stm
Concurrent applications with free monads and stmConcurrent applications with free monads and stm
Concurrent applications with free monads and stm
 
PyCon KR 2019 sprint - RustPython by example
PyCon KR 2019 sprint  - RustPython by examplePyCon KR 2019 sprint  - RustPython by example
PyCon KR 2019 sprint - RustPython by example
 
Python GC
Python GCPython GC
Python GC
 
Python Objects
Python ObjectsPython Objects
Python Objects
 
Compose Async with RxJS
Compose Async with RxJSCompose Async with RxJS
Compose Async with RxJS
 
서ëȄ 개발자가 바띌 ëłž Functional Reactive Programming with RxJava - SpringCamp2015
서ëȄ 개발자가 바띌 ëłž Functional Reactive Programming with RxJava - SpringCamp2015서ëȄ 개발자가 바띌 ëłž Functional Reactive Programming with RxJava - SpringCamp2015
서ëȄ 개발자가 바띌 ëłž Functional Reactive Programming with RxJava - SpringCamp2015
 
ClojureScript loves React, DomCode May 26 2015
ClojureScript loves React, DomCode May 26 2015ClojureScript loves React, DomCode May 26 2015
ClojureScript loves React, DomCode May 26 2015
 
C++ How I learned to stop worrying and love metaprogramming
C++ How I learned to stop worrying and love metaprogrammingC++ How I learned to stop worrying and love metaprogramming
C++ How I learned to stop worrying and love metaprogramming
 
RxJS Evolved
RxJS EvolvedRxJS Evolved
RxJS Evolved
 
C++totural file
C++totural fileC++totural file
C++totural file
 
Basic C++ 11/14 for Python Programmers
Basic C++ 11/14 for Python ProgrammersBasic C++ 11/14 for Python Programmers
Basic C++ 11/14 for Python Programmers
 
ĐŸŃ€ĐŸĐŽĐČĐžĐœŃƒŃ‚Đ°Ń ĐŸŃ‚Đ»Đ°ĐŽĐșĐ° JavaScript с ĐżĐŸĐŒĐŸŃ‰ŃŒŃŽ Chrome Dev Tools
ĐŸŃ€ĐŸĐŽĐČĐžĐœŃƒŃ‚Đ°Ń ĐŸŃ‚Đ»Đ°ĐŽĐșĐ° JavaScript с ĐżĐŸĐŒĐŸŃ‰ŃŒŃŽ Chrome Dev ToolsĐŸŃ€ĐŸĐŽĐČĐžĐœŃƒŃ‚Đ°Ń ĐŸŃ‚Đ»Đ°ĐŽĐșĐ° JavaScript с ĐżĐŸĐŒĐŸŃ‰ŃŒŃŽ Chrome Dev Tools
ĐŸŃ€ĐŸĐŽĐČĐžĐœŃƒŃ‚Đ°Ń ĐŸŃ‚Đ»Đ°ĐŽĐșĐ° JavaScript с ĐżĐŸĐŒĐŸŃ‰ŃŒŃŽ Chrome Dev Tools
 
C++ tutorial
C++ tutorialC++ tutorial
C++ tutorial
 
C c++-meetup-1nov2017-autofdo
C c++-meetup-1nov2017-autofdoC c++-meetup-1nov2017-autofdo
C c++-meetup-1nov2017-autofdo
 
ClojureScript for the web
ClojureScript for the webClojureScript for the web
ClojureScript for the web
 

Semelhante a Cluj.py Meetup: Extending Python in C

Notes about moving from python to c++ py contw 2020
Notes about moving from python to c++ py contw 2020Notes about moving from python to c++ py contw 2020
Notes about moving from python to c++ py contw 2020Yung-Yu Chen
 
掀蔷 Swift 的靱算
掀蔷 Swift çš„éąçŽ—æŽ€è”· Swift 的靱算
掀蔷 Swift 的靱算Pofat Tseng
 
Python For Scientists
Python For ScientistsPython For Scientists
Python For Scientistsaeberspaecher
 
Python Functions Tutorial | Working With Functions In Python | Python Trainin...
Python Functions Tutorial | Working With Functions In Python | Python Trainin...Python Functions Tutorial | Working With Functions In Python | Python Trainin...
Python Functions Tutorial | Working With Functions In Python | Python Trainin...Edureka!
 
An Overview Of Python With Functional Programming
An Overview Of Python With Functional ProgrammingAn Overview Of Python With Functional Programming
An Overview Of Python With Functional ProgrammingAdam Getchell
 
Python-GTK
Python-GTKPython-GTK
Python-GTKYuren Ju
 
C# 6.0 Preview
C# 6.0 PreviewC# 6.0 Preview
C# 6.0 PreviewFujio Kojima
 
Python高çș§çŒ–皋äșŒïŒ‰
Python高çș§çŒ–皋äșŒïŒ‰Python高çș§çŒ–皋äșŒïŒ‰
Python高çș§çŒ–皋äșŒïŒ‰Qiangning Hong
 
Threads and Callbacks for Embedded Python
Threads and Callbacks for Embedded PythonThreads and Callbacks for Embedded Python
Threads and Callbacks for Embedded PythonYi-Lung Tsai
 
2018 cosup-delete unused python code safely - english
2018 cosup-delete unused python code safely - english2018 cosup-delete unused python code safely - english
2018 cosup-delete unused python code safely - englishJen Yee Hong
 
Pemrograman Python untuk Pemula
Pemrograman Python untuk PemulaPemrograman Python untuk Pemula
Pemrograman Python untuk PemulaOon Arfiandwi
 
Python bootcamp - C4Dlab, University of Nairobi
Python bootcamp - C4Dlab, University of NairobiPython bootcamp - C4Dlab, University of Nairobi
Python bootcamp - C4Dlab, University of Nairobikrmboya
 
Python GTK (Hacking Camp)
Python GTK (Hacking Camp)Python GTK (Hacking Camp)
Python GTK (Hacking Camp)Yuren Ju
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance PythonIan Ozsvald
 
Object Oriented Technologies
Object Oriented TechnologiesObject Oriented Technologies
Object Oriented TechnologiesUmesh Nikam
 
Lo Mejor Del Pdc2008 El Futrode C#
Lo Mejor Del Pdc2008 El Futrode C#Lo Mejor Del Pdc2008 El Futrode C#
Lo Mejor Del Pdc2008 El Futrode C#Juan Pablo
 
PyHEP 2018: Tools to bind to Python
PyHEP 2018:  Tools to bind to PythonPyHEP 2018:  Tools to bind to Python
PyHEP 2018: Tools to bind to PythonHenry Schreiner
 
Python lecture 03
Python lecture 03Python lecture 03
Python lecture 03Tanwir Zaman
 

Semelhante a Cluj.py Meetup: Extending Python in C (20)

Notes about moving from python to c++ py contw 2020
Notes about moving from python to c++ py contw 2020Notes about moving from python to c++ py contw 2020
Notes about moving from python to c++ py contw 2020
 
掀蔷 Swift 的靱算
掀蔷 Swift çš„éąçŽ—æŽ€è”· Swift 的靱算
掀蔷 Swift 的靱算
 
Python For Scientists
Python For ScientistsPython For Scientists
Python For Scientists
 
Python Functions Tutorial | Working With Functions In Python | Python Trainin...
Python Functions Tutorial | Working With Functions In Python | Python Trainin...Python Functions Tutorial | Working With Functions In Python | Python Trainin...
Python Functions Tutorial | Working With Functions In Python | Python Trainin...
 
An Overview Of Python With Functional Programming
An Overview Of Python With Functional ProgrammingAn Overview Of Python With Functional Programming
An Overview Of Python With Functional Programming
 
Python-GTK
Python-GTKPython-GTK
Python-GTK
 
C# 6.0 Preview
C# 6.0 PreviewC# 6.0 Preview
C# 6.0 Preview
 
Python高çș§çŒ–皋äșŒïŒ‰
Python高çș§çŒ–皋äșŒïŒ‰Python高çș§çŒ–皋äșŒïŒ‰
Python高çș§çŒ–皋äșŒïŒ‰
 
Threads and Callbacks for Embedded Python
Threads and Callbacks for Embedded PythonThreads and Callbacks for Embedded Python
Threads and Callbacks for Embedded Python
 
2018 cosup-delete unused python code safely - english
2018 cosup-delete unused python code safely - english2018 cosup-delete unused python code safely - english
2018 cosup-delete unused python code safely - english
 
Pemrograman Python untuk Pemula
Pemrograman Python untuk PemulaPemrograman Python untuk Pemula
Pemrograman Python untuk Pemula
 
Python bootcamp - C4Dlab, University of Nairobi
Python bootcamp - C4Dlab, University of NairobiPython bootcamp - C4Dlab, University of Nairobi
Python bootcamp - C4Dlab, University of Nairobi
 
Profiling in Python
Profiling in PythonProfiling in Python
Profiling in Python
 
Python GTK (Hacking Camp)
Python GTK (Hacking Camp)Python GTK (Hacking Camp)
Python GTK (Hacking Camp)
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance Python
 
C++ Programming
C++ ProgrammingC++ Programming
C++ Programming
 
Object Oriented Technologies
Object Oriented TechnologiesObject Oriented Technologies
Object Oriented Technologies
 
Lo Mejor Del Pdc2008 El Futrode C#
Lo Mejor Del Pdc2008 El Futrode C#Lo Mejor Del Pdc2008 El Futrode C#
Lo Mejor Del Pdc2008 El Futrode C#
 
PyHEP 2018: Tools to bind to Python
PyHEP 2018:  Tools to bind to PythonPyHEP 2018:  Tools to bind to Python
PyHEP 2018: Tools to bind to Python
 
Python lecture 03
Python lecture 03Python lecture 03
Python lecture 03
 

Último

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Christopher Logan Kennedy
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Mcleodganj Call Girls đŸ„° 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls đŸ„° 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls đŸ„° 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls đŸ„° 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 

Último (20)

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Mcleodganj Call Girls đŸ„° 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls đŸ„° 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls đŸ„° 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls đŸ„° 8617370543 Service Offer VIP Hot Model
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

Cluj.py Meetup: Extending Python in C

  • 1. Extending Python in C Cluj.py meetup, Nov 19th Steffen Wenz, CTO TrustYou
  • 2. Goals of today’s talk ● Look behind the scenes of the CPython interpreter - gain insights into “how Python works” ● Explore the CPython C API ● Build a Python extension in C ● Introduction to Cython
  • 3. Who are we? ● For each hotel on the planet, provide a summary of all reviews ● Expertise: ○ NLP ○ Machine Learning ○ Big Data ● Clients: 

  • 4.
  • 5. TrustYou Tech Stack Batch Layer ● Hadoop (HDP 2.1) ● Python ● Pig ● Luigi Service Layer ● PostgreSQL ● MongoDB ● Redis ● Cassandra Data Data Queries Hadoop cluster (100 nodes) Application machines
  • 6. Let’s dive in! Assigning an integer a = 4 PyObject* a = PyInt_FromLong(4); // what's the difference to int a = 4? Documentation: PyInt_FromLong
  • 7. List item access x = xs[i] PyObject* x = PyList_GetItem(xs, i); Documentation: PyList_GetItem
  • 8. Returning None 
 return None Py_INCREF(Py_None); return Py_None; Documentation: Py_INCREF
  • 9. Calling a function foo(1337, "bar") // argument list PyObject *args = Py_BuildValue ("is", 1337, "bar"); // make call PyObject_CallObject(foo, args); // release arguments Py_DECREF(args); Documentation: Py_BuildValue, PyObject_CallObject
  • 10. What’s the CPython C API? ● API to manipulate Python objects, and interact with Python code, from C/C++ ● Purpose: Extend Python with new modules/types ● Why?
  • 11. CPython internals def slangify(s): return s + ", yo!" C API Compiler Interpreter >>> slangify("hey") 'hey, yo!' |x00x00dx01x00x17S Not true for Jython, IronPython, PyPy, Stackless 

  • 12. Why is Python slow? a = 1 a = a + 1 int a = 1; a++;
  • 13. Why is Python slow? class Point: def __init__(self, x, y): self.x = x; self.y = y p = Point(1, 2) print p.x typedef struct { int x, y; } point; int main() { point p = {1, 2}; printf("%i", p.x); }
  • 14. Why is Python slow? The GIL
  • 15. Writing in C ● No OOP : typedef struct { /* ... */ } complexType; void fun(complexType* obj, int x, char* y) { // ... } ● Macros for code generation: #define SWAP(x,y) {int tmp = x; x = y; y = tmp;} SWAP(a, b);
  • 16. Writing in C ● Manual memory management: ○ C: static, stack, malloc/free ○ Python C API: Reference counting ● No exceptions ○ Error handling via returning values ○ CPython: return null; signals an error
  • 17. Reference Counting void Py_INCREF(PyObject *o) Increment the reference count for object o. The object must not be NULL; if you aren’t sure that it isn’t NULL, use Py_XINCREF(). = I want to hold on to this object and use it again after a while* *) any interaction with Python interpreter that may invalidate my reference void Py_DECREF(PyObject *o) Decrement the reference count for object o. The object must not be NULL; if you aren’t sure that it isn’t NULL, use Py_XDECREF(). If the reference count reaches zero, the object’s type’s deallocation function (which must not beNULL) is invoked. = I’m done, and don’t care if the object is discarded at this call See documentation
  • 18. Anatomy of a refcount bug void buggy(PyObject *list) { PyObject *item = PyList_GetItem(list, 0); // borrowed ref. PyList_SetItem(list, 1, PyInt_FromLong(0L)); // calls destructor of previous element PyObject_Print(item, stdout, 0); // BUG! }
  • 20. Adding integers in C >>> import arithmetic >>> arithmetic.add(1, 1337) 1338
  • 21. #include <Python.h> static PyObject* arithmetic_add(PyObject* self, PyObject* args) { int i, j; PyArg_ParseTuple(args, "ii", &i, &j); PyObject* sum = PyInt_FromLong(i + j); return sum; }
  • 22. static PyObject* arithmetic_add(PyObject* self, PyObject* args) { int i, j; PyObject* sum = NULL; if (!PyArg_ParseTuple(args, "ii", &i, &j)) goto error; sum = PyInt_FromLong(i + j); if (sum == NULL) goto error; return sum; error: Py_XDECREF(sum); return NULL; }
  • 23. BoilerplateÂČ static PyMethodDef ArithmeticMethods[] = { {"add", arithmetic_add, METH_VARARGS, "Add two integers."}, {NULL, NULL, 0, NULL} // sentinel }; PyMODINIT_FUNC initarithmetic(void) { (void) Py_InitModule("arithmetic", ArithmeticMethods); }
  • 24. 
 and build your module from distutils.core import setup, Extension module = Extension("arithmetic", sources=["arithmeticmodule.c"]) setup( name="Arithmetic", version="1.0", ext_modules=[module] )
  • 25. $ sudo python setup.py install # build with gcc, any compiler errors & warnings are shown here $ python >>> import arithmetic >>> arithmetic <module 'arithmetic' from '/usr/local/lib/python2.7/dist-packages/ arithmetic.so'> >>> arithmetic.add <built-in function add> >>> arithmetic.add(1, "1337") Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: an integer is required
  • 26. Why on earth would I do that?
  • 27. Why go through all this trouble? ● Performance ○ C extensions & Cython optimize CPU-bound code (vs. memory-bound, IO-bound) ○ Pareto principle: 20% of the code responsible for 80% of the runtime ● Also: Interfacing with existing C/C++ code
  • 28. Is my Python code performance-critical? import cProfile, pstats, sys pr = cProfile.Profile() pr.enable() setup() # run code you want to profile pr.disable() stats = pstats.Stats(pr, stream=sys.stdout).sort_stats("time") stats.print_stats()
  • 29. 55705589 function calls (55688041 primitive calls) in 69.216 seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 45413 21.856 0.000 21.856 0.000 {method 'get' of 'pytc.HDB' objects} 32275 9.490 0.000 9.656 0.000 /usr/local/lib/python2.7/dist-packages/simplejson/decoder.py:376(raw_decode) 18760 6.403 0.000 12.797 0.001 /home/steffen/apps/group/lib/util/timeseries.py:29(reindex_pad) 56992 2.586 0.000 2.624 0.000 {sorted} 1383832 2.244 0.000 2.244 0.000 /home/steffen/apps/group/lib/hotel/index.py:231(<lambda>) 2708692 1.845 0.000 5.657 0.000 /home/steffen/apps/group/lib/hotel/index.py:21(<genexpr>) 497989 1.718 0.000 2.456 0.000 {_heapq.heapreplace} 4734466 1.624 0.000 2.491 0.000 /home/steffen/apps/group/lib/util/timeseries.py:43(<genexpr>) 346738 1.475 0.000 1.475 0.000 /usr/lib/python2.7/json/decoder.py:371(raw_decode) 510726 1.354 0.000 10.432 0.000 /usr/lib/python2.7/heapq.py:357(merge) 2691966 1.310 0.000 1.310 0.000 /home/steffen/apps/group/lib/util/timeseries.py:21(float_parse) 357260 1.160 0.000 5.122 0.000 /home/steffen/apps/group/lib/hotel/index.py:471(<genexpr>) 5348564 0.912 0.000 0.912 0.000 /home/steffen/apps/group/lib/util/timeseries.py:90(<genexpr>) 758026 0.882 0.000 0.882 0.000 {method 'match' of '_sre.SRE_Pattern' objects} 9470443 0.868 0.000 0.868 0.000 {method 'append' of 'list' objects} 4715746 0.867 0.000 0.867 0.000 /home/steffen/apps/group/lib/util/timeseries.py:31(bound) 1 0.857 0.857 69.220 69.220 /home/steffen/apps/group/lib/pages/table_page.py:37(calculate) 644766 0.839 0.000 1.752 0.000 {sum}
  • 30. You can’t observe without changing 
 import timeit def setup(): pass def stmt(): pass print timeit.timeit(stmt=stmt, setup=setup, number=100)
  • 32. Pythonic QuickSort def quicksort(xs): if len(xs) <= 1: return xs middle = len(xs) / 2 pivot = xs[middle] del xs[middle] left, right = [], [] for x in xs: append_to = left if x < pivot else right append_to.append(x) return quicksort(left) + [pivot] + quicksort(right)
  • 33. Results: Python vs. C extension Pythonic QuickSort: 2.0s C extension module: 0.092s
  • 35. Adding integers in Cython # add.pyx def add(i, j): return i + j # main.py import pyximport; pyximport.install() import add if __name__ == "__main__": print add.add(1, 1337)
  • 36. What is Cython? ● Compiles Python to C code ● “Superset” of Python: Accepts type annotations to compile more efficient code (optional!) cdef int i = 2 ● No reference counting, error handling, boilerplate 
 plus nicer compiling workflows
  • 37. Results: Pythonic QuickSort: 2.0s C extension module: 0.092s Cython QuickSort (unchanged): 0.82s
  • 38. cdef partition(xs, int left, int right, int pivot_index): cdef int pivot = xs[pivot_index] cdef int el xs[pivot_index], xs[right] = xs[right], xs[pivot_index] pivot_index = left for i in xrange(left, right): el = xs[i] if el <= pivot: xs[i], xs[pivot_index] = xs[pivot_index], xs[i] pivot_index += 1 xs[pivot_index], xs[right] = xs[right], xs[pivot_index] return pivot_index def quicksort(xs, left=0, right=None): if right is None: right = len(xs) - 1 if left < right: middle = (left + right) / 2 pivot_index = partition(xs, left, right, middle) quicksort(xs, left, pivot_index - 1) quicksort(xs, pivot_index + 1, right)
  • 39. Results: Pythonic QuickSort: 2.0s C extension module: 0.092s Cython QuickSort (unchanged): 0.82s Cython QuickSort (C-like): 0.37s ● Unscientific result. Cython can be faster than hand-written C extensions!
  • 40. Further Reading on Cython See code samples on ● O’Reilly Book TrustYou GitHub account: https://github. com/trustyou/meetups/tre e/master/python-c
  • 41. TrustYou wants you! We offer positions in Cluj & Munich: ● Data engineer ● Application developer ● Crawling engineer Write me at swenz@trustyou.net, check out our website, or see you at the next meetup!
  • 43. Python Bytecode >>> def slangify(s): ... return s + ", yo!" ... >>> slangify.func_code.co_code '|x00x00dx01x00x17S' >>> import dis >>> dis.dis(slangify) 2 0 LOAD_FAST 0 (s) 3 LOAD_CONST 1 (', yo!') 6 BINARY_ADD 7 RETURN_VALUE
  • 44. Anatomy of a memory leak void buggier() { PyObject *lst = PyList_New(10); return Py_BuildValue("Oi", lst, 10); // increments refcount } // read the doc carefully before using *any* C API function