This document discusses various techniques for optimizing Python code, including:
1. Using the right algorithms and data structures to minimize time complexity, such as choosing lists, sets or dictionaries based on needed functionality.
2. Leveraging Python-specific optimizations like string concatenation, lookups, loops and imports.
3. Profiling code with tools like timeit, cProfile and visualizers to identify bottlenecks before optimizing.
4. Optimizing only after validating a performance need and starting with general strategies before rewriting hotspots in Python or other languages. Premature optimization can complicate code.
7. When to start?
Need for optimization
are you sure you need to do it at all?
is your code really so bad?
benchmarking
fast enough vs. faster
Time for optimization
is it worth the time to tune it?
how much time is going to be spent running that
code?
8. When to start?
Cost of optimization
costly developer time
addition of new features
new bugs in algorithms
speed vs. space
Optimize only if necessary!
9. Where to start?
Are you sure you're done coding?
frosting a half-baked cake
Premature optimization is the root of all evil!
- Don Knuth
Working, well-architected code is always a must
10. General strategies
Algorithms - the big-O notation
Architecture
Choice of Data structures
LRU techniques
Loop invariant code out of loops
Nested loops
try...catch instead of if...else
Multithreading for I/O bound code
DBMS instead of flat files
11. General strategies
Big – O – The Boss!
performance of the algorithms
a function of N - the input size to the algorithm
O(1) - constant time
O(ln n) - logarithmic
O(n) - linear
O(n2) - quadratic
12. Common big-O’s
Order Said to be Examples
“…. time”
--------------------------------------------------
O(1) constant key in dict
dict[key] = value
list.append(item)
O(ln n) logarithmic Binary search
O(n) linear item in sequence
str.join(list)
O(n ln n) list.sort()
O(n2) quadratic Nested loops (with constant time bodies)
13. Note the notation
O(N2) O(N)
def slow(it): def fast(it):
result = [] result = []
for item in it: for item in it:
result.insert(0, item) result.append(item)
return result result.reverse( )
return result
result = list(it)
14. Big-O’s of Python Building blocks
lists - vectors
dictionaries - hash tables
sets - hash tables
15. Big-O’s of Python Building blocks
Let, L be any list, T any string (plain or Unicode); D
any dict; S any set, with (say) numbers as items
(with O(1) hashing and comparison) and x any
number:
O(1) - len( L ), len(T), len( D ), len(S), L [i],
T [i], D[i], del D[i], if x in D, if x in S,
S .add( x ), S.remove( x ), additions or
removals to/from the right end of L
16. Big-O’s of Python Building blocks
O(N) - Loops on L, T, D, S, general additions or
removals to/from L (not at the right end),
all methods on T, if x in L, if x in T,
most methods on L, all shallow copies
O(N log N) - L .sort in general (but O(N) if L is
already nearly sorted or reverse-sorted)
17. Right Data Structure
lists, sets, dicts, tuples
collections - deque, defaultdict, namedtuple
Choose them based on the functionality
search an element in a sequence
append
intersection
remove from middle
dictionary initializations
18. Right Data Structure
my_list = range(n)
n in my_list
my_list = set(range(n))
n in my_list
my_list[start:end] = []
my_deque.rotate(-end)
for counter in (end-start):
my_deque.pop()
19. Right Data Structure
s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
d = defaultdict(list)
for k, v in s:
d[k].append(v)
d.items()
[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
d = {}
for k, v in s:
d.setdefault(k, []).append(v)
d.items()
[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
21. Built-ins
- Highly optimized
- Sort a list of tuples by it’s n-th field
def sortby(somelist, n):
nlist = [(x[n], x) for x in somelist]
nlist.sort()
return [val for (key, val) in nlist]
n = 1
import operator
nlist.sort(key=operator.itemgetter(n))
22. String Concatenation
s = ""
for substring in list:
s += substring
s = "".join(list)
out = "<html>" + head + prologue + query + tail +
"</html>"
out = "<html>%s%s%s%s</html>" % (head,
prologue, query, tail)
out = "<html>%(head)s%(prologue)s%(query)s%
(tail)s</html>" % locals()
23. Searching:
using ‘in’
O(1) if RHS is set/dictionary
O(N) if RHS is string/list/tuple
using ‘hasattr’
if the searched value is an attribute
if the searched value is not an attribute
24. Loops:
list comprehensions
map as for loop moved to c – if the body of the loop is a
function call
newlist = []
for word in oldlist:
newlist.append(word.upper())
newlist = [s.upper() for s in oldlist]
newlist = map(str.upper, oldlist)
25. Lookups and Local variables:
evaluating function references in loops
accessing local variables vs global variables
upper = str.upper
newlist = []
append = newlist.append
for word in oldlist:
append(upper(word))
26. Dictionaries
Initialization -- try... Except
Lookups -- string.maketrans
Regular expressions:
RE's better than writing a loop
Built-in string functions better than RE's
Compiled re's are significantly faster
re.search('^[A-Za-z]+$', source)
x = re.compile('^[A-Za-z]+$').search
x(source)
27. Imports
avoid import *
use only when required(inside functions)
lazy imports
exec and eval
better to avoid
Compile and evaluate
28. Summary on loop optimization - (extracted from an
essay by Guido)
only optimize when there is a proven speed bottleneck
small is beautiful
use intrinsic operations
avoid calling functions written in Python in your inner
loop
local variables are faster than globals
try to use map(), filter() or reduce() to replace an
explicit for loop(map with built-in, for loop with inline)
check your algorithms for quadratic behaviour
and last but not least: collect data. Python's excellent
profile module can quickly show the bottleneck in your
code
29. Might be unintentional, better not to be intuitive!
The right answer to improve performance
- Use PROFILERS
30. Spot it Right!
Hotspots
Fact and fake( - Profiler Vs Programmers intuition!)
Threads
IO operations
Logging
Encoding and Decoding
Lookups
Rewrite just the hotspots!
Psyco/Pyrex
C extensions
32. timeit
precise performance of small code snippets.
the two convenience functions - timeit and repeat
timeit.repeat(stmt[, setup[, timer[, repeat=3[,
number=1000000]]]])
timeit.timeit(stmt[, setup[, timer[, number=1000000]]])
can also be used from command line
python -m timeit [-n N] [-r N] [-s S] [-t] [-c] [-h]
[statement ...]
33. timeit
import timeit
timeit.timeit('for i in xrange(10): oct(i)', gc.enable()')
1.7195474706909972
timeit.timeit('for i in range(10): oct(i)', 'gc.enable()')
2.1380978155005295
python -m timeit -n1000 -s'x=0' 'x+=1'
1000 loops, best of 3: 0.0166 usec per loop
python -m timeit -n1000 -s'x=0' 'x=x+1'
1000 loops, best of 3: 0.0169 usec per loop
34. timeit
import timeit
python -mtimeit "try:" " str.__nonzero__" "except
AttributeError:" " pass"
1000000 loops, best of 3: 1.53 usec per loop
python -mtimeit "try:" " int.__nonzero__" "except
AttributeError:" " pass"
10000000 loops, best of 3: 0.102 usec per loop
35. timeit
test_timeit.py
def f():
try:
str.__nonzero__
except AttributeError:
pass
if __name__ == '__main__':
f()
python -mtimeit -s "from test_timeit import f" "f()"
100000 loops, best of 3: 2.5 usec per loop
36. cProfile/profile
Deterministic profiling
The run time performance
With statistics
Small snippets bring big changes!
import cProfile
cProfile.run(command[, filename])
python -m cProfile myscript.py [-o output_file] [-s
sort_order]