SlideShare uma empresa Scribd logo
1 de 32
Baixar para ler offline
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Everything You Always Wanted to Know About
Memory in Python
But Were Afraid to Ask
Piotr Przymus
Nicolaus Copernicus University
Europython 2014,
Berlin
P. Przymus 1/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
About Me
Piotr Przymus
PhD student / Research Assistant at Nicolaus Copernicus University.
Interests: databases, GPGPU computing, datamining.
8 years of Python experience.
Some of my Python projects:
Parts of trading platform in turbineam.com (back testing, trading
algorithms)
Mussels bio-monitoring analysis and data mining software.
Simulator of heterogeneus processing environment for evaluation of
database query scheduling algorithms.
P. Przymus 2/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Size of objects
Table: Size of different types in bytes
Type Python
32 bit 64 bit
int (py-2.7) 12 24
long (py-2.7) / int (py-3.3) 14 30
+2 · number of digits
float 16 24
complex 24 32
str (py-2.7) / bytes (py-3.3) 24 40
+2 · length
unicode (py-2.7) / str (py-3.3) 28 52
+(2 or 4) ∗ length
tuple 24 64
+(4 · length) +(8 · length)
P. Przymus 3/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
DIY – check size of objects
sys.getsizeof(obj)
From documentation
Since Python 2.6
Return the size of an object in bytes. The object can be any type.
All built-in objects will return correct results.
May not be true for third-party extensions as it is implementation
specific.
Calls the object’s sizeof method and adds an additional garbage
collector overhead if the object is managed by the garbage collector.
P. Przymus 4/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Objects interning – fun example
1 a = [ i % 257 for i in xrange (2**20)]
2
Listing 1: List of interned integers
1 b = [ 1024 + i % 257 for i in xrange (2**20)]
2
Listing 2: List of integers
Any allocation difference between Listing 1 and Listing 2 ?
Results measured using psutils
Listing 1 – (resident=15.1M, virtual=2.3G)
Listing 2 – (resident=39.5M, virtual=2.4G)
P. Przymus 5/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Objects interning – fun example
1 a = [ i % 257 for i in xrange (2**20)]
2
Listing 3: List of interned integers
1 b = [ 1024 + i % 257 for i in xrange (2**20)]
2
Listing 4: List of integers
Any allocation difference between Listing 1 and Listing 2 ?
Results measured using psutils
Listing 1 – (resident=15.1M, virtual=2.3G)
Listing 2 – (resident=39.5M, virtual=2.4G)
P. Przymus 5/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Objects interning – explained
Objects and variables – general rule
Objects are allocated on assignment
Variables just point to objects (i.e. they do not hold the memory)
Interning of Objects
This is an exception to the general rule.
Python implementation specific (examples from CPython).
”Often” used objects are preallocated and are shared instead of costly
new alloc.
Mainly due to the performance optimization.
1 >>> a = 0, b = 0
2 >>> a is b, a == b
3 (True , True)
4
Listing 5: Interning of Objects
1 >>> a = 1024 , b = 1024
2 >>> a is b, a == b
3 (False , True)
4
Listing 6: Objects allocation
P. Przymus 6/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Objects interning – behind the scenes
Warning
This is Python implementation dependent.
This may change in the future.
This is not documented because of the above reasons.
For reference consult the source code.
CPython 2.7 - 3.4
Single instances for:
int – in range [−5, 257)
str / unicode – empty string and all length=1 strings
unicode / str – empty string and all length=1 strings for Latin-1
tuple – empty tuple
P. Przymus 7/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
String interning – example
1 >>> a, b = "strin", "string"
2 >>> a + ’g’ is b # returns False
3 >>> intern(a+’g’) is intern(b) # returns True
4 >>> a = [ "spam %d" % (i % 257)
5 for i in xrange (2**20)]
6 >>> # memory usage (resident =57.6M, virtual =2.4G)
7 >>> a = [ intern("spam %d" % (i % 257))
8 for i in xrange (2**20)]
9 >>> # memory usage (resident =14.9M, virtual =2.3G)
10
Listing 7: String interning
P. Przymus 8/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
String interning – explained
String interning definition
String interning is a method of storing only one copy of each distinct string
value, which must be immutable.
intern (py-2.x) / sys.intern (py-3.x)
From Cpython documentation:
Enter string in the table of “interned” strings.
Return the interned string (string or string copy).
Useful to gain a little performance on dictionary lookup (key
comparisons after hashing can be done by a pointer compare instead of
a string compare).
Names used in programs are automatically interned
Dictionaries used to hold module, class or instance attributes have
interned keys.
P. Przymus 9/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Mutable Containers Memory Allocation Strategy
Plan for growth and shrinkage
Slightly overallocate memory neaded by container.
Leave room to growth.
Shrink when overallocation threshold is reached.
Reduce number of expensive function calls:
relloc()
memcpy()
Use optimal layout.
List, Sets, Dictionaries
P. Przymus 10/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
List allocation – example
Figure: List growth example
P. Przymus 11/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
List allocation strategy
Represented as fixed-length array of pointers.
Overallocation for list growth (by append)
List size growth: 4, 8, 16, 25, 35, 46, . . .
For large lists less then 12.5%
Due to the memory actions involved, operations:
at end of list are cheap (rare realloc),
in the middle or beginning require memory copy or shift!
Note that for 1,2,5 elements lists, space is wasted.
List allocation size:
32 bits – 32 + (4 * length)
64 bits – 72 + (8 * length)
Shrinking only when list size < 1/2 of allocated space.
P. Przymus 12/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Overallocation of dictionaries/sets
Represented as fixed-length hash tables.
Overallocation for dict/sets – when 2/3 of capacity is reached.
if number of elements < 50000: quadruple the capacity
else: double the capacity
1 // dict growth strategy
2 (mp ->ma_used >50000 ? 2 : 4) * mp ->ma_used;
3 // set growth strategy
4 so ->used >50000 ? so ->used *2 : so ->used *4);
5
Dict/Set growth/shrink code
1 for (newsize = PyDict_MINSIZE ;
2 newsize <= minused && newsize > 0;
3 newsize <<= 1);
4
Shrinkage if dictionary/set fill (real and dummy elements) is much larger
than used elements (real elements) i.e. lot of keys have been deleted.
P. Przymus 13/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Various data representation
1 # Fields: field1 , field2 , field3 , ..., field8
2 # Data: "foo 1", "foo 2", "foo 3", ..., "foo 8"
3 class OldStyleClass : #only py -2.x
4 ...
5 class NewStyleClass (object): # default for py -3.x
6 ...
7 class NewStyleClassSlots (object):
8 __slots__ = (’field1 ’, ’field2 ’, ...)
9 ...
10 import collections as c
11 NamedTuple = c.namedtuple(’nt’, [ ’field1 ’, ... ,])
12
13 TupleData = (’value1 ’, ’value2 ’, ....)
14 ListaData = [’value1 ’, ’value2 ’, ....]
15 DictData = {’field1 ’:, ’value2 ’, ....}
16
Listing 8: Various data representation
P. Przymus 14/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Various data representation – allocated memory
0 MB 50 MB 100 MB 150 MB
Old
StyleClass
New
StyleClass
DictData
NamedTuple
TupleData
ListaData
NewStyle
ClassWithSlots
Python 2.x Python 3.x
Figure: Allocated memory after creating 100000 objects with 8 fields each
P. Przymus 15/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Notes on garbage collector, reference count and cycles
Python garbage collector
Uses reference counting.
Offers cycle detection.
Objects garbage-collected when count goes to 0.
Reference increment, e.g.: object creation, additional aliases, passed to
function
Reference decrement, e.g.: local reference goes out of scope, alias is
destroyed, alias is reassigned
Warning – from documentation
Objects that have del () methods and are part of a reference cycle cause
the entire reference cycle to be uncollectable!
Python doesn’t collect such cycles automatically.
It is not possible for Python to guess a safe order in which to run the
del () methods.
P. Przymus 16/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Tools
psutil
memory profiler
objgraph
Meliae (could be combined with runsnakerun)
Heapy
P. Przymus 17/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Tools – psutil
psutil – A cross-platform process and system utilities module for Python.
1 import psutil
2 import os
3 ...
4 p = psutil.Process(os.getpid ())
5 pinfo = p.as_dict ()
6 ...
7 print pinfo[’memory_percent ’],
8 print pinfo[’memory_info ’].rss , pinfo[’memory_info ’]. vms
Listing 9: Various data representation
P. Przymus 18/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Tools – memory profiler
memory profiler – a module for monitoring memory usage of a python
program.
Recommended dependency: psutil.
May work as:
Line-by-line profiler.
Memory usage monitoring (memory in time).
Debugger trigger – setting debugger breakpoints.
P. Przymus 19/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
memory profiler – Line-by-line profiler
Preparation
To track particular functions use profile decorator.
Running
1 python -m memory_profiler
1 Line # Mem usage Increment Line Contents
2 ================================================
3 45 9.512 MiB 0.000 MiB @profile
4 46 def create_lot_of_stuff (
times = 10000 , cl = OldStyleClass ):
5 47 9.516 MiB 0.004 MiB ret = []
6 48 9.516 MiB 0.000 MiB t = "foo %d"
7 49 156.449 MiB 146.934 MiB for i in xrange(times):
8 50 156.445 MiB -0.004 MiB l = [ t % (j + i%8)
for j in xrange (8)]
9 51 156.449 MiB 0.004 MiB c = cl(*l)
10 52 156.449 MiB 0.000 MiB ret.append(c)
11 53 156.449 MiB 0.000 MiB return ret
Listing 10: Results
P. Przymus 20/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
memory profiler – memory usage monitoring
Preparation
To track particular functions use profile decorator.
Running and plotting
1 mprof run --python python uniwerse.py -f 100 100 -s 100
100 10
2 mprof plot
Figure: Results
P. Przymus 21/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
memory profiler – Debugger trigger
1 eror@eror -laptop :˜$ python -m memory_profiler --pdb -mmem =10
uniwerse.py -s 100 100 10
2 Current memory 20.80 MiB exceeded the maximumof 10.00 MiB
3 Stepping into the debugger
4 > /home/eror/uniwerse.py (52) connect ()
5 -> self.adj.append(n)
6 (Pdb)
Listing 11: Debugger trigger – setting debugger breakpoints.
P. Przymus 22/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Tools – objgraph
objgraph – draws Python object reference graphs with graphviz.
1 import objgraph
2 x = []
3 y = [x, [x], dict(x=x)]
4 objgraph.show_refs ([y], filename=’sample -graph.png’)
5 objgraph. show_backrefs ([x], filename=’sample -backref -graph.png’
)
Listing 12: Tutorial example
Figure: Reference graph Figure: Back reference graph
P. Przymus 23/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Tools – Heapy/Meliae
Heapy
The heap analysis toolset. It can be used to find information about the
objects in the heap and display the information in various ways.
part of ”Guppy-PE – A Python Programming Environment”
Meliae
Python Memory Usage Analyzer
”This project is similar to heapy (in the ’guppy’ project), in its attempt
to understand how memory has been allocated.”
runsnakerun GUI support.
P. Przymus 24/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Tools – Heapy
1 from guppy import hpy
2 hp=hpy()
3 h1 = hp.heap ()
4 l = [ range(i) for i in xrange (2**10)]
5 h2 = hp.heap ()
6 print h2 - h1
Listing 13: Heapy example
1 Partition of a set of 294937 objects. Total size = 11538088
bytes.
2 Index Count % Size % Cumulative % Kind (class / dict
of class)
3 0 293899 100 7053576 61 7053576 61 int
4 1 1025 0 4481544 39 11535120 100 list
5 2 6 0 1680 0 11536800 100 dict (no owner)
6 3 2 0 560 0 11537360 100 dict of guppy.etc.
Glue.Owner
7 4 1 0 456 0 11537816 100 types.FrameType
8 5 2 0 144 0 11537960 100 guppy.etc.Glue.
Owner
9 6 2 0 128 0 11538088 100 str
Listing 14: Results
P. Przymus 25/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Meliae and runsnakerun
1 from meliae import scanner
2 scanner. dump_all_objects (" representation_meliae .dump")
3 # In shell: runsnakemem representation_meliae .dump
Listing 15: Heapy example
Figure: Meliae and runsnakerunP. Przymus 26/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
malloc() alternatives – libjemalloc and libtcmalloc
Pros:
In some cases using different malloc() implementation ”may” help to
retrieve memory from CPython back to system.
Cons:
But equally it may work against you.
1 $LD_PRELOAD ="/usr/lib/libjemalloc .so.1" python
int_float_alloc .py
2 $ LD_PRELOAD="/usr/lib/ libtcmalloc_minimal .so.4" python
int_float_alloc .py
Listing 16: Changing memory allocator
P. Przymus 27/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
malloc() alternatives – libjemalloc and libtcmalloc
Step malloc jemalloc tcmalloc
res virt res virt res virt
step 1 7.4M 46.5M 8.0M 56.9M 9.4M 56.1M
step 2 40.0M 79.1M 41.6M 88.9M 42.5M 89.3M
step 3 16.2M 55.3M 8.2M 88.9M 42.5M 89.3M
step 4 40.0M 84.3M 41.5M 100.9M 51.5M 98.4M
step 5 8.2M 47.3M 8.5M 100.9M 51.5M 98.4M
P. Przymus 28/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Other useful tools
Build Python in debug mode (./configure –with-pydebug . . . ).
Maintains list of all active objects.
Upon exit (or every statement in interactive mode), print all existing
references.
Trac total allocation.
valgrind – a programming tool for memory debugging, leak detection,
and profiling. Rather low level.
CPython can cooperate with valgrind (for >= py-2.7, py-3.2)
gdb-heap (gdb extension)
low level, still experimental
can be attached to running processes
may be used with core file
Web applications memory leaks
dowser – cherrypy application that displays sparklines of python object
counts.
dozer – wsgi middleware version of the cherrypy memory leak debugger
(any wsgi application).
P. Przymus 29/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Summary
Summary:
Try to understand better underlying memory model.
Pay attention to hot spots.
Use profiling tools.
”Seek and destroy” – find the root cause of the memory leak and fix it ;)
Quick and sometimes dirty solutions:
Delegate memory intensive work to other process.
Regularly restart process.
Go for low hanging fruits (e.g. slots , different allocators).
P. Przymus 30/31
Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
References
Wesley J. Chun, Principal CyberWeb Consulting, ”Python 103...
MMMM: Understanding Python’s Memory Model, Mutability, Methods”
David Malcolm, Red Hat, ”Dude – Where’s My RAM?” A deep dive into
how Python uses memory.
Evan Jones, Improving Python’s Memory Allocator
Alexander Slesarev, Memory reclaiming in Python
Source code of Python
Tools documentation
P. Przymus 31/31

Mais conteúdo relacionado

Mais procurados

Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02Fariz Darari
 
Reversing the dropbox client on windows
Reversing the dropbox client on windowsReversing the dropbox client on windows
Reversing the dropbox client on windowsextremecoders
 
Introduction to Python Pandas for Data Analytics
Introduction to Python Pandas for Data AnalyticsIntroduction to Python Pandas for Data Analytics
Introduction to Python Pandas for Data AnalyticsPhoenix
 
Tensorflow internal
Tensorflow internalTensorflow internal
Tensorflow internalHyunghun Cho
 
Python Interview Questions And Answers
Python Interview Questions And AnswersPython Interview Questions And Answers
Python Interview Questions And AnswersH2Kinfosys
 
Learn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesLearn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesMatt Harrison
 
Natural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usageNatural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usagehyunyoung Lee
 
Python For Scientists
Python For ScientistsPython For Scientists
Python For Scientistsaeberspaecher
 
Programming at Compile Time
Programming at Compile TimeProgramming at Compile Time
Programming at Compile TimeemBO_Conference
 
Zero to Hero - Introduction to Python3
Zero to Hero - Introduction to Python3Zero to Hero - Introduction to Python3
Zero to Hero - Introduction to Python3Chariza Pladin
 
Python for Linux System Administration
Python for Linux System AdministrationPython for Linux System Administration
Python for Linux System Administrationvceder
 
Programming Under Linux In Python
Programming Under Linux In PythonProgramming Under Linux In Python
Programming Under Linux In PythonMarwan Osman
 
Intro to Functions Python
Intro to Functions PythonIntro to Functions Python
Intro to Functions Pythonprimeteacher32
 
A Gentle Introduction to Coding ... with Python
A Gentle Introduction to Coding ... with PythonA Gentle Introduction to Coding ... with Python
A Gentle Introduction to Coding ... with PythonTariq Rashid
 
Programming in Python
Programming in Python Programming in Python
Programming in Python Tiji Thomas
 
Intro to Python Programming Language
Intro to Python Programming LanguageIntro to Python Programming Language
Intro to Python Programming LanguageDipankar Achinta
 
Python interview questions and answers
Python interview questions and answersPython interview questions and answers
Python interview questions and answersRojaPriya
 
Industry - Program analysis and verification - Type-preserving Heap Profiler ...
Industry - Program analysis and verification - Type-preserving Heap Profiler ...Industry - Program analysis and verification - Type-preserving Heap Profiler ...
Industry - Program analysis and verification - Type-preserving Heap Profiler ...ICSM 2011
 

Mais procurados (20)

Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02
 
Reversing the dropbox client on windows
Reversing the dropbox client on windowsReversing the dropbox client on windows
Reversing the dropbox client on windows
 
Introduction to Python Pandas for Data Analytics
Introduction to Python Pandas for Data AnalyticsIntroduction to Python Pandas for Data Analytics
Introduction to Python Pandas for Data Analytics
 
Tensorflow internal
Tensorflow internalTensorflow internal
Tensorflow internal
 
Python Interview Questions And Answers
Python Interview Questions And AnswersPython Interview Questions And Answers
Python Interview Questions And Answers
 
Python basic
Python basicPython basic
Python basic
 
Learn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesLearn 90% of Python in 90 Minutes
Learn 90% of Python in 90 Minutes
 
Natural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usageNatural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usage
 
Python For Scientists
Python For ScientistsPython For Scientists
Python For Scientists
 
Programming at Compile Time
Programming at Compile TimeProgramming at Compile Time
Programming at Compile Time
 
Zero to Hero - Introduction to Python3
Zero to Hero - Introduction to Python3Zero to Hero - Introduction to Python3
Zero to Hero - Introduction to Python3
 
Python for Linux System Administration
Python for Linux System AdministrationPython for Linux System Administration
Python for Linux System Administration
 
Programming Under Linux In Python
Programming Under Linux In PythonProgramming Under Linux In Python
Programming Under Linux In Python
 
Intro to Functions Python
Intro to Functions PythonIntro to Functions Python
Intro to Functions Python
 
A Gentle Introduction to Coding ... with Python
A Gentle Introduction to Coding ... with PythonA Gentle Introduction to Coding ... with Python
A Gentle Introduction to Coding ... with Python
 
Programming in Python
Programming in Python Programming in Python
Programming in Python
 
Python basics
Python basicsPython basics
Python basics
 
Intro to Python Programming Language
Intro to Python Programming LanguageIntro to Python Programming Language
Intro to Python Programming Language
 
Python interview questions and answers
Python interview questions and answersPython interview questions and answers
Python interview questions and answers
 
Industry - Program analysis and verification - Type-preserving Heap Profiler ...
Industry - Program analysis and verification - Type-preserving Heap Profiler ...Industry - Program analysis and verification - Type-preserving Heap Profiler ...
Industry - Program analysis and verification - Type-preserving Heap Profiler ...
 

Semelhante a Everything You Always Wanted to Know About Memory in Python But Were Afraid to Ask

Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce AlgorithmsAmund Tveit
 
pythonlibrariesandmodules-210530042906.docx
pythonlibrariesandmodules-210530042906.docxpythonlibrariesandmodules-210530042906.docx
pythonlibrariesandmodules-210530042906.docxRameshMishra84
 
Python Libraries and Modules
Python Libraries and ModulesPython Libraries and Modules
Python Libraries and ModulesRaginiJain21
 
FDP-faculty deveopmemt program on python
FDP-faculty deveopmemt program on pythonFDP-faculty deveopmemt program on python
FDP-faculty deveopmemt program on pythonkannikadg
 
Pypy is-it-ready-for-production-the-sequel
Pypy is-it-ready-for-production-the-sequelPypy is-it-ready-for-production-the-sequel
Pypy is-it-ready-for-production-the-sequelMark Rees
 
Accelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-LearnAccelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-LearnGilles Louppe
 
employee turnover prediction document.docx
employee turnover prediction document.docxemployee turnover prediction document.docx
employee turnover prediction document.docxrohithprabhas1
 
ADK COLEGE.pptx
ADK COLEGE.pptxADK COLEGE.pptx
ADK COLEGE.pptxAshirwad2
 
James Jesus Bermas on Crash Course on Python
James Jesus Bermas on Crash Course on PythonJames Jesus Bermas on Crash Course on Python
James Jesus Bermas on Crash Course on PythonCP-Union
 
Data structures using C
Data structures using CData structures using C
Data structures using CPdr Patnaik
 
Ds12 140715025807-phpapp02
Ds12 140715025807-phpapp02Ds12 140715025807-phpapp02
Ds12 140715025807-phpapp02Salman Qamar
 
2018 cosup-delete unused python code safely - english
2018 cosup-delete unused python code safely - english2018 cosup-delete unused python code safely - english
2018 cosup-delete unused python code safely - englishJen Yee Hong
 
Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)
Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)
Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)Ontico
 
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdfSteve Caron
 

Semelhante a Everything You Always Wanted to Know About Memory in Python But Were Afraid to Ask (20)

Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce Algorithms
 
pythonlibrariesandmodules-210530042906.docx
pythonlibrariesandmodules-210530042906.docxpythonlibrariesandmodules-210530042906.docx
pythonlibrariesandmodules-210530042906.docx
 
Python Libraries and Modules
Python Libraries and ModulesPython Libraries and Modules
Python Libraries and Modules
 
See through C
See through CSee through C
See through C
 
FDP-faculty deveopmemt program on python
FDP-faculty deveopmemt program on pythonFDP-faculty deveopmemt program on python
FDP-faculty deveopmemt program on python
 
Pypy is-it-ready-for-production-the-sequel
Pypy is-it-ready-for-production-the-sequelPypy is-it-ready-for-production-the-sequel
Pypy is-it-ready-for-production-the-sequel
 
Accelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-LearnAccelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-Learn
 
employee turnover prediction document.docx
employee turnover prediction document.docxemployee turnover prediction document.docx
employee turnover prediction document.docx
 
Python Orientation
Python OrientationPython Orientation
Python Orientation
 
ADK COLEGE.pptx
ADK COLEGE.pptxADK COLEGE.pptx
ADK COLEGE.pptx
 
James Jesus Bermas on Crash Course on Python
James Jesus Bermas on Crash Course on PythonJames Jesus Bermas on Crash Course on Python
James Jesus Bermas on Crash Course on Python
 
DS LAB MANUAL.pdf
DS LAB MANUAL.pdfDS LAB MANUAL.pdf
DS LAB MANUAL.pdf
 
Data structures using C
Data structures using CData structures using C
Data structures using C
 
Ds12 140715025807-phpapp02
Ds12 140715025807-phpapp02Ds12 140715025807-phpapp02
Ds12 140715025807-phpapp02
 
2018 cosup-delete unused python code safely - english
2018 cosup-delete unused python code safely - english2018 cosup-delete unused python code safely - english
2018 cosup-delete unused python code safely - english
 
PyCon Estonia 2019
PyCon Estonia 2019PyCon Estonia 2019
PyCon Estonia 2019
 
Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)
Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)
Postgres в основе вашего дата-центра, Bruce Momjian (EnterpriseDB)
 
cs8251 unit 1 ppt
cs8251 unit 1 pptcs8251 unit 1 ppt
cs8251 unit 1 ppt
 
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
 
Python_intro.ppt
Python_intro.pptPython_intro.ppt
Python_intro.ppt
 

Último

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 

Último (20)

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 

Everything You Always Wanted to Know About Memory in Python But Were Afraid to Ask

  • 1. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Everything You Always Wanted to Know About Memory in Python But Were Afraid to Ask Piotr Przymus Nicolaus Copernicus University Europython 2014, Berlin P. Przymus 1/31
  • 2. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References About Me Piotr Przymus PhD student / Research Assistant at Nicolaus Copernicus University. Interests: databases, GPGPU computing, datamining. 8 years of Python experience. Some of my Python projects: Parts of trading platform in turbineam.com (back testing, trading algorithms) Mussels bio-monitoring analysis and data mining software. Simulator of heterogeneus processing environment for evaluation of database query scheduling algorithms. P. Przymus 2/31
  • 3. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Size of objects Table: Size of different types in bytes Type Python 32 bit 64 bit int (py-2.7) 12 24 long (py-2.7) / int (py-3.3) 14 30 +2 · number of digits float 16 24 complex 24 32 str (py-2.7) / bytes (py-3.3) 24 40 +2 · length unicode (py-2.7) / str (py-3.3) 28 52 +(2 or 4) ∗ length tuple 24 64 +(4 · length) +(8 · length) P. Przymus 3/31
  • 4. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References DIY – check size of objects sys.getsizeof(obj) From documentation Since Python 2.6 Return the size of an object in bytes. The object can be any type. All built-in objects will return correct results. May not be true for third-party extensions as it is implementation specific. Calls the object’s sizeof method and adds an additional garbage collector overhead if the object is managed by the garbage collector. P. Przymus 4/31
  • 5. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Objects interning – fun example 1 a = [ i % 257 for i in xrange (2**20)] 2 Listing 1: List of interned integers 1 b = [ 1024 + i % 257 for i in xrange (2**20)] 2 Listing 2: List of integers Any allocation difference between Listing 1 and Listing 2 ? Results measured using psutils Listing 1 – (resident=15.1M, virtual=2.3G) Listing 2 – (resident=39.5M, virtual=2.4G) P. Przymus 5/31
  • 6. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Objects interning – fun example 1 a = [ i % 257 for i in xrange (2**20)] 2 Listing 3: List of interned integers 1 b = [ 1024 + i % 257 for i in xrange (2**20)] 2 Listing 4: List of integers Any allocation difference between Listing 1 and Listing 2 ? Results measured using psutils Listing 1 – (resident=15.1M, virtual=2.3G) Listing 2 – (resident=39.5M, virtual=2.4G) P. Przymus 5/31
  • 7. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Objects interning – explained Objects and variables – general rule Objects are allocated on assignment Variables just point to objects (i.e. they do not hold the memory) Interning of Objects This is an exception to the general rule. Python implementation specific (examples from CPython). ”Often” used objects are preallocated and are shared instead of costly new alloc. Mainly due to the performance optimization. 1 >>> a = 0, b = 0 2 >>> a is b, a == b 3 (True , True) 4 Listing 5: Interning of Objects 1 >>> a = 1024 , b = 1024 2 >>> a is b, a == b 3 (False , True) 4 Listing 6: Objects allocation P. Przymus 6/31
  • 8. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Objects interning – behind the scenes Warning This is Python implementation dependent. This may change in the future. This is not documented because of the above reasons. For reference consult the source code. CPython 2.7 - 3.4 Single instances for: int – in range [−5, 257) str / unicode – empty string and all length=1 strings unicode / str – empty string and all length=1 strings for Latin-1 tuple – empty tuple P. Przymus 7/31
  • 9. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References String interning – example 1 >>> a, b = "strin", "string" 2 >>> a + ’g’ is b # returns False 3 >>> intern(a+’g’) is intern(b) # returns True 4 >>> a = [ "spam %d" % (i % 257) 5 for i in xrange (2**20)] 6 >>> # memory usage (resident =57.6M, virtual =2.4G) 7 >>> a = [ intern("spam %d" % (i % 257)) 8 for i in xrange (2**20)] 9 >>> # memory usage (resident =14.9M, virtual =2.3G) 10 Listing 7: String interning P. Przymus 8/31
  • 10. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References String interning – explained String interning definition String interning is a method of storing only one copy of each distinct string value, which must be immutable. intern (py-2.x) / sys.intern (py-3.x) From Cpython documentation: Enter string in the table of “interned” strings. Return the interned string (string or string copy). Useful to gain a little performance on dictionary lookup (key comparisons after hashing can be done by a pointer compare instead of a string compare). Names used in programs are automatically interned Dictionaries used to hold module, class or instance attributes have interned keys. P. Przymus 9/31
  • 11. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Mutable Containers Memory Allocation Strategy Plan for growth and shrinkage Slightly overallocate memory neaded by container. Leave room to growth. Shrink when overallocation threshold is reached. Reduce number of expensive function calls: relloc() memcpy() Use optimal layout. List, Sets, Dictionaries P. Przymus 10/31
  • 12. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References List allocation – example Figure: List growth example P. Przymus 11/31
  • 13. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References List allocation strategy Represented as fixed-length array of pointers. Overallocation for list growth (by append) List size growth: 4, 8, 16, 25, 35, 46, . . . For large lists less then 12.5% Due to the memory actions involved, operations: at end of list are cheap (rare realloc), in the middle or beginning require memory copy or shift! Note that for 1,2,5 elements lists, space is wasted. List allocation size: 32 bits – 32 + (4 * length) 64 bits – 72 + (8 * length) Shrinking only when list size < 1/2 of allocated space. P. Przymus 12/31
  • 14. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Overallocation of dictionaries/sets Represented as fixed-length hash tables. Overallocation for dict/sets – when 2/3 of capacity is reached. if number of elements < 50000: quadruple the capacity else: double the capacity 1 // dict growth strategy 2 (mp ->ma_used >50000 ? 2 : 4) * mp ->ma_used; 3 // set growth strategy 4 so ->used >50000 ? so ->used *2 : so ->used *4); 5 Dict/Set growth/shrink code 1 for (newsize = PyDict_MINSIZE ; 2 newsize <= minused && newsize > 0; 3 newsize <<= 1); 4 Shrinkage if dictionary/set fill (real and dummy elements) is much larger than used elements (real elements) i.e. lot of keys have been deleted. P. Przymus 13/31
  • 15. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Various data representation 1 # Fields: field1 , field2 , field3 , ..., field8 2 # Data: "foo 1", "foo 2", "foo 3", ..., "foo 8" 3 class OldStyleClass : #only py -2.x 4 ... 5 class NewStyleClass (object): # default for py -3.x 6 ... 7 class NewStyleClassSlots (object): 8 __slots__ = (’field1 ’, ’field2 ’, ...) 9 ... 10 import collections as c 11 NamedTuple = c.namedtuple(’nt’, [ ’field1 ’, ... ,]) 12 13 TupleData = (’value1 ’, ’value2 ’, ....) 14 ListaData = [’value1 ’, ’value2 ’, ....] 15 DictData = {’field1 ’:, ’value2 ’, ....} 16 Listing 8: Various data representation P. Przymus 14/31
  • 16. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Various data representation – allocated memory 0 MB 50 MB 100 MB 150 MB Old StyleClass New StyleClass DictData NamedTuple TupleData ListaData NewStyle ClassWithSlots Python 2.x Python 3.x Figure: Allocated memory after creating 100000 objects with 8 fields each P. Przymus 15/31
  • 17. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Notes on garbage collector, reference count and cycles Python garbage collector Uses reference counting. Offers cycle detection. Objects garbage-collected when count goes to 0. Reference increment, e.g.: object creation, additional aliases, passed to function Reference decrement, e.g.: local reference goes out of scope, alias is destroyed, alias is reassigned Warning – from documentation Objects that have del () methods and are part of a reference cycle cause the entire reference cycle to be uncollectable! Python doesn’t collect such cycles automatically. It is not possible for Python to guess a safe order in which to run the del () methods. P. Przymus 16/31
  • 18. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Tools psutil memory profiler objgraph Meliae (could be combined with runsnakerun) Heapy P. Przymus 17/31
  • 19. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Tools – psutil psutil – A cross-platform process and system utilities module for Python. 1 import psutil 2 import os 3 ... 4 p = psutil.Process(os.getpid ()) 5 pinfo = p.as_dict () 6 ... 7 print pinfo[’memory_percent ’], 8 print pinfo[’memory_info ’].rss , pinfo[’memory_info ’]. vms Listing 9: Various data representation P. Przymus 18/31
  • 20. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Tools – memory profiler memory profiler – a module for monitoring memory usage of a python program. Recommended dependency: psutil. May work as: Line-by-line profiler. Memory usage monitoring (memory in time). Debugger trigger – setting debugger breakpoints. P. Przymus 19/31
  • 21. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References memory profiler – Line-by-line profiler Preparation To track particular functions use profile decorator. Running 1 python -m memory_profiler 1 Line # Mem usage Increment Line Contents 2 ================================================ 3 45 9.512 MiB 0.000 MiB @profile 4 46 def create_lot_of_stuff ( times = 10000 , cl = OldStyleClass ): 5 47 9.516 MiB 0.004 MiB ret = [] 6 48 9.516 MiB 0.000 MiB t = "foo %d" 7 49 156.449 MiB 146.934 MiB for i in xrange(times): 8 50 156.445 MiB -0.004 MiB l = [ t % (j + i%8) for j in xrange (8)] 9 51 156.449 MiB 0.004 MiB c = cl(*l) 10 52 156.449 MiB 0.000 MiB ret.append(c) 11 53 156.449 MiB 0.000 MiB return ret Listing 10: Results P. Przymus 20/31
  • 22. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References memory profiler – memory usage monitoring Preparation To track particular functions use profile decorator. Running and plotting 1 mprof run --python python uniwerse.py -f 100 100 -s 100 100 10 2 mprof plot Figure: Results P. Przymus 21/31
  • 23. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References memory profiler – Debugger trigger 1 eror@eror -laptop :˜$ python -m memory_profiler --pdb -mmem =10 uniwerse.py -s 100 100 10 2 Current memory 20.80 MiB exceeded the maximumof 10.00 MiB 3 Stepping into the debugger 4 > /home/eror/uniwerse.py (52) connect () 5 -> self.adj.append(n) 6 (Pdb) Listing 11: Debugger trigger – setting debugger breakpoints. P. Przymus 22/31
  • 24. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Tools – objgraph objgraph – draws Python object reference graphs with graphviz. 1 import objgraph 2 x = [] 3 y = [x, [x], dict(x=x)] 4 objgraph.show_refs ([y], filename=’sample -graph.png’) 5 objgraph. show_backrefs ([x], filename=’sample -backref -graph.png’ ) Listing 12: Tutorial example Figure: Reference graph Figure: Back reference graph P. Przymus 23/31
  • 25. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Tools – Heapy/Meliae Heapy The heap analysis toolset. It can be used to find information about the objects in the heap and display the information in various ways. part of ”Guppy-PE – A Python Programming Environment” Meliae Python Memory Usage Analyzer ”This project is similar to heapy (in the ’guppy’ project), in its attempt to understand how memory has been allocated.” runsnakerun GUI support. P. Przymus 24/31
  • 26. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Tools – Heapy 1 from guppy import hpy 2 hp=hpy() 3 h1 = hp.heap () 4 l = [ range(i) for i in xrange (2**10)] 5 h2 = hp.heap () 6 print h2 - h1 Listing 13: Heapy example 1 Partition of a set of 294937 objects. Total size = 11538088 bytes. 2 Index Count % Size % Cumulative % Kind (class / dict of class) 3 0 293899 100 7053576 61 7053576 61 int 4 1 1025 0 4481544 39 11535120 100 list 5 2 6 0 1680 0 11536800 100 dict (no owner) 6 3 2 0 560 0 11537360 100 dict of guppy.etc. Glue.Owner 7 4 1 0 456 0 11537816 100 types.FrameType 8 5 2 0 144 0 11537960 100 guppy.etc.Glue. Owner 9 6 2 0 128 0 11538088 100 str Listing 14: Results P. Przymus 25/31
  • 27. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Meliae and runsnakerun 1 from meliae import scanner 2 scanner. dump_all_objects (" representation_meliae .dump") 3 # In shell: runsnakemem representation_meliae .dump Listing 15: Heapy example Figure: Meliae and runsnakerunP. Przymus 26/31
  • 28. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References malloc() alternatives – libjemalloc and libtcmalloc Pros: In some cases using different malloc() implementation ”may” help to retrieve memory from CPython back to system. Cons: But equally it may work against you. 1 $LD_PRELOAD ="/usr/lib/libjemalloc .so.1" python int_float_alloc .py 2 $ LD_PRELOAD="/usr/lib/ libtcmalloc_minimal .so.4" python int_float_alloc .py Listing 16: Changing memory allocator P. Przymus 27/31
  • 29. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References malloc() alternatives – libjemalloc and libtcmalloc Step malloc jemalloc tcmalloc res virt res virt res virt step 1 7.4M 46.5M 8.0M 56.9M 9.4M 56.1M step 2 40.0M 79.1M 41.6M 88.9M 42.5M 89.3M step 3 16.2M 55.3M 8.2M 88.9M 42.5M 89.3M step 4 40.0M 84.3M 41.5M 100.9M 51.5M 98.4M step 5 8.2M 47.3M 8.5M 100.9M 51.5M 98.4M P. Przymus 28/31
  • 30. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Other useful tools Build Python in debug mode (./configure –with-pydebug . . . ). Maintains list of all active objects. Upon exit (or every statement in interactive mode), print all existing references. Trac total allocation. valgrind – a programming tool for memory debugging, leak detection, and profiling. Rather low level. CPython can cooperate with valgrind (for >= py-2.7, py-3.2) gdb-heap (gdb extension) low level, still experimental can be attached to running processes may be used with core file Web applications memory leaks dowser – cherrypy application that displays sparklines of python object counts. dozer – wsgi middleware version of the cherrypy memory leak debugger (any wsgi application). P. Przymus 29/31
  • 31. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References Summary Summary: Try to understand better underlying memory model. Pay attention to hot spots. Use profiling tools. ”Seek and destroy” – find the root cause of the memory leak and fix it ;) Quick and sometimes dirty solutions: Delegate memory intensive work to other process. Regularly restart process. Go for low hanging fruits (e.g. slots , different allocators). P. Przymus 30/31
  • 32. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References References Wesley J. Chun, Principal CyberWeb Consulting, ”Python 103... MMMM: Understanding Python’s Memory Model, Mutability, Methods” David Malcolm, Red Hat, ”Dude – Where’s My RAM?” A deep dive into how Python uses memory. Evan Jones, Improving Python’s Memory Allocator Alexander Slesarev, Memory reclaiming in Python Source code of Python Tools documentation P. Przymus 31/31