Python

Run time profiling with cProfile

Python is distributed with profiling modules. They describe the run time operation of a pure python program, providing a variety of statistics.

The cProfile module is the recommended module. To execute your program under the control of the cProfile module, a simple form is

$ python -m cProfile -s cumulative mypythonscript.py

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      16    0.055    0.003   15.801    0.988 __init__.py:1(<module>)
       1    0.000    0.000   11.113   11.113 __init__.py:35(extract)
     135    7.351    0.054   11.078    0.082 __init__.py:25(iter_extract)
10350736    3.628    0.000    3.628    0.000 {method 'startswith' of 'str' objects}
       1    0.000    0.000    2.422    2.422 pyplot.py:123(show)
       1    0.000    0.000    2.422    2.422 backend_bases.py:69(__call__)
       ...

Each column provides information about time execution of every function calls. -s cumulative orders the result by descending cumulative time.

Note:

You can profile a particular python function such as main()

>>> import profile
>>> profile.run('main()')

Graphical tools to show profiling results

Even if report tools are included in cProfile profiler, it can be interesting to use graphical tools. Most of them work with a stat file that can be generated by cProfile using the -o filepath option.

Below are some of available graphical tools that we tested.

Gpro2Dot

is a python based tool that allows to transform profiling results output into a picture containing the call tree graph (using graphviz). A typical profiling session with python looks like this:

$ python -m cProfile -o output.pstats mypythonscript.py
$ gprof2dot.py -f pstats output.pstats | dot -Tpng -o profiling_results.png
http://wiki.jrfonseca.googlecode.com/git/gprof2dot.png

Each node of the output graph represents a function and has the following layout:

+----------------------------------+
|   function name : module name    |
| total time including sub-calls % |  total time including sub-calls %
|    (self execution time %)       |------------------------------------>
|  total number of self calls      |
+----------------------------------+

Nodes and edges are colored according to the "total time" spent in the functions.

Note:The following small patch let the node color correspond to the execution time and the edge color to the "total time":
diff -r da2b31597c5f gprof2dot.py
--- a/gprof2dot.py      Fri Aug 31 16:38:37 2012 +0200
+++ b/gprof2dot.py      Fri Aug 31 16:40:56 2012 +0200
@@ -2628,6 +2628,7 @@
                 weight = function.weight
             else:
                 weight = 0.0
+            weight = function[TIME_RATIO]

             label = '\n'.join(labels)
             self.node(function.id,
PyProf2CallTree

is a script to help visualizing profiling data with the KCacheGrind graphical calltree analyzer. This is a more interactive solution than Gpro2Dot but it requires to install KCacheGrind. Typical usage:

$ python -m cProfile -o stat.prof mypythonscript.py
$ python pyprof2calltree.py -i stat.prof -k

Profiling data file is opened in KCacheGrind with pyprof2calltree module, whose -k switch automatically opens KCacheGrind.

http://kcachegrind.sourceforge.net/html/pics/KcgShot3Large.gif

There are other tools that are worth testing:

  • RunSnakeRun is an interactive GUI tool which visualizes profile file using square maps:

    $ python -m cProfile -o stat.prof mypythonscript.py
    $ runsnake stat.prof
    
  • pycallgraph generates PNG images of a call tree with the total number of calls:

    $ pycallgraph mypythonscript.py
    
  • lsprofcalltree also use KCacheGrind to display profiling data:

    $ python lsprofcalltree.py -o output.log yourprogram.py
    $ kcachegrind output.log
    

C/C++ extension profiling

For optimization purpose one may have python extensions written in C/C++. For such modules, cProfile will not dig into the corresponding call tree. Dedicated tools must be used (they are most part of Python) to profile a C++ extension from python.

Yep

is a python module dedicated to the profiling of compiled python extension. It uses the google CPU profiler:

$ python -m yep --callgrind mypythonscript.py

Memory Profiler

You may want to control the amount of memory used by a python program. There is an interesting module that fits this need: memory_profiler

You can fetch memory consumption of a program over time using

>>> from memory_profiler import memory_usage
>>> memory_usage(main, (), {})

memory_profiler can also spot lines that consume the most using pdb or IPython.

General purpose Profiling

The Linux perf tool gives access to a wide variety of performance counter subsystems. Using perf, any execution configuration (pure python programs, compiled extensions, subprocess, etc.) may be profiled.

Performance counters are CPU hardware registers that count hardware events such as instructions executed, cache-misses suffered, or branches mispredicted. They form a basis for profiling applications to trace dynamic control flow and identify hotspots.

You can have information about execution times with:

$ perf stat -e cpu-cycles,cpu-clock,task-clock python mypythonscript.py

You can have RAM access information using:

$ perf stat -e cache-misses python mypythonscript.py

Be careful about the fact that perf gives the raw value of the hardware counters. So, you need to know exactly what you are looking for and how to interpret these values in the context of your program.

Note that you can use Gpro2Dot to get a more user-friendly output:

$ perf record -g python mypythonscript.py
$ perf script | gprof2dot.py -f perf | dot -Tpng -o output.png

blog entry of

Logilab.org - en