Run time profiling with cProfile

Python is distributed with profiling modules. They describe the run time operation of a pure python program, providing a variety of statistics.

The cProfile module is the recommended module. To execute your program under the control of the cProfile module, a simple form is

$ python -m cProfile -s cumulative

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      16    0.055    0.003   15.801    0.988<module>)
       1    0.000    0.000   11.113   11.113
     135    7.351    0.054   11.078    0.082
10350736    3.628    0.000    3.628    0.000 {method 'startswith' of 'str' objects}
       1    0.000    0.000    2.422    2.422
       1    0.000    0.000    2.422    2.422

Each column provides information about time execution of every function calls. -s cumulative orders the result by descending cumulative time.


You can profile a particular python function such as main()

>>> import profile

Graphical tools to show profiling results

Even if report tools are included in cProfile profiler, it can be interesting to use graphical tools. Most of them work with a stat file that can be generated by cProfile using the -o filepath option.

Below are some of available graphical tools that we tested.


is a python based tool that allows to transform profiling results output into a picture containing the call tree graph (using graphviz). A typical profiling session with python looks like this:

$ python -m cProfile -o output.pstats
$ -f pstats output.pstats | dot -Tpng -o profiling_results.png

Each node of the output graph represents a function and has the following layout:

|   function name : module name    |
| total time including sub-calls % |  total time including sub-calls %
|    (self execution time %)       |------------------------------------>
|  total number of self calls      |

Nodes and edges are colored according to the "total time" spent in the functions.

Note:The following small patch let the node color correspond to the execution time and the edge color to the "total time":
diff -r da2b31597c5f
--- a/      Fri Aug 31 16:38:37 2012 +0200
+++ b/      Fri Aug 31 16:40:56 2012 +0200
@@ -2628,6 +2628,7 @@
                 weight = function.weight
                 weight = 0.0
+            weight = function[TIME_RATIO]

             label = '\n'.join(labels)

is a script to help visualizing profiling data with the KCacheGrind graphical calltree analyzer. This is a more interactive solution than Gpro2Dot but it requires to install KCacheGrind. Typical usage:

$ python -m cProfile -o
$ python -i -k

Profiling data file is opened in KCacheGrind with pyprof2calltree module, whose -k switch automatically opens KCacheGrind.

There are other tools that are worth testing:

  • RunSnakeRun is an interactive GUI tool which visualizes profile file using square maps:

    $ python -m cProfile -o
    $ runsnake
  • pycallgraph generates PNG images of a call tree with the total number of calls:

    $ pycallgraph
  • lsprofcalltree also use KCacheGrind to display profiling data:

    $ python -o output.log
    $ kcachegrind output.log

C/C++ extension profiling

For optimization purpose one may have python extensions written in C/C++. For such modules, cProfile will not dig into the corresponding call tree. Dedicated tools must be used (they are most part of Python) to profile a C++ extension from python.


is a python module dedicated to the profiling of compiled python extension. It uses the google CPU profiler:

$ python -m yep --callgrind

Memory Profiler

You may want to control the amount of memory used by a python program. There is an interesting module that fits this need: memory_profiler

You can fetch memory consumption of a program over time using

>>> from memory_profiler import memory_usage
>>> memory_usage(main, (), {})

memory_profiler can also spot lines that consume the most using pdb or IPython.

General purpose Profiling

The Linux perf tool gives access to a wide variety of performance counter subsystems. Using perf, any execution configuration (pure python programs, compiled extensions, subprocess, etc.) may be profiled.

Performance counters are CPU hardware registers that count hardware events such as instructions executed, cache-misses suffered, or branches mispredicted. They form a basis for profiling applications to trace dynamic control flow and identify hotspots.

You can have information about execution times with:

$ perf stat -e cpu-cycles,cpu-clock,task-clock python

You can have RAM access information using:

$ perf stat -e cache-misses python

Be careful about the fact that perf gives the raw value of the hardware counters. So, you need to know exactly what you are looking for and how to interpret these values in the context of your program.

Note that you can use Gpro2Dot to get a more user-friendly output:

$ perf record -g python
$ perf script | -f perf | dot -Tpng -o output.png
blog entry of