<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Profiling tools (Logilab.org) RSS Feed</title>
    <description></description>
    <link>http://www.logilab.org/blogentry/104898</link>
<item>
<guid isPermaLink="true">http://www.logilab.org/blogentry/104898</guid>
  <title>Profiling tools</title>
  <link>http://www.logilab.org/blogentry/104898</link>
  <description>&lt;div class=&quot;contents topic&quot; id=&quot;contents&quot;&gt;
&lt;p class=&quot;topic-title&quot;&gt;Contents&lt;/p&gt;
&lt;ul class=&quot;simple&quot;&gt;
&lt;li&gt;&lt;a class=&quot;reference&quot; href=&quot;http://www.logilab.org/blogentry/104898?vid=rss#python&quot; id=&quot;id3&quot;&gt;Python&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class=&quot;reference&quot; href=&quot;http://www.logilab.org/blogentry/104898?vid=rss#run-time-profiling-with-cprofile&quot; id=&quot;id4&quot;&gt;Run time profiling with cProfile&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&quot;reference&quot; href=&quot;http://www.logilab.org/blogentry/104898?vid=rss#graphical-tools-to-show-profiling-results&quot; id=&quot;id5&quot;&gt;Graphical tools to show profiling results&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a class=&quot;reference&quot; href=&quot;http://www.logilab.org/blogentry/104898?vid=rss#c-c-extension-profiling&quot; id=&quot;id6&quot;&gt;C/C++ extension profiling&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&quot;reference&quot; href=&quot;http://www.logilab.org/blogentry/104898?vid=rss#memory-profiler&quot; id=&quot;id7&quot;&gt;Memory Profiler&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class=&quot;reference&quot; href=&quot;http://www.logilab.org/blogentry/104898?vid=rss#general-purpose-profiling&quot; id=&quot;id8&quot;&gt;General purpose Profiling&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div class=&quot;section&quot; id=&quot;python&quot;&gt;
&lt;h3&gt;&lt;a class=&quot;toc-backref&quot; href=&quot;http://www.logilab.org/blogentry/104898?vid=rss#id3&quot;&gt;Python&lt;/a&gt;&lt;/h3&gt;
&lt;div class=&quot;section&quot; id=&quot;run-time-profiling-with-cprofile&quot;&gt;
&lt;h4&gt;&lt;a class=&quot;toc-backref&quot; href=&quot;http://www.logilab.org/blogentry/104898?vid=rss#id4&quot;&gt;Run time profiling with cProfile&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;Python is distributed with &lt;a class=&quot;reference&quot; href=&quot;http://docs.python.org/library/profile.html&quot;&gt;profiling modules&lt;/a&gt;. They describe the run
time operation of a pure python program, providing a variety of
statistics.&lt;/p&gt;
&lt;p&gt;The &lt;a class=&quot;reference&quot; href=&quot;http://docs.python.org/library/profile.html#module-cProfile&quot;&gt;cProfile&lt;/a&gt; module is the recommended module. To execute your program
under the control of the &lt;a class=&quot;reference&quot; href=&quot;http://docs.python.org/library/profile.html#module-cProfile&quot;&gt;cProfile&lt;/a&gt; module, a simple form is&lt;/p&gt;
&lt;pre class=&quot;literal-block&quot;&gt;
$ python -m cProfile -s cumulative mypythonscript.py

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      16    0.055    0.003   15.801    0.988 __init__.py:1(&amp;lt;module&amp;gt;)
       1    0.000    0.000   11.113   11.113 __init__.py:35(extract)
     135    7.351    0.054   11.078    0.082 __init__.py:25(iter_extract)
10350736    3.628    0.000    3.628    0.000 {method &#39;startswith&#39; of &#39;str&#39; objects}
       1    0.000    0.000    2.422    2.422 pyplot.py:123(show)
       1    0.000    0.000    2.422    2.422 backend_bases.py:69(__call__)
       ...
&lt;/pre&gt;
&lt;p&gt;Each column provides information about time execution of every function calls.
&lt;tt class=&quot;docutils literal&quot;&gt;&lt;span class=&quot;pre&quot;&gt;-s&lt;/span&gt; cumulative&lt;/tt&gt; orders the result by descending cumulative time.&lt;/p&gt;
&lt;table class=&quot;docutils field-list&quot; frame=&quot;void&quot; rules=&quot;none&quot;&gt;
&lt;col class=&quot;field-name&quot; /&gt;
&lt;col class=&quot;field-body&quot; /&gt;
&lt;tbody valign=&quot;top&quot;&gt;
&lt;tr class=&quot;field&quot;&gt;&lt;th class=&quot;field-name&quot;&gt;Note:&lt;/th&gt;&lt;td class=&quot;field-body&quot;&gt;&lt;p class=&quot;first&quot;&gt;You can profile a particular python function such as &lt;tt class=&quot;docutils literal&quot;&gt;main()&lt;/tt&gt;&lt;/p&gt;
&lt;pre class=&quot;last literal-block&quot;&gt;
&amp;gt;&amp;gt;&amp;gt; import profile
&amp;gt;&amp;gt;&amp;gt; profile.run(&#39;main()&#39;)
&lt;/pre&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;div class=&quot;section&quot; id=&quot;graphical-tools-to-show-profiling-results&quot;&gt;
&lt;h4&gt;&lt;a class=&quot;toc-backref&quot; href=&quot;http://www.logilab.org/blogentry/104898?vid=rss#id5&quot;&gt;Graphical tools to show profiling results&lt;/a&gt;&lt;/h4&gt;
&lt;p&gt;Even if report tools are included in &lt;a class=&quot;reference&quot; href=&quot;http://docs.python.org/library/profile.html#module-cProfile&quot;&gt;cProfile&lt;/a&gt; profiler, it can be
interesting to use graphical tools.  Most of them work with a stat
file that can be generated by &lt;a class=&quot;reference&quot; href=&quot;http://docs.python.org/library/profile.html#module-cProfile&quot;&gt;cProfile&lt;/a&gt; using the &lt;tt class=&quot;docutils literal&quot;&gt;&lt;span class=&quot;pre&quot;&gt;-o&lt;/span&gt; filepath&lt;/tt&gt;
option.&lt;/p&gt;
&lt;p&gt;Below are some of available graphical tools that we tested.&lt;/p&gt;
&lt;div class=&quot;section&quot; id=&quot;gprof2dot&quot;&gt;
&lt;h5&gt;&lt;a&gt;&lt;a class=&quot;reference&quot; href=&quot;http://code.google.com/p/jrfonseca/wiki/Gprof2Dot&quot;&gt;Gpro2Dot&lt;/a&gt;&lt;/a&gt;&lt;/h5&gt;
&lt;p&gt;is a python based tool that allows to transform profiling results
output into a picture containing the call tree graph (using
&lt;a class=&quot;reference&quot; href=&quot;http://www.graphviz.org/&quot;&gt;graphviz&lt;/a&gt;).  A typical profiling session with python looks like
this:&lt;/p&gt;
&lt;pre class=&quot;literal-block&quot;&gt;
$ python -m cProfile -o output.pstats mypythonscript.py
$ gprof2dot.py -f pstats output.pstats | dot -Tpng -o profiling_results.png
&lt;/pre&gt;
&lt;a class=&quot;reference image-reference&quot; href=&quot;http://wiki.jrfonseca.googlecode.com/git/gprof2dot.png&quot;&gt;&lt;img alt=&quot;http://wiki.jrfonseca.googlecode.com/git/gprof2dot.png&quot; src=&quot;http://wiki.jrfonseca.googlecode.com/git/gprof2dot.png&quot; style=&quot;width: 256px;&quot; /&gt;&lt;/a&gt;
&lt;p&gt;Each node of the output graph represents a function and has the following layout:&lt;/p&gt;
&lt;pre class=&quot;literal-block&quot;&gt;
+----------------------------------+
|   function name : module name    |
| total time including sub-calls % |  total time including sub-calls %
|    (self execution time %)       |------------------------------------&amp;gt;
|  total number of self calls      |
+----------------------------------+
&lt;/pre&gt;
&lt;p&gt;Nodes and edges are colored according to the &amp;quot;total time&amp;quot; spent in the functions.&lt;/p&gt;
&lt;table class=&quot;docutils field-list&quot; frame=&quot;void&quot; rules=&quot;none&quot;&gt;
&lt;col class=&quot;field-name&quot; /&gt;
&lt;col class=&quot;field-body&quot; /&gt;
&lt;tbody valign=&quot;top&quot;&gt;
&lt;tr class=&quot;field&quot;&gt;&lt;th class=&quot;field-name&quot;&gt;Note:&lt;/th&gt;&lt;td class=&quot;field-body&quot;&gt;The following small patch let the node color correspond to the execution
time and the edge color to the &amp;quot;total time&amp;quot;:&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span class=&quot;gh&quot;&gt;diff -r da2b31597c5f gprof2dot.py&lt;/span&gt;
&lt;span class=&quot;gd&quot;&gt;--- a/gprof2dot.py      Fri Aug 31 16:38:37 2012 +0200&lt;/span&gt;
&lt;span class=&quot;gi&quot;&gt;+++ b/gprof2dot.py      Fri Aug 31 16:40:56 2012 +0200&lt;/span&gt;
&lt;span class=&quot;gu&quot;&gt;@@ -2628,6 +2628,7 @@&lt;/span&gt;
                 weight = function.weight
             else:
                 weight = 0.0
&lt;span class=&quot;gi&quot;&gt;+            weight = function[TIME_RATIO]&lt;/span&gt;

             label = &amp;#39;\n&amp;#39;.join(labels)
             self.node(function.id,
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class=&quot;section&quot; id=&quot;pyprof2calltree&quot;&gt;
&lt;h5&gt;&lt;a&gt;&lt;a class=&quot;reference&quot; href=&quot;https://bitbucket.org/ogrisel/pyprof2calltree&quot;&gt;PyProf2CallTree&lt;/a&gt;&lt;/a&gt;&lt;/h5&gt;
&lt;p&gt;is a script to help visualizing profiling data with the &lt;a class=&quot;reference&quot; href=&quot;http://kcachegrind.sourceforge.net/&quot;&gt;KCacheGrind&lt;/a&gt;
graphical calltree analyzer. This is a more interactive solution
than &lt;a class=&quot;reference&quot; href=&quot;http://code.google.com/p/jrfonseca/wiki/Gprof2Dot&quot;&gt;Gpro2Dot&lt;/a&gt; but it requires to install &lt;a class=&quot;reference&quot; href=&quot;http://kcachegrind.sourceforge.net/&quot;&gt;KCacheGrind&lt;/a&gt;.
Typical usage:&lt;/p&gt;
&lt;pre class=&quot;literal-block&quot;&gt;
$ python -m cProfile -o stat.prof mypythonscript.py
$ python pyprof2calltree.py -i stat.prof -k
&lt;/pre&gt;
&lt;p&gt;Profiling data file is opened in &lt;a class=&quot;reference&quot; href=&quot;http://kcachegrind.sourceforge.net/&quot;&gt;KCacheGrind&lt;/a&gt; with pyprof2calltree module,
whose &lt;tt class=&quot;docutils literal&quot;&gt;&lt;span class=&quot;pre&quot;&gt;-k&lt;/span&gt;&lt;/tt&gt; switch automatically opens &lt;a class=&quot;reference&quot; href=&quot;http://kcachegrind.sourceforge.net/&quot;&gt;KCacheGrind&lt;/a&gt;.&lt;/p&gt;
&lt;a class=&quot;reference image-reference&quot; href=&quot;http://kcachegrind.sourceforge.net/html/pics/KcgShot3Large.gif&quot;&gt;&lt;img alt=&quot;http://kcachegrind.sourceforge.net/html/pics/KcgShot3Large.gif&quot; src=&quot;http://kcachegrind.sourceforge.net/html/pics/KcgShot3Large.gif&quot; style=&quot;width: 256px;&quot; /&gt;&lt;/a&gt;
&lt;p&gt;There are other tools that are worth testing:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p class=&quot;first&quot;&gt;&lt;a class=&quot;reference&quot; href=&quot;http://www.vrplumber.com/programming/runsnakerun/&quot;&gt;RunSnakeRun&lt;/a&gt; is an interactive GUI tool which visualizes profile file using square maps:&lt;/p&gt;
&lt;pre class=&quot;literal-block&quot;&gt;
$ python -m cProfile -o stat.prof mypythonscript.py
$ runsnake stat.prof
&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p class=&quot;first&quot;&gt;&lt;a class=&quot;reference&quot; href=&quot;http://pycallgraph.slowchop.com/&quot;&gt;pycallgraph&lt;/a&gt; generates PNG images of a call tree with the total number of calls:&lt;/p&gt;
&lt;pre class=&quot;literal-block&quot;&gt;
$ pycallgraph mypythonscript.py
&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p class=&quot;first&quot;&gt;&lt;a class=&quot;reference&quot; href=&quot;https://bitbucket.org/pypy/pypy/src/default/pypy/tool/lsprofcalltree.py&quot;&gt;lsprofcalltree&lt;/a&gt; also use &lt;a class=&quot;reference&quot; href=&quot;http://kcachegrind.sourceforge.net/&quot;&gt;KCacheGrind&lt;/a&gt; to display profiling data:&lt;/p&gt;
&lt;pre class=&quot;literal-block&quot;&gt;
$ python lsprofcalltree.py -o output.log yourprogram.py
$ kcachegrind output.log
&lt;/pre&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class=&quot;section&quot; id=&quot;c-c-extension-profiling&quot;&gt;
&lt;h3&gt;&lt;a class=&quot;toc-backref&quot; href=&quot;http://www.logilab.org/blogentry/104898?vid=rss#id6&quot;&gt;C/C++ extension profiling&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;For optimization purpose one may have python extensions written in
C/C++. For such modules, &lt;a class=&quot;reference&quot; href=&quot;http://docs.python.org/library/profile.html#module-cProfile&quot;&gt;cProfile&lt;/a&gt; will not dig into the
corresponding call tree. Dedicated tools must be used (they are most
part of Python) to profile a C++ extension from python.&lt;/p&gt;
&lt;p&gt;&lt;a class=&quot;reference&quot; href=&quot;http://pypi.python.org/pypi/yep&quot;&gt;Yep&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;is a python module dedicated to the profiling of compiled python extension. It
uses the &lt;a class=&quot;reference&quot; href=&quot;perftools&amp;lt;http://code.google.com/p/gperftools/?redir=1&quot;&gt;google&lt;/a&gt; CPU profiler:&lt;/p&gt;
&lt;pre class=&quot;literal-block&quot;&gt;
$ python -m yep --callgrind mypythonscript.py
&lt;/pre&gt;
&lt;/div&gt;
&lt;div class=&quot;section&quot; id=&quot;memory-profiler&quot;&gt;
&lt;h3&gt;&lt;a class=&quot;toc-backref&quot; href=&quot;http://www.logilab.org/blogentry/104898?vid=rss#id7&quot;&gt;Memory Profiler&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;You may want to control the amount of memory used by a python program.
There is an interesting module that fits this need: &lt;a class=&quot;reference&quot; href=&quot;https://github.com/fabianp/memory_profiler&quot;&gt;memory_profiler&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;You can fetch memory consumption of a program over time using&lt;/p&gt;
&lt;pre class=&quot;literal-block&quot;&gt;
&amp;gt;&amp;gt;&amp;gt; from memory_profiler import memory_usage
&amp;gt;&amp;gt;&amp;gt; memory_usage(main, (), {})
&lt;/pre&gt;
&lt;p&gt;&lt;a class=&quot;reference&quot; href=&quot;https://github.com/fabianp/memory_profiler&quot;&gt;memory_profiler&lt;/a&gt; can also spot lines that consume the most using &lt;a class=&quot;reference&quot; href=&quot;docs.python.org/library/pdb.html&quot;&gt;pdb&lt;/a&gt;
or &lt;a class=&quot;reference&quot; href=&quot;http://ipython.org&quot;&gt;IPython&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;div class=&quot;section&quot; id=&quot;general-purpose-profiling&quot;&gt;
&lt;h3&gt;&lt;a class=&quot;toc-backref&quot; href=&quot;http://www.logilab.org/blogentry/104898?vid=rss#id8&quot;&gt;General purpose Profiling&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;The Linux &lt;a class=&quot;reference&quot; href=&quot;https://perf.wiki.kernel.org/index.php/Main_Page&quot;&gt;perf&lt;/a&gt; tool gives access to a wide variety of performance counter
subsystems. Using perf, any execution configuration (pure python programs,
compiled extensions, subprocess, etc.) may be profiled.&lt;/p&gt;
&lt;p&gt;Performance counters are CPU hardware registers that count hardware events such
as instructions executed, cache-misses suffered, or branches mispredicted. They
form a basis for profiling applications to trace dynamic control flow and
identify hotspots.&lt;/p&gt;
&lt;p&gt;You can have information about execution times with:&lt;/p&gt;
&lt;pre class=&quot;literal-block&quot;&gt;
$ perf stat -e cpu-cycles,cpu-clock,task-clock python mypythonscript.py
&lt;/pre&gt;
&lt;p&gt;You can have RAM access information using:&lt;/p&gt;
&lt;pre class=&quot;literal-block&quot;&gt;
$ perf stat -e cache-misses python mypythonscript.py
&lt;/pre&gt;
&lt;p&gt;Be careful about the fact that &lt;a class=&quot;reference&quot; href=&quot;https://perf.wiki.kernel.org/index.php/Main_Page&quot;&gt;perf&lt;/a&gt; gives the raw value of the
hardware counters. So, you need to know exactly what you are looking for
and how to interpret these values in the context of your program.&lt;/p&gt;
&lt;p&gt;Note that you can use &lt;a class=&quot;reference&quot; href=&quot;http://code.google.com/p/jrfonseca/wiki/Gprof2Dot&quot;&gt;Gpro2Dot&lt;/a&gt; to get a more user-friendly output:&lt;/p&gt;
&lt;pre class=&quot;literal-block&quot;&gt;
$ perf record -g python mypythonscript.py
$ perf script | gprof2dot.py -f perf | dot -Tpng -o output.png
&lt;/pre&gt;
&lt;/div&gt;
</description>
  <dc:date>2013-02-26T09:40-01:00</dc:date>
  <dc:creator>Alain Leufroy</dc:creator>
</item>
  </channel>
</rss>