|
Blog entries by Alain Leufroy [4]
Python is distributed with profiling modules. They describe the run
time operation of a pure python program, providing a variety of
statistics.
The cProfile module is the recommended module. To execute your program
under the control of the cProfile module, a simple form is
$ python -m cProfile -s cumulative mypythonscript.py
ncalls tottime percall cumtime percall filename:lineno(function)
16 0.055 0.003 15.801 0.988 __init__.py:1(<module>)
1 0.000 0.000 11.113 11.113 __init__.py:35(extract)
135 7.351 0.054 11.078 0.082 __init__.py:25(iter_extract)
10350736 3.628 0.000 3.628 0.000 {method 'startswith' of 'str' objects}
1 0.000 0.000 2.422 2.422 pyplot.py:123(show)
1 0.000 0.000 2.422 2.422 backend_bases.py:69(__call__)
...
Each column provides information about time execution of every function calls.
-s cumulative orders the result by descending cumulative time.
| Note: | You can profile a particular python function such as main()
>>> import profile
>>> profile.run('main()')
|
For optimization purpose one may have python extensions written in
C/C++. For such modules, cProfile will not dig into the
corresponding call tree. Dedicated tools must be used (they are most
part of Python) to profile a C++ extension from python.
Yep
is a python module dedicated to the profiling of compiled python extension. It
uses the google CPU profiler:
$ python -m yep --callgrind mypythonscript.py
You may want to control the amount of memory used by a python program.
There is an interesting module that fits this need: memory_profiler
You can fetch memory consumption of a program over time using
>>> from memory_profiler import memory_usage
>>> memory_usage(main, (), {})
memory_profiler can also spot lines that consume the most using pdb
or IPython.
The Linux perf tool gives access to a wide variety of performance counter
subsystems. Using perf, any execution configuration (pure python programs,
compiled extensions, subprocess, etc.) may be profiled.
Performance counters are CPU hardware registers that count hardware events such
as instructions executed, cache-misses suffered, or branches mispredicted. They
form a basis for profiling applications to trace dynamic control flow and
identify hotspots.
You can have information about execution times with:
$ perf stat -e cpu-cycles,cpu-clock,task-clock python mypythonscript.py
You can have RAM access information using:
$ perf stat -e cache-misses python mypythonscript.py
Be careful about the fact that perf gives the raw value of the
hardware counters. So, you need to know exactly what you are looking for
and how to interpret these values in the context of your program.
Note that you can use Gpro2Dot to get a more user-friendly output:
$ perf record -g python mypythonscript.py
$ perf script | gprof2dot.py -f perf | dot -Tpng -o output.png
Here is at last the release of the version 1.4.0 of hgview.
Besides the classic bugfixes this release introduces a new text based user interface thanks to the urwid library.
Running hgview in a shell, in a terminal, over a ssh session is now possible! If you are trying not to use X (or use it less), have a geek mouse-killer window manager such as wmii/dwm/ion/awesome/... this is for you!
This TUI (Text User Interface!) adopts the principal features of the Qt4 based GUI. Although only the main view has been implemented for now.
In a nutshell, this interface includes the following features :
- display the revision graph (with working directory as a node, and basic support for the mq extension),
- display the files affected by a selected changeset (with basic support for the bfiles extension)
- display diffs (with syntax highlighting thanks to pygments),
- automatically refresh the displayed revision graph when the repository is being modified (requires pyinotify),
- easy key-based navigation in revisions' history of a repo (same as the GUI),
- a command system for special actions (see help)
A new --interface option is now available to choose the interface:
hgview --interface curses
Or you can fix it in the [hgview] section of your ~/.hgrc:
[hgview]
interface = curses # or qt or raw
Then run:
hgview
We'll be working on including other features from the Qt4 interface and making it fully configurable.
We'll also work on bugfixes and new features, so stay tuned! And feel free to file bugs and feature requests.
The EuroScipy2011 conference will be held in Paris at the Ecole Normale Supérieure
from August 25th to 28th and is co-organized and sponsored by INRIA, Logilab and other
companies.
The conference is dedicated to cross-disciplinary gathering focused on the
use and development of the Python language in scientific research.
August 25th and 26th are dedicated to tutorial tracks -- basic and advanced tutorials.
August 27th and 28th are dedicated to talks, posters and demos sessions.
Damien Garaud, Vincent Michel and Alain Leufroy (and others) from Logilab will be there.
We will talk about a
RSS feeds aggregator based on Scikits.learn and CubicWeb
and we have a poster about LibAster
(a python library for thermomechanical simulation based on Code_Aster).
We're very happy to host the Distutils2 sprint this week in Paris.
The sprint has started yesterday with some of Logilab's developers and
others contributors. We'll sprint during 4 days, trying to pull up the
new python package manager.
Let's sumarize this first day:
- Boris Feld and Pierre-Yves David worked on the new system for detecting and dispatching data-files.
- Julien Miotte worked on
- moving qGitFilterBranch from setuptools to distutils2
- testing distutils2 installation and register
(see the tutorial)
- the backward compatibility to distutils in setup.py, using
setup.cfg to fill the setup arguments of setup for helping
users to switch to distutils2.
- André Espaze and Alain Leufroy worked on the python script that help developers build a setup.cfg by recycling their existing setup.py (track).
Join us on IRC at #distutils on irc.freenode.net !
virtualenv, pip and Distribute are tree tools that help developers
and packagers. In this short presentation we will see some virtualenv
capabilities.
Please, keep in mind that all above stuff has been made using :
Debian Lenny, python 2.5 and virtualenv 1.4.5.
virtualenv builds python sandboxes where it is possible to do
whatever you want as a simple user without putting in jeopardy your global
environment.
virtualenv allows you to safety:
- install any python packages
- add debug lines everywhere (not only in your scripts)
- switch between python versions
- try your code as you are a final user
- and so on ...
Prefered way
Just download the virtualenv python script at http://bitbucket.org/ianb/virtualenv/raw/tip/virtualenv.py and call it using python (e.g. python virtualenv.py).
For conveinience, we will refers to this script using virtualenv.
Other ways
For Debian (ubuntu as well) addicts, just do :
$ sudo aptitude install python-virtualenv
Fedora users would do:
$ sudo yum install python-virtualenv
And others can install from PyPI (as superuser):
$ pip install virtualenv
or
$ easy_install pip && pip install virtualenv
You could also get the source here.
To work in a python sandbox, do as follow:
$ virtualenv my_py_env
$ source my_py_env/bin/activate
(my_py_env)$ python
"That's all Folks !"
Once you have finished just do:
(my_py_env)$ deactivate
or quit the tty.
Let's start again ... more slowly. Consider the following environment:
$ pwd
/home/you/some/where
$ ls
Now create a sandbox called my-sandbox:
$ virtualenv my-sandbox
New python executable in "my-sandbox/bin/python"
Installing setuptools............done.
The output said that you have a new python executable and specific
install tools. Your current directory now looks like:
$ ls -Cl
my-sandbox/ README
$ tree -L 3 my-sandbox
my-sandbox/
|-- bin
| |-- activate
| |-- activate_this.py
| |-- easy_install
| |-- easy_install-2.5
| |-- pip
| `-- python
|-- include
| `-- python2.5 -> /usr/include/python2.5
`-- lib
`-- python2.5
|-- ...
|-- orig-prefix.txt
|-- os.py -> /usr/lib/python2.5/os.py
|-- re.py -> /usr/lib/python2.5/re.py
|-- ...
|-- site-packages
| |-- easy-install.pth
| |-- pip-0.6.3-py2.5.egg
| |-- setuptools-0.6c11-py2.5.egg
| `-- setuptools.pth
|-- ...
In addition to the new python executable and the install tools you
have an whole new python environment containing libraries, a
site-packages/ (where your packages will be installed), a bin
directory, ...
- Note:
- virtualenv does not create every file needed to get a whole new python
environment. It uses links to global environment files instead in
order to save disk space end speed up the sandbox creation.
Therefore, there must already have an active python environment
installed on your system.
At this point you have to activate the sandbox in order to use your custom python.
Once activated, python still has access to the global environment but will look at your sandbox first for python's modules:
$ source my-sandbox/bin/activate
(my-sandbox)$ which python
/home/you/some/where/my-sandbox/bin/python
$ echo $PATH
/home/you/some/where/my-sandbox/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
(pyver)$ python -c 'import sys;print sys.prefix;'
/home/you/some/where/my-sandbox
(pyver)$ python -c 'import sys;print "\n".join(sys.path)'
/home/you/some/where/my-sandbox/lib/python2.5/site-packages/setuptools-0.6c8-py2.5.egg
[...]
/home/you/some/where/my-sandbox
/home/you/personal/PYTHONPATH
/home/you/some/where/my-sandbox/lib/python2.5/
[...]
/usr/lib/python2.5
[...]
/home/you/some/where/my-sandbox/lib/python2.5/site-packages
[...]
/usr/local/lib/python2.5/site-packages
/usr/lib/python2.5/site-packages
[...]
First of all, a (my-sandbox) message is automatically added to your
prompt in order to make it clear that you're using a
python sandbox environment.
Secondly, my-sandbox/bin/ is added to your PATH. So, running
python calls the specific python executable placed in
my-sandbox/bin.
- Note
- It is possible to improve the sandbox isolation by ignoring the
global paths and your PYTHONPATH (see Improve isolation
section).
It is possible to install any packages in the sandbox without any
superuser privilege. For instance, we will install the
pylint development revision
in the sandbox.
Suppose that you have the pylint stable version already installed in
your global environment:
(my-sandbox)$ deactivate
$ python -c 'from pylint.__pkginfo__ import version;print version'
0.18.0
Once your sandbox activated, install the development revision of
pylint as an update:
$ source /home/you/some/where/my-sandbox/bin/activate
(my-sandbox)$ pip install -U hg+http://www.logilab.org/hg/pylint#egg=pylint-0.19
The new package and its dependencies are only installed in the sandbox:
(my-sandbox)$ python -c 'import pylint.__pkginfo__ as p;print p.version, p.__file__'
0.19.0 /home/you/some/where/my-sandbox/lib/python2.6/site-packages/pylint/__pkginfo__.pyc
(my-sandbox)$ deactivate
$ python -c 'import pylint.__pkginfo__ as p;print p.version, p.__file__'
0.18.0 /usr/lib/pymodules/python2.6/pylint/__pkginfo__.pyc
You can safely do any change in the new pylint code or in others
sandboxed packages because your global environment is still
unchanged.
As said before, your sandboxed python sys.path still references the global
system path. You can however hide them by:
- either use the --no-site-packages that do not give
access to the global site-packages directory to the sandbox
- or change your PYTHONPATH in my-sandbox/bin/activate
in the same way as for PATH (see tips)
$ virtualenv --no-site-packages closedPy
$ sed -i '9i PYTHONPATH="$_OLD_PYTHON_PATH"
9i export PYTHONPATH
9i unset _OLD_PYTHON_PATH
40i _OLD_PYTHON_PATH="$PYTHONPATH"
40i PYTHONPATH="."
40i export PYTHONPATH' closedPy/bin/activate
$ source closedPy/bin/activate
(closedPy)$ python -c 'import sys; print "\n".join(sys.path)'
/home/you/some/where/closedPy/lib/python2.5/site-packages/setuptools-0.6c8-py2.5.egg
/home/you/some/where/closedPy
/home/you/some/where/closedPy/lib/python2.5
/home/you/some/where/closedPy/lib/python2.5/plat-linux2
/home/you/some/where/closedPy/lib/python2.5/lib-tk
/home/you/some/where/closedPy/lib/python2.5/lib-dynload
/usr/lib/python2.5
/usr/lib64/python2.5
/usr/lib/python2.5/lib-tk
/home/you/some/where/closedPy/lib/python2.5/site-packages
$ deactivate
This way, you'll get an even more isolated sandbox, just
as with a brand new python environment.
It is possible to dedicate a sandbox to a
particular version of python by using the --python=PYTHON_EXE
which specifies the interpreter that virtualenv was installed with
(default is /usr/bin/python):
$ virtualenv --python=python2.4 pyver24
$ source pyver24/bin/activate
(pyver24)$ python -V
Python 2.4.6
$ deactivate
$ virtualenv --python=python2.5 pyver25
$ source pyver25/bin/activate
(pyver25)$ python -V
Python 2.5.2
$ deactivate
To distribute your sandbox, you must use the --relocatable option
that makes an existing sandbox relocatable.
This fixes up scripts and makes all .pth files relative
This option should be called just before you distribute the sandbox (each
time you have changed something in your sandbox).
An important point is that the host system should be similar to your own.
Add these scripts to your .bashrc in order to help you using
virtualenv and automate the creation and activation processes.
rel2abs() {
#from http://unix.derkeiler.com/Newsgroups/comp.unix.programmer/2005-01/0206.html
[ "$#" -eq 1 ] || return 1
ls -Ld -- "$1" > /dev/null || return
dir=$(dirname -- "$1" && echo .) || return
dir=$(cd -P -- "${dir%??}" && pwd -P && echo .) || return
dir=${dir%??}
file=$(basename -- "$1" && echo .) || return
file=${file%??}
case $dir in
/) printf '%s\n' "/$file";;
/*) printf '%s\n' "$dir/$file";;
*) return 1;;
esac
return 0
}
function activate(){
if [[ "$1" == "--help" ]]; then
echo -e "usage: activate PATH\n"
echo -e "Activate the sandbox where PATH points inside of.\n"
return
fi
if [[ "$1" == '' ]]; then
local target=$(pwd)
else
local target=$(rel2abs "$1")
fi
until [[ "$target" == '/' ]]; do
if test -e "$target/bin/activate"; then
source "$target/bin/activate"
echo "$target sandbox activated"
return
fi
target=$(dirname "$target")
done
echo 'no sandbox found'
}
function mksandbox(){
if [[ "$1" == "--help" ]]; then
echo -e "usage: mksandbox NAME\n"
echo -e "Create and activate a highly isaolated sandbox named NAME.\n"
return
fi
local name='sandbox'
if [[ "$1" != "" ]]; then
name="$1"
fi
if [[ -e "$1/bin/activate" ]]; then
echo "$1 is already a sandbox"
return
fi
virtualenv --no-site-packages --clear --distribute "$name"
sed -i '9i PYTHONPATH="$_OLD_PYTHON_PATH"
9i export PYTHONPATH
9i unset _OLD_PYTHON_PATH
40i _OLD_PYTHON_PATH="$PYTHONPATH"
40i PYTHONPATH="."
40i export PYTHONPATH' "$name/bin/activate"
activate "$name"
}
- Note:
- The virtualenv-commands and virtualenvwrapper projects add some very
interesting features to virtualenv. So, put on eye on them for more advanced features than the above ones.
I found it to be irreplaceable for testing new configurations or
working on projects with different dependencies. Moreover, I use it to
learn about other python projects, how my project exactly interacts
with its dependencies (during debugging) or to test the final user
experience.
All of this stuff can be done without virtualenv but not in such an easy and secure way.
I will continue the series by introducing other useful projects
to enhance your productivity : pip and Distribute. See you
soon.
|