blog entries created by Alain Leufroy

PyConFr

2013/10/28 by Alain Leufroy
http://www.pycon.fr/2013_static/pyconfr/images/banner.png

Logilab était au rendez-vous annuel des pythonistes de tous genres : la conférence PYCONFR organisée par l'AFPy, qui avait lieu cette année à l'université de Strasbourg.

Si vous n'y étiez pas, voici un petit aperçu de ce que vous avez raté, sachant que le programme était chargé.

Où en est le packaging ?

Nos amis de Unlish ont fait une présentation de l'état actuel de la distribution de paquets python.

Après une présentation générale de PyPI, ils ont décrit les derniers changements qui ont permis d'améliorer la disponibilité des paquets python.

L'information la plus importante concernait Wheel qui est le format désormais recommandé pour fournir des binaires précompilés. Fini les .egg de setuptools ! Ceci devrait faire sourir plus d'un mainteneur de paquet ou administrateur système.

Wheel est un format de fichier de distribution. Ce format clair et succinct est décrit par la PEP427. Il vise à simplifier la fabrication des paquets pour les distributions de vos OS favoris.

Les versions récentes de l'installeur pip peuvent gérer les paquets Wheel qui sont compatibles avec le système d'installation décrit dans la PEP376. Il faut toutefois, pour l'instant, dire explicitement à pip de prendre en compte ces fichiers dès qu'ils sont disponibles, grâce à l'option --use-wheel.

Vous disposez ainsi des avantages de pip (gestion claire et simple des dépendances, freeze, désinstallation, etc.) et ceux d'une distribution de paquets précompilés (installation rapide et simple, environnement de développement non requis, etc.).

Les paquets Wheel prennent en compte les implementations de python et leurs ABIs. Vous pouvez donc fournir des paquets Wheel (et les signer) facilement pour des versions spécifiques de Jython, Pypy, IronPython, etc.

$ python setup.py bdist_wheel
$ pypy setup.py bdist_wheel

Cela ne vous dispense pas de distribuer les sources de votre paquet ;)

$ python setup.py sdist

Python dans Mercurial

http://www.selenic.com/hg-logo/logo-droplets-50.png

Pierre-Yves David et Alexis Métaireau ont fait un petit rappel des trucs vraiment géniaux dans Mercurial comme les revsets et les templates.

Le coeur de leur présentation concernait l'utilisation de Python pour écrire Mercurial.

D'après son auteur, Mercurial existe aujourd'hui grâce à Python. En effet Python a permis à Matt Mackall d'écrire une preuve de son concept en à peine deux semaines -- il n'avait pas plus de temps à y dédier donc l'implementation en C n'était pas envisageable.

Rappelons qu'avant de changer le langage d'implementation il est toujours intéressant de se poser des questions sur les algorithmes utilisés. Nous avons vu quelques exemples d'optimisation en Python qui ont permis de d'accélérer Mercurial, et quelques astuces pour contourner les lenteurs que l'on rencontre avec l'interpréteur CPython (lazy import, low-level access, etc.).

Les autres avantages notables de l'utilisation de Python viennent de sa flexibilité. Les extensions pour Mercurial peuvent grâce à cela changer le comportement interne de Mercurial. Par exemple largefiles et watchman améliorent grandement la gestion des gros fichiers et la mise à jour des informations du dépôt.

Hy, lisp on Python

http://docs.hylang.org/en/latest/_images/hy_logo-smaller.png

Julien Danjou a présenté une implémentation de Lisp basé sur la VM de Python. En effet Python peut être vu comme un sous-ensemble de Lisp.

Hy interprète un script écrit dans un dialecte de Lisp et le convertit en arbre syntaxique Python classique, qui est ensuite exécuté par l'interpréteur Python.

[Python] .py -(parse)---> AST -(compile)-> .pyc -(run)-> python vm
                      /
[Lisp]   .hy -(parse)/

tip

hy2py permet de montrer l'équivalent Python d'un script Lisp.

Il y a donc une grande interopérabilité entre ce qui est implémenté en Hy et ce qui l'est en Python. Aucun souci pour importer les autres modules Python, quels qu'ils soient.

Hy supporte presque toutes les versions de Python et beaucoup d'interpréteurs, notamment pypy.

De nombreuses fonctions de common Lisp sont disponibles, et Hy se rapproche de Clojure pour la définition des classes.

Pour ceux qui sont intéressés par Hy, notez qu'il manque encore quelques petites choses :

  • les cons cells sont en cours de discussion
  • il faudra faire sans les macroexpand pour vous aider dans vos macros
  • les fonctions de Common Lisp ne sont pas toutes présentes
  • le dialect de Lisp nécessite, pour l'instant, de mixer les [...]` et les (...)`, mais ceci devrait changer.
  • Hy n'est pas présent à l'exécution, il y a donc forcément des limitations.

Python pour la Robotique

Il y avait une présentation bien sympathique d'une équipe qui participe régulièrement aux championnats de france de robotique.

Ils utilisent une carte basée sur un SoC ARM sur laquelle ils disposent d'un Gnu/Linux et d'un interpréteur Python (2.7).

Ils ont codé en C/C++ quelques routines de bas niveau pour un maximum de réactivité. Mise à part cela, tout le reste est en Python, notamment leurs algorithmes pour gérer la stratégie de leurs robots.

Python leur facilite énormément la vie grâce au prototypage rapide, à la rapidité pour corriger leur code (surtout avec le manque de sommeil durant la compétition), à la souplesse pour simuler en amont, analyser des logs, etc.

Un Python dans la maison

http://hackspark.fr/skin/frontend/base/default/images/logo3d_hackspark_small.png

Il y avait aussi la présentation d'un projet (Hack'Spark!) jeune mais déjà fonctionnel de domotique. La petite démonstration en direct du système était du plus bel effet ;)

Et, pour moins de 100 euros vous pourrez allumer la lumière chez vous depuis une interface web ! Perso, je m'y mets ce mois ;)

Framework Graphique Kivy

http://kivy.org/logos/kivy-logo-black-64.png

Kivy est entièrement écrit en Python/Cython et utilise OpenGL. Il a donc un très bon support sur les machines récentes (Linux, BSD, MacOs, Android, iOS, Rpi, etc.). Et il n'a rien a envier aux autres frameworks.

Kivy semble particulièrment pratique pour mener à bien des projets sur les plateformes mobiles comme les téléphones portables et les tablettes (Android et iOS).

De plus, parmi les outils fournis avec Kivy vous pourrez trouver quelques trucs pour simplifier votre développement :

  • PyJNIus utilise l'interface JNI de la VM Java (via Cython). Il sert de proxy sur les classes Java et vous donne donc accès à l'ensemble de l'API Android.
  • PyObjus est le pendant de PyJNIus pour ObjectiveC sous iOS.
  • Plyer essaie de rassembler en une API commune de plus haut niveau PyJNIus et PyObjus, ce qui permet de coder une seule fois pour les deux plateformes.
  • Buildozer aide à la compilation de projet pour Android de manière plus simple qu'avec Python for Android.

Nous avons eu droit à une présentation des concepts et comment les mettre en œuvre en direct. Je sens que ça va me simplifier la vie !


Going to EuroScipy2013

2013/09/04 by Alain Leufroy

The EuroScipy2013 conference was held in Bruxelles at the Université libre de Bruxelles.

http://www.logilab.org/file/175984/raw/logo-807286783.png

As usual the first two days were dedicated to tutorials while the last two ones were dedicated to scientific presentations and general python related talks. The meeting was extended by one more day for sprint sessions during which enthusiasts were able to help free software projects, namely sage, vispy and scipy.

Jérôme and I had the great opportunity to represent Logilab during the scientific tracks and the sprint day. We enjoyed many talks about scientific applications using python. We're not going to describe the whole conference. Visit the conference website if you want the complete list of talks. In this article we will try to focus on the ones we found the most interesting.

First of all the keynote by Cameron Neylon about Network ready research was very interesting. He presented some graphs about the impact of a group job on resolving complex problems. They revealed that there is a critical network size for which the effectiveness for solving a problem drastically increase. He pointed that the source code accessibility "friction" limits the "getting help" variable. Open sourcing software could be the best way to reduce this "friction" while unit testing and ongoing integration are facilitators. And, in general, process reproducibility is very important, not only in computing research. Retrieving experimental settings, metadata, and process environment is vital. We agree with this as we are experimenting it everyday in our work. That is why we encourage open source licenses and develop a collaborative platform that provides the distributed simulation traceability and reproducibility platform Simulagora (in french).

Ian Ozsvald's talk dealt with key points and tips from his own experience to grow a business based on open source and python, as well as mistakes to avoid (e.g. not checking beforehand there are paying customers interested by what you want to develop). His talk was comprehensive and mentioned a wide panel of situations.

http://vispy.org/_static/img/logo.png

We got a very nice presentation of a young but interesting visualization tools: Vispy. It is 6 months old and the first public release was early August. It is the result of the merge of 4 separated libraries, oriented toward interactive visualisation (vs. static figure generation for Matplotlib) and using OpenGL on GPUs to avoid CPU overload. A demonstration with large datasets showed vispy displaying millions of points in real time at 40 frames per second. During the talk we got interesting information about OpenGL features like anti-grain compared to Matplotlib Agg using CPU.

We also got to learn about cartopy which is an open source Python library originally written for weather and climate science. It provides useful and simple API to manipulate cartographic mapping.

Distributed computing systems was a hot topic and many talks were related to this theme.

https://www.openstack.org/themes/openstack/images/openstack-logo-preview-full-color.png

Gael Varoquaux reminded us what are the keys problems with "biggish data" and the key points to successfully process them. I think that some of his recommendations are generally useful like "choose simple solutions", "fail gracefully", "make it easy to debug". For big data processing when I/O limit is the constraint, first try to split the problem into random fractions of the data, then run algorithms and aggregate the results to circumvent this limit. He also presented mini-batch that takes a bunch of observations (trade-off memory usage/vectorization) and joblib.parallel that makes I/O faster using compression (CPUs are faster than disk access).

Benoit Da Mota talked about shared memory in parallel computing and Antonio Messina gave us a quick overview on how to build a computing cluster with Elasticluster, using OpenStack/Slurm/ansible. He demonstrated starting and stopping a cluster on OpenStack: once all VMs are started, ansible configures them as hosts to the cluster and new VMs can be created and added to the cluster on the fly thanks to a command line interface.

We also got a keynote by Peter Wang (from Continuum Analytics) about the future of data analysis with Python. As a PhD in physics I loved his metaphor of giving mass to data. He tried to explain the pain that scientists have when using databases.

https://scikits.appspot.com/static/images/scipyshiny_small.png

After the conference we participated to the numpy/scipy sprint. It was organized by Ralph Gommers and Pauli Virtanen. There were 18 people trying to close issues from different difficulty levels and had a quick tutorial on how easy it is to contribute: the easiest is to fork from the github project page on your own github account (you can create one for free), so that later your patch submission will be a simple "Pull Request" (PR). Clone locally your scipy fork repository, and make a new branch (git checkout -b <newbranch>) to tackle one specific issue. Once your patch is ready, commit it locally, push it on your github repository and from the github interface choose "Push request". You will be able to add something to your commit message before your PR is sent and looked at by the project lead developers. For example using "gh-XXXX" in your commit message will automatically add a link to the issue no. XXXX. Here is the list of open issues for scipy; you can filter them, e.g. displaying only the ones considered easy to fix :D

For more information: Contributing to SciPy.


Logilab à PyConFR 2012 - compte rendu

2012/10/09 by Alain Leufroy
http://awesomeness.openphoto.me/custom/201209/4ed140-pycon3--1-of-37-_870x550.jpg

Logilab était à la conférence PyConFR qui a pris place à Paris il y a deux semaines.

Nous avons commencé par un sprint pylint, coordonné par Boris Feld, où pas mal de volontaires sont passés pour traquer des bogues ou ajouter des nouvelles fonctionnalités. Merci à tous!

Pour ceux qui ne connaissent pas encore, pylint est un utilitaire pratique que nous avons dans notre forge. C'est un outil très puissant d'analyse statique de scripts python qui aide à améliorer/maintenir la qualité du code.

Par la suite, après les "talks" des sponsors¸ où vous auriez pu voir Olivier, vous avons pu participer à quelques tutoriels et présentations vraiment excellentes. Il y avait des présentations pratiques avec, entre autres, les tests, scikit-learn ou les outils pour gérer des services (Cornice, Circus). Il y avait aussi des retours d'information sur le processus de développement de CPython, le développement communautaire ou un supercalculateur. Nous avons même pu faire de la musique avec python et un peu d'"embarqué" avec le Raspberry Pi et Arduino !

Nous avons, avec Pierre-Yves, proposé deux tutoriels d'introduction au gestionnaire de versions décentralisé Mercurial. Le premier tutoriel abordait les bases avec des cas pratiques. Lors du second tutoriel, que l'on avait prévu initialement dans la continuité du premier, nous avons finalement abordé des utilisations plus avancées permettant de résoudre avec énormément d'efficacité des problématiques quotidiennes, comme les requêtes sur les dépôts, ou la recherche automatique de régression par bissection. Vous pouvez retrouver le support avec les exercices .

Pierre-Yves a présenté une nouvelle propriété importante de Mercurial: l'obsolescence. Elle permet de mettre en place des outils d'édition d'historique en toute sécurité ! Parmi ces outils, Pierre-Yves a écrit une extension mutable-history qui vous offre une multitude de commandes très pratiques.

La présentation est disponible en PDF et en consultation en ligne sur slideshare. Nous mettrons bientôt la vidéo en ligne.

http://www.logilab.org/file/107770?vid=download

Si le sujet vous intéresse et que vous avez raté cette présentation, Pierre-Yves reparlera de ce sujet à l'OSDC.

Pour ceux qui en veulent plus, Tarek Ziadé à mis à disposition des photos de la conférence ici.


Profiling tools

2012/09/07 by Alain Leufroy

Python

Run time profiling with cProfile

Python is distributed with profiling modules. They describe the run time operation of a pure python program, providing a variety of statistics.

The cProfile module is the recommended module. To execute your program under the control of the cProfile module, a simple form is

$ python -m cProfile -s cumulative mypythonscript.py

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      16    0.055    0.003   15.801    0.988 __init__.py:1(<module>)
       1    0.000    0.000   11.113   11.113 __init__.py:35(extract)
     135    7.351    0.054   11.078    0.082 __init__.py:25(iter_extract)
10350736    3.628    0.000    3.628    0.000 {method 'startswith' of 'str' objects}
       1    0.000    0.000    2.422    2.422 pyplot.py:123(show)
       1    0.000    0.000    2.422    2.422 backend_bases.py:69(__call__)
       ...

Each column provides information about time execution of every function calls. -s cumulative orders the result by descending cumulative time.

Note:

You can profile a particular python function such as main()

>>> import profile
>>> profile.run('main()')

Graphical tools to show profiling results

Even if report tools are included in cProfile profiler, it can be interesting to use graphical tools. Most of them work with a stat file that can be generated by cProfile using the -o filepath option.

Below are some of available graphical tools that we tested.

Gpro2Dot

is a python based tool that allows to transform profiling results output into a picture containing the call tree graph (using graphviz). A typical profiling session with python looks like this:

$ python -m cProfile -o output.pstats mypythonscript.py
$ gprof2dot.py -f pstats output.pstats | dot -Tpng -o profiling_results.png
http://wiki.jrfonseca.googlecode.com/git/gprof2dot.png

Each node of the output graph represents a function and has the following layout:

+----------------------------------+
|   function name : module name    |
| total time including sub-calls % |  total time including sub-calls %
|    (self execution time %)       |------------------------------------>
|  total number of self calls      |
+----------------------------------+

Nodes and edges are colored according to the "total time" spent in the functions.

Note:The following small patch let the node color correspond to the execution time and the edge color to the "total time":
diff -r da2b31597c5f gprof2dot.py
--- a/gprof2dot.py      Fri Aug 31 16:38:37 2012 +0200
+++ b/gprof2dot.py      Fri Aug 31 16:40:56 2012 +0200
@@ -2628,6 +2628,7 @@
                 weight = function.weight
             else:
                 weight = 0.0
+            weight = function[TIME_RATIO]

             label = '\n'.join(labels)
             self.node(function.id,
PyProf2CallTree

is a script to help visualizing profiling data with the KCacheGrind graphical calltree analyzer. This is a more interactive solution than Gpro2Dot but it requires to install KCacheGrind. Typical usage:

$ python -m cProfile -o stat.prof mypythonscript.py
$ python pyprof2calltree.py -i stat.prof -k

Profiling data file is opened in KCacheGrind with pyprof2calltree module, whose -k switch automatically opens KCacheGrind.

http://kcachegrind.sourceforge.net/html/pics/KcgShot3Large.gif

There are other tools that are worth testing:

  • RunSnakeRun is an interactive GUI tool which visualizes profile file using square maps:

    $ python -m cProfile -o stat.prof mypythonscript.py
    $ runsnake stat.prof
    
  • pycallgraph generates PNG images of a call tree with the total number of calls:

    $ pycallgraph mypythonscript.py
    
  • lsprofcalltree also use KCacheGrind to display profiling data:

    $ python lsprofcalltree.py -o output.log yourprogram.py
    $ kcachegrind output.log
    

C/C++ extension profiling

For optimization purpose one may have python extensions written in C/C++. For such modules, cProfile will not dig into the corresponding call tree. Dedicated tools must be used (they are most part of Python) to profile a C++ extension from python.

Yep

is a python module dedicated to the profiling of compiled python extension. It uses the google CPU profiler:

$ python -m yep --callgrind mypythonscript.py

Memory Profiler

You may want to control the amount of memory used by a python program. There is an interesting module that fits this need: memory_profiler

You can fetch memory consumption of a program over time using

>>> from memory_profiler import memory_usage
>>> memory_usage(main, (), {})

memory_profiler can also spot lines that consume the most using pdb or IPython.

General purpose Profiling

The Linux perf tool gives access to a wide variety of performance counter subsystems. Using perf, any execution configuration (pure python programs, compiled extensions, subprocess, etc.) may be profiled.

Performance counters are CPU hardware registers that count hardware events such as instructions executed, cache-misses suffered, or branches mispredicted. They form a basis for profiling applications to trace dynamic control flow and identify hotspots.

You can have information about execution times with:

$ perf stat -e cpu-cycles,cpu-clock,task-clock python mypythonscript.py

You can have RAM access information using:

$ perf stat -e cache-misses python mypythonscript.py

Be careful about the fact that perf gives the raw value of the hardware counters. So, you need to know exactly what you are looking for and how to interpret these values in the context of your program.

Note that you can use Gpro2Dot to get a more user-friendly output:

$ perf record -g python mypythonscript.py
$ perf script | gprof2dot.py -f perf | dot -Tpng -o output.png

Text mode makes it into hgview 1.4.0

2011/10/06 by Alain Leufroy

Here is at last the release of the version 1.4.0 of hgview.

http://www.logilab.org/image/77974?vid=download

Small description

Besides the classic bugfixes this release introduces a new text based user interface thanks to the urwid library.

Running hgview in a shell, in a terminal, over a ssh session is now possible! If you are trying not to use X (or use it less), have a geek mouse-killer window manager such as wmii/dwm/ion/awesome/... this is for you!

This TUI (Text User Interface!) adopts the principal features of the Qt4 based GUI. Although only the main view has been implemented for now.

In a nutshell, this interface includes the following features :

  • display the revision graph (with working directory as a node, and basic support for the mq extension),
  • display the files affected by a selected changeset (with basic support for the bfiles extension)
  • display diffs (with syntax highlighting thanks to pygments),
  • automatically refresh the displayed revision graph when the repository is being modified (requires pyinotify),
  • easy key-based navigation in revisions' history of a repo (same as the GUI),
  • a command system for special actions (see help)

Installation

There are packages for debian and ubuntu in the logilab's debian repository.

Note:you have to install the hgview-curses package to get the text based interface.

Or you can simply clone our Mercurial repository:

hg clone http://hg.logilab.org/hgview

(more on the hgview home page)

Running the text based interface

A new --interface option is now available to choose the interface:

hgview --interface curses

Or you can fix it in the [hgview] section of your ~/.hgrc:

[hgview]
interface = curses # or qt or raw

Then run:

hgview

What's next

We'll be working on including other features from the Qt4 interface and making it fully configurable.

We'll also work on bugfixes and new features, so stay tuned! And feel free to file bugs and feature requests.


EuroSciPy'11 - Annual European Conference for Scientists using Python.

2011/08/24 by Alain Leufroy
http://www.logilab.org/image/9852?vid=download

The EuroScipy2011 conference will be held in Paris at the Ecole Normale Supérieure from August 25th to 28th and is co-organized and sponsored by INRIA, Logilab and other companies.

The conference is dedicated to cross-disciplinary gathering focused on the use and development of the Python language in scientific research.

August 25th and 26th are dedicated to tutorial tracks -- basic and advanced tutorials. August 27th and 28th are dedicated to talks, posters and demos sessions.

Damien Garaud, Vincent Michel and Alain Leufroy (and others) from Logilab will be there. We will talk about a RSS feeds aggregator based on Scikits.learn and CubicWeb and we have a poster about LibAster (a python library for thermomechanical simulation based on Code_Aster).


Distutils2 Sprint at Logilab (first day)

2011/01/28 by Alain Leufroy

We're very happy to host the Distutils2 sprint this week in Paris.

The sprint has started yesterday with some of Logilab's developers and others contributors. We'll sprint during 4 days, trying to pull up the new python package manager.

Let's sumarize this first day:

  • Boris Feld and Pierre-Yves David worked on the new system for detecting and dispatching data-files.
  • Julien Miotte worked on
    • moving qGitFilterBranch from setuptools to distutils2
    • testing distutils2 installation and register (see the tutorial)
    • the backward compatibility to distutils in setup.py, using setup.cfg to fill the setup arguments of setup for helping users to switch to distutils2.
  • André Espaze and Alain Leufroy worked on the python script that help developers build a setup.cfg by recycling their existing setup.py (track).

Join us on IRC at #distutils on irc.freenode.net !


Virtualenv - Play safely with a Python

2010/03/26 by Alain Leufroy
http://farm5.static.flickr.com/4031/4255910934_80090f65d7.jpg

virtualenv, pip and Distribute are tree tools that help developers and packagers. In this short presentation we will see some virtualenv capabilities.

Please, keep in mind that all above stuff has been made using : Debian Lenny, python 2.5 and virtualenv 1.4.5.

Abstract

virtualenv builds python sandboxes where it is possible to do whatever you want as a simple user without putting in jeopardy your global environment.

virtualenv allows you to safety:

  • install any python packages
  • add debug lines everywhere (not only in your scripts)
  • switch between python versions
  • try your code as you are a final user
  • and so on ...

Install and usage

Install

Prefered way

Just download the virtualenv python script at http://bitbucket.org/ianb/virtualenv/raw/tip/virtualenv.py and call it using python (e.g. python virtualenv.py).

For conveinience, we will refers to this script using virtualenv.

Other ways

For Debian (ubuntu as well) addicts, just do :

$ sudo aptitude install python-virtualenv

Fedora users would do:

$ sudo yum install python-virtualenv

And others can install from PyPI (as superuser):

$ pip install virtualenv

or

$ easy_install pip && pip install virtualenv

You could also get the source here.

Quick Guide

To work in a python sandbox, do as follow:

$ virtualenv my_py_env
$ source my_py_env/bin/activate
(my_py_env)$ python

"That's all Folks !"

Once you have finished just do:

(my_py_env)$ deactivate

or quit the tty.

What does virtualenv actually do ?

At creation time

Let's start again ... more slowly. Consider the following environment:

$ pwd
/home/you/some/where
$ ls

Now create a sandbox called my-sandbox:

$ virtualenv my-sandbox
New python executable in "my-sandbox/bin/python"
Installing setuptools............done.

The output said that you have a new python executable and specific install tools. Your current directory now looks like:

$ ls -Cl
my-sandbox/ README
$ tree -L 3 my-sandbox
my-sandbox/
|-- bin
|   |-- activate
|   |-- activate_this.py
|   |-- easy_install
|   |-- easy_install-2.5
|   |-- pip
|   `-- python
|-- include
|   `-- python2.5 -> /usr/include/python2.5
`-- lib
    `-- python2.5
        |-- ...
        |-- orig-prefix.txt
        |-- os.py -> /usr/lib/python2.5/os.py
        |-- re.py -> /usr/lib/python2.5/re.py
        |-- ...
        |-- site-packages
        |   |-- easy-install.pth
        |   |-- pip-0.6.3-py2.5.egg
        |   |-- setuptools-0.6c11-py2.5.egg
        |   `-- setuptools.pth
        |-- ...

In addition to the new python executable and the install tools you have an whole new python environment containing libraries, a site-packages/ (where your packages will be installed), a bin directory, ...

Note:
virtualenv does not create every file needed to get a whole new python environment. It uses links to global environment files instead in order to save disk space end speed up the sandbox creation. Therefore, there must already have an active python environment installed on your system.

At activation time

At this point you have to activate the sandbox in order to use your custom python. Once activated, python still has access to the global environment but will look at your sandbox first for python's modules:

$ source my-sandbox/bin/activate
(my-sandbox)$ which python
/home/you/some/where/my-sandbox/bin/python
$ echo $PATH
/home/you/some/where/my-sandbox/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
(pyver)$ python -c 'import sys;print sys.prefix;'
/home/you/some/where/my-sandbox
(pyver)$ python -c 'import sys;print "\n".join(sys.path)'
/home/you/some/where/my-sandbox/lib/python2.5/site-packages/setuptools-0.6c8-py2.5.egg
[...]
/home/you/some/where/my-sandbox
/home/you/personal/PYTHONPATH
/home/you/some/where/my-sandbox/lib/python2.5/
[...]
/usr/lib/python2.5
[...]
/home/you/some/where/my-sandbox/lib/python2.5/site-packages
[...]
/usr/local/lib/python2.5/site-packages
/usr/lib/python2.5/site-packages
[...]

First of all, a (my-sandbox) message is automatically added to your prompt in order to make it clear that you're using a python sandbox environment.

Secondly, my-sandbox/bin/ is added to your PATH. So, running python calls the specific python executable placed in my-sandbox/bin.

Note
It is possible to improve the sandbox isolation by ignoring the global paths and your PYTHONPATH (see Improve isolation section).

Installing package

It is possible to install any packages in the sandbox without any superuser privilege. For instance, we will install the pylint development revision in the sandbox.

Suppose that you have the pylint stable version already installed in your global environment:

(my-sandbox)$ deactivate
$ python -c 'from pylint.__pkginfo__ import version;print version'
0.18.0

Once your sandbox activated, install the development revision of pylint as an update:

$ source /home/you/some/where/my-sandbox/bin/activate
(my-sandbox)$ pip install -U hg+http://www.logilab.org/hg/pylint#egg=pylint-0.19

The new package and its dependencies are only installed in the sandbox:

(my-sandbox)$ python -c 'import pylint.__pkginfo__ as p;print p.version, p.__file__'
0.19.0 /home/you/some/where/my-sandbox/lib/python2.6/site-packages/pylint/__pkginfo__.pyc
(my-sandbox)$ deactivate
$ python -c 'import pylint.__pkginfo__ as p;print p.version, p.__file__'
0.18.0 /usr/lib/pymodules/python2.6/pylint/__pkginfo__.pyc

You can safely do any change in the new pylint code or in others sandboxed packages because your global environment is still unchanged.

Useful options

Improve isolation

As said before, your sandboxed python sys.path still references the global system path. You can however hide them by:

  • either use the --no-site-packages that do not give access to the global site-packages directory to the sandbox
  • or change your PYTHONPATH in my-sandbox/bin/activate in the same way as for PATH (see tips)
$ virtualenv --no-site-packages closedPy
$ sed -i '9i PYTHONPATH="$_OLD_PYTHON_PATH"
      9i export PYTHONPATH
      9i unset _OLD_PYTHON_PATH
      40i _OLD_PYTHON_PATH="$PYTHONPATH"
      40i PYTHONPATH="."
      40i export PYTHONPATH' closedPy/bin/activate
$ source closedPy/bin/activate
(closedPy)$ python -c 'import sys; print "\n".join(sys.path)'
/home/you/some/where/closedPy/lib/python2.5/site-packages/setuptools-0.6c8-py2.5.egg
/home/you/some/where/closedPy
/home/you/some/where/closedPy/lib/python2.5
/home/you/some/where/closedPy/lib/python2.5/plat-linux2
/home/you/some/where/closedPy/lib/python2.5/lib-tk
/home/you/some/where/closedPy/lib/python2.5/lib-dynload
/usr/lib/python2.5
/usr/lib64/python2.5
/usr/lib/python2.5/lib-tk
/home/you/some/where/closedPy/lib/python2.5/site-packages
$ deactivate

This way, you'll get an even more isolated sandbox, just as with a brand new python environment.

Work with different versions of Python

It is possible to dedicate a sandbox to a particular version of python by using the --python=PYTHON_EXE which specifies the interpreter that virtualenv was installed with (default is /usr/bin/python):

$ virtualenv --python=python2.4 pyver24
$ source pyver24/bin/activate
(pyver24)$ python -V
Python 2.4.6
$ deactivate
$ virtualenv --python=python2.5 pyver25
$ source pyver25/bin/activate
(pyver25)$ python -V
Python 2.5.2
$ deactivate

Distribute a sandbox

To distribute your sandbox, you must use the --relocatable option that makes an existing sandbox relocatable. This fixes up scripts and makes all .pth files relative This option should be called just before you distribute the sandbox (each time you have changed something in your sandbox).

An important point is that the host system should be similar to your own.

Tips

Speed up sandbox manipulation

Add these scripts to your .bashrc in order to help you using virtualenv and automate the creation and activation processes.

rel2abs() {
#from http://unix.derkeiler.com/Newsgroups/comp.unix.programmer/2005-01/0206.html
  [ "$#" -eq 1 ] || return 1
  ls -Ld -- "$1" > /dev/null || return
  dir=$(dirname -- "$1" && echo .) || return
  dir=$(cd -P -- "${dir%??}" && pwd -P && echo .) || return
  dir=${dir%??}
  file=$(basename -- "$1" && echo .) || return
  file=${file%??}
  case $dir in
    /) printf '%s\n' "/$file";;
    /*) printf '%s\n' "$dir/$file";;
    *) return 1;;
  esac
  return 0
}
function activate(){
    if [[ "$1" == "--help" ]]; then
        echo -e "usage: activate PATH\n"
        echo -e "Activate the sandbox where PATH points inside of.\n"
        return
    fi
    if [[ "$1" == '' ]]; then
        local target=$(pwd)
    else
        local target=$(rel2abs "$1")
    fi
    until  [[ "$target" == '/' ]]; do
        if test -e "$target/bin/activate"; then
            source "$target/bin/activate"
            echo "$target sandbox activated"
            return
        fi
        target=$(dirname "$target")
    done
    echo 'no sandbox found'
}
function mksandbox(){
    if [[ "$1" == "--help" ]]; then
        echo -e "usage: mksandbox NAME\n"
        echo -e "Create and activate a highly isaolated sandbox named NAME.\n"
        return
    fi
    local name='sandbox'
    if [[ "$1" != "" ]]; then
        name="$1"
    fi
    if [[ -e "$1/bin/activate" ]]; then
        echo "$1 is already a sandbox"
        return
    fi
    virtualenv --no-site-packages --clear --distribute "$name"
    sed -i '9i PYTHONPATH="$_OLD_PYTHON_PATH"
            9i export PYTHONPATH
            9i unset _OLD_PYTHON_PATH
           40i _OLD_PYTHON_PATH="$PYTHONPATH"
           40i PYTHONPATH="."
           40i export PYTHONPATH' "$name/bin/activate"
    activate "$name"
}
Note:
The virtualenv-commands and virtualenvwrapper projects add some very interesting features to virtualenv. So, put on eye on them for more advanced features than the above ones.

Conclusion

I found it to be irreplaceable for testing new configurations or working on projects with different dependencies. Moreover, I use it to learn about other python projects, how my project exactly interacts with its dependencies (during debugging) or to test the final user experience.

All of this stuff can be done without virtualenv but not in such an easy and secure way.

I will continue the series by introducing other useful projects to enhance your productivity : pip and Distribute. See you soon.