show 315 results

Blog entries

  • Coding sprint scikits.learn

    2011/03/22 by Vincent Michel

    We are planning a one day coding sprint on scikits.learn the 1st April.
    Venues, or remote participation on IRC are more than welcome !

    More information can be found on the wiki:

  • Distutils2 Sprint at Logilab (first day)

    2011/01/28 by Alain Leufroy

    We're very happy to host the Distutils2 sprint this week in Paris.

    The sprint has started yesterday with some of Logilab's developers and others contributors. We'll sprint during 4 days, trying to pull up the new python package manager.

    Let's sumarize this first day:

    • Boris Feld and Pierre-Yves David worked on the new system for detecting and dispatching data-files.
    • Julien Miotte worked on
      • moving qGitFilterBranch from setuptools to distutils2
      • testing distutils2 installation and register (see the tutorial)
      • the backward compatibility to distutils in, using setup.cfg to fill the setup arguments of setup for helping users to switch to distutils2.
    • André Espaze and Alain Leufroy worked on the python script that help developers build a setup.cfg by recycling their existing (track).

    Join us on IRC at #distutils on !

  • The Python Package Index is not a "Software Distribution"

    2011/01/26 by Pierre-Yves David

    Recent discussions on the #disutils irc channel and with my logilab co-workers led me to the following conclusions:

    • The Python Package Index is not a software distribution
    • There is more than one way to distribute python software
    • Distribution packagers are power users and need super cow-powers
    • Users want it to "just works"
    • The Python Package Index is used by many as a software distribution
    • Pypi has a lot of contributions because requirements are low.

    The Python Package Index is not a software distribution

    I would define a software distribution as :

    • Organised group of people
    • Who apply a Unified Quality process
    • To a finite set of software
    • Which includes all its dependencies
    • With a consistent set of versions that work together
    • For a finite set of platforms
    • Managed and installed by dedicated tools.

    Pypi is a public index where:

    • Any python developer
    • Can upload any tarball containing something related
    • To any python package
    • Which might have external dependencies (outside Pypi)
    • The latest version of something is always available disregarding its compatibility with other packages.
    • Binary packages can be provided for any platform but are usually not.
    • There are several tools to install and manage python packages from pypi.

    Pypi is not a software distribution, it is a software index.

    Card File by Mr. Ducke / Matt

    There is more than one way to distribute python software

    There is a long way from the pure source used by the developer to the software installed on the system of the end user.

    First, the source must be extracted from a (D)VCS to make a version tarball, while executing several release specific actions (eg: changelog generation from a tracker) Second, the version tarball is used to generate a platform independent build, while executing several build steps (eg, Cython compilation into C files or documentation generation). Third, the platform independent build is used to generate a platform dependant build, while executing several platforms dependant build (eg, compilation of C extension). Finally, the platform dependant build is installed and each file gets dispatched to its proper location during the installation process.

    Pieces of software can be distributed as development snapshots taken from the (D)VCS, version tarballs, source packages, platform independent package or platform dependent package.

    package! by Beck Gusler

    Distribution packagers are power users and need super cow-powers

    Distribution packagers usually have the necessary infrastructure and skills to build packages from version tarballs. Moreover they might have specific needs that require as much control as possible over the various build steps. For example:

    • Specific help system requiring a custom version of sphinx.
    • Specific security or platform constraint that require a specific version of Cython
    Cheese Factory by James Yu

    Users want it to "just work"

    Standard users want it to "just work". They prefer simple and quick ways to install stuff. Build steps done on their machine increase the duration of the installation, add potential new dependencies and may trigger an error. Standard users are very disappointed when an installed failed because an error occurred while building the documentation. User give up when they have to download extra dependency and setup complicated compilation environment.

    Users want as many build steps as possible to be done by someone else. That's why many users usually choose a distribution that do the job for them (eg, ubuntu, red-hat, python xy)

    The Python Package Index is used by many as a software distribution

    But there are several situations where the user can't rely on his distribution to install python software:

    • There is no distribution available for the platform (Windows, Mac OS X)
    • They want to install a python package outside of their distribution system (to test or because they do not have the credentials to install it system-wide)
    • The software or version they need is not included in the finite set of software included in their distribution.

    When this happens, the user will use Pypi to fetch python packages. To help them, Pypi accepts binary packages of python modules and people have developed dedicated tools that ease installation of packages and their dependencies: pip, easy_install.

    Pip + Pypi provides the tools of a distribution without its consistency. This is better than nothing.

    Pypi has a lot of contributions because requirements are low

    Pypi should contain version tarballs of all known python modules. It is the first purpose of an index. Version tarball should let distribution and power user perform as many build steps as possible. Pypi will continue to be used as a distribution by people without a better option. Packages provided to these users should require as little as possible to be installed, meaning they either have no build step to perform or have only platforms dependent build step (that could not be executed by the developer).

    Thomas Fisher Rare Book Library by bookchen

    If the incoming distutils2 provides a way to differentiate platform dependent build steps from platform independent ones, python developers will be able to upload three different kind of package on Pypi.

    sdist:Pure source version released by upstream targeted at packagers and power users.
    idist:Platform-independent package with platform independent build steps done (Cython, docs). If there is no such build step, the package is the same as sdist.
    bdist:Platform-dependent package with all build steps performed. For package with no platform dependent build step this package is the same that idist.

    (Image under creative commons Card File by-nc-nd by Mr. Ducke / Matt, Thomas Fisher Rare Book Library by bookchen, package! by Beck Gusler, Cheese Factory by James Yu)

  • Fresh release of lutin77, Logilab Unit Test IN fortran 77

    2011/01/11 by Andre Espaze

    I am pleased to annouce the 0.2 release of lutin77 for running Fortran 77 tests by using a C compiler as the only dependency. Moreover this very light framework of 97 lines of C code makes a very good demo of Fortran and C interfacing. The next level could be to write it in GAS (GNU Assembler).

    For the over excited maintainers of legacy code, here comes a screenshot:

    $ cat test_error.f
       subroutine success
       subroutine error
       integer fid
       open(fid, status="old", file="nofile.txt")
       write(fid, *) "Ola"
       subroutine checke
       call check(.true.)
       call check(.false.)
       call abort
       program run
       call runtest("error")
       call runtest("success")
       call runtest("absent")
       call runtest("checke")
       call resume

    Then you can build the framework by:

    $ gcc -Wall -pedantic -c lutin77.c

    An now run your tests:

    $ gfortran -o test_error test_error.f lutin77.o -ldl -rdynamic
    $ ./test_error
      At line 6 of file test_error.f
      Fortran runtime error: File 'nofile.txt' does not exist
      Error with status 512 for the test "error".
      "absent" test not found.
      Failure at check statement number 2.
      Error for the test "checke".
      4 tests run (1 PASSED, 0 FAILED, 3 ERRORS)

    See also the list of test frameworks for Fortran.

  • Distutils2 January Sprint in Paris

    2011/01/07 by Pierre-Yves David

    At Logilab, we have the pleasure to host a distutils2 sprint in January. Sprinters are welcome in our Paris office from 9h on the 27th of January to 19h the 30th of January. This sprint will focus on polishing distutils2 for the next alpha release and on the install/remove scripts.

    Distutils2 is an important project for Python. Every contribution will help to improve the current state of packaging in Python. See the wiki page on for details about participation. If you can't attend or join us in Paris, you can participate on the #distutils channel of the freenode irc network

    For additional details, see Tarek Ziadé's original announce, read the wiki page on or contact us

  • Accessing data on a virtual machine without network

    2010/12/02 by Andre Espaze

    At Logilab, we work a lot with virtual machines for testing and developping code on customers architecture. We access virtual machines through the network and copy data with scp command. However in case you get a network failure, there is still a way to access your data by mounting a rescue disk on the virtual machine. The following commands will use qemu but the idea could certainly be adapted for others emulators.

    Creating and mounting the rescue disk

    For later mounting the rescue disk on your system, it is necessary to use the raw image format (by default on qemu):

    $ qemu-img create data-rescue.img 10M

    Then run your virtual machine with the 'data-rescue.img' attached (you need to add a disk storage on virtmanager). Once in your virtual system, you will have to partition and format your new hard disk. As a an example with Linux (win32 users will prefer right clicks):

    $ fdisk /dev/sdb
    $ mke2fs -j /dev/sdb1

    Then the new disk can be mounted and used:

    $ mount /dev/sdb1 /media/usb
    $ cp /home/dede/important-customer-code.tar.bz2 /media/usb
    $ umount /media/usb

    You can then stop your virtual machine.

    Getting back data from the rescue disk

    You will then have to carry your 'data-rescue.img' on a system where you can mount a file with the 'loop' option. But first we need to find where our partition start:

    $ fdisk -ul data.img
    You must set cylinders.
    You can do this from the extra functions menu.
    Disk data.img: 0 MB, 0 bytes
    255 heads, 63 sectors/track, 0 cylinders, total 0 sectors
    Units = sectors of 1 * 512 = 512 bytes
    Disk identifier: 0x499b18da
    Device Boot      Start         End      Blocks   Id  System
    data.img1           63       16064        8001   83  Linux

    Now we can mount the partition and get back our code:

    $ mkdir /media/rescue
    $ mount -o loop,offset=$((63 * 512)) data-rescue.img /media/rescue/
    $ ls /media/rescue/

  • Thoughts on the python3 conversion workflow

    2010/11/30 by Emile Anclin


    The 2to3 script is a very useful tool. We can just use it to run over all code base, and end up with a python3 compatible code whilst keeping a python2 code base. To make our code python3 compatible, we do (or did) two things:

    • small python2 compatible modifications of our source code
    • run 2to3 over our code base to generate a python3 compatible version

    However, we not only want to have one python3 compatible version, but also keep developping our software. Hence, we want to be able to easily test it for both python2 and python3. Furthermore if we use patches to get nice commits, this is starting to be quite messy. Let's consider this in the case of Pylint. Indeed, the workflow described before proved to be unsatisfying.

    • I have two repositories, one for python2, one for python3. On the python3 side, I run 2to3 and store the modifications in a patch or a commit.

    • Whenever I implement a fix or a functionality on either side, I have to test if it still works on the other side; but as the 2to3 modifications are often quite heavy, directly creating patches on one side and applying them on the other side won't work most of the time.

    • Now say, I implement something in my python2 base and hold it in a patch or commit it. I can then pull it to my python3 repo:

      • running 2to3 on all Pylint is quite slow: around 30 sec for Pylint without the tests, and around 2 min with the tests. (I'd rather not imagine how long it would take for say CubicWeb).

      • even if I have all my 2to3 modifications on a patch, it takes 5-6 sec to "qpush" or "qpop" them all. Commiting the 2to3 changes instead and using:

        hg pull -u --rebase

        is not much faster. If I don't use --rebase, I will have merges on each pull up. Furthermore, we often have either a patch application failure, merge conflict or end up with something which is not python3 compatible (like a newly introduced "except Error, exc").

    • So quite often, I will have to fix it with:

      hg revert -r REV <broken_files>
      2to3 -nw <broken_files>
      hg qref # or hg resolve -m; hg rebase -c
    • Suppose that 2to3 transition worked fine, or that we fixed it. I run my tests with python3 and see it does not work; so I modify the patch: it all starts again; and the new patch or the patch modification will create a new head in my python3 repo...

    2to3 Fixers

    Considering all that, let's investigate 2to3: it comes with a lot of fixers that can be activated or desactived. Now, a lot of them fix just very seldom use cases or stuff deprecated since years. On the other hand, the 2to3 fixers work with regular expressions, so the more we remove, the faster 2to3 should be. Depending on the project, most cases will just not appear, and for the others, we should be able to find other means of disabling them. The lists proposed here after are just suggestions, it will depend on the source base and other overall considerations which and how fixers could actually be disabled.

    python2 compatible

    Following fixers are 2.x compatible and should be run once and for all (and can then be disabled on daily conversion usage):

    • apply
    • execfile (?)
    • exitfunc
    • getcwdu
    • has_key
    • idioms
    • ne
    • nonzero
    • paren
    • repr
    • standarderror
    • sys_exec
    • tuple_params
    • ws_comma


    This can be fixed using imports from a "compat" module like the logilab.common.compat module which holds convenient compatible objects.

    • callable
    • exec
    • filter (Wraps filter() usage in a list call)
    • input
    • intern
    • itertools_imports
    • itertools
    • map (Wraps map() in a list call)
    • raw_input
    • reduce
    • zip (Wraps zip() usage in a list call)

    strings and bytes

    Maybe they could also be handled by compat:

    • basestring
    • unicode
    • print

    For print for example, we could think of a once-and-for-all custom fixer, that would replace it by a convenient echo function (or whatever name you like) defined in compat.


    Following issues could probably be fixed manually:

    • dict (it fixes dict iterator methods; it should be possible to have code where we can disable this fixer)
    • import (Detects sibling imports; we could convert them to absolute import)
    • imports, imports2 (renamed modules)


    These changes seem to be necessary:

    • except
    • long
    • funcattrs
    • future
    • isinstance (Fixes duplicate types in the second argument of isinstance(). For example, isinstance(x, (int, int)) is converted to isinstance(x, (int)))
    • metaclass
    • methodattrs
    • numliterals
    • next
    • raise

    Consider however that a lot of them might never be used in some projects, like long, funcattrs, methodattrs and numliterals or even metaclass. Also, isinstance is probably motivated by long to int and unicode to str conversions and hence might also be somehow avoided.

    don't know

    Can we fix these one also with compat ?

    • renames
    • throw
    • types
    • urllib
    • xrange
    • xreadlines

    2to3 and Pylint

    Pylint is a special case since its test suite has a lot of bad and deprecated code which should stay there. However, in order to have a reasonable work flow, it seems that something must be done to reduce the 1:30 minutes of 2to3 parsing of the tests. Probably nothing could be gained from the above considerations since most cases just should be in the tests, and actually are. Realise that We can expect to be supporting python2 and python3 for several years in parallel.

    After a quick look, we see that 90 % of the refactorings of test/input files are just concerning the print statements; more over most of them have nothing to do with the tested functionality. Hence a solution might be to avoid to run 2to3 on the test/input directory, since we already have a mechanism to select depending on python version whether a test file should be tested or not. To some extend, astng is a similar case, but the test suite and the whole project is much smaller.

  • Pourquoi il faudrait faire du Javascript coté serveur

    2010/10/20 by Arthur Lutz

    Description de la présentation sur le site de Paris Web 2010: ici.

    Quentin Adam voudrait que l'on fasse plus de javascript coté serveur. Un des principaux avantages du javascript server side est que il n'est pas nécessaire de traduire ces structures de données entre plusieurs languages de programmation.

    Une des limites à cette adoption est que les moteurs de javascripts ne font pas de DOM (ca c'est le boulot du navigateur), du coup pas de jquery, mootools ou dojo (high level javascript)>. Par conséquent les développeurs javascript vont avoir des difficultés pour coder en server side. Certaines librairies sont en train de prendre en compte cet environnement limité.

    Quand on fait du javascript coté serveur, on peut considérer les requêtes comme des websockets, ce qui va être avantageux en terme de performances (par exemple lorsque le serveur reçoit deux requêtes identiques, quand la réponse est prête on renvoie deux fois la même chose).

    Voici quelques outils que Quentin Adam recommande ou mentionne :

    • Ape - Ajax Push Engine - Mettre du javascript dans un module apache. Coté client on a du mootols pour faire du développement.
    • Node.js très adopté par la communauté ruby. Node.js es apparu au moment de l'émergence de v8. Par contre celui-ci n'est pas très stable, la documentation n'est pas très complète, mais il y a beaucoup de "recettes" sur le web.
    • CommonJS est une librairie qui a l'avantage d'être en cours de standardisation.
    • Jaxer est une sorte de firefox embarqué dans un module apache, ce qui est un peu trop lourd mais son existence mérite d'être mentionnée.

    À Logilab, pour le développement de CubicWeb, nous penchons plutôt pour les développements des mécanismes asyncrones dans Twisted, mais cette présentation a le mérite de mettre en avant que d'utiliser javascript ne concerne pas uniquement les tweaks dans le navigateur.

  • Paris Web 2010 - Spécial typographie


    Suite de la première journée.

    Le lendemain, j'ai pu assister à La typographie comme outil de design (par David Rault) qui me semble être une sensibilisation indispensable à tout développeur web. Une introduction efficace et complète sur les familles de polices (classification VOX-ATypI) et les types d'effets produits sur le lecteur. Il faut voir la typographie comme l'équivalent de l'intonation à l'oral. La police apporte un autre contexte à la compréhension du texte. Pour finir, David Rault a parcouru les "web fonts" les plus connues tout en prenant soin de donner son avis d'expert ainsi que des détails historiques croustillants.

    Les organisateurs de Paris Web avaient ensuite judicieusement programmé La macrotypographie de la page Web (par Anne-Sophie Fradier). Après quelques explications historiques sur l'importance du support sur le format, plusieurs techniques de bases ont été présentées, comme par exemple l'usage des grilles pour la construction des pages. Celles-ci fixent un cadre à la créativité et permettent de respecter plus facilement des pauses visuelles pour retrouver un confort de lecture indispensable. L'interlignage doit être important (140% du corps), le fer à gauche et le drapeau à droite et un corps de texte suffisamment gros pour éviter des changements de taille de police intempestifs (qui risquent de "casser" la mise en page).

    Un des sujets intéressants mais souvent méconnu est le respect de la ligne de base dans la construction du flux vertical du texte dans un document. C'est justement sur ce principe et en se basant sur cet article que plusieurs personnes à Logilab ont commencé à implanter des "règles de rythmes" dans le framework CubicWeb lors d'un sprint en mai dernier. Dernier conseil à retenir d'une typographe, il faut donc toujours essayer de "retomber sur ses pattes" :-)

    Une question pertinente fut posée à la fin de la présentation sur la mode des "design fluides"; c'est-à-dire des mises en page calculées tout en proportion plutôt que fixées en pixels. La réponse donnée ne peut être absolue car ceci dépend essentiellement de la créativité et de l'originalité de l'auteur du site ; même si Anne-Sophie Fradier préconise quand même de garder le contrôle sur la largeur (la hauteur étant souvent imposée par le navigateur).


    L'usage de WOFF, les nouveautés apportées par CSS3 et les effets rendus possibles par javascript vont permettre de créer un nouvel univers au texte et à sa mise en forme. Nous pouvons espérer que le confort de lecture et la lisibilité des textes vont devenir de véritables critères de qualité. Il me paraît aujourd'hui évident à l'issu de ces présentations que la typographie va petit à petit s'imposer comme une nouvelle compétence du web designer de demain.

  • Paris Web 2010 - Le texte et le web


    J'ai eu la chance d'assister à l'ensemble des conférences données à Paris Web sur le rôle du texte et de la typographie dans le web d'aujourd'hui.

    La présentation Le texte: parent pauvre du web ? (par Jean-Marc Hardy) rappela les points les plus pertinents sur l'usage des éléments textuels par rapport à l'image.

    Outre l'exemple classique sur les outils de référencement qui ne savent aujourd'hui utiliser que le texte brut d'une page (au grand dam des "flasheurs"), D'autres résultats d'études furent donnés:

    • le taux de suivi des liens publicitaires textuels (ceux de Google par exemple sont 10 fois plus efficaces que les bannières classiques qui ont un taux de suivi de 2‰).
    • des cartes de température montrent que les titres (surtout si ceux-ci sont inférieurs à 11 caractères) restent très structurants pour la lecture et le prise d'informations à la différence des images qui restent floues pour le cerveau pendant les premiers dixièmes de secondes
    • l'usage de phrases explicatives plutôt que des infinitifs vagues dans les boutons de formulaires rassurent l'utilisateur ѕur des étapes cruciales d'enregistrement.

    Un dernier contre-exemple étonnant fut donné au sujet d'une boutique en ligne qui en voulant mettre en valeur la corbeille d'achats par une image très colorée a provoqué l'effet inverse: un sentiment de rejet des utilisateurs qui croyaient voir alors une publicité ;-)

    Jean-Marc Hardy a évoqué brièvement le rôle du texte dans l'accessibilité mais a préféré laisser cette partie à d'autres orateurs de Paris Web (l'accessibilité étant à l'honneur cette année).

    J'aurais bien aimé avoir son avis sur l'esthétique souvent utilisée pour les sites dits Web2.0 qui se rapprochent finalement assez bien de ses recommandations.

    Le deuxième jour, j'ai particulièrement apprécié les sujets autour de la typographie et le rhythme des pages...

show 315 results