show 207 results

Blog entries

  • Pylint at BayPIGgies

    2009/03/30 by Sandrine Ribeau

    I am pleased to announce that Pylint was presented during a Tools night meeting organized by BayPIGgies on thursday march 26th. This meeting has been recorded and you can enjoy the video.

    One point was missing from the presentation and I'll take the opportunity now to mention it. Flymake, an on-the-fly syntax checker for GNU Emacs which has been discussed, does work in combination with Pylint (please see EmacsWiki for more informations).

    photo by ten safe frogs under creative commons

  • new pylint / astng / common releases

    2009/03/25 by Sylvain Thenault

    I'm pleased to announce releases of pylint 0.18, logilab-astng 0.19 and logilab-common 0.39. All these packages should now be cleanly available through easy install.

    Also, happy pylint users will get:

    • fixed python 2.6 support (pylint/astng tested from 2.4 to 2.6)
    • get source code (and so astng) for zip/egg imports
    • some understanding of the property decorator and of unbound methods
    • some false positives fixed and others minor improvments

    See projects home page and ChangeLog for more information:

    Please report any problem / question to the mailing-list.


  • broken easy_install support

    2009/03/25 by Sylvain Thenault

    I recently understood why easy_install wasn't able to find so many of our packages anymore.

    The problem was due to a recent change on our website. The project page was ajaxified, and since easy_install uses some screenscrapping techniques to get distribution archives, it can not find the files it is looking for.

    To fix this, we should make our tarballs downloadable from PyPI, by using

    python register sdist upload

    instead of the current:

    python register

    Uploading our public python software packages to PyPI will make them easy_installable in a breeze !

  • Pylint and Astng support for the _ast module

    2009/03/19 by Emile Anclin

    Supporting _ast and compiler

    Python 2.5 introduces a new module _ast for Abstract Syntax Tree (AST) representation of python code. This module is quite faster than the compiler.ast representation that logilab-astng (and therefore pylint) used until now and the compiler module was removed in Python 3.0.

    Faster is good, but the representations of python code are quite different in _ast and in compiler : some nodes exist in one AST but not the other and almost all child nodes have different names.

    We had to engage in a big refactoring to use the new _ast module, since we wanted to stay compatible with python version < 2.5, which meant keeping the compiler module support. A lot of work was done to find a common representation for the two different trees. In most cases we used _ast-like representations and names, but in some cases we kept ideas or attribute names of compiler.

    Abstract Syntax Trees

    Let's look at an example to compare both representations. Here is a seamingly harmless snippet of code:

    CODE = """
    if cond:
        del delvar
    elif next:

    Now, compare the respective _ast and compiler representations (nodes are in upper case and their attributes are in lower case).

    compiler representation

        node =
            nodes = [
                tests = [
                    name = 'cond'
                    nodes = [
                        flags = 'OP_DELETE'
                        name = 'delvar'
                    name = 'next'
                    nodes = [

    _ast representation

        body = [
            test =
                id = 'cond'
            body = [
                targets = [
                    id = 'delvar'
            orelse = [
                test =
                    id = 'next'
                body = [
                    nl = True

    Can you spot any differences? I would say, they differ quite a lot... For instance, compiler turns a "elif" statements into a list called 'tests' when _ast treats "elif cond:" as if it were "else:if cond:".

    Tree Rebuilding

    We transform these trees by renaming attributes and nodes, or removing or introducing new ones: with compiler, we remove the Stmt node, introduce a Delete node, and recursively build the If nodes coming from an "elif"; and with _ast, we reintroduce the AssName node. This might be only a temporary step towards full _ast like representation.

    This is done by the TreeRebuilder Visitors, one for each representation, which are respectively in astng._nodes_compiler and astng._ast.

    In the simplest case, the TreeRebuilder method looks like this (_nodes_compiler):

    def visit_list(self, node):
        node.elts = node.nodes
        del node.nodes

    (and nothing to do for _ast).

    So, after doing all this and a lot more, we get the following representation from both input trees:

        body = [
            test =
            body = [
                targets = [
            orelse = [
                test =
                body = [
                    dest =
                    values = [
                orelse = [

    Faster towards Py3k

    Of course, you can imagine these modifications had some API repercussions, and thus required a lot of smaller Pylint modifications. But all was done so that you should see no difference in Pylint's behavior using either python <2.5 or python >=2.5, except that with the _ast module pylint is around two times faster!

    Oh, and we fixed small bugs on the way and maybe introduced a few new ones...

    Finally, it is a major step towards Pylint Py3k!

  • hgview 0.10.2 released

    2009/03/13 by Graziella Toutoungis

    I have the pleasure of announcing that the version hgview 0.10.2 was posted on this site and is available for downloading. In this version we added some new functionalities like :

    • Change the search behavior: the button "Next" will move the focus on the next found searched text.
    • Diff works on the merge node.
    • The command --version shows the current version of hgview
    • Fix a bug when the file's name contains space.

  • New version of projman


    Projman is a project management tool. It reads project descriptions and activity logs to schedule tasks and to generate gantt diagrams.

    The version 0.13.6 fix some gantt diagram generation bugs. Graphics library helper is now Cairo instead of matplotlib. Some examples, according to the new model (use of resources type and roles in tasks) have been added to the project.

    And, of course, the compulsory screenshot.

  • Debian Lenny release date - almost there ?

    2009/02/09 by Arthur Lutz

    Being big fans of debian, we are impatiently awaiting the new stable release of the distribution : lenny. Finding it pretty difficult to find information about when they were expecting to release it, I asked a colleague if he knew. He's a debian developer so I though he might have the info. And he did : according to the debian.devel mailing list we should be having the release for the 14th of February 2009. In other words : in 5 days!

    There's a few geeky emails on the release date if you have time to read the threads.

  • LUTIN77: Logilab Unit Test IN fortran 77

    2009/01/28 by Andre Espaze

    We've just released a new project on : lutin77. It's a test framework for Fortran77.

    The goal of this framework is to make unit tests in fortran 77 by having few dependencies: a POSIX environment with C and fortran 77 compilers. Of course, you can use it for making integration or acceptance tests too. The 0.1 version has just been released here:

    If you are new to the unit tests way of building software, I must admit it lacks examples. For an introduction to the techniques involved, you can have a look at Growing Object-Oriented Software, Guided by Tests even if mocked subroutines will be for later. But remember that if you do not like to write tests, you are probably not writing unit tests.

  • Apycot big version change

    2009/01/26 by Arthur Lutz

    The version convention that we use is pretty straight forward and standard : it's composed of 3 numbers separated by dots. What are the rules to incrementing each on of these numbers ?

    • The last number is a incremented when bugs are corrected
    • The middle number is incremented when stories (functionalities) are implemented to the software
    • The first number is incremented when we have a major change of technology

    Well... if you've been paying attention, apycot just turned 1.0.0, the major change of technology is that it is now integrated to CubicWeb (instead of just generating html files). So for a project in your forge, you describe the apycot configuration for it, and the tests for quality assurance are launched on a regular basis. We're still in the process of stabilizing it (latest right now it 1.0.5), but it already runs on the CubicWeb projects, see the screenshot below :

    You should also know that now apycot has two components : the apycotbot which runs the tests and an cubicweb-apycot which displays the results in cubicweb (download cubicweb-apycot-1.0.5.tar.gz and apycotbot-1.0.5.tar.gz).

  • We're now publishing for Ubuntu aswell

    2009/01/26 by Arthur Lutz

    We've always been big fans of debian here at Logilab. So publishing debian packages for our open source software has always been a priority.

    We're now a bit involved with Ubuntu, work with it on some client projects, have a few Ubuntu machines lying around, and we like it too. So we've decided to publish our packages for Ubuntu as well as for debian.

    In the 0.12.1 version of logilab-devtools we introduced publishing of Ubuntu packages with lgp (Logilab Packaging) - see ticket. Since then, you can add the following Ubuntu source to your Ubuntu system

    deb hardy/

    For now, only hardy is up and running, give us a shout if you want something else!

  • Release of CubicWeb 3.0

    2009/01/05 by Nicolas Chauvat

    As some readers of this blog may be aware of, Logilab has been developing its own framework since 2001. It evolved over the years trying to reach the main goal (managing and publishing data with style) and to incorporate the goods ideas seen in other Python frameworks Logilab developers had used. Now, companies other than Logilab have started providing services for this framework and it is stable enough for the core team to be confident in recommending it to third parties willing to build on it without suffering from the tasmanian devil syndrom.

    CubicWeb version 3.0 was released on the last day of 2008. That's 7 years of research and development and (at least) three rewrites that were needed to get this in shape. Enjoy it at !

  • hgview 0.10.0

    2008/12/30 by Graziella Toutoungis

    I have the pleasure of announcing that the version hgview 0.10.0 was posted on this site and is available for downloading. In this version we added some new functionalities like :

    • The possibility to order all revisions by date or author or description.....
    • Support for localtime.
    • Improve the message header when hg mv is used and fix the author base color
    • Integration of bboissin's fixes

    Finally : We have taken into account older versions. As pointed out by some users, mercurial version 1.1.x wasn't working very well with hgview, so we created patches which have to be applied according to the version of mercurial you are using.

  • Pyreverse : UML Diagrams for Python

    2008/12/23 by Emile Anclin

    Pyreverse analyses Python code and extracts UML class diagrams and package depenndencies. Since september 2008 it has been integrated with Pylint (0.15).


    Pyreverse builds a diagram representation of the source code with:
    • class attributes, if possible with their type
    • class methods
    • inheritance links between classes
    • association links between classes
    • representation of Exceptions and Interfaces

    Generation of UML diagrams with Pyreverse

    The command pyreverse generates the diagrams in all formats that graphviz/dot knows, or in VCG :

    The following command shows what dot knows:

    $ dot -Txxx
    Format: "xxx" not recognized. Use one of: canon cmap cmapx cmapx_np dia dot
    eps fig gd gd2 gif hpgl imap imap_np ismap jpe jpeg jpg mif mp pcl pdf pic
    plain plain-ext png ps ps2 svg svgz tk vml vmlz vrml vtx wbmp xdot xlib

    pyreverse creates by default two diagrams:

    $ pyreverse -o png -p Pyreverse pylint/pyreverse/
    creating diagram packages_Pyreverse.png
    creating diagram classes_Pyreverse.png
    • -o : sets the output format
    • -p name : yields the output files packages_name.png and classes_name.png


    One can modify the output with following options:

    -a N, -A    depth of research for ancestors
    -s N, -S    depth of research for associated classes
    -A, -S      all ancestors, resp. all associated
    -m[yn]      add or remove the module name
    -f MOD      filter the attributes : PUB_ONLY/SPECIAL/OTHER/ALL
    -k          show only the classes (no attributes and methods)
    -b          show 'builtin' objects


    General Vue on a Module

    pyreverse -ASmy -k -o png pyreverse/ -p Main
    [image : classes_Main.png, class diagram with all dependencies]

    full size image

    With these options you can have a quick vue of the dependencies without being lost in endless lists of methods and attributes.

    Detailed Vue on a Module

    pyreverse -c PyreverseCommand -a1 -s1 -f ALL -o png  pyreverse/
    [image : PyreverseCommand.png, pyreverse.diagram.ClassDiagram class diagram with one dependency level]

    module in full size image

    Show all methods and attributes of the class (-f ALL). By default, the class diagram option -c uses the options -A, -S, -my, but here we desactivate them to get a reasonably small image.

    Configuration File

    You can put some options into the file ".pyreverserc" in your home directory.


    --filter-mode=PUB_ONLY --ignore doc --ignore test
    This will exclude documentation and test files in the doc and test directories. Also, we will see only "public" methods.

  • Javascript date support

    2008/11/27 by Adrien Di Mascio

    Coming from the python and mx.DateTime world, the javascript Date object is not really appealing. For me, the most disturbing things are :

    • The year parameter in the Date constructor is always considered as a XXe century year if year < 100. (this goes along with the getYear / getFullYear distinction).
    • The inconsistency between months and days indexes : months indexes starts at 0 whereas days indexes starts at 1.
    • The lack of decent strptime / strftime functions (even basic ones not taking locales into account).

    Recently, I've worked with the great Timeline project which makes an heavy use of dates and I had the need for a very basic strptime implementation. This can by no mean be considered as a comprehensive implementation, but it might help so here it is:

        'Y': new RegExp('^-?[0-9]+'),
        'd': new RegExp('^[0-9]{1,2}'),
        'm': new RegExp('^[0-9]{1,2}'),
        'H': new RegExp('^[0-9]{1,2}'),
        'M': new RegExp('^[0-9]{1,2}')
     * _parseData does the actual parsing job needed by `strptime`
    function _parseDate(datestring, format) {
        var parsed = {};
        for (var i1=0,i2=0;i1<format.length;i1++,i2++) {
        var c1 = format[i1];
        var c2 = datestring[i2];
        if (c1 == '%') {
            c1 = format[++i1];
            var data = _DATE_FORMAT_REGXES[c1].exec(datestring.substring(i2));
            if (!data.length) {
                return null;
            data = data[0];
            i2 += data.length-1;
            var value = parseInt(data, 10);
            if (isNaN(value)) {
                return null;
            parsed[c1] = value;
        if (c1 != c2) {
            return null;
        return parsed;
     * basic implementation of strptime. The only recognized formats
     * defined in _DATE_FORMAT_REGEXES (i.e. %Y, %d, %m, %H, %M)
    function strptime(datestring, format) {
        var parsed = _parseDate(datestring, format);
        if (!parsed) {
        return null;
        // create initial date (!!! year=0 means 1900 !!!)
        var date = new Date(0, 0, 1, 0, 0);
        date.setFullYear(0); // reset to year 0
        if (parsed.Y) {
        if (parsed.m) {
        if (parsed.m < 1 || parsed.m > 12) {
            return null;
        // !!! month indexes start at 0 in javascript !!!
        date.setMonth(parsed.m - 1);
        if (parsed.d) {
        if (parsed.m < 1 || parsed.m > 31) {
            return null;
        if (parsed.H) {
        if (parsed.H < 0 || parsed.H > 23) {
            return null;
        if (parsed.M) {
        if (parsed.M < 0 || parsed.M > 59) {
            return null;
        return date;
    // and now monkey patch the Timeline's parser ...
    /* provide our own custom date parser since the default
     * one only understands iso8601 and gregorian dates
    Timeline.NativeDateUnit.getParser = function(format) {
        if (typeof format == "string") {
        if (format.indexOf('%') != -1) {
            return function(datestring) {
                if (datestring) {
                    return strptime(datestring, format);
                return null;
            format = format.toLowerCase();
        if (format == "iso8601" || format == "iso 8601") {
        return Timeline.DateTime.parseIso8601DateTime;
        return Timeline.DateTime.parseGregorianDateTime;

  • We're open for a chat

    2008/11/25 by Arthur Lutz

    We have a public forum that is accessible both using XMPP (jabber) or IRC.

    Jabber / XMPP

    Our jabber server is

    If you don't have a jabber account, create one on a server such as (here is a list of free jabber services) or use our web based client.

    Once you have a jabber account, come and join us at xmpp://

    If you do not know what jabber is, read the wikipedia page about jabber

    IRC / International Relay Chat

    Connect to irc:// and join #pylint

    If you do not know what irc is, read the wikipedia page about irc.

  • DBpedia 3.2 released

    2008/11/19 by Nicolas Chauvat

    For those interested in the Semantic Web as much as we are at Logilab, the announce of the new DBpedia release is very good news. Version 3.2 is extracted from the October 2008 Wikipedia dumps and provides three mayor improvements: the DBpedia Schema which is a restricted vocabulary extracted from the Wikipedia infoboxes ; RDF links from DBpedia to Freebase, the open-license database providing about a million of things from various domains ; cleaner abstracts without the traces of Wikipedia markup that made them difficult to reuse.

    DBpedia can be downloaded, queried with SPARQL or linked to via the Linked Data interface. See the about page for details.

    It is important to note that ontologies are usually more of a common language for data exchange, meant for broad re-use, which means that they can not enforce too many restrictions. On the opposite, database schemas are more restrictive and allow for more interesting inferences. For example, a database schema may enforce that the Publisher of a Document is a Person, whereas a more general ontology will have to allow for Publisher to be a Person or a Company.

    DBpedia provides its schema and moves forward by adding a mapping from that schema to actual ontologies like UMBEL, OpenCyc and Yago. This enables DBpedia users to infer from facts fetched from different databases, like DBpedia + Freebase + OpenCyc. Moreover 'checking' DBpedia's data against ontologies will help detect mistakes or weirdnesses in Wikipedia's pages. For example, if data extracted from Wikipedia's infoboxes states that "Paris was_born_in New_York", reasoning and consistency checking tools will be able to point out that a person may be born in a city, but not a city, hence the above fact is probably an error and should be reviewed.

    With CubicWeb, one can easily define a schema specific to his domain, then quickly set up a web application and easily publish the content of its database as RDF for a known ontology. In other words, CubicWeb makes almost no difference between a web application and a database accessible thru the web.

  • Using branches in mercurial

    2008/10/14 by Arthur Lutz

    The more we use mercurial to manage our code repositories, the more we enjoy its extended functionalities. Lately we've been playing and using branches which end up being very useful. We also use hgview instead of the built-in "hg view" command. And its latest release supports the branches functionality, you can filter out the branch you want to look at. Update your installation (apt-get upgrade ?) to enjoy this new functionality... or download it.

  • A new way of distributing Python code ?


    On distutils-sig, the question of distutils/setuptools replacing is frequently raised and a lot of effort is made to find what would be the best way to build and distribute python code.

    I don't understand the reason why we have a massive coupling between build and distribution (setuptools and pypi to be more precise) and I'm not convinced about this "global" approach. I hope the python community will examine the possibility to change that and split the problem in two distinct projects.

    One of the most successful ideas of Python is its power in extending other languages. And in fact, that's the major problem to solve for the build area. I'm pretty sure it will take a long time before obtaining a valuable (and widely adopted) solution and this is so complicated that the choice of the building chain should be kept under the responsibility of the upstream maintainers for now (distutils, setuptools, makefile, SCons, ...).

    Concerning the distribution, here are the mandatory features I expect:

    • installing source code managing dependencies with foreign contribution
    • have binary builds without interaction with the primary host system
    • be multi-platform agnostic (Linux, BSD, Windows, Mac, ...)
    • clean upgrade/uninstall
    • kind of sandboxes for testing and development mode
    • no administrator privilege required

    I found the project homepage and was really impressed by the tons of functionalities already available and the other numerous advantages, like:

    • multiple version installation
    • reuse external distribution effort (integrate deb, rpm, ...)
    • digital signatures
    • basic mirroring solution
    • notification about software updates
    • command line oriented but various GUI exist
    • try to follow standards (XDG specifications on

    I'm questioning seriously why this project could not be considered as a clean and build-independent python packages index system. Moreover, 0install has already some build capabilities (see 0compile) but the ultimate reason is that it will largely facilitate migrations when a new python build standard will emerge.


    0install looks like a mature project driven by smart people and already included in modern distributions. I'll definitively give it a try soon.

  • Converting excel files to CSV using and pyuno


    The Task

    I recently received from a customer a fairly large amount of data, organized in dozens of xls documents, each having dozens of sheets. I need to process this, and in order to ease the manipulation of the documents, I'd rather use standard text files in CSV (Comma Separated Values) format. Of course I didn't want to spend hours manually converting each sheet of each file to CSV, so I thought this would be a good time to get my hands in pyUno.

    So I gazed over the documentation, found the Calc page on the wiki, read some sample code and got started.

    The easy bit

    The first few lines I wrote were (all imports are here, though some were actually added later).

    import logging
    import sys
    import os.path as osp
    import os
    import time
    import uno
    def convert_spreadsheet(filename):
    def run():
        for filename in sys.argv[1:]:
    def configure_log():
        logger = logging.getLogger('')
        handler = logging.StreamHandler(sys.stdout)
        format = "%(asctime)s %(levelname)-7s [%(name)s] %(message)s"
    if __name__ == '__main__':

    That was the easy part. In order to write the convert_spreadsheet function, I needed to open the document. And to do that, I need to start

    Starting OOo

    I started by copy-pasting some code I found in another project, which expected to be already started with the -accept option. I changed that code a bit, so that the function would launch soffice with the correct options if it could not contact an existing instance:

    def _uno_init(_try_start=True):
        """init python-uno bridge infrastructure"""
            # Get the uno component context from the PyUNO runtime
            local_context = uno.getComponentContext()
            # Get the local Service Manager
            local_service_manager = local_context.ServiceManager
            # Create the UnoUrlResolver on the Python side.
            local_resolver = local_service_manager.createInstanceWithContext(
                "", local_context)
            # Connect to the running and get its context.
            # XXX make host/port configurable
            context = local_resolver.resolve("uno:socket,host=localhost,port=2002;urp;StarOffice.ComponentContext")
            # Get the ServiceManager object
            service_manager = context.ServiceManager
            # Create the Desktop instance
            desktop = service_manager.createInstance("")
            return service_manager, desktop
        except Exception, exc:
            if exc.__class__.__name__.endswith('NoConnectException') and _try_start:
      'Trying to start UNO server')
                status = os.system('soffice -invisible -accept="socket,host=localhost,port=2002;urp;"')
      'status = %d', status)
                return _uno_init(False)
                logging.exception("UNO server not started, you should fix that now. "
                                  "`soffice \"-accept=socket,host=localhost,port=2002;urp;\"` "
                                  "or maybe `unoconv -l` might suffice")

    Spreadsheet conversion

    Now the easy (sort of, once you start understanding the OOo API): to load a document, use desktop.loadComponentFromURL(). To get the sheets of a Calc document, use document.getSheets() (that one was easy...). To iterate over the sheets, I used a sample from the SpreadsheetCommon page on the wiki.

    Exporting the CSV was a bit more tricky. The function to use is document.storeToURL(). There are two gotchas, however. The first one, is that we need to specify a filter, and to parameterize it correctly. The second one is that the CSV export filter is only able to export the active sheet, so we need to change the active sheet as we iterate over the sheets.

    Parametrizing the export filter

    The parameters are passed in a tuple of PropertyValue uno structures, as the second argument to the storeToURL method. I wrote a helper function which accepts any named arguments and convert them to such a tuple:

    def make_property_array(**kwargs):
        """convert the keyword arguments to a tuple of PropertyValue uno
        array = []
        for name, value in kwargs.iteritems():
            prop = uno.createUnoStruct("")
            prop.Name = name
            prop.Value = value
        return tuple(array)

    Now, what do we put in that array? The answer is in the FilterOptions page of the wiki : The FilterName property is "Text - txt - csv (StarCalc)". We also need to configure the filter by using the FilterOptions property. This is a string of comma separated values

    • ASCII code of field separator
    • ASCII code of text delimiter
    • character set, use 0 for "system character set", 76 seems to be UTF-8
    • number of first line (1-based)
    • Cell format codes for the different columns (optional)

    I used the value "59,34,76,1", meaning I wanted semicolons for separators, and double quotes for text delimiters.

    Here's the code:

    def convert_spreadsheet(filename):
        """load a spreadsheet document, and convert all sheets to
        individual CSV files"""'processing %s', filename)
        url = "file://%s" % osp.abspath(filename)
        export_mask = make_export_mask(url)
        # initialize Uno, get a Desktop object
        service_manager, desktop = _uno_init()
            # load the Document
            document = desktop.loadComponentFromURL(url, "_blank", 0, ())
            controller = document.getCurrentController()
            sheets = document.getSheets()
  'found %d sheets', sheets.getCount())
            # iterate on all the spreadsheets in the document
            enumeration = sheets.createEnumeration()
            while enumeration.hasMoreElements():
                sheet = enumeration.nextElement()
                name = sheet.getName()
      'current sheet name is %s', name)
                outfilename = export_mask % name.replace(' ', '_')
                                    make_property_array(FilterName="Text - txt - csv (StarCalc)",
                                                        FilterOptions="59,34,76,1" ))
    def make_export_mask(url):
        """convert the url of the input document to a mask for the written
        CSV file, with a substitution for the sheet name
        >>> make_export_mask('file:///home/foobar/somedoc.xls')
        components = url.split('.')
        components[-2] += '$%s'
        components[-1] = 'csv'
        return '.'.join(components)

  • qgpibplotter is (hopefully) working

    2008/09/04 by David Douard

    My latest personal project, pygpibtoolkit, holds a simple HPGL plotter trying to emulate the HP7470A GPIB plotter, using the very nice and cheap Prologix USB-GPIB dongle. This tool is (for now) called qgpibplotter (since it is using the Qt4 toolkit).

    Tonight, I took (at last) the time to make it work nicely. Well, nicely with the only device I own which is capable of plotting on the GPIB bus, my HP3562A DSA.

    Now, you just have to press the "Plot" button of your test equipment, and bingo! you can see the plot on your computer.

  • gajim, dbus and wmii

    2008/09/02 by Adrien Di Mascio

    I've been using for a long time a custom version of gajim in order to make it interact with wmii. More precisely, I have, in my wmii status bar, a dedicated log zone where I print notification messages such as new incoming emails or text received from gajim (with different colors if special words were cited, etc.).

    I recently decided to throw away my custom gajim and use python and dbus to achieve the same goal in a cleaner way. A very basic version can be found in the simpled project. As of now, the only way to get the code is trhough mercurial:

    hg clone

    The source file is named In this file, you'll also find a version sending messages to Ion's status bar.

  • Command-line graphical user interfaces

    2008/09/01 by Nicolas Chauvat

    Graphical user interfaces help command discovery, while command-line interfaces help command efficiency. This article tries to explain why. I reached it when reading the list of references from the introduction to Ubiquity, which is the best extension to firefox I have seen so far. I expect to start writing Ubiquity commands soon, since I have already been using extensively the 'keyword shorcut' functionnality of firefox's bookmarks and we have already done work in the area of 'language interaction', as they call it at Mozilla Labs, when working with Narval. Our Logilab Simple Desktop project, aka simpled, also goes in the same direction since it tries to unify different applications into a coherent work environment by defining basic commands and shorcuts that can be applied everywhere and accessing the rest of the functionnalities via a command-line interface.

  • Is the Openmoko freerunner a computer or a phone ?

    2008/08/27 by Nicolas Chauvat

    The Openmoko Freerunner is a computer with embedded GSM, accelerometer and GPS. I got mine last week after waiting for a month for the batch to get from Taiwan to the french company I bought it from. The first thing I had to admit was that some time will pass before it gets confortable to use it as a phone. The current version of the system has many weird things in its user interface and the phone works, but the other end of the call suffers a very unpleasant echo.

    I will try to install Debian, Qtopia and Om2008.8 to compare them. I also want to quickly get Python scripts to run on it and get back to Narval hacking. I had an agent running on a bulky Palm+GPS+radionetwork back in 1999 and I look forward to run on this device the same kind of funny things I was doing in AI research ten years ago.

  • simpled - Simple Desktop project started !

    2008/08/11 by Nicolas Chauvat

    I bought last week a new laptop computer that can drive a 24" LCD monitor, which means I do not need my desktop computer any more. In the process of setting up that new laptop, I did what I have been wanting to do for years without finding the time: spending time on my ion3 config to make it more generic and create a small python setup utility that can regenerate it from a template file and a keyboard layout.

    The simpled project was born!

    If you take a look at the list of pending tickets, you will guess that I am using a limited number of pieces of software during my work day and tried to configure them so that they share common action/shortcuts. This is what simpled is about: given a keyboard layout generate the config files for the common tools so that action/shortcuts are always on the same key.

    I use ion3, xterm+bash, emacs, mutt, firefox, gajim. Common actions are: open, save, close, move up/down/left/right, new frame or tab, close frame or tab, move to previous or next tab, etc.

    I will give news in this blog from time to time and announce it on mailing lists when version 0.1 will be out. If you want to give it a try, get the code from the mercurial repository.

  • Simile-Widgets

    2008/08/07 by Nicolas Chauvat

    While working on knowledge management and semantic web technologies, I came across the Simile project at MIT a few years back. I even had a demo of the Exhibit widget fetching then displaying data from our semantic web application framework back in 2006 at the Web2 track of Solutions Linux in Paris.

    Now that we are using these widgets when implementing web apps for clients, I was happy to see that the projects got a life of their own outside of MIT and became full-fledged free-software projects hosted on Google Code. See Simile-Widgets for more details and expect us to provide a debian package soon unless someone does it first.

    Speaking of Debian, here is a nice demo a the Timeline widget presenting the Debian history.

  • SciPy and TimeSeries

    2008/08/04 by Nicolas Chauvat

    We have been using many different tools for doing statistical analysis with Python, including R, SciPy, specific C++ code, etc. It looks like the growing audience of SciPy is now in movement to have dedicated modules in SciPy (lets call them SciKits). See this thread in SciPy-user mailing-list.

  • Google Custom Search Engine, for Python


    A Google custom search engine for Python has been made available by Gerard Flanagan, indexing:

    Using refinements

    To refine the search to any of the individual sites, you can specify a refinement using the following labels: stdlib, wiki, pypi, thehazeltree

    So, to just search the python wiki, you would enter:

    somesearchterm more:wiki

    and similarly:

    somesearchterm more:stdlib somesearchterm more:pypi somesearchterm more:thehazeltree


    The Hazel Tree is a collection of popular Python texts that I have converted to reStructuredText and put together using Sphinx. It's in a publishable state, but not as polished as I'd like, and since I'll be mostly offline for the next month it will have to remain as it is for the present. However, the search engine is ready now and the clock is ticking on its subscription (one year, renewal depending on success of site), so if it's useful to anyone, it's all yours (and if you use it on your own site a link back to would be appreciated).

  • Python for applied Mathematics

    2008/07/29 by Nicolas Chauvat

    The presentation of Python as a tool for applied mathematics got highlighted at the 2008 annual meeting of the american Society for Industrial and Applied Mathematics (SIAM). For more information, read this blogpost and the slides.

  • ion, dock and screen configuration

    2008/07/04 by David Douard

    I have a laptop I use at work (with a docking station), in the train and at home (with an external display), on which my environment is ion3.

    As I use suspend-to-RAM all the time, I have added some keybindings to automatically reconfigure my screen when I plug/unplug an external display (on the dock as well as direct VGA connection).

    The lua code to paste in your .ion3/cfg_ion.lua for the bindings looks like:

    function autoscreen_on()
            local f = io.popen('/home/david/bin/autoscreen -c', 'r')
          if not f then
          local s = f:read('*a')
    function autoscreen_off()
            local f = io.popen('/home/david/bin/autoscreen -d', 'r')
          if not f then
          local s = f:read('*a')
    defbindings("WMPlex.toplevel", {
        bdoc("Turn on any external display and tell ion to reconfigure itself"),
    defbindings("WMPlex.toplevel", {
        bdoc("Turn off any external display and tell ion to reconfigure itself"),

    It makes use of the following python script (named /home/david/bin/autoscreen in the lua code above):

    #!/usr/bin/env python
    import sys
    import os
    import re
    from subprocess import Popen, PIPE
    import optparse
    parser = optparse.OptionParser("A simple automatic screen configurator (using xrandr)")
    parser.add_option('-c', '--connect', action="store_true",
                      help="configure every connected screens")
    parser.add_option('-d', '--disconnect', action="store_true",
                      help="unconfigure every connected screens other than LVDS (laptop screen)")
    parser.add_option('', '--main-display',
                      help="main display identifier (typically, the laptop LCD screen; defaults to LVDS)")
    options, args = parser.parse_args()
    if int(options.connect) + int(options.disconnect) > 1:
        print "ERROR: only one option -c or -d at a time"
    xrandr = Popen("xrandr", shell=True, bufsize=0, stdout=PIPE)
    connected = re.findall(r'([a-zA-Z0-9-]*) connected', xrandr)
    connected = [c for c in connected if c != options.maindisplay]
    cmd = "xrandr --output %s %s"
    if options.connect or options.disconnect:
        for c in connected:
            if options.connect:
                action = "--auto"
            elif options.disconnect:
                action = "--off"
            p = Popen(cmd % (c, action), shell=True)
            sts = os.waitpid(, 0)

  • We're going to Europython'08

    2008/07/02 by Arthur Lutz


    We've decided to go to Europython this year. We're obviously going to give a talk about the exciting things we're doing with LAX and GoogleAppEngine. We're on wednesday at midday in the alfa room, check out the schedule here. Since we think it's important that these events take place, we're also chipping in and sponsoring the event.

    We hope to see you there. Drop us a note if you want to meet up.

  • Munin Plugins for Zope

    2008/07/01 by Arthur Lutz

    Here at Logilab we find Munin pretty useful. We monitor a lots of machines and a lot of services with it, and it usually gives us pretty useful indicators over time that guide us through to optimizations.

    One of the reasons we adopted this technology is it's modular approach with the plugin architecture. And when we realized we could write plugins in python, we knew we'd like it. After years of using it, we're now actually writing plugins for it. Optimizing zope and zeo servers is not an easy task so we're developping plugins to be able to see the difference between before and after changing things.

    You check out the project here, and download it from the ftp.

  • apycot 0.12.1 released

    2008/06/24 by Arthur Lutz

    After one month of internship at logilab, I'm pleased to announce the 0.12.1 release of apycot.

    for more information read the apycot 0.12.1 release note

    You can also check the new sample configuration.

    Pierre-Yves David

  • Instrumentation of google appengine's datastore.

    2008/06/23 by Sylvain Thenault

    Here is a piece of code I've written which I thought may be useful to some other people...

    You'll find here a simple python module to use with the Google AppEngine SDK to monkey patch the datastore API in order to get an idea of the calls performed by your application.

    To instrument of the datastore, put at the top level of your handler file

    import instrdatastore

    Note that it's important to have this before any other import in your application or in the google package to avoid that some modules will use the unpatched version of datastore functions (and hence calls to those functions wouldn't be considered).

    Then add at the end of your handler function


    The handler file should look like this:

    """my handler file with datastore instrumenting activated"""
    import instrdatastore
    # ... other initialization code
    # main function so this handler module is cached
    def main():
      from wsgiref.handlers import CGIHandler
      from ginco.wsgi.handler import ErudiWSGIApplication
      application = ErudiWSGIApplication(config, vreg=vreg)
    if __name__ == "__main__":

    Now you should see in your logs the number of Get/Put/Delete/Query which has been done during request processing

    2008-06-23 06:59:12 - (root) WARNING: datastore access information
    2008-06-23 06:59:12 - (root) WARNING: nb Get: 2
    2008-06-23 06:59:12 - (root) WARNING: arguments (args, kwargs):
    ((datastore_types.Key.from_path('EGroup', u'key_users', _app=u'winecellar'),), {})
    ((datastore_types.Key.from_path('EUser', u'', _app=u'winecellar'),), {})
    2008-06-23 06:59:12 - (root) WARNING: nb Query: 1
    2008-06-23 06:59:12 - (root) WARNING: arguments (args, kwargs):
    (({'for_user =': None}, 'EProperty'), {})
    2008-06-23 06:59:58 - (root) WARNING: nb Put: 1
    2008-06-23 06:59:58 - (root) WARNING: arguments (args, kwargs):
    (({u'login': None, u'last_usage_time': 1214204398.2022741, u'data': ""},), {})

    I'll probably extend this as the time goes. Also notice you may encounter some problems with the automatic reloading feature of the dev app server when instrumentation is activated, in which case you should simply restart the web server.

  • First version of LAX Book

    2008/06/16 by Arthur Lutz

    Previous documentation was merged into a LAX Book now featuring step-by-step screenshots to get up and running faster.

    Don't we all like screenshots...

    Update: LAX is now included in the CubicWeb semantic web framework.

  • Implementing scalable applications with AppEngine

    2008/06/11 by Nicolas Chauvat

    At Google IO, a large part of the Tools track was dedicated to AppEngine. Brett Slatkin gave a talk titled Building scalable Web Applications with Google AppEngine which focused on optimizing the server part of web apps. As other presenters demonstrated it, like Steve Souders in his talk Even Faster Websites, optimizing the browser part of webapps is not to be neglected either.

    Webscale applications require man-made optimisation

    First of all, I must confess I am used to repeat that "early optimisation is the root of all evil" and "delay commitment until the last responsible time". But reading about AppEngine and listening to the Google IO talks, it appears that the tools we have today ask for human intervention to reach web-scale performance, even when "we" stands for "Google".

    In order for web-scale applications to handle the kind of load they are facing, they must be designed and implemented carefully. As carefully as any application was designed before the exponential growth of PC computation power let us move away from low-level implementation details and made some inefficiencies acceptable as long as the time spent developing was short enough.

    It all depends on the parameters of your cost function, but for web-scale applications, it seems like we have not enough computer-time and can not trade it for human-time.

    Writes are more expensive than reads

    To get a better idea of the work constraints, one should know that a disk seek is about 10ms, which means there will be a maximum of 100 accesses per second. On the other hand, if we need consistent data as opposed to transactional data (the latter implying that data is fetched each time it is asked for), data can be read from disk once then cached. Following reads are done from memory at a rate of about 4GB/sec, which means 4000 accesses per second if entities are around 1MB in size. Result of this back of the envelope approximation is 40 reads equals one write.

    It follows that, although the actual time depends on the size and shape of data, writes are very expensive compared to reads and both are better done in batches to optimise disk access.

    Entity groups in AppEngine

    The AppEngine Datastore was designed with this constraints in mind. Entities are sets of property name/value pairs. Each entity may have a parent. An entity without a parent is the root of a hierarchy called an entity group.

    Entities of the same group are stored on disk close to each other, but two distinct entity groups may be stored on different computers. Read access to entities of the same group is thus faster than read access to entities of different groups.

    Write access is serialized per entity group. As opposed to a traditionnal RDBMS that provides row locking, the datastore only provides entity group locking. Writes to the a single entity group will always happen in sequence, even though changes concern different entities.

    There is no limit to the number of entity groups or to the number of entities per group, but because of the locking strategy, large entity groups will cause high contention and a lot of failed transactions. Since writes are expensive, not thinking about write throughput is a very bad idea when designing an AppEngine application if one want it to scale.

    On the other hand, the parallel nature of the datastore make it scale wide and there is no limit to the number of entity groups that can be written to in parallel, nor to the number of reads that can be done in parallel.

    To understand this design in details, you will have to read about GFS, BigTable and other technologies developed by Google to implement large-scale clustering.

    Example of counters

    Counters are a good example to address when discussing write throughput, because the datastore locking strategy makes writing to global data very expensive.

    Let us assume that we want to display on the main page of a wiki application the total number of comments posted.

    A global counter would serialize all its updates. If 100 users were to add comments at the same time, some of them would have to wait several seconds for their action to complete: one write for the comment, one write for the counter, at most 100 writes per second for the counter and a lot of time lost due to failed transaction that need to be restarted.

    The solution to make the counter scale is to partition it among all entity groups then sum these partial counters when the global value is needed.

    Since chances are low that a given user will write more than one comment at a time, comment entities for a user can be grouped together and a partial counter can be added to the same entity group. Creating a new comment and increasing the partial counter will be done in the same batch.

    When a new request for the main page is received, the counter total is looked up in the cache. If it is not found, all partial counters are fetched and summed up, then the cache is refreshed with a short timeout, for example one minute.

    During the next minute, the counter will be "consistent", read no too far-off, and served extremely fast from the cache.

    Prevent repeated or unneeded work

    To sum things up, when implementing applications on top of AppEngine with web-scale usage as a goal, everything that can be done to save time should be considered. Including the following:

    • importing python modules as late as possible will minimize the python runtime overhead
    • retrieving data that is not going to be used is a waste
    • repeated queries and queries returning large result sets must be avoided
    • when Get() if sufficient, do not spend time on Query()
    • landing pages are traffic intensive and would better use the same query for everyone
    • entity groups have to be designed to match the load and aim at low write contention
    • caching must be used aggressively (it is no surprise that memcache was the first improvement that followed within a month of the AppEngine public release)


    As a conclusion, the interface AppEngine is exhibiting today requires to optimize early, but I would bet that in the years to come, new languages and domain-specific compilers or database engines will take part of that burden off the hands of the developers.

    Did not Yahoo and Google start developping PigLatin and Sawzall to make it easier to write parallel data-processing programs ? The same could happen with describe a data-model in a high-level language and get a tool to optimize it for write contention and web-scale application.

    See Also

    LAX (Logilab App engine eXtension) is a full-featured web application framework running on Google AppEngine developed by Logilab.

  • Google App Engine future directions

    2008/06/09 by Nicolas Chauvat

    Several of us went to San Francisco last week to attend Google IO. As usual with conferences, meeting people was more interesting than listening to most talks. The AppEngine Fireside Chat was a Q&A session that lasted about an hour. Here is what I learned from this session and various chats with AppEngineers.

    1. Google has decided to provide its scalable datastore architecture as a service. At this point, the datastore is the product and the goal it to make it as widely accessible as possible.
    2. The google.appengine.api.datastore API alone would not have made for a very sexy launch. In order to attract more people and lower the bar the beginners would have to jump over, they looked for a higher level programming interface.
    3. Since some people working at Google have been using Django and know it, they reimplemented part of its interface for defining data models. Late in the project, they added GQL because Django-like queries were a bit too difficult. In both case, the goal was to make it easier for external developers to get started.
    4. But Google is not in the business of providing web application frameworks and AppEngineers made explicit that they would not be officially supporting a specific framework or a specific version of a given framework (not even Django 0.96, although there is a django-appengine-helper project on They expect frameworks to be provided by communities of developers.

    My conclusion is twofold:

    • They will be focusing on supporting other languages in AppEngine (I would bet on Java being the next one available) rather than extending Python frameworks support.
    • Anyone is free to join with his own framework and provide support for it, the One True Interface being the one defined by google.appengine.api.datastore, not the one defined by db.model and GQL.

    This is why Logilab published its own framework running on App Engine as free software and is providing support for it: Logilab Appengine eXtension.

  • LAX - Logilab Appengine eXtension is a full-featured web application framework running on AppEngine

    2008/06/09 by Arthur Lutz

    LAX version 0.3.0 was released today, see

    Get a new application running in ten minutes with the install guide and the tutorial:


    Update: LAX is now included in the CubicWeb semantic web framework.

  • Browsers strangeness ...

    2008/06/07 by Adrien Di Mascio

    ... or when inverting two lines of code in your HTML's HEAD can speed up your web page rendering !

    If you have the following HTML page:

        <link rel="stylesheet" type="text/css" href="" />
        <script type="text/javascript">
          var somearray = [1, 2, 3];
        <link rel="stylesheet" type="text/css" href="" />

    Firefox3 [1] will download the CSS sequentially, hence if both CSS get 250ms to download, this page will approximatively appear in more or less half a second.

    Now, if you just move the inline script before the two CSS declarations:

        <script type="text/javascript">
          var somearray = [1, 2, 3];
        <link rel="stylesheet" type="text/css" href="" />
        <link rel="stylesheet" type="text/css" href="" />

    The two CSS files are now downloaded in parallel, and your page now take about half time to render !

    One of the lessons here is that optimizing your website's backend is great and necessary, but is a quite long term and hard job. On the other hand, optimizing the frontend is often easier and pays off immediatly (well, so to speak...). Don't forget that in complex and rich web sites, most of the time can be spent on the client side.

    [1] It seems that Firefox 2 doesn't event try to download CSS in parallel.

    Going further

    Of course, this is quite browser-dependant ! It would be simpler if all browsers behaved the same way but fortunately, there is a very nice tool named cuzillion developed by Steve Souders at Google (formerly Chief performance at Yahoo and developer of Yslow, a firebug's extension which is able to point out performance problems of your site). This tool lets you create web pages online by inserting inline scripts, CSS, images, etc. and then test how long the page takes to be rendered in your browser. You can control the order of the inserted elements as well as customize their properties (how long it shoud take to download, choose another domain to download, if a script is defined with a script tag, an XHR, an iframe, etc.)

  • New apycot release

    2008/06/02 by Arthur Lutz

    After almost 2 years of inactivity, here is a new release of apycot the "Automated Pythonic Code Tester". We use it everyday to maintain our software quality, and we hope this tool can help you as well.

    Admittedly it's not trivial to setup, but once it's running you'll be able to count on it. We're working on getting it to work "out-of-the-box"...

    Here's what's in the ChangeLog :

    2008-05-19 -- 0.11.0
    • updated documentation
    • new pylintrc option for the pyhton_lint checker.
    • Added code to disabled checker with missing required option with the proper ERROR statut
    • removed the catalog option of the xml_valid checker this feature can now be handle with the XML_CATALOG_FILE environement variable (see libxml2 doc for details)
    • moved xml tool from python-xml to lxml
    • new 'hourly' mode for running tests
    • new 'test_activity_report' report
    • pylint checker support new disable_msg and show_categories options (show_categories default to Error and Fatal categories to avoid reports polution)
    • activity option "days" has been renamed to "time" and correspond to a number of day in daily mode but to a number of hour in hourly mode
    • fixed debian_lint and debian_piuparts to actually do something...
    • fixed docutils checker for recent docutils versions
    • dropped python 2.2/2.3 compat (to run apycot itself)
    • added output redirectors to the debian preprocessor to avoid parsing errors
    • can use regular expressions in <pp>_match_* options

  • Flying to Google I/O

    2008/05/27 by Arthur Lutz

    Three of us from Logilab are going to San Francisco to listen, share and discuss at Google I/O.

    It's a two day developer gathering in San Francisco, with various talks about google technologies :

    We're hoping to show and talk about LAX ( which uses Google AppEngine

show 207 results