subscribe to this blog - en

News from Logilab and our Free Software projects, as well as on topics dear to our hearts (Python, Debian, Linux, the semantic web, scientific computing...)

show 208 results
  • We're now publishing for Ubuntu aswell

    2009/01/26 by Arthur Lutz

    We've always been big fans of debian here at Logilab. So publishing debian packages for our open source software has always been a priority.

    We're now a bit involved with Ubuntu, work with it on some client projects, have a few Ubuntu machines lying around, and we like it too. So we've decided to publish our packages for Ubuntu as well as for debian.

    In the 0.12.1 version of logilab-devtools we introduced publishing of Ubuntu packages with lgp (Logilab Packaging) - see ticket. Since then, you can add the following Ubuntu source to your Ubuntu system

    deb hardy/

    For now, only hardy is up and running, give us a shout if you want something else!

  • Release of CubicWeb 3.0

    2009/01/05 by Nicolas Chauvat

    As some readers of this blog may be aware of, Logilab has been developing its own framework since 2001. It evolved over the years trying to reach the main goal (managing and publishing data with style) and to incorporate the goods ideas seen in other Python frameworks Logilab developers had used. Now, companies other than Logilab have started providing services for this framework and it is stable enough for the core team to be confident in recommending it to third parties willing to build on it without suffering from the tasmanian devil syndrom.

    CubicWeb version 3.0 was released on the last day of 2008. That's 7 years of research and development and (at least) three rewrites that were needed to get this in shape. Enjoy it at !

  • hgview 0.10.0

    2008/12/30 by Graziella Toutoungis

    I have the pleasure of announcing that the version hgview 0.10.0 was posted on this site and is available for downloading. In this version we added some new functionalities like :

    • The possibility to order all revisions by date or author or description.....
    • Support for localtime.
    • Improve the message header when hg mv is used and fix the author base color
    • Integration of bboissin's fixes

    Finally : We have taken into account older versions. As pointed out by some users, mercurial version 1.1.x wasn't working very well with hgview, so we created patches which have to be applied according to the version of mercurial you are using.

  • Pyreverse : UML Diagrams for Python

    2008/12/23 by Emile Anclin

    Pyreverse analyses Python code and extracts UML class diagrams and package depenndencies. Since september 2008 it has been integrated with Pylint (0.15).


    Pyreverse builds a diagram representation of the source code with:
    • class attributes, if possible with their type
    • class methods
    • inheritance links between classes
    • association links between classes
    • representation of Exceptions and Interfaces

    Generation of UML diagrams with Pyreverse

    The command pyreverse generates the diagrams in all formats that graphviz/dot knows, or in VCG :

    The following command shows what dot knows:

    $ dot -Txxx
    Format: "xxx" not recognized. Use one of: canon cmap cmapx cmapx_np dia dot
    eps fig gd gd2 gif hpgl imap imap_np ismap jpe jpeg jpg mif mp pcl pdf pic
    plain plain-ext png ps ps2 svg svgz tk vml vmlz vrml vtx wbmp xdot xlib

    pyreverse creates by default two diagrams:

    $ pyreverse -o png -p Pyreverse pylint/pyreverse/
    creating diagram packages_Pyreverse.png
    creating diagram classes_Pyreverse.png
    • -o : sets the output format
    • -p name : yields the output files packages_name.png and classes_name.png


    One can modify the output with following options:

    -a N, -A    depth of research for ancestors
    -s N, -S    depth of research for associated classes
    -A, -S      all ancestors, resp. all associated
    -m[yn]      add or remove the module name
    -f MOD      filter the attributes : PUB_ONLY/SPECIAL/OTHER/ALL
    -k          show only the classes (no attributes and methods)
    -b          show 'builtin' objects


    General Vue on a Module

    pyreverse -ASmy -k -o png pyreverse/ -p Main
    [image : classes_Main.png, class diagram with all dependencies]

    full size image

    With these options you can have a quick vue of the dependencies without being lost in endless lists of methods and attributes.

    Detailed Vue on a Module

    pyreverse -c PyreverseCommand -a1 -s1 -f ALL -o png  pyreverse/
    [image : PyreverseCommand.png, pyreverse.diagram.ClassDiagram class diagram with one dependency level]

    module in full size image

    Show all methods and attributes of the class (-f ALL). By default, the class diagram option -c uses the options -A, -S, -my, but here we desactivate them to get a reasonably small image.

    Configuration File

    You can put some options into the file ".pyreverserc" in your home directory.


    --filter-mode=PUB_ONLY --ignore doc --ignore test
    This will exclude documentation and test files in the doc and test directories. Also, we will see only "public" methods.

  • Javascript date support

    2008/11/27 by Adrien Di Mascio

    Coming from the python and mx.DateTime world, the javascript Date object is not really appealing. For me, the most disturbing things are :

    • The year parameter in the Date constructor is always considered as a XXe century year if year < 100. (this goes along with the getYear / getFullYear distinction).
    • The inconsistency between months and days indexes : months indexes starts at 0 whereas days indexes starts at 1.
    • The lack of decent strptime / strftime functions (even basic ones not taking locales into account).

    Recently, I've worked with the great Timeline project which makes an heavy use of dates and I had the need for a very basic strptime implementation. This can by no mean be considered as a comprehensive implementation, but it might help so here it is:

        'Y': new RegExp('^-?[0-9]+'),
        'd': new RegExp('^[0-9]{1,2}'),
        'm': new RegExp('^[0-9]{1,2}'),
        'H': new RegExp('^[0-9]{1,2}'),
        'M': new RegExp('^[0-9]{1,2}')
     * _parseData does the actual parsing job needed by `strptime`
    function _parseDate(datestring, format) {
        var parsed = {};
        for (var i1=0,i2=0;i1<format.length;i1++,i2++) {
        var c1 = format[i1];
        var c2 = datestring[i2];
        if (c1 == '%') {
            c1 = format[++i1];
            var data = _DATE_FORMAT_REGXES[c1].exec(datestring.substring(i2));
            if (!data.length) {
                return null;
            data = data[0];
            i2 += data.length-1;
            var value = parseInt(data, 10);
            if (isNaN(value)) {
                return null;
            parsed[c1] = value;
        if (c1 != c2) {
            return null;
        return parsed;
     * basic implementation of strptime. The only recognized formats
     * defined in _DATE_FORMAT_REGEXES (i.e. %Y, %d, %m, %H, %M)
    function strptime(datestring, format) {
        var parsed = _parseDate(datestring, format);
        if (!parsed) {
        return null;
        // create initial date (!!! year=0 means 1900 !!!)
        var date = new Date(0, 0, 1, 0, 0);
        date.setFullYear(0); // reset to year 0
        if (parsed.Y) {
        if (parsed.m) {
        if (parsed.m < 1 || parsed.m > 12) {
            return null;
        // !!! month indexes start at 0 in javascript !!!
        date.setMonth(parsed.m - 1);
        if (parsed.d) {
        if (parsed.m < 1 || parsed.m > 31) {
            return null;
        if (parsed.H) {
        if (parsed.H < 0 || parsed.H > 23) {
            return null;
        if (parsed.M) {
        if (parsed.M < 0 || parsed.M > 59) {
            return null;
        return date;
    // and now monkey patch the Timeline's parser ...
    /* provide our own custom date parser since the default
     * one only understands iso8601 and gregorian dates
    Timeline.NativeDateUnit.getParser = function(format) {
        if (typeof format == "string") {
        if (format.indexOf('%') != -1) {
            return function(datestring) {
                if (datestring) {
                    return strptime(datestring, format);
                return null;
            format = format.toLowerCase();
        if (format == "iso8601" || format == "iso 8601") {
        return Timeline.DateTime.parseIso8601DateTime;
        return Timeline.DateTime.parseGregorianDateTime;

  • We're open for a chat

    2008/11/25 by Arthur Lutz

    We have a public forum that is accessible both using XMPP (jabber) or IRC.

    Jabber / XMPP

    Our jabber server is

    If you don't have a jabber account, create one on a server such as (here is a list of free jabber services) or use our web based client.

    Once you have a jabber account, come and join us at xmpp://

    If you do not know what jabber is, read the wikipedia page about jabber

    IRC / International Relay Chat

    Connect to irc:// and join #pylint

    If you do not know what irc is, read the wikipedia page about irc.

  • DBpedia 3.2 released

    2008/11/19 by Nicolas Chauvat

    For those interested in the Semantic Web as much as we are at Logilab, the announce of the new DBpedia release is very good news. Version 3.2 is extracted from the October 2008 Wikipedia dumps and provides three mayor improvements: the DBpedia Schema which is a restricted vocabulary extracted from the Wikipedia infoboxes ; RDF links from DBpedia to Freebase, the open-license database providing about a million of things from various domains ; cleaner abstracts without the traces of Wikipedia markup that made them difficult to reuse.

    DBpedia can be downloaded, queried with SPARQL or linked to via the Linked Data interface. See the about page for details.

    It is important to note that ontologies are usually more of a common language for data exchange, meant for broad re-use, which means that they can not enforce too many restrictions. On the opposite, database schemas are more restrictive and allow for more interesting inferences. For example, a database schema may enforce that the Publisher of a Document is a Person, whereas a more general ontology will have to allow for Publisher to be a Person or a Company.

    DBpedia provides its schema and moves forward by adding a mapping from that schema to actual ontologies like UMBEL, OpenCyc and Yago. This enables DBpedia users to infer from facts fetched from different databases, like DBpedia + Freebase + OpenCyc. Moreover 'checking' DBpedia's data against ontologies will help detect mistakes or weirdnesses in Wikipedia's pages. For example, if data extracted from Wikipedia's infoboxes states that "Paris was_born_in New_York", reasoning and consistency checking tools will be able to point out that a person may be born in a city, but not a city, hence the above fact is probably an error and should be reviewed.

    With CubicWeb, one can easily define a schema specific to his domain, then quickly set up a web application and easily publish the content of its database as RDF for a known ontology. In other words, CubicWeb makes almost no difference between a web application and a database accessible thru the web.

  • Using branches in mercurial

    2008/10/14 by Arthur Lutz

    The more we use mercurial to manage our code repositories, the more we enjoy its extended functionalities. Lately we've been playing and using branches which end up being very useful. We also use hgview instead of the built-in "hg view" command. And its latest release supports the branches functionality, you can filter out the branch you want to look at. Update your installation (apt-get upgrade ?) to enjoy this new functionality... or download it.

  • A new way of distributing Python code ?


    On distutils-sig, the question of distutils/setuptools replacing is frequently raised and a lot of effort is made to find what would be the best way to build and distribute python code.

    I don't understand the reason why we have a massive coupling between build and distribution (setuptools and pypi to be more precise) and I'm not convinced about this "global" approach. I hope the python community will examine the possibility to change that and split the problem in two distinct projects.

    One of the most successful ideas of Python is its power in extending other languages. And in fact, that's the major problem to solve for the build area. I'm pretty sure it will take a long time before obtaining a valuable (and widely adopted) solution and this is so complicated that the choice of the building chain should be kept under the responsibility of the upstream maintainers for now (distutils, setuptools, makefile, SCons, ...).

    Concerning the distribution, here are the mandatory features I expect:

    • installing source code managing dependencies with foreign contribution
    • have binary builds without interaction with the primary host system
    • be multi-platform agnostic (Linux, BSD, Windows, Mac, ...)
    • clean upgrade/uninstall
    • kind of sandboxes for testing and development mode
    • no administrator privilege required

    I found the project homepage and was really impressed by the tons of functionalities already available and the other numerous advantages, like:

    • multiple version installation
    • reuse external distribution effort (integrate deb, rpm, ...)
    • digital signatures
    • basic mirroring solution
    • notification about software updates
    • command line oriented but various GUI exist
    • try to follow standards (XDG specifications on

    I'm questioning seriously why this project could not be considered as a clean and build-independent python packages index system. Moreover, 0install has already some build capabilities (see 0compile) but the ultimate reason is that it will largely facilitate migrations when a new python build standard will emerge.


    0install looks like a mature project driven by smart people and already included in modern distributions. I'll definitively give it a try soon.

  • Converting excel files to CSV using and pyuno


    The Task

    I recently received from a customer a fairly large amount of data, organized in dozens of xls documents, each having dozens of sheets. I need to process this, and in order to ease the manipulation of the documents, I'd rather use standard text files in CSV (Comma Separated Values) format. Of course I didn't want to spend hours manually converting each sheet of each file to CSV, so I thought this would be a good time to get my hands in pyUno.

    So I gazed over the documentation, found the Calc page on the wiki, read some sample code and got started.

    The easy bit

    The first few lines I wrote were (all imports are here, though some were actually added later).

    import logging
    import sys
    import os.path as osp
    import os
    import time
    import uno
    def convert_spreadsheet(filename):
    def run():
        for filename in sys.argv[1:]:
    def configure_log():
        logger = logging.getLogger('')
        handler = logging.StreamHandler(sys.stdout)
        format = "%(asctime)s %(levelname)-7s [%(name)s] %(message)s"
    if __name__ == '__main__':

    That was the easy part. In order to write the convert_spreadsheet function, I needed to open the document. And to do that, I need to start

    Starting OOo

    I started by copy-pasting some code I found in another project, which expected to be already started with the -accept option. I changed that code a bit, so that the function would launch soffice with the correct options if it could not contact an existing instance:

    def _uno_init(_try_start=True):
        """init python-uno bridge infrastructure"""
            # Get the uno component context from the PyUNO runtime
            local_context = uno.getComponentContext()
            # Get the local Service Manager
            local_service_manager = local_context.ServiceManager
            # Create the UnoUrlResolver on the Python side.
            local_resolver = local_service_manager.createInstanceWithContext(
                "", local_context)
            # Connect to the running and get its context.
            # XXX make host/port configurable
            context = local_resolver.resolve("uno:socket,host=localhost,port=2002;urp;StarOffice.ComponentContext")
            # Get the ServiceManager object
            service_manager = context.ServiceManager
            # Create the Desktop instance
            desktop = service_manager.createInstance("")
            return service_manager, desktop
        except Exception, exc:
            if exc.__class__.__name__.endswith('NoConnectException') and _try_start:
      'Trying to start UNO server')
                status = os.system('soffice -invisible -accept="socket,host=localhost,port=2002;urp;"')
      'status = %d', status)
                return _uno_init(False)
                logging.exception("UNO server not started, you should fix that now. "
                                  "`soffice \"-accept=socket,host=localhost,port=2002;urp;\"` "
                                  "or maybe `unoconv -l` might suffice")

    Spreadsheet conversion

    Now the easy (sort of, once you start understanding the OOo API): to load a document, use desktop.loadComponentFromURL(). To get the sheets of a Calc document, use document.getSheets() (that one was easy...). To iterate over the sheets, I used a sample from the SpreadsheetCommon page on the wiki.

    Exporting the CSV was a bit more tricky. The function to use is document.storeToURL(). There are two gotchas, however. The first one, is that we need to specify a filter, and to parameterize it correctly. The second one is that the CSV export filter is only able to export the active sheet, so we need to change the active sheet as we iterate over the sheets.

    Parametrizing the export filter

    The parameters are passed in a tuple of PropertyValue uno structures, as the second argument to the storeToURL method. I wrote a helper function which accepts any named arguments and convert them to such a tuple:

    def make_property_array(**kwargs):
        """convert the keyword arguments to a tuple of PropertyValue uno
        array = []
        for name, value in kwargs.iteritems():
            prop = uno.createUnoStruct("")
            prop.Name = name
            prop.Value = value
        return tuple(array)

    Now, what do we put in that array? The answer is in the FilterOptions page of the wiki : The FilterName property is "Text - txt - csv (StarCalc)". We also need to configure the filter by using the FilterOptions property. This is a string of comma separated values

    • ASCII code of field separator
    • ASCII code of text delimiter
    • character set, use 0 for "system character set", 76 seems to be UTF-8
    • number of first line (1-based)
    • Cell format codes for the different columns (optional)

    I used the value "59,34,76,1", meaning I wanted semicolons for separators, and double quotes for text delimiters.

    Here's the code:

    def convert_spreadsheet(filename):
        """load a spreadsheet document, and convert all sheets to
        individual CSV files"""'processing %s', filename)
        url = "file://%s" % osp.abspath(filename)
        export_mask = make_export_mask(url)
        # initialize Uno, get a Desktop object
        service_manager, desktop = _uno_init()
            # load the Document
            document = desktop.loadComponentFromURL(url, "_blank", 0, ())
            controller = document.getCurrentController()
            sheets = document.getSheets()
  'found %d sheets', sheets.getCount())
            # iterate on all the spreadsheets in the document
            enumeration = sheets.createEnumeration()
            while enumeration.hasMoreElements():
                sheet = enumeration.nextElement()
                name = sheet.getName()
      'current sheet name is %s', name)
                outfilename = export_mask % name.replace(' ', '_')
                                    make_property_array(FilterName="Text - txt - csv (StarCalc)",
                                                        FilterOptions="59,34,76,1" ))
    def make_export_mask(url):
        """convert the url of the input document to a mask for the written
        CSV file, with a substitution for the sheet name
        >>> make_export_mask('file:///home/foobar/somedoc.xls')
        components = url.split('.')
        components[-2] += '$%s'
        components[-1] = 'csv'
        return '.'.join(components)

show 208 results