show 314 results

Blog entries

  • Javascript date support

    2008/11/27 by Adrien Di Mascio

    Coming from the python and mx.DateTime world, the javascript Date object is not really appealing. For me, the most disturbing things are :

    • The year parameter in the Date constructor is always considered as a XXe century year if year < 100. (this goes along with the getYear / getFullYear distinction).
    • The inconsistency between months and days indexes : months indexes starts at 0 whereas days indexes starts at 1.
    • The lack of decent strptime / strftime functions (even basic ones not taking locales into account).

    Recently, I've worked with the great Timeline project which makes an heavy use of dates and I had the need for a very basic strptime implementation. This can by no mean be considered as a comprehensive implementation, but it might help so here it is:

        'Y': new RegExp('^-?[0-9]+'),
        'd': new RegExp('^[0-9]{1,2}'),
        'm': new RegExp('^[0-9]{1,2}'),
        'H': new RegExp('^[0-9]{1,2}'),
        'M': new RegExp('^[0-9]{1,2}')
     * _parseData does the actual parsing job needed by `strptime`
    function _parseDate(datestring, format) {
        var parsed = {};
        for (var i1=0,i2=0;i1<format.length;i1++,i2++) {
        var c1 = format[i1];
        var c2 = datestring[i2];
        if (c1 == '%') {
            c1 = format[++i1];
            var data = _DATE_FORMAT_REGXES[c1].exec(datestring.substring(i2));
            if (!data.length) {
                return null;
            data = data[0];
            i2 += data.length-1;
            var value = parseInt(data, 10);
            if (isNaN(value)) {
                return null;
            parsed[c1] = value;
        if (c1 != c2) {
            return null;
        return parsed;
     * basic implementation of strptime. The only recognized formats
     * defined in _DATE_FORMAT_REGEXES (i.e. %Y, %d, %m, %H, %M)
    function strptime(datestring, format) {
        var parsed = _parseDate(datestring, format);
        if (!parsed) {
        return null;
        // create initial date (!!! year=0 means 1900 !!!)
        var date = new Date(0, 0, 1, 0, 0);
        date.setFullYear(0); // reset to year 0
        if (parsed.Y) {
        if (parsed.m) {
        if (parsed.m < 1 || parsed.m > 12) {
            return null;
        // !!! month indexes start at 0 in javascript !!!
        date.setMonth(parsed.m - 1);
        if (parsed.d) {
        if (parsed.m < 1 || parsed.m > 31) {
            return null;
        if (parsed.H) {
        if (parsed.H < 0 || parsed.H > 23) {
            return null;
        if (parsed.M) {
        if (parsed.M < 0 || parsed.M > 59) {
            return null;
        return date;
    // and now monkey patch the Timeline's parser ...
    /* provide our own custom date parser since the default
     * one only understands iso8601 and gregorian dates
    Timeline.NativeDateUnit.getParser = function(format) {
        if (typeof format == "string") {
        if (format.indexOf('%') != -1) {
            return function(datestring) {
                if (datestring) {
                    return strptime(datestring, format);
                return null;
            format = format.toLowerCase();
        if (format == "iso8601" || format == "iso 8601") {
        return Timeline.DateTime.parseIso8601DateTime;
        return Timeline.DateTime.parseGregorianDateTime;

  • We're open for a chat

    2008/11/25 by Arthur Lutz

    We have a public forum that is accessible both using XMPP (jabber) or IRC.

    Jabber / XMPP

    Our jabber server is

    If you don't have a jabber account, create one on a server such as (here is a list of free jabber services) or use our web based client.

    Once you have a jabber account, come and join us at xmpp://

    If you do not know what jabber is, read the wikipedia page about jabber

    IRC / International Relay Chat

    Connect to irc:// and join #pylint

    If you do not know what irc is, read the wikipedia page about irc.

  • DBpedia 3.2 released

    2008/11/19 by Nicolas Chauvat

    For those interested in the Semantic Web as much as we are at Logilab, the announce of the new DBpedia release is very good news. Version 3.2 is extracted from the October 2008 Wikipedia dumps and provides three mayor improvements: the DBpedia Schema which is a restricted vocabulary extracted from the Wikipedia infoboxes ; RDF links from DBpedia to Freebase, the open-license database providing about a million of things from various domains ; cleaner abstracts without the traces of Wikipedia markup that made them difficult to reuse.

    DBpedia can be downloaded, queried with SPARQL or linked to via the Linked Data interface. See the about page for details.

    It is important to note that ontologies are usually more of a common language for data exchange, meant for broad re-use, which means that they can not enforce too many restrictions. On the opposite, database schemas are more restrictive and allow for more interesting inferences. For example, a database schema may enforce that the Publisher of a Document is a Person, whereas a more general ontology will have to allow for Publisher to be a Person or a Company.

    DBpedia provides its schema and moves forward by adding a mapping from that schema to actual ontologies like UMBEL, OpenCyc and Yago. This enables DBpedia users to infer from facts fetched from different databases, like DBpedia + Freebase + OpenCyc. Moreover 'checking' DBpedia's data against ontologies will help detect mistakes or weirdnesses in Wikipedia's pages. For example, if data extracted from Wikipedia's infoboxes states that "Paris was_born_in New_York", reasoning and consistency checking tools will be able to point out that a person may be born in a city, but not a city, hence the above fact is probably an error and should be reviewed.

    With CubicWeb, one can easily define a schema specific to his domain, then quickly set up a web application and easily publish the content of its database as RDF for a known ontology. In other words, CubicWeb makes almost no difference between a web application and a database accessible thru the web.

  • Semantic Roundup - infos glanées au NextMediaCamp

    2008/11/12 by Arthur Lutz

    Par manque de temps voici les infos en brut glanées jeudi soir dernier au NextMediaBarCamp :

    Un BarCamp c'est assez rigolo, un peu trop de jeune cadres dynamiques en cravate à mon goût, mais bon. Parmi les trois mini-conférences auxquelles j'ai participé, il y avait une sur le web sémantique. Animée en partie par Fabrice Epelboin qui écrit pour la version française de ReadWriteWeb, j'ai appris des choses. Pour résumer ce que j'y ai compris : le web sémantique il y a deux approches :

    1. soit on pompe le contenu existant et on le transforme en contenu lisible par les machines avec des algorithmes de la mort, comme opencalais le fait (top-down)
    2. soit on écrit nous même en web sémantique avec des microformats ensuite les machines liront de plus en plus le web (bottom-up)

    Dans le deuxième cas la difficulté est de faciliter la tache des rédacteurs du web pour qu'ils puissent facilement publier du contenu en web sémantique. Pour cela ces outils sont mentionnés : Zemanta, Glue (de la société AdaptiveBlue).

    Tout ça m'a fait penser au fait que si CubicWeb publie déjà en microformats, il lui manque une interface d'édition à la hauteur des enjeux. Par exemple lorsque l'on tape un article et que l'application a les coordonnées d'une personne metionnée, il fait automagiquement une relation à cet élément. A creuser...

    Sinon sur les autres confs et le futur des médias, selon les personnes présentes, c'est assez glauque : des medias publicitaires, custom-made pour le bonheur de l'individu, où l'on met des sous dans les agrégateurs de contenu plutôt sur des journalistes de terrain. Pour ceux que ça intéresse, j'ai aussi découvert lors de ce BarCamp un petit film "rigolo" qui traite ce sujet préoccupant.

  • Using branches in mercurial

    2008/10/14 by Arthur Lutz

    The more we use mercurial to manage our code repositories, the more we enjoy its extended functionalities. Lately we've been playing and using branches which end up being very useful. We also use hgview instead of the built-in "hg view" command. And its latest release supports the branches functionality, you can filter out the branch you want to look at. Update your installation (apt-get upgrade ?) to enjoy this new functionality... or download it.

  • A new way of distributing Python code ?


    On distutils-sig, the question of distutils/setuptools replacing is frequently raised and a lot of effort is made to find what would be the best way to build and distribute python code.

    I don't understand the reason why we have a massive coupling between build and distribution (setuptools and pypi to be more precise) and I'm not convinced about this "global" approach. I hope the python community will examine the possibility to change that and split the problem in two distinct projects.

    One of the most successful ideas of Python is its power in extending other languages. And in fact, that's the major problem to solve for the build area. I'm pretty sure it will take a long time before obtaining a valuable (and widely adopted) solution and this is so complicated that the choice of the building chain should be kept under the responsibility of the upstream maintainers for now (distutils, setuptools, makefile, SCons, ...).

    Concerning the distribution, here are the mandatory features I expect:

    • installing source code managing dependencies with foreign contribution
    • have binary builds without interaction with the primary host system
    • be multi-platform agnostic (Linux, BSD, Windows, Mac, ...)
    • clean upgrade/uninstall
    • kind of sandboxes for testing and development mode
    • no administrator privilege required

    I found the project homepage and was really impressed by the tons of functionalities already available and the other numerous advantages, like:

    • multiple version installation
    • reuse external distribution effort (integrate deb, rpm, ...)
    • digital signatures
    • basic mirroring solution
    • notification about software updates
    • command line oriented but various GUI exist
    • try to follow standards (XDG specifications on

    I'm questioning seriously why this project could not be considered as a clean and build-independent python packages index system. Moreover, 0install has already some build capabilities (see 0compile) but the ultimate reason is that it will largely facilitate migrations when a new python build standard will emerge.


    0install looks like a mature project driven by smart people and already included in modern distributions. I'll definitively give it a try soon.

  • Converting excel files to CSV using and pyuno


    The Task

    I recently received from a customer a fairly large amount of data, organized in dozens of xls documents, each having dozens of sheets. I need to process this, and in order to ease the manipulation of the documents, I'd rather use standard text files in CSV (Comma Separated Values) format. Of course I didn't want to spend hours manually converting each sheet of each file to CSV, so I thought this would be a good time to get my hands in pyUno.

    So I gazed over the documentation, found the Calc page on the wiki, read some sample code and got started.

    The easy bit

    The first few lines I wrote were (all imports are here, though some were actually added later).

    import logging
    import sys
    import os.path as osp
    import os
    import time
    import uno
    def convert_spreadsheet(filename):
    def run():
        for filename in sys.argv[1:]:
    def configure_log():
        logger = logging.getLogger('')
        handler = logging.StreamHandler(sys.stdout)
        format = "%(asctime)s %(levelname)-7s [%(name)s] %(message)s"
    if __name__ == '__main__':

    That was the easy part. In order to write the convert_spreadsheet function, I needed to open the document. And to do that, I need to start

    Starting OOo

    I started by copy-pasting some code I found in another project, which expected to be already started with the -accept option. I changed that code a bit, so that the function would launch soffice with the correct options if it could not contact an existing instance:

    def _uno_init(_try_start=True):
        """init python-uno bridge infrastructure"""
            # Get the uno component context from the PyUNO runtime
            local_context = uno.getComponentContext()
            # Get the local Service Manager
            local_service_manager = local_context.ServiceManager
            # Create the UnoUrlResolver on the Python side.
            local_resolver = local_service_manager.createInstanceWithContext(
                "", local_context)
            # Connect to the running and get its context.
            # XXX make host/port configurable
            context = local_resolver.resolve("uno:socket,host=localhost,port=2002;urp;StarOffice.ComponentContext")
            # Get the ServiceManager object
            service_manager = context.ServiceManager
            # Create the Desktop instance
            desktop = service_manager.createInstance("")
            return service_manager, desktop
        except Exception, exc:
            if exc.__class__.__name__.endswith('NoConnectException') and _try_start:
      'Trying to start UNO server')
                status = os.system('soffice -invisible -accept="socket,host=localhost,port=2002;urp;"')
      'status = %d', status)
                return _uno_init(False)
                logging.exception("UNO server not started, you should fix that now. "
                                  "`soffice \"-accept=socket,host=localhost,port=2002;urp;\"` "
                                  "or maybe `unoconv -l` might suffice")

    Spreadsheet conversion

    Now the easy (sort of, once you start understanding the OOo API): to load a document, use desktop.loadComponentFromURL(). To get the sheets of a Calc document, use document.getSheets() (that one was easy...). To iterate over the sheets, I used a sample from the SpreadsheetCommon page on the wiki.

    Exporting the CSV was a bit more tricky. The function to use is document.storeToURL(). There are two gotchas, however. The first one, is that we need to specify a filter, and to parameterize it correctly. The second one is that the CSV export filter is only able to export the active sheet, so we need to change the active sheet as we iterate over the sheets.

    Parametrizing the export filter

    The parameters are passed in a tuple of PropertyValue uno structures, as the second argument to the storeToURL method. I wrote a helper function which accepts any named arguments and convert them to such a tuple:

    def make_property_array(**kwargs):
        """convert the keyword arguments to a tuple of PropertyValue uno
        array = []
        for name, value in kwargs.iteritems():
            prop = uno.createUnoStruct("")
            prop.Name = name
            prop.Value = value
        return tuple(array)

    Now, what do we put in that array? The answer is in the FilterOptions page of the wiki : The FilterName property is "Text - txt - csv (StarCalc)". We also need to configure the filter by using the FilterOptions property. This is a string of comma separated values

    • ASCII code of field separator
    • ASCII code of text delimiter
    • character set, use 0 for "system character set", 76 seems to be UTF-8
    • number of first line (1-based)
    • Cell format codes for the different columns (optional)

    I used the value "59,34,76,1", meaning I wanted semicolons for separators, and double quotes for text delimiters.

    Here's the code:

    def convert_spreadsheet(filename):
        """load a spreadsheet document, and convert all sheets to
        individual CSV files"""'processing %s', filename)
        url = "file://%s" % osp.abspath(filename)
        export_mask = make_export_mask(url)
        # initialize Uno, get a Desktop object
        service_manager, desktop = _uno_init()
            # load the Document
            document = desktop.loadComponentFromURL(url, "_blank", 0, ())
            controller = document.getCurrentController()
            sheets = document.getSheets()
  'found %d sheets', sheets.getCount())
            # iterate on all the spreadsheets in the document
            enumeration = sheets.createEnumeration()
            while enumeration.hasMoreElements():
                sheet = enumeration.nextElement()
                name = sheet.getName()
      'current sheet name is %s', name)
                outfilename = export_mask % name.replace(' ', '_')
                                    make_property_array(FilterName="Text - txt - csv (StarCalc)",
                                                        FilterOptions="59,34,76,1" ))
    def make_export_mask(url):
        """convert the url of the input document to a mask for the written
        CSV file, with a substitution for the sheet name
        >>> make_export_mask('file:///home/foobar/somedoc.xls')
        components = url.split('.')
        components[-2] += '$%s'
        components[-1] = 'csv'
        return '.'.join(components)

  • qgpibplotter is (hopefully) working

    2008/09/04 by David Douard

    My latest personal project, pygpibtoolkit, holds a simple HPGL plotter trying to emulate the HP7470A GPIB plotter, using the very nice and cheap Prologix USB-GPIB dongle. This tool is (for now) called qgpibplotter (since it is using the Qt4 toolkit).

    Tonight, I took (at last) the time to make it work nicely. Well, nicely with the only device I own which is capable of plotting on the GPIB bus, my HP3562A DSA.

    Now, you just have to press the "Plot" button of your test equipment, and bingo! you can see the plot on your computer.

  • gajim, dbus and wmii

    2008/09/02 by Adrien Di Mascio

    I've been using for a long time a custom version of gajim in order to make it interact with wmii. More precisely, I have, in my wmii status bar, a dedicated log zone where I print notification messages such as new incoming emails or text received from gajim (with different colors if special words were cited, etc.).

    I recently decided to throw away my custom gajim and use python and dbus to achieve the same goal in a cleaner way. A very basic version can be found in the simpled project. As of now, the only way to get the code is trhough mercurial:

    hg clone

    The source file is named In this file, you'll also find a version sending messages to Ion's status bar.

  • Command-line graphical user interfaces

    2008/09/01 by Nicolas Chauvat

    Graphical user interfaces help command discovery, while command-line interfaces help command efficiency. This article tries to explain why. I reached it when reading the list of references from the introduction to Ubiquity, which is the best extension to firefox I have seen so far. I expect to start writing Ubiquity commands soon, since I have already been using extensively the 'keyword shorcut' functionnality of firefox's bookmarks and we have already done work in the area of 'language interaction', as they call it at Mozilla Labs, when working with Narval. Our Logilab Simple Desktop project, aka simpled, also goes in the same direction since it tries to unify different applications into a coherent work environment by defining basic commands and shorcuts that can be applied everywhere and accessing the rest of the functionnalities via a command-line interface.

show 314 results