] > latest blogs

latest blogs

Using branches in mercurial

http://www.logilab.org/image/4873?vid=download&small=true

The more we use mercurial to manager our code repositories, the more we enjoy its extended functionalities. Lately we've been playing and using branches which end up being very useful. We also use hgview instead of the built-in "hg view" command. And it's latest release supports the branches functionality, you can filter out the branch you want to look at. Update your installation (apt-get upgrade ?) to enjoy this new functionality... or download it.

http://www.selenic.com/hg-logo/logo-droplets-50.png

A new way of distributing Python code ?

http://jonathan.demoutiez.net/images/logos/python.png

On distutils-sig, the question of distutils/setuptools replacing is frequently raised and a lot of effort is made to find what would be the more occurate way to build and distribute python code.

I don't understand the reason why we have a massive coupling between build and distribution (setuptools and pypi to be more precise) and I'm not convinced about this "global" approach. I hope the python community will examine the possibility to change that and split the problem in two separated projects.

One of the most successful idea of Python is its power in extending other languages. And in fact, that's the major problem to solve for the build area. I'm pretty sure it will take a long time before obtaining a valuable (and adopted) solution and this is so complicated that the choice of the building chain should be kept under the responsibility of the upstream maintainers for now (distutils, setuptools, makefile, SCons, ...).

Concerning the distribution, here are the mandatory features I expect:

  • installing source code managing dependencies with foreign contribution
  • have binary builds without interact with the primary host system
  • be multi-platform agnostic (Linux, BSD, Windows, Mac, ...)
  • clean upgrade/uninstall
  • kind of sandboxes for testing and development mode
  • no administrator privilege required
http://0install.net/tango/package-x-generic.png

I found the http://0install.net project homepage and was really impressed by the tons of functionalities already available and the other numerous advantages, like:

  • multiple version installation
  • reuse external distribution effort (integrate deb, rpm, ...)
  • digital signatures
  • basic mirroring solution
  • notification about software updates
  • command line oriented but various GUI exist
  • try to follow standards (XDG specifications on freedesktop.org))

I'm questioning seriously why this project could not be considered as a clean and build-independent python packages index system. Moreover, 0install has already some build capabilities (see 0compile) but the ultime reason is that it will largely facilitate migrations when a new python build standard will emerge.

Conclusion

0install is a mature project driven by smart people and already present in modern distributions. I'll definitively give a try to it soon.

Converting excel files to CSV using OpenOffice.org and pyuno

http://wiki.services.openoffice.org/w/images/6/69/Py-uno_128.png

The Task

I recently received from a customer a fairly large amount of data, organized in dozens of xls documents, each having dozens of sheets. I need to process this, and in order to ease the manipulation of the documents, I'd rather use standard text files in CSV (Comma Separated Values) format. Of course I didn't want to spend hours manually converting each sheet of each file to CSV, so I thought this would be a good time to get my hands in pyUno.

So I gazed over the documentation, found the Calc page on the OpenOffice.org wiki, read some sample code and got started.

The easy bit

The first few lines I wrote were (all imports are here, though some were actually added later).

import logging
import sys
import os.path as osp
import os
import time

import uno

def convert_spreadsheet(filename):
    pass

def run():
    for filename in sys.argv[1:]:
        convert_spreadsheet(filename)

def configure_log():
    logger = logging.getLogger('')
    logger.setLevel(logging.DEBUG)
    handler = logging.StreamHandler(sys.stdout)
    logger.addHandler(handler)
    format = "%(asctime)s %(levelname)-7s [%(name)s] %(message)s"
    handler.setFormatter(logging.Formatter(format))

if __name__ == '__main__':
    configure_log()
    run()

That was the easy part. In order to write the convert_spreadsheet function, I needed to open the document. And to do that, I need to start OpenOffice.org.

Starting OOo

http://www.squaregoldfish.co.uk/software/e17icons/oocalc.png

I started by copy-pasting some code I found in another project, which expected OpenOffice.org to be already started with the -accept option. I changed that code a bit, so that the function would launch soffice with the correct options if it could not contact an existing instance:

def _uno_init(_try_start=True):
    """init python-uno bridge infrastructure"""
    try:
        # Get the uno component context from the PyUNO runtime
        local_context = uno.getComponentContext()
        # Get the local Service Manager
        local_service_manager = local_context.ServiceManager
        # Create the UnoUrlResolver on the Python side.
        local_resolver = local_service_manager.createInstanceWithContext(
            "com.sun.star.bridge.UnoUrlResolver", local_context)
        # Connect to the running OpenOffice.org and get its context.
        # XXX make host/port configurable
        context = local_resolver.resolve("uno:socket,host=localhost,port=2002;urp;StarOffice.ComponentContext")
        # Get the ServiceManager object
        service_manager = context.ServiceManager
        # Create the Desktop instance
        desktop = service_manager.createInstance("com.sun.star.frame.Desktop")
        return service_manager, desktop
    except Exception, exc:
        if exc.__class__.__name__.endswith('NoConnectException') and _try_start:
            logging.info('Trying to start UNO server')
            status = os.system('soffice -invisible -accept="socket,host=localhost,port=2002;urp;"')
            time.sleep(2)
            logging.info('status = %d', status)
            return _uno_init(False)
        else:
            logging.exception("UNO server not started, you should fix that now. "
                              "`soffice \"-accept=socket,host=localhost,port=2002;urp;\"` "
                              "or maybe `unoconv -l` might suffice")
            raise

Spreadsheet conversion

Now the easy (sort of, once you start understanding the OOo API): to load a document, use desktop.loadComponentFromURL(). To get the sheets of a Calc document, use document.getSheets() (that one was easy...). To iterate over the sheets, I used a sample from the SpreadsheetCommon page on the OpenOffice.org wiki.

Exporting the CSV was a bit more tricky. The function to use is document.storeToURL(). There are two gotchas, however. The first one, is that we need to specify a filter, and to parameterize it correctly. The second one is that the CSV export filter is only able to export the active sheet, so we need to change the active sheet as we iterate over the sheets.

Parametrizing the export filter

The parameters are passed in a tuple of PropertyValue uno structures, as the second argument to the storeToURL method. I wrote a helper function which accepts any named arguments and convert them to such a tuple:

def make_property_array(**kwargs):
    """convert the keyword arguments to a tuple of PropertyValue uno
    structures"""
    array = []
    for name, value in kwargs.iteritems():
        prop = uno.createUnoStruct("com.sun.star.beans.PropertyValue")
        prop.Name = name
        prop.Value = value
        array.append(prop)
    return tuple(array)

Now, what do we put in that array? The answer is in the FilterOptions page of the wiki : The FilterName property is "Text - txt - csv (StarCalc)". We also need to configure the filter by using the FilterOptions property. This is a string of comma separated values

  • ASCII code of field separator
  • ASCII code of text delimiter
  • character set, use 0 for "system character set", 76 seems to be UTF-8
  • number of first line (1-based)
  • Cell format codes for the different columns (optional)

I used the value "59,34,76,1", meaning I wanted semicolons for separators, and double quotes for text delimiters.

Here's the code:

def convert_spreadsheet(filename):
    """load a spreadsheet document, and convert all sheets to
    individual CSV files"""
    logging.info('processing %s', filename)
    url = "file://%s" % osp.abspath(filename)
    export_mask = make_export_mask(url)
    # initialize Uno, get a Desktop object
    service_manager, desktop = _uno_init()
    try:
        # load the Document
        document = desktop.loadComponentFromURL(url, "_blank", 0, ())
        controller = document.getCurrentController()
        sheets = document.getSheets()
        logging.info('found %d sheets', sheets.getCount())

        # iterate on all the spreadsheets in the document
        enumeration = sheets.createEnumeration()
        while enumeration.hasMoreElements():
            sheet = enumeration.nextElement()
            name = sheet.getName()
            logging.info('current sheet name is %s', name)
            controller.setActiveSheet(sheet)
            outfilename = export_mask % name.replace(' ', '_')
            document.storeToURL(outfilename,
                                make_property_array(FilterName="Text - txt - csv (StarCalc)",
                                                    FilterOptions="59,34,76,1" ))
    finally:
        document.close(True)

def make_export_mask(url):
    """convert the url of the input document to a mask for the written
    CSV file, with a substitution for the sheet name

    >>> make_export_mask('file:///home/foobar/somedoc.xls')
    'file:///home/foobar/somedoc$%s.csv'
    """

    components = url.split('.')
    components[-2] += '$%s'
    components[-1] = 'csv'
    return '.'.join(components)

qgpibplotter is (hopefully) working

My latest personal project, pygpibtoolkit, holds a simple HPGL plotter trying to emulate the HP7470A GPIB plotter, using the very nice and cheap Prologix USB-GPIB dongle. This tool is (for now) called qgpibplotter (since it is using the Qt4 toolkit).

Tonight, I took (at last) the time to make it work nicely. Well, nicely with the only device I own which is capable of plotting on the GPIB bus, my HP3562A DSA.

Now, you just have to press the "Plot" button of your test equipment, and bingo! you can see the plot on your computer.

http://www.logilab.org/image/5837?vid=download
blog entry of

gajim, dbus and wmii

http://upload.wikimedia.org/wikipedia/commons/d/de/Gajim.png

I've been using for a long time a custom version of gajim in order to make it interact with wmii. More precisely, I have, in my wmii status bar, a dedicated log zone where I print notification messages such as new incoming emails or text received from gajim (with different colors if special words were cited, etc.).

I recently decided to throw away my custom gajim and use python and dbus to achieve the same goal in a cleaner way. A very basic version can be found in the simpled project. As of now, the only way to get the code is trhough mercurial:

hg clone http://www.logilab.org/hg/simpled

The source file is named gajimnotifier.py. In this file, you'll also find a version sending messages to Ion's status bar.

blog entry of

Command-line graphical user interfaces

http://azarask.in/gfx/ubiquity_side.png

Graphical user interfaces help command discovery, while command-line interfaces help command efficiency. This article tries to explain why. I reached it when reading the list of references from the introduction to Ubiquity, which is the best extension to firefox I have seen so far. I expect to start writing Ubiquity commands soon, since I have already been using extensively the 'keyword shorcut' functionnality of firefox's bookmarks and we have already done work in the area of 'language interaction', as they call it at Mozilla Labs, when working with Narval. Our Logilab Simple Desktop project, aka simpled, also goes in the same direction since it tries to unify different applications into a coherent work environment by defining basic commands and shorcuts that can be applied everywhere and accessing the rest of the functionnalities via a command-line interface.

blog entry of

Is the Openmoko freerunner a computer or a phone ?

http://wiki.openmoko.org/images/thumb/b/b9/Freerunner02.gif/150px-Freerunner02.gif

The Openmoko Freerunner is a computer with embedded GSM, accelerometer and GPS. I got mine last week after waiting for a month for the batch to get from Taiwan to the french company I bought it from. The first thing I had to admit was that some time will pass before it gets confortable to use it as a phone. The current version of the system has many weird things in its user interface and the phone works, but the other end of the call suffers a very unpleasant echo.

I will try to install Debian, Qtopia and Om2008.8 to compare them. I also want to quickly get Python scripts to run on it and get back to Narval hacking. I had an agent running on a bulky Palm+GPS+radionetwork back in 1999 and I look forward to run on this device the same kind of funny things I was doing in AI research ten years ago.

blog entry of

simpled - Simple Desktop project started !

I bought last week a new laptop computer that can drive a 24" LCD monitor, which means I do not need my desktop computer any more. In the process of setting up that new laptop, I did what I have been wanting to do for years without finding the time: spending time on my ion3 config to make it more generic and create a small python setup utility that can regenerate it from a template file and a keyboard layout.

The simpled project was born!

If you take a look at the list of pending tickets, you will guess that I am using a limited number of pieces of software during my work day and tried to configure them so that they share common action/shortcuts. This is what simpled is about: given a keyboard layout generate the config files for the common tools so that action/shortcuts are always on the same key.

I use ion3, xterm+bash, emacs, mutt, firefox, gajim. Common actions are: open, save, close, move up/down/left/right, new frame or tab, close frame or tab, move to previous or next tab, etc.

I will give news in this blog from time to time and announce it on mailing lists when version 0.1 will be out. If you want to give it a try, get the code from the mercurial repository.

blog entry of

Simile-Widgets

http://simile.mit.edu/images/logo.png

While working on knowledge management and semantic web technologies, I came across the Simile project at MIT a few years back. I even had a demo of the Exhibit widget fetching then displaying data from our semantic web application framework back in 2006 at the Web2 track of Solutions Linux in Paris.

Now that we are using these widgets when implementing web apps for clients, I was happy to see that the projects got a life of their own outside of MIT and became full-fledged free-software projects hosted on Google Code. See Simile-Widgets for more details and expect us to provide a debian package soon unless someone does it first.

Speaking of Debian, here is a nice demo a the Timeline widget presenting the Debian history.

http://beta.thumbalizr.com/app/thumbs/?src=/thumbs/onl/source/d2/d280583f143793f040bdacf44a39b0d5.png&w=320&q=0&enc=
blog entry of

SciPy and TimeSeries

http://www.enthought.com/img/scipy-sm.png

We have been using many different tools for doing statistical analysis with Python, including R, SciPy, specific C++ code, etc. It looks like the growing audience of SciPy is now in movement to have dedicated modules in SciPy (lets call them SciKits). See this thread in SciPy-user mailing-list.

blog entry of

Google Custom Search Engine, for Python

A Google custom search engine for Python has been made available by Gerard Flanagan, indexing:

http://www.logilab.fr/images/python-logo.png

Using refinements

To refine the search to any of the individual sites, you can specify a refinement using the following labels: stdlib, wiki, pypi, thehazeltree

So, to just search the python wiki, you would enter:

somesearchterm more:wiki

and similarly:

somesearchterm more:stdlib somesearchterm more:pypi somesearchterm more:thehazeltree

About http://thehazeltree.org

The Hazel Tree is a collection of popular Python texts that I have converted to reStructuredText and put together using Sphinx. It's in a publishable state, but not as polished as I'd like, and since I'll be mostly offline for the next month it will have to remain as it is for the present. However, the search engine is ready now and the clock is ticking on its subscription (one year, renewal depending on success of site), so if it's useful to anyone, it's all yours (and if you use it on your own site a link back to http://thehazeltree.org would be appreciated).

blog entry of
tagged by

Python for applied Mathematics

http://www.ams.org/images/siam2008-brain.jpg

The presentation of Python as a tool for applied mathematics got highlighted at the 2008 annual meeting of the american Society for Industrial and Applied Mathematics (SIAM). For more information, read this blogpost and the slides.

blog entry of

Windows, fichiers ouverts et tests unitaires

Un problème rencontré hier : un test unitaire plante sous Windows, après avoir créé un objet qui garde des fichiers ouverts. le tearDown du test est appelé, mais il plante car Windows refuse de supprimer des fichiers ouverts, et le framework de test garde une référence sur la fonction de test pour qu'on puisse examiner la pile d'appels. Sous Linux, pas de problème (on a le droit du supprimer du disque un fichier ouvert, et donc pas de soucis dans le teardown).

Quelques pistes pour contourner le problème:

  1. mettre le test dans un try...finally avec un del sur l'objet qui garde les fichiers ouverts dans le finally. Inconvénient : quand le test ne passe pas, pdb ne permet plus de voir grand chose
  2. au lieu de nettoyer dans le tearDown, nettoyer plus tard dans un atexit par exemple. Il faut voir comment ça se passe si plusieurs tests veulent écrire dans les mêmes fichiers (je pense qu'il faudrait un répertoire temporaire par test, si on veut pouvoir avoir plusieurs tests qui foirent et examiner leurs données, mais il faut tester pour être sûr)
  3. coller un try...except dans le tearDown autour de la suppression de chaque fichier, et mettre les fichiers qui posent problème dans une liste qui sera traitée à la sortie du programme (avec atexit par exemple).

Ça ressemble à du bricolage, mais on a un comportement de windows sur lequel on n'a pas de contrôle (même avec des privilèges Administrateur ou System, on ne peut pas contourner cette impossibilité de supprimer un fichier ouvert, à ma connaissance).

Une autre approche, nettement plus lourde, serait de virtualiser la création de fichiers pour travailler en mémoire (au minimum surcharger os.mkdir et le builtin open, voire dans le cas qui nous intéresse les modules qui travaillent avec des fichiers zip). Il y a peut-être des choses comme ça en circulation. Poser la question sur la liste TIP apportera peut-être des réponses (une rapide recherche dans les archives n'a rien donné).

Voir aussi ces enfilades de mars 2004 et novembre 2004 sur comp.lang.python.

blog entry of

ion, dock and screen configuration

I have a laptop I use at work (with a docking station), in the train and at home (with an external display), on which my environment is ion3.

As I use suspend-to-RAM all the time, I have added some keybindings to automatically reconfigure my screen when I plug/unplug an external display (on the dock as well as direct VGA connection).

The lua code to paste in your .ion3/cfg_ion.lua for the bindings looks like:

function autoscreen_on()
        local f = io.popen('/home/david/bin/autoscreen -c', 'r')
      if not f then
          return
      end
      local s = f:read('*a')
      f:close()
    ioncore.restart()
end

function autoscreen_off()
        local f = io.popen('/home/david/bin/autoscreen -d', 'r')
      if not f then
          return
      end
      local s = f:read('*a')
      f:close()
    ioncore.restart()
end

defbindings("WMPlex.toplevel", {
    bdoc("Turn on any external display and tell ion to reconfigure itself"),
    kpress(META.."F10",
           "autoscreen_on()"),
})

defbindings("WMPlex.toplevel", {
    bdoc("Turn off any external display and tell ion to reconfigure itself"),
    kpress(META.."F11",
           "autoscreen_off()"),
})

It makes use of the following python script (named /home/david/bin/autoscreen in the lua code above):

#!/usr/bin/env python

import sys
import os
import re
from subprocess import Popen, PIPE
import optparse
parser = optparse.OptionParser("A simple automatic screen configurator (using xrandr)")
parser.add_option('-c', '--connect', action="store_true",
                  dest='connect',
                  default=False,
                  help="configure every connected screens")
parser.add_option('-d', '--disconnect', action="store_true",
                  dest='disconnect',
                  default=False,
                  help="unconfigure every connected screens other than LVDS (laptop screen)")
parser.add_option('', '--main-display',
                  dest='maindisplay',
                  default="LVDS",
                  help="main display identifier (typically, the laptop LCD screen; defaults to LVDS)")

options, args = parser.parse_args()

if int(options.connect) + int(options.disconnect) > 1:
    print "ERROR: only one option -c or -d at a time"
    parser.print_help()
    sys.exit(1)


xrandr = Popen("xrandr", shell=True, bufsize=0, stdout=PIPE).stdout.read()

connected = re.findall(r'([a-zA-Z0-9-]*) connected', xrandr)
connected = [c for c in connected if c != options.maindisplay]

cmd = "xrandr --output %s %s"

if options.connect or options.disconnect:
    for c in connected:
        if options.connect:
            action = "--auto"
        elif options.disconnect:
            action = "--off"

        p = Popen(cmd % (c, action), shell=True)
        sts = os.waitpid(p.pid, 0)
blog entry of
tagged by

We're going to Europython

http://europython.org/euro/img/europython.png

Hey,

We've decided to go to Europython this year. We're obviously going to give a talk about the exciting things we're doing with LAX and GoogleAppEngine. We're on wednesday at midday in the alfa room, check out the schedule here. Since we think it's important that these events take place, we're also chipping in and sponsoring the event.

We hope to see you there. Drop us a note if you want to meet up.

blog entry of