Blog entries

PostgreSQL on windows : plpythonu and "specified module could not be found" error

2010/03/22

I recently had to (remotely) debug an issue on windows involving PostgreSQL and PL/Python. Basically, two very similar computers, with Python2.5 installed via python(x,y), PostgreSQL 8.3.8 installed via the binary installer. On the first machine create language plpythonu; worked like a charm, and on the other one, it failed with C:\\Program Files\\Postgresql\\8.3\\plpython.dll: specified module could not be found. This is caused by the dynamic linker not finding some DLL. Using Depends.exe showed that plpython.dll looks for python25.dll (the one it was built against in the 8.3.8 installer), but that the DLL was there.

I'll save the various things we tried and jump directly to the solution. After much head scratching, it turned out that the first computer had TortoiseHg installed. This caused C:\\Program Files\\TortoiseHg to be included in the System PATH environment variable, and that directory contains python25.dll. On the other hand C:\\Python25 was in the user's PATH environment variable on both computers. As the database Windows service runs using a dedicated local account (typically with login postgres), it would not have C:\\Python25 in its PATH, but if TortoiseHg was there, it would find the DLL in some other directory. So the solution was to add C:\\Python25 to the system PATH.


Rss feeds aggregator based on Scikits.learn and CubicWeb

2011/10/17 by Vincent Michel

During Euroscipy, the Logilab Team presented an original approach for querying news using semantic information: "Rss feeds aggregator based on Scikits.learn and CubicWeb" by Vincent Michel This work is based on two major pieces of software:

http://www.cubicweb.org/data/index-cubicweb.png
  • CubicWeb, the pythonic semantic web framework, is used to store and query Dbpedia information. CubicWeb is able to reconstruct links from rdf/nt files, and can easily execute complex queries in a database with more than 8 millions entities and 75 millions links when using a PostgreSQL backend.
http://scipy-lectures.github.com/_images/scikit-learn-logo.png
  • Scikit.learn is a cutting-edge python toolbox for machine learning. It provides algorithms that are simple and easy to use.
http://www.pfeifermachinery.com/img/rss.png

Based on these tools, we built a pure Python application to query the news:

  • Named Entities are extracted from RSS articles of a few mainstream English newspapers (New York Times, Reuteurs, BBC News, etc.), for each group of words in an article, we check if a Dbpedia entry has the same label. If so, we create a semantic link between the article and the Dbpedia entry.
  • An occurrence matrix of "RSS Articles" times "Named Entities" is constructed and may be used against several machine learning algorithms (MeanShift algorithm, Hierachical Clustering) in order to provide original and informative views of recent events.
http://wiki.dbpedia.org/images/dbpedia_logo.png

Moreover, queries may be used jointly with semantic information from Dbpedia:

  • All musical artists in the news:

    DISTINCT Any E, R WHERE E appears_in_rss R, E has_type T, T label "musical artist"
    
  • All living office holder persons in the news:

    DISTINCT Any E WHERE E appears_in_rss R, E has_type T, T label "office holder", E has_subject C, C label "Living people"
    
  • All news that talk about Barack Obama and any scientist:

    DISTINCT Any R WHERE E1 label "Barack Obama", E1 appears_in_rss R, E2 appears_in_rss R, E2 has_type T, T label "scientist"
    
  • All news that talk about a drug:

    Any X, R WHERE X appears_in_rss R, X has_type T, T label "drug"
    

Such a tool may be used for informetrics and news analysis. Feel free to download the complete slides of the presentation.


FOSDEM 2013

2013/02/12 by Pierre-Yves David

I was in Bruxelles for FOSDEM 2013. As with previous FOSDEM there were too many interesting talks and people to see. Here is a summary of what I saw:

In the Mozilla's room:

  1. The html5 pdf viewer pdfjs is impressive. The PDF specification is really scary but this full featured "native" viewer is able to renders most of it with very good performance. Have a look at the pdfjs demo!
  1. Firefox debug tools overview with a specific focus of Firefox OS emulator in your browser.
  1. Introduction to webl10n: an internationalization format and library used in Firefox OS. A successful mix that results in a format that is idiot-proof enough for a duck to use, that relies on Unicode specifications to handle complex pluralization rules and that allows cascading translation definitions.
typical webl10n user
  1. Status of html5 video and audio support in Firefox. The topic looks like a real headache but the team seems to be doing really well. Special mention for the reverse demo effect: The speaker expected some format to be still unsupported but someone else apparently implemented them over night.
  2. Last but not least I gave a talk about the changeset evolution concept that I'm putting in Mercurial. Thanks goes to Feth for asking me his not-scripted-at-all-questions during this talk. (slides)
http://www.selenic.com/hg-logo/logo-droplets-150.png

In the postgresql room:

  1. Insightful talk about more event trigger in postgresql engine and how this may becomes the perfect way to break your system.
  2. Full update of the capability of postgis 2.0. The postgis suite was already impressive for storing and querying 2D data, but it now have impressive capability regarding 3D data.
http://upload.wikimedia.org/wikipedia/en/6/60/PostGIS_logo.png

On python related topic:

http://www.python.org/community/logos/python-logo-master-v3-TM-flattened.png
  • Victor Stinner has started an interesting project to improve CPython performance. The first one: astoptimizer breaks some of the language semantics to apply optimisation on compiling to byte code (lookup caching, constant folding,…). The other, registervm is a full redefinition of how the interpreter handles reference in byte code.

After the FOSDEM, I crossed the channel to attend a Mercurial sprint in London. Expect more on this topic soon.