|
Blog entriesJ'ai présenté au printemps dernier à l'occasion de la conférence AgileFrance2010 un retour d'expérience sur la gestion agile du projet Pylos. Mon "client" m'a fait la gentillesse de participer à l'élaboration de cette présentation et de venir co-présenter.
Après avoir longtemps tardé, voici le support de la présentation (le texte se trouve à la fin, avec les notes pour les orateurs). Bonne lecture.
Merci à Christine, et aux organisateurs de la conférence.
I had a bad case of bug hunting today which took me > 5 hours to track down (with the help of Adrien in the end).
I was trying to start a CubicWeb instance on my computer, and was encountering some strange pyro error at startup. So I edited some source file to add a pdb.set_trace() statement and restarted the instance, waiting for Python's debugger to kick in. But that did not happen. I was baffled. I first checked for standard problems:
- no pdb.py or pdb.pyc was lying around in my Python sys.path
- the pdb.set_trace function had not been silently redefined
- no other thread was bugging me
- the standard input and output were what they were supposed to be
- I was not able to reproduce the issue on other machines
After triple checking everything, grepping everywhere, I asked a question on StackOverflow before taking a lunch break (if you go there, you'll see the answer). After lunch, no useful answer had come in, so I asked Adrien for help, because two pairs of eyes are better than one in some cases. We dutifully traced down the pdb module's code to the underlying bdb and cmd modules and learned some interesting things on the way down there. Finally, we found out that the Python code frames which should have been identical where not. This discovery caused further bafflement. We looked at the frames, and saw that one of those frames's class was a psyco generated wrapper.
It turned out that CubicWeb can use two implementation of the RQL module: one which uses gecode (a C++ library for constraint based programming) and one which uses logilab.constraint (a pure python library for constraint solving). The former is the default, but it would not load on my computer, because the gecode library had been replaced by a more recent version during an upgrade. The pure python implementation tries to use psyco to speed up things. Installing the correct version of libgecode solved the issue. End of story.
When I checked out StackOverflow, Ned Batchelder had provided an answer. I didn't get the satisfaction of answering the question myself...
Once this was figured out, solving the initial pyro issue took 2 minutes...
I recently had to (remotely) debug an issue on windows involving
PostgreSQL and PL/Python. Basically, two very similar computers, with
Python2.5 installed via python(x,y), PostgreSQL 8.3.8 installed via
the binary installer. On the first machine create language
plpythonu; worked like a charm, and on the other one, it failed with
C:\\Program Files\\Postgresql\\8.3\\plpython.dll: specified module
could not be found. This is caused by the dynamic linker not finding
some DLL. Using Depends.exe showed that
plpython.dll looks for python25.dll (the one it was built
against in the 8.3.8 installer), but that the DLL was there.
I'll save the various things we tried and jump directly to the
solution. After much head scratching, it turned out that the first
computer had TortoiseHg installed. This caused C:\\Program
Files\\TortoiseHg to be included in the System PATH environment
variable, and that directory contains python25.dll. On the other
hand C:\\Python25 was in the user's PATH environment variable on both
computers. As the database Windows service runs using a dedicated
local account (typically with login postgres), it would not have
C:\\Python25 in its PATH, but if TortoiseHg was there, it would
find the DLL in some other directory. So the solution was to add
C:\\Python25 to the system PATH.
As part of an ongoing customer project, I've been learning about the Condor queue management system (actually it is more than just a batch queue management system, tacking the High-throughput computing problem, but in my current project, we're not using the full possibilities of Condor, and the choice was dictated by other considerations outside the scope of this note). The documentation is excellent, and the features of the product are really amazing (pity the project runs on Windows, and we cannot use 90% of these...).
To launch a job on a computer participating in the Condor farm, you just have to write a job file which looks like this:
Universe=vanilla
Executable=$path_to_executabe
Arguments=$arguments_to_the_executable
InitialDir=$working_directory
Log=$local_logfile_name
Output=$local_file_for_job_stdout
Error=$local_file_for_job_stderr
Queue
and then run condor_submit my_job_file and use condor_q to monitor the status your job (queued, running...)
My program is generating Condor job files and submitting them, and I've spent hours yesterday trying to understand why they were all failing : the stderr file contained a message from Python complaining that it could not import site and exiting.
A point which was not clear in the documentation I read (but I probably overlooked it) is that the executable mentionned in the job file is supposed to be a local file on the submission host which is copied to the computer running the job. In the jobs generated by my code, I was using sys.executable for the Executable field, and a path to the python script I wanted to run in the Arguments field. This resulted in the Python interpreter being copied on the execution host and not being able to run because it was not able to find the standard files it needs at startup.
Once I figured this out, the fix was easy: I made my program write a batch script which launched the Python script and changed the job to run that script.
UPDATE : I'm told there is a Transfer_executable=False line I could have put in the script to achieve the same thing.
(photo by gudi&cris licenced under CC-BY-ND)
I regularly come across code such as:
output = os.popen('diff -u %s %s' % (appl_file, ref_file), 'r')
Code like this might well work machine but it is buggy and will fail (preferably during the demo or once shipped).
Where is the bug?
It is in the use of %s, which can inject in your command any string you want and also strings you don't want. The problem is that you probably did not check appl_file and ref_file for weird things (spaces, quotes, semi colons...). Putting quotes around the %s in the string will not solve the issue.
So what should you do? The answer is "use the subprocess module": subprocess.Popen takes a list of arguments as first parameter, which are passed as-is to the new process creation system call of your platform, and not interpreted by the shell:
pipe = subprocess.Popen(['diff', '-u', appl_file, ref_file], stdout=subprocess.PIPE)
output = pipe.stdout
By now, you should have guessed that the shell=True parameter of subprocess.Popen should not be used unless you really really need it (and even them, I encourage you to question that need).
Après plusieurs mois au point mort ou presque, Sylvain a pu hier soir
publier des versions corrigeant un certain nombre de bogues dans
pylint et astng ([1] et [2]).
Il n'en demeure pas moins qu'à Logilab, nous manquons de temps pour
faire baisser la pile de tickets ouverts dans le tracker de
pylint. Si vous jetez un œuil dans l'onglet Tickets, vous y trouverez
un grand nombre de bogues en souffrance et de fonctionalités
indispensables (certaines peut-être un peu moins que d'autres...) Il
est déjà possible de contribuer en utilisant mercurial pour fournir
des patches, ou en signalant des bogues (aaaaaaaaaarg ! encore des
tickets !) et certains s'y sont mis, qu'ils en soient remerciés.
Maintenant, nous nous demandions ce que nous pourrions faire pour
faire avance Pylint, et nos premières idées sont :
- organiser un petit sprint de 3 jours environ
- organiser des jours de "tuage de ticket", comme ça se pratique sur
différents projets OSS
Mais pour que ça soit utile, nous avons besoin de votre aide. Voici donc
quelques questions :
- est-ce que vous participeriez à un sprint à Logilab (à Paris,
France), ce qui nous permettrait de nous rencontrer, de vous
apprendre plein de choses sur le fonctionnement de Pylint et de
travailler ensemble sur des tickets qui vous aideraient dans votre
travail ?
- si la France c'est trop loin, où est-ce que ça vous arrangerait ?
- seriez-vous prêt à vous joindre à nous sur le serveur jabber de
Logilab ou sur IRC, pour participer à une chasse au ticket (à une
date à déterminer). Si oui, quel est votre degré de connaissance du
fonctionnement interne de Pylint et astng ?
Vous pouvez répondre en commentant sur ce blog (pensez à vous
enregistrer en utilisant le lien en haut à droite sur cette page) ou
en écrivant à sylvain.thenault@logilab.fr. Si nous avons suffisamment
de réponses positives nous organiserons quelque chose.
The mkstemp function in the tempfile module returns a tuple of 2 values:
- an OS-level handle to an open file (as would be returned by os.open())
- the absolute pathname of that file.
I often see code using mkstemp only to get the filename to the temporary file, following a pattern such as:
from tempfile import mkstemp
import os
def need_temp_storage():
_, temp_path = mkstemp()
os.system('some_commande --output %s' % temp_path)
file = open(temp_path, 'r')
data = file.read()
file.close()
os.remove(temp_path)
return data
This seems to be working fine, but there is a bug hiding in there. The bug will show up on Linux if you call this functions many time in a long running process, and on the first call on Windows. We have leaked a file descriptor.
The first element of the tuple returned by mkstemp is typically an integer used to refer to a file by the OS. In Python, not closing a file is usually no big deal because the garbage collector will ultimately close the file for you, but here we are not dealing with file objects, but with OS-level handles. The interpreter sees an integer and has no way of knowing that the integer is connected to a file. On Linux, calling the above function repeatedly will eventually exhaust the available file descriptors. The program will stop with:
IOError: [Errno 24] Too many open files: '/tmp/tmpJ6g4Ke'
On Windows, it is not possible to remove a file which is still opened by another process, and you will get:
Windows Error [Error 32]
Fixing the above function requires closing the file descriptor using os.close_():
from tempfile import mkstemp
import os
def need_temp_storage():
fd, temp_path = mkstemp()
os.system('some_commande --output %s' % temp_path)
file = open(temp_path, 'r')
data = file.read()
file.close()
os.close(fd)
os.remove(temp_path)
return data
If you need your process to write directly in the temporary file, you don't need to call os.write_(fd, data). The function os.fdopen_(fd) will return a Python file object using the same file descriptor. Closing that file object will close the OS-level file descriptor.
I recently received from a customer a fairly large amount of data, organized in dozens of xls documents, each having dozens of sheets. I need to process this, and in order to ease the manipulation of the documents, I'd rather use standard text files in CSV (Comma Separated Values) format. Of course I didn't want to spend hours manually converting each sheet of each file to CSV, so I thought this would be a good time to get my hands in pyUno.
So I gazed over the documentation, found the Calc page on the OpenOffice.org wiki, read some sample code and got started.
The first few lines I wrote were (all imports are here, though some were actually added later).
import logging
import sys
import os.path as osp
import os
import time
import uno
def convert_spreadsheet(filename):
pass
def run():
for filename in sys.argv[1:]:
convert_spreadsheet(filename)
def configure_log():
logger = logging.getLogger('')
logger.setLevel(logging.DEBUG)
handler = logging.StreamHandler(sys.stdout)
logger.addHandler(handler)
format = "%(asctime)s %(levelname)-7s [%(name)s] %(message)s"
handler.setFormatter(logging.Formatter(format))
if __name__ == '__main__':
configure_log()
run()
That was the easy part. In order to write the convert_spreadsheet function, I needed to open the document. And to do that, I need to start OpenOffice.org.
I started by copy-pasting some code I found in another project, which
expected OpenOffice.org to be already started with the -accept
option. I changed that code a bit, so that the function would launch
soffice with the correct options if it could not contact an existing
instance:
def _uno_init(_try_start=True):
"""init python-uno bridge infrastructure"""
try:
# Get the uno component context from the PyUNO runtime
local_context = uno.getComponentContext()
# Get the local Service Manager
local_service_manager = local_context.ServiceManager
# Create the UnoUrlResolver on the Python side.
local_resolver = local_service_manager.createInstanceWithContext(
"com.sun.star.bridge.UnoUrlResolver", local_context)
# Connect to the running OpenOffice.org and get its context.
# XXX make host/port configurable
context = local_resolver.resolve("uno:socket,host=localhost,port=2002;urp;StarOffice.ComponentContext")
# Get the ServiceManager object
service_manager = context.ServiceManager
# Create the Desktop instance
desktop = service_manager.createInstance("com.sun.star.frame.Desktop")
return service_manager, desktop
except Exception, exc:
if exc.__class__.__name__.endswith('NoConnectException') and _try_start:
logging.info('Trying to start UNO server')
status = os.system('soffice -invisible -accept="socket,host=localhost,port=2002;urp;"')
time.sleep(2)
logging.info('status = %d', status)
return _uno_init(False)
else:
logging.exception("UNO server not started, you should fix that now. "
"`soffice \"-accept=socket,host=localhost,port=2002;urp;\"` "
"or maybe `unoconv -l` might suffice")
raise
Now the easy (sort of, once you start understanding the OOo API): to
load a document, use desktop.loadComponentFromURL(). To get the sheets
of a Calc document, use document.getSheets() (that one was
easy...). To iterate over the sheets, I used a sample from the
SpreadsheetCommon page on the OpenOffice.org wiki.
Exporting the CSV was a bit more tricky. The function to use is
document.storeToURL(). There are two gotchas, however. The first one,
is that we need to specify a filter, and to parameterize it
correctly. The second one is that the CSV export filter is only able
to export the active sheet, so we need to change the active sheet as
we iterate over the sheets.
The parameters are passed in a tuple of PropertyValue uno structures,
as the second argument to the storeToURL method. I wrote a helper
function which accepts any named arguments and convert them to such a
tuple:
def make_property_array(**kwargs):
"""convert the keyword arguments to a tuple of PropertyValue uno
structures"""
array = []
for name, value in kwargs.iteritems():
prop = uno.createUnoStruct("com.sun.star.beans.PropertyValue")
prop.Name = name
prop.Value = value
array.append(prop)
return tuple(array)
Now, what do we put in that array? The answer is in the FilterOptions
page of the wiki : The FilterName property is "Text - txt - csv
(StarCalc)". We also need to configure the filter by using the
FilterOptions property. This is a string of comma separated values
- ASCII code of field separator
- ASCII code of text delimiter
- character set, use 0 for "system character set", 76 seems to be UTF-8
- number of first line (1-based)
- Cell format codes for the different columns (optional)
I used the value "59,34,76,1", meaning I wanted semicolons for
separators, and double quotes for text delimiters.
Here's the code:
def convert_spreadsheet(filename):
"""load a spreadsheet document, and convert all sheets to
individual CSV files"""
logging.info('processing %s', filename)
url = "file://%s" % osp.abspath(filename)
export_mask = make_export_mask(url)
# initialize Uno, get a Desktop object
service_manager, desktop = _uno_init()
try:
# load the Document
document = desktop.loadComponentFromURL(url, "_blank", 0, ())
controller = document.getCurrentController()
sheets = document.getSheets()
logging.info('found %d sheets', sheets.getCount())
# iterate on all the spreadsheets in the document
enumeration = sheets.createEnumeration()
while enumeration.hasMoreElements():
sheet = enumeration.nextElement()
name = sheet.getName()
logging.info('current sheet name is %s', name)
controller.setActiveSheet(sheet)
outfilename = export_mask % name.replace(' ', '_')
document.storeToURL(outfilename,
make_property_array(FilterName="Text - txt - csv (StarCalc)",
FilterOptions="59,34,76,1" ))
finally:
document.close(True)
def make_export_mask(url):
"""convert the url of the input document to a mask for the written
CSV file, with a substitution for the sheet name
>>> make_export_mask('file:///home/foobar/somedoc.xls')
'file:///home/foobar/somedoc$%s.csv'
"""
components = url.split('.')
components[-2] += '$%s'
components[-1] = 'csv'
return '.'.join(components)
A Google custom search engine for Python has been made available by Gerard Flanagan, indexing:
To refine the search to any of the individual sites, you can specify a
refinement using the following labels: stdlib, wiki, pypi,
thehazeltree
So, to just search the python wiki, you would enter:
somesearchterm more:wiki
and similarly:
somesearchterm more:stdlib
somesearchterm more:pypi
somesearchterm more:thehazeltree
The Hazel Tree is a collection of
popular Python texts that I have converted to reStructuredText and put
together using Sphinx. It's in a publishable
state, but not as polished as I'd like, and since I'll be mostly offline for
the next month it will have to remain as it is for the present. However,
the search engine is ready now and the clock is ticking on its subscription (one year, renewal depending on success of site), so if it's useful to anyone, it's all yours (and if you use it on your own site a
link back to http://thehazeltree.org would be appreciated).
Un problème rencontré hier : un test unitaire plante sous Windows, après avoir créé un objet qui garde des fichiers ouverts. le tearDown du test est appelé, mais il plante car Windows refuse de supprimer des fichiers ouverts, et le framework de test garde une référence sur la fonction de test pour qu'on puisse examiner la pile d'appels. Sous Linux, pas de problème (on a le droit du supprimer du disque un fichier ouvert, et donc pas de soucis dans le teardown).
Quelques pistes pour contourner le problème:
- mettre le test dans un try...finally avec un del sur l'objet qui garde les fichiers ouverts dans le finally. Inconvénient : quand le test ne passe pas, pdb ne permet plus de voir grand chose
- au lieu de nettoyer dans le tearDown, nettoyer plus tard dans un atexit par exemple. Il faut voir comment ça se passe si plusieurs tests veulent écrire dans les mêmes fichiers (je pense qu'il faudrait un répertoire temporaire par test, si on veut pouvoir avoir plusieurs tests qui foirent et examiner leurs données, mais il faut tester pour être sûr)
- coller un try...except dans le tearDown autour de la suppression de chaque fichier, et mettre les fichiers qui posent problème dans une liste qui sera traitée à la sortie du programme (avec atexit par exemple).
Ça ressemble à du bricolage, mais on a un comportement de windows sur lequel on n'a pas de contrôle (même avec des privilèges Administrateur ou System, on ne peut pas contourner cette impossibilité de supprimer un fichier ouvert, à ma connaissance).
Une autre approche, nettement plus lourde, serait de virtualiser la création de fichiers pour travailler en mémoire (au minimum surcharger os.mkdir et le builtin open, voire dans le cas qui nous intéresse les modules qui travaillent avec des fichiers zip). Il y a peut-être des choses comme ça en circulation. Poser la question sur la liste TIP apportera peut-être des réponses (une rapide recherche dans les archives n'a rien donné).
Voir aussi ces enfilades de mars 2004 et novembre 2004 sur comp.lang.python.
How can I test if a python float is "not a number" without depending on numpy? Simple, a nan value is different to any other value, including itself:
def isnan(x):
return isinstance(x, float) and x!=x
|