The version 3 of Python is incompatible with the 2.x series. In order to make pylint usable with Python3, I did some work on making the logilab-common library Python3 compatible, since pylint depends on it.
The strategy is to have one source code version, and to use the 2to3 tool for publishing a Python3 compatible version.
The first problem was that we use the pytest runner, that depends on logilab.common.testlib which extends the unittest module.
Without major modification we could use unittest2 instead of unittest in Python2.6. I thought that the unittest2 module was equivalent to the unittest in Python3, but then realized I was wrong:
I did not investigate if there are other unittest and unittest2 versions corresponding.
What we can see is that the 3.1 version of unittest is different from everything else; whereas the 2.6-unittest2 is equivalent to 3.2-unittest. So, after trying to run pytest on Python3.1 and since there is a backport of unittest2 for Python3.1, it became clear that the best is to ignore py3.1-unittest and work on Python3.2 and unittest2 directly.
Meanwhile, some work was being done on logilab-common to switch from unittest to unittest2. This was included in logilab.common-0.52.
The -3 option of python2.6 warns about Python3 incompatible stuff.
Since I already knew that pytest would work with unittest2, I wanted to know as fast as possible if pytest would run on Python3.x. So I run all logilab.common tests with "python2.6 -3 bin/pytest" and found a couple of problems that I quick-fixed or discarded, waiting to know the real solution.
The 2to3 script (from the 2to3 library) does its best to transform Python2.x code into Python3 compatible code, but manual work is often needed to handle some cases. For example file is not considered a deprecated base class, calls to raw_input(...) are handled but not using raw_input as an instance attribute, etc. At times, 2to3 can be overzealous, and for example do modifications such as:
- for name, local_node in node.items(): + for name, local_node in list(node.items()):
After a while, I found that the best solution was to adopt the following working procedure:
2to3-2.6 -n -w *py test/*py ureports/*py
Since we are in a mercurial repository we don't need backups (-n) and we can write the modifications to the files directly (-w).
I used two repositories when working on logilab.common, one for Python2 and one for Python3, because other tools, like astng and pylint, depend on that library. Setting the PYTHONPATH was enough to get astng and pylint to use the right version.
import __builtin__ as builtins # 2to3 will tranform '__builtin__' to 'builtins'
The most difficult point is the replacement of str/unicode by bytes/str.
In Python3.x, we only use unicode strings called just str (the u'' syntax and unicode disappear), but everything written on disk will have to be converted to bytes, with some explicit encoding. In Python3.x, file descriptors have a defined encoding, and will automatically transform the strings to bytes.
I wrote two functions in logilab.common.compat. One converts str to bytes and the other simply ignores the encoding in case of 3.x where it was expected in 2.x. But there might be a need to write additional tests to make sure the modifications work as expected.
As a general conclusion, I found no need for using sa2to3, although it might be a very good tool. I would instead suggest to have a small compat module and keep only one version of the code, as far as possible. The code base being either on 2.x or on 3.x and using the (possibly customized) 2to3 or 3to2 scripts to publish two different versions.