Python 2.5 introduces a new module _ast for Abstract Syntax Tree (AST) representation of python code. This module is quite faster than the compiler.ast representation that logilab-astng (and therefore pylint) used until now and the compiler module was removed in Python 3.0.
Faster is good, but the representations of python code are quite different in _ast and in compiler : some nodes exist in one AST but not the other and almost all child nodes have different names.
We had to engage in a big refactoring to use the new _ast module, since we wanted to stay compatible with python version < 2.5, which meant keeping the compiler module support. A lot of work was done to find a common representation for the two different trees. In most cases we used _ast-like representations and names, but in some cases we kept ideas or attribute names of compiler.
Let's look at an example to compare both representations. Here is a seamingly harmless snippet of code:
CODE = """ if cond: del delvar elif next: print """
Now, compare the respective _ast and compiler representations (nodes are in upper case and their attributes are in lower case).
Module node = Stmt nodes = [ If tests = [ Name name = 'cond' Stmt nodes = [ AssName flags = 'OP_DELETE' name = 'delvar' ] Name name = 'next' Stmt nodes = [ Printnl ] ]
Module body = [ If test = Name id = 'cond' body = [ Delete targets = [ Name id = 'delvar' ] ] orelse = [ If test = Name id = 'next' body = [ Print nl = True ] ] ]
Can you spot any differences? I would say, they differ quite a lot... For instance, compiler turns a "elif" statements into a list called 'tests' when _ast treats "elif cond:" as if it were "else:if cond:".
We transform these trees by renaming attributes and nodes, or removing or introducing new ones: with compiler, we remove the Stmt node, introduce a Delete node, and recursively build the If nodes coming from an "elif"; and with _ast, we reintroduce the AssName node. This might be only a temporary step towards full _ast like representation.
This is done by the TreeRebuilder Visitors, one for each representation, which are respectively in astng._nodes_compiler and astng._ast.
In the simplest case, the TreeRebuilder method looks like this (_nodes_compiler):
def visit_list(self, node): node.elts = node.nodes del node.nodes
(and nothing to do for _ast).
So, after doing all this and a lot more, we get the following representation from both input trees:
Module() body = [ If() test = Name(cond) body = [ Delete() targets = [ DelName(delvar) ] ] orelse = [ If() test = Name(next) body = [ Print() dest = None values = [ ] ] orelse = [ ] ] ]
Of course, you can imagine these modifications had some API repercussions, and thus required a lot of smaller Pylint modifications. But all was done so that you should see no difference in Pylint's behavior using either python <2.5 or python >=2.5, except that with the _ast module pylint is around two times faster!
Oh, and we fixed small bugs on the way and maybe introduced a few new ones...
Finally, it is a major step towards Pylint Py3k!