Supporting _ast and compilerPython 2.5 introduces a new module _ast for Abstract Syntax Tree (AST) representation of python code. This module is quite faster than the compiler.ast representation that logilab-astng (and therefore pylint) used until now and the compiler module was removed in Python 3.0. Faster is good, but the representations of python code are quite different in _ast and in compiler : some nodes exist in one AST but not the other and almost all child nodes have different names. We had to engage in a big refactoring to use the new _ast module, since we wanted to stay compatible with python version < 2.5, which meant keeping the compiler module support. A lot of work was done to find a common representation for the two different trees. In most cases we used _ast-like representations and names, but in some cases we kept ideas or attribute names of compiler. Abstract Syntax TreesLet's look at an example to compare both representations. Here is a seamingly harmless snippet of code: CODE = """
if cond:
del delvar
elif next:
print
"""
Now, compare the respective _ast and compiler representations (nodes are in upper case and their attributes are in lower case). compiler representation
Module
node =
Stmt
nodes = [
If
tests = [
Name
name = 'cond'
Stmt
nodes = [
AssName
flags = 'OP_DELETE'
name = 'delvar'
]
Name
name = 'next'
Stmt
nodes = [
Printnl
]
]
_ast representation
Module
body = [
If
test =
Name
id = 'cond'
body = [
Delete
targets = [
Name
id = 'delvar'
]
]
orelse = [
If
test =
Name
id = 'next'
body = [
Print
nl = True
]
]
]
Can you spot any differences? I would say, they differ quite a lot... For instance, compiler turns a "elif" statements into a list called 'tests' when _ast treats "elif cond:" as if it were "else:if cond:". Tree RebuildingWe transform these trees by renaming attributes and nodes, or removing or introducing new ones: with compiler, we remove the Stmt node, introduce a Delete node, and recursively build the If nodes coming from an "elif"; and with _ast, we reintroduce the AssName node. This might be only a temporary step towards full _ast like representation. This is done by the TreeRebuilder Visitors, one for each representation, which are respectively in astng._nodes_compiler and astng._ast. In the simplest case, the TreeRebuilder method looks like this (_nodes_compiler): def visit_list(self, node):
node.elts = node.nodes
del node.nodes
(and nothing to do for _ast). So, after doing all this and a lot more, we get the following representation from both input trees:
Module()
body = [
If()
test =
Name(cond)
body = [
Delete()
targets = [
DelName(delvar)
]
]
orelse = [
If()
test =
Name(next)
body = [
Print()
dest =
None
values = [
]
]
orelse = [
]
]
]
Faster towards Py3kOf course, you can imagine these modifications had some API repercussions, and thus required a lot of smaller Pylint modifications. But all was done so that you should see no difference in Pylint's behavior using either python <2.5 or python >=2.5, except that with the _ast module pylint is around two times faster! Oh, and we fixed small bugs on the way and maybe introduced a few new ones... Finally, it is a major step towards Pylint Py3k! |


hgview 0.10.2 released

Comments
-
2009/03/23 15:20, written by anon
| reply to this comment
-
2009/03/23 15:35, written by nchauvat
| reply to this comment
(add comment)Wow, that is great stuff! Jython doesn't support the compiler module. Given that it is gone from 3.0, Jython is unlikely to ever support it. Much work has been done to support _ast though, which gives me home that Pylint and family has a good chance of working on Jython soon. I wonder if logilab-astng might be useful for projects outside of the Pylint family that use compiler.ast.
The hope that it would be used outside of pylint was the motivation
for making it a separate module.