The latest release of logilab-astng (0.23), the underlying source code
representation library used by PyLint, provides a new API that may change pylint users' life in the near future...
It aims to allow registration of functions that will be called after a module has
been parsed. While this sounds dumb, it gives a chance to fix/enhance the
understanding PyLint has about your code.
I see this as a major step towards greatly enhanced code analysis, improving the
situation where PyLint users know that when running it against code using their
favorite framework (who said CubicWeb? :p ), they should expect a bunch of false
positives because of black magic in the ORM or in decorators or whatever else. There are also places in the Python standard library where dynamic code can cause false positives in PyLint.
Let's take a simple example, and see how we can improve things using the new
API. The following code:
import hashlib
def hexmd5(value):
""""return md5 checksum hexadecimal digest of the given value"""
return hashlib.md5(value).hexdigest()
def hexsha1(value):
""""return sha1 checksum hexadecimal digest of the given value"""
return hashlib.sha1(value).hexdigest()
gives the following output when analyzed through pylint:
[syt@somewhere ~]$ pylint -E example.py
No config file found, using default configuration
************* Module smarter_astng
E: 5,11:hexmd5: Module 'hashlib' has no 'md5' member
E: 9,11:hexsha1: Module 'hashlib' has no 'sha1' member
However:
[syt@somewhere ~]$ python
Python 2.7.1+ (r271:86832, Apr 11 2011, 18:13:53)
[GCC 4.5.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import smarter_astng
>>> smarter_astng.hexmd5('hop')
'5f67b2845b51a17a7751f0d7fd460e70'
>>> smarter_astng.hexsha1('hop')
'cffb6b20e0eef296772f6c1457cdde0049bdfb56'
The code runs fine... Why does pylint bother me then? If we take a look at the
hashlib module, we see that there are no sha1 or md5 defined in
there. They are defined dynamically according to Openssl library availability in order to use the fastest available implementation, using code like:
for __func_name in __always_supported:
# try them all, some may not work due to the OpenSSL
# version not supporting that algorithm.
try:
globals()[__func_name] = __get_hash(__func_name)
except ValueError:
import logging
logging.exception('code for hash %s was not found.', __func_name)
Honestly I don't blame PyLint for not understanding this kind of magic. The
situation on this particular case could be improved, but that's some tedious
work, and there will always be "similar but different" case that won't be
understood.
The good news is that thanks to the new astng callback, I can help it be
smarter! See the code below:
from logilab.astng import MANAGER, scoped_nodes
def hashlib_transform(module):
if module.name == 'hashlib':
for hashfunc in ('sha1', 'md5'):
module.locals[hashfunc] = [scoped_nodes.Class(hashfunc, None)]
def register(linter):
"""called when loaded by pylint --load-plugins, register our tranformation
function here
"""
MANAGER.register_transformer(hashlib_transform)
What's in there?
- A function that will be called with each astng module built during a pylint
execution, i.e. not only the one that you analyses, but also those accessed for
type inference.
- This transformation function is fairly simple: if the module is the 'hashlib'
module, it will insert into its locals dictionary a fake class node for each
desired name.
- It is registered using the register_transformer method of astng's MANAGER
(the central access point to built syntax tree). This is done in the pylint
plugin API register callback function (called when module is imported using
'pylint --load-plugins'.
Now let's try it! Suppose I stored the above code in a 'astng_hashlib.py' module in my
PYTHONPATH, I can now run pylint with the plugin activated:
[syt@somewhere ~]$ pylint -E --load-plugins astng_hashlib example.py
No config file found, using default configuration
************* Module smarter_astng
E: 5,11:hexmd5: Instance of 'md5' has no 'hexdigest' member
E: 9,11:hexsha1: Instance of 'sha1' has no 'hexdigest' member
Huum. We have now a different error :( Pylint grasp there are some md5 and
sha1 classes but it complains they don't have a hexdigest method. Indeed,
we didn't give a clue about that.
We could continue on and on to give it a full representation of hashlib public
API using the astng nodes API. But that would be painful, trust me. Or we could
do something clever using some higher level astng API:
from logilab.astng import MANAGER
from logilab.astng.builder import ASTNGBuilder
def hashlib_transform(module):
if module.name == 'hashlib':
fake = ASTNGBuilder(MANAGER).string_build('''
class md5(object):
def __init__(self, value): pass
def hexdigest(self):
return u''
class sha1(object):
def __init__(self, value): pass
def hexdigest(self):
return u''
''')
for hashfunc in ('sha1', 'md5'):
module.locals[hashfunc] = fake.locals[hashfunc]
def register(linter):
"""called when loaded by pylint --load-plugins, register our tranformation
function here
"""
MANAGER.register_transformer(hashlib_transform)
The idea is to write a fake python implementation only documenting the prototype
of the desired class, and to get an astng from it, using the string_build method of
the astng builder. This method will return a Module node containing the astng
for the given string. It's then easy to replace or insert additional information
into the original module, as you can see in the above example.
Now if I run pylint using the updated plugin:
[syt@somewhere ~]$ pylint -E --load-plugins astng_hashlib example.py
No config file found, using default configuration
No error anymore, great!
This fairly simple change could quickly provide great enhancements. We should
probably improve the astng manipulation API now that it's exposed like
that. But we can also easily imagine a code base of such pylint plugins
maintained by each community around a python library or framework. One could
then use a plugins stack matching stuff used by its software, and have a greatly
enhanced experience of using pylint.
For a start, it would be great if pylint could be shipped with a plugin that
explains all the magic found in the standard library, wouldn't it? Left as an exercice to
the reader!