from
Neal Norwitz <nnorwitz at gmail dot com>
to
Sylvain Thénault <sylvain.thenault at logilab dot fr>
cc
Greg Wilson <gvwilson at cs dot utoronto dot ca>
subject
Re: PyLint / PyChecker unification
date
On 9/13/05, Sylvain Thénault <sylvain.thenault@logilab.fr> wrote:2005/09/13 21:18
>
> actually pylint take information from various sources:
> - raw python source file (ie stream), to extract some formating
> information lost in the ast
> - ast from the compiler package, which is actually extended before being
> given to pylint to add some information and so ease analysis
> - living objects to get information from extensions modules or regular
> modules without source code available
nice.
> I agree that this kind of tool should work with other python
> implementation. However I think that being based on the compiler ast to
> do analysis is the right way, and so its probably better to fix it and
> make it work with other python implementation than to start something
> else.
Ideally I agree with you that it would be preferable to fix the
compiler, but I'm not sure how easy it would be to fix. It's
*possible* it would be easier to rewrite. One of my issues is speed.
Another is the lack of testing. IIRC, the reason that the compiler
can't be used in other implementations is that it uses some C
extension module like parser. I don't remember the details.
> Notice that it seems to me that the compiler module is maintained
> and up to date. At least it has been updated for python 2.4 decorator,
> gen comps... I don't know any construct it can't handle (that doesn't
> mean there aren't).
There have been bugs that have come up fairly regularly. There is
currently one that "global a ; a = 5" isn't handled properly.
But the good news is that this was found by the PyPy guys (who also
have commit access to python). It's quite possible that they will do
a lot of work and make the compiler more robust and speedy. I don't
follow pypy enough to know what's going on. They will be in your area
doing a sprint soon though. I don't remember exactly where.
pypy is trying to achieve something very difficult. But they've got
very smart people working on it. I don't know if they will succeed,
but there is some overlap and we could wind up helping each other.
> > * Must check stdlib in less than 30 seconds with 100+ warning types
> herr, I can't advertise pylint on this :)
This is a very difficult problem given the current infrastructure
(python's, i don't know about pylint's). BTW, how long does pylint
take for the python 2.3 stdlib without anything cached?
If pylint took 60 seconds, I believe we could optimize. But from what
I remember of pychecker2 it was more like many minutes. And only a
small fraction of the checks were implemented (still probably only
about 20-30). pychecker has over 100 checks.
My box (dual Opteron running at 1.6 GHz with 1 GB) takes about 30
seconds with pychecker. It's ironic that I picked the 30 second
requirement before knowing how long it took. :-) Actually I started
with 10 seconds and thought to myself that was completely impossible.
> > * Must provide caching of analysis (i.e. for packages like wx)
> done
The real reason I added this requirement is that I think it will be
necessary to be fast enough for large projects. I've seen tools
(including pychecker) be thrown out because they don't meet
expectation. One expectation is that it will go faster than I think
is reasonable. I know I can't convince people of what is reasonable
for me, so I believe we need to make a better tool.
> > I would be curious to hear what you would add/change/remove.
>
> I agree with all that list, after that it's only a point about what
> checks should be done. And having a single tool integrating every
> features of both pychecker and pylint would be a great start !
Agreed. From everything I've seen and heard, I don't even think it
would be difficult to arrive at the checks to perform. I think the
only difficulty would be determining which to turn on by default. :-)
After finding enough time to implement them all, of course. :-)
>From what I've seen of pylint's architecture, I think it has a lot of
the right elements for success. From a high level, it seems similar
to pychecker2 (never released in CVS at SourceForge). I haven't
looked in detail at pylint, so I can't speak more than just
generalities.
The biggest concern I have with pylint is speed (with portability to
other pythons second). I don't know how to address those with the
current compiler. One reason the compiler is slow is because it
allocates so many objects.
Hopefully, I've communicated all my concerns and not beat them to
death. My uncertainty about how to proceed stems partly from my
limited and dated knowledge. Maybe there's things you've accomplished
which would help put me at ease about the compiler. I really would
prefer to use it, but I'm not sure it's possible given my other
requirements.
I would also like to know if you think I'm being unreasonble in any of
the directions I'm pushing. I think I'm mostly taking the attitude of
a difficult developer and trying to lower the bar so they can't come
up with any reasonable excuse to avoid the tool. Maybe there are
other alternatives like not using the compiler and only using the
source in a fast mode that has limited checking abilities, but runs
real fast.
n
