pylint #4683 Non-ASCII characters count double if utf8 encode [resolved]
Lubin Fayolle reported # -*- coding: utf-8 -*- """A little script to demonstrate a little bug.""" print "------------------------------------------------------------------------" print "-----------------------------------------------------------------------é" Pylint returns the following warning: myscript.py:6: [C] Line too long (81/80) where line 6 corresponds to the second call of print. The two lines are the same length though... It seems that 'é' counts double, like many (any?) other non-ASCII characters. | |
priority | minor |
---|---|
type | bug |
done in | 0.21.0 |
load | 0.200 |
load left | 0.000 |
closed by | <not specified> |
similar entities
Comments
-
2010/03/03 19:31, written by cmorris
-
2010/03/04 08:00, written by sthenault
-
2010/03/29 14:05, written by svetlyak40wt
-
2010/03/29 14:57, written by sthenault
-
2010/03/29 16:38, written by svetlyak40wt
add commentIt seems like the solution here would be to cast to unicode before calling len. However, to do that, we need to know the encoding of the module. From what I've read, it doesn't seem to be possible to do this in general with total accuracy, but it seems like we could catch the majority of cases by using utf8, since it seems to be the most commonly used unicode encoding, and is backwards-compatible with ASCII. To catch other encodings, the best solution I can think of would be to have a command line option/field in .pylintrc to specify an encoding. Does this seem worthwhile?
you should use the encoding declaration that is mandatory for non-ascii modules in earlier python versions (iirc, warning introduced in python 2.3 / crash with python 2.4). There are already some code in pylint to detect this encoding.
I've solved this annoing problem. Here is the patch: http://gist.github.com/347854
would you please add a test case to the functional suite ?
see test/input and test/messages or search ml archives for more details
Hi Sylvain, I've updated the patch and added tests.