subscribe to this blog

Logilab.org - en

News from Logilab and our Free Software projects, as well as on topics dear to our hearts (Python, Debian, Linux, the semantic web, scientific computing...)

show 204 results
  • Fresh release of lutin77, Logilab Unit Test IN fortran 77

    2011/01/11 by Andre Espaze

    I am pleased to annouce the 0.2 release of lutin77 for running Fortran 77 tests by using a C compiler as the only dependency. Moreover this very light framework of 97 lines of C code makes a very good demo of Fortran and C interfacing. The next level could be to write it in GAS (GNU Assembler).

    For the over excited maintainers of legacy code, here comes a screenshot:

    $ cat test_error.f
       subroutine success
       end
    
       subroutine error
       integer fid
       open(fid, status="old", file="nofile.txt")
       write(fid, *) "Ola"
       end
    
       subroutine checke
       call check(.true.)
       call check(.false.)
       call abort
       end
    
       program run
       call runtest("error")
       call runtest("success")
       call runtest("absent")
       call runtest("checke")
       call resume
       end
    

    Then you can build the framework by:

    $ gcc -Wall -pedantic -c lutin77.c
    

    An now run your tests:

    $ gfortran -o test_error test_error.f lutin77.o -ldl -rdynamic
    $ ./test_error
      At line 6 of file test_error.f
      Fortran runtime error: File 'nofile.txt' does not exist
      Error with status 512 for the test "error".
    
      "absent" test not found.
    
      Failure at check statement number 2.
      Error for the test "checke".
    
      4 tests run (1 PASSED, 0 FAILED, 3 ERRORS)
    

    See also the list of test frameworks for Fortran.


  • Distutils2 January Sprint in Paris

    2011/01/07 by Pierre-Yves David

    At Logilab, we have the pleasure to host a distutils2 sprint in January. Sprinters are welcome in our Paris office from 9h on the 27th of January to 19h the 30th of January. This sprint will focus on polishing distutils2 for the next alpha release and on the install/remove scripts.

    Distutils2 is an important project for Python. Every contribution will help to improve the current state of packaging in Python. See the wiki page on python.org for details about participation. If you can't attend or join us in Paris, you can participate on the #distutils channel of the freenode irc network

    http://guide.python-distribute.org/_images/state_of_packaging.jpg

    For additional details, see Tarek Ziadé's original announce, read the wiki page on python.org or contact us


  • Accessing data on a virtual machine without network

    2010/12/02 by Andre Espaze

    At Logilab, we work a lot with virtual machines for testing and developping code on customers architecture. We access virtual machines through the network and copy data with scp command. However in case you get a network failure, there is still a way to access your data by mounting a rescue disk on the virtual machine. The following commands will use qemu but the idea could certainly be adapted for others emulators.

    Creating and mounting the rescue disk

    For later mounting the rescue disk on your system, it is necessary to use the raw image format (by default on qemu):

    $ qemu-img create data-rescue.img 10M
    

    Then run your virtual machine with the 'data-rescue.img' attached (you need to add a disk storage on virtmanager). Once in your virtual system, you will have to partition and format your new hard disk. As a an example with Linux (win32 users will prefer right clicks):

    $ fdisk /dev/sdb
    $ mke2fs -j /dev/sdb1
    

    Then the new disk can be mounted and used:

    $ mount /dev/sdb1 /media/usb
    $ cp /home/dede/important-customer-code.tar.bz2 /media/usb
    $ umount /media/usb
    

    You can then stop your virtual machine.

    Getting back data from the rescue disk

    You will then have to carry your 'data-rescue.img' on a system where you can mount a file with the 'loop' option. But first we need to find where our partition start:

    $ fdisk -ul data.img
    You must set cylinders.
    You can do this from the extra functions menu.
    
    Disk data.img: 0 MB, 0 bytes
    255 heads, 63 sectors/track, 0 cylinders, total 0 sectors
    Units = sectors of 1 * 512 = 512 bytes
    Disk identifier: 0x499b18da
    
    Device Boot      Start         End      Blocks   Id  System
    data.img1           63       16064        8001   83  Linux
    

    Now we can mount the partition and get back our code:

    $ mkdir /media/rescue
    $ mount -o loop,offset=$((63 * 512)) data-rescue.img /media/rescue/
    $ ls /media/rescue/
    important-customer-code.tar.bz2
    

  • Thoughts on the python3 conversion workflow

    2010/11/30 by Emile Anclin

    Python3

    The 2to3 script is a very useful tool. We can just use it to run over all code base, and end up with a python3 compatible code whilst keeping a python2 code base. To make our code python3 compatible, we do (or did) two things:

    • small python2 compatible modifications of our source code
    • run 2to3 over our code base to generate a python3 compatible version

    However, we not only want to have one python3 compatible version, but also keep developping our software. Hence, we want to be able to easily test it for both python2 and python3. Furthermore if we use patches to get nice commits, this is starting to be quite messy. Let's consider this in the case of Pylint. Indeed, the workflow described before proved to be unsatisfying.

    • I have two repositories, one for python2, one for python3. On the python3 side, I run 2to3 and store the modifications in a patch or a commit.

    • Whenever I implement a fix or a functionality on either side, I have to test if it still works on the other side; but as the 2to3 modifications are often quite heavy, directly creating patches on one side and applying them on the other side won't work most of the time.

    • Now say, I implement something in my python2 base and hold it in a patch or commit it. I can then pull it to my python3 repo:

      • running 2to3 on all Pylint is quite slow: around 30 sec for Pylint without the tests, and around 2 min with the tests. (I'd rather not imagine how long it would take for say CubicWeb).

      • even if I have all my 2to3 modifications on a patch, it takes 5-6 sec to "qpush" or "qpop" them all. Commiting the 2to3 changes instead and using:

        hg pull -u --rebase
        

        is not much faster. If I don't use --rebase, I will have merges on each pull up. Furthermore, we often have either a patch application failure, merge conflict or end up with something which is not python3 compatible (like a newly introduced "except Error, exc").

    • So quite often, I will have to fix it with:

      hg revert -r REV <broken_files>
      2to3 -nw <broken_files>
      hg qref # or hg resolve -m; hg rebase -c
      
    • Suppose that 2to3 transition worked fine, or that we fixed it. I run my tests with python3 and see it does not work; so I modify the patch: it all starts again; and the new patch or the patch modification will create a new head in my python3 repo...

    2to3 Fixers

    Considering all that, let's investigate 2to3: it comes with a lot of fixers that can be activated or desactived. Now, a lot of them fix just very seldom use cases or stuff deprecated since years. On the other hand, the 2to3 fixers work with regular expressions, so the more we remove, the faster 2to3 should be. Depending on the project, most cases will just not appear, and for the others, we should be able to find other means of disabling them. The lists proposed here after are just suggestions, it will depend on the source base and other overall considerations which and how fixers could actually be disabled.

    python2 compatible

    Following fixers are 2.x compatible and should be run once and for all (and can then be disabled on daily conversion usage):

    • apply
    • execfile (?)
    • exitfunc
    • getcwdu
    • has_key
    • idioms
    • ne
    • nonzero
    • paren
    • repr
    • standarderror
    • sys_exec
    • tuple_params
    • ws_comma

    compat

    This can be fixed using imports from a "compat" module like the logilab.common.compat module which holds convenient compatible objects.

    • callable
    • exec
    • filter (Wraps filter() usage in a list call)
    • input
    • intern
    • itertools_imports
    • itertools
    • map (Wraps map() in a list call)
    • raw_input
    • reduce
    • zip (Wraps zip() usage in a list call)

    strings and bytes

    Maybe they could also be handled by compat:

    • basestring
    • unicode
    • print

    For print for example, we could think of a once-and-for-all custom fixer, that would replace it by a convenient echo function (or whatever name you like) defined in compat.

    manually

    Following issues could probably be fixed manually:

    • dict (it fixes dict iterator methods; it should be possible to have code where we can disable this fixer)
    • import (Detects sibling imports; we could convert them to absolute import)
    • imports, imports2 (renamed modules)

    necessary

    These changes seem to be necessary:

    • except
    • long
    • funcattrs
    • future
    • isinstance (Fixes duplicate types in the second argument of isinstance(). For example, isinstance(x, (int, int)) is converted to isinstance(x, (int)))
    • metaclass
    • methodattrs
    • numliterals
    • next
    • raise

    Consider however that a lot of them might never be used in some projects, like long, funcattrs, methodattrs and numliterals or even metaclass. Also, isinstance is probably motivated by long to int and unicode to str conversions and hence might also be somehow avoided.

    don't know

    Can we fix these one also with compat ?

    • renames
    • throw
    • types
    • urllib
    • xrange
    • xreadlines

    2to3 and Pylint

    Pylint is a special case since its test suite has a lot of bad and deprecated code which should stay there. However, in order to have a reasonable work flow, it seems that something must be done to reduce the 1:30 minutes of 2to3 parsing of the tests. Probably nothing could be gained from the above considerations since most cases just should be in the tests, and actually are. Realise that We can expect to be supporting python2 and python3 for several years in parallel.

    After a quick look, we see that 90 % of the refactorings of test/input files are just concerning the print statements; more over most of them have nothing to do with the tested functionality. Hence a solution might be to avoid to run 2to3 on the test/input directory, since we already have a mechanism to select depending on python version whether a test file should be tested or not. To some extend, astng is a similar case, but the test suite and the whole project is much smaller.


  • Notes on making "logilab-common" Py3k-compatible

    2010/09/28 by Emile Anclin

    The version 3 of Python is incompatible with the 2.x series. In order to make pylint usable with Python3, I did some work on making the logilab-common library Python3 compatible, since pylint depends on it.

    The strategy is to have one source code version, and to use the 2to3 tool for publishing a Python3 compatible version.

    Pytest vs. Unittest

    The first problem was that we use the pytest runner, that depends on logilab.common.testlib which extends the unittest module.

    Without major modification we could use unittest2 instead of unittest in Python2.6. I thought that the unittest2 module was equivalent to the unittest in Python3, but then realized I was wrong:

    • Python3.1/unittest is some strange "forward port" of unittest. Both are a single file, but they must be quite different since 3.1 has 1623 lines compared to 875 from 2.6...
    • Python2.x/unittest2 is a python package, backported from the alpha-release of Python3.2/unittest.

    I did not investigate if there are other unittest and unittest2 versions corresponding.

    What we can see is that the 3.1 version of unittest is different from everything else; whereas the 2.6-unittest2 is equivalent to 3.2-unittest. So, after trying to run pytest on Python3.1 and since there is a backport of unittest2 for Python3.1, it became clear that the best is to ignore py3.1-unittest and work on Python3.2 and unittest2 directly.

    Meanwhile, some work was being done on logilab-common to switch from unittest to unittest2. This was included in logilab.common-0.52.

    'python2.6 -3' and 2to3

    The -3 option of python2.6 warns about Python3 incompatible stuff.

    Since I already knew that pytest would work with unittest2, I wanted to know as fast as possible if pytest would run on Python3.x. So I run all logilab.common tests with "python2.6 -3 bin/pytest" and found a couple of problems that I quick-fixed or discarded, waiting to know the real solution.

    The 2to3 script (from the 2to3 library) does its best to transform Python2.x code into Python3 compatible code, but manual work is often needed to handle some cases. For example file is not considered a deprecated base class, calls to raw_input(...) are handled but not using raw_input as an instance attribute, etc. At times, 2to3 can be overzealous, and for example do modifications such as:

    -                for name, local_node in node.items():
    +                for name, local_node in list(node.items()):
    

    Procedure

    After a while, I found that the best solution was to adopt the following working procedure:

    • run the tests with python2.6 -3 and solve the appearing issues.
    • run 2to3 on all that has to be transformed:
    2to3-2.6 -n -w *py test/*py ureports/*py
    

    Since we are in a mercurial repository we don't need backups (-n) and we can write the modifications to the files directly (-w).

    • create a 223.diff patch that will be applied and removed repeatedly.

      Now, we will push and pop this patch (which is much faster than running 2to3), and only regenerate it from time to time to make sure it still works:

    • run "python3.2 bin/pytest -x", to find problems and solutions for crashes and tests that do not work. Note that after some quick fixes on logilab.common.testlib, pytest works quite well, and that we can use the "-x" option. Using Python's Whatsnew_3.0 documentation for hints is quite useful.

    • hg qpop 223.diff

    • write the solution into the 2.x code, convert it into a patch or a commit, and run the tests: some trivial things might not work or not be 2.4 compatible.

    • hg qpush 223.diff

    • repeat the procedure

    I used two repositories when working on logilab.common, one for Python2 and one for Python3, because other tools, like astng and pylint, depend on that library. Setting the PYTHONPATH was enough to get astng and pylint to use the right version.

    Concrete examples

    • We had to remove "os.path.walk" by replacing it with "os.walk".

    • The renaming of raw_input to input, __builtin__ to builtins and IOString to io could easily be resolved by using the improved logilab.common.compat technique: write a python version dependent definition of a variable, function, or class in logilab.common.compat and import it from there.

      For builtin, it is even easier: as 2to3 recognizes direct imports, so we can write in compat.py:

    import __builtin__ as builtins # 2to3 will tranform '__builtin__' to 'builtins'
    

    The most difficult point is the replacement of str/unicode by bytes/str.

    In Python3.x, we only use unicode strings called just str (the u'' syntax and unicode disappear), but everything written on disk will have to be converted to bytes, with some explicit encoding. In Python3.x, file descriptors have a defined encoding, and will automatically transform the strings to bytes.

    I wrote two functions in logilab.common.compat. One converts str to bytes and the other simply ignores the encoding in case of 3.x where it was expected in 2.x. But there might be a need to write additional tests to make sure the modifications work as expected.

    Conclusion

    • After less than a week of work, most of the logilab.common tests pass. The biggest remaining problem are the tests for testlib.py. But we can already start working on the Python3 compatibility for astng and finally pylint.
    • Looking at the lib2to3 library, one can see that 2to3 works with regular expressions which reproduce the Python grammar. Hence, it can not do much code investigation or static inference like astng. I think that using astng, we could improve 2to3 without too much effort.
    • for astng the difficulties are quite different: syntax changes become semantic changes, we will have to add new types of astng nodes.
    • For testing astng and pylint we will probably have to check the different test examples, a lot of them being code snippets which 2to3 will not parse; they will have to be corrected by hand.

    As a general conclusion, I found no need for using sa2to3, although it might be a very good tool. I would instead suggest to have a small compat module and keep only one version of the code, as far as possible. The code base being either on 2.x or on 3.x and using the (possibly customized) 2to3 or 3to2 scripts to publish two different versions.


  • SemWeb.Pro - first french Semantic Web conference, Jan 17/18 2011

    2010/09/20 by Nicolas Chauvat

    SemWeb.Pro, the first french conference dedicated to the Semantic Web will take place in Paris on January 17/18 2011.

    One day of talks, one day of tutorials.

    Want to grok the Web 3.0? Be there.

    Something you want to share? Call for papers ends on October 15, 2010.

    http://www.semweb.pro/semwebpro.png

  • Discovering logilab-common Part 1 - deprecation module

    2010/09/02 by Stéphanie Marcu

    logilab-common library contains a lot of utilities which are often unknown. I will write a series of blog entries to explore nice features of this library.

    We will begin with the logilab.common.deprecation module which contains utilities to warn users when:

    • a function or a method is deprecated
    • a class has been moved into another module
    • a class has been renamed
    • a callable has been moved to a new module

    deprecated

    When a function or a method is deprecated, you can use the deprecated decorator. It will print a message to warn the user that the function is deprecated.

    The decorator takes two optional arguments:

    • reason: the deprecation message. A good practice is to specify at the beginning of the message, between brackets, the version number from which the function is deprecated. The default message is 'The function "[function name]" is deprecated'.
    • stacklevel: This is the option of the warnings.warn function which is used by the decorator. The default value is 2.

    We have a class Person defined in a file person.py. The get_surname method is deprecated, we must use the get_lastname method instead. For that, we use the deprecated decorator on the get_surname method.

    from logilab.common.deprecation import deprecated
    
    class Person(object):
    
        def __init__(self, firstname, lastname):
            self._firstname = firstname
            self._lastname = lastname
    
        def get_firstname(self):
            return self._firstname
    
        def get_lastname(self):
            return self._lastname
    
        @deprecated('[1.2] use get_lastname instead')
        def get_surname(self):
            return self.get_lastname()
    
    def create_user(firstname, lastname):
        return Person(firstname, lastname)
    
    if __name__ == '__main__':
        person = create_user('Paul', 'Smith')
        surname = person.get_surname()
    

    When running person.py we have the message below:

    person.py:22: DeprecationWarning: [1.2] use get_lastname instead
    surname = person.get_surname()

    class_moved

    Now we moved the class Person in a new_person.py file. We notice in the person.py file that the class has been moved:

    from logilab.common.deprecation import class_moved
    import new_person
    Person = class_moved(new_person.Person)
    
    if __name__ == '__main__':
        person = Person('Paul', 'Smith')
    

    When we run the person.py file, we have the following message:

    person.py:6: DeprecationWarning: class Person is now available as new_person.Person
    person = Person('Paul', 'Smith')

    The class_moved function takes one mandatory argument and two optional:

    • new_class: this mandatory argument is the new class
    • old_name: this optional argument specify the old class name. By default it is the same name than the new class. This argument is used in the default printed message.
    • message: with this optional argument, you can specify a custom message

    class_renamed

    The class_renamed function automatically creates a class which fires a DeprecationWarning when instantiated.

    The function takes two mandatory arguments and one optional:

    • old_name: a string which contains the old class name
    • new_class: the new class
    • message: an optional message. The default one is '[old class name] is deprecated, use [new class name]'

    We now rename the Person class into User class in the new_person.py file. Here is the new person.py file:

    from logilab.common.deprecation import class_renamed
    from new_person import User
    
    Person = class_renamed('Person', User)
    
    if __name__ == '__main__':
        person = Person('Paul', 'Smith')
    

    When running person.py, we have the following message:

    person.py:5: DeprecationWarning: Person is deprecated, use User
    person = Person('Paul', 'Smith')

    moved

    The moved function is used to tell that a callable has been moved to a new module. It returns a callable wrapper, so that when the wrapper is called, a warning is printed telling where the object can be found. Then the import is done (and not before) and the actual object is called.

    Note

    The usage is somewhat limited on classes since it will fail if the wrapper is used in a class ancestors list: use the class_moved function instead (which has no lazy import feature though).

    The moved function takes two mandatory parameters:

    • modpath: a string representing the path to the new module
    • objname: the name of the new callable

    We will use in person.py, the create_user function which is now defined in the new_person.py file:

    from logilab.common.deprecation import moved
    
    create_user = moved('new_person', 'create_user')
    
    if __name__ == '__main__':
        person = create_user('Paul', 'Smith')
    

    When running person.py, we have the following message:

    person.py:4: DeprecationWarning: object create_user has been moved to module new_person
    person = create_user('Paul', 'Smith')

  • pdb.set_trace no longer working: problem solved

    2010/08/12

    I had a bad case of bug hunting today which took me > 5 hours to track down (with the help of Adrien in the end).

    I was trying to start a CubicWeb instance on my computer, and was encountering some strange pyro error at startup. So I edited some source file to add a pdb.set_trace() statement and restarted the instance, waiting for Python's debugger to kick in. But that did not happen. I was baffled. I first checked for standard problems:

    • no pdb.py or pdb.pyc was lying around in my Python sys.path
    • the pdb.set_trace function had not been silently redefined
    • no other thread was bugging me
    • the standard input and output were what they were supposed to be
    • I was not able to reproduce the issue on other machines

    After triple checking everything, grepping everywhere, I asked a question on StackOverflow before taking a lunch break (if you go there, you'll see the answer). After lunch, no useful answer had come in, so I asked Adrien for help, because two pairs of eyes are better than one in some cases. We dutifully traced down the pdb module's code to the underlying bdb and cmd modules and learned some interesting things on the way down there. Finally, we found out that the Python code frames which should have been identical where not. This discovery caused further bafflement. We looked at the frames, and saw that one of those frames's class was a psyco generated wrapper.

    It turned out that CubicWeb can use two implementation of the RQL module: one which uses gecode (a C++ library for constraint based programming) and one which uses logilab.constraint (a pure python library for constraint solving). The former is the default, but it would not load on my computer, because the gecode library had been replaced by a more recent version during an upgrade. The pure python implementation tries to use psyco to speed up things. Installing the correct version of libgecode solved the issue. End of story.

    When I checked out StackOverflow, Ned Batchelder had provided an answer. I didn't get the satisfaction of answering the question myself...

    Once this was figured out, solving the initial pyro issue took 2 minutes...


  • EuroSciPy'10

    2010/07/13 by Adrien Chauve
    http://www.logilab.org/image/9852?vid=download

    The EuroSciPy2010 conference was held in Paris at the Ecole Normale Supérieure from July 8th to 11th and was organized and sponsored by Logilab and other companies.

    July, 8-9: Tutorials

    The first two days were dedicated to tutorials and I had the chance to talk about SciPy with André Espaze, Gaël Varoquaux and Emanuelle Gouillart in the introductory track. This was nice but it was a bit tricky to present SciPy in such a short time while trying to illustrate the material with real and interesting examples. One very nice thing for the introductory track is that all the material was contributed by different speakers and is freely available in a github repository (licensed under CC BY).

    July, 10-11: Scientific track

    The next two days were dedicated to scientific presentations and why python is such a great tool to develop scientific software and carry out research.

    Keynotes

    I had a very great time listening to the presentations, starting with the two very nice keynotes given by Hans Petter Langtangen and Konrad Hinsen. The latter gave us a very nice summary of what happened in the scientific python world during the past 15 years, what is happening now and of course what could happen during the next 15 years. Using a crystal ball and a very humorous tone, he made it very clear that the challenge in the next years will be about how using our hundreds, thousands or even more cores in a bug-free and efficient way. Functional programming may be a very good solution to this challenge as it provides a deterministic way of parallelizing our programs. Konrad also provided some hints about future versions of python that could provide a deeper and more efficient support of functional programming and maybe the addition of a keyword 'async' to handle the computation of a function in another core.

    In fact, the PEP 3148 entitled "Futures - execute computations asynchronously" was just accepted two days ago. This PEP describes the new package called "futures" designed to facilitate the evaluation of callables using threads and processes in future versions of python. A full implementation is already available.

    Parallelization

    Parallelization was indeed a very popular issue across presentations, and as for resolving embarrassingly parallel problems, several solutions were presented.

    • Playdoh: Distributes computations over computers connected to a secure network (see playdoh presentation).

      Distributing the computation of a function over two machines is as simple as:

      import playdoh
      result1, result2 = playdoh.map(fun, [arg1, arg2], _machines = ['machine1.network.com', 'machine2.network.com'])
      
    • Theano: Allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. In particular it can use GPU transparently and generate optimized C code (see theano presentation).

    • joblib: Provides among other things helpers for embarrassingly parallel problems. It's built over the multiprocessing package introduced in python 2.6 and brings more readable code and easier debugging.

    Speed

    Concerning speed, Fransesc Alted has showed us interesting tools for memory optimization currently successfully used in PyTables 2.2. You can read more details on these kind of optimizations in EuroSciPy'09 (part 1/2): The Need For Speed.

    SCons

    Last but not least, I talked with Cristophe Pradal who is one of the core developer of OpenAlea. He convinced me that SCons is worth using once you have built a nice extension for it: SConsX. I'm looking forward to testing it.


  • HOWTO install lodgeit pastebin under Debian/Ubuntu

    2010/06/24 by Arthur Lutz

    Lodge it is a simple open source pastebin... and it's written in Python!

    The installation under debian/ubuntu goes as follows:

    sudo apt-get update
    sudo apt-get -uVf install python-imaging python-sqlalchemy python-jinja2 python-pybabel python-werkzeug python-simplejson
    cd local
    hg clone http://dev.pocoo.org/hg/lodgeit-main
    cd lodgeit-main
    vim manage.py
    

    For debian squeeze you have to downgrade python-werkzeug, so get the old version of python-werkzeug from snapshot.debian.org at http://snapshot.debian.org/package/python-werkzeug/0.5.1-1/

    wget http://snapshot.debian.org/archive/debian/20090808T041155Z/pool/main/p/python-werkzeug/python-werkzeug_0.5.1-1_all.deb
    

    Modify the dburi and the SECRET_KEY. And launch application:

    python manage.py runserver
    

    Then off you go configure your apache or lighthttpd.

    An easy (and dirty) way of running it at startup is to add the following command to the www-data crontab

    @reboot cd /tmp/; nohup /usr/bin/python /usr/local/lodgeit-main/manage.py runserver &
    

    This should of course be done in an init script.

    http://rn0.ru/static/help/advanced_features.png

    Hopefully we'll find some time to package this nice webapp for debian/ubuntu.


show 204 results