subscribe to this blog

Logilab.org - en

News from Logilab and our Free Software projects, as well as on topics dear to our hearts (Python, Debian, Linux, the semantic web, scientific computing...)

show 181 results
  • Mini-DebConf Paris 2012

    2012/11/29 by Julien Cristau

    Last week-end, I attended the mini-DebConf organized at EPITA (near Paris) by the French Debian association and sponsored by Logilab.

    http://www.logilab.org/file/112649?vid=download

    The event was a great success, with a rather large number of attendees, including people coming from abroad such as Debian kernel maintainers Ben Hutchings and Maximilian Attems, who talked about their work with Linux.

    Among the other speakers were Loïc Dachary about OpenStack and its packaging in Debian, and Josselin Mouette about his work deploying Debian/GNOME desktops in a large enterprise environment at EDF R&D.

    On my part I gave a talk on Saturday about Debian's release team, and the current state of the wheezy (to-be Debian 7.0) release.

    On Sunday I presented together with Vladimir Daric the work we did to migrate a computation cluster from Red Hat to Debian. Attendees had quite a few questions about our use of ZFS on Linux for storage, and salt for configuration management and deployment.

    Slides for the talks are available on the mini-DebConf web page (wheezy state, migration to debian cluster also viewable on slideshare), and videos will soon be on http://video.debian.net/.

    Now looking forward to next summer's DebConf13 in Switzerland, and hopefully next year's edition of the Paris event.


  • PyLint 0.26 is out

    2012/10/08 by Sylvain Thenault

    I'm very pleased to announce new releases of Pylint and underlying ASTNG library, respectivly 0.26 and 0.24.1. The great news is that both bring a lot of new features and some bug fixes, mostly provided by the community effort.

    We're still trying to make it easier to contribute on our free software project at Logilab, so I hope this will continue and we'll get even more contritions in a near future, and an even smarter/faster/whatever pylint!

    For more details, see ChangeLog files or http://www.logilab.org/project/pylint/0.26.0 and http://www.logilab.org/project/logilab-astng/0.24.1

    So many thanks to all those who made that release, and enjoy!


  • Profiling tools

    2012/09/07 by Alain Leufroy

    Python

    Run time profiling with cProfile

    Python is distributed with profiling modules. They describe the run time operation of a pure python program, providing a variety of statistics.

    The cProfile module is the recommended module. To execute your program under the control of the cProfile module, a simple form is

    $ python -m cProfile -s cumulative mypythonscript.py
    
       ncalls  tottime  percall  cumtime  percall filename:lineno(function)
          16    0.055    0.003   15.801    0.988 __init__.py:1(<module>)
           1    0.000    0.000   11.113   11.113 __init__.py:35(extract)
         135    7.351    0.054   11.078    0.082 __init__.py:25(iter_extract)
    10350736    3.628    0.000    3.628    0.000 {method 'startswith' of 'str' objects}
           1    0.000    0.000    2.422    2.422 pyplot.py:123(show)
           1    0.000    0.000    2.422    2.422 backend_bases.py:69(__call__)
           ...
    

    Each column provides information about time execution of every function calls. -s cumulative orders the result by descending cumulative time.

    Note:

    You can profile a particular python function such as main()

    >>> import profile
    >>> profile.run('main()')
    

    Graphical tools to show profiling results

    Even if report tools are included in cProfile profiler, it can be interesting to use graphical tools. Most of them work with a stat file that can be generated by cProfile using the -o filepath option.

    Below are some of available graphical tools that we tested.

    Gpro2Dot

    is a python based tool that allows to transform profiling results output into a picture containing the call tree graph (using graphviz). A typical profiling session with python looks like this:

    $ python -m cProfile -o output.pstats mypythonscript.py
    $ gprof2dot.py -f pstats output.pstats | dot -Tpng -o profiling_results.png
    
    http://wiki.jrfonseca.googlecode.com/git/gprof2dot.png

    Each node of the output graph represents a function and has the following layout:

    +----------------------------------+
    |   function name : module name    |
    | total time including sub-calls % |  total time including sub-calls %
    |    (self execution time %)       |------------------------------------>
    |  total number of self calls      |
    +----------------------------------+
    

    Nodes and edges are colored according to the "total time" spent in the functions.

    Note:The following small patch let the node color correspond to the execution time and the edge color to the "total time":
    diff -r da2b31597c5f gprof2dot.py
    --- a/gprof2dot.py      Fri Aug 31 16:38:37 2012 +0200
    +++ b/gprof2dot.py      Fri Aug 31 16:40:56 2012 +0200
    @@ -2628,6 +2628,7 @@
                     weight = function.weight
                 else:
                     weight = 0.0
    +            weight = function[TIME_RATIO]
    
                 label = '\n'.join(labels)
                 self.node(function.id,
    
    PyProf2CallTree

    is a script to help visualizing profiling data with the KCacheGrind graphical calltree analyzer. This is a more interactive solution than Gpro2Dot but it requires to install KCacheGrind. Typical usage:

    $ python -m cProfile -o stat.prof mypythonscript.py
    $ python pyprof2calltree.py -i stat.prof -k
    

    Profiling data file is opened in KCacheGrind with pyprof2calltree module, whose -k switch automatically opens KCacheGrind.

    http://kcachegrind.sourceforge.net/html/pics/KcgShot3Large.gif

    There are other tools that are worth testing:

    • RunSnakeRun is an interactive GUI tool which visualizes profile file using square maps:

      $ python -m cProfile -o stat.prof mypythonscript.py
      $ runsnake stat.prof
      
    • pycallgraph generates PNG images of a call tree with the total number of calls:

      $ pycallgraph mypythonscript.py
      
    • lsprofcalltree also use KCacheGrind to display profiling data:

      $ python lsprofcalltree.py -o output.log yourprogram.py
      $ kcachegrind output.log
      

    C/C++ extension profiling

    For optimization purpose one may have python extensions written in C/C++. For such modules, cProfile will not dig into the corresponding call tree. Dedicated tools must be used (they are most part of Python) to profile a C++ extension from python.

    Yep

    is a python module dedicated to the profiling of compiled python extension. It uses the google CPU profiler:

    $ python -m yep --callgrind mypythonscript.py
    

    Memory Profiler

    You may want to control the amount of memory used by a python program. There is an interesting module that fits this need: memory_profiler

    You can fetch memory consumption of a program over time using

    >>> from memory_profiler import memory_usage
    >>> memory_usage(main, (), {})
    

    memory_profiler can also spot lines that consume the most using pdb or IPython.

    General purpose Profiling

    The Linux perf tool gives access to a wide variety of performance counter subsystems. Using perf, any execution configuration (pure python programs, compiled extensions, subprocess, etc.) may be profiled.

    Performance counters are CPU hardware registers that count hardware events such as instructions executed, cache-misses suffered, or branches mispredicted. They form a basis for profiling applications to trace dynamic control flow and identify hotspots.

    You can have information about execution times with:

    $ perf stat -e cpu-cycles,cpu-clock,task-clock python mypythonscript.py
    

    You can have RAM access information using:

    $ perf stat -e cache-misses python mypythonscript.py
    

    Be careful about the fact that perf gives the raw value of the hardware counters. So, you need to know exactly what you are looking for and how to interpret these values in the context of your program.

    Note that you can use Gpro2Dot to get a more user-friendly output:

    $ perf record -g python mypythonscript.py
    $ perf script | gprof2dot.py -f perf | dot -Tpng -o output.png
    

  • PyLint 0.25.2 and related projects released

    2012/07/18 by Sylvain Thenault

    I'm pleased to announce the new release of Pylint and related projects (i.e. logilab-astng and logilab-common)!

    By installing PyLint 0.25.2, ASTNG 0.24 and logilab-common 0.58.1, you'll get a bunch of bug fixes and a few new features. Among the hot stuff:

    • PyLint should now work with alternative python implementations such as Jython, and at least go further with PyPy and IronPython (but those have not really been tested, please try it and provide feedback so we can improve their support)
    • the new ASTNG includes a description of dynamic code it is not able to understand. This is handled by a bitbucket hosted project described in another post.

    Many thanks to everyone who contributed to these releases, Torsten Marek / Boris Feld in particular (both sponsored by Google by the way, Torsten as an employee and Boris as a GSoC student).

    Enjoy!


  • Introducing the pylint-brain project

    2012/07/18 by Sylvain Thenault

    Huum, along with the new PyLint release, it's time to introduce the PyLint-Brain project I've recently started.

    Despite its name, PyLint-Brain is actually a collection of extensions for ASTNG, with the goal of making ASTNG smarter (and this directly benefits PyLint) by describing stuff that is too dynamic to be understood automatically (such as functions in the hashlib module, defaultdict, etc.).

    The PyLint-Brain collection of extensions is developped outside of ASTNG itself and hosted on a bitbucket project to ease community involvement and to allow distinct development cycles. Basically, ASTNG will include the PyLint-Brain extensions, but you may use earlier/custom versions by tweaking your PYTHONPATH.

    Take a look at the code, it's fairly easy to contribute new descriptions, and help us make pylint smarter!


  • Debian science sprint and workshop at ESRF

    2012/06/22 by Julien Cristau

    esrfdebian

    From June 24th to June 26th, the European Synchrotron organises a workshop centered around Debian. On Monday, a number of talks about the use of Debian in scientific facilities will be featured. On Sunday and Tuesday, members of the Debian Science group will meet for a sprint focusing on the upcoming Debian 7.0 release.

    Among the speakers will be Stefano Zacchiroli, the current Debian project leader. Logilab will be present with Nicolas Chauvat at Monday's conference, and Julien Cristau at both the sprint and the conference.

    At the sprint we'll be discussing packaging of scientific libraries such as blas or MPI implementations, and working on polishing other scientific packages, such as python-related ones (including Salome on which we are currently working).


  • A Python dev day at La Cantine. Would like to have more PyCon?

    2012/06/01 by Damien Garaud
    http://www.logilab.org/file/98313?vid=downloadhttp://www.logilab.org/file/98312?vid=download

    We were at La Cantine on May 21th 2012 in Paris for the "PyCon.us Replay session".

    La Cantine is a coworking space where hackers, artists, students and so on can meet and work. It also organises some meetings and conferences about digital culture, computer science, ...

    On May 21th 2012, it was a dev day about Python. "Would you like to have more PyCon?" is a french wordplay where PyCon sounds like Picon, a french "apéritif" which traditionally accompanies beer. A good thing because the meeting began at 6:30 PM! Presentations and demonstrations were about some Python projects presented at PyCon 2012 in Santa Clara (California) last March. The original pycon presentations are accessible on pyvideo.org.

    PDB Introduction

    By Gael Pasgrimaud (@gawel_).

    pdb is the well-known Python debugger. Gael showed us how to easily use this almost-mandatory tool when you develop in Python. As with the gdb debugger, you can stop the execution at a breakpoint, walk up the stack, print the value of local variables or modify temporarily some local variables.

    The best way to define a breakpoint in your source code, it's to write:

    import pdb; pdb.set_trace()
    

    Insert that where you would like pdb to stop. Then, you can step trough the code with s, c or n commands. See help for more information. Following, the help command in pdb command-line interpreter:

    (Pdb) help
    
    Documented commands (type help <topic>):
    ========================================
    EOF    bt         cont      enable  jump  pp       run      unt
    a      c          continue  exit    l     q        s        until
    alias  cl         d         h       list  quit     step     up
    args   clear      debug     help    n     r        tbreak   w
    b      commands   disable   ignore  next  restart  u        whatis
    break  condition  down      j       p     return   unalias  where
    
    Miscellaneous help topics:
    ==========================
    exec  pdb
    

    It is also possible to invoke the module pdb when you run a Python script such as:

    $> python -m pdb my_script.py
    

    Pyramid

    http://www.logilab.org/file/98311?vid=download

    By Alexis Metereau (@ametaireau).

    Pyramid is an open source Python web framework from Pylons Project. It concentrates on providing fast, high-quality solutions to the fundamental problems of creating a web application:

    • the mapping of URLs to code ;
    • templating ;
    • security and serving static assets.

    The framework allows to choose different approaches according the simplicity//feature tradeoff that the programmer need. Alexis, from the French team of Services Mozilla, is working with it on a daily basis and seemed happy to use it. He told us that he uses Pyramid more as web Python library than a web framework.

    Circus

    http://www.logilab.org/file/98316?vid=download

    By Benoit Chesneau (@benoitc).

    Circus is a process watcher and runner. Python scripts, via an API, or command-line interface can be used to manage and monitor multiple processes.

    A very useful web application, called circushttpd, provides a way to monitor and manage Circus through the web. Circus uses zeromq, a well-known tool used at Logilab.

    matplotlib demo

    This session was a well prepared and funny live demonstration by Julien Tayon of matplotlib, the Python 2D plotting library . He showed us some quick and easy stuff.

    For instance, how to plot a sinus with a few code lines with matplotlib and NumPy:

    import numpy as np
    import matplotlib.pyplot as plt
    
    fig = plt.figure()
    ax = fig.add_subplot(111)
    
    # A simple sinus.
    ax.plot(np.sin(np.arange(-10., 10., 0.05)))
    fig.show()
    

    which gives:

    http://www.logilab.org/file/98315?vid=download

    You can make some fancier plots such as:

    # A sinus and a fancy Cardioid.
    a = np.arange(-5., 5., 0.1)
    ax_sin = fig.add_subplot(211)
    ax_sin.plot(np.sin(a), '^-r', lw=1.5)
    ax_sin.set_title("A sinus")
    
    # Cardioid.
    ax_cardio = fig.add_subplot(212)
    x = 0.5 * (2. * np.cos(a) - np.cos(2 * a))
    y = 0.5 * (2. * np.sin(a) - np.sin(2 * a))
    ax_cardio.plot(x, y, '-og')
    ax_cardio.grid()
    ax_cardio.set_xlabel(r"$\frac{1}{2} (2 \cos{t} - \cos{2t})$", fontsize=16)
    fig.show()
    

    where you can type some LaTeX equations as X label for instance.

    http://www.logilab.org/file/98314?vid=download

    The force of this plotting library is the gallery of several examples with piece of code. See the matplotlib gallery.

    Using Python for robotics

    Dimitri Merejkowsky reviewed how Python can be used to control and program Aldebaran's humanoid robot NAO.

    Wrap up

    Unfortunately, Olivier Grisel who was supposed to make three interesting presentations was not there. He was supposed to present :

    • A demo about injecting arbitrary code and monitoring Python process with Pyrasite.
    • Another demo about Interactive Data analysis with Pandas and the new IPython NoteBook.
    • Wrap up : Distributed computation on cluster related project: IPython.parallel, picloud and Storm + Umbrella

    Thanks to La Cantine and the different organisers for this friendly dev day.


  • Mercurial 2.3 sprint, Day 1-2-3

    2012/05/15 by Pierre-Yves David

    I'm now back from Copenhagen were I attended the mercurial 2.3 sprint with twenty other people. A huge amount of work was done in a very friendly atmosphere.

    Regarding mercurial's core:

    • Bookmark behaviour was improved to get closer to named branch's behaviour.
    • Several performance improvements regarding branches and heads caches. The heads cache refactoring improves rebase performance on huge repository (thanks to Facebook and Atlassian).
    • The concept I'm working on, Obsolete markers, was a highly discussed subject and is expected to get partly into the core in the near future. Thanks to my employer Logilab for paying me to work on this topic.
    • General code cleanup and lock validation.
    http://www.logilab.org/file/92956?vid=download

    Regarding the bundled extension :

    • Some fixes where made to progress which is now closer to getting into mercurial's core.
    • Histedit and keyring extensions are scheduled to be shipped with mercurial.
    • Some old and unmaintained extensions (children, hgtk) are now deprecated.
    • The LargeFile extension got some new features (thanks to the folks from Unity3D)
    • Rebase will use the --detach flag by default in the next release.
    http://www.logilab.org/file/92958?vid=download

    Regarding the project itself:

    http://www.logilab.org/file/92955?vid=download

    Regarding other extensions:

    http://www.logilab.org/file/92959?vid=download

    And I'm probably forgetting some stuff. Special thanks to Unity3D for hosting the sprint and providing power, network and food during these 3 days.


  • Mercurial 2.3 day 0

    2012/05/10 by Pierre-Yves David

    I'm now at Copenhagen to attend the mercurial "2.3" sprint.

    About twenty people are attending, including staff from Atlassian, Facebook, Google and Mozilla.

    I expect code and discussion about various topic among:

    • the development process of mercurial itself,
    • performance improvement on huge repository,
    • integration of Obsolete Markers into mercurial core,
    • improvement on various aspect (merge diff, moving some extension in core, ...)

    I'm of course very interested in the Obsolete Markers topic. I've been working on an experimental implementation for several months. An handful of people are using them at Logilab for two months and feedback are very promising.


  • Debian bug squashing party in Paris

    2012/02/16 by Julien Cristau

    Logilab will be present at the upcoming Debian BSP in Paris this week-end. This event will focus on fixing as many "release critical" bugs as possible, to help with the preparation of the upcoming Debian 7.0 "wheezy" release. It will also provide an opportunity to introduce newcomers to the processes of Debian development and bug fixing, as well as provide an opportunity for contributors in various areas of the project to interact "in real life".

    http://www.logilab.org/file/88881?vid=download

    The current stable release, Debian 6.0 "squeeze", came out in February 2011. The development of "wheezy" is scheduled to freeze in June 2012, for an eventual release later this year.

    Among the things we hope to work on during this BSP, the latest HDF5 release (1.8.8) includes API and packaging changes that require some changes in dependent packages. With the number of scientific packages relying on HDF5, this is a pretty big change, as tracked in this Debian bug.


show 181 results