The EuroSciPy2010 conference was held in Paris at the Ecole Normale Supérieure from July 8th to 11th and was organized and sponsored by Logilab and other companies.
The first two days were dedicated to tutorials and I had the chance to talk about SciPy with André Espaze, Gaël Varoquaux and Emanuelle Gouillart in the introductory track. This was nice but it was a bit tricky to present SciPy in such a short time while trying to illustrate the material with real and interesting examples. One very nice thing for the introductory track is that all the material was contributed by different speakers and is freely available in a github repository (licensed under CC BY).
The next two days were dedicated to scientific presentations and why python is such a great tool to develop scientific software and carry out research.
I had a very great time listening to the presentations, starting with the two very nice keynotes given by Hans Petter Langtangen and Konrad Hinsen. The latter gave us a very nice summary of what happened in the scientific python world during the past 15 years, what is happening now and of course what could happen during the next 15 years. Using a crystal ball and a very humorous tone, he made it very clear that the challenge in the next years will be about how using our hundreds, thousands or even more cores in a bug-free and efficient way. Functional programming may be a very good solution to this challenge as it provides a deterministic way of parallelizing our programs. Konrad also provided some hints about future versions of python that could provide a deeper and more efficient support of functional programming and maybe the addition of a keyword 'async' to handle the computation of a function in another core.
In fact, the PEP 3148 entitled "Futures - execute computations asynchronously" was just accepted two days ago. This PEP describes the new package called "futures" designed to facilitate the evaluation of callables using threads and processes in future versions of python. A full implementation is already available.
Parallelization was indeed a very popular issue across presentations, and as for resolving embarrassingly parallel problems, several solutions were presented.
Distributing the computation of a function over two machines is as simple as:
import playdoh result1, result2 = playdoh.map(fun, [arg1, arg2], _machines = ['machine1.network.com', 'machine2.network.com'])
Theano: Allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. In particular it can use GPU transparently and generate optimized C code (see theano presentation).
joblib: Provides among other things helpers for embarrassingly parallel problems. It's built over the multiprocessing package introduced in python 2.6 and brings more readable code and easier debugging.
Concerning speed, Fransesc Alted has showed us interesting tools for memory optimization currently successfully used in PyTables 2.2. You can read more details on these kind of optimizations in EuroSciPy'09 (part 1/2): The Need For Speed.