Today I felt like summing up my opinion on a topic that was discussed this year on
the Python mailing lists, at PyCon-FR, at EuroPython and EuroSciPy... packaging
software! Let us discuss the two main use cases.
The first use case is to maintain computer systems in production. A trait of
production systems, is that they can not afford failures and are often deployed
on a large scale. It leaves little room for manually fixing problems. Either
the installation process works or the system fails. Reaching that level of
quality takes a lot of work.
The second use case is to facilitate the life of software developers and computer
users by making it easy for them to give a try to new pieces of software without
much work.
The first use case has to be addressed as a configuration management
problem. There is no way around it. The best way I know of managing the
configuration of a computer system is called Debian. Its package format and its
tool chain provide a very extensive and efficient set of features for system
development and maintenance. Of course it is not perfect and there are missing
bits and open issues that could be tackled, like the dependencies between
hardware and software. For example, nothing will prevent you from installing on
your Debian system a version of a driver that conflicts with the version of the
chip found in your hardware. That problem could be solved, but I do not think
the Debian project is there yet and I do not count it as a reason to reject
Debian since I have not seen any other competitor at the level as Debian.
The second use case is kind of a trap, for it concerns most computer users and
most of those users are either convinced the first use case has nothing in
common with their problem or convinced that the solution is easy and requires
little work.
The situation is made more complicated by the fact that most of those users
never had the chance to use a system with proper package management tools. They
simply do not know the difference and do not feel like they are missing when
using their system-that-comes-with-a-windowing-system-included.
Since many software developers have never had to maintain computer systems in
production (often considered a lower sysadmin job) and never developed packages
for computer systems that are maintained in production, they tend to think that
the operating system and their software are perfectly decoupled. They have no
problem trying to create a new layer on top of existing operating systems and
transforming an operating system issue (managing software installation) into a
programming langage issue (see CPAN, Python eggs and so many others).
Creating a sub-system specific to a language and hosting it on an operating
system works well as long as the language boundary is not crossed and there is
no competition between the sub-system and the system itself. In the Python
world, distutils, setuptools, eggs and the like more or less work with pure
Python code. They create a square wheel that was made round years ago by
dpkg+apt-get and others, but they help a lot of their users do something they
would not know how to do another way.
A wall is quickly hit though, as the approach becomes overly complex as soon as
they try to depend on things that do not belong to their Python sub-system. What
if your application needs a database? What if your application needs to link to
libraries? What if your application needs to reuse data from or provide data to
other applications? What if your application needs to work on different
architectures?
The software developers that never had to maintain computer systems in
production wish these tasks were easy. Unfortunately they are not easy and cannot be. As I said, there is no way around configuration management for the one
who wants a stable system. Configuration management requires both project
management work and software development work. One can have a system where
packaging software is less work, but that comes at the price of stability and
reduced functionnality and ease of maintenance.
Since none of the two use cases will disappear any time soon, the only solution
to the problem is to share as much data as possible between the different tools
and let each one decide how to install software on his computer system.
Some links to continue your readings on the same topic: