You can click on the Google or Yahoo buttons to sign-in with these identity providers,
or you just type your identity uri and click on the little login button.
Today was the second day of the 10th anniversary Pylint sprint in Logilab's Toulouse office.
This morning, we started with a presentation by myself about how the inference engine works in astroid (former astng).
Then we started thinking all together about how we should change its API to be able to plug more information during the inference process. The first use-case we wanted to assert was namedtuple, as explained in http://www.logilab.org/ticket/8796.
We ended up by addressing it by:
enhancing the existing transformation feature so one may register a transformation function on any node rather than on a module node only;
being able to specify, on a node instance, a custom inference function to use instead of the default (class) implementation.
We would then be able to customize both the tree structure and the inference process and so to resolve the cases we were targeting.
Once this was sufficiently sketched out, everyone got his own tasks to do. Here is a quick summary of what has been achieved today:
Anthony resumed the check_messages thing and finished it for the simple cases, then he started on having a template for text reported,
Julien and David made a lot of progress on the Python 3.3 compatibility, though not enough to get the full green test suite,
Torsten continued backporting stuff from gpylint, all of them having been integrated by the end of the day,
Sylvain implemented the new transformation API and had the namedtuple proof of concept working, and even some documentation! Now this have to be tested for more real-world uses.
So things are going really well, and see you tomorrow for even more improvements to pylint!
Today was the first day of the Pylint sprint we organized using
Pylint's 10th years anniversary as an excuse.
So I (Sylvain) have welcome my fellow Logilab friends David, Anthony
and Julien as well as Torsten from Google into Logilab's new Toulouse
office.
After a bit of presentation and talk about Pylint development, we
decided to keep discussion for lunch and dinner and to setup
priorities. We ended with the following tasks (picks from the pad at
http://piratepad.net/oAvsUoGCAC):
rename astng to move it outside the logilab package,
Torsten gpylint (Google Pylint) patches review, as much as
possible (but not all of them, starting by a review of the numberous
internal checks Google has, seeing one by one which one should be
backported upstream),
enhance astroid (former astng) API to allow more ad-hoc
customization for a better grasp of magic occuring in e.g. web
frameworks (protocol buffer or SQLAlchemy may also be an
application of this).
Regarding the astng renaming, we decided to move on with
astroid as pointed out by the survey on StellarSurvey.com
In the afternoon, David and Julien tackled this, while Torsten was
extracting patches from Google code and sending them to bitbucket as
pulll request, Sylvain embrassing setuptools namespaces packages and
Anthony discovering the code to spread the @check_message decorator
usage.
Torsten submitted 5 pull-requests with code extracted from gpylint,
we reviewed them together and then Torsten used evolve to properly
insert those in the pylint history once review comments were
integrated
Sylvain submitted 2 patches on logilab-common to support both
setuptools namespace packages and pkgutil.extend_path (but
not bare __path__ manipulation
Anthony discovered various checkers and started adding proper
@check_messages on visit methods
After doing some review all together, we even had some time to take a
look at Python 3.3 support while writing this summary.
Hopefuly, our work on forthcoming days will be as efficient as on this first day!
Ce matin, j'ai assisté aux Rencontres INRIA Industries qui portaient sur le thème "Modélisation, simulation et calcul intensif".
Y était notamment présenté l'initiative HPC-PME dont l'objet est de faciliter l'accès des PME au calcul hautes performances.
Cette conférence a été pour moi l'occasion de formaliser mes réflexions sur les PME et leur recours à la simulation numérique.
L'initiative HPC-PME est portée par l'INRIA, le GENCI et OSEO. Elle se propose d'accompagner des PME afin de leur montrer
en quoi le calcul hautes performances peut leur être utile pour innover, gagner en compétitivité, et finalement identifier
des leviers de croissance. Elle décline ses actions selon quatre axes :
la formation et le partage de bonnes pratiques,
l'accès à des ressources de calcul pour montrer l'utilité du calcul hautes performances,
le soutien d'experts issus de la recherche publique, qui vont transférer leurs compétences en calcul hautes performances,
l'intégration dans des dispositifs de financement de l'innovation.
L'accompagnement classique d'une PME consiste à identifier avec elle quels sont les points sur lesquels la simulation numérique
peut lui apporter quelque chose, à dimensionner les besoins en calcul nécessaires à ces simulations, à l'aider à porter ses codes
de simulation vers des machines de calcul hautes performances, et enfin à effectuer, sur ces machines, un cas de calcul typique.
Après deux ans de travail, l'initiative HPC-PME a permis à 30 PME d'être accompagnées et de découvrir en quoi le calcul hautes
performances peut les aider à gagner en compétitivité. Ces PME sont issues de domaines d'activités très divers (hydrologie marine,
électronique, gestion de l'énergie, finance, etc.) et ont comme point commun d'être centrées sur un métier donné et de ne pas avoir,
généralement, de compétences internes dans le domaine du calcul numérique.
Convaincu depuis longtemps que la simulation numérique et l'utilisation intelligente de la puissance informatique peuvent aider les
entreprises à gagner en compétitivité, je ne peux qu'être ravi d'une initiative telle que HPC-PME. Toutefois, elle ne me semble
pas toujours convenir aux besoins d'une PME qui se pose la question d'un recours à la simulation numérique.
Lors des différentes conversations que j'ai pu avoir avec des dirigeants de PME, il m'est apparu que leur principale demande est
de pouvoir effectuer très simplement des simulations en fournissant un jeu de données puis en cliquant sur un bouton. L'initiative
HPC-PME a pour objectif de démontrer l'intérêt du calcul hautes performances et d'aider les PME à acquérir les compétences leur
permettant de mettre en œuvre des solutions de HPC. Or, la plupart des PME :
ne souhaitent pas mettre en place en interne une solution de calcul hautes performances (achat, configuration, maintenance),
ne désirent pas acquérir en interne les compétences leur permettant d'utiliser une solution de calcul hautes performances,
n'ont pas réellement besoin des performances extrêmes apportées par l'initiative HPC-PME (la Formule 1
de la simulation numérique), mais plutôt d'une solution solide, simple, et à coût d'entrée raisonnable (une bonne voiture
haut de gamme suffit),
n'ont pas une demande continue leur permettant d'amortir un investissement en calcul numérique.
Les PME ont généralement une expertise métier forte, c'est là leur facteur différenciant. Mais elles n'ont, pour la majorité d'entre
elles, aucune compétence en analyse numérique ou en informatique. Attendu qu'elles n'ont que peu de calculs à effectuer, il leur
est surérogatoire d'internaliser ces compétences.
Partant de ce constat, je pense sincèrement que la meilleure façon pour une PME d'avoir accès au calcul numérique serait de disposer
d'une plateforme dont l'accès est facturé à l'utilisation et qui lui permet de :
définir un cas de calcul en termes métier,
avoir accès à des ressources de calcul sur lesquelles les codes de calcul sont installés et configurés,
lancer en un clic le cas de calcul sur une ressource de calcul offrant des performances suffisantes,
pouvoir obtenir de l'aide de la part d'experts numériciens connaissant les codes de calcul (en cas de problème numérique ou de modélisation),
offrir des fonctionnalités de partage des cas de calcul selon des règles de sécurité strictes et entièrement contrôlables
(typiquement pour faire appel à un expert numéricien, montrer ses résultats au client final ou faire travailler un fournisseur).
After a quick survey, we're officially scheduling Pylint 10th years anniversary sprint from monday, June 17 to wednesday, June 19 in Logilab's Toulouse office.
There is still some room available if more people want to come, drop me a note (sylvain dot thenault at logilab dot fr).
After 10 years of hosting Pylint on our own forge at logilab.org, we've decided to publish version 1.0 and move Pylint and astng development to BitBucket. There has been repository mirrors there for some time, but we intend now to use all BitBucket features, notably Pull Request, to handle various development tasks.
There are several reasons behind this. First, using both BitBucket and our own forge is rather cumbersome, for integrators at least. This is mainly because BitBucket doesn't provide support for Mercurial's changeset evolution feature while our forge relies on it. Second, our forge has several usability drawbacks that make it hard to use for newcomers, and we lack the time to be responsive on this. Finally, we think that our quality-control process, as exposed by our forge, is a bit heavy for such community projects and may keep potential contributors away.
All in all, we hope this will help to have a wider contributor audience as well as more regular maintainers / integrators which are not Logilab employees. And so, bring the best Pylint possible to the Python community!
Logilab.org web pages will be updated to mention this, but kept as there is still valuable information there (eg tickets). We may also keep automatic tests and package building services there.
So, please use https://bitbucket.org/logilab/pylint as main web site regarding pylint development. Bug reports, feature requests as well as contributions should be done there. The same move will be done for Pylint's underlying library, logilab-astng (https://bitbucket.org/logilab/astng). We also wish in this process to move it out of the 'logilab' python package. It may be a good time to give it another name, if you have any idea don't hesitate to express yourself.
Pylint turning 10 and moving out of its parents is probably a good time to thank Logilab for paying me and some colleagues to create and maintain this project!
In a few week, pylint will be 10 years old (0.1 released on may 19 2003!).
At this occasion, I would like to release a 1.0. Well, not exactly at that date,
but not too long after would be great. Also, I think it would be a good time
to have a few days sprint to work a bit on this 1.0 but also to meet all together
and talk about pylint status and future, as more and more contributions come from
outside Logilab (actually mostly Google, which employs Torsten and Martin, the most
active contributors recently).
The first thing to do is to decide a date and place. Having discussed a bit with
Torsten about that, it seems reasonable to target a sprint during june or july.
Due to personal constraints, I would like to host this sprint in Logilab's
Toulouse office.
So, who would like to jump in and sprint to make pylint even better? I've created
a doodle so every one interested may tell his preferences:
http://doodle.com/4uhk26zryis5x7as
Regarding the location, is everybody ok with Toulouse? Other ideas are Paris, or
Florence around EuroPython, or... <add your proposition here>.
We'll talk about the sprint topics later, but there are plenty of exciting ideas
around there.
Please, answer quickly so we can move on. And I hope to see you all there!
At the end of March 2013, Logilab hosted a sprint on the LMGC90 simulation code
in Paris.
LMGC90 is an open-source software developed at the LMGC ("Laboratoire de
Mécanique et Génie Civil" -- "Mechanics and Civil Engineering Laboratory") of
the CNRS, in Montpellier, France. LMGC90 is devoted to contact mechanics and is,
thus, able to model large collections of deformable or undeformable physical
objects of various shapes, with numerous interaction laws. LMGC90 also allows
for multiphysics coupling.
the LMGC, which leads LMCG90 development and aims at constantly improving its
architecture and usability;
the Innovation and Research Department of the SNCF (the French state-owned
railway company), which uses LMGC90 to study railway mechanics, and more
specifically, the ballast;
the LaMSID ("Laboratoire de Mécanique des Structures Industrielles Durables",
"Laboratory for the Mechanics of Ageing Industrial Structures") laboratory
of the EDF / CNRS / CEA , which has an strong expertise on Code_ASTER
and LMGC90;
Logilab, as the developer, for the SNCF, of a CubicWeb-based platform
dedicated to the simulation data and knowledge management.
After a great introduction to LMGC90 by Frédéric Dubois and some preliminary
discussions, teams were quickly constituted around the common areas of interest.
As of the sprint date, LMGC90 is mainly developed in Fortran, but also contains
Python code for two purposes:
Exposing the Fortran functions and subroutines in the LMGC90 core to Python;
this is achieved using Fortran 2003's ISO_C_BINDING module and Swig.
These Python bindings are grouped in a module called ChiPy.
Making it easy to generate input data (so called "DATBOX" files) using Python.
This is done through a module called Pre_LMGC.
The main drawback of this approach is the double modelling of data that this
architecture implies: once in the core and once in Pre_LMGC.
It was decided to build a unique user-level Python layer on top of ChiPy,
that would be able to build the computational problem description and write the
DATBOX input files (currently achieved by using Pre_LMGC), as well as
to drive the simulation and read the OUTBOX result files (currently by using
direct ChiPy calls).
This task has been met with success, since, in the short time span available
(half a day, basically), the team managed to build some object types using
ChiPy calls and save them into a DATBOX.
This topic involved importing LMGC90 DATBOX data into the numerical platform
developed by Logilab for the SNCF.
This was achieved using ChiPy as a Python API to the Fortran core to get:
the bodies involved in the computation, along with their materials, behaviour
laws (with their associated parameters), geometries (expressed in terms of
zones);
the interactions between these bodies, along with their interaction laws (and
associated parameters, e.g. friction coefficient) and body pair (each
interaction is defined between two bodies);
the interaction groups, which contain interactions that have the same
interaction law.
There is still a lot of work to be done (notably regarding the charges applied
to the bodies), but this is already a great achievement. This could only have
occured in a sprint, were every needed expertise is available:
the SNCF experts were there to clarify the import needs and check the overall
direction;
Logilab implemented a data model based on CubicWeb, and imported the data
using the ChiPy bindings developed on-demand by the LMGC core developer team,
using the usual-for-them ISO_C_BINDING/ Swig Fortran wrapping dance.
Logilab undertook the data import; to this end, it asked the LMGC how the
relevant information from LMGC90 can be exposed to Python via the ChiPy API.
The main point of this topic was to replace the in-house DATBOX/OUTBOX textual
format used by LMGC90 to store input and output data, with an open, standard and
efficient format.
Several formats have been considered, like HDF5, MED and NetCDF4.
MED has been ruled out for the moment, because it lacks the support for storing
body contact information. HDF5 was chosen at last because of the quality of its
Python libraries, h5py and pytables, and the ease of use tools like h5fs provide.
Alain Leufroy from Logilab quickly presented h5py and h5fs usage, and the team
started its work, measuring the performance impact of the storage pattern of
LMGC90 data. This was quickly achieved, as the LMGC experts made it easy to
setup tests of various sizes, and as the Logilab developers managed to
understand the concepts and implement the required code in a fast and agile way.
This topic turned out to be more difficult than initially assessed, mainly
because LMGC90 has dependencies to non-packaged external libraries, which thus had
to be packaged first:
the Matlib linear algebra library, written in C,
the Lapack95 library, which is a Fortran95 interface to the Lapack library.
Logilab kept working on this after the sprint and produced packages that are
currently being tested by the LMGC team. Some changes are expected (for instance,
Python modules should be prefixed with a proper namespace) before the packages can be
submitted for inclusion into Debian. The expertise of Logilab regarding
Debian packaging was of great help for this task. This will hopefully help to
spread the use of LMGC90.
As you may know, Logilab is really fond of Mercurial as a DVCS. Our company
invested a lot into the development of the great evolve extension, which makes
Mercurial a very powerful tool to efficiently manage the team development of
software in a clean fashion.
This is why Logilab presented Mercurial's features and advantages over the
current VCS used to manage LMGC90 sources, namely svn, to the other
participants of the Sprint. This was appreciated and will hopefully benefit to
LMGC90 ease of development and spread among the Open Source community.
All in all, this two-day sprint on LMGC90, involving participants from several
industrial and academic institutions has been a great success. A lot of code has
been written but, more importantly, several stepping stones have been laid, such
as:
the general LMGC90 data access architecture, with the Python layer on top of
the LMGC90 core;
the data storage format, namely HDF5.
Colaterally somehow, several other results have also been achieved:
partial LMGC90 data import into the SNCF CubicWeb-based numerical platform,
Debian / Ubuntu packaging of LMGC90 and dependencies.
On a final note, one would say that we greatly appreciated the cooperation
between the participants, which we found pleasant and efficient. We look forward
to finding more occasions to work together.
I'm very pleased to announce the release of pylint 0.27 and
logilab-astng 0.24.2. There has been a lot of enhancements and
bug fixes since the latest release, so you're strongly encouraged
to upgrade. Here is a detailed list of changes:
#20693: replace pylint.el by Ian Eure version (patch by J.Kotta)
#105327: add support for --disable=all option and deprecate the
'disable-all' inline directive in favour of 'skip-file' (patch by
A.Fayolle)
#110840: add messages I0020 and I0021 for reporting of suppressed
messages and useless suppression pragmas. (patch by Torsten Marek)
#112728: add warning E0604 for non-string objects in __all__
(patch by Torsten Marek)
#120657: add warning W0110/deprecated-lambda when a map/filter
of a lambda could be a comprehension (patch by Martin Pool)
#113231: logging checker now looks at instances of Logger classes
in addition to the base logging module. (patch by Mike Bryant)
#111799: don't warn about octal escape sequence, but warn about o
which is not octal in Python (patch by Martin Pool)
#115580: fix erroneous W0212 (access to protected member) on super call
(patch by Martin Pool)
#110853: fix a crash when an __init__ method in a base class has been
created by assignment rather than direct function definition (patch by
Torsten Marek)
#110838: fix pylint-gui crash when include-ids is activated (patch by
Omega Weapon)
#112667: fix emission of reimport warnings for mixed imports and extend
the testcase (patch by Torsten Marek)
#112698: fix crash related to non-inferable __all__ attributes and
invalid __all__ contents (patch by Torsten Marek)
Python 3 related fixes:
#110213: fix import of checkers broken with python 3.3, causing
"No such message id W0704" breakage
#120635: redefine cmp function used in pylint.reporters
Include full warning id for I0020 and I0021 and make sure to flush
warnings after each module, not at the end of the pylint run.
(patch by Torsten Marek)
Changed the regular expression for inline options so that it must be
preceeded by a # (patch by Torsten Marek)
Make dot output for import graph predictable and not depend
on ordering of strings in hashes. (patch by Torsten Marek)
Add hooks for import path setup and move pylint's sys.path
modifications into them. (patch by Torsten Marek)
pylint-brain: more subprocess.Popen faking (see #46273)
#109562 [jython]: java modules have no __doc__, causing crash
#120646 [py3]: fix for python3.3 _ast changes which may cause crash
I was in Bruxelles for FOSDEM 2013. As with previous FOSDEM there were too many
interesting talks and people to see. Here is a summary of what I saw:
In the Mozilla's room:
The html5 pdf viewer pdfjs is impressive. The PDF specification is really
scary but this full featured "native" viewer is able to renders most of it
with very good performance. Have a look at the pdfjs demo!
Firefox debug tools overview with a specific focus of Firefox OS emulator in
your browser.
Introduction to webl10n: an internationalization format and library used in
Firefox OS. A successful mix that results in a format that is idiot-proof
enough for a duck to use, that relies on Unicode specifications to handle
complex pluralization rules and that allows cascading translation
definitions.
Status of html5 video and audio support in Firefox. The topic looks like a
real headache but the team seems to be doing really well. Special mention
for the reverse demo effect: The speaker expected some format to be still
unsupported but someone else apparently implemented them over night.
Last but not least I gave a talk about the changeset evolution concept that
I'm putting in Mercurial. Thanks goes to Feth for asking me his
not-scripted-at-all-questions during this talk. (slides)
Insightful talk about more event trigger in postgresql engine and how this may
becomes the perfect way to break your system.
Full update of the capability of postgis 2.0. The postgis suite was already
impressive for storing and querying 2D data, but it now have impressive
capability regarding 3D data.
On python related topic:
Aldebaran Robotic are currently opening most of their code. And
they are a perfect example of the value of python for implementing high level
logic.
Victor Stinner has started an interesting project to improve CPython
performance. The first one: astoptimizer breaks some of the language
semantics to apply optimisation on compiling to byte code (lookup caching,
constant folding,…). The other, registervm is a full redefinition of how the interpreter
handles reference in byte code.
After the FOSDEM, I crossed the channel to attend a Mercurial sprint in London.
Expect more on this topic soon.
The Release candidate version of Mercurial 2.5 was released last sunday.
This new version makes a major change in the way "hidden" changesets are
handled. In 2.4 only hg log (and a few others) would support effectively
hiding "hidden" changesets. Now all hg commands are transparently compatible
with the hidden revision concept. This is a considerable step towards
changeset evolution, the next-generation collaboration technology that I'm
developing for Mercurial.
The week after, I'm crossing the channel to attend the Mercurial 2.6 Sprint
hosted by Facebook London. I expect a lot of discussion about the user
interface and network access of changeset evolution.
Now I have a working OpenStack cloud at Logilab, I want to provide
my fellow collegues a bunch of ready-made images to create instances.
Strangely, there are no really usable ready-made UEC Debian images
available out there. There have been recent efforts made to provide
Debian images on Amazon Market Place, and the toolsuite used to
build these is available as a collection of bash shell scripts from
a github repository. There are also some images for Eucalyptus,
but I have not been able to make them boot properly on my kvm-based
OpenStack install.
So I have tried to build my own set of Debian images to upload in my
glance shop.
A bit of vocabulary may be useful for the one not very accustomed with
OpenStack nor AWS jargons.
When you want to create an instance of an image, ie. boot a virtual
machine in a cloud, you generally choose from a set of ready made
system images, then you choose a virtual machine flavor (ie. a
combination of a number of virtual CPUs, an amount of RAM, and a
harddrive size used as root device). Generally, you have to choose
between tiny (1 CPU, 512MB, no disk), small (1 CPU, 2G of RAM, 20G
of disk), etc.
In the cloud world, an instance is not meant to be sustainable. What
is sustainable is a volume that can be attached to a running instance.
If you want your instance to be sustainable, there are 2 choices:
you can snapshot a running instance and upload it as a new image ;
so it is not really a sustainable instance, instead, it's the
ability to configure an instance that is then the base for booting
other instances,
or you can boot an instance from a volume (which is the
sustainable part of a virtual machine in a cloud).
In the Amazon world, a "standard" image (the one that is instanciated
when creating a new instance) is called an instance store-backed AMI
images, also called an UEC image, and a volume image is called an
EBS-backed AMI image (EBS stands for Elastic Block Storage). So an AMI
images stored in a volume cannot be instanciated, it can be booted
once and only once at a time. But it is sustainable. Different usage.
An UEC or AMI image consist in a triplet: a kernel, an init ramdisk
and a root file system image. An EBS-backed image is just the raw
image disk to be booted on a virtulization host (a kvm raw or qcow2
image, etc.)
In OpenStack, when you create an instance from a given image, what
happens depends on the kind of image.
In fact, in OpenStack, one can upload traditional UEC AMI images (need
to upload the 3 files, the kernel, the initial ramdisk and the root
filesystem as a raw image). But one can also upload bare
images. These kind of images are booted directly by the
virtualization host. So it is some kind of hybrid between a boot from
volume (an EBS-backed boot in the Amazon world) and the traditional
instanciation from an UEC image.
Allows to boot a non-linux system (depends on the virtualization
system, especially true when using kvm vitualization).
Is slower to boot and consumes more resources, since the virtual
machine image must be the size of the required/wanted virtual
machine (but can remain minimal if using a qcow2 image format). If
you use a 10G raw image, then 10G of data will be copied from the
image provider to the virtualization host, and this big file will
be duplicated each time you instantiate this image.
The root filesystem size corresponding to the flavor of the
instance is not honored; the filesystem size is the one of the
BARE images.
Instantiating an AMI image:
Honours the flavor.
Generally allows quicker instance creation process.
Less resource consumption.
Can only boot Linux guests.
If one wants to boot a Windows guest in OpenStack, the only solution
(as far as I know) is to use a BARE image of an installed Windows
system. It works (I have succeeded in doing so), but a minimal Windows
7 install is several GB, so instantiating such a BARE image is very
slow, because the image needs to be uploaded on the virtualization
host.
So I wanted to provide a minimal Debian image in my cloud, and to
provide it as an AMI image so the flavor is honoured, and so the
standard cloud injection mechanisms (like setting up the ssh key to
access the VM) work without having to tweak the rc.local script or use
cloud-init in my guest.
This creates a new virtual machine, launch the Debian installer
directly downloaded from a Debian mirror, and start the usual Debian
installer in a virtual serial console (I don't like VNC very much).
I then followed the installation procedure. When asked for the
partitioning and so, I chose to create only one primary partition
(ie. with no swap partition; it wont be necessary here). I also chose
only "Default system" and "SSH server" to be installed.
After the installation process, the VM is rebooted, I log into it (by
SSH or via the console), so I can configure a bit the system.
david@host:~$ ssh root@openstack-squeeze-amd64.vm.logilab.fr
Linux openstack-squeeze-amd64 2.6.32-5-amd64 #1 SMP Sun Sep 23 10:07:46 UTC 2012 x86_64
The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Sun Dec 23 20:14:24 2012 from 192.168.1.34
root@openstack-squeeze-amd64:~# apt-get update
root@openstack-squeeze-amd64:~# apt-get install vim curl parted # install some must have packages[...]
root@openstack-squeeze-amd64:~# dpkg-reconfigure locales # I like to have fr_FR and en_US in my locales[...]
root@openstack-squeeze-amd64:~# echo virtio_baloon >> /etc/modules
root@openstack-squeeze-amd64:~# echo acpiphp >> /etc/modules
root@openstack-squeeze-amd64:~# update-initramfs -u
root@openstack-squeeze-amd64:~# apt-get clean
root@openstack-squeeze-amd64:~# rm /etc/udev/rules.d/70-persistent-net.rules
root@openstack-squeeze-amd64:~# rm .bash_history
root@openstack-squeeze-amd64:~# poweroff
What we do here is to install some packages, do some
configurations. The important part is adding the acpiphp module so
the volume attachment will work in our instances. We also clean some
stuffs up before shutting the VM down.
Then, as I want a minimal-sized disk image, the filesystem must be
resized to minimal. I did this like described below, but I think there
are simpler methods to do so.
david@host:~$ fdisk -l openstack-squeeze-amd64.raw # display the partition location in the disk
Disk openstack-squeeze-amd64.raw: 5368 MB, 5368709120 bytes
149 heads, 8 sectors/track, 8796 cylinders, total 10485760 sectors
Units= sectors of 1 * 512= 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0001fab7
Device Boot Start End Blocks Id System
debian-squeeze-amd64.raw1 2048 10483711 5240832 83 Linux
david@host:~$ # extract the filesystem from the image
david@host:~$ dd if=openstack-squeeze-amd64.raw of=openstack-squeeze-amd64.ami bs=1024 skip=1024 count=5240832
david@host:~$ losetup /dev/loop1 openstack-squeeze-amd64.ami
david@host:~$ mkdir /tmp/img
david@host:~$ mount /dev/loop1 /tmp/img
david@host:~$ cp /tmp/img/boot/vmlinuz-2.6.32-5-amd64 .
david@host:~$ cp /tmp/img/boot/initrd.img-2.6.32-5-amd64 .
david@host:~$ umount /tmp/img
david@host:~$ e2fsck -f /dev/loop1 # required before a resize
e2fsck 1.42.5 (29-Jul-2012)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/loop1: 26218/327680 files (0.2% non-contiguous), 201812/1310208 blocks
david@host:~$ resize2fs -M /dev/loop1 # minimize the filesystem
resize2fs 1.42.5 (29-Jul-2012)
Resizing the filesystem on /dev/loop1 to 191461 (4k) blocks.
The filesystem on /dev/loop1 is now 191461 blocks long.
david@host:~$ # note the new size ^^^^ and the block size above (4k)
david@host:~$ losetup -d /dev/loop1 # detach the lo device
david@host:~$ dd if=debian-squeeze-amd64.ami of=debian-squeeze-amd64-reduced.ami bs=4096 count=191461
After all this, you have a kernel image, a init ramdisk file and a
minimized root filesystem image file. So you just have to upload them to
your OpenStack image provider (glance):
david@host:~$ glance add disk_format=aki container_format=aki name="debian-squeeze-uec-x86_64-kernel"\
< vmlinuz-2.6.32-5-amd64
Uploading image 'debian-squeeze-uec-x86_64-kernel'==================================================================================[100%] 24.1M/s, ETA 0h 0m 0s
Added new image with ID: 644e59b8-1503-403f-a4fe-746d4dac2ff8
david@host:~$ glance add disk_format=ari container_format=ari name="debian-squeeze-uec-x86_64-initrd"\
< initrd.img-2.6.32-5-amd64
Uploading image 'debian-squeeze-uec-x86_64-initrd'==================================================================================[100%] 26.7M/s, ETA 0h 0m 0s
Added new image with ID: 6f75f1c9-1e27-4cb0-bbe0-d30defa8285c
david@host:~$ glance add disk_format=ami container_format=ami name="debian-squeeze-uec-x86_64"\kernel_id=644e59b8-1503-403f-a4fe-746d4dac2ff8 ramdisk_id=6f75f1c9-1e27-4cb0-bbe0-d30defa8285c \
< debian-squeeze-amd64-reduced.ami
Uploading image 'debian-squeeze-uec-x86_64'==================================================================================[100%] 42.1M/s, ETA 0h 0m 0s
Added new image with ID: 4abc09ae-ea34-44c5-8d54-504948e8d1f7
And that's it (!). I now have a working Debian squeeze image in my cloud that works fine:
Nazca is a python library aiming to
help you to align data. But, what does “align data” mean? For instance,
you have a list of cities, described by their name and their country and you
would like to find their URI on dbpedia to have more information about them, as
the longitude and the latitude. If you have two or three cities, it can be done
with bare hands, but it could not if there are hundreds or thousands cities.
Nazca provides you all the stuff we need to do it.
This blog post aims to introduce you how this library works and can be used.
Once you have understood the main concepts behind this library, don't hesitate
to try Nazca online !
The alignment process is divided into three main steps:
Gather and format the data we want to align.
In this step, we define two sets called the alignset and the
targetset. The alignset contains our data, and the
targetset contains the data on which we would like to make the links.
Compute the similarity between the items gathered. We compute a distance
matrix between the two sets according to a given distance.
Find the items having a high similarity thanks to the distance matrix.
Now, we have to compute the similarity between each items. For that purpose, the
Levenshtein distance[1], which is well accurate to compute the distance between few words, is used.
Such a function is provided in the nazca.distance module.
The next step is to compute the distance matrix according to the Levenshtein
distance. The result is given in the following table.
Albert Camus
Guillaume Apollinaire
Victor Hugo
Victor Hugo
6
9
0
Albert Camus
0
8
6
The alignment process is ended by reading the matrix and saying items having a
value inferior to a given threshold are identical.
Also called the edit distance, because the distance between two words
is equal to the number of single-character edits required to change one
word into the other.
The previous case was simple, because we had only one attribute to align (the
name), but it is frequent to have a lot of attributes to align, such as the name
and the birth date and the birth city. The steps remain the same, except that
three distance matrices will be computed, and items will be represented as
nested lists. See the following example:
alignset=[['Paul Dupont','14-08-1991','Paris'],['Jacques Dupuis','06-01-1999','Bressuire'],['Michel Edouard','18-04-1881','Nantes']]targetset=[['Dupond Paul','14/08/1991','Paris'],['Edouard Michel','18/04/1881','Nantes'],['Dupuis Jacques ','06/01/1999','Bressuire'],['Dupont Paul','01-12-2012','Paris']]
In such a case, two distance functions are used, the Levenshtein one for the
name and the city and a temporal one for the birth date [2].
The cdist function of nazca.distances enables us to compute those
matrices :
The next step is gathering those three matrices into a global one, called the
global alignment matrix. Thus we have :
0
1
2
3
0
1
40304
2715
7780
1
2715
43011
0
5091
2
40304
0
43011
48084
Allowing some misspelling mistakes (for example Dupont and Dupond are very
closed), the matching threshold can be set to 1 or 2. Thus we can see that the
item 0 in our alignset is the same that the item 0 in the targetset, the
1 in the alignset and the 2 of the targetset too : the links can be
done !
It's important to notice that even if the item 0 of the alignset and the 3
of the targetset have the same name and the same birthplace they are
unlikely identical because of their very different birth date.
You may have noticed that working with matrices as I did for the example is a
little bit boring. The good news is that Nazca makes all this job for you. You just
have to give the sets and distance functions and that's all. An other good news
is the project comes with the needed functions to build the sets !
Just before we start, we will assume the following imports have been done:
fromnazcaimportdataioasaldio#Functions for input and output datafromnazcaimportdistancesasald#Functions to compute the distancesfromnazcaimportnormalizeasaln#Functions to normalize datafromnazcaimportalignerasala#Functions to align data
On wikipedia, we can find the Goncourt prize winners, and we
would like to establish a link between the winners and their URI on dbpedia
(Let's imagine the Goncourt prize winners category does not exist in dbpedia)
We simply copy/paste the winners list of wikipedia into a file and replace all
the separators (- and ,) by #. So, the beginning of our file is :
1906#Jérôme et Jean Tharaud#Dingley, l'illustre écrivain (Cahiers de la Quinzaine)
When using the high-level functions of this library, each item must have at
least two elements: an identifier (the name, or the URI) and the attribute to
compare. With the previous file, we will use the name (so the column number 1)
as identifier (we don't have an URI here as identifier) and attribute to align.
This is told to python thanks to the following code:
Now, let's build the targetset thanks to a sparql query and the dbpedia
end-point. We ask for the list of the French novelists, described by their URI
and their name in French:
Both functions return nested lists as presented before. Now, we have to define
the distance function to be used for the alignment. This is done thanks to a
python dictionary where the keys are the columns to work on, and the values are
the treatments to apply.
treatments={1:{'metric':ald.levenshtein}}# Use a levenshtein on the name# (column 1)
Finally, the last thing we have to do, is to call the alignall function:
alignments=ala.alignall(alignset,targetset,0.4,#This is the matching thresholdtreatments,mode=None,#We'll discuss about that lateruniq=True#Get the best results only)
This function returns an iterator over the different alignments done. You can
see the results thanks to the following code :
fora,tinalignments:print'%s has been aligned onto %s'%(a,t)
It may be important to apply some pre-treatment on the data to align. For
instance, names can be written with lower or upper characters, with extra
characters as punctuation or unwanted information in parenthesis and so on. That
is why we provide some functions to normalize your data. The most useful may
be the simplify() function (see the docstring for more information). So the
treatments list can be given as follow:
defremove_after(string,sub):""" Remove the text after ``sub`` in ``string`` >>> remove_after('I like cats and dogs', 'and') 'I like cats' >>> remove_after('I like cats and dogs', '(') 'I like cats and dogs' """try:returnstring[:string.lower().index(sub.lower())].strip()exceptValueError:returnstringtreatments={1:{'normalization':[lambdax:remove_after(x,'('),aln.simply],'metric':ald.levenshtein}}
The previous case with the Goncourt prize winners was pretty simply because
the number of items was small, and the computation fast. But in a more real use
case, the number of items to align may be huge (some thousands or millions…). In
such a case it's unthinkable to build the global alignment matrix because it
would be too big and it would take (at least...) fews days to achieve the computation.
So the idea is to make small groups of possible similar data to compute smaller
matrices (i.e. a divide and conquer approach).
For this purpose, we provide some functions to group/cluster data. We have
functions to group text and numerical data.
This is the code used, we will explain it:
targetset=aldio.rqlquery('http://demo.cubicweb.org/geonames',"""Any U, N, LONG, LAT WHERE X is Location, X name N, X country C, C name "France", X longitude LONG, X latitude LAT, X population > 1000, X feature_class "P", X cwuri U""",indexes=[0,1,(2,3)])alignset=aldio.sparqlquery('http://dbpedia.inria.fr/sparql',"""prefix db-owl: <http://dbpedia.org/ontology/> prefix db-prop: <http://fr.dbpedia.org/property/> select ?ville, ?name, ?long, ?lat where { ?ville db-owl:country <http://fr.dbpedia.org/resource/France> . ?ville rdf:type db-owl:PopulatedPlace . ?ville db-owl:populationTotal ?population . ?ville foaf:name ?name . ?ville db-prop:longitude ?long . ?ville db-prop:latitude ?lat . FILTER (?population > 1000) }""",indexes=[0,1,(2,3)])treatments={1:{'normalization':[aln.simply],'metric':ald.levenshtein,'matrix_normalized':False}}results=ala.alignall(alignset,targetset,3,treatments=treatments,#As beforeindexes=(2,2),#On which data build the kdtreemode='kdtree',#The mode to useuniq=True)#Return only the best results
Let's explain the code. We have two files, containing a list of cities we want
to align, the first column is the identifier, and the second is the name of the city
and the last one is location of the city (longitude and latitude), gathered into
a single tuple.
In this example, we want to build a kdtree on the couple (longitude, latitude)
to divide our data in few candidates. This clustering is coarse, and is only
used to reduce the potential candidats without loosing any more refined possible
matchs.
So, in the next step, we define the treatments to apply.
It is the same as before, but we ask for a non-normalized matrix
(ie: the real output of the levenshtein distance).
Thus, we call the alignall function. indexes is a tuple saying the
position of the point on which the kdtree must be built, mode is the mode
used to find neighbours [3].
Finally, uniq ask to the function to return the best
candidate (ie: the one having the shortest distance below the given threshold)
The function outputs a generator yielding tuples where the first element is the
identifier of the alignset item and the second is the targetset one (It
may take some time before yielding the first tuples, because all the computation
must be done…)
We have also made this little application of Nazca, using Cubicweb. This application provides a user interface for
Nazca, helping you to choose what you want to align. You can use sparql or rql
queries, as in the previous example, or import your own cvs file [4]. Once you
have choosen what you want to align, you can click the Next step button to
customize the treatments you want to apply, just as you did before in python !
Once done, by clicking the Next step, you start the alignment process. Wait a
little bit, and you can either download the results in a csv or rdf file, or
directly see the results online choosing the html output.
A while ago, I started the install of an OpenStack cluster at
Logilab, so our developers can play easily with any kind of
environment. We are planning to improve our Apycot automatic testing
platform so it can use "elastic power". And so on.
I first tried a Ubuntu Precise based setup, since at that time,
Debian packages were not really usable. The setup never reached a point
where it could be relased as production ready, due to the fact I tried a
too complex and bleeding edge configuration (involving Quantum,
openvswitch, sheepdog)...
Meanwhile, we went really short of storage capacity. For now, it
mainly consists in hard drives distributed in our 19" Dell racks
(generally with hardware RAID controllers). So I recently purchased a
low-cost storage bay (SuperMicro SC937 with a 6Gb/s JBOD-only HBA)
with 18 spinning hard drives and 4 SSDs. This storage bay being driven
by ZFS on Linux (tip: the SSD-stored ZIL is a requirement to
get decent performances). This storage setup is still under test for
now.
I also went to the last Mini-DebConf in Paris, where Loic Dachary
presented the status of the OpenStack packaging effort in
Debian. This gave me the will to give a new try to OpenStack using
Wheezy and a bit simpler setup. But I could not
consider not to use my new ZFS-based storage as a nova volume
provider. It is not available for now in OpenStack (there is a backend
for Solaris, but not for ZFS on Linux). However, this is Python and in
fact, the current ISCSIDriver backend needs very little to
make it work with zfs instead of lvm as "elastics" block-volume
provider and manager.
So, I wrote a custom nova volume driver to handle this. As I don't
want the nova-volume daemon to run on my ZFS SAN, I wrote this backend
mixing the SanISCSIDriver (which manages the storage system via
SSH) and the standard ISCSIDriver (which uses standard Linux isci
target tools). I'm not very fond of the API of the VolumeDriver
(especially the fact that the ISCSIDriver is responsible for 2 roles:
managing block-level volumes and exporting block-level volumes). This
small design flaw (IMHO) is the reason I had to duplicate some code
(not much but...) to implement my ZFSonLinuxISCSIDriver...
one control node, running in a "normal" libvirt-controlled virtual
machine; it is a Wheezy that runs:
nova-api
nova-cert
nova-network
nova-scheduler
nova-volume
glance
postgresql
OpenStack dashboard
one computing node (Dell R310, Xeon X3480, 32G, Wheezy), which runs:
nova-api
nova-network
nova-compute
ZFS-on-Linux SAN (3x raidz1 poools made of 6 1T drives, 2x
(mirrored) 32G SLC SDDs, 2x 120G MLC SSDs for cache); for now, the storage is
exported to the SAN via one 1G ethernet link.
I mainly followed the Debian HOWTO to setup my private cloud. I
mainly tuned the network settings to match my environement (and the
fact my control node lives in a VM, with VLAN stuff handled by the
host).
I easily got a working setup (I must admit that I think my
previous experiment with OpenStack helped a lot when dealing with
custom configurations... and vocabulary; I'm not sure I would have
succeded "easily" following the HOWTO, but hey, it is a functionnal
HOWTO, meaning if you do not follow the instructions because you want
special tunings, don't blame the HOWTO).
Compared to the HOWTO, my nova.conf looks like (as of today):
[DEFAULT]
logdir=/var/log/nova
state_path=/var/lib/nova
lock_path=/var/lock/nova
root_helper=sudo nova-rootwrap
auth_strategy=keystone
dhcpbridge_flagfile=/etc/nova/nova.conf
dhcpbridge=/usr/bin/nova-dhcpbridge
sql_connection=postgresql://novacommon:XXX@control.openstack.logilab.fr/nova
## Network config
# A nova-network on each compute node
multi_host=true
# VLan manger
network_manager=nova.network.manager.VlanManager
vlan_interface=eth1
# My ip
my-ip=172.17.10.2
public_interface=eth0
# Dmz & metadata things
dmz_cidr=169.254.169.254/32
ec2_dmz_host=169.254.169.254
metadata_host=169.254.169.254
## More general things
# The RabbitMQ host
rabbit_host=control.openstack.logilab.fr
## Glance
image_service=nova.image.glance.GlanceImageService
glance_api_servers=control.openstack.logilab.fr:9292
use-syslog=true
ec2_host=control.openstack.logilab.fr
novncproxy_base_url=http://control.openstack.logilab.fr:6080/vnc_auto.html
vncserver_listen=0.0.0.0
vncserver_proxyclient_address=127.0.0.1
I had a bit more work to do to make nova-volume work. First, I got hit
by this nasty bug #695791 which is trivial to fix... when you know
how to fix it (I noticed the bug report after I fixed it by myself).
Then, as I wanted the volumes to be stored and exported by my shiny
new ZFS-on-Linux setup, I had to write my own volume driver, which was
quite easy, since it is Python, and the logic to implement was already
provided by the ISCSIDriver class on the one hand, and by the
SanISCSIDrvier on the other hand. So I ended with this firt
implementation. This file should be copied to nova volumes package
directory (nova/volume/zol.py):
# vim: tabstop=4 shiftwidth=4 softtabstop=4# Copyright 2010 United States Government as represented by the# Administrator of the National Aeronautics and Space Administration.# Copyright 2011 Justin Santa Barbara# Copyright 2012 David DOUARD, LOGILAB S.A.# All Rights Reserved.## Licensed under the Apache License, Version 2.0 (the "License"); you may# not use this file except in compliance with the License. You may obtain# a copy of the License at## http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the# License for the specific language governing permissions and limitations# under the License."""Driver for ZFS-on-Linux-stored volumes.This is mainly a custom version of the ISCSIDriver that uses ZFS asvolume provider, generally accessed over SSH."""importosfromnovaimportexceptionfromnovaimportflagsfromnovaimportutilsfromnovaimportlogasloggingfromnova.openstack.commonimportcfgfromnova.volume.driverimport_iscsi_locationfromnova.volumeimportiscsifromnova.volume.sanimportSanISCSIDriverLOG=logging.getLogger(__name__)san_opts=[cfg.StrOpt('san_zfs_command',default='/sbin/zfs',help='The ZFS command.'),]FLAGS=flags.FLAGSFLAGS.register_opts(san_opts)classZFSonLinuxISCSIDriver(SanISCSIDriver):"""Executes commands relating to ZFS-on-Linux-hosted ISCSI volumes. Basic setup for a ZoL iSCSI server: XXX Note that current implementation of ZFS on Linux does not handle: zfs allow/unallow For now, needs to have root access to the ZFS host. The best is to use a ssh key with ssh authorized_keys restriction mechanisms to limit root access. Make sure you can login using san_login & san_password/san_private_key """ZFSCMD=FLAGS.san_zfs_command_local_execute=utils.executedef_getrl(self):returnself._runlocaldef_setrl(self,v):ifisinstance(v,basestring):v=v.lower()in('true','t','1','y','yes')self._runlocal=vrun_local=property(_getrl,_setrl)def__init__(self):super(ZFSonLinuxISCSIDriver,self).__init__()self.tgtadm.set_execute(self._execute)LOG.info("run local = %s (%s)"%(self.run_local,FLAGS.san_is_local))defset_execute(self,execute):LOG.debug("override local execute cmd with %s (%s)"%(repr(execute),execute.__module__))self._local_execute=executedef_execute(self,*cmd,**kwargs):ifself.run_local:LOG.debug("LOCAL execute cmd %s (%s)"%(cmd,kwargs))returnself._local_execute(*cmd,**kwargs)else:LOG.debug("SSH execute cmd %s (%s)"%(cmd,kwargs))check_exit_code=kwargs.pop('check_exit_code',None)command=' '.join(cmd)returnself._run_ssh(command,check_exit_code)def_create_volume(self,volume_name,sizestr):zfs_poolname=self._build_zfs_poolname(volume_name)# Create a zfs volumecmd=[self.ZFSCMD,'create']ifFLAGS.san_thin_provision:cmd.append('-s')cmd.extend(['-V',sizestr])cmd.append(zfs_poolname)self._execute(*cmd)def_volume_not_present(self,volume_name):zfs_poolname=self._build_zfs_poolname(volume_name)try:out,err=self._execute(self.ZFSCMD,'list','-H',zfs_poolname)ifout.startswith(zfs_poolname):returnFalseexceptExceptionase:# If the volume isn't presentreturnTruereturnFalsedefcreate_volume_from_snapshot(self,volume,snapshot):"""Creates a volume from a snapshot."""zfs_snap=self._build_zfs_poolname(snapshot['name'])zfs_vol=self._build_zfs_poolname(snapshot['name'])self._execute(self.ZFSCMD,'clone',zfs_snap,zfs_vol)self._execute(self.ZFSCMD,'promote',zfs_vol)defdelete_volume(self,volume):"""Deletes a volume."""ifself._volume_not_present(volume['name']):# If the volume isn't present, then don't attempt to deletereturnTruezfs_poolname=self._build_zfs_poolname(volume['name'])self._execute(self.ZFSCMD,'destroy',zfs_poolname)defcreate_export(self,context,volume):"""Creates an export for a logical volume."""self._ensure_iscsi_targets(context,volume['host'])iscsi_target=self.db.volume_allocate_iscsi_target(context,volume['id'],volume['host'])iscsi_name="%s%s"%(FLAGS.iscsi_target_prefix,volume['name'])volume_path=self.local_path(volume)# XXX (ddouard) this code is not robust: does not check for# existing iscsi targets on the host (ie. not created by# nova), but fixing it require a deep refactoring of the iscsi# handling code (which is what have been done in cinder)self.tgtadm.new_target(iscsi_name,iscsi_target)self.tgtadm.new_logicalunit(iscsi_target,0,volume_path)ifFLAGS.iscsi_helper=='tgtadm':lun=1else:lun=0ifself.run_local:iscsi_ip_address=FLAGS.iscsi_ip_addresselse:iscsi_ip_address=FLAGS.san_ipreturn{'provider_location':_iscsi_location(iscsi_ip_address,iscsi_target,iscsi_name,lun)}defremove_export(self,context,volume):"""Removes an export for a logical volume."""try:iscsi_target=self.db.volume_get_iscsi_target_num(context,volume['id'])exceptexception.NotFound:LOG.info(_("Skipping remove_export. No iscsi_target "+"provisioned for volume: %d"),volume['id'])returntry:# ietadm show will exit with an error# this export has already been removedself.tgtadm.show_target(iscsi_target)exceptExceptionase:LOG.info(_("Skipping remove_export. No iscsi_target "+"is presently exported for volume: %d"),volume['id'])returnself.tgtadm.delete_logicalunit(iscsi_target,0)self.tgtadm.delete_target(iscsi_target)defcheck_for_export(self,context,volume_id):"""Make sure volume is exported."""tid=self.db.volume_get_iscsi_target_num(context,volume_id)try:self.tgtadm.show_target(tid)exceptexception.ProcessExecutionError,e:# Instances remount read-only in this case.# /etc/init.d/iscsitarget restart and rebooting nova-volume# is better since ensure_export() works at boot time.LOG.error(_("Cannot confirm exported volume ""id:%(volume_id)s.")%locals())raisedeflocal_path(self,volume):zfs_poolname=self._build_zfs_poolname(volume['name'])zvoldev='/dev/zvol/%s'%zfs_poolnamereturnzvoldevdef_build_zfs_poolname(self,volume_name):zfs_poolname='%s%s'%(FLAGS.san_zfs_volume_base,volume_name)returnzfs_poolname
To configure my nova-volume instance (which runs on the control node,
since it's only a manager), I added these to my nova.conf file:
Note that the private key (/etc/nova/sankey here) is stored
in clear and that it must be readable by the nova user.
This key being stored in clear and giving root acces to my ZFS host, I
have limited a bit this root access by using a custom command wrapper
in the .ssh/authorized_keys file.
I had to set the iscsi_ip_address (the ip address of the ZFS
host), but I think this is a result of something mistakenly
implemented in my ZFSonLinux driver.
Using this config, I can boot an image, create a volume on my ZFS
storage, and attach it to the running image.
I have to test things like snapshot, (live?) migration and so. This is a
very first draft implementation which needs to be refined, improved
and tested.
Besides the fact that it needs more tests, I plan to use salt for my OpenStack
deployment (first to add more compute nodes in my cluster), and on the
other side, I'd like to try the salt-cloud so I have a bunch of
Debian images that "just work" (without the need of porting the
cloud-init Ubuntu package).
On the side of my zol driver, I need to port it to Cinder, but I do not have a Folsom install to test it...
À la mi-octobre, j'ai participé à la conférence OSDC 2012 à Paris. Le but de
cette conférence est de permettre à des développeurs de différentes communautés de se rencontrer dans une ambiance chaleureuse. De fait, j'ai découvert un certain nombre de projets et de pratiques intéressants.
Le samedi, j'ai découvert des outils javascript mettant l'accent sur les
modèles de données comme AngularJS ou BackBone, Une présentation rapide du
langage Go, le très prometteur portage des outils GCC sur Windows nommé
MinGW ainsi que les nouveautés de GCC 4.8. La journée s'est conclut sur des
présentations éclairs dont je retiendrai surtout la perversité des opérateurs
secrets en Perl et le livre Javascript Éloquent intégralement en HTML qui en
profite donc pour inclure exemples et exercices interactifs au fil du
contenu.
Le dimanche matin j'ai ouvert le bal en présentant mes travaux actuels dans
le DVCS Mercurial: l'Évolution de Changeset (PDF de la présentation). Ce concept permet aux développeurs de découvrir la réécriture d'historique de manière
simple et sûre. Les utilisateurs avancés ont accès de leur côté à
des processus de travail et de revue encore inédits dans le monde des DVCS. Ma présentation fut suivie d'une introduction à la
découverte automatique de bugs grâce à la bisection dans les DVCS.
La journée s'est poursuivie avec une présentation du langage Haskell, de la bibliothèque de visualisation sigmajs et la spécification SPORE apportant un peu d'espoir dans les spécifications de services Web REST.
Nous utilisons les méthodes agiles depuis la création de Logilab.
Nous avons parfois pris des libertés avec le formalisme des méthodes connues, en adaptant nos pratiques à nos clients et nos particularités. Nous avons en chemin développé nos propres outils orientés vers notre activité de développement logiciel
(gestion de version, processus sur les tickets, intégration continue, etc).
Il est parfois bon de se replonger dans la théorie et d'échanger
les bonnes pratiques en terme d'agilité. C'est pour cette raison que nous avons participé à l'étape nantaise de l'Agile Tour.
Plutôt que d'être simples spectateurs, nous avons présenté nos pratiques agiles, fortement liées au logiciel libre, dont un avantage indéniable est la possibilité offerte à chacun de le modifier pour l'adapter à ses besoins.
Premièrement, en utilisant la plate-forme web CubicWeb, nous avons pu construire une forge dont nous contrôlons le modèle de données. Les processus de gestion peuvent donc être spécifiques et les données des applications peuvent être étroitement intégrées. Par exemple, bien que la base logicielle soit la même, le circuit de validation des tickets sur l'extranet n'est pas identique à celui de nos forges publiques. Autre exemple, les versions livrées sur l'extranet apparaissent directement dans l'outil intranet de suivi des affaires et de décompte du temps (CRM/ERP).
Deuxièmement, nous avons choisi mercurial (hg) en grande partie car il est
écrit en python ce qui nous a permis de l'intégrer à nos autres outils, mais aussi d'y
contribuer (cf evolve).
Le BDD (Behaviour Driven Development) se combine avec des tests
fonctionnels haut niveau qui peuvent être décrits grâce à un formalisme syntaxique souvent associé au
langage Gherkin. Ces scénarios de test peuvent ensuite être convertis en
code et exécutés. Coté Python, nous avons trouvé behave et lettuce. De manière similaire à Selenium (scénarios de test de
navigation Web), la difficulté de ce genre de tests est plutôt leur
maintenance que l'écriture initiale.
Ce langage haut niveau peut néanmoins être un canal de communication
avec un client écrivant des tests. À ce jour, nous avons eu plusieurs
clients prenant le temps de faire des fiches de tests que nous
"traduisons" ensuite en tests unitaires. Si le client n'est pas forcément prêt à apprendre le Python et leurs tests unitaires, il serait
peut-être prêt à écrire des tests selon ce formalisme.