subscribe to this blog

Logilab.org - en

News from Logilab and our Free Software projects, as well as on topics dear to our hearts (Python, Debian, Linux, the semantic web, scientific computing...)

Mercurial 2.3 day 0

2012/05/10 by Pierre-Yves David

I'm now at Copenhagen to attend the mercurial "2.3" sprint.

About twenty people are attending, including staff from Atlassian, Facebook, Google and Mozilla.

I expect code and discussion about various topic among:

  • the development process of mercurial itself,
  • performance improvement on huge repository,
  • integration of Obsolete Markers into mercurial core,
  • improvement on various aspect (merge diff, moving some extension in core, ...)

I'm of course very interested in the Obsolete Markers topic. I've been working on an experimental implementation for several months. An handful of people are using them at Logilab for two months and feedback are very promising.


Debian bug squashing party in Paris

2012/02/16 by Julien Cristau

Logilab will be present at the upcoming Debian BSP in Paris this week-end. This event will focus on fixing as many "release critical" bugs as possible, to help with the preparation of the upcoming Debian 7.0 "wheezy" release. It will also provide an opportunity to introduce newcomers to the processes of Debian development and bug fixing, as well as provide an opportunity for contributors in various areas of the project to interact "in real life".

http://www.logilab.org/file/88881?vid=download

The current stable release, Debian 6.0 "squeeze", came out in February 2011. The development of "wheezy" is scheduled to freeze in June 2012, for an eventual release later this year.

Among the things we hope to work on during this BSP, the latest HDF5 release (1.8.8) includes API and packaging changes that require some changes in dependent packages. With the number of scientific packages relying on HDF5, this is a pretty big change, as tracked in this Debian bug.


Introduction To Mercurial Phases (Part III)

2012/02/03 by Pierre-Yves David

This is the final part of a series of posts about the new phases feature we implemented for mercurial 2.1. The first part talks about how phases will help mercurial users, the second part explains how to control them. This one explains what people should take care of when upgrading.

Important upgrade note and backward compatibility

Phases do not require any conversion of your repos. Phase information is not stored in changesets. Everybody using a new client will take advantage of phases on any repository they touch.

However there is some points you need to be aware of regarding interaction between the old world without phases and the new world with phases:

Talking over the wire to a phaseless server using a phased client

As ever, the Mercurial wire protocol (used to communicate through http and ssh) is fully backward compatible [1]. But as old Mercurial versions are not aware of phases, old servers will always be treated as publishing.

Direct file system access to a phaseless repository using a phased client

A new client has no way to determine which parts of the history should be immutable and which parts should not. In order to fail safely, a new repo will mark everything as public when no data is available. For example, in the scenario described in part I, if an old version of mercurial were used to clone and commit, a new version of mercurial will see them as public and refuse to rebase them.

Note

Some extensions (like mq) may provide smarter logic to set some changesets to the draft or even secret phases.

The phased client will write phase data to the old repo on its first write operation.

Direct file system access to a phased repository using a phaseless client

Everything works fine except that the old client is unable to see or manipulate phases:

  • Changesets added to the repo inherit the phase of their parents, whatever the parents' phase. This could result in new commits being seen as public or pulled content seen as draft or even secret when a newer client uses the repo again!
  • Changesets pushed to a publishing server won't be set public.
  • Secret changesets are exchanged.
  • Old clients are willing to rewrite immutable changesets (as they don't know that they shouldn't).

So, if you actively rewrite your history or use secret changesets, you should ensure that only new clients touch those repositories where the phase matters.

Fixing phases error

Several situations can result in bad phases in a repository:

  • When upgrading from phaseless to phased Mercurial, the default phases picked may be too restrictive.
  • When you let an old client touch your repository.
  • When you push to a publishing server that should not actually be publishing.

The easiest way to restore a consistant state is to use the phase command. In most cases, changesets marked as public but absent from your real public server should be moved to draft:

hg phase --force --draft 'public() and outgoing()'

If you have multiple public servers, you can pull from the others to retrieve their phase data too.

Conclusion

Mercurial's phases are a simple concept that adds always on and transparent safety for most users while not preventing advanced ones from doing whatever they want.

Behind this safety-enabling and useful feature, phases introduce in Mercurial code the concept of sharing mutable parts of history. The introduction of this feature paves the way for advanced history rewriting solutions while allowing safe and easy sharing of mutable parts of history. I'll post about those future features shortly.


[1]You can expect the 0.9.0 version of Mercurial to interoperate cleanly with one released 5 years later.

[Images by Crystian Cruz (cc-nd) and C.J. Peters (cc-by-sa)]


Introduction To Mercurial Phases (Part II)

2012/02/02 by Pierre-Yves David

This is the second part of a series of posts about the new phases feature we implemented for mercurial 2.1. The first part talks about how phases will help mercurial users, this second part explains how to control them.

Controlling automatic phase movement

Sometimes it may be desirable to push and pull changesets in the draft phase to share unfinished work. Below are some cases:

  • pushing to continuous integration,
  • pushing changesets for review,
  • user has multiple machines,
  • branch clone.

You can disable publishing behavior in a repository configuration file [1]:

[phases]
   publish=False
   

When a repository is set to non-publishing, people push changesets without altering their phase. draft changesets are pushed as draft and public changesets are pushed as public:

celeste@Chessy ~/palace $ hg showconfig phases
   phases.publish=False
   
babar@Chessy ~/palace $ hg log --graph
   @  [draft] add a carpet (2afbcfd2af83)
   |
   o  [public] Add a table in the kichen (139ead8a540f)
   |
   o  [public] Add wall color (0d1feb1bca54)
   |
   
   babar@Chessy ~/palace $ hg outgoing ~celeste/palace/
   [public] Add wall color (0d1feb1bca54)
   [public] Add a table in the kichen (139ead8a540f)
   [draft] add a carpet (3c1b19d5d3f5)
   babar@Chessy ~/palace $ hg push ~celeste/palace/
   pushing to ~celeste/palace/
   searching for changes
   adding changesets
   adding manifests
   adding file changes
   added 3 changesets with 3 changes to 2 files
   babar@Chessy ~/palace $ hg log --graph
   @  [draft] add a carpet (2afbcfd2af83)
   |
   o  [public] Add a table in the kichen (139ead8a540f)
   |
   o  [public] Add wall color (0d1feb1bca54)
   |
   
   
celeste@Chessy ~/palace $ hg log --graph
   o  [draft] add a carpet (2afbcfd2af83)
   |
   o  [public] Add a table in the kichen (139ead8a540f)
   |
   o  [public] Add wall color (0d1feb1bca54)
   |
   
   

And pulling gives the phase as in the remote repository:

celeste@Chessy ~/palace $ hg up 139ead8a540f
   celeste@Chessy ~/palace $ echo The wall will be decorated with portraits >> bedroom
   celeste@Chessy ~/palace $ hg ci -m 'Decorate the wall.'
   created new head
   celeste@Chessy ~/palace $ hg log --graph
   @  [draft] Decorate the wall. (3389164e92a1)
   |
   | o  [draft] add a carpet (3c1b19d5d3f5)
   |/
   o  [public] Add a table in the kichen (139ead8a540f)
   |
   o  [public] Add wall color (0d1feb1bca54)
   |
   
   ---
   babar@Chessy ~/palace $ hg pull ~celeste/palace/
   pulling from ~celeste/palace/
   searching for changes
   adding changesets
   adding manifests
   adding file changes
   added 1 changesets with 1 changes to 1 files (+1 heads)
   babar@Chessy ~/palace $ hg log --graph
   @  [draft] Decorate the wall. (3389164e92a1)
   |
   | o  [draft] add a carpet (3c1b19d5d3f5)
   |/
   o  [public] Add a table in the kichen (139ead8a540f)
   |
   o  [public] Add wall color (0d1feb1bca54)
   |
   
   

Phase information is exchanged during pull and push operations. When a changeset exists on both sides but within different phases, its phase is unified to the lowest [2] phase. For instance, if a changeset is draft locally but public remotely, it is set public:

celeste@Chessy ~/palace $ hg push -r 3389164e92a1
   pushing to http://hg.celesteville.com/palace
   searching for changes
   adding changesets
   adding manifests
   adding file changes
   added 1 changesets with 1 changes to 1 files
   celeste@Chessy ~/palace $ hg log --graph
   @  [public] Decorate the wall. (3389164e92a1)
   |
   | o  [draft] add a carpet (3c1b19d5d3f5)
   |/
   o  [public] Add a table in the kichen (139ead8a540f)
   |
   o  [public] Add wall color (0d1feb1bca54)
   |
   
   ---
   babar@Chessy ~/palace $ hg pull ~celeste/palace/
   pulling from ~celeste/palace/
   searching for changes
   no changes found
   babar@Chessy ~/palace $ hg log --graph
   @  [public] Decorate the wall. (3389164e92a1)
   |
   | o  [draft] add a carpet (3c1b19d5d3f5)
   |/
   o  [public] Add a table in the kichen (139ead8a540f)
   |
   o  [public] Add wall color (0d1feb1bca54)
   |
   
   

Note

pull is read-only operation and does not alter phases in remote repositories.

You can also control the phase in which a new changeset is committed. If you don't want new changesets to be pushed without explicit consent, update your configuration with:

[phases]
   new-commit=secret
   

You will need to use manual phase movement before you can push them. See the next section for details:

Note

With what have been done so far for 2.1, the "most practical way to make a new commit secret" is to use:

   hg commit --config phases.new-commit=secret
   
[1]You can use this setting in your user hgrc too.
[2]Phases as ordered as follow: public < draft < secret

Manual phase movement

Most phase movements should be automatic and transparent. However it is still possible to move phase manually using the hg phase command:

babar@Chessy ~/palace $ hg log --graph
   @    [draft] merge with Celeste works (f728ef4eba9f)
   |\
   o |  [draft] add a carpet (3c1b19d5d3f5)
   | |
   | o  [public] Decorate the wall. (3389164e92a1)
   |/
   o  [public] Add a table in the kichen (139ead8a540f)
   |
   
   babar@Chessy ~/palace $ hg phase --public 3c1b19d5d3f5
   babar@Chessy ~/palace $ hg log --graph
   @    [draft] merge with Celeste works (f728ef4eba9f)
   |\
   o |  [public] add a carpet (3c1b19d5d3f5)
   | |
   | o  [public] Decorate the wall. (3389164e92a1)
   |/
   o  [public] Add a table in the kichen (139ead8a540f)
   |
   
   

Changesets only move to lower [#] phases during normal operation. By default, the phase command enforces this rule:

babar@Chessy ~/palace $ hg phase --draft 3c1b19d5d3f5
   no phases changed
   babar@Chessy ~/palace $ hg log --graph
   @    [draft] merge with Celeste works (f728ef4eba9f)
   |\
   o |  [public] add a carpet (3c1b19d5d3f5)
   | |
   | o  [public] Decorate the wall. (3389164e92a1)
   |/
   o  [public] Add a table in the kichen (139ead8a540f)
   |
   
   

If you are confident in what your are doing you can still use the --force switch to override this behavior:

Warning

Phases are designed to avoid forcing people to use hg phase --force. If you need to use --force on a regular basis, you are probably doing something wrong. Read the previous section again to see how to configure your environment for automatic phase movement suitable to your needs.

babar@Chessy ~/palace $ hg phase --verbose --force --draft 3c1b19d5d3f5
   phase change for 1 changesets
   babar@Chessy ~/palace $ hg log --graph
   @    [draft] merge with Celeste works (f728ef4eba9f)
   |\
   o |  [draft] add a carpet (3c1b19d5d3f5)
   | |
   | o  [public] Decorate the wall. (3389164e92a1)
   |/
   o  [public] Add a table in the kichen (139ead8a540f)
   |
   
   

Note that a phase defines a consistent set of revisions in your history graph. This means that to have a public (immutable) changeset all its ancestors need to be immutable too. Once you have a secret (not exchanged) changeset, all its descendant will be secret too.

This means that changing the phase of a changeset may result in phase movement for other changesets:

babar@Chessy ~/palace $ hg phase -v --public f728ef4eba9f # merge with Celeste works
   phase change for 2 changesets
   babar@Chessy ~/palace $ hg log --graph
   @    [public] merge with Celeste works (f728ef4eba9f)
   |\
   o |  [public] add a carpet (3c1b19d5d3f5)
   | |
   | o  [public] Decorate the wall. (3389164e92a1)
   |/
   o  [public] Add a table in the kichen (139ead8a540f)
   |
   
   babar@Chessy ~/palace $ hg phase -vf --draft 3c1b19d5d3f5 # add a carpet
   phase change for 2 changesets
   babar@Chessy ~/palace $ hg log --graph
   @    [draft] merge with Celeste works (f728ef4eba9f)
   |\
   o |  [draft] add a carpet (3c1b19d5d3f5)
   | |
   | o  [public] Decorate the wall. (3389164e92a1)
   |/
   o  [public] Add a table in the kichen (139ead8a540f)
   |
   
   

The next and final post will explain how older mercurial versions interact with newer versions that support phases.

[Images by Jimmy Smith (cc-by-nd) and Cory Doctorow (cc-by-sa)]


Introduction To Mercurial Phases (Part I)

2012/02/02 by Pierre-Yves David
credit: redshirtjosh, http://www.flickr.com/photos/43273828@N06/4111258568/

On the behalf of Logilab I put a lot of efforts to include a new core feature named phases in Mercurial 2.1. Phases are a system for tracking which changesets have been or should be shared. This helps to prevent common mistakes when modifying history (for instance, with the mq or rebase extensions). It will transparently benefit to all users. This concept is the first step towards simple, safe and powerful rewritting mecanisms for history in mercurial.

This serie of three blog entries will explain:

  1. how phases will help mercurial users,
  2. how one can control them,
  3. how older mercurial versions interact with newer versions that support phases.

Preventing erroneous history rewriting

credit: anita.priks, http://www.flickr.com/photos/46785534@N06/6358218623/

History rewriting is a common practice in DVCS. However when done the wrong way the most common error results in duplicated history. The phase concept aims to make rewriting history safer. For this purpose Mercurial 2.1 introduces a distinction between the "past" part of your history (that is expected to stay there forever) and the "present" part of the history (that you are currently evolving). The old and immutable part is called public and the mutable part of your history is called draft.

Let's see how this happens using a simple scenario.


A new Mercurial user clones a repository:

babar@Chessy ~ $ hg clone http://hg.celesteville.com/palace
requesting all changes
adding changesets
adding manifests
adding file changes
added 2 changesets with 2 changes to 2 files
updating to branch default
2 files updated, 0 files merged, 0 files removed, 0 files unresolved
babar@Chessy ~/palace $ cd palace
babar@Chessy ~/palace $ hg log --graph
@  changeset:   1:2afbcfd2af83
|  tag:         tip
|  user:        Celeste the Queen <Celeste@celesteville.com>
|  date:        Wed Jan 25 16:41:56 2012 +0100
|  summary:     We need a kitchen too.
|
o  changeset:   0:898889b143fb
   user:        Celeste the Queen <Celeste@celesteville.com>
   date:        Wed Jan 25 16:39:07 2012 +0100
   summary:     First description of the throne room

The repository already contains some changesets. Our user makes some improvements and commits them:

babar@Chessy ~/palace $ echo The wall shall be Blue >> throne-room
babar@Chessy ~/palace $ hg ci -m 'Add wall color'
babar@Chessy ~/palace $ echo In the middle stands a three meters round table >> kitchen
babar@Chessy ~/palace $ hg ci -m 'Add a table in the kichen'

But when he tries to push new changesets, he discovers that someone else already pushed one:

babar@Chessy ~/palace $ hg push
pushing to http://hg.celesteville.com/palace
searching for changes
abort: push creates new remote head bcd4d53319ec!
(you should pull and merge or use push -f to force)
babar@Chessy ~/palace $ hg pull
pulling from http://hg.celesteville.com/palace
searching for changes
adding changesets
adding manifests
adding file changes
added 1 changesets with 1 changes to 1 files (+1 heads)
(run 'hg heads' to see heads, 'hg merge' to merge)
babar@Chessy ~/palace $ hg log --graph
o  changeset:   4:0a5b3d7e4e5f
|  tag:         tip
|  parent:      1:2afbcfd2af83
|  user:        Celeste the Queen <Celeste@celesteville.com>
|  date:        Wed Jan 25 16:58:23 2012 +0100
|  summary:     Some bedroom description.
|
| @  changeset:   3:bcd4d53319ec
| |  user:        Babar the King <babar@celesteville.com>
| |  date:        Wed Jan 25 16:52:02 2012 +0100
| |  summary:     Add a table in the kichen
| |
| o  changeset:   2:f9f14815935d
|/   user:        Babar the King <babar@celesteville.com>
|    date:        Wed Jan 25 16:51:51 2012 +0100
|    summary:     Add wall color
|
o  changeset:   1:2afbcfd2af83
|  user:        Celeste the Queen <Celeste@celesteville.com>
|  date:        Wed Jan 25 16:41:56 2012 +0100
|  summary:     We need a kitchen too.
|
o  changeset:   0:898889b143fb
   user:        Celeste the Queen <Celeste@celesteville.com>
   date:        Wed Jan 25 16:39:07 2012 +0100
   summary:     First description of the throne room

Note

From here on this scenario becomes very unlikely. Mercurial is simple enough for a new user not to be that confused by such a trivial situation. But we keep the example simple to focus on phases.

Recently, our new user read some hype blog about "rebase" and the benefit of linear history. So, he decides to rewrite his history instead of merging.

Despite reading the wonderful rebase help, our new user makes the wrong decision when it comes to using it. He decides to rebase the remote changeset 0a5b3d7e4e5f:"Some bedroom description." on top of his local changeset.

With previous versions of mercurial, this mistake was allowed and would result in a duplication of the changeset 0a5b3d7e4e5f:"Some bedroom description."

babar@Chessy ~/palace $ hg rebase -s 4 -d 3
babar@Chessy ~/palace $ hg push
pushing to http://hg.celesteville.com/palace
searching for changes
abort: push creates new remote head bcd4d53319ec!
(you should pull and merge or use push -f to force)
babar@Chessy ~/palace $ hg pull
pulling from http://hg.celesteville.com/palace
searching for changes
adding changesets
adding manifests
adding file changes
added 1 changesets with 1 changes to 1 files (+1 heads)
(run 'hg heads' to see heads, 'hg merge' to merge)
babar@Chessy ~/palace $ hg log --graph
@  changeset:   5:55d9bae1e1cb
|  tag:         tip
|  parent:      3:bcd4d53319ec
|  user:        Celeste the Queen <Celeste@celesteville.com>
|  date:        Wed Jan 25 16:58:23 2012 +0100
|  summary:     Some bedroom description.
|
| o  changeset:   4:0a5b3d7e4e5f
| |  parent:      1:2afbcfd2af83
| |  user:        Celeste the Queen <Celeste@celesteville.com>
| |  date:        Wed Jan 25 16:58:23 2012 +0100
| |  summary:     Some bedroom description.
| |
o |  changeset:   3:bcd4d53319ec
| |  user:        Babar the King <babar@celesteville.com>
| |  date:        Wed Jan 25 16:52:02 2012 +0100
| |  summary:     Add a table in the kichen
| |
o |  changeset:   2:f9f14815935d
|/   user:        Babar the King <babar@celesteville.com>
|    date:        Wed Jan 25 16:51:51 2012 +0100
|    summary:     Add wall color
|
o  changeset:   1:2afbcfd2af83
|  user:        Celeste the Queen <Celeste@celesteville.com>
|  date:        Wed Jan 25 16:41:56 2012 +0100
|  summary:     We need a kitchen too.
|
o  changeset:   0:898889b143fb
   user:        Celeste the Queen <Celeste@celesteville.com>
   date:        Wed Jan 25 16:39:07 2012 +0100
   summary:     First description of the throne room

In more complicated setups it's a fairly common mistake, Even in big and successful projects and with other DVCSs.

In the new Mercurial version the user won't be able to make this mistake anymore. Trying to rebase the wrong way will result in:

babar@Chessy ~/palace $ hg rebase -s 4 -d 3
abort: can't rebase immutable changeset 0a5b3d7e4e5f
(see hg help phases for details)

The correct rebase still works as expected:

babar@Chessy ~/palace $ hg rebase -s 2 -d 4
babar@Chessy ~/palace $ hg log --graph
@  changeset:   4:139ead8a540f
|  tag:         tip
|  user:        Babar the King <babar@celesteville.com>
|  date:        Wed Jan 25 16:52:02 2012 +0100
|  summary:     Add a table in the kichen
|
o  changeset:   3:0d1feb1bca54
|  user:        Babar the King <babar@celesteville.com>
|  date:        Wed Jan 25 16:51:51 2012 +0100
|  summary:     Add wall color
|
o  changeset:   2:0a5b3d7e4e5f
|  user:        Celeste the Queen <Celeste@celesteville.com>
|  date:        Wed Jan 25 16:58:23 2012 +0100
|  summary:     Some bedroom description.
|
o  changeset:   1:2afbcfd2af83
|  user:        Celeste the Queen <Celeste@celesteville.com>
|  date:        Wed Jan 25 16:41:56 2012 +0100
|  summary:     We need a kitchen too.
|
o  changeset:   0:898889b143fb
   user:        Celeste the Queen <Celeste@celesteville.com>
   date:        Wed Jan 25 16:39:07 2012 +0100
   summary:     First description of the throne room

What is happening here:

  • Changeset 0a5b3d7e4e5f from Celeste was set to the public phase because it was pulled from the outside. The public phase is immutable.
  • Changesets f9f14815935d and bcd4d53319ec (rebased as 0d1feb1bca54 and 139ead8a540f) have been commited locally and haven't been transmitted from this repository to another. As such, they are still in the draft phase. Unlike the public phase, the draft phase is mutable.

Let's watch the whole action in slow motion, paying attention to phases:

babar@Chessy ~ $ cat >> ~/.hgrc << EOF
[ui]
username=Babar the King <babar@celesteville.com>
logtemplate='[{phase}] {desc} ({node|short})\\n'
EOF

First, changesets cloned from a public server are public:

babar@Chessy ~ $ hg clone --quiet http://hg.celesteville.com/palace
babar@Chessy ~/palace $ cd palace
babar@Chessy ~/palace $ hg log --graph
@  [public] We need a kitchen too. (2afbcfd2af83)
|
o  [public] First description of the throne room (898889b143fb)

Second, new changesets committed locally are in the draft phase:

babar@Chessy ~/palace $ echo The wall shall be Blue >> throne-room
babar@Chessy ~/palace $ hg ci -m 'Add wall color'
babar@Chessy ~/palace $ echo In the middle stand a three meters round table >> kitchen
babar@Chessy ~/palace $ hg ci -m 'Add a table in the kichen'
babar@Chessy ~/palace $ hg log --graph
@  [draft] Add a table in the kichen (bcd4d53319ec)
|
o  [draft] Add wall color (f9f14815935d)
|
o  [public] We need a kitchen too. (2afbcfd2af83)
|
o  [public] First description of the throne room (898889b143fb)

Third, changesets pulled from a public server are public:

babar@Chessy ~/palace $ hg pull --quiet
babar@Chessy ~/palace $ hg log --graph
o  [public] Some bedroom description. (0a5b3d7e4e5f)
|
| @  [draft] Add a table in the kichen (bcd4d53319ec)
| |
| o  [draft] Add wall color (f9f14815935d)
|/
o  [public] We need a kitchen too. (2afbcfd2af83)
|
o  [public] First description of the throne room (898889b143fb)

Note

rebase preserves the phase of rebased changesets

babar@Chessy ~/palace $ hg rebase -s 2 -d 4
babar@Chessy ~/palace $ hg log --graph
@  [draft] Add a table in the kichen (139ead8a540f)
|
o  [draft] Add wall color (0d1feb1bca54)
|
o  [public] Some bedroom description. (0a5b3d7e4e5f)
|
o  [public] We need a kitchen too. (2afbcfd2af83)
|
o  [public] First description of the throne room (898889b143fb)

Finally, once pushed to the public server, changesets are set to the public (immutable) phase

babar@Chessy ~/palace $ hg push
pushing to http://hg.celesteville.com/palace
searching for changes
adding changesets
adding manifests
adding file changes
added 2 changesets with 2 changes to 2 files
babar@Chessy ~/palace $ hg log --graph

@  [public] Add a table in the kichen (139ead8a540f)
|
o  [public] Add wall color (0d1feb1bca54)
|
o  [public] Some bedroom description. (0a5b3d7e4e5f)
|
o  [public] We need a kitchen too. (2afbcfd2af83)
|
o  [public] First description of the throne room (898889b143fb)

To summarize:

  • Changesets exchanged with the outside are public and immutable.
  • Changesets committed locally are draft until exchanged with the outside.
  • As a user, you should not worry about phases. Phases move transparently.

Preventing premature exchange of history

credit: Richard Elzey, http://www.flickr.com/photos/elzey/3516256055/

The public phases prevent user from accidentally rewriting public history. It's a good step forward but phases can go further. Phases can prevent you from accidentally making history public in the first place.

For this purpose, a third phase is available, the secret phase. To explain it, I'll use the mq extension which is nicely integrated with this secret phase:

Our fellow user enables the mq extension

babar@Chessy ~/palace $ vim ~/.hgrc
babar@Chessy ~/palace $ cat ~/.hgrc
[ui]
username=Babar the King <babar@celesteville.com>
[extensions]
# enable the mq extension included with Mercurial
hgext.mq=
[mq]
# Enable secret phase integration.
# This integration is off by default for backward compatibility.
secret=true

New patches (not general commits) are now created as secret

babar@Chessy ~/palace $ echo A red carpet on the floor. >> throne-room
babar@Chessy ~/palace $ hg qnew -m 'add a carpet' carpet.diff
babar@Chessy ~/palace $ hg log --graph

@  [secret] add a carpet (3c1b19d5d3f5)
|
@  [public] Add a table in the kichen (139ead8a540f)
|
o  [public] Add wall color (0d1feb1bca54)
|

this secret changeset is excluded from outgoing and push:

babar@Chessy ~/palace $ hg outgoing
comparing with http://hg.celesteville.com/palace
searching for changes
no changes found (ignored 1 secret changesets)
babar@Chessy ~/palace $ hg push
pushing to http://hg.celesteville.com/palace
searching for changes
no changes found (ignored 1 secret changesets)

And other users do not see it:

celeste@Chessy ~/palace $ hg incoming ~babar/palace/
comparing with ~babar/palace
searching for changes
[public] Add wall color (0d1feb1bca54)
[public] Add a table in the kichen (139ead8a540f)

The mq integration take care of phase movement for the user. Changeset are made draft by qfinish

babar@Chessy ~/palace $ hg qfinish .
babar@Chessy ~/palace $ hg log --graph
@  [draft] add a carpet (2afbcfd2af83)
|
o  [public] Add a table in the kichen (139ead8a540f)
|
o  [public] Add wall color (0d1feb1bca54)
|

And changesets are made secret again by qimport

babar@Chessy ~/palace $ hg qimport -r 2afbcfd2af83
babar@Chessy ~/palace $ hg log --graph
@  [secret] add a carpet (2afbcfd2af83)
|
o  [public] Add a table in the kichen (139ead8a540f)
|
o  [public] Add wall color (0d1feb1bca54)
|

As expected, mq refuses to qimport public changesets

babar@Chessy ~/palace $ hg qimport -r 139ead8a540f
abort: revision 4 is not mutable

In the next part I'll details how to control phases movement.


Generating a user interface from a Yams model

2012/01/09 by Nicolas Chauvat

Yams is a pythonic way to describe an entity-relationship model. It is used at the core of the CubicWeb semantic web framework in order to automate lots of things, including the generation and validation of forms. Although we have been using the MVC design pattern to write user interfaces with Qt and Gtk before we started CubicWeb, we never got to reuse Yams. I am on my way to fix this.

Here is the simplest possible example that generates a user interface (using dialog and python-dialog) to input data described by a Yams data model.

First, let's write a function that builds the data model:

def mk_datamodel():
    from yams.buildobjs import EntityType, RelationDefinition, Int, String
    from yams.reader import build_schema_from_namespace

    class Question(EntityType):
        number = Int()
        text = String()

    class Form(EntityType):
        title = String()

    class in_form(RelationDefinition):
        subject = 'Question'
        object = 'Form'
        cardinality = '*1'

    return build_schema_from_namespace(vars().items())

Here is what you get using graphviz or xdot to display the schema of that data model with:

import os
from yams import schema2dot

datamodel = mk_datamodel()
schema2dot.schema2dot(datamodel, '/tmp/toto.dot')
os.system('xdot /tmp/toto.dot')
http://www.logilab.org/file/87002?vid=download

To make a step in the direction of genericity, let's add a class that abstracts the dialog API:

class InterfaceDialog:
    """Dialog-based Interface"""
    def __init__(self, dlg):
        self.dlg = dlg

    def input_list(self, invite, options) :
        assert len(options) != 0, str(invite)
        choice = self.dlg.radiolist(invite, list=options, selected=1)
        if choice is not None:
            return choice.lower()
        else:
            raise Exception('operation cancelled')

    def input_string(self, invite, default):
        return self.dlg.inputbox(invite, init=default).decode(sys.stdin.encoding)

And now let's put everything together:

datamodel = mk_datamodel()

import dialog
ui = InterfaceDialog(dialog.Dialog())
ui.dlg.setBackgroundTitle('Dialog Interface with Yams')

objs = []
for entitydef in datamodel.entities():
    if entitydef.final:
        continue
    obj = {}
    for attr in entitydef.attribute_definitions():
        if attr[1].type in ('String','Int'):
            obj[str(attr[0])] = ui.input_string('%s.%s' % (entitydef,attr[0]), '')
    try:
        entitydef.check(obj)
    except Exception, exc:
        ui.dlg.scrollbox(str(exc))

print objs
http://www.logilab.org/file/87001?vid=download

The result is a program that will prompt the user for the title of a form and the text/number of a question, then enforce the type constraints and display the inconsistencies.

The above is very simple and does very little, but if you read the documentation of Yams and if you think about generating the UI with Gtk or Qt instead of dialog, or if you have used the form mechanism of CubicWeb, you'll understand that this proof of concept opens a door to a lot of possibilities.

I will come back to this topic in a later article and give an example of integrating the above with pigg, a simple MVC library for Gtk, to make the programming of user-interfaces even more declarative and bug-free.


Interesting things seen at the Afpy Computer Camp

2011/11/28 by Pierre-Yves David

This summer I spent three days in Burgundy at the Afpy Computer Camps. This yearly meeting gathered French speaking python developers for talking and coding. The main points of this 2011 edition were:

http://www.afpy.org/_public/images/logo_afpy.png

The new IPython 0.11 was shown by Olivier Grisel. This new version contains lots of impressive feature like inline figures, asynchronous execution, exportable sessions, and a web-browser based client. IPython was also presented by its main author Fernando Perez during his keynote talk at EuroSciPy. Since then Logilab got involved with IPython. We contributed to the Debian packaging of iPython dependencies and we joined the discussion about Restructured Text formatting for note book.

http://ipython.org/ipython-doc/rel-0.11/_static/logo.png

Tarek Ziade bootstrapped his new Red Barrel project and small framework to build modern webservices with multiple back-end including the new socket.io protocol.

Alexis Métaireau and Feth Arezki discovered their common interest into account tracking application. The discussion's result is a first release of I hate money a few months later.

For my part, I spent most of my time working with Boris Feld on the Python Testing Infrastructure , a continuous integration tool to test python distributions available at PyPI.

http://master.pyti.org/data/pyti.ico.png

This yearly Afpy Computer Camps is an event intended for python developers but the Afpy also organize events for non python developer. The next one is tonight in Paris at La cantine : Vous reprendrez bien un peu de python ?. See you tonight ?


Python in Finance (and Derivative Analytics)

2011/10/25 by Damien Garaud

The Logilab team attended (and co-organized) EuroScipy 2011, at the end of August in Paris.

We saw some interesting posters and a presentation dealing with Python in finance and derivative analytics [1].

In order to debunk the idea that "all computation libraries dedicated to financial applications must be written in C/C++ or some other compiled programming language", I would like to introduce a more Pythonic way.

You may know that financial applications such as risk management have in most cases high computational needs. For instance, it can be necessary to quickly perform a large number of Monte Carlo simulations to evaluate an American option in a few seconds.

The Python community provides several reliable and efficient libraries and packages dedicated to numerical computations:

http://numpy.scipy.org/_static/numpy_logo.pnghttps://scikits.appspot.com/static/images/scipyshiny_small.png
  • the well-known SciPy and NumPy libraries. They provide a complete set of tools to work with matrix, linear algebra operations, singular values decompositions, multi-variate regression models, ...
  • scikits is a set of add-on toolkits for SciPy. For instance there are statistical models in statsmodels packages, a toolkit dedicated to timeseries manipulation and another one dedicated to numerical optimization;
  • pandas is a recent Python package which provides "fast, flexible, and expressive data structures designed to make working with relational or labeled data both easy and intuitive.". pandas uses Cython to improve its performance. Moreover, pandas has been used extensively in production in financial applications;
http://docs.cython.org/_static/cython-logo-light.png
  • Cython is a way to write C extensions for the Python language. Since you write Cython code in the same way as you write Python code, it's easy to use it. Is it fast? Yes ! I compared a simple example from Cython's official documentation with a full Python code -- a piece of code which computes the first kth prime numbers. The Cython code is almost thirty times faster than the full-Python code (non-official). Furthermore, you can use NumPy in Cython code !

I believe that thanks to several useful tools and libraries, Python can be used in numerical computation, even in Finance (both research and production). It is easy-to-maintain without sacrificing performances.

Note that you can find some other references on Visixion webpages:


Rss feeds aggregator based on Scikits.learn and CubicWeb

2011/10/17 by Vincent Michel

During Euroscipy, the Logilab Team presented an original approach for querying news using semantic information: "Rss feeds aggregator based on Scikits.learn and CubicWeb" by Vincent Michel This work is based on two major pieces of software:

http://www.cubicweb.org/data/index-cubicweb.png
  • CubicWeb, the pythonic semantic web framework, is used to store and query Dbpedia information. CubicWeb is able to reconstruct links from rdf/nt files, and can easily execute complex queries in a database with more than 8 millions entities and 75 millions links when using a PostgreSQL backend.
http://scipy-lectures.github.com/_images/scikit-learn-logo.png
  • Scikit.learn is a cutting-edge python toolbox for machine learning. It provides algorithms that are simple and easy to use.
http://www.pfeifermachinery.com/img/rss.png

Based on these tools, we built a pure Python application to query the news:

  • Named Entities are extracted from RSS articles of a few mainstream English newspapers (New York Times, Reuteurs, BBC News, etc.), for each group of words in an article, we check if a Dbpedia entry has the same label. If so, we create a semantic link between the article and the Dbpedia entry.
  • An occurrence matrix of "RSS Articles" times "Named Entities" is constructed and may be used against several machine learning algorithms (MeanShift algorithm, Hierachical Clustering) in order to provide original and informative views of recent events.
http://wiki.dbpedia.org/images/dbpedia_logo.png

Moreover, queries may be used jointly with semantic information from Dbpedia:

  • All musical artists in the news:

    DISTINCT Any E, R WHERE E appears_in_rss R, E has_type T, T label "musical artist"
    
  • All living office holder persons in the news:

    DISTINCT Any E WHERE E appears_in_rss R, E has_type T, T label "office holder", E has_subject C, C label "Living people"
    
  • All news that talk about Barack Obama and any scientist:

    DISTINCT Any R WHERE E1 label "Barack Obama", E1 appears_in_rss R, E2 appears_in_rss R, E2 has_type T, T label "scientist"
    
  • All news that talk about a drug:

    Any X, R WHERE X appears_in_rss R, X has_type T, T label "drug"
    

Such a tool may be used for informetrics and news analysis. Feel free to download the complete slides of the presentation.


Helping pylint to understand things it doesn't

2011/10/10 by Sylvain Thenault

The latest release of logilab-astng (0.23), the underlying source code representation library used by PyLint, provides a new API that may change pylint users' life in the near future...

It aims to allow registration of functions that will be called after a module has been parsed. While this sounds dumb, it gives a chance to fix/enhance the understanding PyLint has about your code.

I see this as a major step towards greatly enhanced code analysis, improving the situation where PyLint users know that when running it against code using their favorite framework (who said CubicWeb? :p ), they should expect a bunch of false positives because of black magic in the ORM or in decorators or whatever else. There are also places in the Python standard library where dynamic code can cause false positives in PyLint.

The problem

Let's take a simple example, and see how we can improve things using the new API. The following code:

import hashlib

def hexmd5(value):
    """"return md5 checksum hexadecimal digest of the given value"""
    return hashlib.md5(value).hexdigest()

def hexsha1(value):
    """"return sha1 checksum hexadecimal digest of the given value"""
    return hashlib.sha1(value).hexdigest()

gives the following output when analyzed through pylint:

[syt@somewhere ~]$ pylint -E example.py
No config file found, using default configuration
************* Module smarter_astng
E:  5,11:hexmd5: Module 'hashlib' has no 'md5' member
E:  9,11:hexsha1: Module 'hashlib' has no 'sha1' member

However:

[syt@somewhere ~]$ python
Python 2.7.1+ (r271:86832, Apr 11 2011, 18:13:53)
[GCC 4.5.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import smarter_astng
>>> smarter_astng.hexmd5('hop')
'5f67b2845b51a17a7751f0d7fd460e70'
>>> smarter_astng.hexsha1('hop')
'cffb6b20e0eef296772f6c1457cdde0049bdfb56'

The code runs fine... Why does pylint bother me then? If we take a look at the hashlib module, we see that there are no sha1 or md5 defined in there. They are defined dynamically according to Openssl library availability in order to use the fastest available implementation, using code like:

for __func_name in __always_supported:
    # try them all, some may not work due to the OpenSSL
    # version not supporting that algorithm.
    try:
        globals()[__func_name] = __get_hash(__func_name)
    except ValueError:
        import logging
        logging.exception('code for hash %s was not found.', __func_name)

Honestly I don't blame PyLint for not understanding this kind of magic. The situation on this particular case could be improved, but that's some tedious work, and there will always be "similar but different" case that won't be understood.

The solution

The good news is that thanks to the new astng callback, I can help it be smarter! See the code below:

from logilab.astng import MANAGER, scoped_nodes

def hashlib_transform(module):
    if module.name == 'hashlib':
        for hashfunc in ('sha1', 'md5'):
            module.locals[hashfunc] = [scoped_nodes.Class(hashfunc, None)]

def register(linter):
    """called when loaded by pylint --load-plugins, register our tranformation
    function here
    """
    MANAGER.register_transformer(hashlib_transform)

What's in there?

  • A function that will be called with each astng module built during a pylint execution, i.e. not only the one that you analyses, but also those accessed for type inference.
  • This transformation function is fairly simple: if the module is the 'hashlib' module, it will insert into its locals dictionary a fake class node for each desired name.
  • It is registered using the register_transformer method of astng's MANAGER (the central access point to built syntax tree). This is done in the pylint plugin API register callback function (called when module is imported using 'pylint --load-plugins'.

Now let's try it! Suppose I stored the above code in a 'astng_hashlib.py' module in my PYTHONPATH, I can now run pylint with the plugin activated:

[syt@somewhere ~]$ pylint -E --load-plugins astng_hashlib example.py
No config file found, using default configuration
************* Module smarter_astng
E:  5,11:hexmd5: Instance of 'md5' has no 'hexdigest' member
E:  9,11:hexsha1: Instance of 'sha1' has no 'hexdigest' member

Huum. We have now a different error :( Pylint grasp there are some md5 and sha1 classes but it complains they don't have a hexdigest method. Indeed, we didn't give a clue about that.

We could continue on and on to give it a full representation of hashlib public API using the astng nodes API. But that would be painful, trust me. Or we could do something clever using some higher level astng API:

from logilab.astng import MANAGER
from logilab.astng.builder import ASTNGBuilder

def hashlib_transform(module):
    if module.name == 'hashlib':
    fake = ASTNGBuilder(MANAGER).string_build('''

class md5(object):
  def __init__(self, value): pass
  def hexdigest(self):
    return u''

class sha1(object):
  def __init__(self, value): pass
  def hexdigest(self):
    return u''

''')
    for hashfunc in ('sha1', 'md5'):
        module.locals[hashfunc] = fake.locals[hashfunc]

def register(linter):
    """called when loaded by pylint --load-plugins, register our tranformation
    function here
    """
    MANAGER.register_transformer(hashlib_transform)

The idea is to write a fake python implementation only documenting the prototype of the desired class, and to get an astng from it, using the string_build method of the astng builder. This method will return a Module node containing the astng for the given string. It's then easy to replace or insert additional information into the original module, as you can see in the above example.

Now if I run pylint using the updated plugin:

[syt@somewhere ~]$ pylint -E --load-plugins astng_hashlib example.py
No config file found, using default configuration

No error anymore, great!

What's next?

This fairly simple change could quickly provide great enhancements. We should probably improve the astng manipulation API now that it's exposed like that. But we can also easily imagine a code base of such pylint plugins maintained by each community around a python library or framework. One could then use a plugins stack matching stuff used by its software, and have a greatly enhanced experience of using pylint.

For a start, it would be great if pylint could be shipped with a plugin that explains all the magic found in the standard library, wouldn't it? Left as an exercice to the reader!