[Python-Dev] Mercurial?

Dirkjan Ochtman dirkjan at ochtman.nl
Sun Apr 5 17:47:12 CEST 2009

On 05/04/2009 17:18, Antoine Pitrou wrote:
> It is a cheap price to pay if there is a significant return for it. In my
> experience using the hg mirror of the py3k branch, I don't remember having had
> to run "annotate" on the trunk to hunt for a change that I'd witnessed in py3k.
> Other developers may have different experiences, though.
> As for the clone time, one of our proeminent developers is, IIRC, on a 40 kb/s
> line. Perhaps he wants to step in and say whether cloning the trunk is a painful
> experience for him, or not.

Sure it's painful, but he only has to go through that once, maybe twice.

> The consensus seems to be that it will not happen before a couple of years.

See, I think the point here is that, even though you want the branches 
to be clones, you also want them to all be part of the same directed 
acyclic graph (that DAG thing I keep nattering on about). That way, you 
can later merge every branch back in to some other branch (even if it's 
trivial merge that doesn't keep anything from one of the branches). Even 
if that's not for a couple of years, it's nice when you'll be able to do 
it in a couple of years without changing all the hashes (meaning 
everybody has to re-clone).

For any dial-up providers, we could for example provide a repository 
that just has the changesets up to the split between trunk and py3k. He 
can clone that once, clone it locally, then pull the rest of the 
respective history in those local clones.

If you don't have common history, a few of the niceties of having a 
DAG-based DVCS in the first place go away; that seems like a pity.

> Does the hg protocol compress that good? I would have thought there is already a
> lot of compression in the layout (given that it seems much more efficient than
> some of its competitors).

When used over HTTP, hg uses bundles (which can also be used as separate 
file to exchange changesets informally). Bundles contain gzip- or 
bzip2-compressed csets. When communicating over SSH, on the other hand, 
hg defaults to uncompressed streams, because the assumption is that 
connections can use SSH's compression, which is more efficient.

All of this functions on top of the already quite efficient revlogs that 
make up the basic storage model for hg.



More information about the Python-Dev mailing list