[core-workflow] Help needed: best way to convert hg repos to git?

Petr Viktorin encukou at gmail.com
Mon Feb 15 05:12:00 EST 2016

On 02/14/2016 06:24 PM, Pierre-Yves David wrote:
> On 02/13/2016 02:56 AM, Nick Coghlan wrote:
>> On 13 February 2016 at 11:15, Brett Cannon <brett at python.org> wrote:
>>> I don't remember the story behind cpython-fullhistory, but it's
>>> obviously
>>> incomplete since just stopped post conversion. You will need to find
>>> someone
>>> who knows (I'd ask on python-dev).
>>> Also realize that this will be  our fourth VCS (cvs, svn, hg, now
>>> git). This
>>> is not going to be a perfect history of the semantic actions of
>>> commits from
>>> the beginning of time just due to the fact that these VCS tools all use
>>> different concepts.
>> There's also the fact that prior to the move to SourceForge in 2000,
>> all changes had to be funneled through the half dozen or so people
>> with write access to the CVS tree:
>> https://docs.python.org/3/whatsnew/2.0.html#new-development-process
>> I think it's definitely OK if future code archaeologists need to dig
>> into the SVN repository to get a more complete view of CPython's
>> history.
> I've never met a project who did not regret such decision at some point.
> Keeping older history is usually valuable. Mercurial have powerful
> enough tool to let you get all the history back together, I assume git
> probably have that power too.
> This is your call, but I strongly recommend taking advantage of this
> migration to put everything back together.

While "putting everything back together" would be great, it doesn't
*have* to block the migration. Git has a command called "git replace"
that lets you do this later.

The Linux kernel (which switched to Git before Git migration tools
existed) has a separate "early history" repo that you can "prepend" to
the main one. Then, in your local copy, it looks like one unbroken
history. Since Git commits are snapshots and not deltas, this works
amazingly well -- it's just telling Git's object retrieval routine to
retrieve <object X> instead of <object Y>. The disadvantage is that it
has to be done in each clone individually -- no one can rewrite history
for others.

Two commands every future historian would have to do:
    git fetch <url_for_old_history>
    git replace --graft <first_commit_of_new_history>

More information about the core-workflow mailing list