Mailman 3 Mercurial migration: help needed - Python-Dev

newer
compiling python2.5 on linux under...

Mercurial migration: help needed

"Martin v. Löwis"

18 Aug 2009 18 Aug '09

1:42 p.m.

This is a repost from two weeks ago. It didn't get much feedback last time. I still keep trying, reposting to python-list also this time. In this thread, I'd like to collect things that ought to be done but where Dirkjan has indicated that he would prefer if somebody else did it. Item 1 ------ The first item is build identification. If you want to work on this, please either provide a patch (for trunk and/or py3k), or (if you are a committer) create a subversion branch. It seems that Barry and I agree that for the maintenance branches, sys.subversion should be frozen, so we need actually two sets of patches: one that removes sys.subversion entirely, and the other that freezes the branch to the respective one, and freezes the subversion revision to None. The patch should consider what Dirkjan proposes as the branching strategy: clones to separate 2.x and 3.x, as well as for features, and branches with the clones for releases and maintenance (see the PEP for details). Anybody working on this should have good knowledge of the Python source code, Mercurial, and either autoconf or Visual Studio (preferably both). Item 2 ------ The second item is line conversion hooks. Dj Gilcrease has posted a solution which he considers a hack himself. Mark Hammond has also volunteered, but it seems some volunteer needs to be "in charge", keeping track of a proposed solution until everybody agrees that it is a good solution. It may be that two solutions are necessary: a short-term one, that operates as a hook and has limitations, and a long-term one, that improves the hook system of Mercurial to implement the proper functionality (which then might get shipped with Mercurial in a cross-platform manner). Regards, Martin

Show replies by date

Dirkjan Ochtman

18 Aug 18 Aug

1:50 p.m.

On Tue, Aug 18, 2009 at 10:12, "Martin v. Löwis" wrote:

...

In this thread, I'd like to collect things that ought to be done but where Dirkjan has indicated that he would prefer if somebody else did it.

I think the most important item here is currently the win32text stuff. Mark Hammond said he would work on this; Mark, when do you have time for this? Then I could set apart some time for it as well. Have stalled a bit on the fine-grained branch processing, hope to move that forward tomorrow. Cheers, Dirkjan

Mark Hammond

5:02 p.m.

On 18/08/2009 6:20 PM, Dirkjan Ochtman wrote:

...

On Tue, Aug 18, 2009 at 10:12, "Martin v. Löwis" wrote:

...
In this thread, I'd like to collect things that ought to be done but where Dirkjan has indicated that he would prefer if somebody else did it.

I think the most important item here is currently the win32text stuff. Mark Hammond said he would work on this; Mark, when do you have time for this? Then I could set apart some time for it as well.

I can make time, somewhat spasmodically, starting fairly soon. Might I suggest that as a first task I can resurrect my old stale patch, and you can arrange to install win32text locally and start experimenting with how mixed line-endings can work for you. Once we are all playing in the same ballpark I think we should be able to make good progress. I-said-ballpark-yet-I-call-myself-an-aussie? ly, Mark

Dirkjan Ochtman

5:16 p.m.

On Tue, Aug 18, 2009 at 13:32, Mark Hammond wrote:

...

I can make time, somewhat spasmodically, starting fairly soon. Might I suggest that as a first task I can resurrect my old stale patch, and you can arrange to install win32text locally and start experimenting with how mixed line-endings can work for you. Once we are all playing in the same ballpark I think we should be able to make good progress.

Sounds good to me. Cheers, Dirkjan

Brett Cannon

19 Aug 19 Aug

1:16 a.m.

[stripping out python-list and Mark from the CC] On Tue, Aug 18, 2009 at 01:20, Dirkjan Ochtman wrote:

...

On Tue, Aug 18, 2009 at 10:12, "Martin v. Löwis" wrote:

...
In this thread, I'd like to collect things that ought to be done but where Dirkjan has indicated that he would prefer if somebody else did it.

I think the most important item here is currently the win32text stuff. Mark Hammond said he would work on this; Mark, when do you have time for this? Then I could set apart some time for it as well.

Have stalled a bit on the fine-grained branch processing, hope to move that forward tomorrow.

Can we possibly get these todo items in the PEP? I keep looking at the PEP out of habit to see what the blockers are and they are not there, at which point I have to dig up Martin's email. -Brett

...

Cheers,

Dirkjan _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org

Dirkjan Ochtman

1:23 a.m.

On Tue, Aug 18, 2009 at 21:46, Brett Cannon wrote:

...

Can we possibly get these todo items in the PEP? I keep looking at the PEP out of habit to see what the blockers are and they are not there, at which point I have to dig up Martin's email.

Will do. Cheers, Dirkjan

Mark Hammond

30 Aug 30 Aug

8:14 a.m.

On 18/08/2009 6:20 PM, Dirkjan Ochtman wrote:

...

On Tue, Aug 18, 2009 at 10:12, "Martin v. Löwis" wrote:

...
In this thread, I'd like to collect things that ought to be done but where Dirkjan has indicated that he would prefer if somebody else did it.

I think the most important item here is currently the win32text stuff. Mark Hammond said he would work on this; Mark, when do you have time for this? Then I could set apart some time for it as well.

Have stalled a bit on the fine-grained branch processing, hope to move that forward tomorrow.

I'm afraid I've also stalled on this task and I need some help to get things moving again. 1) I've stalled on the 'none:' patch I promised to resurrect. While doing this, I re-discovered that the tests for win32text appear to check win32 line endings are used by win32text on *all* platforms, not just Windows. I asked for advice from Dirkjan who referred me to the mercurual-devel list, but my request of slightly over a week ago remains unanswered (http://selenic.com/pipermail/mercurial-devel/2009-August/014873.html) - maybe I just need to be more patient... Further, Martin's comments in this thread indicate he believes a new extension will be necessary rather than 'fixing' win32text. If this is the direction we take, it may mean the none: patch, which targets the implementation of win32text, is no longer necessary anyway. 2) These same recent discussions about an entirely new extension and no clear indication of our expectations regarding what the tool actually enforces means I'm not sure how to make a start on the more general issue. I also fear that should I try to make a start on this, it will still wind up fruitless - eg, it seems any work targeting win32text specifically would have been wasted, so I'd really like to see a consensus on what needs to be done before attempting to start it. So in short, I'm still offering to work on this issue - I just don't know what that currently entails. Thanks, Mark

Martin Geisler

5:07 p.m.

Mark Hammond writes:

...

1) I've stalled on the 'none:' patch I promised to resurrect. While doing this, I re-discovered that the tests for win32text appear to check win32 line endings are used by win32text on *all* platforms, not just Windows.

I think it is only Patrick Mezard who knows how to run (parts of) the test suite on Windows.

...

I asked for advice from Dirkjan who referred me to the mercurual-devel list, but my request of slightly over a week ago remains unanswered (http://selenic.com/pipermail/mercurial-devel/2009-August/014873.html) - maybe I just need to be more patient...

Oh no, that's usually the wrong tactic :-) I've been too busy for real Mercurial work the last couple of weeks, but you should not feel bad about poking us if you don't get a reply. Or come to the IRC channel (#mercurial on irc.freenode.net) where Dirkjan (djc) and myself (mg) hang out when it's daytime in Europe.

...

Further, Martin's comments in this thread indicate he believes a new extension will be necessary rather than 'fixing' win32text. If this is the direction we take, it may mean the none: patch, which targets the implementation of win32text, is no longer necessary anyway.

I suggested a new extension for two reasons: * I'm using Linux, and I mentally skip over all extensions that mention "win32"... I guess others do the same, and in this case it's really a shame since converting EOL markers is a cross-platform problem: if someone creates a repository on Windows, I might find it nice to translate the EOL markers into LF on my machine. As far as I know, all my tools works correctly with CRLF EOL markers, but I can see the usefulness of such an extension when adding new files (which would default to LF unless I take care). * A new extension will not have to deal with backwards compatibility issues. That would let us clean up the strange names: I think "cleverencode:" and "cleverdecode:" quite poor names that convey little meaning (and what's with the colon?). We could instead use the same names as Subversion: "native", "CRLF" and "LF". The new extension could be named 'convert-eol' or something like that.

...

2) These same recent discussions about an entirely new extension and no clear indication of our expectations regarding what the tool actually enforces means I'm not sure how to make a start on the more general issue.

It would be a folly to require all files in all changesets to use the right EOL markers -- people will be making mistakes offline. The important thing is that they fix them before pushing to a public server. So the extension should do that: either abort commits with the wrong EOL markers or do as Subversion and automatically convert the file in the working copy.

...

I also fear that should I try to make a start on this, it will still wind up fruitless - eg, it seems any work targeting win32text specifically would have been wasted, so I'd really like to see a consensus on what needs to be done before attempting to start it.

As I understand it, what is lacking is that win32text will read the encode/decode settings from a versioned file called <repo>/.hgeol. This means that you can just enable the extension and be done with it, instead of configuring it in every clone. The <repo>/.hgeol file should contain two sections: [repository] native = LF [patterns] Windows.txt = CRLF Unix.txt = LF Tools/buildbot/** = CRLF **.txt = native **.py = native **.dsp = CRLF The [repository] setting controls what native is translated into upon commit. The [patterns] section can be translated into safe [decode] / [encode] settings by the extension: [encode] Windows.txt = to-crlf Unix.txt = to-lf Tools/buildbot/** = to-crlf **.txt = to-lf **.py = to-lf **.dsp = to-crlf [decode] Windows.txt = to-crlf Unix.txt = to-lf Tools/buildbot/** = to-crlf **.txt = to-native **.py = to-native **.dsp = to-crlf where to-crlf, to-lf, to-native are filters installed by the extension. I guess your 'none' encode/decode filter patch would be needed if the Unix.txt file were to be stored unchanged in the repository? Instead I imagine that the extension will convert a modified Unix.txt to LF EOL markers automatically (Subversion behaves like that, as far as I can tell from a bit of testing). That way the repository will contain most files in the format specified as native for it, but selected files are stored using whatever EOLs they like. The result is that someone who has not enabled the extension will get correct files from a checkout. Had we stored the *.dsp files with LF EOLs in the repository (like Subversion does), then using the extension would be mandatory for everybody. -- Martin Geisler VIFF (Virtual Ideal Functionality Framework) brings easy and efficient SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.

"Martin v. Löwis"

11:29 p.m.

...

I suggested a new extension for two reasons:

* I'm using Linux, and I mentally skip over all extensions that mention "win32"... I guess others do the same, and in this case it's really a shame since converting EOL markers is a cross-platform problem: if someone creates a repository on Windows, I might find it nice to translate the EOL markers into LF on my machine.

As far as I know, all my tools works correctly with CRLF EOL markers, but I can see the usefulness of such an extension when adding new files (which would default to LF unless I take care).

* A new extension will not have to deal with backwards compatibility issues. That would let us clean up the strange names: I think "cleverencode:" and "cleverdecode:" quite poor names that convey little meaning (and what's with the colon?). We could instead use the same names as Subversion: "native", "CRLF" and "LF".

The new extension could be named 'convert-eol' or something like that.

Thanks for the confirmation - this is also why I think a new extension would be best. FWIW, in Python, most files would be declared native, some CRLF, none LF.

...

...
2) These same recent discussions about an entirely new extension and no clear indication of our expectations regarding what the tool actually enforces means I'm not sure how to make a start on the more general issue.

It would be a folly to require all files in all changesets to use the right EOL markers -- people will be making mistakes offline. The important thing is that they fix them before pushing to a public server.

So the extension should do that: either abort commits with the wrong EOL markers or do as Subversion and automatically convert the file in the working copy.

Maybe I misunderstand: when people use the extension, they cannot possibly make mistakes, right? Because the commit that gets aborted is already the local commit, right? Of course, it may still be that not all people use the extension. I think this is of concern to Mark (and he would like hg to refuse operation at all if the extension isn't used), but not to me: I would like this to be a feature of hg eventually, in which case I don't need to worry whether hg enforces presence of certain extensions. If people make commits that break the eol style, we could well refuse to accept them on the server, telling people that they should have used the extension (or that they should have been more careful if they don't use the extension). I think subversion's behavior wrt. incorrect eol-style is more subtle. In some cases, it will complain about inconsistencies, rather than fixing them automatically. Regards, Martin

Martin Geisler

31 Aug 31 Aug

12:07 a.m.

"Martin v. Löwis" writes:

...

...
So the extension should do that: either abort commits with the wrong EOL markers or do as Subversion and automatically convert the file in the working copy.

Maybe I misunderstand: when people use the extension, they cannot possibly make mistakes, right? Because the commit that gets aborted is already the local commit, right?

Of course, it may still be that not all people use the extension.

Exactly, when people use the extension, they wont be able to make bad commits.

...

I think this is of concern to Mark (and he would like hg to refuse operation at all if the extension isn't used), but not to me: I would like this to be a feature of hg eventually, in which case I don't need to worry whether hg enforces presence of certain extensions.

Yes, that would be nice for the future. I don't know if the other Mercurial developers will see this as a big controversy -- Mercurial has so far made very sure to never mutate your files behind your back. Expansion of keywords (like $Id$) is also implemented as an extension.

...

If people make commits that break the eol style, we could well refuse to accept them on the server, telling people that they should have used the extension (or that they should have been more careful if they don't use the extension).

Indeed. Their work will not be lost -- one can always take the final file, convert the line-endings, copy it into a fresh clone and commit that. With more work one could even salvage the intermediate commits, but that is probably not necessary.

...

I think subversion's behavior wrt. incorrect eol-style is more subtle. In some cases, it will complain about inconsistencies, rather than fixing them automatically.

Okay --- I don't have much experience with the svn:eol-style, except that I've read about it in the manual. -- Martin Geisler VIFF (Virtual Ideal Functionality Framework) brings easy and efficient SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.

Mark Hammond

5 Sep 5 Sep

6 a.m.

On 30/08/2009 9:37 PM, Martin Geisler wrote:

...

Mark Hammond writes:

...
1) I've stalled on the 'none:' patch I promised to resurrect. While doing this, I re-discovered that the tests for win32text appear to check win32 line endings are used by win32text on *all* platforms, not just Windows.

I think it is only Patrick Mezard who knows how to run (parts of) the test suite on Windows.

...
I asked for advice from Dirkjan who referred me to the mercurual-devel list, but my request of slightly over a week ago remains unanswered (http://selenic.com/pipermail/mercurial-devel/2009-August/014873.html) - maybe I just need to be more patient...

Oh no, that's usually the wrong tactic :-) I've been too busy for real Mercurial work the last couple of weeks, but you should not feel bad about poking us if you don't get a reply. Or come to the IRC channel (#mercurial on irc.freenode.net) where Dirkjan (djc) and myself (mg) hang out when it's daytime in Europe.

To be fair, I did mail Dirkjan directly who referred me to the -develop list, which I did with a CC to him and a private mail asking for some help should the mail fall on deaf ears as I feared it would. There really is only so far I'm willing to poke and prod people when I'm well aware we are all volunteers.

...

...
Further, Martin's comments in this thread indicate he believes a new extension will be necessary rather than 'fixing' win32text. If this is the direction we take, it may mean the none: patch, which targets the implementation of win32text, is no longer necessary anyway.

I suggested a new extension for two reasons:

... Thanks, and that does indeed sound excellent. However, this is going a fair way beyond the original scope I signed up for. While I was willing to help implement some new features into an existing extension, taking on the design and implementation of an entire new extension is something I'm not willing to undertake. I don't think such an extension should even come from the Python community or it will end up being a python-only extension - or at best, will need to run the gauntlet of 2 bike-shedding sessions from both the Python and hg communities which will waste much time. What is the hope of an EOL extension which meets our requirements coming directly out of the hg community? If that hope is small, where does that leave us? Cheers, Mark

"Martin v. Löwis"

12:45 p.m.

...

What is the hope of an EOL extension which meets our requirements coming directly out of the hg community? If that hope is small, where does that leave us?

As before. I'll repost my request for help, and we stay with subversion meanwhile. Perhaps I'll post it to some mercurial list as well. Regards, Martin

Paul Moore

2:54 p.m.

2009/9/5 "Martin v. Löwis" :

...

...
What is the hope of an EOL extension which meets our requirements coming directly out of the hg community? If that hope is small, where does that leave us?

As before. I'll repost my request for help, and we stay with subversion meanwhile.

Perhaps I'll post it to some mercurial list as well.

Can anyone (re-) post the specification of the proposed extension, to the level that it is currently defined? I'm willing to make an attempt to put together an extension, on the assumption that it'll be easier to refine an implementation than continue discussing possibilities... Paul.

"Martin v. Löwis"

7:48 p.m.

New subject: hgeol extension (Was: Mercurial migration: help needed)

...

Can anyone (re-) post the specification of the proposed extension, to the level that it is currently defined?

For reference, here are the original specification, mine and Martin Geisler's: http://mail.python.org/pipermail/python-dev/2009-August/090984.html http://mail.python.org/pipermail/python-dev/2009-August/091453.html Here is my attempt at summarizing it: - name of versioned configuration file (in root of tree): .hgeol - names of conversion modes: native, LF, CRLF In the configuration file, there is a section [patterns] which maps file name patterns to conversion modes, e.g. [patterns] **.txt = native **.py = native **.dsp = CRLF **.bat = CRLF Tools/bgen/README = native Lib/email/test/data/msg_26.txt = CRLF - Martin Geisler also proposes that there is a section [repository] native = <conversionmode> I personally feel YAGNI; it should only support LF (adding such a feature later may be considered) Open issues: - name of extension - what should happen if the file on disk doesn't have the "expected" line endings, or mixed line endings? E.g. a file declared as native "should" have CRLF on Windows - what if it doesn't, on commit? My proposal: do what svn does (whatever that is). That's it, AFAICT. Martin Geisler also discussed something that I read as an implementation strategy, by mapping the patterns to into the (apparently existing) encode/decode configuration setting. HTH, Martin P.S. If you decide that you will or will not work on it, please let us know.

Brett Cannon

6 Sep 6 Sep

1:50 a.m.

New subject: hgeol extension (Was: Mercurial migration: help needed)

On Sat, Sep 5, 2009 at 07:18, "Martin v. Löwis" wrote:

...

...
Can anyone (re-) post the specification of the proposed extension, to the level that it is currently defined?

For reference, here are the original specification, mine and Martin Geisler's:

http://mail.python.org/pipermail/python-dev/2009-August/090984.html http://mail.python.org/pipermail/python-dev/2009-August/091453.html

Here is my attempt at summarizing it:

- name of versioned configuration file (in root of tree): .hgeol - names of conversion modes: native, LF, CRLF In the configuration file, there is a section [patterns] which maps file name patterns to conversion modes, e.g.

[patterns] **.txt = native **.py = native **.dsp = CRLF **.bat = CRLF Tools/bgen/README = native Lib/email/test/data/msg_26.txt = CRLF

- Martin Geisler also proposes that there is a section [repository] native = <conversionmode> I personally feel YAGNI; it should only support LF (adding such a feature later may be considered)

Do you mean what native is in the repo or what it should be considered on the user's machine? If it's the former then I actually like it as it means a clone doesn't need to do anything special when 'native' matches what is expected in the repo while a commit still does its EOL validation. I still think we need to have a server-side block which rejects commits that messes up the line-endings so people can fix them. Shouldn't mess up 'blame' as the messed up line-endings should only be from their edits. Otherwise it's just like when Tim used to run reindent.py over the entire repo on occasion. And as mentioned in another email by Paul, it would be nice to let the user specify what they want 'native' to be on their machine if they happen to be a Windows user who prefers LF.

...

Open issues: - name of extension

StupidLineEndings =)

...

- what should happen if the file on disk doesn't have the "expected" line endings, or mixed line endings? E.g. a file declared as native "should" have CRLF on Windows - what if it doesn't, on commit? My proposal: do what svn does (whatever that is).

Or refuse the commit with a message and tell the user to fix it (if svn doesn't happen to do that). -Brett

"Martin v. Löwis"

3:36 a.m.

New subject: hgeol extension (Was: Mercurial migration: help needed)

...

...
- Martin Geisler also proposes that there is a section [repository] native = <conversionmode> I personally feel YAGNI; it should only support LF (adding such a feature later may be considered)

Do you mean what native is in the repo or what it should be considered on the user's machine?

The former.

...

If it's the former then I actually like it as it means a clone doesn't need to do anything special when 'native' matches what is expected in the repo while a commit still does its EOL validation.

But the same would be true if the repo format would be always LF: when "native" matches (which would then be on Unix), the extension would *still* have to do nothing but validation.

...

I still think we need to have a server-side block which rejects commits that messes up the line-endings so people can fix them.

Certainly.

...

Shouldn't mess up 'blame' as the messed up line-endings should only be from their edits.

It could be that they had a number of commits that eventually lead to the version that they push; this will also push the intermediate versions. So when you then do a blame, it will tell you that the revision was logged as "fix whitespace", rather than "resolve issue #9743". You are mostly right that the committer name would be the same (except when the committer was pushing some changes pulled from the actual contributor), however, I still see these whitespace-only changes as a complication. Regards, Martin

Brett Cannon

3:55 a.m.

New subject: hgeol extension (Was: Mercurial migration: help needed)

On Sat, Sep 5, 2009 at 15:06, "Martin v. Löwis" wrote:

...

...
...
- Martin Geisler also proposes that there is a section [repository] native = <conversionmode> I personally feel YAGNI; it should only support LF (adding such a feature later may be considered)

Do you mean what native is in the repo or what it should be considered on the user's machine?

The former.

...
If it's the former then I actually like it as it means a clone doesn't need to do anything special when 'native' matches what is expected in the repo while a commit still does its EOL validation.

But the same would be true if the repo format would be always LF: when "native" matches (which would then be on Unix), the extension would *still* have to do nothing but validation.

Right, but I am just thinking about how we specify in .hgeols what the repository is expected to be as this extension might work out nicely for other projects who prefer CLRF as their repo-native line ending.

...

...
I still think we need to have a server-side block which rejects commits that messes up the line-endings so people can fix them.

Certainly.

...
Shouldn't mess up 'blame' as the messed up line-endings should only be from their edits.

It could be that they had a number of commits that eventually lead to the version that they push; this will also push the intermediate versions. So when you then do a blame, it will tell you that the revision was logged as "fix whitespace", rather than "resolve issue #9743".

Yep.

...

You are mostly right that the committer name would be the same (except when the committer was pushing some changes pulled from the actual contributor), however, I still see these whitespace-only changes as a complication.

It's unfortunate, but I see it as a rare occurrence as it would only happen if someone got sloppy. And it should typically get caught client-side before the commit ever occurs, minimizing the whitespace-only commits even more. -Brett

"Martin v. Löwis"

4:05 a.m.

New subject: hgeol extension (Was: Mercurial migration: help needed)

...

Right, but I am just thinking about how we specify in .hgeols what the repository is expected to be as this extension might work out nicely for other projects who prefer CLRF as their repo-native line ending.

This is what I refer to as YAGNI. Subversion has LF as the internal storage, and, IIRC, so does CVS. I don't think there is any precedence for wanting something else - and frankly, I can't see how repository storage would matter.

...

...
You are mostly right that the committer name would be the same (except when the committer was pushing some changes pulled from the actual contributor), however, I still see these whitespace-only changes as a complication.

It's unfortunate, but I see it as a rare occurrence as it would only happen if someone got sloppy. And it should typically get caught client-side before the commit ever occurs, minimizing the whitespace-only commits even more.

It would, of course, be possible to ban them altogether, at the expense of users having to replay changes. Regards, Martin

Stephen J. Turnbull

1:11 p.m.

New subject: hgeol extension (Was: Mercurial migration: help needed)

"Martin v. Löwis" writes:

...

This is what I refer to as YAGNI. Subversion has LF as the internal storage, and, IIRC, so does CVS. I don't think there is any precedence for wanting something else - and frankly, I can't see how repository storage would matter.

Well, internally you could use U+2028 LINE SEPARATOR, which would screw up *everybody* if they don't use the converter, since there are probably very few editors that understand U+2028. I've heard that this is what Samba did when converting to Unicode: intead of using UTF-8 they used UTF-16 so that English would be at least as buggy as any other language. Maybe there's somebody who was participating in Samba at that time who knows?

Martin Geisler

5:27 a.m.

New subject: hgeol extension

"Martin v. Löwis" writes:

...

...
Can anyone (re-) post the specification of the proposed extension, to the level that it is currently defined?

For reference, here are the original specification, mine and Martin Geisler's:

http://mail.python.org/pipermail/python-dev/2009-August/090984.html http://mail.python.org/pipermail/python-dev/2009-August/091453.html

Here is my attempt at summarizing it:

- name of versioned configuration file (in root of tree): .hgeol - names of conversion modes: native, LF, CRLF In the configuration file, there is a section [patterns] which maps file name patterns to conversion modes, e.g.

[patterns] **.txt = native **.py = native **.dsp = CRLF **.bat = CRLF Tools/bgen/README = native Lib/email/test/data/msg_26.txt = CRLF

- Martin Geisler also proposes that there is a section [repository] native = <conversionmode> I personally feel YAGNI; it should only support LF (adding such a feature later may be considered)

I don't think it's a good idea to store everything in LF in the repository. Unlike Subversion, you cannot expect all interactions to take place through the "eol-filter" we're implementing. Letting people checkout a useful unfiltered clone would be possible if we know the repository native format and convert back to that. Anyway, it's a minor detail. More importantly, I've posted a simple, rough extension that does this here: http://markmail.org/message/yj4so736t4cfdulv I figured it would be better to discuss the design and implementation on mercurial-devel since there are more Mercurial hackers there. I've CC'ed a bunch of people from this thread to "seed" the discussion -- the rest of you on python-devel are hereby invited to join :-) http://selenic.com/mailman/listinfo/mercurial-devel -- Martin Geisler VIFF (Virtual Ideal Functionality Framework) brings easy and efficient SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.

Antoine Pitrou

5 Sep 5 Sep

5:14 p.m.

Mark Hammond writes:

...

What is the hope of an EOL extension which meets our requirements coming directly out of the hg community? If that hope is small, where does that leave us?

I'm starting to wonder what the problem really is that makes it so Python-specific. If I understood correctly, it's about a couple of files which must be stored using non-Unix line endings, right? (in the PC and PCbuild directories?) These files are hardly modified often and by many people (and even more rarely by non-Windows people), so why not just put a verification hook on the server and let the offending committer repair his mistake manually, if it ever happens? (we can even provide a script to help repairing the EOL mistake, like Tools/reindent.py does for indentation mistakes)

"Martin v. Löwis"

6:49 p.m.

...

I'm starting to wonder what the problem really is that makes it so Python-specific. If I understood correctly, it's about a couple of files which must be stored using non-Unix line endings, right? (in the PC and PCbuild directories?)

No. It's about files that must, when checked out on Windows, have CRLF endings, and, when checked out on Unix, have LF endings - i.e. all the .py, .c, .h, and .rst files, plus a couple of others which don't require specific treatment. IOW, it's about the default behavior, and the majority of new files.

...

These files are hardly modified often and by many people (and even more rarely by non-Windows people), so why not just put a verification hook on the server and let the offending committer repair his mistake manually, if it ever happens? (we can even provide a script to help repairing the EOL mistake, like Tools/reindent.py does for indentation mistakes)

This was Dirkjan's original proposal, and it is the proposal that brings so much heat into the discussion, claiming that it is a problem of minorities (I do understand that you were unaware that "the problem" is really with the many files, not with the few). In addition, a DVCS brings in another problem dimension: when people push their changes, they have *already* committed them - and perhaps not even they, but a contributor from which they had been pulling changes. The bogus change may have been weeks ago, so the subversion solution (of rejecting the commit to happen) doesn't quite work that well for a DVCS. Regards, Martin

Antoine Pitrou

6:56 p.m.

Le samedi 05 septembre 2009 à 15:19 +0200, "Martin v. Löwis" a écrit :

...

No. It's about files that must, when checked out on Windows, have CRLF endings, and, when checked out on Unix, have LF endings - i.e. all the ..py, .c, .h, and .rst files, plus a couple of others which don't require specific treatment.

IOW, it's about the default behavior, and the majority of new files.

Ok, sorry for the misunderstanding and the lost bandwidth.

...

In addition, a DVCS brings in another problem dimension: when people push their changes, they have *already* committed them - and perhaps not even they, but a contributor from which they had been pulling changes. The bogus change may have been weeks ago, so the subversion solution (of rejecting the commit to happen) doesn't quite work that well for a DVCS.

I don't think this problem is really serious. If the push fails, you can just commit (locally) a new changeset that repairs the EOL or indentation problems, and push the whole bunch of changesets again (I assume the server-side hook will not examine changesets individually, but only the last of them?).

Martin Geisler

7:57 p.m.

Antoine Pitrou writes:

...

Le samedi 05 septembre 2009 à 15:19 +0200, "Martin v. Löwis" a écrit :

...
In addition, a DVCS brings in another problem dimension: when people push their changes, they have *already* committed them - and perhaps not even they, but a contributor from which they had been pulling changes. The bogus change may have been weeks ago, so the subversion solution (of rejecting the commit to happen) doesn't quite work that well for a DVCS.

I don't think this problem is really serious. If the push fails, you can just commit (locally) a new changeset that repairs the EOL or indentation problems, and push the whole bunch of changesets again (I assume the server-side hook will not examine changesets individually, but only the last of them?).

Yes, the server-side hook will have to work like this in order for people to fix mistakes like you just described. -- Martin Geisler VIFF (Virtual Ideal Functionality Framework) brings easy and efficient SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.

"Martin v. Löwis"

9:50 p.m.

...

...
I don't think this problem is really serious. If the push fails, you can just commit (locally) a new changeset that repairs the EOL or indentation problems, and push the whole bunch of changesets again (I assume the server-side hook will not examine changesets individually, but only the last of them?).

Yes, the server-side hook will have to work like this in order for people to fix mistakes like you just described.

Not necessarily. People could also be required to go back and replay all changes. Regards, Martin

Martin Geisler

6 Sep 6 Sep

1:04 a.m.

"Martin v. Löwis" writes:

...

...
...
I don't think this problem is really serious. If the push fails, you can just commit (locally) a new changeset that repairs the EOL or indentation problems, and push the whole bunch of changesets again (I assume the server-side hook will not examine changesets individually, but only the last of them?).

Yes, the server-side hook will have to work like this in order for people to fix mistakes like you just described.

Not necessarily. People could also be required to go back and replay all changes.

Replaying changes, i.e., editing history is quite easy as long as you have done no merges. So people can indeed fix their mistakes by cleaning up history as long as they have a linear history. Both mq and histedit are available for this: http://mercurial.selenic.com/wiki/MqExtension http://mercurial.selenic.com/wiki/HisteditExtension The problem comes if a small group have been working together on a new feature and have merged changes in from the mainline while doing so. They will then no longer be able to edit past the most recent merge. -- Martin Geisler VIFF (Virtual Ideal Functionality Framework) brings easy and efficient SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.

Stephen J. Turnbull

5 Sep 5 Sep

8:29 p.m.

Antoine Pitrou writes:

...

...
In addition, a DVCS brings in another problem dimension: when people push their changes, they have *already* committed them - and perhaps not even they, but a contributor from which they had been pulling changes. The bogus change may have been weeks ago, so the subversion solution (of rejecting the commit to happen) doesn't quite work that well for a DVCS.

git has a nice filter-branch command, which would allow you to automatically repair the problem (it works basically by checking out each changeset and rerecording it with the appropriate commands). I know bzr is growing something similar, so presumably it is or will soon be available in hg.

...

I don't think this problem is really serious. If the push fails, you can just commit (locally) a new changeset that repairs the EOL or indentation problems, and push the whole bunch of changesets again (I assume the server-side hook will not examine changesets individually, but only the last of them?).

That's not a very good solution. Especially with typical Mercurial workflows[1], it's quite possible that you'll have a number of bogus changesets interleaved with good one. I don't think recording a repair is satisfactory. Footnotes: [1] Note that DVCS means you do *not* have to follow Python workflows in your private branches.

Dirkjan Ochtman

8:57 p.m.

On 05/09/2009 16:59, Stephen J. Turnbull wrote:

...

git has a nice filter-branch command, which would allow you to automatically repair the problem (it works basically by checking out each changeset and rerecording it with the appropriate commands). I know bzr is growing something similar, so presumably it is or will soon be available in hg.

That means you change hashes on the server side, without human feedback. Let's try not to subvert the immutability design that Mercurial tries to encourage. Cheers, Dirkjan

Stephen J. Turnbull

6 Sep 6 Sep

1:26 p.m.

Dirkjan Ochtman writes:

...

On 05/09/2009 16:59, Stephen J. Turnbull wrote:

...
git has a nice filter-branch command, which would allow you to automatically repair the problem (it works basically by checking out each changeset and rerecording it with the appropriate commands). I know bzr is growing something similar, so presumably it is or will soon be available in hg.

That means you change hashes on the server side, without human feedback. Let's try not to subvert the immutability design that Mercurial tries to encourage.

No, I mean the server refuses to accept it and the submitter can fix it easily (with mq or histedit as Martin G points out), then resubmit. In any case Mercurial's notion of immutability is unsustainable in practice, as the plethora of extensions which mutate history testifies.

"Martin v. Löwis"

5 Sep 5 Sep

9:48 p.m.

...

I don't think this problem is really serious. If the push fails, you can just commit (locally) a new changeset that repairs the EOL or indentation problems

I would find that unfortunate. It's a fairly irrelevant change, yet it may manage to corrupt the history (hg blame).

...

and push the whole bunch of changesets again (I assume the server-side hook will not examine changesets individually, but only the last of them?).

That is for us to decide. I can see arguments either way. But it shouldn't happen often that the server refuses a push; all errors should already be caught on the clients. Regards, Martin

Dirkjan Ochtman

9:55 p.m.

On Sat, Sep 5, 2009 at 18:18, "Martin v. Löwis" wrote:

...

But it shouldn't happen often that the server refuses a push; all errors should already be caught on the clients.

We could just mandate the same hook code as a commit hook. Cheers, Dirkjan

"Martin v. Löwis"

9:58 p.m.

...

...
But it shouldn't happen often that the server refuses a push; all errors should already be caught on the clients.

We could just mandate the same hook code as a commit hook.

I would be in favor (although, IIUC, "mandate" here would be a social thing, not a technical one). Regards, Martin

Dirkjan Ochtman

10:08 p.m.

On 05/09/2009 18:28, "Martin v. Löwis" wrote:

...

I would be in favor (although, IIUC, "mandate" here would be a social thing, not a technical one).

Right, but that would be the same for the extension. Cheers, Dirkjan

Georg Brandl

10:26 p.m.

Martin v. Löwis schrieb:

...

...
I don't think this problem is really serious. If the push fails, you can just commit (locally) a new changeset that repairs the EOL or indentation problems

I would find that unfortunate. It's a fairly irrelevant change, yet it may manage to corrupt the history (hg blame).

I'm for a per-changeset check as well. In the common case, the client will have the "required" extension, and errors will be caught there. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Terry Reedy

7:56 p.m.

Martin v. Löwis wrote:

...

...
I'm starting to wonder what the problem really is that makes it so Python-specific. If I understood correctly, it's about a couple of files which must be stored using non-Unix line endings, right? (in the PC and PCbuild directories?)

No. It's about files that must, when checked out on Windows, have CRLF endings, and, when checked out on Unix, have LF endings - i.e. all the .py, .c, .h, and .rst files, plus a couple of others which don't require specific treatment.

IOW, it's about the default behavior, and the majority of new files.

FWIW, I had the same impression as Antoine. I am aware that 'stupid'pad requires /r/n, but do IDLE and other editors (on Windows) that people would actually use to create/edit such files? I would personally be willing to install a notepad replacement if needed to quickview such files. If essentially all text files need fixed line endings on Windows, then hg really needs this built in. Has it really not been used much on Windows? tjr

Antoine Pitrou

8:39 p.m.

Terry Reedy writes:

...

If essentially all text files need fixed line endings on Windows, then hg really needs this built in. Has it really not been used much on Windows?

Mercurial is used by e.g. Mozilla, which is not really known for poor Windows support (chances are many Firefox developers are Windows-based). I wonder whether they have written their own extension, or if they simply rely on their text editors to do the right thing.

Martin (gzlist)

8:49 p.m.

On 05/09/2009, Antoine Pitrou wrote:

...

Terry Reedy writes:

...
If essentially all text files need fixed line endings on Windows, then hg really needs this built in. Has it really not been used much on Windows?

Mercurial is used by e.g. Mozilla, which is not really known for poor Windows support (chances are many Firefox developers are Windows-based). I wonder whether they have written their own extension, or if they simply rely on their text editors to do the right thing.

Actually, most Firefox developers are mac based. Mozilla isn't a great example of windows integration, they install half-a-unix-system in order to just build under windows, including msys, python 2.5, mercurial, and xemacs. See: https://developer.mozilla.org/en/Windows_Build_Prerequisites Martin

Dirkjan Ochtman

8:56 p.m.

On 05/09/2009 17:09, Antoine Pitrou wrote:

...

Mercurial is used by e.g. Mozilla, which is not really known for poor Windows support (chances are many Firefox developers are Windows-based). I wonder whether they have written their own extension, or if they simply rely on their text editors to do the right thing.

I'm pretty sure they don't have a specific extension for it. I don't know if many of their developers use the win32text extension, but I would guess not (I have been somewhat involved in Mozilla's migration). Cheers, Dirkjan

"Martin v. Löwis"

9:55 p.m.

...

FWIW, I had the same impression as Antoine. I am aware that 'stupid'pad requires /r/n, but do IDLE and other editors (on Windows) that people would actually use to create/edit such files? I would personally be willing to install a notepad replacement if needed to quickview such files.

Visual Studio will create files with CRLF endings. Please don't talk people out of using Visual Studio for development. More generally: please accept that it is consensus that this *has* to be fixed, and that it *is* a problem on Windows.

...

If essentially all text files need fixed line endings on Windows, then hg really needs this built in. Has it really not been used much on Windows?

I think that's the case. It's pretty much a Unix-only tool, like most of the other DVCS implementations. FWIW, I tried to check out Mozilla (which is in hg), and the check out would always abort with a timeout. Then I downloaded a bundle that they had produced, and try to unbundle it. It took all night, but was complete the other morning. Trying to update the checkout would again make me run into http timeouts. I tried the same on Linux, and it completed within a few minutes. So I conclude that, from a certain project size on, hg is unusable on Windows, atleast on my office machine, running Windows 7. Regards, Martin

Dirkjan Ochtman

10:32 p.m.

On Sat, Sep 5, 2009 at 18:25, "Martin v. Löwis" wrote:

...

I think that's the case. It's pretty much a Unix-only tool, like most of the other DVCS implementations.

I know a lot of projects use Mercurial on Windows as well, I'm not aware of any big problems with it.

...

FWIW, I tried to check out Mozilla (which is in hg), and the check out would always abort with a timeout. Then I downloaded a bundle that they had produced, and try to unbundle it. It took all night, but was complete the other morning. Trying to update the checkout would again make me run into http timeouts. I tried the same on Linux, and it completed within a few minutes. So I conclude that, from a certain project size on, hg is unusable on Windows, atleast on my office machine, running Windows 7.

That sounds pretty bad. By check out, do you mean the clone (getting data over the wire) or the actual check out (setting up a working directory)? I think I've heard of problems with the clone part before, for them. We're actually working on improving clone size, though it also seems to have to do with network reliability. Cheers, Dirkjan

"Martin v. Löwis"

11:07 p.m.

...

I know a lot of projects use Mercurial on Windows as well, I'm not aware of any big problems with it.

I trust that indeed, there are no big problems for most users. I also trust that the hg developers are, in general, open to incorporating improvements on Windows. I'm still skeptical though whether Mercurial is really usable on Windows. Your statement is a somewhat self-fulfilling prophecy: people who run into big problems early likely aren't going to use it. So lack of reports doesn't really mean there aren't any problems.

...

...
FWIW, I tried to check out Mozilla (which is in hg), and the check out would always abort with a timeout. Then I downloaded a bundle that they had produced, and try to unbundle it. It took all night, but was complete the other morning. Trying to update the checkout would again make me run into http timeouts. I tried the same on Linux, and it completed within a few minutes. So I conclude that, from a certain project size on, hg is unusable on Windows, atleast on my office machine, running Windows 7.

That sounds pretty bad. By check out, do you mean the clone (getting data over the wire) or the actual check out (setting up a working directory)?

Creating the clone. ISTM that it leaves the http connection open while doing stuff locally (or creates multiple of them, and one times out). It starts cloning, and then, after an hour or so, it reports ABORT, and rolls back, for no apparent reason.

...

I think I've heard of problems with the clone part before, for them. We're actually working on improving clone size, though it also seems to have to do with network reliability.

Our institute has generally a really good internet connection; I think hg.mozilla.org does as well. Plus, it worked when doing it on the very same machine on Linux. Regards, Martin

Martin (gzlist)

11:36 p.m.

On 05/09/2009, "Martin v. Löwis" wrote:

...

Creating the clone. ISTM that it leaves the http connection open while doing stuff locally (or creates multiple of them, and one times out).

It starts cloning, and then, after an hour or so, it reports ABORT, and rolls back, for no apparent reason.

I have been tracking mozilla-central, and believe this is a problem with the repo, that started some time after Aug 04 - which is the last log entry I have in my clone. I presumed it was just a problem my end so hadn't got round trying to debug it yet. If it is a general problem, the fact it's been around for about a month without being addressed might indicate how well tested most DVCSes are under windows. Martin

Neil Hodgson

6 Sep 6 Sep

4:26 a.m.

Dirkjan Ochtman:

...

I know a lot of projects use Mercurial on Windows as well, I'm not aware of any big problems with it.

If you have a Windows-only project with CRLF files using Mercurial then there is no line end problem as Mercurial preserves the CRLFs for you. Line end problems occur on mixed projects where both Windows and Unix tools are used. Neil

Paul Moore

5 Sep 5 Sep

10:56 p.m.

2009/9/5 "Martin v. Löwis" :

...

...
FWIW, I had the same impression as Antoine. I am aware that 'stupid'pad requires /r/n, but do IDLE and other editors (on Windows) that people would actually use to create/edit such files? I would personally be willing to install a notepad replacement if needed to quickview such files.

Visual Studio will create files with CRLF endings. Please don't talk people out of using Visual Studio for development. More generally: please accept that it is consensus that this *has* to be fixed, and that it *is* a problem on Windows.

(Disclaimer: I have no problem with accepting that the extension is needed - the following is for clarity, in particular to help me understand how the hook will be used) There are 2 separate questions - (1) what is held in the repository, and (2) what is in the user's workspace. The two clearly interact. Taking (2) first, there are *some* files (very few, I believe) that require specific line endings (CRLF - Visual Studio build files, is my understanding). There are also tools that require fixed line endings for input (notepad, Visual Studio). Finally, tools create new files with certain line endings by default (pretty much guaranteed to be platform-native, I'd expect). The result is that user workspaces *may* (quite probably, will) contain files with a mixture of line endings if care is not taken. As regards (1), I assume that for "text" files, a consistent EOL convention (assumed LF) should be used in the repository. It's not clear to me what should be held in the repo for the files requiring specific line endings - my instinct is that they should be treated as "binary" and stored in the repo with the mandated line endings, and checked out unchanged. So we have the following situation: - Some "binary" files which should never be converted - Some "text" files, which must be held in LF form in the repo My view is that how I store text files in my workspace is entirely up to me (and the capabilities of my tools). So, how files get checked out should not be centrally mandated. (Hmm, that may complicate matters). How files are checked in is crucial, so a setting is required which ensures that each file so marked is converted to LF format on checking - effectively working like universal newline mode. So, the issues: 1. Given that the "problematic" tools (notepad and Visual Studio) are Windows tools, we seem to be back to the idea that this extension is only needed by Windows developers. As I understood the consensus to be that the extension should be for all users, I suspect I've missed something. 2. Allowing text files to be checked out in whatever form the user prefers seems complicated. The alternative would likely be to say test files are checked out in "native" form. That works, but would irritate me as I work on Windows, but prefer strongly to use LF line endings (yes, I realise that makes me an oddball...) I'd put up with it if it was the consensus to do this, of course. 3. Is there a realistic possibility that a user could edit one of the CRLF-requiring files with a tool that converts it to LF? If so, is there a need to trap that programmatically (as opposed to simply accepting that this equates to the individual accidentally breaking the build, no worse or better than checking in a C file with a syntax error)? Is this a fair summary? Paul

"Martin v. Löwis"

11:26 p.m.

...

There are 2 separate questions - (1) what is held in the repository, and (2) what is in the user's workspace. The two clearly interact. [...] As regards (1), I assume that for "text" files, a consistent EOL convention (assumed LF) should be used in the repository.

Correct. I believe Martin (the other one) proposed to make it configurable. I agree that using LF in the repository is sensible. Wrt. "should": there is the debate whether intermediate revisions can deviate from that requirement.

...

It's not clear to me what should be held in the repo for the files requiring specific line endings - my instinct is that they should be treated as "binary" and stored in the repo with the mandated line endings, and checked out unchanged.

I would say so, yes. One consequence of that is that if you change your mind in hgeols, you need to commit all files that now fail to conform. This is what happens with svn, and it may be tricky to implement as you need to commit files that didn't change on disk (say you switch from native to CRLF on a Windows checkout). OTOH, even if you do store all text files in LF in the repo, then you would still have the problem for files that go from unspecified eol-style to a specified one. So changing hgeols is tricky, period.

...

- Some "binary" files which should never be converted - Some "text" files, which must be held in LF form in the repo

My view is that how I store text files in my workspace is entirely up to me (and the capabilities of my tools). So, how files get checked out should not be centrally mandated.

Not by technical means, no. In the developer FAQ, there will be clear directions, and you better follow them - or need to accept the blame if you made a mistake because you ignored them.

...

So, the issues:

1. Given that the "problematic" tools (notepad and Visual Studio) are Windows tools, we seem to be back to the idea that this extension is only needed by Windows developers. As I understood the consensus to be that the extension should be for all users, I suspect I've missed something.

Technically, yes, it is only needed on Windows. The desire to have all users use them come from the wish that problems with the setup will be detected earlier. E.g. if the extension stops working with a new Mercurial version, and all users use them, there is a larger motivation to fix them for all users. Things that the extension can do for you on Unix: - check that the syntax of .hgeols is correct; this may affect Unix users which try to edit it - check that all text files have consistent line endings, and refuse checkin if they don't. This may become relevant if a Unix text editor tries to edit a CRLF file, and doesn't quite detect that.

...

2. Allowing text files to be checked out in whatever form the user prefers seems complicated. The alternative would likely be to say test files are checked out in "native" form. That works, but would irritate me as I work on Windows, but prefer strongly to use LF line endings (yes, I realise that makes me an oddball...) I'd put up with it if it was the consensus to do this, of course.

It's consensus, and it's also what subversion does, and CVS did.

...

3. Is there a realistic possibility that a user could edit one of the CRLF-requiring files with a tool that converts it to LF? If so, is there a need to trap that programmatically (as opposed to simply accepting that this equates to the individual accidentally breaking the build, no worse or better than checking in a C file with a syntax error)?

I think everything you can imagine is also realistic. For the less-realistic cases, it may be better if the commit is refused rather than silently fixing it, since the user operated the system in a surprising way - so he may actually have meant to do it that way. One specific case is recent autoconf, which put a CR character into configure, completely breaking svn's eol handling. Regards, Martin

"Martin v. Löwis"

11:34 p.m.

...

...
2. Allowing text files to be checked out in whatever form the user prefers seems complicated. The alternative would likely be to say test files are checked out in "native" form. That works, but would irritate me as I work on Windows, but prefer strongly to use LF line endings (yes, I realise that makes me an oddball...) I'd put up with it if it was the consensus to do this, of course.

It's consensus, and it's also what subversion does, and CVS did.

Following up to myself: you might want to make it a feature that "native" is configurable, per user or per repo. It should default to CRLF on Windows, but you might want to set it to LF on your system. In that case, the extension would have just its checking functionality. Regards, Martin

Neil Hodgson

6 Sep 6 Sep

4:31 a.m.

Paul Moore:

...

1. Given that the "problematic" tools (notepad and Visual Studio) are Windows tools, we seem to be back to the idea that this extension is only needed by Windows developers. As I understood the consensus to be that the extension should be for all users, I suspect I've missed something.

Some of the problems come from users on Unix checking in files with CRLF line ends that they have received using some other mechanism such as sharing a disk between Windows and Linux. I was going to point to a bad revision in a bzr housed project I work on but launchpad isn't working currently. What happened was that an OS X user committed a set of changes but with all the files having a different line ending to the repository. The result is that it is no longer easy to track changes before that revision. It also makes a check out larger. It would help in such cases for the commit command on Unix to either automatically change any CRLF line ends to LF for text files (but not files with an explicitly specified line end) or to display a warning. Neil

Stephen J. Turnbull

1:56 p.m.

Paul Moore writes:

...

The result is that user workspaces *may* (quite probably, will) contain files with a mixture of line endings if care is not taken.

Yes. Under your "fixed-EOL-files-are-binary" scheme, this is guaranteed for Unix systems.

...

As regards (1), I assume that for "text" files, a consistent EOL convention (assumed LF) should be used in the repository. It's not clear to me what should be held in the repo for the files requiring specific line endings - my instinct is that they should be treated as "binary" and stored in the repo with the mandated line endings, and checked out unchanged.

Why? Files that require specific line endings are in general used in platform-specific ways. So checking them out with the platform's normal line ending should work fine.

...

My view is that how I store text files in my workspace is entirely up to me (and the capabilities of my tools). So, how files get checked out should not be centrally mandated.

Your tools will be able to work with the native EOL convention, or you wouldn't be able to stand using them. In general the extension should default to checking out with the native convention. If you really want to change that, you can; there's nothing the server can do to mandate what's in your workspaces. The "mandate" here is simply the default extension that Python and/or Mercurial will distribute to help developers avoid having their pushes aborted for incorrect EOLs.

...

So, the issues:

1. Given that the "problematic" tools (notepad and Visual Studio) are Windows tools, we seem to be back to the idea that this extension is only needed by Windows developers. As I understood the consensus to be that the extension should be for all users, I suspect I've missed something.

What you've missed is that developers *of* the Windows port are not necessarily developers *on* Windows. If we treat these as text files and check them out in the native convention everywhere, then it doesn't matter if you edit them on Unix or Windows, when you check them out and build on Windows it Just Works[tm]. I have never heard of a Unix cross-IDE port of Visual Studio....

...

2. Allowing text files to be checked out in whatever form the user prefers seems complicated.

It's not a question of "allow". AIUI, you won't be allowed to push a commit with broken line endings to the public repo. This is too much of a burden to impose given the wayward behavior of existing tools, so an extension will be distributed that does the checking (and any needed conversion) for you. If you don't like that extension, you can change it; it shouldn't be too difficult. Eg:

...

That works, but would irritate me as I work on Windows, but prefer strongly to use LF line endings (yes, I realise that makes me an oddball...) I'd put up with it if it was the consensus to do this, of course.

You don't need to. In that case I would guess that you are at very low risk if you disable the checkout side of the extension.

...

3. Is there a realistic possibility that a user could edit one of the CRLF-requiring files with a tool that converts it to LF?

Yes. It happened occasionally in XEmacs's CVS repository, and caused great consternation among Windows testers.

...

If so, is there a need to trap that programmatically (as opposed to simply accepting that this equates to the individual accidentally breaking the build, no worse or better than checking in a C file with a syntax error)?

It's worse. Until the problem is fixed, the people who need a different EOL convention are working with a file with breakage on every line. Furthermore, that breakage may be quite silent, eg, if you use XEmacs to edit on Windows but Visual Studio to build. With Emacs it's easy enough to change -- if you recognize the breakage.

Dj Gilcrease

19 Aug 19 Aug

3:51 a.m.

On Tue, Aug 18, 2009 at 2:12 AM, "Martin v. Löwis" wrote:

...

The second item is line conversion hooks. Dj Gilcrease has posted a solution which he considers a hack himself. Mark Hammond has also volunteered, but it seems some volunteer needs to be "in charge", keeping track of a proposed solution until everybody agrees that it is a good solution. It may be that two solutions are necessary: a short-term one, that operates as a hook and has limitations, and a long-term one, that improves the hook system of Mercurial to implement the proper functionality (which then might get shipped with Mercurial in a cross-platform manner).

My solution is a hack because the hooks in Mercurial need to be modified to support it properly, I would be happy to help work on this as it is a situation I run into all the time in my own projects. I can never seem to get all the developers to enable the hooks, and one of them always commits with improper line endings =P

Mark Hammond

21 Aug 21 Aug

12:46 p.m.

[Adjusted the CCs...] On 19/08/2009 8:21 AM, Dj Gilcrease wrote:

...

On Tue, Aug 18, 2009 at 2:12 AM, "Martin v. Löwis" wrote:

...
The second item is line conversion hooks. Dj Gilcrease has posted a solution which he considers a hack himself. Mark Hammond has also volunteered, but it seems some volunteer needs to be "in charge", keeping track of a proposed solution until everybody agrees that it is a good solution. It may be that two solutions are necessary: a short-term one, that operates as a hook and has limitations, and a long-term one, that improves the hook system of Mercurial to implement the proper functionality (which then might get shipped with Mercurial in a cross-platform manner).

My solution is a hack because the hooks in Mercurial need to be modified to support it properly, I would be happy to help work on this as it is a situation I run into all the time in my own projects. I can never seem to get all the developers to enable the hooks, and one of them always commits with improper line endings =P

Maybe you can enumerate what you think needs to change in mercurial, then once we have a plan in place it will be clearer who can do what. I'm resurrecting my patch to support a filter called 'none' (which is turning out to be harder than I thought). Off the top of my head, it would the following would give us a pretty solid solution: * Finish my patch for 'none' as a filter, so '**=cleverencode' can be reasonably used (currently you can't specify specific files *not* have cleverencode, making it unsuitable in practice without the concept of 'none') * Add support for versioned 'filter rules' - eg, /.hgfilters or similar. * This might be pushing my luck, but: add 'defensive' support to core hg for this feature - if /.hgfilters exists, hg should refuse to operate on the working tree unless the win32text extension is enabled. Note that this last point still leaves win32text optional for hg itself - but if the owner of a repository has explicitly 'opted in' for win32text support, hg can still assist in refusing to screw the tree. The hg user has the option of enabling that extension, declining to use that repository, or arguing with the owner of the repo about use of the feature in the first place. Is there something I'm missing? Or maybe a better way to have hg enforce a repository's policy while not inflicting pain on hg users who don't want to ever think about windows? Cheers, Mark

Dirkjan Ochtman

1:18 p.m.

On Fri, Aug 21, 2009 at 09:16, Mark Hammond wrote:

...

I'm resurrecting my patch to support a filter called 'none' (which is turning out to be harder than I thought). Off the top of my head, it would the following would give us a pretty solid solution:

* Finish my patch for 'none' as a filter, so '**=cleverencode' can be reasonably used (currently you can't specify specific files *not* have cleverencode, making it unsuitable in practice without the concept of 'none')

* Add support for versioned 'filter rules' - eg, /.hgfilters or similar.

* This might be pushing my luck, but: add 'defensive' support to core hg for this feature - if /.hgfilters exists, hg should refuse to operate on the working tree unless the win32text extension is enabled.

Sounds great to me. The latter might indeed be hard to get into the core, but seems like a good idea to try. Cheers, Dirkjan

Stephen J. Turnbull

2:20 p.m.

Mark Hammond writes:

...

* Add support for versioned 'filter rules' - eg, /.hgfilters or similar.

* This might be pushing my luck, but: add 'defensive' support to core hg for this feature - if /.hgfilters exists, hg should refuse to operate on the working tree unless the win32text extension is enabled.

The name ".hgfilters" should be changed, then. That's way too generic to be used to "enforce" something as specific as win32text. I can imagine all kinds of things wanting to use rules or filters. How about a scheme where an extension reserves a filter file for itself in .hgfilters? In this case the win32text filters would live in .hgfilters/win32text, and if that file exists hg checks that the corresponding extension has been enabled, and if not, refuses to run (and tells you that if you really want to override, you rename the file to win32text.disabled and commit). Note that Bazaar is currently discussing some similar policies. I think the name they have settled on is ".bzrrules". Maybe .hgrules is a better name. -- ________________________________________________________________________ ________________________________________________________________________ Q: What are those straight lines? A: "XEmacs rules."

Nick Coghlan

3:27 p.m.

Stephen J. Turnbull wrote:

...

Note that Bazaar is currently discussing some similar policies. I think the name they have settled on is ".bzrrules". Maybe .hgrules is a better name.

So it would be .hgrules/<extensionname>? With the extension then defining the contents of the rule file? An alternative would be to go one level deeper and have: .hgrules/required/<extensionname> .hgrules/optional/<extensionname> If an extension rule file appeared in the first subdirectory then hg would refuse to operate on the repository without that extension being enabled. I guess something like that might be nice to have, but the support for negative filtering and versioned rule definitions is all we really need from a python-dev point of view. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Stephen J. Turnbull

6:30 p.m.

Nick Coghlan writes:

...

Stephen J. Turnbull wrote:

...
Note that Bazaar is currently discussing some similar policies. I think the name they have settled on is ".bzrrules". Maybe .hgrules is a better name.

So it would be .hgrules/<extensionname>? With the extension then defining the contents of the rule file?

Yes.

...

An alternative would be to go one level deeper and have:

.hgrules/required/<extensionname> .hgrules/optional/<extensionname>

I thought briefly about that kind of thing. However, this way would require deciding the semantics of the subdirectories, and while "optional" vs "required" is pretty appealing, how about "required" vs. "requisite"? (As Dave Barry would say, "I am *still* not kidding." See: http://www.kernel.org/pub/linux/libs/pam/Linux-PAM-html/sag-configuration-fi... Of course anything related to Python would do a better job of naming<wink>, but such semantic fine points might very well be important. And yes, there are people who take their VCS as seriously as they take authenticating as root.) So what I thought was that extensions would provide a policy function, which would make such judgments when called. But then I realized I had no clue what the semantics should be, so I didn't mention it.

Dj Gilcrease

7:40 p.m.

On Fri, Aug 21, 2009 at 1:16 AM, Mark Hammond wrote:

...

Maybe you can enumerate what you think needs to change in mercurial, then once we have a plan in place it will be clearer who can do what.

The encode/decode hooks need to be passed the filename they are working on so you can have an ignore list, this is why I consider my method a hack since I am using a precommit hook to do conversion since I am able to find out which file I am working on and make sure it is not in an ignore list. There also needs to be a way to have required and version controlled extensions. This weekend I plan on digging into Mercurials hook code and doing up a patch so the encode/decode hooks accept the filename they are working on in a backwards compatible way

...

An alternative would be to go one level deeper and have:

.hgrules/required/<extensionname> .hgrules/optional/<extensionname>

I like this, though maybe .hgextensions since it would contain versioned rules and the actual required extension. The extra sub directories are not really required IMHO, you just have a hgrc file that works the same as the local hgrc file except it only looks in the .hgextensions directory for the correct extension so for python we could have something like [extensions] format_enforcer = [encode] **=format_enforcer.cleverencode [ignore] *.sln= ... [hooks] pretxncommit.crlf = python:format_enforcer.forbidcrlf pretxncommit.cr = python:format_enforcer.forbidcr

Dirkjan Ochtman

7:49 p.m.

On Fri, Aug 21, 2009 at 16:10, Dj Gilcrease wrote:

...

I like this, though maybe .hgextensions since it would contain versioned rules and the actual required extension. The extra sub directories are not really required IMHO, you just have a hgrc file that works the same as the local hgrc file except it only looks in the .hgextensions directory for the correct extension so for python we could have something like

[extensions] format_enforcer =

Enabling extensions in a versioned file is not going to fly. Cheers, Dirkjan

Dj Gilcrease

8:12 p.m.

On Fri, Aug 21, 2009 at 8:19 AM, Dirkjan Ochtman wrote:

...

Enabling extensions in a versioned file is not going to fly.

any specific reason?

Martin Geisler

22 Aug 22 Aug

4:47 a.m.

Dj Gilcrease writes:

...

On Fri, Aug 21, 2009 at 8:19 AM, Dirkjan Ochtman wrote:

...
Enabling extensions in a versioned file is not going to fly.

any specific reason?

In the general case, you can specify an extension to be enabled by filename: [extensions] foo = ~/src/foo So if I can enable an extension like that on your system, I might be evil and commit a bad extension *and* enable it at the same time. You might argue that one should then limit which extensions one can enable in a versioned file, but it seems hard to come up with a good mechanism for this. The current "mechanism" is the users own ~/.hgrc file which can be seen as a whitelist of extensions he trust. An alternative could be the new %include syntax for configuration files, which was introduced in Mercurial 1.3. If you add %include ../config to your .hg/hgrc file, the (versioned!) file named 'config' from the root of your repository will be included on the spot. The catch is that you have to add such a line to all your Python clones. -- Martin Geisler VIFF (Virtual Ideal Functionality Framework) brings easy and efficient SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.

Dirkjan Ochtman

1:05 p.m.

On Sat, Aug 22, 2009 at 01:17, Martin Geisler wrote:

...

In the general case, you can specify an extension to be enabled by filename:

[extensions] foo = ~/src/foo

So if I can enable an extension like that on your system, I might be evil and commit a bad extension *and* enable it at the same time.

You might argue that one should then limit which extensions one can enable in a versioned file, but it seems hard to come up with a good mechanism for this. The current "mechanism" is the users own ~/.hgrc file which can be seen as a whitelist of extensions he trust.

Thanks for explaining that bit, Martin. Everyone: Martin is also a hg crew member. It sounds to me like somehow requiring extensions to be enabled (without actually enabling them) would help mitigate the issues somehow, although it's still a distributed system and so clients cannot be trusted (e.g. I might put a win32text stub in there somewhere that does nothing). Cheers, Dirkjan

Stephen J. Turnbull

2:29 p.m.

Dirkjan Ochtman writes:

...

[Clients] cannot be trusted (e.g. I might put a win32text stub in there somewhere that does nothing).

Heck, just edit the .hgrules file, and do a Houdini on any and all handcuffs. Don't trust software, trust people -- but help them avoid thoughtless mistakes.

Nick Coghlan

9:17 p.m.

Stephen J. Turnbull wrote:

...

Dirkjan Ochtman writes:

...
[Clients] cannot be trusted (e.g. I might put a win32text stub in there somewhere that does nothing).

Heck, just edit the .hgrules file, and do a Houdini on any and all handcuffs.

Don't trust software, trust people -- but help them avoid thoughtless mistakes.

Yes, on the client side we're not trying to prevent someone doing the wrong thing deliberately - just nudging them towards doing the right thing so they won't run afoul of the server side checks that will actually *enforce* the line ending rules for the main repository. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Mark Hammond

6:26 a.m.

On 22/08/2009 12:19 AM, Dirkjan Ochtman wrote:

...

On Fri, Aug 21, 2009 at 16:10, Dj Gilcrease wrote:

...
I like this, though maybe .hgextensions since it would contain versioned rules and the actual required extension. The extra sub directories are not really required IMHO, you just have a hgrc file that works the same as the local hgrc file except it only looks in the .hgextensions directory for the correct extension so for python we could have something like

[extensions] format_enforcer =

Enabling extensions in a versioned file is not going to fly.

I like Stephen and Nick's discussion higher in this thread, but wonder if some middle ground couldn't work. Instead of [extensions], just have a place to list the required extensions - eg; Something like ~/.hgrules having: [config] # or maybe [rules] ? required_extensions = win32text, some_pydev_specific_extension [encode] {rules for encoding} [pydev] some_custom_property_for_our_custom_ext = 1 ... etc ... (Note I am not proposing we need out own pydev_specific_extension, I just included it here to try and show the more general concept) This way you aren't *enabling* extensions in this versioned file, just listing rules about what extensions must be enabled. From core hg's POV, it doesn't care if the required extensions relate to windows line endings or re-encoding images - it just honours the wishes of the repo owner. From earlier in the thread, Dirkjan writes:

...

The [concept of hg enforing required extensions] might indeed be hard to get into the core, but seems like a good idea to try.

From my POV, this would be required in some form or another before such a scheme could actually work. Without it we end up with an improved win32text (good!) but in practice still have the same problems we have discussed in this thread which would make it unsuitable for us who actually try and use it, particularly as a general solution for projects with any kind of windows focus or community. Given you are a core hg committer and well known in the community, would you be willing to start a thread with the hg developers about this issue? If something like this can't get into the core, I will drop any expectations of it becoming a viable general solution for windows focused projects, so would limit the work I am willing to invest to the commitments I've made here. Thanks! Mark

Stephen J. Turnbull

10:16 a.m.

Mark Hammond writes:

...

Something like ~/.hgrules having:

Surely you mean $PROJECTROOT/.hgrules?

...

[config] # or maybe [rules] ? required_extensions = win32text, some_pydev_specific_extension

[extensions] required_for_commit = win32text,some_other_ext That might require a change to hg's ini file semantics if currently it refuses to parse [extension] sections in versioned hgrcs. Note the change in name: I'm not sure exactly what the semantics should be, but surely we want to allow browsing the repository, branching, etc without enabling any extensions.

...

[Encode] {rules for encoding}

No, there must be a way to indicate that "this is a section for a specific extension". Bare [Encode] will be seen as polluting the global namespace, and will get a lot of pushback, I think.

...

This way you aren't *enabling* extensions in this versioned file,

True, but how many people will just download the extension and enable it? This would open a door to "social engineering". (Personally, *I* am not opposed to it on those grounds, but as devil's advocate I do want to mention that as an argument you might run into.)

...

just listing rules about what extensions must be enabled. From core hg's POV, it doesn't care if the required extensions relate to windows line endings or re-encoding images - it just honours the wishes of the repo owner.

If it refuses the user's request, it should issue a message to the effect of "Please enable win32text, which is required in ."

Mark Hammond

10:32 a.m.

On 22/08/2009 2:46 PM, Stephen J. Turnbull wrote:

...

Mark Hammond writes:

...
Something like ~/.hgrules having:

Surely you mean $PROJECTROOT/.hgrules?

Indeed.

...

...
[config] # or maybe [rules] ? required_extensions = win32text, some_pydev_specific_extension

[extensions] required_for_commit = win32text,some_other_ext

That might require a change to hg's ini file semantics if currently it refuses to parse [extension] sections in versioned hgrcs.

Yes - I'm not proposing specific names for sections etc - I'm more interested in getting the concepts across, and fully expect the hg guys will have their own opinions and make final decisions on the exact spelling.

...

Note the change in name: I'm not sure exactly what the semantics should be, but surely we want to allow browsing the repository, branching, etc without enabling any extensions.

...
[Encode] {rules for encoding}

No, there must be a way to indicate that "this is a section for a specific extension". Bare [Encode] will be seen as polluting the global namespace, and will get a lot of pushback, I think.

Possibly - although I would expect the existing section names be reused when applied to a versioned file, I'd be more than happy for the hg guys to declare new names are appropriate for this.

...

...
This way you aren't *enabling* extensions in this versioned file,

True, but how many people will just download the extension and enable it?

In the ideal world, exactly as many people who would read the Python developer guide, then download and install the extension based purely on that. IOW, it is Python itself setting the policy, so people need to make their own decisions based on that, regardless of whether the tool enforces it or not.

...

This would open a door to "social engineering". (Personally, *I* am not opposed to it on those grounds, but as devil's advocate I do want to mention that as an argument you might run into.)

...
just listing rules about what extensions must be enabled. From core hg's POV, it doesn't care if the required extensions relate to windows line endings or re-encoding images - it just honours the wishes of the repo owner.

If it refuses the user's request, it should issue a message to the effect of "Please enable win32text, which is required in."

Agreed. Thanks, Mark

Stephen J. Turnbull

2:22 p.m.

Mark Hammond writes:

...

On 22/08/2009 2:46 PM, Stephen J. Turnbull wrote:

...

Possibly - although I would expect the existing section names be reused when applied to a versioned file, I'd be more than happy for the hg guys to declare new names are appropriate for this.

If there's already an [Encode] section, that's different. (I don't details, I'm not that big a Mercurial fan.) But you'd still need a way to differentiate win32text rules from other encoding rules.

...

...
...
This way you aren't *enabling* extensions in this versioned file,

True, but how many people will just download the extension and enable it?

In the ideal world, exactly as many people who would read the Python developer guide, then download and install the extension based purely on that. IOW, it is Python itself setting the policy, so people need to make their own decisions based on that, regardless of whether the tool enforces it or not.

You're missing the point. I'm not talking about whether it will work for Python, I'm talking about the worry that somebody will post a way cool Python branch and require a private extension, which everybody will just automatically install and enable, which extension then proceeds to phone home to Spammer Haven, Inc. with the contents of your email contact list. That's what I mean by "social engineering," and why I worry about policy pushback from Mercurial HQ. Maybe that's more paranoid than they are.... But it can't hurt your cause to be ready for that kind of worry.

Martin Geisler

3:18 p.m.

"Stephen J. Turnbull" writes:

...

Mark Hammond writes:

...
On 22/08/2009 2:46 PM, Stephen J. Turnbull wrote:

...
Possibly - although I would expect the existing section names be reused when applied to a versioned file, I'd be more than happy for the hg guys to declare new names are appropriate for this.

If there's already an [Encode] section, that's different. (I don't details, I'm not that big a Mercurial fan.) But you'd still need a way to differentiate win32text rules from other encoding rules.

There is a [decode] and an [encode] section: http://www.selenic.com/mercurial/hgrc.5.html#decode-encode The win32text extension works by defining new filters which can then be used like this: [encode] ** = cleverencode: [decode] ** = cleverdecode: (they are "clever" because they skip binary files)

...

...
...
True, but how many people will just download the extension and enable it?

In the ideal world, exactly as many people who would read the Python developer guide, then download and install the extension based purely on that. IOW, it is Python itself setting the policy, so people need to make their own decisions based on that, regardless of whether the tool enforces it or not.

You're missing the point. I'm not talking about whether it will work for Python, I'm talking about the worry that somebody will post a way cool Python branch and require a private extension, which everybody will just automatically install and enable, which extension then proceeds to phone home to Spammer Haven, Inc. with the contents of your email contact list. That's what I mean by "social engineering," and why I worry about policy pushback from Mercurial HQ.

Maybe that's more paranoid than they are.... But it can't hurt your cause to be ready for that kind of worry.

Oh, we try to be very paranoid in Mercurial :-) That's why you don't see any support for copying hgrc files when you clone and why hg wont trust hgrc files not owned by you: it should be safe to do cd ~collegue/src/python hg tip -- Martin Geisler VIFF (Virtual Ideal Functionality Framework) brings easy and efficient SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.

Paul Moore

4:48 p.m.

2009/8/22 Martin Geisler :

...

Oh, we try to be very paranoid in Mercurial :-) That's why you don't see any support for copying hgrc files when you clone and why hg wont trust hgrc files not owned by you: it should be safe to do

cd ~collegue/src/python hg tip

So, is the implication therefore that there would be resistance to having some way of making a setting which *is* copied on clone, which says that you can't commit in this repository unless you have the following extensions enabled? Or is the fact that it's only saying "you must have an extension called win32text enabled" and not actually enabling code directly, sufficiently secure to make it acceptable? Paul.

Martin Geisler

7:05 p.m.

Paul Moore writes:

...

2009/8/22 Martin Geisler :

...
Oh, we try to be very paranoid in Mercurial :-) That's why you don't see any support for copying hgrc files when you clone and why hg wont trust hgrc files not owned by you: it should be safe to do

cd ~collegue/src/python hg tip

So, is the implication therefore that there would be resistance to having some way of making a setting which *is* copied on clone, which says that you can't commit in this repository unless you have the following extensions enabled?

It sounds somewhat invasive to forbid commits. Moreover, repository owners should remember that clients can do whatever they want, so this can only be a hint, never a requirement. I don't think this has been mentioned: When you clone you move history (changesets) only and I'm pretty sure you cannot even read the configuration settings over the "wire protocol". So cloning from a HTTP URL wont copy a setting found in the <repo>/.hg/hgrc file. This implies that the settings should live in a version controlled file. I think that is sensible under all circumstances. So if the win32text extension (horrible name, I agree... it should have been made more general and called eolconvert or something like that) would just read a configuration file from the repository, then all you should ask people is to enable win32text.

...

Or is the fact that it's only saying "you must have an extension called win32text enabled" and not actually enabling code directly, sufficiently secure to make it acceptable?

It is definitely secure enough to be included. There should be a way to turn off those hints, though: I might want to clone the Python repository and play around with it without enabling win32text. -- Martin Geisler VIFF (Virtual Ideal Functionality Framework) brings easy and efficient SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.

Mark Hammond

23 Aug 23 Aug

4:47 a.m.

On 22/08/2009 6:52 PM, Stephen J. Turnbull wrote:

...

Mark Hammond writes:

...
On 22/08/2009 2:46 PM, Stephen J. Turnbull wrote:

...
Possibly - although I would expect the existing section names be reused when applied to a versioned file, I'd be more than happy for the hg guys to declare new names are appropriate for this.

If there's already an [Encode] section, that's different. (I don't details, I'm not that big a Mercurial fan.) But you'd still need a way to differentiate win32text rules from other encoding rules.

As mentioned in my previous post, I'm trying to avoid bike-shedding what the hg guys are better placed to decree. How they choose to spell these options is something for hg to decide, and I doubt my opinion matters enough to bother sharing, let alone advocating.

...

...
...
...
This way you aren't *enabling* extensions in this versioned file,

True, but how many people will just download the extension and enable it?

In the ideal world, exactly as many people who would read the Python developer guide, then download and install the extension based purely on that. IOW, it is Python itself setting the policy, so people need to make their own decisions based on that, regardless of whether the tool enforces it or not.

You're missing the point. I'm not talking about whether it will work for Python, I'm talking about the worry that somebody will post a way cool Python branch and require a private extension, which everybody will just automatically install and enable, which extension then proceeds to phone home to Spammer Haven, Inc. with the contents of your email contact list. That's what I mean by "social engineering," and why I worry about policy pushback from Mercurial HQ.

No, you are missing the point - social engineering doesn't require tool support - tools simply make certain things easier.

...

Maybe that's more paranoid than they are.... But it can't hurt your cause to be ready for that kind of worry.

If this becomes seen as 'my' cause, I suspect it will run out of steam very quickly. I truly hope python-dev, as a community, takes some ownership of this issue or I predict the effort will fizzle out without a workable solution. There seem to be a number of people who agree the status-quo isn't acceptable, so I'm not sure what would happen in that case... Cheers, Mark

"Martin v. Löwis"

12:55 p.m.

...

If this becomes seen as 'my' cause, I suspect it will run out of steam very quickly. I truly hope python-dev, as a community, takes some ownership of this issue

That certainly won't happen. python-dev, as a community, has never ever taken ownership of anything. It's always individuals who take ownership. So you essentially say that you want somebody else (but not you) take ownership - which, of course, is certainly fine. Hence my call for volunteers.

...

There seem to be a number of people who agree the status-quo isn't acceptable, so I'm not sure what would happen in that case...

My prediction is that it will depend on whether workable code is available by the time a decision is made to migrate. If code is available, then migration will happen (no matter whether the code has an owner); if no code is available, migration will stall. Regards, Martin

Mark Hammond

24 Aug 24 Aug

8:29 a.m.

On 23/08/2009 5:25 PM, "Martin v. Löwis" wrote:

...

...
If this becomes seen as 'my' cause, I suspect it will run out of steam very quickly. I truly hope python-dev, as a community, takes some ownership of this issue

That certainly won't happen. python-dev, as a community, has never ever taken ownership of anything. It's always individuals who take ownership.

I believe ownership of a task and ownership of a cause are somewhat different. In other words, I'm happy to take ownership of a number as tasks relating to this cause, but if the general feeling is that it is my cause rather than *our* cause, then I will probably opt-out - I'm taking these tasks on at this moment purely because I believe it *is* a common cause.

...

So you essentially say that you want somebody else (but not you) take ownership - which, of course, is certainly fine. Hence my call for volunteers.

Hence my volunteering and the time I am currently spending.

...

...
There seem to be a number of people who agree the status-quo isn't acceptable, so I'm not sure what would happen in that case...

My prediction is that it will depend on whether workable code is available by the time a decision is made to migrate. If code is available, then migration will happen (no matter whether the code has an owner); if no code is available, migration will stall.

Right - I guess we are all still struggling with exactly what "workable code" means in this context. Cheers, Mark

Nick Coghlan

8:50 a.m.

Mark Hammond wrote:

...

On 23/08/2009 5:25 PM, "Martin v. Löwis" wrote:

...
...
If this becomes seen as 'my' cause, I suspect it will run out of steam very quickly. I truly hope python-dev, as a community, takes some ownership of this issue

That certainly won't happen. python-dev, as a community, has never ever taken ownership of anything. It's always individuals who take ownership.

I believe ownership of a task and ownership of a cause are somewhat different.

In other words, I'm happy to take ownership of a number as tasks relating to this cause, but if the general feeling is that it is my cause rather than *our* cause, then I will probably opt-out - I'm taking these tasks on at this moment purely because I believe it *is* a common cause.

If by ownership of the cause you just mean "acceptable handling of line conversions" as being one of the criteria that must be dealt with before the switch to hg actually happens, then I think you have that agreement already. We're not going to accept a regression in line handling from what SVN provides. Your proposed improvements to win32text (possibly in the form of a new extension based on win32text rather than a new version of win32text itself) along with server side enforcement sound like they will meet the need. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Martin Geisler

22 Aug 22 Aug

3:27 p.m.

"Stephen J. Turnbull" writes:

...

Mark Hammond writes:

[extensions] required_for_commit = win32text,some_other_ext

That might require a change to hg's ini file semantics if currently it refuses to parse [extension] sections in versioned hgrcs.

It doesn' refuse anything like that. When Mercurial starts, it reads these configuration files: http://www.selenic.com/mercurial/hgrc.5.html#files Notice that they are all outside the clone's working directory, the closes one is the <repo>/.hg/hgrc file. As I wrote somewhere else in this thread, you can add %include ../.repo-settings in your <repo>/.hg/hgrc file, and this will result in <repo>/.repo-settings being loaded (and this file *is* in the working copy and can thus be put under revision control). -- Martin Geisler VIFF (Virtual Ideal Functionality Framework) brings easy and efficient SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.

"Martin v. Löwis"

2:39 p.m.

...

From my POV, this would be required in some form or another before such a scheme could actually work. Without it we end up with an improved win32text (good!)

I still think this would be actually bad. Instead, a new extension should be written, with a name that does not have "win32" as a substring, and that has no provision for guessing line breaks by inspecting files. Regards, Martin

Mark Hammond

23 Aug 23 Aug

5:07 a.m.

On 22/08/2009 7:09 PM, "Martin v. Löwis" wrote:

...

...
From my POV, this would be required in some form or another before such a scheme could actually work. Without it we end up with an improved win32text (good!)

I still think this would be actually bad.

Instead, a new extension should be written, with a name that does not have "win32" as a substring, and that has no provision for guessing line breaks by inspecting files.

To be clear, you are suggesting: * Having hg enforce an extension as required is good. * Python adopting win32text as that extension would be bad - instead another extension with different semantics (ie, no guessing based on file content) should be used, and enforced, instead. Or have I misunderstood? Assuming I am correct, I am inclined to agree - win32text may be "good enough" in the short term, but it is far from ideal. Cheers, Mark

"Martin v. Löwis"

12:46 p.m.

...

...
...
From my POV, this would be required in some form or another before such a scheme could actually work. Without it we end up with an improved win32text (good!)

I still think this would be actually bad.

Instead, a new extension should be written, with a name that does not have "win32" as a substring, and that has no provision for guessing line breaks by inspecting files.

To be clear, you are suggesting:

* Having hg enforce an extension as required is good.

I have no opinion on that.

...

* Python adopting win32text as that extension would be bad - instead another extension with different semantics (ie, no guessing based on file content) should be used, and enforced, instead.

Yes. The functionality being discussed should not be added to win32text.

...

Assuming I am correct, I am inclined to agree - win32text may be "good enough" in the short term, but it is far from ideal.

I also feel that an extension that is inherently platform independent and has a clear specification has much higher chances of becoming a standard feature of Mercurial one day. Regards, Martin

Mark Hammond

22 Aug 22 Aug

6:28 a.m.

On 22/08/2009 12:10 AM, Dj Gilcrease wrote:

...

On Fri, Aug 21, 2009 at 1:16 AM, Mark Hammond wrote:

...
Maybe you can enumerate what you think needs to change in mercurial, then once we have a plan in place it will be clearer who can do what.

The encode/decode hooks need to be passed the filename they are working on so you can have an ignore list, this is why I consider my method a hack since I am using a precommit hook to do conversion since I am able to find out which file I am working on and make sure it is not in an ignore list. There also needs to be a way to have required and version controlled extensions.

I think this is the exact issue my 'none' patch addresses. Your filters can say: [encode] *.dsp=none: **=cleverencode: The end result should be that anything with 'none:' forms what you call an ignore list. Would that not meet your requirements? Cheers, Mark

Dj Gilcrease

23 Aug 23 Aug

4:05 a.m.

On Fri, Aug 21, 2009 at 6:58 PM, Mark Hammond wrote:

...

[encode] *.dsp=none: **=cleverencode:

The end result should be that anything with 'none:' forms what you call an ignore list.

Would that not meet your requirements?

It would, so I guess I'll hold off on digging into the hook code

5340

Age (days ago)

5359

Last active (days ago)

List overview

Download

77 comments

15 participants

participants (15)

"Martin v. Löwis"
Antoine Pitrou
Brett Cannon
Dirkjan Ochtman
Dj Gilcrease
Georg Brandl
Mark Hammond
Mark Hammond
Martin (gzlist)
Martin Geisler
Neil Hodgson
Nick Coghlan
Paul Moore
Stephen J. Turnbull
Terry Reedy

Mercurial migration: help needed

Martin Geisler

Martin Geisler

Martin Geisler

Martin Geisler

Martin Geisler

Martin Geisler

Martin Geisler

Martin Geisler

Martin Geisler

tags

participants (15)