Nick Coghlan wrote:
Mark Hammond wrote:
On 5/08/2009 7:09 PM, Dirkjan Ochtman wrote:
I'm not sure how win32text will provide anything other than performance degradation for non-Windows developers, but if there's functionality to be had, I'm happy to mandate its use on every platform.
I see two practical outcomes of such a mandate:
- line-ending rules are enforced for local checkins, even for linux
users, even though such 'accidental' inappropriate line-ending checkins should be much rarer than for windows.
- practical problems faced by Windows users, including any performance
considerations, are shared by the community and therefore addressed as a community, thereby ensuring all platforms are considered as important as any other.
The main error that enabling win32text everywhere can catch is the use of a *nix client to accidentally corrupt one of the files that is supposed to have \r\n line endings.
It also simplifies the configuration rules in the Python hg FAQ - we would be able to just tell all developers wanting to contribute patches to Python to enable the win32text extension when working with the Python repositories (or clones thereof) without having to worry about what platform they were on.
So it seems to me that the main client-side feature we want is a versioned .hgeols file in the repository that allows files to be explicitly nominated as one of:
- eol=CRLF (i.e. have \r\n line endings in the repository and should be
left that way on the local disk as well - equivalent to SVN eol-style:CRLF)
- eol=LF (i.e. have \n line endings in the repository and should be left
that way on the local disk as well - equivalent to SVN eol-style:LF)
- eol=CR (i.e. have \n line endings in the repository and should be left
that way on the local disk as well - equivalent to SVN eol-style:CR)
- native text (i.e. always stored in the repository with \n line
endings, but uses native line endings on the local disk - equivalent to SVN eol-style:native)
- binary (i.e. always reproduced on disk exactly as they are in the
repository - equivalent to SVN files without eol-style set at all)
The .hgeols file should also allow the repository to define which of the above should be used as the default handling mechanism for text files that are not named in the file (native text, in the specific case of the Python repositories).
Files which look like binary files (according to the existing win32text heuristics) would be left alone regardless of what the default handling was set to in .hgeols.
win32text would then be enhanced to check for a .hgeols file before falling back to its existing configuration mechanisms.
The above basically provides the SVN eol-style feature in a more hg-friendly way. Allowing wildcards in the .hgeols files might be nice, but I don't think it is actually required. We really don't have that many files that are affected by this problem (it's just the fact that it is a number greater than zero that is causing the problem).
The server side pre-push hooks for the main Python repositories would be set to reject change sets which didn't meet the above rules. If a patch fails those checks, either the committer can fix it themselves and resubmit, or else send it back to the originator along with a pointer to the section in the dev FAQ that describes the expected client-side configuration.
Instead of just talking about line endings, could each file have a specific 'filetype'? This would define what kind of data it contains, how it's stored in the repository, and what actions to perform for fetching and committing, including any checks:
c_header: C header file; LF in repository; native outside
c_source: C source file; LF in repository; native outside
text: plain text; LF in repository; native outside
crlf_text: plain text; CRLF in repository; CRLF outside
cr_text: plain text; CR in repository; CR outside
lf_text: plain text; LF in repository; LF outside
binary: arbitrary binary data; as-is in repository
This could be expanded in the future to include filetypes for JPEG, etc.