[Python-Dev] PEP 385: the eol-type issue

MRAB python at mrabarnett.plus.com
Wed Aug 5 15:35:02 CEST 2009

Nick Coghlan wrote:
> Mark Hammond wrote:
>> On 5/08/2009 7:09 PM, Dirkjan Ochtman wrote:
>>> I'm not sure how win32text will provide anything other than
>>> performance degradation for non-Windows developers, but if there's
>>> functionality to be had, I'm happy to mandate its use on every
>>> platform.
>> I see two practical outcomes of such a mandate:
>> * line-ending rules are enforced for local checkins, even for linux
>> users, even though such 'accidental' inappropriate line-ending checkins
>> should be much rarer than for windows.
>> * practical problems faced by Windows users, including any performance
>> considerations, are shared by the community and therefore addressed as a
>> community, thereby ensuring all platforms are considered as important as
>> any other.
> The main error that enabling win32text everywhere can catch is the use
> of a *nix client to accidentally corrupt one of the files that is
> supposed to have \r\n line endings.
> It also simplifies the configuration rules in the Python hg FAQ - we
> would be able to just tell all developers wanting to contribute patches
> to Python to enable the win32text extension when working with the Python
> repositories (or clones thereof) without having to worry about what
> platform they were on.
> So it seems to me that the main client-side feature we want is a
> versioned .hgeols file in the repository that allows files to be
> explicitly nominated as one of:
> - eol=CRLF (i.e. have \r\n line endings in the repository and should be
> left that way on the local disk as well - equivalent to SVN eol-style:CRLF)
> - eol=LF (i.e. have \n line endings in the repository and should be left
> that way on the local disk as well - equivalent to SVN eol-style:LF)
> - eol=CR (i.e. have \n line endings in the repository and should be left
> that way on the local disk as well - equivalent to SVN eol-style:CR)
> - native text (i.e. always stored in the repository with \n line
> endings, but uses native line endings on the local disk - equivalent to
> SVN eol-style:native)
> - binary (i.e. always reproduced on disk exactly as they are in the
> repository - equivalent to SVN files without eol-style set at all)
> The .hgeols file should also allow the repository to define which of the
> above should be used as the default handling mechanism for text files
> that are not named in the file (native text, in the specific case of the
> Python repositories).
> Files which look like binary files (according to the existing win32text
> heuristics) would be left alone regardless of what the default handling
> was set to in .hgeols.
> win32text would then be enhanced to check for a .hgeols file before
> falling back to its existing configuration mechanisms.
> The above basically provides the SVN eol-style feature in a more
> hg-friendly way. Allowing wildcards in the .hgeols files might be nice,
> but I don't think it is actually required. We really don't have that
> many files that are affected by this problem (it's just the fact that it
> is a number greater than zero that is causing the problem).
> The server side pre-push hooks for the main Python repositories would be
> set to reject change sets which didn't meet the above rules. If a patch
> fails those checks, either the committer can fix it themselves and
> resubmit, or else send it back to the originator along with a pointer to
> the section in the dev FAQ that describes the expected client-side
> configuration.
Instead of just talking about line endings, could each file have a
specific 'filetype'? This would define what kind of data it contains,
how it's stored in the repository, and what actions to perform for
fetching and committing, including any checks:

     c_header: C header file; LF in repository; native outside

     c_source: C source file; LF in repository; native outside

     text: plain text; LF in repository; native outside

     crlf_text: plain text; CRLF in repository; CRLF outside

     cr_text: plain text; CR in repository; CR outside

     lf_text: plain text; LF in repository; LF outside

     binary: arbitrary binary data; as-is in repository

This could be expanded in the future to include filetypes for JPEG, etc.

More information about the Python-Dev mailing list