
We have been discussing whitespace and line endings at the following pull request: http://github.com/numpy/numpy/pull/4 . Chuck suggested we discuss it here on the list. I have the following set in my ~/.gitconfig file: [apply] whitespace = fix [core] autocrlf = input which is attempting to correct some changes in: branding/icons/numpylogo.svg branding/icons/numpylogoicon.svg tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in David C. suggested that the nsi.in file should not be changed. I suggested adding a .gitattributes file along with the existing .gitignore file in the numpy repo. This would enforce windows line endings for the nsi.in file: tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in eol=crlf alternatively this would disable any attempt to convert line endings: tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in -text I think the former is preferable. But it seems like a good idea to include some git config files in the repo to ensure trailing whitespace is stripped and line endings are appropriate to the numpy project, regardless of what people may have in their ~/.gitconfig file. Comments? Darren

2010/10/19 Darren Dale <dsdale24@gmail.com>:
We have been discussing whitespace and line endings at the following pull request: http://github.com/numpy/numpy/pull/4 . Chuck suggested we discuss it here on the list.
I have the following set in my ~/.gitconfig file:
[apply] whitespace = fix
[core] autocrlf = input
which is attempting to correct some changes in:
branding/icons/numpylogo.svg branding/icons/numpylogoicon.svg tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in
David C. suggested that the nsi.in file should not be changed. I suggested adding a .gitattributes file along with the existing .gitignore file in the numpy repo. This would enforce windows line endings for the nsi.in file:
tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in eol=crlf
alternatively this would disable any attempt to convert line endings:
tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in -text
This may be useful here: http://www.bluishcoder.co.nz/2007/09/git-binary-files-and-cherry-picking.htm... Treating svgs and the other files as binary will increase diff's size, but maybe it's an option? my 2 cents, fwiw, Friedrich

On Tue, Oct 19, 2010 at 11:17 AM, Friedrich Romstedt < friedrichromstedt@gmail.com> wrote:
2010/10/19 Darren Dale <dsdale24@gmail.com>:
We have been discussing whitespace and line endings at the following pull request: http://github.com/numpy/numpy/pull/4 . Chuck suggested we discuss it here on the list.
I have the following set in my ~/.gitconfig file:
[apply] whitespace = fix
[core] autocrlf = input
which is attempting to correct some changes in:
branding/icons/numpylogo.svg branding/icons/numpylogoicon.svg tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in
David C. suggested that the nsi.in file should not be changed. I suggested adding a .gitattributes file along with the existing .gitignore file in the numpy repo. This would enforce windows line endings for the nsi.in file:
tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in eol=crlf
alternatively this would disable any attempt to convert line endings:
tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in -text
This may be useful here:
http://www.bluishcoder.co.nz/2007/09/git-binary-files-and-cherry-picking.htm...
Treating svgs and the other files as binary will increase diff's size, but maybe it's an option?
my 2 cents, fwiw,
Well, this post hasn't generated much comment. I think we do need a .gitconfig file for the project so why don't you go ahead and make one and deal with the nsi.in file in the process. The .svg files can have their line endings converted to line feeds which will give us two less files to worry about. Chuck

2010/10/20 Charles R Harris <charlesr.harris@gmail.com>:
[...] I think we do need a .gitconfig file [...]
.gitattributes
so why don't you go ahead and make one and deal with the nsi.in file in the process.
http://github.com/friedrichromstedt/numpy/tree/friedrich-gitattributes-nsis
The .svg files can have their line endings converted to line feeds
Is there anything I've missed or am I right with that currently there is *no* policy for other files at all how they are checked-in/checked-out. We could setup .gitattributes to include also .py files to be LF only by ``eol=lf`` in .gitattributes for ``*.py``, etc. Currently I think it works because git does not change to system strategy by default. But I have a dim memory about git checking in with LF by default too, I need some pointer here I guess. IIRC git checks in with LF but does not change on check-out? Due to Darren's config file the .nsi.in file made it with CRLF into the repo. I guess .gitattributes has precedence over ~/.gitconfig. Pros of *.py in .gitattributes: * We are independent by overriding the user's global config. All *.py files will be checked in with LF. Cons: * They'll also be checked-out with LF, wich may be annoying esp. for Windows users. Is there any way to force check-in LF *only*, via .gitattributes?
[...] which will give us two less files to worry about.
I'm not quite sure, by analogy to .ps, where I messed up my ps files by converting their line ends. No tool was able to read them any longer, seems like being included in the ps specs, maybe the same holds for svg? Not all software has universal eol support. (This was on Windows) Darren, your specs used by you are working due to backward-compatibiltiy of git, see http://www.kernel.org/pub/software/scm/git/docs/gitattributes.html. ``eol=...`` is according to that doc preferrable. Maybe I'll add ``*.dmg -text`` to prevent normalisation of the dmgs I'd like to connect with the numpy commits in my repo. I'll send discussion of this in another thread later. Since it's not approved so far I do not add right now. So far, Friedrich

2010/10/20 Darren Dale <dsdale24@gmail.com>:
On Wed, Oct 20, 2010 at 6:12 AM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
Due to Darren's config file the .nsi.in file made it with CRLF into the repo.
Uh, no.
You mean I'm wrong? Wait, I'll check, but ...Hmm, how do I check the line endings? `file` told it for the .nsi.in file, but not for the .py ones. I was starting from the assumtion that Mac OS X native form is LF, or even CR, but not CRLF, which is Windows? Sorry if I wasn't diligent enough or too sure. Friedrich

On Wed, Oct 20, 2010 at 9:56 AM, Friedrich Romstedt < friedrichromstedt@gmail.com> wrote:
2010/10/20 Darren Dale <dsdale24@gmail.com>:
On Wed, Oct 20, 2010 at 6:12 AM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
Due to Darren's config file the .nsi.in file made it with CRLF into the repo.
Uh, no.
You mean I'm wrong? Wait, I'll check, but ...Hmm, how do I check the line endings? `file` told it for the .nsi.in file, but not for the .py ones. I was starting from the assumtion that Mac OS X native form is LF, or even CR, but not CRLF, which is Windows?
Sorry if I wasn't diligent enough or too sure.
You can grep for CR: grep -P '\r' foo . On windows and mac you might need to add -U. For vim ":set list" will show the line endings. Chuck

.nsi.in has '$\n\r' Is this now LFCR or CRLF? doall.py and py3tool.py have '$\n' precisely. I'm on Mac. \n is the at least vim-default on Mac. So yes, confirming that it's already in the repo. What about the svg eols? Do we need my branch now, or do we want to leave it alone? When others change the file, it might get converted, when we do not apply my commit, assumed that they have some personal preference about eols. For the standard user he will not notice the difference. Someone else than me has to think through it for approval. (I thought thru and say it's okay.) "When deciding what attributes are assigned to a path, git consults $GIT_DIR/info/attributes file (which has the highest precedence), .gitattributes file in the same directory as the path in question, and its parent directories up to the toplevel of the work tree (the further the directory that contains .gitattributes is from the path in question, the lower its precedence). Finally global and system-wide files are considered (they have the lowest precedence)." I tried to put it in the same dir but it didn't work for me. Maybe my git is ageing. Used $grep -PU '$\r' and similar; '$\r^' does not work, just for completeness. The section "End-of-line conversion" in http://www.kernel.org/pub/software/scm/git/docs/gitattributes.html is quite clear. Considering this section I come to the conclusion to leave it to the user how he/she checks in/out the files, but we define for files with defined gitattributes the attributes in .gitattributes. (Alternative to .gitattributes: $GIT_DIR/info/attributes, I don't know if this goes into the repo too). People are advised then to not attribute files to themselves by just an eol conversion. Opinions, Comments? I know, this topic is unattractive. http://www.astro.gla.ac.uk/users/labrosse/PORTFOLIO/images/cosmicsexinesslad... (from http://www.astro.gla.ac.uk/users/labrosse/PORTFOLIO/philo.html) This is somewhere at the other end of the world :-) Friedrich 2010/10/20 Charles R Harris <charlesr.harris@gmail.com>:
On Wed, Oct 20, 2010 at 9:56 AM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
2010/10/20 Darren Dale <dsdale24@gmail.com>:
On Wed, Oct 20, 2010 at 6:12 AM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
Due to Darren's config file the .nsi.in file made it with CRLF into the repo.
Uh, no.
You mean I'm wrong? Wait, I'll check, but ...Hmm, how do I check the line endings? `file` told it for the .nsi.in file, but not for the .py ones. I was starting from the assumtion that Mac OS X native form is LF, or even CR, but not CRLF, which is Windows?
Sorry if I wasn't diligent enough or too sure.
You can grep for CR: grep -P '\r' foo . On windows and mac you might need to add -U. For vim ":set list" will show the line endings.

On Wed, Oct 20, 2010 at 11:56 AM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
2010/10/20 Darren Dale <dsdale24@gmail.com>:
On Wed, Oct 20, 2010 at 6:12 AM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
Due to Darren's config file the .nsi.in file made it with CRLF into the repo.
Uh, no.
You mean I'm wrong?
Due to my config file... nothing. I simply noticed the already-existing CRLF line endings in the repository.

On Thu, Oct 21, 2010 at 12:56 AM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
2010/10/20 Darren Dale <dsdale24@gmail.com>:
On Wed, Oct 20, 2010 at 6:12 AM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
Due to Darren's config file the .nsi.in file made it with CRLF into the repo.
Uh, no.
You mean I'm wrong?
Yes, the file has always used CRLF, and needs to stay that way. David

2010/10/21 David Cournapeau <cournape@gmail.com>:
On Thu, Oct 21, 2010 at 12:56 AM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
2010/10/20 Darren Dale <dsdale24@gmail.com>:
On Wed, Oct 20, 2010 at 6:12 AM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
Due to Darren's config file the .nsi.in file made it with CRLF into the repo.
Uh, no.
You mean I'm wrong?
Yes, the file has always used CRLF, and needs to stay that way.
I see, misunderstanding, for me I used "made it" in the sense "succeeded in" :-) So to be clear, I meant that I understood your config file. Btw, it has \n\r, so it's LFCR and not CRLF as it should be on Windows (ref: de.wikipedia). I checked both my understanding of CR/LF as well as used $grep -PU '$\n\r' again. See also http://de.wikipedia.org/wiki/Zeilenumbruch (german, the en version doesn't have the table). So either: 1) You encoded for whatever reason the file with CR and LF swapped 2) It doesn't matter what the order is 3) There is some misunderstanding, once again. I don't want to go into a flame war about small small thing, maybe it's just my problem that I don't have a clear picture currently. Please apologise my insistence, Friedrich

On Thu, Oct 21, 2010 at 8:47 PM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
2010/10/21 David Cournapeau <cournape@gmail.com>:
On Thu, Oct 21, 2010 at 12:56 AM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
2010/10/20 Darren Dale <dsdale24@gmail.com>:
On Wed, Oct 20, 2010 at 6:12 AM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
Due to Darren's config file the .nsi.in file made it with CRLF into the repo.
Uh, no.
You mean I'm wrong?
Yes, the file has always used CRLF, and needs to stay that way.
I see, misunderstanding, for me I used "made it" in the sense "succeeded in" :-) So to be clear, I meant that I understood your config file.
Btw, it has \n\r, so it's LFCR and not CRLF as it should be on Windows (ref: de.wikipedia). I checked both my understanding of CR/LF as well as used $grep -PU '$\n\r' again.
See also http://de.wikipedia.org/wiki/Zeilenumbruch (german, the en version doesn't have the table). So either: 1) You encoded for whatever reason the file with CR and LF swapped
Nobody encoded the file in a special manner. It just happens to be a file used on windows, by a windows program, and as such should stay in CR/LF format. I am not sure why you say LF and CR are swapped, I don't see it myself, and vim tells me it is in DOS (e.g. CR/LF) format.
2) It doesn't matter what the order is
It does matter. Although text editors are generally smart about line endings, other windows softwares are not. cheers, David

On Thu, Oct 21, 2010 at 9:26 AM, David Cournapeau <cournape@gmail.com> wrote:
On Thu, Oct 21, 2010 at 8:47 PM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
2010/10/21 David Cournapeau <cournape@gmail.com>:
On Thu, Oct 21, 2010 at 12:56 AM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
2010/10/20 Darren Dale <dsdale24@gmail.com>:
On Wed, Oct 20, 2010 at 6:12 AM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
Due to Darren's config file the .nsi.in file made it with CRLF into the repo.
Uh, no.
You mean I'm wrong?
Yes, the file has always used CRLF, and needs to stay that way.
I see, misunderstanding, for me I used "made it" in the sense "succeeded in" :-) So to be clear, I meant that I understood your config file.
Btw, it has \n\r, so it's LFCR and not CRLF as it should be on Windows (ref: de.wikipedia). I checked both my understanding of CR/LF as well as used $grep -PU '$\n\r' again.
See also http://de.wikipedia.org/wiki/Zeilenumbruch (german, the en version doesn't have the table). So either: 1) You encoded for whatever reason the file with CR and LF swapped
Nobody encoded the file in a special manner. It just happens to be a file used on windows, by a windows program, and as such should stay in CR/LF format. I am not sure why you say LF and CR are swapped, I don't see it myself, and vim tells me it is in DOS (e.g. CR/LF) format.
2) It doesn't matter what the order is
It does matter. Although text editors are generally smart about line endings, other windows softwares are not.
I filed a new pull request, http://github.com/numpy/numpy/pull/7 . This should enforce LF on all text files, with the current exception of the nsi.in file, which is CRLF. The svgs have been converted to LF. Additional, confusing reading can be found at http://help.github.com/dealing-with-lineendings/ , http://www.kernel.org/pub/software/scm/git/docs/git-config.html, and http://www.kernel.org/pub/software/scm/git/docs/gitattributes.html . Darren

2010/10/21 Darren Dale <dsdale24@gmail.com>:
I filed a new pull request, http://github.com/numpy/numpy/pull/7 . This should enforce LF on all text files, with the current exception of the nsi.in file, which is CRLF. The svgs have been converted to LF. Additional, confusing reading can be found at http://help.github.com/dealing-with-lineendings/ , http://www.kernel.org/pub/software/scm/git/docs/git-config.html, and http://www.kernel.org/pub/software/scm/git/docs/gitattributes.html .
Hm, I like you pull request more than my own branch, but I think your conclusions might be incorrect. ``* text=auto`` forces git to normalise *all* text files, including the .nsi.in file, to LF *in the repo only*. But it says nothing about how to set eol in the working dir. ``[...].nsi.in eol=crlf`` forces git to check-out the .nsi.in file with CRLF. At least this is what the gitattributes.html we both seem to use says. So it's perfect, it keeps the repo clean, but allows users to check-out how they want, Friedrich

On Thu, Oct 21, 2010 at 4:48 PM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
2010/10/21 Darren Dale <dsdale24@gmail.com>:
I filed a new pull request, http://github.com/numpy/numpy/pull/7 . This should enforce LF on all text files, with the current exception of the nsi.in file, which is CRLF. The svgs have been converted to LF. Additional, confusing reading can be found at http://help.github.com/dealing-with-lineendings/ , http://www.kernel.org/pub/software/scm/git/docs/git-config.html, and http://www.kernel.org/pub/software/scm/git/docs/gitattributes.html .
Hm, I like you pull request more than my own branch, but I think your conclusions might be incorrect.
``* text=auto`` forces git to normalise *all* text files, including the .nsi.in file, to LF *in the repo only*. But it says nothing about how to set eol in the working dir.
``[...].nsi.in eol=crlf`` forces git to check-out the .nsi.in file with CRLF.
I see. Thank you for clarifying. It probably is not necessary then to have the exception for the nsi.in file, since git will create files with CRLF eols in the working directory on windows by default. The eols in the working directory can be controlled by the core.eol setting, which defaults to "native". But unless David C gives his blessing, I will leave the pull request as is. Pretty confusing. Darren

Hi Darren, 2010/10/19 Darren Dale <dsdale24@gmail.com>:
I have the following set in my ~/.gitconfig file:
[apply] whitespace = fix
[core] autocrlf = input
which is attempting to correct some changes in:
branding/icons/numpylogo.svg branding/icons/numpylogoicon.svg tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in
Here an excerpt from git-config: core.autocrlf Setting this variable to "true" is almost the same as setting the text attribute to "auto" on all files except that text files are not guaranteed to be normalized: files that contain CRLF in the repository will not be touched. Use this setting if you want to have CRLF line endings in your working directory even though the repository does not have normalized line endings. This variable can be set to input, in which case no output conversion is performed.
From git-apply:
``fix`` outputs warnings for a few such errors, and applies the patch after fixing them (strip is a synonym --- the tool used to consider only trailing whitespace characters as errors, and the fix involved stripping them, but modern gits do more). So I think your "autocrlf=input" makes the .nsi.in file checked out as LF since it's in LF in the repo, and "no output conversion is performed" due to core.autocrlf=input in your .gitconfigure. So the svg changes must come from the 'fix' value for the whitespace action. I don't think it is a good idea to let whitespace be fixed by git and not by your editor :-) Or do you disagree? This whitespace & newline thing is really painful, I suggest you set in your .gitconfig: [core] autocrlf = true and in our numpy .gitattributes: * text=auto while the text=auto is more strong and a superset of autocrlf=true. I came across this when trying if text=auto marks any files as changed, and it didn't so everything IS already LF in the repo. Can you check this please? I was near to leaving a comment like "asap" on github, but since this is so horribly complicated and error-prone ... Friedrich

On Wed, Oct 27, 2010 at 8:36 AM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
Hi Darren,
2010/10/19 Darren Dale <dsdale24@gmail.com>:
I have the following set in my ~/.gitconfig file:
[apply] whitespace = fix
[core] autocrlf = input
which is attempting to correct some changes in:
branding/icons/numpylogo.svg branding/icons/numpylogoicon.svg tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in
Here an excerpt from git-config:
core.autocrlf
Setting this variable to "true" is almost the same as setting the text attribute to "auto" on all files except that text files are not guaranteed to be normalized: files that contain CRLF in the repository will not be touched. Use this setting if you want to have CRLF line endings in your working directory even though the repository does not have normalized line endings. This variable can be set to input, in which case no output conversion is performed.
From git-apply:
``fix`` outputs warnings for a few such errors, and applies the patch after fixing them (strip is a synonym --- the tool used to consider only trailing whitespace characters as errors, and the fix involved stripping them, but modern gits do more).
So I think your "autocrlf=input" makes the .nsi.in file checked out as LF since it's in LF in the repo, and "no output conversion is performed" due to core.autocrlf=input in your .gitconfigure.
So the svg changes must come from the 'fix' value for the whitespace action.
I don't think it is a good idea to let whitespace be fixed by git and not by your editor :-) Or do you disagree?
"What are considered whitespace errors is controlled by core.whitespace configuration. By default, trailing whitespaces (including lines that solely consist of whitespaces) and a space character that is immediately followed by a tab character inside the initial indent of the line are considered whitespace errors." No mention of EOL conversions there. But yes, I guess we disagree. I prefer to have git automatically strip any trailing whitespace that I might have accidentally introduced.
This whitespace & newline thing is really painful, I suggest you set in your .gitconfig:
[core] autocrlf = true
I don't think so: "Use this setting if you want to have CRLF line endings in your working directory even though the repository does not have normalized line endings." I don't want CRLF in my working directory. Did you read http://help.github.com/dealing-with-lineendings/ ?
and in our numpy .gitattributes:
* text=auto
That is already included in the pull request.
while the text=auto is more strong and a superset of autocrlf=true.
I came across this when trying if text=auto marks any files as changed, and it didn't so everything IS already LF in the repo.
Can you check this please?
Check what?
I was near to leaving a comment like "asap" on github, but since this is so horribly complicated and error-prone ...
I'm starting to consider canceling the pull request. Darren

Hi Darren, 2010/10/27 Darren Dale <dsdale24@gmail.com>:
So the svg changes must come from the 'fix' value for the whitespace action.
I don't think it is a good idea to let whitespace be fixed by git and not by your editor :-) Or do you disagree?
"What are considered whitespace errors is controlled by core.whitespace configuration. By default, trailing whitespaces (including lines that solely consist of whitespaces) and a space character that is immediately followed by a tab character inside the initial indent of the line are considered whitespace errors."
No mention of EOL conversions there. But yes, I guess we disagree. I prefer to have git automatically strip any trailing whitespace that I might have accidentally introduced.
I agree. But I just guess that the changes of the svgs in your pull request might be not due to eols but due to whitespace fixes. I think so because in my numpy (current master branch) I cannot see any CRLF there in the repo. Checked with ``* text=auto``, which also affects non-normalised files in the repo. But it might be that the conversion is done silently, although I don't know how to do it like that. So no "changed" showing up implies "no non-LF eol".
This whitespace & newline thing is really painful, I suggest you set in your .gitconfig:
[core] autocrlf = true
I don't think so: "Use this setting if you want to have CRLF line endings in your working directory even though the repository does not have normalized line endings." I don't want CRLF in my working directory. Did you read http://help.github.com/dealing-with-lineendings/ ?
Aha, this is a misunderstanding. Somehow I thought you're working on Windows. Is there then a specific reason not to use CRLF? I mean, you can check it in with LF anyway. The page you mentioned is very brief and like a recipe, not my taste. I want to know what's going on in detail.
and in our numpy .gitattributes:
* text=auto
That is already included in the pull request.
Yes, I know. I meant to leave the line with the eol=crlf alone. All based on the assumtion that you're working with crlf anyway, so might be wrong.
while the text=auto is more strong and a superset of autocrlf=true.
I came across this when trying if text=auto marks any files as changed, and it didn't so everything IS already LF in the repo.
Can you check this please?
Check what?
My conclusions above. We both know that in this subject all conclusions are pretty error-prone ...
I was near to leaving a comment like "asap" on github, but since this is so horribly complicated and error-prone ...
I'm starting to consider canceling the pull request.
At least we should check if it's really what we intend. I understand now better why at all you wanted to force the .nsi.in file to be crlf. From your previous posts, i.e. that it would be the default for Win users anyway, I see now that I should have asked. To my understanding the strategy should be two things: 1) LF force in the repo. This is independent from the .nsi.in thing, but missing currently in the official branches. We can do that at the same time. 2) Forcing the .nsi.in file to be crlf in the check-out (and only there) at all times. There is one higher level in $GITDIR, but I think we can ignore that. To (1): The default Win user would check-in *newly created* files currently in CRLF, at least this is what I did with a not-so-recent git some time ago (other repos) .... When I switched to Mac, all my files were marked "changed". afaik git does not do normalisation if you do not tell it to do so. "While git normally leaves file contents alone, it can be configured to normalize line endings to LF in the repository and, optionally, to convert them to CRLF when files are checked out." (http://www.kernel.org/pub/software/scm/git/docs/gitattributes.html) I still do not understand why my files showed up changed. They're still crlf, I just copied them, and vim tells [dos]. Please also confirm or show that I'm wrong with my observation of LFCR in your .nsi.in file instead of CRLF (it's swapped). I checked as written before. http://article.gmane.org/gmane.comp.python.numeric.general/41063. This would also explain how it can make it into the repo (i.e., succeed in :-) and still be not detected by git when ``* text=auto`` is active. Git thinks it's \n since theres no \r before, and does not consider the \r which is trailing. Best wishes, Friedrich P.S.: Might be worth putting this OL, I believe noone besides us is interested. If you agree.

On Wed, Oct 27, 2010 at 4:31 PM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
P.S.: Might be worth putting this OL, I believe noone besides us is interested. If you agree.
I'm interested since we also use git (and github) for a Python project which is developed and tested on Linux, Mac OS X and Windows. We haven't done anything special with the repository settings about line endings... but so far it seems to be working with the defaults. Peter

Hi Peter, 2010/10/27 Peter <numpy-discussion@maubp.freeserve.co.uk>:
I'm interested since we also use git (and github) for a Python project which is developed and tested on Linux, Mac OS X and Windows. We haven't done anything special with the repository settings about line endings... but so far it seems to be working with the defaults.
I can only guess, but I bet that you're using vim or any other editor which can handle both. Since Python has universal newline support, it can also read both. Seems that you're lucky that it's Python ..... :-) You might use the od -c on Linux and Mac to check which files were created on Win and which on LF OSes ... supposed that I'm right at all. Would be a good check of my assumptions. Friedrich

On Wed, Oct 27, 2010 at 4:49 PM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
Hi Peter,
2010/10/27 Peter <numpy-discussion@maubp.freeserve.co.uk>:
I'm interested since we also use git (and github) for a Python project which is developed and tested on Linux, Mac OS X and Windows. We haven't done anything special with the repository settings about line endings... but so far it seems to be working with the defaults.
I can only guess, but I bet that you're using vim or any other editor which can handle both. Since Python has universal newline support, it can also read both. Seems that you're lucky that it's Python ..... :-)
You might use the od -c on Linux and Mac to check which files were created on Win and which on LF OSes ... supposed that I'm right at all.
Would be a good check of my assumptions.
Friedrich
Why use od when I can use Python ;) (see below) I've just tested a checkout of our repository on Mac OS X, and found a single Python file with Windows newlines (CRLF) but I'm pretty sure that happened before we moved to git. Peter -- #Quick script to check for potential new line issues #or tab indentation instead of spaces import os def check(filename): #load in binary mode as I want to see any \r handle = open(filename, "rb") for line in handle: if "\t" in line or "\r" in line: print filename, repr(line) handle.close() for dirpath, dirnames, filenames in os.walk("."): for f in filenames: if f.endswith(".py"): check(os.path.join(dirpath,f)) print "Done"

2010/10/27 Peter <numpy-discussion@maubp.freeserve.co.uk>:
Why use od when I can use Python ;) (see below)
Did not come around that trick with 'b'. Good catch :-) So you mean you don't have that file in the repo? May it be that all the files have been created on a LF machine except this single file? Because once they are created the editor will maintain the eol convention. vim does at least. Thanks for your time! Friedrich

On Wed, Oct 27, 2010 at 5:08 PM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
2010/10/27 Peter <numpy-discussion@maubp.freeserve.co.uk>:
Why use od when I can use Python ;) (see below)
Did not come around that trick with 'b'. Good catch :-)
So you mean you don't have that file in the repo?
May it be that all the files have been created on a LF machine except this single file? Because once they are created the editor will maintain the eol convention. vim does at least.
Thanks for your time!
Friedrich
There was one Python file in our repository which used Windows/DOS line endings - I believe this would have been created by a developer on Windows back when we used CVS (the repository was later migrated to git). Peter

It **IS** CRLF, I have no idea what I did wrong, but n0877:nsis_scripts Friedrich$ od -c numpy-superinstaller.nsi.in 0000000 ; - - - - - - - - - - - - - - - 0000020 - - - - - - - - - - - - - - - - 0000040 - \r \n ; I n c l u d e M o d e 0000060 r n U I \r \n \r \n ! i n c l u d 0000100 e " M U I 2 . n s h " \r \n \r \n I'm not THAT interested to search for my mistake if noone else sees the spot. od -c seems more robust anyway. I saw more than only double newlines, whatever :-( Friedrich

On Wed, Oct 27, 2010 at 11:31 AM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
Hi Darren,
2010/10/27 Darren Dale <dsdale24@gmail.com>:
So the svg changes must come from the 'fix' value for the whitespace action.
I don't think it is a good idea to let whitespace be fixed by git and not by your editor :-) Or do you disagree?
"What are considered whitespace errors is controlled by core.whitespace configuration. By default, trailing whitespaces (including lines that solely consist of whitespaces) and a space character that is immediately followed by a tab character inside the initial indent of the line are considered whitespace errors."
No mention of EOL conversions there. But yes, I guess we disagree. I prefer to have git automatically strip any trailing whitespace that I might have accidentally introduced.
I agree. But I just guess that the changes of the svgs in your pull request might be not due to eols but due to whitespace fixes.
No, it was not. I explicitly checked the svg files before and after, using open("foo.svg").readlines[0], and saw that the files were CRLF before the commit on my branch, and LF after.
I think so because in my numpy (current master branch) I cannot see any CRLF there in the repo. Checked with ``* text=auto``, which also affects non-normalised files in the repo.
But it might be that the conversion is done silently, although I don't know how to do it like that. So no "changed" showing up implies "no non-LF eol".
This whitespace & newline thing is really painful, I suggest you set in your .gitconfig:
[core] autocrlf = true
I don't think so: "Use this setting if you want to have CRLF line endings in your working directory even though the repository does not have normalized line endings." I don't want CRLF in my working directory. Did you read http://help.github.com/dealing-with-lineendings/ ?
Aha, this is a misunderstanding. Somehow I thought you're working on Windows. Is there then a specific reason not to use CRLF? I mean, you can check it in with LF anyway.
The page you mentioned is very brief and like a recipe, not my taste. I want to know what's going on in detail.
and in our numpy .gitattributes:
* text=auto
That is already included in the pull request.
Yes, I know. I meant to leave the line with the eol=crlf alone. All based on the assumtion that you're working with crlf anyway, so might be wrong.
while the text=auto is more strong and a superset of autocrlf=true.
I came across this when trying if text=auto marks any files as changed, and it didn't so everything IS already LF in the repo.
Can you check this please?
Check what?
My conclusions above. We both know that in this subject all conclusions are pretty error-prone ...
I was near to leaving a comment like "asap" on github, but since this is so horribly complicated and error-prone ...
I'm starting to consider canceling the pull request.
At least we should check if it's really what we intend.
I understand now better why at all you wanted to force the .nsi.in file to be crlf. From your previous posts, i.e. that it would be the default for Win users anyway, I see now that I should have asked.
To my understanding the strategy should be two things: 1) LF force in the repo. This is independent from the .nsi.in thing, but missing currently in the official branches. We can do that at the same time. 2) Forcing the .nsi.in file to be crlf in the check-out (and only there) at all times. There is one higher level in $GITDIR, but I think we can ignore that.
To (1): The default Win user would check-in *newly created* files currently in CRLF, at least this is what I did with a not-so-recent git some time ago (other repos) .... When I switched to Mac, all my files were marked "changed". afaik git does not do normalisation if you do not tell it to do so. "While git normally leaves file contents alone, it can be configured to normalize line endings to LF in the repository and, optionally, to convert them to CRLF when files are checked out." (http://www.kernel.org/pub/software/scm/git/docs/gitattributes.html) I still do not understand why my files showed up changed. They're still crlf, I just copied them, and vim tells [dos].
Please also confirm or show that I'm wrong with my observation of LFCR in your .nsi.in file instead of CRLF (it's swapped).
I thought this was already settled. OK, on my whitespace-cleanup branch, I modify .gitattributes to comment out the line about the nsi.in file. I check out the nsi.in file from HEAD, and: In [1]: open('tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in').readlines()[0] Out[1]: ';--------------------------------\n' Then I do git checkout HEAD^ tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in : In [1]: open('tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in').readlines()[0] Out[1]: ';--------------------------------\r\n' CRLF, not LFCR.
I checked as written before. http://article.gmane.org/gmane.comp.python.numeric.general/41063. This would also explain how it can make it into the repo (i.e., succeed in :-) and still be not detected by git when ``* text=auto`` is active. Git thinks it's \n since theres no \r before, and does not consider the \r which is trailing.
Best wishes, Friedrich
P.S.: Might be worth putting this OL, I believe noone besides us is interested. If you agree.
I'm losing interest myself. I don't think the issue is so complicated, there just seems to be a lot of confusing misinformation being posted here.

2010/10/27 Darren Dale <dsdale24@gmail.com>:
I'm losing interest myself. I don't think the issue is so complicated, there just seems to be a lot of confusing misinformation being posted here.
I apologise for all misinformation I posted. I always double-check before sending. Believe me or not. I think the subject is simple but hard to treat because of all those options and their interaction? It's hard for me to get a clear picture at all. I guess that's the reason why the docs are soo long. Compared to the simplicity of \n != \r\n. I will try to get some psuedo code instead of narrative cheap talking, then we can simply check. Guess the code is shorter than the docs are :( Friedrich

attribute: 'true', 'false', '<value>', '' attributes: text, eol, core.autocrlf filters: left alone. core.safecrlf: left alone. results. normalise: True, False workingdir_fmt: 'lf', 'crlf' # Apply text. # Can be skipped on check-out. if text == 'true': normalise = True elif text == 'false': normalise = False elif text == 'auto': normalise = file_is_text elif text == '': if file_is_text and core.autocrlf: # depends on is_normalised, in_repo # 0 0: True # 0 1: False # 1 0: N/A (does not occur) # 1 1: True normalise = not (in_repo and not is_normalised) # Except it is unnormalised in repo, do normalise. else: normalise = False # SO we should think about "text" as "normalise_preference" # Two levels of indentation get necessarily prosaic ... # Define default fmt. if core.autocrlf: workingdir_fmt = 'crlf' else: workingdir_fmt = core.eol # defaults to native # Apply eol attribute. if eol == 'lf': normalise = True workingdir_fmt = 'lf' elif eol == 'crlf': normalise = True workingdir_fmt = 'crlf' else: pass # Translation of crlf attribute: # crlf == 'true' => text = 'true' # crlf == 'false' => text = 'false' # crlf == 'input' => eol = 'lf' CHECKING IN if normalise: checkin(normalise(file)) else: checkin(file) CHECKING OUT checkout(file, workingdir_fmt) NOTES Files are not changed without a checkin. So I think Darren's last pull req is just perfect. I was just worried about the mechanisms that lead to the changes, but when Darren says they are all okay and sensible then I think they are. Maybe the pseudo code would be useful for somehow git docs too, I mean the people who use git are coders and not philologists. I checked the code but I cannot fully exclude bugs, as usual .. Friedrich

On Wed, Oct 27, 2010 at 8:58 AM, Darren Dale <dsdale24@gmail.com> wrote:
On Wed, Oct 27, 2010 at 8:36 AM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
Hi Darren,
2010/10/19 Darren Dale <dsdale24@gmail.com>:
I have the following set in my ~/.gitconfig file:
[apply] whitespace = fix
[core] autocrlf = input
which is attempting to correct some changes in:
branding/icons/numpylogo.svg branding/icons/numpylogoicon.svg tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in
Here an excerpt from git-config:
core.autocrlf
Setting this variable to "true" is almost the same as setting the text attribute to "auto" on all files except that text files are not guaranteed to be normalized: files that contain CRLF in the repository will not be touched. Use this setting if you want to have CRLF line endings in your working directory even though the repository does not have normalized line endings. This variable can be set to input, in which case no output conversion is performed.
From git-apply:
``fix`` outputs warnings for a few such errors, and applies the patch after fixing them (strip is a synonym --- the tool used to consider only trailing whitespace characters as errors, and the fix involved stripping them, but modern gits do more).
So I think your "autocrlf=input" makes the .nsi.in file checked out as LF since it's in LF in the repo, and "no output conversion is performed" due to core.autocrlf=input in your .gitconfigure.
So the svg changes must come from the 'fix' value for the whitespace action.
I don't think it is a good idea to let whitespace be fixed by git and not by your editor :-) Or do you disagree?
"What are considered whitespace errors is controlled by core.whitespace configuration. By default, trailing whitespaces (including lines that solely consist of whitespaces) and a space character that is immediately followed by a tab character inside the initial indent of the line are considered whitespace errors."
No mention of EOL conversions there. But yes, I guess we disagree. I prefer to have git automatically strip any trailing whitespace that I might have accidentally introduced.
This whitespace & newline thing is really painful, I suggest you set in your .gitconfig:
[core] autocrlf = true
I don't think so: "Use this setting if you want to have CRLF line endings in your working directory even though the repository does not have normalized line endings." I don't want CRLF in my working directory. Did you read http://help.github.com/dealing-with-lineendings/ ?
and in our numpy .gitattributes:
* text=auto
That is already included in the pull request.
while the text=auto is more strong and a superset of autocrlf=true.
I came across this when trying if text=auto marks any files as changed, and it didn't so everything IS already LF in the repo.
Can you check this please?
Check what?
I was near to leaving a comment like "asap" on github, but since this is so horribly complicated and error-prone ...
I'm starting to consider canceling the pull request.
I'd like to do something here, but I'm waiting for a consensus and for someone to test things out, maybe with a test repo, to make sure things operate correctly. The documentation isn't that clear... Chuck

2010/10/27 Charles R Harris <charlesr.harris@gmail.com>:
I'd like to do something here, but I'm waiting for a consensus and for someone to test things out, maybe with a test repo, to make sure things operate correctly. The documentation isn't that clear...
Okay, I'll do that tomorrow (in ~13 hr). I feel responsible. I can try it on Mac OS and Win 7. I'll use recent mingw git for Win 7. Nevertheless I have so much warm fish atm .... :( Friedrich

Hi Chuck, On Wed, Oct 27, 2010 at 1:30 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
I'd like to do something here, but I'm waiting for a consensus and for someone to test things out, maybe with a test repo, to make sure things operate correctly. The documentation isn't that clear...
I am getting ready to test on windows and mac. In the process of upgrading git on windows to 1.7.3.1, The following dialog appeared: Configuring line ending conversions How should Git treat line endings in text files? x Checkout Windows-style, commit Unix-style line endings Git will convert LF to CRLF when checking out text files. When committing text files, CRLF will be converted to LF. For cross-platform projects, this is the recommended setting on Windows ("core.autocrlf" is set to "true") o Checkout as-is, commit Unix-style line endings Git will not perform any conversion when checking out text files. When committing text files, CRLF will be converted to LF. For cross-platform projects this is the recommended setting on Unix ("core.autocrlf" is set to "input"). o Checkout as-is, commit as-is Git will not perform any conversions when checking out or committing text files. Choosing this option is not recommended for cross-platform projects ("core.autocrlf" is set to "false") This might warrant a very brief mention in the docs, for helping people set up their environment. Its too bad core.autocrlf cannot be set on a per-project basis in a file that gets committed to the repository. As far as I can tell, it can only be set in ~/.gitconfig or numpy/.git/config. Which is why I suggested adding .gitattributes, which can be committed to the repository, and the line "* text=auto" ensures that EOLs in text files are committed as LF, so we don't have to worry about somebody's config settings having unwanted impact on the repository. And now the bad news: I have not been able to verify that Git respects the autocrlf setting or the eol setting in .gitattributes on my windows 7 computer: I made a new clone and the line endings are LF in the working directory, both on master and in my whitespace-cleanup branch (even the nsi.in file!). ("git config -l" confirms that "core.autocrlf" is "true".) To check my sanity, I tried writing files using wordpad and notepad to confirm that they are at least using CRLF, and they are *not*, according to both python's open() and grep "\r\n". If it were after noon where I live, I would be looking for a bottle of whiskey. But its not, so I'll just beat my head against my desk until I've forgotten about this whole episode.

On Thu, Oct 28, 2010 at 9:23 AM, Darren Dale <dsdale24@gmail.com> wrote:
Hi Chuck,
On Wed, Oct 27, 2010 at 1:30 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
I'd like to do something here, but I'm waiting for a consensus and for someone to test things out, maybe with a test repo, to make sure things operate correctly. The documentation isn't that clear...
I am getting ready to test on windows and mac. In the process of upgrading git on windows to 1.7.3.1, The following dialog appeared:
Configuring line ending conversions How should Git treat line endings in text files?
x Checkout Windows-style, commit Unix-style line endings Git will convert LF to CRLF when checking out text files. When committing text files, CRLF will be converted to LF. For cross-platform projects, this is the recommended setting on Windows ("core.autocrlf" is set to "true")
o Checkout as-is, commit Unix-style line endings Git will not perform any conversion when checking out text files. When committing text files, CRLF will be converted to LF. For cross-platform projects this is the recommended setting on Unix ("core.autocrlf" is set to "input").
o Checkout as-is, commit as-is Git will not perform any conversions when checking out or committing text files. Choosing this option is not recommended for cross-platform projects ("core.autocrlf" is set to "false")
This might warrant a very brief mention in the docs, for helping people set up their environment. Its too bad core.autocrlf cannot be set on a per-project basis in a file that gets committed to the
Yes, this would be good information to have in the notes.
repository. As far as I can tell, it can only be set in ~/.gitconfig or numpy/.git/config. Which is why I suggested adding .gitattributes, which can be committed to the repository, and the line "* text=auto" ensures that EOLs in text files are committed as LF, so we don't have to worry about somebody's config settings having unwanted impact on the repository.
Might be worth trying in a numpy/.gitconfig just to see what happens. Documentation isn't always complete.
And now the bad news: I have not been able to verify that Git respects the autocrlf setting or the eol setting in .gitattributes on my windows 7 computer: I made a new clone and the line endings are LF in the working directory, both on master and in my whitespace-cleanup branch (even the nsi.in file!). ("git config -l" confirms that "core.autocrlf" is "true".) To check my sanity, I tried writing files using wordpad and notepad to confirm that they are at least using CRLF, and they are *not*, according to both python's open() and grep "\r\n". If it were after noon where I live, I would be looking for a
Grepping for CR is tricky. The straightforward way is grep ctrl-v ctrl-m. I'm pretty sure notepad uses CRLF since it doesn't do line breaks for "unix" text.
bottle of whiskey. But its not, so I'll just beat my head against my desk until I've forgotten about this whole episode. _
Oh, don't do that. Someone's got to explore the territory and, as an official old fart, I'm volunteering the younguns. Chuck

On Thu, Oct 28, 2010 at 12:11 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Thu, Oct 28, 2010 at 9:23 AM, Darren Dale <dsdale24@gmail.com> wrote:
Hi Chuck,
On Wed, Oct 27, 2010 at 1:30 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
I'd like to do something here, but I'm waiting for a consensus and for someone to test things out, maybe with a test repo, to make sure things operate correctly. The documentation isn't that clear...
I am getting ready to test on windows and mac. In the process of upgrading git on windows to 1.7.3.1, The following dialog appeared:
Configuring line ending conversions How should Git treat line endings in text files?
x Checkout Windows-style, commit Unix-style line endings Git will convert LF to CRLF when checking out text files. When committing text files, CRLF will be converted to LF. For cross-platform projects, this is the recommended setting on Windows ("core.autocrlf" is set to "true")
o Checkout as-is, commit Unix-style line endings Git will not perform any conversion when checking out text files. When committing text files, CRLF will be converted to LF. For cross-platform projects this is the recommended setting on Unix ("core.autocrlf" is set to "input").
o Checkout as-is, commit as-is Git will not perform any conversions when checking out or committing text files. Choosing this option is not recommended for cross-platform projects ("core.autocrlf" is set to "false")
This might warrant a very brief mention in the docs, for helping people set up their environment. Its too bad core.autocrlf cannot be set on a per-project basis in a file that gets committed to the
Yes, this would be good information to have in the notes.
repository. As far as I can tell, it can only be set in ~/.gitconfig or numpy/.git/config. Which is why I suggested adding .gitattributes, which can be committed to the repository, and the line "* text=auto" ensures that EOLs in text files are committed as LF, so we don't have to worry about somebody's config settings having unwanted impact on the repository.
Might be worth trying in a numpy/.gitconfig just to see what happens. Documentation isn't always complete.
And now the bad news: I have not been able to verify that Git respects the autocrlf setting or the eol setting in .gitattributes on my windows 7 computer: I made a new clone and the line endings are LF in the working directory, both on master and in my whitespace-cleanup branch (even the nsi.in file!). ("git config -l" confirms that "core.autocrlf" is "true".) To check my sanity, I tried writing files using wordpad and notepad to confirm that they are at least using CRLF, and they are *not*, according to both python's open() and grep "\r\n". If it were after noon where I live, I would be looking for a
maybe just something obvious: Did you read the files in python as binary 'rb' ? I checked two old git checkouts (with a one and a half year old git version), pymvpa and scikits.talkbox and both have files with \r\n as line endings on my windowsXP. I don't think I did anything special, but maybe I had used a GUI interface for those. Josef
Grepping for CR is tricky. The straightforward way is grep ctrl-v ctrl-m. I'm pretty sure notepad uses CRLF since it doesn't do line breaks for "unix" text.
bottle of whiskey. But its not, so I'll just beat my head against my desk until I've forgotten about this whole episode. _
Oh, don't do that. Someone's got to explore the territory and, as an official old fart, I'm volunteering the younguns.
Chuck
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Thu, Oct 28, 2010 at 12:23 PM, <josef.pktd@gmail.com> wrote:
On Thu, Oct 28, 2010 at 12:11 PM, Charles R Harris
On Thu, Oct 28, 2010 at 9:23 AM, Darren Dale <dsdale24@gmail.com> wrote:
And now the bad news: I have not been able to verify that Git respects the autocrlf setting or the eol setting in .gitattributes on my windows 7 computer: I made a new clone and the line endings are LF in the working directory, both on master and in my whitespace-cleanup branch (even the nsi.in file!). ("git config -l" confirms that "core.autocrlf" is "true".) To check my sanity, I tried writing files using wordpad and notepad to confirm that they are at least using CRLF, and they are *not*, according to both python's open() and grep "\r\n". If it were after noon where I live, I would be looking for a
maybe just something obvious: Did you read the files in python as binary 'rb' ?
No, I did not. You are right, this shows \r\n. Why is it necessary to open them as binary? IIUC (OIDUC), one should use 'rU' to unify line endings.

On Thu, Oct 28, 2010 at 2:40 PM, Darren Dale <dsdale24@gmail.com> wrote:
On Thu, Oct 28, 2010 at 12:23 PM, <josef.pktd@gmail.com> wrote:
On Thu, Oct 28, 2010 at 12:11 PM, Charles R Harris
On Thu, Oct 28, 2010 at 9:23 AM, Darren Dale <dsdale24@gmail.com> wrote:
And now the bad news: I have not been able to verify that Git respects the autocrlf setting or the eol setting in .gitattributes on my windows 7 computer: I made a new clone and the line endings are LF in the working directory, both on master and in my whitespace-cleanup branch (even the nsi.in file!). ("git config -l" confirms that "core.autocrlf" is "true".) To check my sanity, I tried writing files using wordpad and notepad to confirm that they are at least using CRLF, and they are *not*, according to both python's open() and grep "\r\n". If it were after noon where I live, I would be looking for a
maybe just something obvious: Did you read the files in python as binary 'rb' ?
No, I did not. You are right, this shows \r\n. Why is it necessary to open them as binary? IIUC (OIDUC), one should use 'rU' to unify line endings.
The python default for open(filename).read() or open(filename, 'r').read() is to standardize line endings to \n. Josef
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Thu, Oct 28, 2010 at 3:23 PM, <josef.pktd@gmail.com> wrote:
On Thu, Oct 28, 2010 at 2:40 PM, Darren Dale <dsdale24@gmail.com> wrote:
On Thu, Oct 28, 2010 at 12:23 PM, <josef.pktd@gmail.com> wrote:
On Thu, Oct 28, 2010 at 12:11 PM, Charles R Harris
On Thu, Oct 28, 2010 at 9:23 AM, Darren Dale <dsdale24@gmail.com> wrote:
And now the bad news: I have not been able to verify that Git respects the autocrlf setting or the eol setting in .gitattributes on my windows 7 computer: I made a new clone and the line endings are LF in the working directory, both on master and in my whitespace-cleanup branch (even the nsi.in file!). ("git config -l" confirms that "core.autocrlf" is "true".) To check my sanity, I tried writing files using wordpad and notepad to confirm that they are at least using CRLF, and they are *not*, according to both python's open() and grep "\r\n". If it were after noon where I live, I would be looking for a
maybe just something obvious: Did you read the files in python as binary 'rb' ?
No, I did not. You are right, this shows \r\n. Why is it necessary to open them as binary? IIUC (OIDUC), one should use 'rU' to unify line endings.
The python default for open(filename).read() or open(filename, 'r').read() is to standardize line endings to \n.
Although, on a mac: In [1]: open('tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in').readlines()[0] Out[1]: ';--------------------------------\r\n'

On 10/28/10 1:25 PM, Darren Dale wrote:
No, I did not. You are right, this shows \r\n. Why is it necessary to open them as binary? IIUC (OIDUC), one should use 'rU' to unify line endings.
Although, on a mac:
In [1]: open('tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in').readlines()[0] Out[1]: ';--------------------------------\r\n'
that's what the 'U' is for. try: open('tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in','rU').readlines()[0] -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Thu, Oct 28, 2010 at 12:11 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Thu, Oct 28, 2010 at 9:23 AM, Darren Dale <dsdale24@gmail.com> wrote:
Hi Chuck,
On Wed, Oct 27, 2010 at 1:30 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
I'd like to do something here, but I'm waiting for a consensus and for someone to test things out, maybe with a test repo, to make sure things operate correctly. The documentation isn't that clear...
I am getting ready to test on windows and mac. In the process of upgrading git on windows to 1.7.3.1, The following dialog appeared:
Configuring line ending conversions How should Git treat line endings in text files?
x Checkout Windows-style, commit Unix-style line endings Git will convert LF to CRLF when checking out text files. When committing text files, CRLF will be converted to LF. For cross-platform projects, this is the recommended setting on Windows ("core.autocrlf" is set to "true")
o Checkout as-is, commit Unix-style line endings Git will not perform any conversion when checking out text files. When committing text files, CRLF will be converted to LF. For cross-platform projects this is the recommended setting on Unix ("core.autocrlf" is set to "input").
o Checkout as-is, commit as-is Git will not perform any conversions when checking out or committing text files. Choosing this option is not recommended for cross-platform projects ("core.autocrlf" is set to "false")
This might warrant a very brief mention in the docs, for helping people set up their environment. Its too bad core.autocrlf cannot be set on a per-project basis in a file that gets committed to the
Yes, this would be good information to have in the notes.
repository. As far as I can tell, it can only be set in ~/.gitconfig or numpy/.git/config. Which is why I suggested adding .gitattributes, which can be committed to the repository, and the line "* text=auto" ensures that EOLs in text files are committed as LF, so we don't have to worry about somebody's config settings having unwanted impact on the repository.
Might be worth trying in a numpy/.gitconfig just to see what happens. Documentation isn't always complete.
Now that I understand the situation a little better, I don't think we would want such a .gitconfig in the repository itself. Most windows users would probably opt for autcrlf=true, but that is definitely not the case for mac and linux users. I've been testing the changes in the pull request this morning on linux, mac and windows, all using git-1.7.3.1. I made a testing branch from whitespace-cleanup and added two files created on windows: temp.txt and tmp.txt. One of them was added to .gitattributes to preserve the crlf in the repo. windows: with autocrlf=true, all files in the working directory are crlf. With autocrlf=false, files marked in .gitattributes for crlf do have crlf, the other files are lf. Check. mac: tested with autocrlf=input. files marked in .gitattributes for crlf have crlf, others are lf. Check. linux (kubuntu 10.10): tested with autocrlf=input and false. All files in the working directory have lf, even those marked for crlf. This is confusing. I copied temp.txt from windows, verified that it still had crlf endings, and copied it into the working directory. Git warns that crlf will be converted to lf, but attempting a commit yields "nothing to do". I had to do "git add temp.txt", at which point git status tells me the working directory is clean and there is nothing to commit. I'm not too worried about this, its a situation that is unlikely to ever occur in practice. I think I have convinced myself that the pull request is satisfactory. Devs should bear in mind, though, that there is a small risk when committing changes to binary files that git will corrupt such a file by incorrectly identifying and converting crlf to lf. Git should warn when line conversions are going to take place, so they can be disabled for a binary file in .gitattributes: mybinaryfile.dat -text That is all, Darren

On Sat, Oct 30, 2010 at 7:54 AM, Darren Dale <dsdale24@gmail.com> wrote:
On Thu, Oct 28, 2010 at 12:11 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Thu, Oct 28, 2010 at 9:23 AM, Darren Dale <dsdale24@gmail.com> wrote:
Hi Chuck,
On Wed, Oct 27, 2010 at 1:30 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
I'd like to do something here, but I'm waiting for a consensus and for someone to test things out, maybe with a test repo, to make sure
things
operate correctly. The documentation isn't that clear...
I am getting ready to test on windows and mac. In the process of upgrading git on windows to 1.7.3.1, The following dialog appeared:
Configuring line ending conversions How should Git treat line endings in text files?
x Checkout Windows-style, commit Unix-style line endings Git will convert LF to CRLF when checking out text files. When committing text files, CRLF will be converted to LF. For cross-platform projects, this is the recommended setting on Windows ("core.autocrlf" is set to "true")
o Checkout as-is, commit Unix-style line endings Git will not perform any conversion when checking out text files. When committing text files, CRLF will be converted to LF. For cross-platform projects this is the recommended setting on Unix ("core.autocrlf" is set to "input").
o Checkout as-is, commit as-is Git will not perform any conversions when checking out or committing text files. Choosing this option is not recommended for cross-platform projects ("core.autocrlf" is set to "false")
This might warrant a very brief mention in the docs, for helping people set up their environment. Its too bad core.autocrlf cannot be set on a per-project basis in a file that gets committed to the
Yes, this would be good information to have in the notes.
repository. As far as I can tell, it can only be set in ~/.gitconfig or numpy/.git/config. Which is why I suggested adding .gitattributes, which can be committed to the repository, and the line "* text=auto" ensures that EOLs in text files are committed as LF, so we don't have to worry about somebody's config settings having unwanted impact on the repository.
Might be worth trying in a numpy/.gitconfig just to see what happens. Documentation isn't always complete.
Now that I understand the situation a little better, I don't think we would want such a .gitconfig in the repository itself. Most windows users would probably opt for autcrlf=true, but that is definitely not the case for mac and linux users.
I've been testing the changes in the pull request this morning on linux, mac and windows, all using git-1.7.3.1. I made a testing branch from whitespace-cleanup and added two files created on windows: temp.txt and tmp.txt. One of them was added to .gitattributes to preserve the crlf in the repo.
windows: with autocrlf=true, all files in the working directory are crlf. With autocrlf=false, files marked in .gitattributes for crlf do have crlf, the other files are lf. Check.
Good, sounds like windows is safe.
mac: tested with autocrlf=input. files marked in .gitattributes for crlf have crlf, others are lf. Check.
Good.
linux (kubuntu 10.10): tested with autocrlf=input and false. All files in the working directory have lf, even those marked for crlf. This is confusing. I copied temp.txt from windows, verified that it still had crlf endings, and copied it into the working directory. Git warns that crlf will be converted to lf, but attempting a commit yields "nothing to do". I had to do "git add temp.txt", at which point git status tells me the working directory is clean and there is nothing to commit. I'm not too worried about this, its a situation that is unlikely to ever occur in practice.
That is confusing, I would have hoped that Mac and Linux would behave the same way. I'm guessing that git stores text internally with LF and adds CR on checkout on the other platforms but not linux, or at least Ubuntu. I think I have convinced myself that the pull request is satisfactory.
Devs should bear in mind, though, that there is a small risk when committing changes to binary files that git will corrupt such a file by incorrectly identifying and converting crlf to lf. Git should warn when line conversions are going to take place, so they can be disabled for a binary file in .gitattributes:
mybinaryfile.dat -text
That is all,
Thanks for testing this out, I'll make the commit. I suppose if something goes wrong somewhere down the line we can always fix it up. I'm particularly happy with the information on Windows as that was my main concern. Chuck
participants (7)
-
Charles R Harris
-
Christopher Barker
-
Darren Dale
-
David Cournapeau
-
Friedrich Romstedt
-
josef.pktd@gmail.com
-
Peter