Re: [Python-Dev] Running Clang 2.7's static analyzer over the code base

I'll just do a single entry saying that the static analyzer was used and not list the files.
And in reply to Benjamin's question about the whitespace, I guess it actually doesn't matter. More important to clean up in py3k.
On May 3, 2010 4:00 PM, "Antoine Pitrou" solipsis@pitrou.net wrote:
Benjamin Peterson <benjamin <at> python.org> writes:
2010/5/3 Brett Cannon <brett <at> python.org>:
When I check in these changes I will do it file by file, but my
question is
how to handle M...
Indeed. At most a single NEWS entry sounds sufficient.
Regards
Antoine.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org

Le Mon, 03 May 2010 16:11:53 -0700, Brett Cannon a écrit :
And in reply to Benjamin's question about the whitespace, I guess it actually doesn't matter. More important to clean up in py3k.
Would it be finally time to standardize all C files on a 4-spaces indentation rule?
I understand the "svn annotate" argument, but we edit code far more often than we annotate it. Also, it seems "svn ann -x -w" would ignore whitespace changes.
Regards
Antoine.

Antoine Pitrou wrote:
Le Mon, 03 May 2010 16:11:53 -0700, Brett Cannon a écrit :
And in reply to Benjamin's question about the whitespace, I guess it actually doesn't matter. More important to clean up in py3k.
Would it be finally time to standardize all C files on a 4-spaces indentation rule?
I understand the "svn annotate" argument, but we edit code far more often than we annotate it. Also, it seems "svn ann -x -w" would ignore whitespace changes.
Let's not forget to consider what Hg can do, since that represents the future.
regards Steve

On May 03, 2010, at 10:12 PM, Steve Holden wrote:
Antoine Pitrou wrote:
Le Mon, 03 May 2010 16:11:53 -0700, Brett Cannon a écrit :
And in reply to Benjamin's question about the whitespace, I guess it actually doesn't matter. More important to clean up in py3k.
Would it be finally time to standardize all C files on a 4-spaces indentation rule?
I understand the "svn annotate" argument, but we edit code far more often than we annotate it. Also, it seems "svn ann -x -w" would ignore whitespace changes.
Let's not forget to consider what Hg can do, since that represents the future.
Now would be a good time to convert the C files to 4 space indents. We've only been talking about it for a decade at least. While I'm sure history can be preserved across the svn->hg import, it's enough of a break to bite the bullet and fix the code. I think we only need to convert the py3k branch though.
-Barry

On Mon, May 3, 2010 at 7:34 PM, Barry Warsaw barry@python.org wrote:
Now would be a good time to convert the C files to 4 space indents. We've only been talking about it for a decade at least.
Will changing the indentation of source files to 4 space indents break patches on the bug tracker?
-- Alexandre

Will changing the indentation of source files to 4 space indents break patches on the bug tracker?
Plain patch will choke, but "patch -l" might accept them. I have only read the documentation, though, and don't know whether it does in practice.
Regards, Martin

Steve Holden wrote:
Antoine Pitrou wrote:
Le Mon, 03 May 2010 16:11:53 -0700, Brett Cannon a écrit :
And in reply to Benjamin's question about the whitespace, I guess it actually doesn't matter. More important to clean up in py3k.
Would it be finally time to standardize all C files on a 4-spaces indentation rule?
I understand the "svn annotate" argument, but we edit code far more often than we annotate it. Also, it seems "svn ann -x -w" would ignore whitespace changes.
Let's not forget to consider what Hg can do, since that represents the future.
I think Mercurial chokes in the light of white space inconsistencies just as much as Subversion. One reason for not converting in the past was also that it would break merging, unless that whitespace normalization is applied to all these branches.
Regards, Martin

Le mardi 04 mai 2010 04:34:13, Barry Warsaw a écrit :
Now would be a good time to convert the C files to 4 space indents. (...) I think we only need to convert the py3k branch though.
It will make the port of patches (commits) between trunk and py3k much harder. Can you explain why do you want to only fix only py3k and not trunk?

Le mardi 04 mai 2010 01:24:37, Antoine Pitrou a écrit :
Would it be finally time to standardize all C files on a 4-spaces indentation rule?
Indentation is painful on some files mixing tabs and spaces (sometimes in the same function!).
I understand the "svn annotate" argument, but we edit code far more often than we annotate it. Also, it seems "svn ann -x -w" would ignore whitespace changes.
I use svn blame to find the commit which insered a line of code. When I get the commit number, I always read the commit to ensure that the patch really insered the line, and it's not a refactoring patch. Refactoring can be about the indentation, but there are a lot of other minor changes: op->ob_size => Py_SIZE(op), (a) => a, { with a new line => { without newline, ...

Le mardi 04 mai 2010 01:24:37, Antoine Pitrou a écrit :
Le Mon, 03 May 2010 16:11:53 -0700, Brett Cannon a écrit :
And in reply to Benjamin's question about the whitespace, I guess it actually doesn't matter. More important to clean up in py3k.
Would it be finally time to standardize all C files on a 4-spaces indentation rule?
Oh, if we decide to reindent all C files, would it be also possible to strip trailing spaces?

On May 04, 2010, at 07:16 AM, Martin v. Löwis wrote:
I think Mercurial chokes in the light of white space inconsistencies just as much as Subversion. One reason for not converting in the past was also that it would break merging, unless that whitespace normalization is applied to all these branches.
Then let's do py3k and release31-maint, or whatever they will be called under hg. Once 2.7 is released and we're on hg, how much back porting of revisions from Python 3 -> 2 is going to happen?
-Barry

Barry Warsaw <barry <at> python.org> writes:
Then let's do py3k and release31-maint, or whatever they will be called under hg. Once 2.7 is released and we're on hg, how much back porting of revisions from Python 3 -> 2 is going to happen?
Probably quite a bit still; all C extension bug fixes for example.
I think we should reindent all 3 branches. Most of the work can probably be scripted (str.replace("\t", " " * 4)), and then a visual pass is necessary to fix vertical alignments and the like.
Regards
Antoine.

On Tue, May 4, 2010 at 14:41, Antoine Pitrou solipsis@pitrou.net wrote:
I think we should reindent all 3 branches. Most of the work can probably be scripted (str.replace("\t", " " * 4)), and then a visual pass is necessary to fix vertical alignments and the like.
If the script is robust enough, I can run it as part of the conversion, making sure that all branches get cleaned up...
Cheers,
Dirkjan

Dirkjan Ochtman wrote:
On Tue, May 4, 2010 at 14:41, Antoine Pitrou solipsis@pitrou.net wrote:
I think we should reindent all 3 branches. Most of the work can probably be scripted (str.replace("\t", " " * 4)), and then a visual pass is necessary to fix vertical alignments and the like.
If the script is robust enough, I can run it as part of the conversion, making sure that all branches get cleaned up...
Could this be done as part of the conversion without affecting the history?

Le mardi 04 mai 2010 14:41:42, Antoine Pitrou a écrit :
Barry Warsaw <barry <at> python.org> writes:
Then let's do py3k and release31-maint, or whatever they will be called under hg. Once 2.7 is released and we're on hg, how much back porting of revisions from Python 3 -> 2 is going to happen?
Probably quite a bit still; all C extension bug fixes for example.
2.7 will be maintained for long time, longer than any other 2.x version.
I think we should reindent all 3 branches. Most of the work can probably be scripted (str.replace("\t", " " * 4)), and then a visual pass is necessary to fix vertical alignments and the like.
We should also add pre-commit scripts to avoid the reintroduction of tabulations in C (and Python?) files.

On May 4, 2010, at 8:41 AM, Antoine Pitrou wrote:
I think we should reindent all 3 branches. Most of the work can probably be scripted (str.replace("\t", " " * 4)), and then a visual pass is necessary to fix vertical alignments and the like.
I think we should reindent all 3 branches. Most of the work can probably be scripted (str.replace("\t", " " * 4)), and then a visual pass is necessary to fix vertical alignments and the like.
Is it really that simple?
People often do the following:
- use tabs inside strings ("\t" should be used instead); - use tabs for alignment of inline comments on the right side; - mix tabs and spaces for indentation whitespace (continuation lines); - use tabs for alignment inside comments.
A simple replacement of a tab with 4 spaces as you propose does not work on such a code.
To do it properly, you may end up reinventing `indent` (UNIX program).
What if there was a set of `indent` flags defined for PEP-7? It would be a nice tool to check whether a new code complies.
However, running `indent` on an existing code would probably cause non-whitespace changes too as it rearranges the comments and code.
I'm afraid that any other kind of "script + visual pass + post-edit" would incur similar changes if done properly (only more tedious).
So, the question is what bothers developers more:
- old C files with tab indentation, or - a lot of changes in version control to fix them?
Both? :-)
Zvezdan

On Tue, May 4, 2010 at 08:27, Zvezdan Petkovic zvezdan@zope.com wrote:
On May 4, 2010, at 8:41 AM, Antoine Pitrou wrote:
I think we should reindent all 3 branches. Most of the work can probably
be scripted (str.replace("\t", " " * 4)), and then a visual pass is necessary to fix vertical alignments and the like.
I think we should reindent all 3 branches. Most of the work can probably
be scripted (str.replace("\t", " " * 4)), and then a visual pass is necessary to fix vertical alignments and the like.
Is it really that simple?
People often do the following:
- use tabs inside strings ("\t" should be used instead);
- use tabs for alignment of inline comments on the right side;
- mix tabs and spaces for indentation whitespace (continuation lines);
- use tabs for alignment inside comments.
A simple replacement of a tab with 4 spaces as you propose does not work on such a code.
To do it properly, you may end up reinventing `indent` (UNIX program).
What if there was a set of `indent` flags defined for PEP-7? It would be a nice tool to check whether a new code complies.
`make patchcheck` already does this for Python code. Adding to the command to handle C code wouldn't be hard.
-Brett
However, running `indent` on an existing code would probably cause non-whitespace changes too as it rearranges the comments and code.
I'm afraid that any other kind of "script + visual pass + post-edit" would incur similar changes if done properly (only more tedious).
So, the question is what bothers developers more:
- old C files with tab indentation, or
- a lot of changes in version control to fix them?
Both? :-)
Zvezdan
Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org

Le Tue, 04 May 2010 11:27:58 -0400, Zvezdan Petkovic a écrit :
A simple replacement of a tab with 4 spaces as you propose does not work on such a code.
Right, I was simplifying a bit.
I've just written a script which does a slightly more elaborate reindentation of C files, but still with the main objective of replacing tabs with 4-spaces indents. It shows good results on most of the source tree. In particular, it handles continuation lines and keeps vertical alignment correct in the majority of cases. This script could probably be applied with little a posteriori manual intervention.
It can be found in trunk/sandbox/untabify/untabify.py.
Regards
Antoine.

Victor Stinner wrote:
We should also add pre-commit scripts to avoid the reintroduction of tabulations in C (and Python?) files.
Python and ReST files are already covered (with reindent.py and reindent-rst.py to fix any files that get mixed up).
"make patchcheck" includes the precommit checks for both of those file types.
It's only C files that haven't been fixed yet (because their whitespace is such a mess overall, whereas the others were already pretty clean even before the precommit hooks were added).
Cheers, Nick.

Zvezdan Petkovic wrote:
So, the question is what bothers developers more:
- old C files with tab indentation, or
- a lot of changes in version control to fix them?
Both?
C files that mix tabs and spaces are actually the main source of pain when editing :)
Cheers, Nick.

Eric Smith wrote:
Dirkjan Ochtman wrote:
On Tue, May 4, 2010 at 14:41, Antoine Pitrou solipsis@pitrou.net wrote:
I think we should reindent all 3 branches. Most of the work can probably be scripted (str.replace("\t", " " * 4)), and then a visual pass is necessary to fix vertical alignments and the like.
If the script is robust enough, I can run it as part of the conversion, making sure that all branches get cleaned up...
Could this be done as part of the conversion without affecting the history?
I don't think that would be a good idea - it's just whitespace, but I think we risk more problems by leaving it out of the history than we do by preserving.
Howver, if we delay fixing the C file indentation until we're already on hg, the merge tools should offer the best chance of being able to apply pre-fix patches and have the software figure out where the whitespace changes need to be accounted for.
If we do it before, or don't record the change in the history at all, then hg won't have any changeset information to work with and will almost certainly get just as confused by the whitespace changes as the basic "patch" command would.
Cheers, Nick.

Le mercredi 05 mai 2010 à 08:41 +1000, Nick Coghlan a écrit :
Howver, if we delay fixing the C file indentation until we're already on hg, the merge tools should offer the best chance of being able to apply pre-fix patches and have the software figure out where the whitespace changes need to be accounted for.
I just tried using "patch -l" after having changed the indentation of a C source file and it worked fine. The only issue being that the patched lines of code have the wrong indentation (they get their indentation from the patch, not from the input file), but it's probably unavoidable.
Regards
Antoine.

Le mercredi 05 mai 2010 00:41:04, Nick Coghlan a écrit :
Howver, if we delay fixing the C file indentation until we're already on hg, the merge tools should offer the best chance of being able to apply pre-fix patches and have the software figure out where the whitespace changes need to be accounted for.
If we wait for Hg to solve all of our problems, the migration from svn to hg will be harder and longer.

Nick Coghlan <ncoghlan <at> gmail.com> writes:
Howver, if we delay fixing the C file indentation until we're already on hg, the merge tools should offer the best chance of being able to apply pre-fix patches and have the software figure out where the whitespace changes need to be accounted for.
For the record, I've added to the untabify script a patch rewriting option ("-p") which reindents all patch hunks for C files containing tabs. It should minimize manual reformatting work with existing patches.
Regards
Antoine.

Le mardi 04 mai 2010 07:12:32, Martin v. Löwis a écrit :
Will changing the indentation of source files to 4 space indents break patches on the bug tracker?
Plain patch will choke, but "patch -l" might accept them.
Tested on posixmodule.c: it works :-)

On Wed, May 5, 2010 at 8:10 AM, Antoine Pitrou solipsis@pitrou.net wrote:
For the record, I've added to the untabify script a patch rewriting option ("-p") which reindents all patch hunks for C files containing tabs. It should minimize manual reformatting work with existing patches.
I just tried '-p' with a local patch that I had created pre-whitespace-fixes. It worked like a charm. Thanks Antoine.
-- Meador
participants (12)
-
"Martin v. Löwis"
-
Alexandre Vassalotti
-
Antoine Pitrou
-
Barry Warsaw
-
Brett Cannon
-
Dirkjan Ochtman
-
Eric Smith
-
Meador Inge
-
Nick Coghlan
-
Steve Holden
-
Victor Stinner
-
Zvezdan Petkovic