Compare source code

John Nagle nagle at
Thu Nov 4 17:48:43 CET 2010

On 10/31/2010 6:52 AM, jf wrote:
> Le 31/10/2010 13:10, Martin v. Loewis a écrit :
>>> I've a project with tabs and spaces mixed (yes I know it's bad).
>>> I edit each file to remove tabs, but it's so easy to make a mistake.
>>> Do you know a tools to compare the initial file with the cleaned one to
>>> know if the algorithms are the same ?
>> Tools/scripts/ of the standard Python distribution normalizes
>> white space in source code.
> So great, you save my time !
> Should I be worry about this comment in "So long as the
> input files get a clean bill of health from, reindent should
> do a good job." ?

    Are both of those tools consistent with the interpretation of mixed 
tabs and spaces in Python 3.x?

    The current CPython parser front end is smart about this.  Tabs
and spaces can be mixed provided that the semantics of the program
do not depend on the width of a tab.  This was not the case in
early 2.x versions.  (When did that go in?)

    The key to doing this right is to compare the whitespace of an
indented line with the line above it.  The longer whitespace string must
start with the shorter whitespace string, whether it's tabs, spaces, or
any combination thereof.  If it does not, the indentation is ambiguous.
For all unambiguous cases, you can then convert tabs to spaces or vice
versa with any number of spaces to tabs, and the semantics of the
program will not change.

    It's not clear whether "tabnanny" or "reindent" have smart semantics
like CPython.  The documentation doesn't say.  If they don't, they
should, or it should be documented that they're broken.

				 John Nagle

More information about the Python-list mailing list