How to ignore white space changes using difflib?

Grant Edwards invalid at invalid
Wed Apr 8 12:11:27 EDT 2009


On 2009-04-08, Grant Edwards <invalid at invalid> wrote:

> I'm trying to use difflib to compare strings ignoring changes
> to white-space (space/tab).  According to the doc page, you
> can do this by specifying a "charjunk" parameter to filter out
> characters:
>
>    charjunk: A function that accepts a character (a string of
>    length 1), and returns if the character is junk, or false if
>    not. The default is module-level function
>    IS_CHARACTER_JUNK(), which filters out whitespace characters
>    (a blank or tab; note: bad idea to include newline in
>    this!).

Apparently that "filtering out" characters doesn't mean that
they're ignored when doing the comparison.  (A bit of a "WTF?"
if you ask me).  After some more googling, it appears that I'm
far from the first person who interpreted "filtered out" as
"ignored when comparing lines". I'd submit a fix for the doc
page, but you apparently have to be a lot smarter than me to
figure out what "filters out" means in this context.

I guess I can collapse all whitespace sequences, do the diff on
the collapsed lines, and them map the results back to the
original lines. :/

> What am I doing wrong?

Reading the doc page, apparently. ;)

-- 
Grant Edwards                   grante             Yow! Not SENSUOUS ... only
                                  at               "FROLICSOME" ... and in
                               visi.com            need of DENTAL WORK ... in
                                                   PAIN!!!



More information about the Python-list mailing list