How should I compare two txt files separately coming from windows/dos and linux/unix

John Machin sjmachin at lexicon.net
Thu Jun 11 01:08:41 EDT 2009


Chris Rebert <clp2 <at> rebertia.com> writes:

> 
> On Wed, Jun 10, 2009 at 8:11 PM, higer<higerinbeijing <at> gmail.com> wrote:
> > I just want to compare two files,one from windows and the other from
> > unix. But I do not want to compare them through reading them line by
> > line. Then I found there is a filecmp module which is used as file and
> > directory comparisons. However,when I use two same files (one from
> > unix,one from windows,the content of them is the same) to test its cmp
> > function, filecmp.cmp told me false.
> >
> > Later, I found that windows use '\n\r' as new line flag but unix use
> > '\n', so filecmp.cmp think that they are different,then return false.
> > So, can anyone tell me that is there any method like IgnoreNewline
> > which can ignore the difference of new line flag in diffrent
> > platforms? If not,I think filecmp may be not a good file comparison
> 
> Nope, there's no such flag. You could run the files through either
> `dos2unix` or `unix2dos` beforehand though, which would solve the
> problem.
> Or you could write the trivial line comparison code yourself and just
> make sure to open the files in Universal Newline mode (add 'U' to the
> `mode` argument to `open()`).
> You could also file a bug (a patch to add newline insensitivity would
> probably be welcome).

Or popen diff ...

A /very/ /small/ part of the diff --help output:

  -E  --ignore-tab-expansion  Ignore changes due to tab expansion.
  -b  --ignore-space-change  Ignore changes in the amount of white space.
  -w  --ignore-all-space  Ignore all white space.
  -B  --ignore-blank-lines  Ignore changes whose lines are all blank.
  -I RE  --ignore-matching-lines=RE  Ignore changes whose lines all match RE.
  --strip-trailing-cr  Strip trailing carriage return on input.

Cheers,
John







More information about the Python-list mailing list