How should I compare two txt files separately coming from windows/dos and linux/unix
sjmachin at lexicon.net
Thu Jun 11 07:08:41 CEST 2009
Chris Rebert <clp2 <at> rebertia.com> writes:
> On Wed, Jun 10, 2009 at 8:11 PM, higer<higerinbeijing <at> gmail.com> wrote:
> > I just want to compare two files,one from windows and the other from
> > unix. But I do not want to compare them through reading them line by
> > line. Then I found there is a filecmp module which is used as file and
> > directory comparisons. However,when I use two same files (one from
> > unix,one from windows,the content of them is the same) to test its cmp
> > function, filecmp.cmp told me false.
> > Later, I found that windows use '\n\r' as new line flag but unix use
> > '\n', so filecmp.cmp think that they are different,then return false.
> > So, can anyone tell me that is there any method like IgnoreNewline
> > which can ignore the difference of new line flag in diffrent
> > platforms? If not,I think filecmp may be not a good file comparison
> Nope, there's no such flag. You could run the files through either
> `dos2unix` or `unix2dos` beforehand though, which would solve the
> Or you could write the trivial line comparison code yourself and just
> make sure to open the files in Universal Newline mode (add 'U' to the
> `mode` argument to `open()`).
> You could also file a bug (a patch to add newline insensitivity would
> probably be welcome).
Or popen diff ...
A /very/ /small/ part of the diff --help output:
-E --ignore-tab-expansion Ignore changes due to tab expansion.
-b --ignore-space-change Ignore changes in the amount of white space.
-w --ignore-all-space Ignore all white space.
-B --ignore-blank-lines Ignore changes whose lines are all blank.
-I RE --ignore-matching-lines=RE Ignore changes whose lines all match RE.
--strip-trailing-cr Strip trailing carriage return on input.
More information about the Python-list