[Tutor] Diffing two files.

Ertl, John john.ertl at fnmoc.navy.mil
Sat Jan 29 01:10:27 CET 2005


Kent

What I need to do is find what should be common and see if it really is.  I
have two output files...The output files will have a bunch of systems stuff
then the text of interest and then a bunch more systems stuff.  The systems
stuff may be different for each file but the text of interest will always
have a fixed line in front of it and behind it.  

The idea is to get the text of interest (using the known beginning and
ending flags in the text) from each file and then check to make sure the
text of interest is the same in both files. 

I have not done much text stuff so this is new territory for me.  I will
take a look at difflib.

Thanks again

John Ertl

Simplified example of a text files.

Sldfsdf
Sdfsdfsf
Sdfsdfsdfwefs
Sdcfasdsgerg
Vsadgfasgdbgdfgsdf
-Beginning flag
This
Text
Should be
The
Same in the other file.
-Ending flag
Sdfsdfsdfsd
Sdfsdfsdfasd
Sdfsadfsdf
Sdfsadfasdf
Sdfsdfasd
Sdfasdf
s


-----Original Message-----
From: Kent Johnson [mailto:kent37 at tds.net]
Sent: Friday, January 28, 2005 15:23
Cc: Tutor at python.org
Subject: Re: [Tutor] Diffing two files.

You don't really say what you are trying to accomplish. Do you want to
identify the common text, or
find the pieces that differ?

If the common text is always the same and you know it ahead of time, you can
just search the lines
of each file to find it.

If you need to identify the common part, difflib might be useful. There is
an example on this page
of finding matching blocks of two sequences:
http://docs.python.org/lib/sequencematcher-examples.html

In your case the sequences will be lists of lines rather than strings (which
are sequences of
characters)

Kent

Ertl, John wrote:
> All,
>
> I have two text files that should contain a section of text that is the
> same.  Luckily the section of text has a defined beginning and end.  It
> looks like the most straightforward thing would be to read the targeted
text
> from each file (only 50 lines or so) into lists and then compare the
lists.
> I would think I could use sets to find a unique list (hopefully there
would
> not be anything)...or I could do line by line comparison.  Any advise on
> what is the better method.  Should I avoid the list comparison
approach...is
> there a built in way of comparing entire files instead of dealing
explicitly
> with the lines?
>
> Thanks,
>
> John Ertl
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
>

_______________________________________________
Tutor maillist  -  Tutor at python.org
http://mail.python.org/mailman/listinfo/tutor


More information about the Tutor mailing list