[Tutor] File Compare

Dave Angel davea at davea.name
Fri Jan 9 07:22:29 CET 2015


On 01/09/2015 12:32 AM, Crusier wrote:
> Hi,
>

Please specify Python version for any new question.  I'll assume Python 3.4

Thank you for using text message, rather than html.  But realize that 
attachments are also a problem for many people, as this forum goes 
through many gateways, some of which don't support attachments.  Just 
make the program and data self-contained in the message, by keeping the 
data small enough to just paste it in place.

> I am comparing two files, and the program has problem comparing two
> files. I am currently using Python 3.3 and Window 7.
>
> Attached is my code:
>
> import difflib
>
> old_file = open('0105-up.txt','r')
> file_contentsA = old_file.read()
> A = file_contentsA.split(",")
> print(A)
> print()
>
>
> new_file = open('0106-up.txt','r')
> file_contentsB = new_file.read()
> B = file_contentsB.split(",")
> print(B)
> print()
>
> print('\n'.join(difflib.unified_diff(A, B)))
>
> old_file.close()
> new_file.close()
>
> When the result comes out, I have noticed that some of the numbers are
> in both files and the program fails to the difference. Please help.
>

The difflib is doing exactly what it's supposed to.  You passed it two 
lists, each with a single element. The single element was different, so 
it showed you both old and new versions of that element.

Your problem is that you're trying to split on commas, but there are no 
commas in your sample data.  You probably should be splitting on 
whitespace, like Steven showed you on 10/13.

Change split(",") to  split() and you've solved one problem.

Next problem is that the two files are not in the same order.  difflib 
looks for differences in sequences of items by matching up identical 
ones, and reporting the breaks in the matches (roughly).  If your data 
is scrambled between one file and the other, you'll need to sort it.

Easiest way to do that is to use

A.sort()
B.sort()

Next possible problem is that this difflib.unified_diff shows the 
context for each difference.  You may or may not like that, and if you 
don't, there are various ways to fix it.  But that you'll have to judge, 
and perhaps ask for that specifically.



-- 
-- 
DaveA


More information about the Tutor mailing list