[Tutor] Comparing lines in two files, writing result into a t hird file

stuart_clemons@us.ibm.com stuart_clemons@us.ibm.com
Sat Apr 26 10:11:02 2003

Hi Scott:

I just wanted to say thanks again.   I was able to spend time breaking down
the code you provided. (Start with a few lines of code snippet,  add print
out variables, run code, see exactly what was going on, add more code
snippet, print out variables, etc.).

Wow.  Clear, concise and dead-on !  (I'm not worthy !!!).  Extremely
eloquent in its simplicity. This really clears up the problem I had in the
past when I tried to read a file into a dictionary.  This structure worked
perfectly for my immediate problem and I can see that it will work
perfectly for variations of the this merge report that I want to provide.

This weekend I hope to look at Danny and Pan's approaches as a learning
exercise.  Danny got me thinking about code efficiency.  I hope to look at
some Python code I wrote about a year ago (that's remarkably still being
used) when I last worked with Python.  I'm still a newbie, but I was a
really a newbie then.  I know that that code could be done much more

Anyway, enough rambling.  I really feel like I learned a lot just by asking
one question.  Getting this information (and seeing some success in using
it) has really got me psyched about Python.  Thanks again. This is a great

- Stuart

----- Forwarded by Stuart Clemons/Westford/IBM on 04/26/03 09:37 AM -----
                      Stuart Clemons                                                                                                   
                                               To:      Scott Widney <SWidney@ci.las-vegas.nv.us>                                      
                      04/24/03 07:42 AM        cc:      tutor@python.org                                                               
                                               Subject: RE: [Tutor] Comparing lines in two files, writing result into a t hird file    
                                               (Document link: Stuart Clemons)                                                         

Hi Scott:

Thanks for laying out the dictionary structure for me.  I wanted to use
dictionaries a year ago for something I was working on, but I couldn't get
dictionaries to work for me (it was very frustrating), so I ended up
hacking something else together.   I think that was about the last time I
needed to use Python for anything.

Anyway, I'm going to try using this structure to solve the problem I'm
working on.  I need to produce this "merged" list fairly quickly (like
today) and then on a regular basis.

To Danny and Pan:  Thanks very much for contributing thoughts and code
related to this problem.  Since it looks like I'll have a need to use
Python for the forseeable future, as a learning exercise, I'm going to try
each of these approaches to this problem.   Most of the work I'll need
Python for is similar to this problem.  (Next up is formatting a dump of a
text log file into a readable report. I think I know how to handle this
one, but if not, as Arnold would say, I'll be back !)
Thanks again.

- Stuart

> Concerning dictionaries, do you think dictionaries is the structure
> to use ? If so, I'll try to spend some time reading up on
> dictionaries.  I do remember having problems reading a file into a
> dictionary when I tried it a year ago or so.

Since you're pressed for time, I can give you a basic script using a

d = {} # Start with an empty dictionary

f1 = file('file1.txt', 'r')
for num in f1.readlines():
    num = num.strip()       # get rid of any nasty newlines
    d[num] = 1              # and populate

f2 = file('file2.txt', 'r')
for num in f2.readlines():
    num = num.strip()                # again with the newlines
    if d.has_key(num): d[num] += 1   # <- increment value, or
    else: d[num] = 1                 # <- create a new key

nums = d.keys()
f3 = file('file3.txt', 'w')
for num in nums:
    f3.write(num)          # Here we put the
    if d[num] > 1:         # newlines back, either
        f3.write("*\n")    # <- with
    else:                  # or
        f3.write("\n")     # <- without
f3.close()                 # the asterisk

Should be fairly quick. And it's certainly easier to flash-parse with the
naked eye than a value-packed list comprehension.