comparing strings

jadedlime jadedlime at hotmail.com
Tue Jul 30 15:13:05 EDT 2002


Hi, I am very new to Python and very new to programming in general.
I need to take a particular string of data and compare it to a dictionary
which will put all matching records in a specific file and all the records
from the string that do not match in another to be dealt with on an
individual basis.

I appreciate any and all help on this problem, part of the code is included.
this seems to produce the matches, but not the mismatches.



import string, re, codecs

# create inputs and outputs

input = codecs.open("/home/jadedlime/Julio/julio.work", "r",
encoding="ISO-8859-1")
output = codecs.open("sec_try", "w", encoding="ISO-8859-1")
output2 = codecs.open("sec_trynotfound", "w", encoding="ISO-8859-1")

# read and split the lines of the main julio records

whole = input.read()
lines = string.split(whole, "\n")

# create the dictionary with regular expressions built in

dictionary = {"^aarl australian academic & research libraries$" : "10735",
              "^acimed$" : "11225",
              "^adbs: l'association des professionnels de l'information et
de la documentation$" : "11715",
              "^alpha 94. strat\351gies d\222alphab\351tisation et de
d\351veloppement culturel en milieu rural\
              $" : "12205",
              "^american archivists$" : "13185",
              "^anales de documentaci\363n$" : "14165",
              "^annual review of information science and technology
(arist)$" : "14655",
              "^aproximaciones a la traducci\363n$" : "15145",
              "^apuntes$" : "15635",
              "^architectural records conference report$" : "16125",
              "^archivaria$" : "16615",
}


# take the desired field (journal titles) and put it in a list format

journallist = []
for line in lines:
        field = string.split(line, '"')
        journalitems = string.strip(string.lower(field[23]))
        journallist.append(journalitems)

#  compile the dictionary and make it into a list format

dictionarykeys = dictionary.keys()
dictionarylist=[]
for dictionaryexp in dictionarykeys:
        regular = re.compile(dictionaryexp)
        dictionarylist.append(regular)

# run a search that should match all the journal titles in the julio file to
the ones in
# the dictionary, if they match send them to a specific file, if they do
not, send them
# to another file so ajustments can be made.

for key in dictionarylist:
        for item in journallist:
                if re.search(key, item):
                        output.write("found\t" + item + "\n")
        else:
                output2.write("not found\t" + item + "\n")






More information about the Python-list mailing list