comparing strings
jadedlime
jadedlime at hotmail.com
Tue Jul 30 15:13:05 EDT 2002
Hi, I am very new to Python and very new to programming in general.
I need to take a particular string of data and compare it to a dictionary
which will put all matching records in a specific file and all the records
from the string that do not match in another to be dealt with on an
individual basis.
I appreciate any and all help on this problem, part of the code is included.
this seems to produce the matches, but not the mismatches.
import string, re, codecs
# create inputs and outputs
input = codecs.open("/home/jadedlime/Julio/julio.work", "r",
encoding="ISO-8859-1")
output = codecs.open("sec_try", "w", encoding="ISO-8859-1")
output2 = codecs.open("sec_trynotfound", "w", encoding="ISO-8859-1")
# read and split the lines of the main julio records
whole = input.read()
lines = string.split(whole, "\n")
# create the dictionary with regular expressions built in
dictionary = {"^aarl australian academic & research libraries$" : "10735",
"^acimed$" : "11225",
"^adbs: l'association des professionnels de l'information et
de la documentation$" : "11715",
"^alpha 94. strat\351gies d\222alphab\351tisation et de
d\351veloppement culturel en milieu rural\
$" : "12205",
"^american archivists$" : "13185",
"^anales de documentaci\363n$" : "14165",
"^annual review of information science and technology
(arist)$" : "14655",
"^aproximaciones a la traducci\363n$" : "15145",
"^apuntes$" : "15635",
"^architectural records conference report$" : "16125",
"^archivaria$" : "16615",
}
# take the desired field (journal titles) and put it in a list format
journallist = []
for line in lines:
field = string.split(line, '"')
journalitems = string.strip(string.lower(field[23]))
journallist.append(journalitems)
# compile the dictionary and make it into a list format
dictionarykeys = dictionary.keys()
dictionarylist=[]
for dictionaryexp in dictionarykeys:
regular = re.compile(dictionaryexp)
dictionarylist.append(regular)
# run a search that should match all the journal titles in the julio file to
the ones in
# the dictionary, if they match send them to a specific file, if they do
not, send them
# to another file so ajustments can be made.
for key in dictionarylist:
for item in journallist:
if re.search(key, item):
output.write("found\t" + item + "\n")
else:
output2.write("not found\t" + item + "\n")
More information about the Python-list
mailing list