basic python questions
Fredrik Lundh
fredrik at pythonware.com
Sat Nov 18 05:27:50 EST 2006
nateastle at gmail.com wrote:
> I have a simple assignment for school but am unsure where to go. The
> assignment is to read in a text file, split out the words and say which
> line each word appears in alphabetical order. I have the basic outline
> of the program done which is:
looks like an excellent start to me.
> def Xref(filename):
> try:
> fp = open(filename, "r")
> lines = fp.readlines()
> fp.close()
> except:
> raise "Couldn't read input file \"%s\"" % filename
> dict = {}
> for line_num in xrange(len(lines)):
> if lines[line_num] == "": continue
> words = lines[line_num].split()
> for word in words:
> if not dict.has_key(word):
> dict[word] = []
> if line_num+1 not in dict[word]:
> dict[word].append(line_num+1)
> return dict
>
> My question is, how do I easily parse out punction marks
it depends a bit how you define the term "word".
if you're using regular text, with a limited set of punctuation
characters, you can simply do e.g.
word = word.strip(".,!?:;")
if not word:
continue
inside the "for word" loop. this won't handle such characters if they
appear inside words, but that's probably good enough for your task.
another, slightly more advanced approach is to use regular expressions,
such as re.findall("\w+") to get a list of all alphanumeric "words" in
the text. that'll have other drawbacks (e.g. it'll split up words like
"couldn't" and "cross-reference", unless you tweak the regexp), and is
probably overkill.
and how do I sort the list and
how to sort the dictionary when printing the cross-reference, you mean?
just use "sorted" on the dictionary; that'll get you a sorted list
of the keys.
sorted(dict)
to avoid duplicates and simplify sorting, you probably want to normalize
the case of the words you add to the dictionary, e.g. by converting all
words to lowercase.
> if there anything else that I am doing wrong in this code
there's plenty of things that can be tweaked and tuned and written in a
slightly shorter way by an experienced Python programmer, but assuming
that this is a general programming assignment, I don't see something
seriously "wrong" in your code (just make sure you test it on a file
that doesn't exist before you hand it in)
</F>
More information about the Python-list
mailing list