basic python questions

Paddy paddy3118 at netscape.net
Sat Nov 18 08:01:07 CET 2006


nateastle at gmail.com wrote:

> I have a simple assignment for school but am unsure where to go. The
> assignment is to read in a text file, split out the words and say which
> line each word appears in alphabetical order. I have the basic outline
> of the program done which is:
>
> def Xref(filename):
>     try:
>         fp = open(filename, "r")
>         lines = fp.readlines()
>         fp.close()
>     except:
>         raise "Couldn't read input file \"%s\"" % filename
>     dict = {}
>     for line_num in xrange(len(lines)):
>         if lines[line_num] == "":  continue
>         words = lines[line_num].split()
>         for word in words:
>             if not dict.has_key(word):
>                 dict[word] = []
>             if line_num+1 not in dict[word]:
>                 dict[word].append(line_num+1)
>     return dict
>
> My question is, how do I easily parse out punction marks and how do I
> sort the list and if there anything else that I am doing wrong in this
> code it would be much help.
Hi,
on first reading, you have a naked except clause that catches all
exceptions. You might want to try your program on a non-existent file
to find out the actual exception you need to trap for that error
message. Do you want the program to continue if you have no input file?

If you have not covered Regular Expressions, often called RE's then one
way of getting rid of puctuation is to turn the problem on its head.
create a string of all the characters that you consider as valid in
words then go through each input line discarding any character not *in*
the string. Use the doctored line for word extraction.

help(sorted) will start you of on sorting in python. Other
documentation sources have a lot more.

P.S. I have not run the code myself
P.P.S. Where is the functions docstring!
P.P.P.S. You might want to read up on enumerate. It gives another way
to do things when you want an index as well as each item from an
iterable but remember, the index given starts from zero.

Oh, and welcome to comp.lang.python :-)

- Paddy.




More information about the Python-list mailing list