Mike Driscoll kyosohma at
Fri May 15 16:37:45 CEST 2009

On May 15, 8:58 am, anica_1... at wrote:
> hello, I´m a student of linguistic an I need do this exercises. Can
> anybody help me,please?
> Thanks
> ◑ Read in some text from a corpus, tokenize it, and print the list of
> all wh-word types that occur. (wh-words in English are used in
> questions, relative clauses and exclamations: who, which, what, and so
> on.) Print them in order. Are any words duplicated in this list,
> because of the presence of case distinctions or punctuation?

This requires learning file I/O and some string manipulation
techniques. I would probably read each line, split on spaces and then
loop over each word and check for "wh" and add them to a new list.
After reading the file, you'd then use a sort to get them in the right

> ◑ Create a file consisting of words and (made up) frequencies, where
> each line consists of a word, the space character, and a positive
> integer, e.g. fuzzy 53. Read the file into a Python list using open
> (filename).readlines(). Next, break each line into its two fields
> using split(), and convert the number into an integer using int(). The
> result should be a list of the form: [['fuzzy', 53], ...].

I recommend reading the Python Tutorial:

If you're using Python 2.x, then check out

If you're using 3.0, your primary options are the online docs and
"Programming in Python 3" by Summerfield.

- Mike

More information about the Python-list mailing list