[Tutor] Simple counter to determine frequencies of words in adocument

Alan Gauld alan.gauld at btinternet.com
Sat Nov 20 09:57:19 CET 2010


"Josep M. Fontana" <josep.m.fontana at gmail.com> wrote

> The code I started writing to achieve this result can be seen below.
> You will see that first I'm trying to create a dictionary that
> contains the word as the key with the frequency as its value. Later 
> on
> I will transform the dictionary into a text file with the desired
> formatting.

Thats the right approach...

> things around and I cannot get it to work as desired. Can anybody 
> tell
> me what's wrong so that I can say "duh" to myself once again?

I'll give some comments

> ---------------------------
> def countWords(a_list):
>    words = {}
>    for i in range(len(a_list)):
>        item = a_list[i]
>        count = a_list.count(item)
>        words[item] = count
>    return sorted(words.items(), key=lambda item: item[1], 
> reverse=True)

The loop is a bit clunky. it would be clearer just to iterate over 
a_list:

for item in a_list:
     words[item] = a_list.count(item)

And the return value is a list of tuples, which when you write
it will be a single long line containing the string representation.
Is tat what you want?


> with open('output.txt', 'a') as token_freqs:
>    with open('input.txt', 'r') as out_tokens:
>        token_list = countWords(out_tokens.read())
>        token_freqs.write(token_list)

read returns a single string. Using a for loop on a string will get
you the characters in the string not the words.
Also you probably want to use 'w' mode for your output
file to create a new one each time, otherwise the file will
keep getting bigger everytime you run the code.

HTH,


-- 
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/




More information about the Tutor mailing list