[Tutor] Simple counter to determine frequencies of words in adocument
Alan Gauld
alan.gauld at btinternet.com
Sat Nov 20 09:57:19 CET 2010
"Josep M. Fontana" <josep.m.fontana at gmail.com> wrote
> The code I started writing to achieve this result can be seen below.
> You will see that first I'm trying to create a dictionary that
> contains the word as the key with the frequency as its value. Later
> on
> I will transform the dictionary into a text file with the desired
> formatting.
Thats the right approach...
> things around and I cannot get it to work as desired. Can anybody
> tell
> me what's wrong so that I can say "duh" to myself once again?
I'll give some comments
> ---------------------------
> def countWords(a_list):
> words = {}
> for i in range(len(a_list)):
> item = a_list[i]
> count = a_list.count(item)
> words[item] = count
> return sorted(words.items(), key=lambda item: item[1],
> reverse=True)
The loop is a bit clunky. it would be clearer just to iterate over
a_list:
for item in a_list:
words[item] = a_list.count(item)
And the return value is a list of tuples, which when you write
it will be a single long line containing the string representation.
Is tat what you want?
> with open('output.txt', 'a') as token_freqs:
> with open('input.txt', 'r') as out_tokens:
> token_list = countWords(out_tokens.read())
> token_freqs.write(token_list)
read returns a single string. Using a for loop on a string will get
you the characters in the string not the words.
Also you probably want to use 'w' mode for your output
file to create a new one each time, otherwise the file will
keep getting bigger everytime you run the code.
HTH,
--
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/
More information about the Tutor
mailing list