counting how often the same word appears in a txt file...But my code only prints the last line entry in the txt file

Steven D'Aprano steve+comp.lang.python at
Wed Dec 19 12:03:21 CET 2012

On Wed, 19 Dec 2012 02:45:13 -0800, dgcosgrave wrote:

> Hi Iam just starting out with python...My code below changes the txt
> file into a list and add them to an empty dictionary and print how often
> the word occurs, but it only seems to recognise and print the last entry
> of the txt file. Any help would be great.
> tm =open('ask.txt', 'r')
> dict = {}
> for line in tm:
> 	line = line.strip()
> 	line = line.translate(None, '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~')
>       line = line.lower()
> 	list = line.split(' ')

Note: you should use descriptive names. Since this is a list of WORDS, a 
much better name would be "words" rather than list. Also, list is a built-
in function, and you may run into trouble when you accidentally re-use 
that as a name. Same with using "dict" as you do.

Apart from that, so far so good. For each line, you generate a list of 
words. But that's when it goes wrong, because you don't do anything with 
the list of words! The next block of code is *outside* the for-loop, so 
it only runs once the for-loop is done. So it only sees the last list of 

> for word in list:

The problem here is that you lost the indentation. You need to indent the 
"for word in list" (better: "for word in words") so that it starts level 
with the line above it.

> 		if word in dict:
> 			count = dict[word]
> 			count += 1
> 			dict[word] = count

This bit is fine.

> else:
> 	dict[word] = 1

But this fails for the same reason! You have lost the indentation.

A little-known fact: Python for-loops take an "else" block too! It's a 
badly named statement, but sometimes useful. You can write:

for value in values:
    if condition:
        break  # skip to the end of the for...else
    print "We never reached the break statement"

So by pure accident, you lined up the "else" statement with the for loop, 
instead of what you needed:

for line in tm:
    ... blah blah blah
    for word in words:
        if word in word_counts:  # better name than "dict"
            ... blah blah blah

> for word, count in dict.iteritems():
> 	print word + ":" + str(count)

And this bit is okay too.

Good luck!


More information about the Python-list mailing list