Struggling with sorted dict of word lengths and count

Cathy James nambo4jb at gmail.com
Mon Jun 27 13:00:00 EDT 2011


Dear Python Programmers,

I am a Python newby and I need help with my code: I have done parts of it
but I can't get what I need: I need to manipulate text to come up with word
lengths and their frequency:ie

how many 1-letter words in a text
how many 2-letter words in a text, etc

I believe I am on the right path, but I can't get it right, I hope someone
can shed some light: Code below.

import string
import sys
def word_length(word):
    for p in string.punctuation:
        word = word.replace(p, "")  # replace any punctuation symbol with
empty string
    return len(word)

def fileProcess(filename = open('input_text.txt', 'r')):
#    Need to show word count(ascending order) for each of the word lengths
that has been encountered.

#
    print ("Length \t" + "Count")#print header for all numbers
    freq = {} #empty dict to accumulate word count and word length
    for line in filename:
        for word in line.lower().split( ):#split lines into words and make
lower case
            wordlen = word_length(word)#run function to return length of
each word
            freq[wordlen] = freq.get(wordlen, 0) + 1#increment the stored
value if there is one, or initialize
        print(word, wordlen, freq[wordlen])

fileProcess()
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20110627/5d319b17/attachment.html>


More information about the Python-list mailing list