[Tutor] Help please

Kengesbayev, Askar askar.kengesbayev at etrade.com
Thu Oct 17 16:21:17 CEST 2013


Ruben,

#1 you can try something like this
  try:
        with open('my_file.txt') as file:
            pass
    except IOError as e:
        print "Unable to open file"  #Does not exist or you do not have read permission

#2. I would try to use regular expression push words to array and then you can manipulate array. Not sure if it is efficient way but it should work.
#3 . easy way would be to use regular expression. Re module.
#4. Once you will have array in #2 you can sort it and print whatever top words you need.
#5.  I am not sure the best way on this but you can play with array from #2.

Thanks,
Askar

From: Pinedo, Ruben A [mailto:rapinedo at miners.utep.edu]
Sent: Wednesday, October 16, 2013 2:49 PM
To: tutor at python.org
Subject: [Tutor] Help please

I was given this code and I need to modify it so that it will:

#1. Error handling for the files to ensure reading only .txt file
#2. Print a range of top words... ex: print top 10-20 words
#3. Print only the words with > 3 characters
#4. Modify the printing function to print top 1 or 2 or 3 ....
#5. How many unique words are there in the book of length 1, 2, 3 etc

I am fairly new to python and am completely lost, i looked in my book as to how to do number one but i cannot figure out what to modify and/or delete to add the print selection. This is the code:


import string

def process_file(filename):
    hist = dict()
    fp = open(filename)
    for line in fp:
        process_line(line, hist)
    return hist

def process_line(line, hist):
    line = line.replace('-', ' ')

    for word in line.split():
        word = word.strip(string.punctuation + string.whitespace)
        word = word.lower()

        hist[word] = hist.get(word, 0) + 1

def common_words(hist):
    t = []
    for key, value in hist.items():
        t.append((value, key))

    t.sort(reverse=True)
    return t

def most_common_words(hist, num=100):
    t = common_words(hist)
    print 'The most common words are:'
    for freq, word in t[:num]:
        print freq, '\t', word

hist = process_file('emma.txt')
print 'Total num of Words:', sum(hist.values())
print 'Total num of Unique Words:', len(hist)
most_common_words(hist, 50)

Any help would be greatly appreciated because i am struggling in this class. Thank you in advance

Respectfully,

Ruben Pinedo
Computer Information Systems
College of Business Administration
University of Texas at El Paso
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20131017/ea525e7b/attachment-0001.html>


More information about the Tutor mailing list