Most Effective Way to Build Up a Histogram of Words?

Steve Holden sholden at holdenweb.com
Thu Oct 12 13:07:21 EDT 2000


June Kim wrote:
> 
> 
> Thank you for your clear and clean code.

Mostly Simon Brunning's, in fact.

> The problem, however, is that I might run through several of a few MB files,
> summing up to tens of mega bytes when added into one file .

As long as you are only processing one file at a time, this will probably
be OK.  Don't forget, the table grows with the number of unique words
rather than with the filesize.

> Therefore, to do the sorting all at once might sound somewhat unfeasible
> or ineffecient. Am I trying to make Python a panacea here? ( I know it has
> no snake oil though)
> 
I can only repeat, try it and see.  You may be pleasantly surprised.

Enhance the code you have with a loop to iterate over all the files
you want to read, and go take a cup of your favorite beverage :-)
If it works you can deal with more complex issues later, but you will
have proved its practicality.

> Best Regards,
> June

regards
 Steve
-- 
Helping people meet their information needs with training and technology.
703 967 0887      sholden at bellatlantic.net      http://www.holdenweb.com/





More information about the Python-list mailing list