[Tutor] data structures

Knacktus knacktus at googlemail.com
Thu Dec 2 07:07:04 CET 2010


Am 02.12.2010 02:51, schrieb Dana:
> Hello,
>
> I'm using Python to extract words from plain text files. I have a list
> of words. Now I would like to convert that list to a dictionary of
> features where the key is the word and the value the number of
> occurrences in a group of files based on the filename (different files
> correspond to different categories). What is the best way to represent
> this data? When I finish I expect to have about 70 unique dictionaries
> with values I plan to use in frequency distributions, etc. Should I use
> globally defined dictionaries?
Depends on what else you want to do with the group of files. If you're 
expecting some operations on the group's data you should create a class 
to be able to add some more methods to the data. I would probably go 
with a class.

class FileGroup(object):

     def __init__(self, filenames):
         self.filenames = filenames
         self.word_to_occurrences = {}
         self._populate_word_to_occurrences()

     def _populate_word_to_occurrences():
         for filename in filenames:
             with open(filename) as fi:
                 # do the processing

Now you could add other meaningful data and methods to a group of files.

But also I think dictionaries can be fine. If you really only need the 
dicts. You could create a function to create those.

def create_word_to_occurrences(filenames):
     word_to_occurrences = {}
     for filename in filenames:
         with open(filename) as fi
             # do the processing
     return word_to_occurrences

But as I said, if in doubt I would go for the class.

>
> Dana
> _______________________________________________
> Tutor maillist - Tutor at python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor



More information about the Tutor mailing list