[Tutor] data structures
Knacktus
knacktus at googlemail.com
Thu Dec 2 07:07:04 CET 2010
Am 02.12.2010 02:51, schrieb Dana:
> Hello,
>
> I'm using Python to extract words from plain text files. I have a list
> of words. Now I would like to convert that list to a dictionary of
> features where the key is the word and the value the number of
> occurrences in a group of files based on the filename (different files
> correspond to different categories). What is the best way to represent
> this data? When I finish I expect to have about 70 unique dictionaries
> with values I plan to use in frequency distributions, etc. Should I use
> globally defined dictionaries?
Depends on what else you want to do with the group of files. If you're
expecting some operations on the group's data you should create a class
to be able to add some more methods to the data. I would probably go
with a class.
class FileGroup(object):
def __init__(self, filenames):
self.filenames = filenames
self.word_to_occurrences = {}
self._populate_word_to_occurrences()
def _populate_word_to_occurrences():
for filename in filenames:
with open(filename) as fi:
# do the processing
Now you could add other meaningful data and methods to a group of files.
But also I think dictionaries can be fine. If you really only need the
dicts. You could create a function to create those.
def create_word_to_occurrences(filenames):
word_to_occurrences = {}
for filename in filenames:
with open(filename) as fi
# do the processing
return word_to_occurrences
But as I said, if in doubt I would go for the class.
>
> Dana
> _______________________________________________
> Tutor maillist - Tutor at python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
More information about the Tutor
mailing list