how to organize a module that requires a data file
Larry Bates
larry.bates at websafe.com
Thu Nov 17 15:17:49 EST 2005
Personally I would do this as a class and pass a path to where
the file is stored as an argument to instantiate it (maybe try
to help user if they don't pass it). Something like:
class morph:
def __init__(self, pathtodictionary=None):
if pathtodictionary is None:
#
# Insert code here to see if it is in the current
# directory and/or look in other directories.
#
try: self.fp=open(pathtodictionary, 'r')
except:
print "unable to locate dictionary at: %s" % pathtodictionary
else:
#
# Insert code here to load data from .txt file
#
fp.close()
return
def get_stem(self, arg1, arg2):
#
# Code for get_stem method
#
The other way I've done this is to have a .INI file that always lives
in the same directory as the class with an entry in it that points me
to where the .txt file lives.
Hope this helps.
-Larry Bates
Steven Bethard wrote:
> Ok, so I have a module that is basically a Python wrapper around a big
> lookup table stored in a text file[1]. The module needs to provide a
> few functions::
>
> get_stem(word, pos, default=None)
> stem_exists(word, pos)
> ...
>
> Because there should only ever be one lookup table, I feel like these
> functions ought to be module globals. That way, you could just do
> something like::
>
> import morph
> assist = morph.get_stem('assistance', 'N')
> ...
>
> My problem is with the text file. Where should I keep it? If I want to
> keep the module simple, I need to be able to identify the location of
> the file at module import time. That way, I can read all the data into
> the appropriate Python structure, and all my module-level functions will
> work immediatly after import.
>
> I can only think of a few obvious places where I could find the text
> file at import time -- in the same directory as the module (e.g.
> lib/site-packages), in the user's home directory, or in a directory
> indicated by an environment variable. The first seems weird because the
> text file is large (about 10MB) and I don't really see any other
> packages putting data files into lib/site-packages. The second seems
> weird because it's not a per-user configuration - it's a data file
> shared by all users. And the the third seems weird because my
> experience with a configuration depending heavily on environment
> variables is that this is difficult to maintain.
>
> If I don't mind complicating the module functions a bit (e.g. by
> starting each function with "if _lookup_table is not None"), I could
> allow users to specify a location for the file after the module is
> imported, e.g.::
>
> import morph
> morph.setfile(r'C:\resources\morph_english.flat')
> ...
>
> Then all the module-level functions would have to raise Exceptions until
> setfile() was called. I don't like that the user would have to
> configure the module each time they wanted to use it, but perhaps that's
> unaviodable.
>
> Any suggestions? Is there an obvious place to put the text file that
> I'm missing?
>
> Thanks in advance,
>
> STeVe
>
> [1] In case you're curious, the file is a list of words and their
> morphological stems provided by the University of Pennsylvania.
More information about the Python-list
mailing list