how to organize a module that requires a data file
Steven Bethard
steven.bethard at gmail.com
Thu Nov 17 14:18:51 EST 2005
Ok, so I have a module that is basically a Python wrapper around a big
lookup table stored in a text file[1]. The module needs to provide a
few functions::
get_stem(word, pos, default=None)
stem_exists(word, pos)
...
Because there should only ever be one lookup table, I feel like these
functions ought to be module globals. That way, you could just do
something like::
import morph
assist = morph.get_stem('assistance', 'N')
...
My problem is with the text file. Where should I keep it? If I want to
keep the module simple, I need to be able to identify the location of
the file at module import time. That way, I can read all the data into
the appropriate Python structure, and all my module-level functions will
work immediatly after import.
I can only think of a few obvious places where I could find the text
file at import time -- in the same directory as the module (e.g.
lib/site-packages), in the user's home directory, or in a directory
indicated by an environment variable. The first seems weird because the
text file is large (about 10MB) and I don't really see any other
packages putting data files into lib/site-packages. The second seems
weird because it's not a per-user configuration - it's a data file
shared by all users. And the the third seems weird because my
experience with a configuration depending heavily on environment
variables is that this is difficult to maintain.
If I don't mind complicating the module functions a bit (e.g. by
starting each function with "if _lookup_table is not None"), I could
allow users to specify a location for the file after the module is
imported, e.g.::
import morph
morph.setfile(r'C:\resources\morph_english.flat')
...
Then all the module-level functions would have to raise Exceptions until
setfile() was called. I don't like that the user would have to
configure the module each time they wanted to use it, but perhaps that's
unaviodable.
Any suggestions? Is there an obvious place to put the text file that
I'm missing?
Thanks in advance,
STeVe
[1] In case you're curious, the file is a list of words and their
morphological stems provided by the University of Pennsylvania.
More information about the Python-list
mailing list