the 'right way' to distribute and access data files in a packaged Python module

Diez B. Roggisch deets at nospam.web.de
Mon Feb 2 12:42:07 EST 2009


David Moss wrote:

> Hi,
> 
> I'm the author of netaddr :-
> 
>     http://pypi.python.org/pypi/netaddr/0.6
> 
> For release 0.6 I've added setuptools support so it can be distributed
> as a Python egg package using the easy_install tool.
> 
> In 0.6, I've started bundling some data files from IEEE and IANA with
> the code below the site-packages install path (lib/site-packages/
> netaddr/...). netaddr accesses and parses these files on module load
> to provide various IP and MAC address related information via its API.
> 
> This mechanism works for the setuptools based packages because on
> install they extract to the filesystem and can be accessed using
> something like :-
> 
>     >>> index = open(os.path.join(__file__, 'oui.idx'))
> 
> However, setuptools seems to perform some magic for module imports
> which prevents me from accessing these files directly as they are
> bundled inside an egg (zip) file :-(
> 
> Two questions arise out of this situation :-
> 
> 1) is there a better way to distribute the files, i.e. should I be
> using a different more correct path instead of site-packages for data?
> If so, where is this and how do I add it to my setup scripts and code?
> 
> 2) is there an easy (and portable) way for me to dive inside an egg
> file to access the data I required (ugly but workable). I'm assuming
> I'd need to check for the presence of setuptools available with the
> Python interpreter etc.
> 
> My users and I would be grateful for and help and advice.

There is a zip-safe flag that you can specify that tells setuptools that
installing your egg only works if it is unarchived. However, there is also
the pkg_resources-package that allows you to access streams from within a
package, even if it is zipped. You should investigate these two options.

Diez



More information about the Python-list mailing list