[Distutils] "Python Package Management Sucks"

Toshio Kuratomi a.badger at gmail.com
Thu Oct 2 00:14:40 CEST 2008


Phillip J. Eby wrote:
> At 11:00 AM 10/1/2008 -0700, Toshio Kuratomi wrote:
>> I have no love for how pkg_resources implements this (including the API)
>> but the idea of retrieving data files, locales, config files, etc from
>> an API is good.  For packages to be coded that conform to the File
>> Hierachy Standard on Linux, the API (and metadata) needs to be more
>> flexible.
> 
> There's some confusion here.  pkg_resources implements *resource*
> management and *metadata* management...  NOT "file management".
> 
> Resource files and metadata are no more "data" in the FHS sense than
> static data segments in a .so file are; they are simply a more
> convenient way of including such data than having a giant base64 string
> or something like that hardcoded into the program itself.  There is thus
> no relevance to the FHS and absolutely no reason for them to live
> anywhere except within the Python packages they are a part of.
> 
If we can agree on a definition of resource files there's a case to be
made here.  One of the problems, though, is that people use
pkg_resources for things that are data.  Now there could be two reasons
for that:

1) Developers are abusing pkg_resources.
2) Linux distributions disagree with you on what consitutes data vs a
resource.

Let's discuss the definition of resource vs data below (since you made a
good start at it) and we can see which of these it is.

> 
>>   We need to be able to mark locale, config, and data files in
>> the metadata.
> 
> Sure...  and having a standard for specifying that kind of
> application/system-level install stuff is great; it's just entirely
> outside the scope of what eggs are for.
> 
> To be clear, I mean here that a "file" (as opposed to a resource) is
> something that the user is expected to be able to read or copy, or
> modify.  (Whereas a resource is something that is entirely internal to a
> library, and metadata is information *about* the library itself.)
> 
metadata, I haven't even begun to think about yet.  I personally don't
see a huge need to shift it around on the filesystem but someone who's
thought about it longer might find reasons that it belongs in some other
place.

resources, as I said needs to be defined.  You're saying here that a
resource is something internal to the library.  A "file" is something
that a user can read, copy, or modify.

In a typical TurboGears app, there's the following things to be found
inside of the app's directory in site-packages:

config/{app.cfg,__init__.py,log.cfg} - These could go in /etc/ as their
configuration.  However, I've tried to stress to upstream that only
things that configure the TurboGears framework for use with their app
should go in these files (default templating language, identity
controller).  When those things are true, I can see this as being an
"internal resource".  If upstream can't get their act together, it's config.

locale/{message catalogs for various languages} --  These are binary
files that contain strings that the user may see when a message is
given.  These, I think are data.

templates/*html -- These are templates that the application fills in
with variables intermixed with short bits of code.  These are on the
border between code and data.  The user sees them in a modified form.
The app sometimes executes pieces of them before the user sees them.
Some template languages create python byte code from the templates,
others load them and write into them every time.  None of them can be
executed on their own.  All of them have to be loaded by a call to parse
them from a piece of python code in another file.  None of them are
directly called or invoked.  My leaning is that these are data.

static/{javascript,css,images} -- These are things that are definitely
never executed.  They are served by the webserver verbatim when their
URL is called.  These are certainly data. (Note: I don't believe these
are referenced using the resources API, just via URL.)

So... do you agree on which of these are data and which are resources?
Do you have an idea on how we can prevent application and framework
writers from misusing the resources API to load things that are data?

> 
>>   The build/install tool needs to be able to install those
>> into the filesystem in the proper places for a Linux distro, an egg,
>> etc.  and then we need to be able to call an API to retrieve the
>> specific class of resources or a directory associated with them.
> 
> Agreed...  assuming of course that we're keeping a clear distinction
> between static resources+metadata and actual "data" (e.g. configuration)
> files.
> 
> 
<nod>.  The definition and distinction is important.

-Toshio

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20081001/323fbbb2/attachment.pgp>


More information about the Distutils-SIG mailing list