[Distutils] Metadata fields

Amos Latteier amos@digicool.com
Mon Mar 12 00:22:01 2001


Andrew Kuchling wrote:
> Information about a package
> ===========================
> Name
> Version
> Supported Platforms
> Description
> Keywords
> Homepage URL
> Author IDs
> License
> Download link
> Date of release

Thanks for the great start Andrew! I have a couple comments.

The distutils currently has both a description and a
long_description. I think that both are useful.

It also has a couple "derived" fields - contact and
contact_email are set to either the maintainer (if
available) or the author. It also has a fullname field which
is name-version. I think that we can dispense with these
"derived" fields.

For fields like platforms, and license we should have a list
of possible choices.

Keywords are tricky. Should we have a restricted vocabulary?
I think that a controled vocabulary is preferable, but
agreeing on a vocabulary is difficult. I suggest that we use
an existing categorization system. For example, CPAN, Source
Forge, and Freshmeat have ways of categorizing software. I'm
sure that there are lots of other examples too. 

I wonder about the download link. The distribution packager
may not know what this is, assuming that the software can be
downloaded from the catalog. Maybe this field is set by the
catalog. Or maybe it doesn't belong in the meta-data.

Finally, maybe you should be able to find out which files a
package installs. Maybe this information is not properly
speaking meta-data.

> Information about a document
> ============================
> Name
> Author
> Description
> URL of HTML version
> URL of printable version
> URL and format of downloadable version
>   (Any of these URLs can be omitted if not applicable.)
> 
> The "Information about a document" section is only
relevant to a
> catalog that includes non-software things such as
documentation, and
> can probably be ignored for now. 

I agree that we should hold off on this for now. There are
lots of other pieces of information which may be relevant to
documents (for example, the Dublin Core).

> The "Information about an author"
> section makes sense for a CPAN-like system where authors
are
> registered as independent entities, but not for one where
packages are
> the only entities.  On the other hand, maybe registering
developers is
> worth preserving; otherwise you'd have to put your URL and
GPG key in
> every single package you maintain, which is kind of
annoying.

I think that it's worth having author information seperate
from package information. I also think that email address is
a good author id. This probably means that the catalog
system will be in charge of managing author meta-data, while
packag meta-data will be managed by distribution packagers.

Thanks again Andrew for offering to write this up as a PEP.

-Amos