[Distutils] A Modest Proposal for "A Database of Installed Packages"

Alexander Michael lxander.m at gmail.com
Thu Apr 10 04:05:01 CEST 2008


On Mon, Apr 7, 2008 at 11:18 PM, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 05:05 PM 4/7/2008 -0400, Alexander Michael wrote:
>
> > a. I believe that having side-car files that sit alongside
> > packages because they have the same base name makes the database more
> > transparent to the uninitiated.
> >
>
>  I'm not aware that this was ever a stated design goal, nor why it should
> have any priority.  OTOH, files named by distribution would be at least as,
> if not even *more* transparent than package names, so I don't see any
> particular benefit to this.

Your right. I didn't state this in my rationale. I'm suggesting it
*partly* because I
thought that subverting the "database" aspect by inverting the relationship
between distributions and packages would help solve the social problem.
Now it could be that a technical solution to a social problem is
suspect, but the other
part of why I suggested it this way was an honest attempt at improving
transparency
by putting the metadata up front. I did indeed read PEP 262 and upon reading it
decided it sounded too much like a "system packager" and not enough like a way
to get the missing metadata out of PKG-INFO and into the installed packages. I
appreciate your patience with me. I know I've earned little social
capital in this
community and am likely trying too hard to be helpful, but I'm
earnestly trying to be
helpful and I have read-up a fair bit in attempting to do so. That
said, I think your
probably right and inverting the distribution/package name relationship is too
problematic, if only because of the redundancy it results in for
distributions that
contain more than one package (or worse, a slew of modules).

This pretty much leaves us with the egg-info files I've been reading
about. Since I
use setuptools for everything but wxPython (whose WinXP installer doesn't seem
to include them) and these aren't included in the standard library
except for wsgiref,
I don't really see these files, even though I see the code to produce them in
distutils. But this is perhaps beside the point.

> > Just browsing a directory of python
> > packages will allow you to see what's going on. Moving like-names
> > files around manually maintains the integrity and availability of the
> > data.
>
>  Moving anything manually, other than the *entire* directory, will be
> unlikely to retain any form of integrity, so it's best not to give the false
> impression that it would.

I disagree with this. Certainly it decreases in likelihood when the
side-car files are
named by distribution and not package and if the distribution contains
more than one
package, but other than at, it seems pretty easy (e.g. hmm.. maybe I should move
the mypkg.pkg-info file along with the mypkg directory. let's look
inside. oh! I see
how this works!)

> > I think that having magic entries in an essentially "hidden"
> > directory somewhere will cause all sorts of trouble that could be
> > avoiding at the cost of a small bit of duplication.
> >    b. I assume, perhaps incorrectly, that most distributions contain
> > only a single package.
>
>  Very incorrectly, unless you mean a single top-level package.  Odds are
> fairly good that if there's a package, there's probably at least a
> subpackage, too, like perhaps a tests subpackage.

I do mean single top-level package (but one setup.py), thanks for clarifying.

> > That said, I do agree that if you are primarily interested in a
> > database of *distributions* (as opposed to *packages*) then something
> > like is proposed in PEP 262 makes more sense (but it would have to be
> > per directory and not site-wide due to the dynamic nature of the
> > python path).
>
>  That's exactly what I want.  The only reason I didn't just implement
> easy_install using a per-directory form of PEP 262 is that I wanted
> something done rather more immediately.  That was years ago, so I can afford
> to be more patient now.  :)

Its ironic how impatience is rewarded! ;)

> > This is a trade-off between putting the metadata up
> > front in an obvious and easy to understand way so that it will
> > hopefully have a better chance of being noticed and maintained, versus
> > tucking it away hidden someplace so that even though it is broken, it
> > doesn't bother anyone until they care enough to fix it. *It is this
> > trade-off that I am exploring with this strawman "counter" proposal to
> > PEP 262.*
> >
>
>  Someone would have to be crazy to maintain this information by hand.  So
> I'd actually consider it an advantage if the file format made this fact
> plain, by using something that's difficult for a human being to maintain,
> like say a pickle.  ;-)  OTOH, it's possible that some system packagers will
> not wish to use Python to generate the files, so using something a bit less
> complex would be a good idea.  The format proposed by PEP 262 isn't really
> that bad of a trade-off in those terms.

What made you think people were rational? :) I do think that being
able to maintain
it by hand will aid in transparency and clarity which linux users just
love, so maybe
it will help win them over to the idea of python knowing a little bit
about itself. ;)

> > 2. The strawman proposal did not explicitly address how optional
> > add-on tools (like setuptools) might manage namespace packages.
> >
>
>  I think there's some mistunderstanding here about the proposal's goals.  If
> the proposal doesn't work for setuptools, it doesn't work, period.
>
>  The entire point is to allow setuptools to do its work without annoying the
> people who don't want to use it.

This is my misunderstanding. I was trying to take setuptools out of the
equation (if only to avoid the social backlash of "trying to get setuptools
into the standard library") by providing a proposal that met the objectives
of my installation-tool agnostic rationale. .

> > 4. Concerns were raised about the performance penalty for using the
> > side-car style files without version numbers possibly not all of which
> > were located at the top-most level of the directory listed in the
> > python path.
> >
> > Any add-on tool that actually used the data would very likely need to
> > build a cache of the data using a more efficient representation,
> >
>
>  This is a misunderstanding of the point I raised.  Floris merely asked why
> there were version numbers in .egg-info files, and I answered him.  That
> doesn't actually have much, if anything, to do with the package database
> proposal.  It's merely how installed distributions' versions can be
> recognized quickly at runtime, not anything to do with how potential
> installation conflicts are handled at installation time.

Apologies for confusing point of fact with objective.

>  easy_install uses eggs for installation simply because it need never worry
> about file ownership conflicts.  There is a direct mapping from a
> distribution to its files: the contents of a zipfile or subdirectory.  This
> also allows for (relatively) straightforward uninstallation.

I actually like zipped eggs (much more than easy_install as package
manager), but that is besides the point since the BDFL vetoed them.

>  The goal of the proposal, then, is to have a way for easy_install to have
> another way to map from a distribution to its owned files (and vice versa),
> so that eggs are not necessary for normal, single-version installations.

This is where we misunderstood each other and where I've probably gone
astray as I wasn't trying to propose anything at all for easy_install (it wasn't
in my attempt at a rationale), but a generic common ground that tools like
easy_install (but not exlcusively) could use without stepping on people's toes
like easy_install does. I really wanted the proposal to standalone from
easy_install so that easy_install haters wouldn't have to fear it as well as
provide some utility for those who don't even use such tools. But this
is probably just tilting at windmills.

Thanks again for your patience. You must be overwhelmed by all of the
opinions and misdirected attempts to help (including mine). I did wish we
could have come-up with a setuptools agnostic support layer for installation
managers suitable for inclusion in the standard library, but for some reason
that doesn't seem desired by too many people. Apologies (and thanks for all
your hard work).


More information about the Distutils-SIG mailing list