[Distutils] formencode as .egg in Debian ??
Phillip J. Eby
pje at telecommunity.com
Thu Nov 24 00:06:24 CET 2005
At 10:00 PM 11/23/2005 +0100, Martin v. Löwis wrote:
>Phillip J. Eby wrote:
>>I was referring to how the distribution is *installed*. You don't use
>>things directly from a deb file, they have to be installed on the
>>system. When you install an egg, you must use one of the three forms, or
>>the system as a whole will not function.
>
>That depends on whether the "system" (pkg_resources, I assume) is used
>at all. If the project is just a Python library, you can install it
>as a Python package in site-python, not as an egg.
>
>>Eggs that depend on the egg will not be able to find it, nor use any
>>plugins it contains.
>
>Not sure what an egg plugin is, so I cannot comment on that.
>As for other eggs finding the one: In Debian, there normally shouldn't
>be any need to, since there will be also a Debian package providing
>the other project, and then a plain "import" will be sufficient to
>find the Python package.
No, it won't, because... oh never mind. I'll explain again below.
What you seem to keep missing, though, is that eggs and their metadata are
a *feature*, not a bug. The rapid uptake of setuptools by developers
trying to build more powerful frameworks and platforms for Python is
sufficient evidence that they provide useful features that Python
developers desire to have, precisely because they can be used to wrap
non-setuptools based pacakges without code changes and without reinventing
wheels - either the wheels provided by setuptools, or the wheels provided
by other projects when wrapped by setuptools. Removing the metadata gives
them neither option.
>Of course, any usage of the pkg_resource API would break. One way
>to deal with that is to encourage upstream authors to have a fallback
>mode where they can work without pkg_resource; another is to provide
>a fallback implementation of pkg_resource.
Yes, and while we're at it, let's encourage developers to have fallbacks so
their code can run on Python 1.5.2. Heck, why stop there? Anything that
requires features introduced after Python 1.0 would obviously only be an
impossible attempt to improve upon perfection. For that matter, let's not
have any dependencies on other packages at all! Clearly it would be better
for everybody to write their own modules and not use something written by
some random person on the Internet. :)
All joking aside, one of the central points of having setuptools in the
first place is that it allows people to avoid duplicating code. Code like,
say, the pkg_resources module. This is another example of what I'm calling
a contradiction in terms, because I keep saying that the purpose of all
this is to allow X, and then you propose, "well, do it without X", and I
say, "but X is the whole point! Doing it without X isn't actually doing
'it' because X is what 'it' is." And then you say, "Ah, but what if you do
it with Y?", and so we go round the loop again.
>>So, when I say it is a contradiction in terms to install an egg in a
>>non-egg form, I mean that it is nonsensical to say that you have
>>installed it, because it will be unusable (by other eggs), nonfunctional
>>(by itself), or both.
>
>That makes me not like the egg infrastructure: too many subtle
>dependencies, and you are too much forced into using the structures
>that the setuptools authors came up with.
[boggle] Um, what is Debian but a collection of subtle dependencies forced
into the structures that its authors came up with? Perhaps your point here
is just too subtle for me. :)
>Of course, the pragmatic view is just to bite the bitter pill (is
>this the idiom?)
The idioms are to "bite the bullet", or "swallow the bitter pill". The
former is from the one-time medical practice of biting on a bullet to avoid
screaming during procedures performed without anesthetic. The latter of
course is also a medical idiom, in the sense that a medicine may be bitter
but nonetheless good for one's health. :) In any case, both idioms imply
a desire to get an unpleasant but beneficial task over with, so mixing them
is quite understandable, albeit odd-sounding. :)
> and find some strategy that makes pkg_resource
>work, without any of the drawbacks of setuptools.
Just as I'm trying to help find a way to make Debian be able to provide
something useful for setuptools-based projects, despite the drawbacks of
the current Debian arrangements. ;)
The degree of negativity from the Debian side at the outset of this
conversation (virtually all of it from you) has not been conducive to
making this happen. As a simple matter of practicality, I can't afford to
leave your comments unanswered, not because I feel any need to convince you
personally of anything, but because I don't want to leave anyone else with
the impression that your portrayal of these so-called "drawbacks" is a fair
one. Otherwise, I would have just ignored your comments and focused on
working with the people who seem more interested in finding solutions than
finding ways to declare a non-existence of the problem. As it is, I feel
forced to spend time replying to your comments point-by-point, that I could
otherwise spend on actually helping to resolve the issues.
If I were to adopt your tone, I would be calling Debian a fragile and
broken system that is unable to deal well with simple matters like editing
a file upon installation, or having multiple versions of a package
installed at the same time. Sure, the limitation might exist, but is it
fair to call Debian fragile or broken because of it? Not a bit! I've
therefore been very careful to describe any such tradeoffs that Debian
makes in neutral terms rather than categorically pejorative ones. I would
prefer if you would extend me the same courtesy of not describing every
design tradeoff I make as being a "non-standard", "drawback", "for no good
reason".
(Even though I have referred to the existing Debian policy as "outdated", I
meant it only in the sense that it does not deal explicitly with the issue
of eggs, which is a neutral statement, not a judgment of the condition. It
would be stupid and unreasonable for me to imply that Debian's policy must
be updated to include eggs, as setuptools is alpha software that is very
much still in development. Which is why it isn't me who approached the
Debian developers about this, as opposed to the other way around. However,
once contacted about the matter, I'm certainly going to point out that
ignoring the existence of eggs and their likely rapid increase in
popularity (e.g. TurboGears claims 40,000 eggs served) is also unreasonable.)
>>>I would expect that you can "unegg" a project.
>>
>>For projects that make use of eggs, you expect wrong. Try it with
>>setuptools, and you will find that it is unable to even run its own
>>tests, because the "test" command is registered via an entry point.
>
>I would have to rewrite the code, of course. I do all registration
>that needs to be done in __init__.py
That registration can't be done until a package is imported, so even if you
did the significant patching this would require, your effort will fail as
soon as you bring extensions into the picture, such as buildutils or
SQLObject, as I already explained.
>>Entry points are just one kind of project metadata that can be
>>registered; other projects like Trac and SQLObject have their own kinds
>>of metadata as well. None of this metadata is accessible without the
>>EGG-INFO or .egg-info directory; removing it is like removing the
>>JavaBean metadata or the deployment descriptors from Java jars, rendering
>>the jar useless in many contexts, despite the fact that all the "code" remains.
>
>Sure, *just* removing it would be wrong. I have to replace it with
>Python code.
Which will *never be imported* and will therefore never execute, because
the project it needs to *plug into* won't know it exists. A project "foo"
that extends the functionality of project "bar" can't be statically known
about by project "bar". The dependency is that foo requires bar, but bar
must be able to "discover" at runtime that foo exists.
The idea is that project "bar" can be extensible by other projects, by
providing entry point groups that other projects can add themselves to (via
published metadata). These other projects do not need to be imported; they
are found by their metadata, which describes them as offering entry points
in the "bar"-supplied entry point groups. Thus, new projects like "foo"
can hook in to the infrastructure provided by "bar".
For example, SQLObject and buildutils are project "foo" with respect to
setuptools; setuptools doesn't depend on them, or know about their
existence a priori. But their mere presence on sys.path (or more
precisely, the presence of egg metadata in well-defined locations relative
to sys.path entries) is enough to allow setuptools to find them.
The "Trac" web-based project management application is an example of
project "bar" - it offers a sophisticated plugin capability to allow people
to customize its database, web interface, and so on. The mere existence of
a plugin project on sys.path, or its presence in the Trac plugins
directory, is sufficient to allow that project's code to be *dynamically
imported* on an as-needed basis whenever a particular notification hook is
invoked.
These things are not practical without some kind of metadata. You cannot
simply replace the metadata with code, because the code has to be imported,
which means that you would have to import every module and package on
sys.path in order to be sure you found all the metadata.
>>The only projects that can be "unegged", then, are ones that no egg
>>project depends on, and which do not themselves depend on any eggs. The
>>number of projects that are not depended on by other projects will be
>>smaller and smaller over time, as will the number that do not depend on
>>other eggs.
>
>Define "depends on". If this is "imports", I don't see a problem with
>unegging the package.
As you said, a false proposition implies any conclusion. It is you who is
assuming "depends on" means "imports". Plugins are the simplest example of
a "depends on" that goes beyond importing.
>>In essence, trying to work around the absence of egg metadata is a
>>bottomless pit, because over time there will be an ever-increasing amount
>>of functionality in the field that is based on the use of metadata.
>
>That is really sad.
Yes, we should all go back to C like real programmers. :) No, wait, then
we would have to deal with all those messy .h files. But who needs
interfaces and metadata like argument types? We should just put the memory
addresses of the functions directly in our code, because then there will be
fewer processing steps and we won't have all those .h files messing up the
place. Plus, that whole concept of a "linker" seems awfully fragile to
me. Who knows what address it might put my code at? Besides, I don't need
a linker if I only use the code that I write, and those people who use
other people's code are obviously just too lazy to write their own or even
copy and paste it. Can you imagine? :)
>>>I would add the complaint:
>>>- it increases sys.path for no good reason.
>>
>>It is only true that it increases the length in the case of the two .egg
>>forms, not the .egg-info form.
>
>Ok, then I think this is what Debian should use.
Great! At least we are making some progress here. For non-setuptools
packages (like ElementTree), it will suffice to place an empty
'projectname-version.egg-info' file or directory in site-packages alongside
the installed package. I will modify setuptools 0.6a9 to parse the version
from the file or directory name, and to accept a file instead of a
directory. (Currently, it requires a PKG-INFO file inside an .egg-info
directory and parses the Version: header from PKG-INFO.)
If Debian adds this metadata marker for its non-setuptools Python packages,
then the Python packages will be "eggs" in the sense that other eggs will
be able to discover them via the pkg_resources API, and thus TurboGears
users will be able to use the Debian-supplied versions of ElementTree and
the like.
Note, however, that the 'projectname-version' string has some precise
escaping rules; the distutils are quite inconsistent about their processing
of names and escaping, so I had to devise more specific rules for
setuptools, because setuptools has to actually *use* the project names and
versions, and parse them out of filenames:
1. The project name in a file or directory name is the setup(name=...)
argument, with all runs of one or more non-alphanumeric characters replaced
with '_'. (Note that this means there is never more than one '_' in a row
in the filename.) So a project like "FooBar Tools" or "FooBar-Tools" would
become "FooBar_Tools" in the filename.
2. The rules for the version are the same as for the name, *except* that
the '.' character is allowed to remain unescaped, and spaces are converted
to '.' before compacting non-alphanumeric runs. So, version '1.2 rc5'
becomes '1.2.rc5', while '1.2-pl5' becomes '1.2_pl5'.
>>The "no good reason" part is an interesting opinion, although in my view
>>it is rather narrow-minded. Being able to support multi-version
>>importing is a very good reason indeed, as is avoiding the need for a
>>platform-specific package management tool in order to manage Python projects.
>
>I don't see why multi-version support necessarily requires to
>increase sys.path. In the case of eggs, version dependencies are
>expressed explicitly in the code (through require() calls),
Actually, they're expressed in the egg metadata, and the wrappers on a
project's scripts execute the require() calls, so that the code doesn't
have to contain explicit require() calls except for more-dynamic
situations, such as plugins and "optional extra features" that require
additional projects to be present.
> so
>that essentially replace the standard Python import search algorithm.
>Because of that, you could have a default version inside site-packages,
>and additional versions elsewhere, only found when require() is
>called.
That's correct, and setuptools actually supports that scenario, but it
doesn't currently provide tools for creating that arrangement on disk,
since the "default version" you propose would be hard to manage without an
external packaging tool, like Debian. (The proposed addition for 0.6a9
would be to make it possible to install such a thing, for use with external
packaging tools.)
Note that setuptools is in release 0.6a8 at the moment - it is obviously
not a polished product, but it provides enough functionality to be quite
useful to many Python developers. To this point, directly working on
integration with external packaging tools has not been a focus, although I
always have given top priority to responding to questions and requests from
people working on integration with those tools (e.g. the volunteers who
worked on easy_deb and the Gentoo stuff). I can't reasonably learn the
technical details of every packaging system, so it is best to let
volunteers familiar with individual packaging systems tell me what they
need in order to effectively wrap the system.
Up until now, my interactions with such volunteers have been most pleasant
and positive. To my knowledge, it's not usual for packaging system
developers to spew FUD at a project and look for ways to exclude or break
the work of developers who've chosen to use it. I'm therefore more than a
little surprised by some of the attitude I've received. I hope, though,
that we can get past that soon, if only because it means I'll have more
time to work on implementing and documenting whatever the resolution is. ;)
More information about the Distutils-SIG
mailing list