[Distutils] Optional C extensions in packages

Phillip J. Eby pje at telecommunity.com
Fri Feb 2 00:23:20 CET 2007


At 05:32 PM 2/1/2007 -0500, Jim Fulton wrote:
>I'm still worried about the ambiguous case when there are both
>platform-dependent and platform-independent eggs installed.

How would this happen?

I think you're trying to solve a broader problem than the one I'm trying to 
solve, which is that I'd like to make it possible for people who don't have 
working compilers (i.e. mostly Windows, with some Mac users and some people 
in virtual hosting environments) to install packages that contain C extensions.

In that scenario, you're going to *always* want to use this option to 
suppress optional extensions, because there isn't any way for you to build 
them.  But, you would presumably still want to know about packages that 
*require* their extensions to be built.


>I think you were proposing an easy_install option.  This helps when
>someone installs a distribution directly, but doesn't help when a
>distribution is installed as a dependency.

This would be an option to suppress compiling *all* optional C extensions, 
period.


>   It also doesn't help with
>controlling selection of eggs after installation.  And I think it
>doesn't make it easy to change one's mind.  For example, one might
>install an egg with extensions and then install one without
>extensions to debug a problem using the Python debugger.  Would the
>option let them do that?

The idea was that it would be a build-time option.


>Is it possible to control this as part of the requirement specification?
>Perhaps this could be some kind of standard extra?
>
>I'd strongly prefer to be able to control this via the requirements
>mechanism.  I'd like to be able to say that I want or don't want
>extensions as part of a requirement string.

Yeah, I see the benefit of that, certainly.  The problem is that we're 
trying to solve different problems.  I just want to make it *possible* to 
suppress building extensions during easy_install.

I'll give some more thought to what you're asking for.  I have an inkling 
of an idea, but the problems have to do with things like having to actually 
check the egg's contents to see if it meets requirements, and there are 
problems regarding the need to clean up the build/ directory if you change 
what features you build something with.

You see, setuptools has an undocumented 'feature' mechanism (which is still 
used by some PEAK projects) to control the inclusion of various packages, 
extensions, etc.  The main reason this is undocumented is because it turns 
out that it's fragile to specify what features to use or not use on the 
command line alone, due to some distutils' commands just taking whatever's 
in the build/ directory as gospel.

Anyway, that feature mechanism could probably be tied in to the 
requirements system, as long as there was a way to wipe the build/ 
directory whenever the features changed between runs of setup.py, and there 
was a way to list the features in the .egg-info, and pkg_resources was 
changed to query a distribution's "features" info when validating a 
requirement that includes "extras".

I'm a little concerned that this will incur additional disk access under 
various circumstances, unless there is some way to statically distinguish 
between extras that denote "features" and ones that indicate additional 
requirements.  Of course, matching a requirement against a distribution 
when the requirement doesn't list any extras, will not incur overhead.

I guess we could do something like this for 0.7.  One thing that concerns 
me, however, is that it potentially *increases* the amount of conflicts and 
confusion possible regarding a single egg, unless there's a way to include 
the features in the filename.  You can't tell just by looking at it, if it 
meets your needs.

In contrast, the benefit of my current proposal is that it's intended 
strictly for those circumstances where the eggs are *supposed to be* 
interchangeable except for platform-specificity and performance, and you 
should be able to at least tell from the filename which kind you have.  In 
the case where we allow other choices of features, you would need some kind 
of tool to tell you what features the egg was built with.

Maybe another possibility is to have *subprojects* instead, where a 
subproject is something built using the same setup.py, but has a distinct 
project name, like "PyProtocols-CExtensions" or "Twisted-Foo".  By default, 
perhaps such a multi-project setup script would run each subproject with 
its own build directory, and dump multiple eggs or source distributions 
into the dist/ directory.  This might take some munging of EasyInstall to 
support picking up the distributions produced when running the bdist_egg, 
but it might be doable.

The principal downsides to this approach are the doubling up of eggs 
involved, and the need to keep a precise match of versions between the 
packages.  In particular, if someone installs a new version of a package 
without its C extensions, and the C extensions still exist for an older 
version, it will end up importing the wrong extensions -- and it will be 
hard to tell what happened and why.  The package will just seem broken.

Sigh.  I guess at this point I don't really see a way to do optional 
extensions that doesn't turn into a crazy madhouse of support later.  It 
seems to me that at least the problems with my approach would at most boil 
down to, "how come this thing is so slow"?  :)



More information about the Distutils-SIG mailing list