[Distutils] Environment marker expression types

Fri Apr 26 08:05:18 CEST 2013

On Thu, Apr 25, 2013 at 9:50 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Note that == and != don't emit TypeError with non-comparable types -
> if both sides return NotImplemented from the corresponding magic
> methods, then the comparison is just False for == and True for !=.
> It's only the ordering comparisons that now emit TypeError rather than
> attempting to guess an appropriate answer.

Ah.  Ok, good.

> The main advantage of combining the systems is that it allows the
> extras to *also* participate in the environment marker system (for
> example, only adding certain dependencies if you're on Windows *and*

I think you accidentally a word, there.  ;-)

ISTM that it isn't necessary for 'extra' to be part of environment
markers in order to allow extras to have environment markers; as long
as environment markers are part of requirements, and extras define
requirements, that's more than sufficient.  ISTM that extras in
environment markers is a kludge to avoid special syntax for extras,
and if we're switching to a structured format, there's no real problem
with having a dedicated structure for extras.

(Incidentally, this would also improve performance, since it would
then not be necessary to parse all conditionals in order to find out
what extras are available!)

> While working on the latest PEP 426 update, I've actually been
> pondering the problem of how to handle the environment marker system
> in general. Specifically, I *don't* really want anyone to need to
> check any environment markers in the metadata passed to the
> post-install hook, or even in the metadata recorded in the
> installation database. The installer should be resolving all of those
> at install time. However, then you get the interesting question of
> what happens if you share an installation database across
> heterogeneous machines via a network file system, implying that either
> these things *have* to be evaluated at run time, solely in order to
> accommodate that niche use case, or else we need to constrain that use
> case by saying that it is up to the people setting it up to ensure
> that the environments are *sufficiently* similar for things to keep
> working.

I'd just as soon stick with dynamic; in fact I've just implemented
that in a setuptools dev branch today.  Basically, I extended the
syntax of requires.txt to allow sections to be named e.g.
"[ssl:python_version in '2.3, 2.4, 2.5']".  That is, "[extra:marker]",
where 'extra' can be blank to designate non-extra conditional
requirements, and an extra can have multiple sections, with or without
conditionals.

The reason I'm implementing this isn't as a speculative feature, it's
because the setuptools SSL support I'm working on has
platform-specific requirements (wincertstore, ctypes) despite
setuptools *itself* being platform-independent.  This is kind of an
ugly hole in the way dependencies work today: you can't sanely package
a platform-independent package with platform-specific dependencies.
So I need some way of doing this in order to ship SSL support in
setuptools 0.6c12 (and of course 0.7+).

As it happens, the code to implement marker handling was short; <120
lines, for code that works across 2.3-2.7 and will probably work
unchanged for 3.x.  And the evaluation only needs to take place when
dependencies are actually processed, which is pretty darn infrequent
in most programs.

> 5. The installer evaluates the conditional metadata as appropriate
> before writing the metadata out to the installation database and
> invoking the post-install hook. Post-installation, the top level
> metadata includes both the unconditional metadata *and* any
> conditional metadata that applies to the current system.

ISTM that this does away with some of the benefits of e.g. fat wheels
and the ability to share source across multiple Python versions (as
distros sometimes like to do).  It also seems to have very little
gain, since if you are reading the metadata, you've probably already
imported distlib anyway.  ;-)

In general, I favor the minimum possible transformations between
wheels and on-disk format, simply because it means fewer conditional
paths in APIs that need to support both.  If you're going to resolve
the dependencies in a wheel, you have to parse the conditionals, so
again it doesn't save much to not parse them.

If you want to increase parsing efficiency in the presence of
conditionals, I would suggest nesting the conditionals in individual
fields, or at least using a structure like:

    { 'conditionals': {'requires': [....], 'other-field': [...]}}

As this means that you only have to parse the conditions for fields
you're actually accessing.  That keeps parsing to a minimum if a given
query is only looking at unconditional fields.

In general, I don't really see performance as a big deal, though: on
an old 1.4Ghz Athlon running 2.x, I can parse and evaluate the marker
"sys.platform=='win32' and python_version in '2.3, 2.4'" over 10000
times per second, without having made any serious attempts at speeding
that up, yet.

Conditionals are not all that common (and don't exist *as*
conditionals in the wild as yet), and actual requirement resolution is
not a high-frequency event, especially in a non-egg or buildout-based
environment.  (Buildout hardcodes dependencies in scripts, and non-egg
environments don't usually evaluate *any* dependencies at runtime, so
only package management tools will pay the price of evaluation.)

But of course, this is all based on setuptools' requires.txt approach,
which doesn't need markers in order to know what extras exist.  OTOH,
if you code extras *only* as markers, then it does indeed create a big
(and wholly unnecesary) can of worms with respect to parsing.

So, my suggestion would be to drop the overloading of extras as
markers, and structure all conditional data such that only the
conditionals needed to satisfy a particular inquiry ever need to be
parsed.  This should optimize the common case (infrequent dependency
parsing and infrequent conditionals in a simple unpacked structure)
while still enabling advanced cases such as cross-version or
cross-platform sharing to work correctly.