[Distutils] PEP 426 moved back to Draft status

Daniel Holth dholth at gmail.com
Fri Mar 10 16:03:20 EST 2017


You lost me a bit at 'extra sets'. FYI it is already possible to depend on
your own extras in another extra.

Extra pseudo code:
spampackage
extra['spam'] = 'spampackage[eggs]'
extra['eggs'] = ...

+1 on extras. The extras feature has the wonderful property that people
understand it. Lots of projects have a 'test' extra instead of
tests_require for example, and you don't have to look up how to install
them.

On Fri, Mar 10, 2017 at 1:14 PM Brett Cannon <brett at python.org> wrote:

On Fri, 10 Mar 2017 at 07:56 Nick Coghlan <ncoghlan at gmail.com> wrote:

On 11 March 2017 at 00:52, Nathaniel Smith <njs at pobox.com> wrote:

On Fri, Mar 10, 2017 at 1:26 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Hi folks,
>
> After a few years of dormancy, I've finally moved the metadata 2.0
> specification back to Draft status:
>
https://github.com/python/peps/commit/8ae8b612d4ea8b3bf5d8a7b795ae8aec48bbb7a3

We have lots of metadata files in the wild that already claim to be
version 2.0. If you're reviving this I think you might need to change
the version number?


They're mostly in metadata.json files, though. That said, version numbers
are cheap, so I'm happy to skip straight to 3.0 if folks think it makes
more sense.


+1 on jumping.




> Based on our last round of discussion, I've culled a lot of the complexity
> around dependency declarations, cutting it back to just 4 pre-declared
> extras (dev, doc, build, test),

I think we can drop 'build' in favor of pyproject.toml?


No, as that's a human edited input file, not an output file from the sdist
generation process.


Actually all of the pre-declared extras are really relevant for sdists
rather than wheels. Maybe they should all move into pyproject.toml?


Think "static release metadata in an API response from PyPI" for this
particular specification, rather than something you'd necessarily check
into source control.


Or "stuff PyPI has to parse, not you". ;)


That's actually one of the big benefits of doing this post pyproject.toml
-  with that taking care of the build system bootstrapping problem, it
frees up pydist.json to be entirely an artifact of the sdist generation
process (and then copying it along to the wheel archives and the installed
package as well).

That said, that's actually an important open question: is pydist.json
always preserved unmodified through the sdist->wheel->install and
sdist->install process?


Is there a reason not to?



There's a lot to be said for treating the file as immutable, and instead
adding *other* metadata files as a component moves through the distribution
process. If so, then it may actually be more appropriate to call the
rendered file "pysdist.json", since it contains the sdist metadata
specifically, rather than arbitrary distribution metadata.


Since this is meant for tool consumption and not human consumption,
breaking the steps into individual files so that they are considered
immutable by tools farther down the toolchain makes sense to me.




> and some reserved extras that can be used to
> say "don't install this, even though you normally would" (self, runtime).

Hmm. While it's not the most urgent problem we face, I really think in
the long run we need to move the extras system to something like:

    https://mail.python.org/pipermail/distutils-sig/2015-October/027364.html

The current extras system is inherently broken with respect to
upgrades, and reified extras would solve this, along with several
other intractable problems (e.g. numpy ABI tracking).

So from that perspective, I'm wary of adding new special case "magic"
to the extras system. Adding conventional names for things like
test-dependencies is fine, that doesn't pose any new obstacles to a
future migration. But adding complexity to the "extras language" like
"*", "self", "runtime", etc. does make it harder to change how extras
work in the future.


Technically the only part of that which the PEP really locks in is barring
the use of "self" and "runtime" as extras names (which needs to be
validated by a check against currently published metadata to see if anyone
is already using them).


Do you have something planned for these names?



'*' is already illegal due to the naming rules, and the '-extra' syntax is
also an illegal name, so neither of those actually impacts the metadata
format, only what installation tools allow. The main purpose of having them
in the PEP is to disallow using those spellings for anything else and
instead reserve them for the purposes described in the PEP.

I'd also be fairly strongly opposed to converting extras from an optional
dependency management system to a "let multiple PyPI packages target the
same site-packages subdirectory" because we already know that's a nightmare
from the Linux distro experience (having a clear "main" package that owns
the parent directory with optional subpackages solves *some* of the
problems, but my main reaction is still "Run awaaay").

It especially isn't needed just to solve the "pip forgets what extras it
installed" problem - that technically doesn't even need a PEP to resolve,
it just needs pip to drop a pip specific file into the PEP 376 dist-info
directory that says what extras to request when doing future upgrades.
Similarly, the import system offers so much flexibility in checking for
optional packages at startup and lying about where imports are coming from
that it would be hard to convince me that installation customisation to use
particular optional dependencies *had* to be done at install time.


I feel like most of the value we get out of these could be had by just
standardizing the existing convention that packages should have an
explicit "all" extra that includes all the feature-based extras,


That's the first I've heard of that convention, so it may not be as
widespread as you thought it was :)


but
not the special development extras? This also provides flexibility for
cases like, a package where there are two extras that conflict with
each other -- the package authors can pick which one they recommend to
put into "all".


That's actually the main problem I had with '*' - it didn't work anywhere
near as nicely once the semantic dependencies were migrated over to being
part of the extras system.

Repeating the same dependencies under multiple extra names in order to
model pseudo-sets seems error prone and messy to me, though.

So perhaps we should add the notion of "extra_sets" as a first class
entity, where they're named sets of declared extras? And if you don't
declare an "all" set explicitly, you get an implied one that consists of
all your declared extras.


I think that's a tool decision that doesn't tie into the PEP (unless you're
going to ban the use of the name "all").



For migration of existing metadata that uses "all" as a normal extra, the
translation would be:

- declared extras are added to "all" in order until all of the dependencies
in all are covered or all declared extras are included
- any dependency in "all" that isn't in another extra gets added to a new
"_all" extra
- "extras" and "extra_sets" are populated accordingly

Tools consuming the metadata would then just need to read "extra_sets" and
expand any named sets before passing the list of extras over to their
existing dependency processing machinery.


If this is meant to be generated by pyproject.toml consumers then I think
it should be up to the build tools to support that concept. Then the build
tools can statically declare the union of some extras to get extra sets
since the information isn't changing once the pydist.json file is generated
(dynamic calculation is only necessary if the value could change between
data generation and consumption).



> I've also deleted a lot of the text related to thing that we now don't
need
> to worry about until the first few standard metadata extensions are being
> defined.
>
> I think the biggest thing it needs right now is a major editing pass from
> someone that isn't me to help figure out which explanatory sections can be
> culled completely, while still having the specification itself make sense.
>
> From a technical point of view, the main "different from today" piece that
> we have left is the Provide & Obsoleted-By fields, and I'm seriously
> wondering if it might make sense to just delete those entirely for now,
and
> reconsider them later as a potential metadata extension.

Overall the vibe I get from the Provides and Obsoleted-By sections is
that these are surprisingly complicated and could really do with their
own PEP, yeah, where the spec will have room to breathe and properly
cover all the details.

In particular, the language in the "provides" spec about how the
interpretation of the metadata depends on whether you get it from a
public index server versus somewhere else makes me really nervous.


Yeah, virtual provides are a security nightmare on a public index server -
distros are only able to get away with it because they maintain relatively
strict control over the package review process.


Experience suggests that splitting up packaging PEPs is basically
never a bad idea, right? :-)


Indeed :)

OK, I'll put them on the chopping block too, under the assumption they may
come back as an extension some day if it ever makes it to the top of
someone's list of "thing that bothers them enough about Python packaging to
do something about it".


As a general note I guess I should say that I'm still not convinced
that migrating to json is worth the effort, but you've heard those
arguments before and I don't have anything new to add now, so :-).


The main benefit I see will be to empower utility APIs like distlib (and
potentially Warehouse itself) to better hide both the historical and
migratory cruft by translating everything to the PEP 426 format, even if
the source artifact only includes the legacy metadata. Unless the plumbing
actually breaks, nobody other than the plumber cares when it's a mess, as
long as the porcelain is shiny and clean :)

Cheers,
Nick.

P.S. Something I'm getting out of this experience: if you can afford to sit
on your hands for 3-4 years, that's a *really good way* to avoid falling
prey to "second system syndrome" [1] :)

P.P.S Having no budget to pay anyone else and only limited time and
attention of your own also turns out to make it easier to avoid ;)


Yes, getting to stew on an idea for any length of time lets those random
ideas one gets to properly die when they are bad. ;)
_______________________________________________
Distutils-SIG maillist  -  Distutils-SIG at python.org
https://mail.python.org/mailman/listinfo/distutils-sig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20170310/7545ea8e/attachment-0001.html>


More information about the Distutils-SIG mailing list