[Distutils] Idea: Using Trove classifiers for platform compatibility warnings

Sat Apr 8 22:27:28 EDT 2017

On 8 April 2017 at 19:29, Paul Moore <p.f.moore at gmail.com> wrote:
> On 8 April 2017 at 03:17, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> The "at least one relevant tag is set" pre-requisite would be to avoid
>> emitting false positives for projects that don't provide any platform
>> compatibility guidance at all.
>
> I agree that there's little incentive at the moment to get classifiers
> right.

I'll also explicitly note that I think this idea counts as a "nice to
have" - in cases where there are real compatibility problems, those
are going to show up at runtime anyway, so what this idea really
provides is a debugging hint that says "Hey, you know that weird
behaviour you're seeing in <environment>? How sure are you that all of
your dependencies actually support that configuration?"

That said, if folks agree that this idea at least seems plausible, one
outcome is that I would abandon the draft "Supported Environments"
section for the "python.constraints" extension in PEP 459:
https://www.python.org/dev/peps/pep-0459/#supported-environments

While programmatic expressions like that are handy for publishers,
they don't convey the difference between "We expect future Python
versions to work" and "We have tested this particular Python version,
and it does appear to work", and they're also fairly hostile to
automated data analysis, since you need to evaluate expressions in a
mini-language rather than just filtering on an appropriately defined
set of metadata tags.

When it comes to the "Programming Language :: Python" classifiers
though, we already give folks quite a bit of flexibility there:

- no tag or the generic unversioned tag to say "No guidance provided"
- the "PL :: Python :: X" tags to say "definitely supports Python X"
without saying which X.Y versions
- the "PL :: Python :: X.Y" tags to say "definitely supports Python X.Y"

And that flexibility provides an opportunity to let publishers make a
trade-off between precision of information provided (down to just
major version, or specifying both major and minor version) and the
level of maintenance effort (with the more precise approach meaning
always having to make a new release to update the compatibility
metadata for new Python feature releases, even when the existing code
works without any changes, but also meaning you get a way to
affirmatively say "Yes, we tested this with the new version, and it
still works").

We also have the "PL :: Python :: X :: Only" tags, but I think that
may be a misguided approach and we'd be better off with a general
notion of tag negation: "Not :: PL :: Python :: X" (so you'd add a
"Not :: Programming Language :: Python :: 2" tag instead of adding a
"Programming Language :: Python :: 3 :: Only" tag)

> So my concern with this proposal would be that it issues the
> warnings to end users, who don't have any direct means of resolving
> the issue (they can of course raise bugs on the projects they find
> with incorrect classifiers).

We need to be clear about the kinds of end users we're considering
here, though: folks using pip (or similar) tools to do their own
install-time software integration, *not* folks consuming pre-built and
pre-integrated components through conda/apt/dnf/msi/etc.

In the latter cases, the redistributor is taking on the task of making
sure their particular combinations work well together, but when we use
pip (et al) directly, that task falls directly on us as useful, and
it's useful when debugging to know whether what we're doing is a
combination that upstream has already thought about (and is hopefully
covering in their CI setup if they have one), or whether we may be
doing something unusual that most other people haven't tried yet.
While this is also useful info for redistributors to know, I was
thinking in PyPI publisher & pip user terms when the idea occurred to
me.

The concept is based at least in part on my experience as a World of
Warcraft player, where there are two main pieces to their
compatibility handling model for UI Add-ons:

1. Add-on authors tag the add-on itself with the most recent version
of the client API that they've tested it against
2. To avoid having your UI break completely every time the client API
changes, he main game client has a simple "Load Out of Date Addons"
check box to let you opt-in to continue to use add-ons that may not
have been updated for the latest changes to the game's runtime API
(while also clearly saying "Don't complain to Blizzard about any UI
bugs you encounter in this unsupported configuration")

Assuming we do pursue this idea (which is still a big assumption at
this point, due to the "potentially nice to have for debugging in some
situations" motivation being a fairly weak one for volunteer efforts),
I think a sensible way to go would be to have the classifier checking
be opt-in initially (e.g. through a "--check-classifiers" option), and
only consider making it the default behaviour if having it available
as a debugging option seems insufficient.

> Furthermore, there's a potential risk
> that projects might see classifiers as implying a level of support
> they are not happy with, and so are reluctant to add classifiers
> "just" to suppress the warning.

>From a client UX perspective, something like the approach used for the
`--no-binary` option would seem reasonable:
https://pip.pypa.io/en/stable/reference/pip_install/#cmdoption-no-binary

That is:

* `--check-classifiers :none:` to disable checks entirely
* `--check-classifiers :all:` to check everything
* `--check-classifiers a,b,c,d` to check key packages you really care
about, but ignore others

> But without data, the above is just FUD, so I'd suggest we do some
> analysis. I did some spot checks, and it seems that projects might
> typically not set the OS classifier, which alleviates my biggest
> concern (projects stating "POSIX" because that's what they develop on,
> when they actually work fine on Windows) - but propoer data would be
> better. Two things I'd like to see:
>
> 1. A breakdown of how many projects actually use the various OS and
> Language classifiers.
> 2. Where projects ship wheels, do the wheels they ship match the
> classifiers they declare?
>
> That should give a good idea of the immediate impact of this proposal.

I think the other thing that research would provide is guidance on
whether it makes more sense to create *new* tags specifically for
compatibility testing reports rather than attempting to define new
semantics for existing tags. The inference from existing tags would
then solely be a migration step where clients and services could
synthesise the new tags based on old metadata (including things like
`Requires-Python:`).

If we went down that path, it might look like this:

1. Two new classifier namespaces specifically for compatibility
assertions: "Compatible" and "Incompatible"
2. Within each, start by defining two subnamespaces based on existing
classifiers:

    Compatible :: Python :: [as for `Programming Language :: Python ::`]
    Compatible :: OS :: [as for `Operating System :: `]
    Incompatible :: Python :: [as for `Programming Language :: Python ::`]
    Incompatible :: OS :: [as for `Operating System :: `]

Within the "Compatible" namespace the ` :: Only` suffix would be a
modifier to strengthen the "Compatible with this" assertion into a
"almost certainly not compatible with any of the other options in this
category" assertion.

One nice aspect of that model is that it would be readily extensible
to other dimensions of compatibility, like "Implementation" (so
projects that know they're tightly coupled to the C API for example
can add "Compatible :: Implementation :: CPython").

The downside is that it would leave the older "for information only"
classifiers as semantically ambiguous and we'd be stuck permanently
with two very similar sets of classifiers.

> (There's not much we can say about source-only distributions, but
> that's OK). The data needed to answer those questions should be
> available - the only way I have of getting it is via the JSON
> interface to PyPI, so I can write a script to collect the information,
> but it might be some time before I can collate it. Is this something
> the BigQuery data we have (which I haven't even looked at myself)
> could answer?

Back when Donald and I were working on PEP 440 and ensuring the
normalization scheme covered the vast majority of existing projects,
we had to retrieve all the version info over XML-RPC:
https://github.com/pypa/packaging/blob/master/tasks/check.py

I'm not aware of any subsequent changes on that front, so I don't
believe we currently push the PKG-INFO registration metadata into Big
Query. However, I do believe we *could* (if Google are amenable), and
if we did, it would make these kinds of research questions much easier
to answer.

Donald, any feedback on how hard it would be to get the current PyPI
project metadata into a queryable format in BQ?

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia