On 8 April 2017 at 03:17, Nick Coghlan
PyPI already has a reasonably extensive component tagging system in https://pypi.python.org/pypi?%3Aaction=list_classifiers but we don't really *use* it all that much for programmatic purposes.
That means the incentives for setting tags correctly are weak, since there isn't much pay-off in the form of tooling-intermediated communication of constraints and testing limitations to end users.
What I'm starting to wonder is whether or not it may make sense to start recommending that installation tools emit warnings in the following cases:
1. At least one "Operating System" tag is set, but the tags don't include any that cover the *current* operating system 2. At least one "Programming Language :: Python" tag is set, but the tags don't include any that cover the *current* Python version
The "at least one relevant tag is set" pre-requisite would be to avoid emitting false positives for projects that don't provide any platform compatibility guidance at all.
I agree that there's little incentive at the moment to get classifiers right. So my concern with this proposal would be that it issues the warnings to end users, who don't have any direct means of resolving the issue (they can of course raise bugs on the projects they find with incorrect classifiers). Furthermore, there's a potential risk that projects might see classifiers as implying a level of support they are not happy with, and so are reluctant to add classifiers "just" to suppress the warning. But without data, the above is just FUD, so I'd suggest we do some analysis. I did some spot checks, and it seems that projects might typically not set the OS classifier, which alleviates my biggest concern (projects stating "POSIX" because that's what they develop on, when they actually work fine on Windows) - but propoer data would be better. Two things I'd like to see: 1. A breakdown of how many projects actually use the various OS and Language classifiers. 2. Where projects ship wheels, do the wheels they ship match the classifiers they declare? That should give a good idea of the immediate impact of this proposal. (There's not much we can say about source-only distributions, but that's OK). The data needed to answer those questions should be available - the only way I have of getting it is via the JSON interface to PyPI, so I can write a script to collect the information, but it might be some time before I can collate it. Is this something the BigQuery data we have (which I haven't even looked at myself) could answer? Paul