On Sat, 27 Jan 2018 at 09:16 Nathaniel Smith <njs@pobox.com> wrote:
On Fri, Jan 26, 2018 at 7:11 AM, Pradyun Gedam <pradyunsg@gmail.com> wrote:
> Hello! I hope everyone's had a great start to 2018! :)
>
> A few months back, while working on pip, I had noticed an oddity about
> extras.
>
> Installing a package with extras would not store information about the fact
> that the extras were requested. This means, later, it is not possible to
> know which extra-based optional dependencies of a package have to be
> considered when verifying that the packages are compatible with each other.
> This information is relavant for resolution/validation since without it, it
> is not possible to know which the extra-requirements to care about.
>
> As an example, installing ``requests[security]`` and then uninstalling
> ``PyOpenSSL`` leaves you in a state where you don't really satisfy what was
> asked for but there's no way to detect that either.

Another important use case is upgrades: if requests[security] v1 just
depends on pyopenssl, and then requests[security] v2 adds a dependency
on certifi, and I do

pip install requests[security] == 1
pip upgrade

then upgrade should give me requests[security] == 2, and thus install
certifi. But this doesn't work if you don't have any record that
'requests[security]' is even installed :-).

Yes! Essentially, if there's a situation where a package may be modified, 
we should care about having this information, to ensure it still does
satisfy the extra's requirements which may change themselves when the
base package changes.
 
> Thus, obviously, I'm interested in making pip to be able to store this
> information. As I understand, this is done needs to be specified in a PEP
> and/or on PyPUG's specification page.
>
> To that end, here's seeding proposal for the discussion: a new
> `extras-requested.txt` file in the .dist-info directory, storing the extra
> names in a one-per-line format.

I'm going to put in another plug here for my "reified extras" idea:
https://mail.python.org/pipermail/distutils-sig/2015-October/027364.html

Essentially, the idea is to promote extras to full packages --
normally ones that contain no files, just metadata like dependencies,
though that's not a necessary requirement, it's just how we'd
interpret existing extras specifications.

Then installing 'requests[security]' would install the
'requests[security]' package, which depends on both 'requests' and
'pyopenssl', and we have a 'requests[security]-$VERSION.dist-info'
directory recording that we installed it.

I like this. This is how I'm modelling extras within the resolver currently,
by just considering extras as just-another-requirement and having them
depend on the base package and the extra dependencies. Prof. Justin 
Cappos had suggested this to me. I imagine this'll result in simplification 
somewhere due to this consistency between what the resolver consumes
and what's on the disk.

I think if we go this way, we should probably aim to just something 
equivalent of Debian's metapackages for now. The rest of the advanced
features can be brought in at a latter stage.

The advantages are:

- it's a simpler way to record information the information you want
here, without adding more special cases to dist-info: most code
doesn't even have to know what 'extras' are, just what packages are

- it opens the door to lots of more advanced features, like
'foo[test]' being a package that actually contains foo's tests, or
build variants like 'numpy[mkl]' being numpy built against the MKL
library, or maybe making it possible to track which version of numpy's
ABI different packages use. (The latter two cases need some kind of
provides: support, which is impossible right now because we don't want
to allow random-other-package to say 'provides-dist: cryptography';
but, it would be okay if 'numpy[mkl]' said 'provides-dist: numpy',
because we know 'numpy[mkl]' and 'numpy' are maintained by the same
people.)

I know there's a lot of precedent for this kind of clever use of
metadata-only packages in Debian (e.g. search for "metapackages"), and
I guess the RPM world probably has similar tricks.

-n

--
Nathaniel J. Smith -- https://vorpus.org