Re: [Distutils] RFC 2: PEP 541 - Package Index Name Retention
(copied from an email I erroneously sent to python-ideas@) I want to address one gap in the PEP regarding reclaiming abandoned names: Version reuse. The problem with reusing names is that existing applications or installations that reference the old one, unless they pin the version name precisely. Even in that case, I foresee issues with version collision, especially if the abandoned project was well-versioned in the same model (semver or otherwise) that the new project uses. I'm deeply concerned by the idea of installer code suddenly picking up a new project... with possibly different dependencies on its own, either with old or clashing versions. I recognize it's going to be rare, but these incidents will definitely impact the repeatability of builds depending on PyPi. I think the criteria for reuse of a name must include usage limits; if the package is being downloaded on a steady basis by accounts that can't be shown to belong to known integration systems, reuse should not be allowed. -- Chris R. ====== Not to be taken literally, internally, or seriously. Twitter: http://twitter.com/offby1
On Jan 16, 2017, at 11:59 AM, Chris Rose
wrote: (copied from an email I erroneously sent to python-ideas@)
I think the criteria for reuse of a name must include usage limits; if the package is being downloaded on a steady basis by accounts that can't be shown to belong to known integration systems, reuse should not be allowed.
I agree, which is why the rules for removal of an abandoned project include the following: * download statistics on the Package Index for the existing package indicate project is not being used; - Ł
The tricky part there is that "being used" is a tough concept to define.
Over what time period? What amount of downloading counts as "used"?
I believe these concepts need to be made very clear, because the impact of
exploitative replacement is pretty severe if it is made to happen.
On Mon, Jan 16, 2017 at 1:15 PM Łukasz Langa
On Jan 16, 2017, at 11:59 AM, Chris Rose
wrote: (copied from an email I erroneously sent to python-ideas@)
I think the criteria for reuse of a name must include usage limits; if the package is being downloaded on a steady basis by accounts that can't be shown to belong to known integration systems, reuse should not be allowed.
I agree, which is why the rules for removal of an abandoned project include the following:
* download statistics on the Package Index for the existing package indicate project is not being used;
- Ł
On Mon, Jan 16, 2017 at 1:18 PM, Chris Rose
The tricky part there is that "being used" is a tough concept to define. Over what time period? What amount of downloading counts as "used"?
I believe these concepts need to be made very clear, because the impact of exploitative replacement is pretty severe if it is made to happen.
Would a month where the old package is made unavailable, but the new owner is not given access yet be a good compromise ? It most likely let time the old owner (or old users) to manifest a decide to "revive" the package if necessary, otherwise give a really strong signal that if there is still a couple of download, then it really does not breaks a lot. -- M
That depends on policy. I don't want to go too far down the trap of privileging my specific use case, but as a company that vendors *everything* we depend on, our accesses to PyPi for dependencies are pretty rare, which means we might run afoul of these changes when ingesting packages. I'm going to ask the pointed question: is there actually any serious value to allowing the replacement of a name for anything that was ever in wide usage? Trademark violations notwithstanding -- legal stuff requires some degree of exception to the process -- why should abandonment result in replacement, as long as the existing code has ever been in use? On Mon, Jan 16, 2017 at 1:58 PM, Matthias Bussonnier < bussonniermatthias@gmail.com> wrote:
On Mon, Jan 16, 2017 at 1:18 PM, Chris Rose
wrote: The tricky part there is that "being used" is a tough concept to define. Over what time period? What amount of downloading counts as "used"?
I believe these concepts need to be made very clear, because the impact of exploitative replacement is pretty severe if it is made to happen.
Would a month where the old package is made unavailable, but the new owner is not given access yet be a good compromise ?
It most likely let time the old owner (or old users) to manifest a decide to "revive" the package if necessary, otherwise give a really strong signal that if there is still a couple of download, then it really does not breaks a lot. -- M
-- Chris R. ====== Not to be taken literally, internally, or seriously. Twitter: http://twitter.com/offby1
On 01/16/2017 02:02 PM, Chris Rose wrote:
That depends on policy. I don't want to go too far down the trap of privileging my specific use case, but as a company that vendors *everything* we depend on, our accesses to PyPi for dependencies are pretty rare, which means we might run afoul of these changes when ingesting packages.
If you have everything vendored then you should be able to easily fall back to older versions that you already have available. Maybe run your own PyPI server internally?
I'm going to ask the pointed question: is there actually any serious value to allowing the replacement of a name for anything that was ever in wide usage?
Possibly not, but with automated downloads to various distributions I suspect it becomes very difficult to tell if packages are actually "being used".
[...] -- why should abandonment result in replacement, as long as the existing code has ever been in use?
Because PyPI is not an archaeological site? Although, having said that, perhaps there could be a PyPI/archaeological page for packages that have been replaced. -- ~Ethan~
PyPi might not be an archaeological site, but like it or not it *is* a key
part of deployment processes, including those that run headless. I'm
referencing vendoring processes, but the same idea applies when your code
is deployed by any process that includes `pip install` in its steps. While
in an ideal world every user of these packages would host an internal
mirror of the packages they need and rigorously vet them, that's not the
world we live in.
I raise the issue because I believe the bar for taking over an abandoned
name should be nigh-insurmountably high; the risks are in my view severe,
given the way software is built today.
On Mon, Jan 16, 2017 at 5:16 PM, Ethan Furman
On 01/16/2017 02:02 PM, Chris Rose wrote:
That depends on policy. I don't want to go too far down the trap of
privileging my specific use case, but as a company that vendors *everything* we depend on, our accesses to PyPi for dependencies are pretty rare, which means we might run afoul of these changes when ingesting packages.
If you have everything vendored then you should be able to easily fall back to older versions that you already have available.
Maybe run your own PyPI server internally?
I'm going to ask the pointed question: is there actually any serious
value to allowing the replacement of a name for anything that was ever in wide usage?
Possibly not, but with automated downloads to various distributions I suspect it becomes very difficult to tell if packages are actually "being used".
[...] -- why should abandonment result in replacement, as long as
the existing code has ever been in use?
Because PyPI is not an archaeological site? Although, having said that, perhaps there could be a PyPI/archaeological page for packages that have been replaced.
-- ~Ethan~ _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
-- Chris R. ====== Not to be taken literally, internally, or seriously. Twitter: http://twitter.com/offby1
On Jan 17, 2017, at 12:25 PM, Chris Rose
wrote: PyPi might not be an archaeological site, but like it or not it *is* a key part of deployment processes, including those that run headless. I'm referencing vendoring processes, but the same idea applies when your code is deployed by any process that includes `pip install` in its steps. While in an ideal world every user of these packages would host an internal mirror of the packages they need and rigorously vet them, that's not the world we live in.
I raise the issue because I believe the bar for taking over an abandoned name should be nigh-insurmountably high; the risks are in my view severe, given the way software is built today.
If ~nobody is downloading it from PyPI today and ~nobody is releasing new versions to PyPI then re-using the name should have very little effect, even if in the past people had been using it. The only real use case I can think of where this might not be true is it could break someone’s ability to reproduce a deployment from many years ago but if you need to reproduce your build from years ago, depending on PyPI for that is not really the smartest bet. Otherwise it feels a lot like we’re in a “if a tree falls in the woods, but nobody is around to hear it does it make a sound?” territory. One thing we could possibly do is provide the ability for, as part of the relqunishing process, “lock” the old versions that were uploaded so that the new owner can neither delete them or upload new files for them AND set a “minimum version” for new uploads for that project. This could mean that one could say that foobar < 4.0 is the old project and foobar >= 4.0 is the new project and existing == continue to work. I’m not sure I feel about that though. Ultimately, consumers need to either live with these sorts of problems or they need to develop their own solutions for counteracting them (like vendoring) because while this proposed policy *could* cause them these issues, it’s not the only way for them to occur. Specifically, this only deals with cases that the original author is no longer responsive in some way, but fit hey are it’s typically not very hard in my experience to convince someone to give up a name they once used and are no longer interested in maintaining. I’ve acted as the middleman for this very arrangement on a number of occasions. — Donald Stufft
One thing we could possibly do is provide the ability for, as part of the relqunishing process, “lock” the old versions that were uploaded so that the new owner can neither delete them or upload new files for them AND set a “minimum version” for new uploads for that project. This could mean that one could say that foobar < 4.0 is the old project and foobar >= 4.0 is the new project and existing == continue to work. I’m not sure I feel about that though.
Wouldn't that be a case where the version epoch[1] could (should?) be used ?
If included in a version identifier, the epoch appears before all other components, separated from the release segment by an exclamation mark:
E!X.Y # Version identifier with epoch If no explicit epoch is given, the implicit epoch is 0 .
Most version identifiers will not include an epoch, as an explicit epoch is only needed if a project changes the way it handles version numbering in a way that means the normal version ordering rules will give the wrong answer.
-- M 1:https://www.python.org/dev/peps/pep-0440/#version-epochs
participants (5)
-
Chris Rose
-
Donald Stufft
-
Ethan Furman
-
Matthias Bussonnier
-
Łukasz Langa