[Catalog-sig] an immutable mirror of PyPI
faassen at startifact.com
Sat Jul 16 12:54:53 CEST 2011
On 07/16/2011 01:08 AM, Ben Finney wrote:
> Martijn Faassen<faassen at startifact.com> writes:
>> I don't work in a vacuum. I share code with others. This code has
>> dependencies on other code. So how do people obtain this other code?
> By depending on other code, you have a choice to make: you either take
> the maintenance burden on yourself, or you delegate the maintenance
> burden (usually to the developers of that code).
> By delegating the maintenance burden of that code elsewhere, that
> entails delegating the responsibility for future availability of that
There is maintenance burden and there is the package actually existing
for download. When I depend on Foo 1.1, I am not delegating maintenance
burden to the original developer, unless I go and ask questions about
Foo 1.1. The answer can then be: Foo 1.1 is not maintained, sorry. Only
when I am interested in upgrades does the original developer come in again.
I don't see why these two should be the same: the future availability of
an existing release of a package is not identical to continued
development of that code.
>> PyPI I thought was among other things central place where people can
>> download and install packages from so that they can resolve
>> dependencies, but you seem to be arguing against doing that.
> I find it strange that I'm defending PyPI in this instance, since I am
> quite sympathetic to complaints that it has poor policies on package
> availability and many other complaints.
> But you seem to expect that PyPI must guarantee that any package version
> ever available will be available forever. That's not reasonable, I
I am not barging in here with expectations. I'm coming in here with use
cases and proposals. It seems my use cases are rejected as goals of
PyPI. In that case I want to get a better understanding of the goals of
You say that the goal of perpetual availability of packages is an
unreasonable goal of PyPI or related services. You don't seem to explain
So I have use cases: I can release code that relies on releases that can
disappear or can be replaced. I think this is bad for repeatability and
security. I'd like to see some improvements made. How would we make
these improvements? I've so far proposed three ideas:
* PyPI not throwing away things after a grace period. Almost universally
* an additional service, a mirror, that offers some repeatability
guarantees. Removal would need to go through channels, implying some
kind of custodianship I think people here are wary about.
* better communication channels: a list of what's been removed, a list
of what's been deprecated. I can then write tools that help me maintain
my projects. It's not the same as the above ideas: old projects can
still break at the whim of people whose code I depend on, but it'll at
least help manage this issue.
But perhaps you have better ideas on how to better help manage this.
I am getting a bit tired of hearing "you can do this yourself", as this
ties into to heart of collaboration, and PyPI if anything is at least
> Instead, you need to choose packages considering whether you trust the
> package to remain available, which is a social issue between you and the
> people developing that work.
> If you think there is a significant risk the people responsible for that
> package will remove a version on which you depend from PyPI, you should
> engage in dialogue with those people to resolve that.
And how exactly am I supposed to read people's minds, possibly years
into the future? I had absolutely no expectation that this would happen
with the release that disappeared on over a month ago. The developer one
day just decided to clean up old, unsupported releases. Of course I
contacted the developer after it happened. Several others did too. I
then started thinking about how to reduce this risk in a more broad sense.
> I don't think PyPI has any business requiring package developers to keep
> a version available at PyPI beyond when they want it available there.
> The risks inherent in that need to be addressed as a social issue, not a
> technical limitation.
Yes, this is a social issue. But tools can support social issues. If
people tell me to keep my own private mirror, that's a tool solution
too, but not a very social one.
>> At most it's some kind of showcase for packages that peoples should
>> take into their consideration. Taking this point to the extreme, it's
>> *never* something that you can automate downloading from.
> There are points that can be made toward that view; but I don't find
> this specific case (wanting guaranteed availability of every version
> forever at PyPI) supports it.
>> Instead you should be giving a giant tarball of packages to everybody,
>> always, if they use your code at all.
> This is indeed a terrible option, and I lament it whenever I see it.
> I prefer supporting the efforts of those who *do* provide reasonable
> guarantees of package selection and availability and integration
> testing. We call them “operating system distributions”.
The requirements for developers concerning library availability are not
identical to those of users. Operating system distributions focus on a
stable platform for users. Some developers need to develop
cross-platform code. Some developers need to develop different versions
of a project, or different project that rely on different versions of
dependencies. Some developers need to depend on libraries or library
versions not (yet) available in distributions.
These developers effectively create a stable distribution of
dependencies that they have tested together. It's useful to have tools
to support this and allow these developers to share their code with others.
More information about the Catalog-SIG