[Catalog-sig] PyPI mirrors are all up to date

"Martin v. Löwis" martin at v.loewis.de
Tue Apr 17 00:19:07 CEST 2012

Am 17.04.2012 00:09, schrieb Tarek Ziadé:
> On 4/16/12 11:57 PM, "Martin v. Löwis" wrote:
>>> Maybe a better checksum would be a global hash calculated differently ?
>> Define a protocol, and I present you with an implementation that
>> conforms to the protocol, and still has inconsistent data, and not
>> in a malicious manner, but due to bugs/race conditions/unexpected
>> events. It's pointless.
> if you calculate a checksum with all mirrored files - you can guarantee
> that the bits are the same
> on both side, no ?

How exactly would you calculate that checksum? Would you really require
concatenation of all files? That could take a few hours per change. It
would also raise the question in what order the files ought to be

> how can they know if version 1.3 of package foo never made it to the
> mirror they use ?
> They can't. They have to trust the last modified date and make the
> assumption that the mirror
> is fresh enough, for foo 1.3 to be present in both the master and the
> mirror.

How could they do so using your protocol?

> I think the idea of the checksum is to double-check that kind of claim. 
> But maybe that's overkill ?

I think it's both overkill, and it doesn't help.

> maybe the mirroring code should check file by file that everything was
> copied correctly ?

If you also assume malicious mirrors, then you definitely need to check
every file, as specified in


However, if a mirror claims it is up-to-date, and that verification
fails, my recommendation would be to give up in the tool and have
the user submit a bug report, in order to eliminate the mirror from
the mirror list.


More information about the Catalog-SIG mailing list