[Catalog-sig] PyPI mirrors are all up to date
Tarek Ziadé
tarek at ziade.org
Tue Apr 17 00:48:05 CEST 2012
On 4/17/12 12:19 AM, "Martin v. Löwis" wrote:
> Am 17.04.2012 00:09, schrieb Tarek Ziadé:
>> On 4/16/12 11:57 PM, "Martin v. Löwis" wrote:
>>>> Maybe a better checksum would be a global hash calculated differently ?
>>> Define a protocol, and I present you with an implementation that
>>> conforms to the protocol, and still has inconsistent data, and not
>>> in a malicious manner, but due to bugs/race conditions/unexpected
>>> events. It's pointless.
>> if you calculate a checksum with all mirrored files - you can guarantee
>> that the bits are the same
>> on both side, no ?
> How exactly would you calculate that checksum?
by calculating the grand hash of each file hash.
> Would you really require
> concatenation of all files?
I did not say that. You are claiming it in a rhetorical question.
> That could take a few hours per change.
why that ? you don't calculate the checksum of a file your already have
twice.
Even if you do, it's very fast to call md5.
try it:
$ find mirror | xargs md5
this takes a few seconds at most on the whole mirror
> It
> would also raise the question in what order the files ought to be
> concatenated.
Anything reproductible, a sorted list. In bash I *suspect* the
calculation of the grand hash of the mirror is a one-liner that takes
less than a minute.
I am going to stop here anyways because I don't see the point of
discussing implementation details at this stage, since we were
barely starting to talk about the idea of a checksum - and that seems to
be going nowhere.
Cheers
Tarek
More information about the Catalog-SIG
mailing list