[Distutils] [Catalog-sig] distribute D.C. sprint tasks
zopyxfilter at googlemail.com
zopyxfilter at googlemail.com
Mon Oct 13 14:10:24 CEST 2008
On 12.10.2008 18:18 Uhr, Martin v. Löwis wrote:
>>>> Our z3c.pypimirror already performs an incremental update based on
>>>> the information available from the index.html page of the simple
>>>> index and the available md5 hashes. Works like a charm...
>>>>
>>>>
>>> So how does it find out when a release gets made?
>>>
>>>
>> What do you mean by that?
>>
>
> If you only look at
>
> http://pypi.python.org/simple/
>
> then you have no way of find out out what changed. So "the information
> available from the index.html page of the simple index" is not actually
> suitable for building incremental mirroring. What you describe is not
> possible.
>
> I just looked at the z3c.pypimirror source, and found that it isn't
> really incremental: Whenever it mirrors, it looks at *all* index.html
> pages, of each an every package (all 4900 of them, except when you
> restrict the mirror). It then only downloads any new files that may
> have been added/deleted, and it *is* incremental wrt. files. IIUC,
> it is *not* incremental wrt. the package index itself.
>
> Please correct me if I'm wrong (and please correct z3c.pypimirror
> if I'm not :-)
>
Good suggestion. I think we can take the changelog into account easily.
Having to check this with
Daniel Kraft, the original author of the package.
> Can you please set a specific useragent header, to find out what
> amount of traffic pypimirror produces? Currently, urllib accounts for
> 17% of the requests, excluding requests made through urllib by
> setuptools (which is a separate 18%). It's probably not all of them
> through pypimirror, but of the 64626 requests made through urllib
> yesterday, 41671 originated from zopyx.com.
>
>
Should not be a problem.
> For real incremental mirroring, you should retrieve the changelog,
> and access only those package pages that have actually changed since
> the last time you ran the mirror (successfully).
See above.
Andreas
More information about the Distutils-SIG
mailing list