[Catalog-sig] Replacement client for pep381client
Christian Theune
ct at gocept.com
Thu Mar 21 00:59:21 CET 2013
Hi,
as you might be aware, I've done my share on bitching about my mirror
(f.pypi.python.org) breaking.
I have picked pep381client apart yesterday and rebuilt it - mostly from
ground up.
You can find a working version here:
https://bitbucket.org/ctheune/bandersnatch
The focus has been on making it a lot more robust and a lot easier to
repair a mirror when it's known to be broken. To achieve that I:
- refactored the code, trying to make it more intentional, less mechanical
- stop parsing the simple pages' html and make more use of the XML-RPC API
- add Tarek's worker/queue approach for parallelizing it
- keep as little state as possible on the client
- switch form timestamps to serial counters for checking what and how
much to update
- handle locking of concurrent runs more gracefully
I think I have a good grasp of what's going on now so that I can keep
maintining this in the future.
I'm currently re-initializing my own mirror. This basically can be run
in-place by just removing the existing state data and calling my sync
script (bsn-mirror) instead of pep381run with the same parameters.
Tomorrow I'll update the documentation, make it use a config file and
put some lipstick on the main entry point. After that I should be ready
for a release.
If you want to give it a try already, you just do this:
$ hg clone https://bitbucket/org/ctheune/bandersnatch
$ cd bandersnatch
$ virtualenv-2.7 .
$ bin/python bootstrap.py
$ bin/buildout
$ bin/bsn-mirror /my/mirror/path
Cheers,
Christian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20130320/ac78d7e7/attachment.html>
More information about the Catalog-SIG
mailing list