On 9 February 2014 11:15, Ernest W. Durbin III <ewdurbin@gmail.com> wrote:
Since the launch of the new infrastructure for PyPI two weeks ago, I've been monitoring overall performance and reliability of PyPI for browsers, uploads, installers, and mirrors. The initial rates will be limited to 5 req/s per IP with bursts of 10 requests allowed. Client requests up to the burst limit will be delayed to maintain a 5 req/s maximum. Any requests past the 10 request burst will receive an HTTP 429 response code per RFC 6585.
5/s sounds really low - if the RPC's take less than 200ms to answer (and I sure hope they do), a single threaded mirroring client (with low latency to PyPI's servers // pipelined requests) can easily it. Most folk I know writing API servers aim for response times in the single to low 10's of ms digits... What is the 95% percentile for PyPI to answer these problematic APIs ? Can our infrastructure restrict concurrency etc (e.g. if we have haproxy it can trivially limit by concurrency rather than rate)? That would be IMO a better metric for overload.
Tuning these parameters will be painless, so if issues arise with mirroring clients we will be very responsive to necessary modifications.
Note that the routes used by installation clients (`/packages` and `/simple`) will remain unaffected as they are generally served from the CDN, and do not have as high of an overhead in our backend processes.
This rate-limiting is to be considered an interim solution, as I plan to begin a discussion on some updates to mirroring infrastructure guidelines.
Ok, cool. -Rob -- Robert Collins <rbtcollins@hp.com> Distinguished Technologist HP Converged Cloud