[Catalog-sig] Pypi cdn for hosted packages

Jesse Noller jnoller at gmail.com
Thu Feb 28 12:52:26 CET 2013



On Feb 28, 2013, at 6:48 AM, Giovanni Bajo <rasky at develer.com> wrote:

> Il giorno 28/feb/2013, alle ore 12:18, Donald Stufft <donald.stufft at gmail.com> ha scritto:
> 
>> On Thursday, February 28, 2013 at 6:13 AM, Jesse Noller wrote:
>>> 
>>> 
>>> On Feb 28, 2013, at 5:41 AM, Donald Stufft <donald.stufft at gmail.com> wrote:
>>> 
>>>> On Thursday, February 28, 2013 at 5:39 AM, Jesse Noller wrote:
>>>>> Thread fork.
>>>>> 
>>>>> Anyway. I know we have at least 1 major rep of a cloud provider on the list, and I have at least one off in my pocket.
>>>>> 
>>>>> I'd like to start discussing (completely ignoring past efforts and discussion which got bogged down) how we can start distributing the package data we host via CDN rather than the mirroring system.
>>>>> 
>>>>> Most of all, we need the code in a pull request to support it ;)
>>>> To be honest with PyPI as an origin you don't really even need to change
>>>> the code. You just drop your CDN in front of PyPI and it'll take care of
>>>> things.
>>>> 
>>>> Code changes are required if you want to store the packages on a cloud
>>>> storage provider.
>>> 
>>> Excellent. Now, the question is do we bother with both (CSP+CDN) or just go the CDN route short term?
>> This is probably a question best asked to Noah. He knows the capabilities of the
>> VM hosts better as far as actual technical requirements. However moving storage
>> to a CSP does mean that scaling PyPI out by launching additional instances is
>> easier. I think he's talked about using gluster or similar as well which would have
>> similar properties (at the expense of the PSF needing to maintain the cluster ofc).
> 
> I don't think you can just "drop the CDN in front of PyPI". It depends on the CDN of course, and how their API works, but usually you need to separate static resources (to be CDN'd) from dynamic resources, make sure the origin serves the static resources with an appropriate caching header, and then rewrite the URLs to go through the CDN (so that PyPI tells everybody the CDN url instead of the origin URL). Moreover, you probably need special configuration (depending on the CDN) if you need it to go through SSL.
> 
> The only CDN that is a drop-in that I know of is Cloudflare, but it requires delegation of name servers which is 1) probably impossible until pypi is under python.org and 2) I guess we would violate their ToS anyway since they want sites visited by browsers and not file servers (pypi qualifies for both, but I guess most of the traffic is make through the latter).
> 
> Jesse, if you can give me a pointer to the CDN service you've agreements/discussions with (assuming they have public docs on their API), I can prepare a PyPI pull request.

Let me poke em with a stick. Ideally I'd like the providers to help just get the work done and assist with code changes so this doesn't die on the altar of "no interested volunteers" :)


> -- 
> Giovanni Bajo   ::  rasky at develer.com
> Develer S.r.l.  ::  http://www.develer.com
> 
> My Blog: http://giovanni.bajo.it
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20130228/b22bb83c/attachment-0001.html>


More information about the Catalog-SIG mailing list