[Chicago] threads and xmlrpc?

Fri Jan 30 20:19:59 CET 2009

On Fri, Jan 30, 2009 at 9:45 AM, Tim Gebhardt <tim at gebhardtcomputing.com> wrote:
> The 2 connections per host is defined in the HTTP RFC:
> http://www.faqs.org/rfcs/rfc2068.html
>
> See section 8.1.4.
> The RFC says "should limit 2 connections per server" and a lot of http
> client libraries obey this.  I know for a fact that the .NET web client
> class does.  I don't know what python does for sure so I'd hate to comment.
> This is one of the reasons why a lot of HTTP client libraries implement the
> "request" object instances as a factory rather than just instantiate the
> class directly:
>>>> import urllib2
>>>> f = urllib2.urlopen('http://www.python.org/') #Returns a Request object
>>>> print f.read(100)
> Rather than:
>>>> import urllib2
>>>> r = urllib2.Request("http://www.python.org")
>>>> print r.open().read(100)

I see.
Looking at this example on threads:
http://www.ibm.com/developerworks/aix/library/au-threadingpython/index.html

this is implemented withing the thread.each thread calls....:
url = urllib2.urlopen(host)

Looking at the xmlrpc examples the way I connect is:
http://docs.python.org/library/xmlrpclib.html
pypi=xmlrpclib.ServerProxy(XML_RPC_SERVER)

My question is: is urlopen(xyz) similar to serverproxy(xyz) ? If yes
then I can use it within the thread and issue that in each thread. But
if its not and this will make at the end 5000 active connections then
that won't work .

How would I know if serverproxy returns a instance of the class vs
request object?

Thanks a lot,
Lucas

> The Java and .NET HTTP client libraries I've used all implement it in a
> similar way because it's easier to set up stuff like connection limits and
> keep-alive.
> In any case, from my python web scraping days with httplib2, I found that I
> would reduce the number of timeouts and request errors if I waited for 1
> second after every request to a particular host.
> -Tim Gebhardt
> tim at gebhardtcomputing.com
>
> On Thu, Jan 29, 2009 at 10:46 PM, Lukasz Szybalski <szybalski at gmail.com>
> wrote:
>>
>> On Thu, Jan 29, 2009 at 9:02 AM, Tim Gebhardt <tim at gebhardtcomputing.com>
>> wrote:
>> > If xmlrpc obeys the HTTP standard connection limit, you're limited to 2
>> > concurrent connections per host.
>>
>> Could you point me to some docs on this. What I am comparing it to is
>> an apache  server which can handle 100+ requests per second with no
>> problems. With Project Gutenberg we are talking about TB of data. With
>> Pypi we are talking about <kb per request and maybe about ~3kb per
>> second. So I think I should be able to achieve bandwidth of about
>> 20kb/s minimum without anybody noticing any performance hits.
>>
>> I've emailed pypi, but if there are other things to consider, or you
>> might know why such a low throughput on xmlrpc I would be interested
>> to know more.
>>
>> Thanks,
>> Lucas
>
>
> _______________________________________________
> Chicago mailing list
> Chicago at python.org
> http://mail.python.org/mailman/listinfo/chicago
>
>

-- 
How to create python package?
http://lucasmanual.com/mywiki/PythonPaste
Bazaar and Launchpad
http://lucasmanual.com/mywiki/Bazaar