[soc2008-general] urllib asynchronicity

Sebastian Hagen sebastian_hagen at memespace.net
Mon Mar 17 18:03:22 CET 2008


Hello,

my name is Sebastian Hagen, and I'm a CS student at the University of
Rostock. I'm interested in participating in the 2008 GSoC on a
PSF-mentored project. Specifically, I'm interested in the
urllib/urllib2/urlparse cleanup project.

Aside from general API cleanups and integrating the separate modules
into one, I'd like to focus on rewriting the library to allow completely
asynchronous operation, allowing the simultaenous use of an arbitrarily
high number of connections in a single thread.
Naturally, this should work with many different kinds of event loop
APIs, including widely-available interfaces like select and poll,
interfaces provided by libraries like qt or glib, and similar APIs I've
never even heard of.
I don't intend to write code to interface with each of these, of course,
but the API used by the revised urllib* should make writing an
appropriate wrapper easy for anyone who wants to use it with a
not-supported-out-of-the-box event loop API.
Implementing this correctly would probably also entail significantly
rewriting httplib and ftplib and extending their APIs, so I consider
this a part of this project.
I'm very much scratching my own itch here; I've sorely missed this
functionality in the past. There are ways to hack around it, but the
ones I could find were not pretty.
Blocking operation should also continue to be supported, of course; it
wouldn't even be necessary to break the existing API (at least not
immediately; in the long term, it might still be desirable to deprecate
the redundant url* modules).

The comments on <http://wiki.python.org/moin/CleanupUrlLibProject>
suggest that there was considerable interest in exactly this kind of
project during 2007's GSoC. However,
<http://svn.python.org/view/python/trunk/Lib/urllib2.py?rev=60648&view=log>
suggests that if any work was done on this, it hasn't made it into mainline.
Is there some specific reason these proposals didn't work out? Is this
insufficiently ambitious (in expected necessary effort) to be approved?
Is it too ambitious? Was there work done on this, and there's some
specific reason it never made it into mainline?

Any hints would be greatly appreciated. Before giving this a try of my
own, I'd really like to know what happened to those who tried before me.

Regards,
Sebastian Hagen



More information about the soc2008-general mailing list