[Web-SIG] Stuff left to be done on WSGI

John J Lee jjl at pobox.com
Sat Aug 28 19:18:57 CEST 2004

On Fri, 27 Aug 2004, Ian Bicking wrote:
> Phillip J. Eby wrote:
> > wrong.  'web' just seems like a nuisance attractor for all sorts of
> > unproductive bickering on so many levels.
> be in 2.5 under that name.  I was thinking of it like a package for
> various Python web-related modules (the Next Generation; forgoing this
> current generation which is all in the root).

+0 for web as bag-of-modules

Seems uncontroversial, since anybody with a web module has an equal right
to lay claim to a patch of land within it.

OTOH, I've never found the Python practise of sticking all stdlib modules
in the root namespace to be troublesome.  And the reality is that there is
no grand scheme here: people generally do small pieces of work as they
find they need / want to do it.

> Almost all the modules in the root have issues.  Well, let's enumerate...
> webbrowser: this seems like a totally weird module to me


> cgi: ick ick ick.
> cgitb: this is okay.
> urllib: defunct?

It's not about to go away. (especially since Guido wrote it, I think ;-)

Unfortunately, I think there enough bugs in both urllib and urllib2 that
it's hard to say that either is unconditionally better for all purposes.

> urllib2: surpisingly hard to use in a number of ways.  There was some
> discussion about this early in Web-SIG.  I think the client stuff John
> Lee has done at: http://wwwsearch.sourceforge.net/ is better, and I
> think he's interested in that direction.  Probably not right now, but at
> some point this could well improve on urllib*

This is what I hope to do on urllib2 for 2.5, very roughly in order of
priority.  I guess you're referring above mostly to 3 in this list.  1, 2
and 3 will likely happen, 4, 5, and 6 may or may not.

Help is welcome :-)

1 Add more handlers from ClientCookie: Robot rules, http-equiv, refresh, etc.

2 Add features that are present in urllib but missing from urllib2
  (urlretrieve is the most obvious, and easy to fix).

3 A class bearing some resemblance to mechanize.UserAgent, as we discussed
  here before.  The idea is to avoid having to make a new object each time
  you want to change URL-opener behaviour.

4 Possibly improve proxy, authentication support, if I can be bothered.  I
  think this is probably still quite buggy, despite valuable changes from
  Anthony Baxter and others.

5 Connection caching.

6 HEAD, GET byte range (and maybe something to make resuming downloads as
  easy as possible), conditional GET requests, a function to do file

> DocXMLRPCServer: what a weird module.

Weird indeed.  Never noticed it before.

> HTMLParser: lives in the world between web and XML.  Some of the client
> tools in wwwserver are very HTML-centric as well.  But it all fits together.
> htmllib: deprecated, I think?  Or HTMLParser?  I don't know what's going
> on here.

As you probably know, htmllib just adds some possibly-convenient bits and
pieces on top of sgmllib.

sgmllib/htmllib is more relaxed about bad HTML than is HTMLParser, so is
certainly worth keeping.


More information about the Web-SIG mailing list