urllib, urllib2, httplib -- Begging for consolidation?
John J. Lee
jjl at pobox.com
Thu Jun 6 17:55:07 EDT 2002
On Wed, 5 Jun 2002 brueckd at tbye.com wrote:
> On Wed, 5 Jun 2002, John J. Lee wrote:
[...]
> Before I jump in I do want to emphasize that I'm not trying to be too
> critical of httplib/urllib/urllib2 - the OP asked if they should be
> consolidated, and my short response is "yes, and reorganized too". :-)
OK -- but no reason for consolidation of the implementation, yes. I don't
really think there is any reason for consolidation of the interface either
(which I suppose is what you mean by reorganisation), but the
documentation for httplib might include a pointer to urllib{2,}, since as
you point out below it's probably not obvious that that's where the HTTP
redirection stuff is.
[...]
> So, depending on your specific needs, you'd "plug in" to this hierarchy at
> the appropriate level. If all you need is to go fetch some object, you'd
[...]
> The key though is that each level you'd have to reinvent as little as
> possible - you'd build on related work from lower levels.
Almost goes without saying. But again, I think that's what we've got
already.
[...]
> functionality instead of enforced layering). The approach we have today is
> *sort of* like this, except that richer and smarter functionality is being
> added at the *top* of the hierarchy.
The redirection stuff was added between httplib and the OpenerDirector /
urlopen stuff. That's not at the top.
> Problem #1 is what makes me throw my hands up in frustration when we talk
> about, e.g., expanding the urllib APIs to have some way to do a HEAD
> request: it doesn't belong on that level of functionality.
I think you mean "it doesn't belong in a module named 'urllib2'"... or at
least you should mean that ;)
> That generic
> interface is for making it trivially easy to fetch the contents associated
> with a URL, independent as much as possible from protocol.
Only part of urllib2 (OpenerDirector &c) is a generic any-old-url-scheme
API. The rest (eg. AbstractHTTPHandler, HTTPRedirectHandler) is
HTTP-specific, and can be used on its own, without going through the
generic stuff.
[...]
> Probably 90% of the problem is naming and/or module organization - urllib,
> a module to retrieve the contents given a generic URL, should be just that
> and little else, and protocol-specific knowledge should be accessible from
> protocol-specific modules.
Given how it is now and need for backwards-compat., what do you suggest to
do? I think putting a pointer in httplib docs (if there's not one there
already) is the best that can be done now.
John
More information about the Python-list
mailing list