urllib, urllib2, httplib -- Begging for consolidation?

Dave Brueck dave at pythonapocrypha.com
Tue Jun 4 22:04:23 CEST 2002

jeremy at alum.mit.edu (Jeremy Hylton) wrote in message news:<b0f083db.0205091346.397e35db at posting.google.com>...
> brueckd at tbye.com wrote in message news:<mailman.1020962597.19539.python-list at python.org>...
> > On 9 May 2002, Paul Boddie wrote:
> > 
> > > For me, what I've mostly been doing with urllib is to connect to
> > > locations and to download files. Indeed, having this functionality in
> > > the standard library is incredibly useful when a server
> > 
> > Right, and I'm not in any way advocating the elimination of this, only
> > that more of the meat of the functionality be built in the corresponding
> > protocol libraries (and not urllib), on top of which is built a thin,
> > general-purpose API that does simple "protocol triage" if you will.
> It's not at all clear that you want all possible logic associated with
> processing HTTP inside httplib.  Rather, specific http-based
> applications may want to use some set of the features.  Thus, I think
> it's helpful to have a separation between the base protocol module and
> a higher level application toolkit.

(this thread was dead, but I ran into a problem today that made me
think of it)

That's fine, but why would so much useful http-specific knowledge live
outside of an http-specific module? If there's some division of the
feature set (base protocol and higher level) it would make sense for
the two modules to be httpcore and httplib (or something) with urllib
built on top.

> That's the rationale for having urllib and urllib2 separate from
> httplib, I believe.  The rationale for having urllib2 in addition to
> urllib is flexibility. urllib requires applications to get exactly one
> set of features; urllib2 lets an application pick the features it
> needs.

Ok, maybe I'm just not understanding how httplib/urllib/urllib2 work
together. For example, what is the correct way to do a HTTP HEAD
request that follows redirects? It's not hard, but it's silly to have
to code it yourself if somebody already did the work for you. Well,
httplib doesn't know how to follow an HTTP redirect, but lo and
behold, urllib does. Unfortunately, there's no way to use it because
in this case because urllib is for opening URLs (the GET is hardcoded
and not easily overridable).

I don't disagree that a layering or division of functionality can be
helpful, I just disagree that the current layering/division makes
sense. I think it's great that opening a URL via urllib will
automatically follow redirects, it's just a shame that you can't take
advantage of that functionality if what you're doing falls outside the
narrow design domain of urllib transactions. :(


More information about the Python-list mailing list