[Web-SIG] So what's missing?
John J Lee
jjl at pobox.com
Sat Oct 25 16:54:06 EDT 2003
On Sat, 25 Oct 2003, Ian Bicking wrote:
> On Saturday, October 25, 2003, at 07:38 AM, John J Lee wrote:
> > It's a minor issue, but it seems nicer to me to have authentication
> > separate if it can easily be separate -- that fits in with the general
> > philosophy of urllib2 that you pick 'n mix the features you want. What
> > are the trivial reasons for it breaking on non-HTTP auth?
> There's a HTTPBasicAuthHandler, but no HTTPSBasicAuthHandler, and
> though the two concepts are orthogonal they are still tied into each
> other. Another option would be to take HTTPS out of the class
> hierarchy, and make SSL a feature of HTTPHandler (and maybe the other
Well, that would break code. And adding an HTTPSBasicAuthHandler is only
five lines or so (even less if you want a class that handles both HTTP and
> The AuthHandlers are a little annoying too, you can't just give them a
> username/password. You have to give them some manager object that can
> be queried for a password for a username/realm/URL. This is a nice
> option to have, but in most cases you don't need that kind of
> generality, and it makes it a lot harder to understand what you need to
> do. username=x, password=y are very easy to understand.
That's just a documentation issue, I think -- and possibly adding some
convenience method. I wrote some docs for this, and I keep asking for
people who seem to be actually using these features to check this
documentation bug, but nobody has yet:
You don't have to provide a password manager object in fact: just let the
HTTPBasicAuthHandler create one for you, and use the add_password method
(which admittedly does require realm and uri as well as username /
password -- perhaps None should act as a wildcard there?).
> >> Cookie handling also fits into this, but from the opposite direction
> >> from a URL object, since we are creating something of a user agent.
> >> You'd almost want to do:
> >> ua = UserAgent()
> >> url = web.URL('http://whatever.com')
> >> content = ua.get(url)
> >> Or something like that. I think an explicit agent is called for,
> >> separate from the URLs that it may retrieve. But only when you start
> >> considering cookies and caching.
> > [...]
> > Are you suggesting replacing urllib2, building on top of it, or
> > extending it? urllib2's handlers already gets a lot of the
> > 'user-agent' job done. What requirements does caching impose that
> > urllib2 doesn't meet? There's already a CacheFTPHandler.
> I think a URL class would probably building on top of urllib2, but
> would also need some more features. And obviously urllib2 can't go
> anywhere, so we might as well use it.
OK. Does this URL class proposal fit with that path module PEP, do you
think? Somebody mentioned that PEP (it was a PEP, wasn't it...?) before,
but I've forgotten everything about it :-)
> The caching in CacheFTPHandler is connection caching, not result
> caching. HTTP has a wide array of ways to indicate caching, check for
> updates, etc. Enough that it becomes kind of complicated, which is why
> I don't think that fits well into the idea of a URL object (which
> should be quite simple, at least from the outside).
That doesn't answer my question. To repeat: What requirements does
caching impose that *urllib2* doesn't meet? And why do we need a new
UserAgent class when we already have urllib2 and its handlers?
More information about the Web-SIG