[Python-bugs-list] [ python-Bugs-407783 ] urllib2: AbstractHTTPHandler limits flexible client implemen
noreply@sourceforge.net
noreply@sourceforge.net
Mon, 01 Apr 2002 13:54:05 -0800
Bugs item #407783, was opened at 2001-03-12 00:43
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=407783&group_id=5470
Category: Python Library
Group: None
>Status: Closed
Resolution: Postponed
Priority: 4
Submitted By: Bill Bumgarner (bbum)
Assigned to: Jeremy Hylton (jhylton)
Summary: urllib2: AbstractHTTPHandler limits flexible client implemen
Initial Comment:
The implementation of the do_open() method on the AbstractHTTPHandler class contains a couple of "features" that could be considered to be "bugs". In any case, each time I have wanted to use urllib2 for relatively straightforward development of an HTTP client, I have had to effectively replace the HTTPHandler with one that reimplements do_open() (or http_open() in 2.0). Maybe my usage is not the norm-- in any case, the more information, the better...
Specifics (all names in context of Python 2.1):
- AbstractHTTPHandler does not allow for anything but GET or POST requests. GET is the default and POST happens anytime the request object contains data to be passed to the server.
This limitation is the only thing that stands in the way of using the AbstractHTTPHandler *directly* to implement, say, a WebDAV client or to do something like a site sucker that uses the HEAD method to determine if content has changed.
- [this is likely a bug] the method will throw an exception if *any* response is received from the server other than 200. However, HTTP defines that all 2XX responses should be treated as successful.
In any case, there are *a lot* of contexts within which a non-200 response may be treated as a 'success' of some sort or another. Regardless, it is really outside of the scope of the AbstractHTTPHandler's implementation to make the success/failure decision-- it should simply return the same thing regardless of the response status.
- [a bug?] Whenever an exception is raised (a non-200 code is received), the status code and reason (as provided by the server) are both lost.
I see that moshez has been primarily responsible for recent changes surrounding this code. I would be happy to contribute to the evolution of the code; please feel free to contact me directly.
----------------------------------------------------------------------
>Comment By: Jeremy Hylton (jhylton)
Date: 2002-04-01 21:54
Message:
Logged In: YES
user_id=31392
I still think this is a useful feature, but I don't have
time to champion it. Since the original poster hasn't
followed up in the last year, I'll just close the report.
----------------------------------------------------------------------
Comment By: Jeremy Hylton (jhylton)
Date: 2001-10-09 18:06
Message:
Logged In: YES
user_id=31392
It's only six or seven months since there was an active
discussion on this bug report. Anyone still interested in
fixing it? I think it's reasonable to try and fix these
issues for 2.2, but I don't have time to implement it all
myself.
----------------------------------------------------------------------
Comment By: Guido van Rossum (gvanrossum)
Date: 2001-09-05 18:07
Message:
Logged In: YES
user_id=6380
Unassigning from Moshe -- he doesn't seem to have time
(Moshe, if you're still interested, just change the owner
field back to you).
----------------------------------------------------------------------
Comment By: Moshe Zadka (moshez)
Date: 2001-04-09 14:02
Message:
Logged In: YES
user_id=11645
I'm formally postponing it until the 2.1 release comes out
-- clearly none of this can be considered a bug fix.
----------------------------------------------------------------------
Comment By: Bill Bumgarner (bbum)
Date: 2001-03-20 00:37
Message:
Logged In: YES
user_id=103811
OK-- I can understand that logic (close to beta, etc).
Given the prominence of Python in the WebDav
community combined with the increasing use of 2xx
(and 1xx) codes, it would be extremely useful to
include-- at the least-- examples of handling such via
the urllib2 modules.
Beyond that, it would be quite helpful to the developers
to expend some amount of engineering effort such that
handling 2xx response codes doesn't require
__getattr__ trickery!
Similarly, breaking out the HTTP raw connection setup
from the method that actually composes and sends the
HTTP request would be helpful in that it would greatly
reduce the amount of code that has to be duplicated
when subclassing the handler to customize handling of
2xx or when specifying methods other than GET/POST.
I.e. most developers will be confused to the point of
being overwhelmed if "how do I customize responses
such that they don't raise" or "how do I send an
OPTIONS or HEAD request" requires figuring out how
to deal with setting up and sending a request via the
much-lower-level-than-urllib2 HTTP API.
----------------------------------------------------------------------
Comment By: Moshe Zadka (moshez)
Date: 2001-03-18 09:22
Message:
Logged In: YES
user_id=11645
None of these can really be classified as "bugs" rather then
functionality enhancement requests, and this is something
I'm not sure I want to do this close to the second beta.
BTW, one thing I'm sure I *don't* want to change -- handling
of 20x codes. If you want to handle 201/206/whatever, then
just handle them. With some __getattr__ trickery, you can
have a class that handles all http_error_20x errors, so this
is *easy* for 3rd party urllib2 extensions to add.
Regarding explicitly determining the command: just put the
command inside the request object, and use it in your
own HTTPHandler/HTTPSHandler. This may be done in the next
version of urllib2 (for 2.2). At that time I might also add
the feature that other encodings (not just
application/x-www-form-urlencoded, but also
multipart/form-data) will be supported.
----------------------------------------------------------------------
Comment By: Jeremy Hylton (jhylton)
Date: 2001-03-16 18:43
Message:
Logged In: YES
user_id=31392
I haven't had any spare cycles to devote to urllib2 this
year. Perhaps Moshe can be of more help in the near term.
Following the 2.1 release, I may have more time.
I never used urllib2 it a situation that produced anything
other than vanilla responses -- 200, 401, etc. I'm not to
surprised to hear that there are problems with 2XX cases.
Can you post some examples of the sorts of things you want
to do? It sounds reasonable in the abstract, but some code
would help. If not in this patch archive, perhaps on
comp.lang.python?
----------------------------------------------------------------------
Comment By: Bill Bumgarner (bbum)
Date: 2001-03-12 03:59
Message:
Logged In: YES
user_id=103811
I realized that the exception throw behaviour is more fundamental to the underlying implementation than may have been indicated in the above description. In particular, throwing an HTTP exception when handling a 401 is key to making the various Authentication Handlers work.
I still feel that the behaviour should be normalized across all requests such that the callee is responsible for determining error conditions or, at the lest, has access to the same data in a relatively similar format upon success or failure.
----------------------------------------------------------------------
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=407783&group_id=5470