
urllib2.py, after receiving an HTTP response, decides if it was an error and raises an Exception, or it just returns the info. For example, you make ``urllib2.urlopen("http://www.google.com")``. If you receive 200, it's ok; if you receive 500, you get an exception raised. How it decides? Function HTTPErrorProcessor, line 490, actually says: class HTTPErrorProcessor(BaseHandler): ... if code not in (200, 206): # it prepares an error response ... Why only 200 and 206? A coworker of mine found this (he was receiving 202, "Accepted"). In RFC 2616 (http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html) it says about codes "2xx"... This class of status code indicates that the client's request was successfully received, understood, and accepted. I know it's no difficult to work this around (you have to catch all the exceptions, and check for the code), but I was wondering the reasoning of this. IMHO, "2xx" should not raise an exception. If you also think it's a bug, I can fix it. Regards, -- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

Why only 200 and 206?
This kind of question can often be answered through the revision history. If you do 'svn annotate', you see that the line testing for 206 was last changed in r36262. Comparing that to the previous revision, you see that it before said if r.status == 200: and that amk changed it with the log message [Bug #912845] urllib2 only checks for a 200 return code, but 206 is also legal if a Range: header was supplied. (Actually, should the first 'if' statement be modified to allow any 2xx status code?) Going to bugs.python.org/912845, you see that the current form was proposed by Ahmed F. (oneofone), apparently because it crashed for him with 206. In 2006, heresiarch ask the same question that you are now asking and that amk asked before. Going further back, you see that HTTPErrorProcessor (along with the handling of 200) as added by jhylton in 34909, which in turn originates from #852995, where jjlee introduced the handlers in order to support cookie handling. Looking at the change, you see that it is just refactoring; the special-casing of 200 was present before. In earlier versions, the error handling was done using this block: 14267 jhylton if code == 200: 14267 jhylton return addinfourl(fp, hdrs, req.get_full_url()) 14267 jhylton else: 14267 jhylton return self.parent.error('http', req, fp, code, msg, hdrs) You then find that 14267 is the initial revision, checked in with the comment # EXPERIMENTAL # # An extensible library for opening URLs using a variety protocols. # Intended as a replacement for urllib. So it seems that it only tests for 200 and 206 because the experiments never produced a need for anything else. Regards, Martin

Martin v. Löwis wrote:
Why only 200 and 206?
Thanks for this detailed explanation, I learned a lot of how to "discover" the history of a piece of code (didn't know about "annotate"). Regarding the codes themselves: As the tests for 200 and 206 came from just needing them, I think there's no reason to not include the rest of 200. So, in the base that the RFC says that "2xx" codes means that the request was succeded, I think we shouldn't raise an Exception. Right now, it's a bug. Do you think it's safe to fix this or will break much code? Regards, -- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

On Tue, Mar 27, 2007 at 04:12:06PM +0000, Facundo Batista wrote:
(didn't know about "annotate").
It is also known under the name "blame"! ;) Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.

On Tue, Mar 27, 2007 at 07:14:35PM +0200, "Martin v. L?wis" wrote:
But "blame" is its official primary name!
See? (-: Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.

Right now, it's a bug. Do you think it's safe to fix this or will break much code?
Who am I to judge whether a fix will break much code? Personally, I think it should treat all 2xx responses as success. Callers can then still check the response code themselves if they need to. Regards, Martin

Martin v. Löwis wrote:
Who am I to judge whether a fix will break much code? Personally, I
Sorry, this was an error. I thought "you" as in plural (in spanish there're two different words for third person of plural and singular), and wrote it as is; now, re-reading the parragraph, it's confusing. So, you-people-in-the-list, do you think fix this will be a problem?
think it should treat all 2xx responses as success. Callers can then still check the response code themselves if they need to.
The same I think. If nobody has a conflic with this decission, I'll fix this. Thank you! :) -- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

On Tue, Mar 27, 2007, Facundo Batista wrote:
The proper English word for plural "you" is "y'all". ;-) Except for "all y'all". Isn't English fun? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Need a book? Use your library!

Aahz wrote:
That's not English, it's 'Mer'can. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden Recent Ramblings http://holdenweb.blogspot.com

Facundo Batista wrote:
Nobody raised any objection, I'll fix this these days. Regards, -- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

Why only 200 and 206?
This kind of question can often be answered through the revision history. If you do 'svn annotate', you see that the line testing for 206 was last changed in r36262. Comparing that to the previous revision, you see that it before said if r.status == 200: and that amk changed it with the log message [Bug #912845] urllib2 only checks for a 200 return code, but 206 is also legal if a Range: header was supplied. (Actually, should the first 'if' statement be modified to allow any 2xx status code?) Going to bugs.python.org/912845, you see that the current form was proposed by Ahmed F. (oneofone), apparently because it crashed for him with 206. In 2006, heresiarch ask the same question that you are now asking and that amk asked before. Going further back, you see that HTTPErrorProcessor (along with the handling of 200) as added by jhylton in 34909, which in turn originates from #852995, where jjlee introduced the handlers in order to support cookie handling. Looking at the change, you see that it is just refactoring; the special-casing of 200 was present before. In earlier versions, the error handling was done using this block: 14267 jhylton if code == 200: 14267 jhylton return addinfourl(fp, hdrs, req.get_full_url()) 14267 jhylton else: 14267 jhylton return self.parent.error('http', req, fp, code, msg, hdrs) You then find that 14267 is the initial revision, checked in with the comment # EXPERIMENTAL # # An extensible library for opening URLs using a variety protocols. # Intended as a replacement for urllib. So it seems that it only tests for 200 and 206 because the experiments never produced a need for anything else. Regards, Martin

Martin v. Löwis wrote:
Why only 200 and 206?
Thanks for this detailed explanation, I learned a lot of how to "discover" the history of a piece of code (didn't know about "annotate"). Regarding the codes themselves: As the tests for 200 and 206 came from just needing them, I think there's no reason to not include the rest of 200. So, in the base that the RFC says that "2xx" codes means that the request was succeded, I think we shouldn't raise an Exception. Right now, it's a bug. Do you think it's safe to fix this or will break much code? Regards, -- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

On Tue, Mar 27, 2007 at 04:12:06PM +0000, Facundo Batista wrote:
(didn't know about "annotate").
It is also known under the name "blame"! ;) Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.

On Tue, Mar 27, 2007 at 07:14:35PM +0200, "Martin v. L?wis" wrote:
But "blame" is its official primary name!
See? (-: Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.

Right now, it's a bug. Do you think it's safe to fix this or will break much code?
Who am I to judge whether a fix will break much code? Personally, I think it should treat all 2xx responses as success. Callers can then still check the response code themselves if they need to. Regards, Martin

Martin v. Löwis wrote:
Who am I to judge whether a fix will break much code? Personally, I
Sorry, this was an error. I thought "you" as in plural (in spanish there're two different words for third person of plural and singular), and wrote it as is; now, re-reading the parragraph, it's confusing. So, you-people-in-the-list, do you think fix this will be a problem?
think it should treat all 2xx responses as success. Callers can then still check the response code themselves if they need to.
The same I think. If nobody has a conflic with this decission, I'll fix this. Thank you! :) -- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/

On Tue, Mar 27, 2007, Facundo Batista wrote:
The proper English word for plural "you" is "y'all". ;-) Except for "all y'all". Isn't English fun? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Need a book? Use your library!

Aahz wrote:
That's not English, it's 'Mer'can. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden Recent Ramblings http://holdenweb.blogspot.com

Facundo Batista wrote:
Nobody raised any objection, I'll fix this these days. Regards, -- . Facundo . Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/
participants (6)
-
"Martin v. Löwis"
-
Aahz
-
Facundo Batista
-
Jeremy Hylton
-
Oleg Broytmann
-
Steve Holden