[New-bugs-announce] [issue2776] urllib2.urlopen() gets confused with path with // in it

Ambarish Malpani report at bugs.python.org
Tue May 6 23:30:16 CEST 2008


New submission from Ambarish Malpani <ambarish at yahoo.com>:

Try the following code:
import urllib
import urllib2

url =
'http://features.us.reuters.com//autos/news/95ED98EE-A837-11DC-BCB3-4F218271.html'

data = urllib.urlopen(url).read()
data2 = urllib2.urlopen(url).read()

The attempt to get it with urllib works fine. With urllib2, the request
is malformed and I get back a HTTP 404

Request in the 2nd case is:
GET //autos/news/95ED98EE-A837-11DC-BCB3-4F218271.html HTTP/1.1\r\n
Accept-Encoding: identity\r\n
Host: autos\r\n
Connection: close\r\n
....

The host line seems to be looking for the last // rather than the first.

----------
components: Extension Modules
messages: 66334
nosy: ambarish
severity: normal
status: open
title: urllib2.urlopen() gets confused with path with // in it
type: behavior
versions: Python 2.5

__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue2776>
__________________________________


More information about the New-bugs-announce mailing list