[Patches] [ python-Patches-649742 ] urllib2.Request's headers are case-sens.

SourceForge.net noreply@sourceforge.net
Tue, 17 Jun 2003 14:53:00 -0700


Patches item #649742, was opened at 2002-12-06 13:26
Message generated for change (Comment added) made by bcannon
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=649742&group_id=5470

Category: Library (Lib)
Group: Python 2.3
>Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: John J Lee (jjlee)
Assigned to: Brett Cannon (bcannon)
Summary: urllib2.Request's headers are case-sens.

Initial Comment:
urllib2.Request's headers are case-sensitive.

This is unfortunate if, for example, you add a content-type header
like so:

req = urllib2.Request("http://blah/", data,
                      headers={"Content-Type": "text/ugly"})

because, while urllib2.AbstractHTTPHandler is careful to check not to
add this header if it's already in the Request, it happens to use a
different case convention:

                if not req.headers.has_key('Content-type'):
                    h.putheader('Content-type',
                                'application/x-www-form-urlencoded')

so you get both headers:

Content-Type: text/ugly
Content-type: application/x-www-form-urlencoded

in essentially random order.  The documentation says:

"""Note that there cannot be more than one header with the same name,
and later calls will overwrite previous calls in case the key
collides.  Currently, this is no loss of functionality, since all
headers which have meaning when used more than once have a
(header-specific) way of gaining the same functionality using only one
header."""

RFC 2616 (section 4.2) says:

"""The order in which header fields with the same field-name are
received is therefore significant to the interpretation of the
combined field value, and thus a proxy MUST NOT change the order of
these field values when a message is forwarded."""

The patch fixes this by adding normalisation of header case to
urllib.Request.  With the patch, you'd get:

Content-type: text/ugly


John


----------------------------------------------------------------------

>Comment By: Brett Cannon (bcannon)
Date: 2003-06-17 14:53

Message:
Logged In: YES 
user_id=357491

Dont as revision 1.51 .

----------------------------------------------------------------------

Comment By: John J Lee (jjlee)
Date: 2003-06-17 06:06

Message:
Logged In: YES 
user_id=261020

OpenerDirector.addheaders is another source of 
headers, on top of the ones provided by 
Request.headers and those hard-coded in 
AbstractHTTPHandler.do_open. 
 
These headers should be compared case- 
insensitively, just as the others are.  The patch 
I just attached does this. 
 
Since all the other headers are .capitalize()d, 
this patch also changes the default value of 
addheaders back to "User-agent" (reversing 
patch 599836). 
 
This really needs to be fixed before 2.3 final. 

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-05-12 00:30

Message:
Logged In: YES 
user_id=357491

Applied as urllib2.py 1.43.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-05-11 23:56

Message:
Logged In: YES 
user_id=357491

OK, the idea of the patch was cleared.  I will apply it some time this week.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-05-11 16:27

Message:
Logged In: YES 
user_id=357491

OK, I fixed the 'items' calls in my local copy of the file.  I am going to get 
someone to double-check this patch and if they give me the all-clear I will 
apply it.

----------------------------------------------------------------------

Comment By: John J Lee (jjlee)
Date: 2003-05-11 04:19

Message:
Logged In: YES 
user_id=261020

I used iter rather than iteritems because that's what 
the rest of the module does, so maybe you want to 
look at the other 5 instances of that if you use 
iteritems. 
 
Otherwise, fine. 
 

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-05-10 23:06

Message:
Logged In: YES 
user_id=357491

So the patch looks almost perfect.  There are only two things that I would 
change.  One is how you iterate over the dictionary.  It is better to use 
header.iteritems than header.items .  Second, I would not do the 
capitalization in __init__ directly.  Instead, to match the expectation of the 
docs ("headers should be a dictionary, and will be treated as if 
add_header() was called with each key and value as arguments") you should 
just call self.add_header(k, v)  in the loop.  This will lower code redundency 
and if for some reason add_header is changed no one will have to worry 
about changing __init__ at the same time.

But otherwise the patch looks good.  I have uploaded a corrected version of 
the patch.  Have a look and let me know that if it works for you.

----------------------------------------------------------------------

Comment By: John J Lee (jjlee)
Date: 2003-05-10 04:55

Message:
Logged In: YES 
user_id=261020

Patch is attached (no doc changes required). 
 

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-05-09 16:46

Message:
Logged In: YES 
user_id=357491

OK.  If you can rewrite the patch then using capitalize I will take a look and 
decide whether to apply it or not.

Also, if this will require changes to the docs please also include a patch for 
that.

----------------------------------------------------------------------

Comment By: John J Lee (jjlee)
Date: 2003-05-09 05:41

Message:
Logged In: YES 
user_id=261020

Ooh, look at all those string methods I'd forgotten about. 
 
Yes, good idea, but name.capitalize() would be simpler and minutely 
more conservative (the module already uses that convention), hence 
better. 
 

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-05-08 18:38

Message:
Logged In: YES 
user_id=357491

Do you think this would also work, John, if instead of having 
normalise_header_case you did name.title()?

----------------------------------------------------------------------

Comment By: John J Lee (jjlee)
Date: 2002-12-08 09:18

Message:
Logged In: YES 
user_id=261020

Here it is.

I swear I did check the box.  I clicked the button twice, though --
I guess SF doesn't like that.


John


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-12-07 00:43

Message:
Logged In: YES 
user_id=21627

There's no uploaded file!  You have to check the
checkbox labeled "Check to Upload & Attach File"
when you upload a file.

Please try again.

(This is a SourceForge annoyance that we can do
nothing about. :-( )

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=649742&group_id=5470