[issue5468] urlencode does not handle "bytes", and could easily handle alternate encodings

Senthil Kumaran report at bugs.python.org
Fri Jul 2 13:45:27 CEST 2010


Senthil Kumaran <orsenthil at gmail.com> added the comment:

I see no problem in going ahead with the suggestion proposed and the patch.

- I checked with RFC3986 Section 2.5
http://labs.apache.org/webarch/uri/rfc/rfc3986.html#identifying-data

Relevant line:
When a new URI scheme defines a component that represents textual data consisting of characters from the Universal Character Set [UCS], the data should first be encoded as octets according to the UTF-8 character encoding [STD63]; then only those octets that do not correspond to characters in the unreserved set should be percent-encoded.

- This is done already in quote and quote_plus. 
- It just boils down to urlencode also providing the same facility for query strings and that was the point of this bug report.

Jeremy, I shall go ahead with this and do the modifications, if required.

----------
assignee: jhylton -> orsenthil
nosy: +orsenthil

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue5468>
_______________________________________


More information about the Python-bugs-list mailing list