[Python-Dev] More on Py3K urllib -- urlencode()
Dan Mahn
dan.mahn at digidescorp.com
Sat Mar 7 15:56:32 CET 2009
After a harder look, I concluded there was a bit more work to be done,
but still very basic modifications.
Attached is a version of urlencode() which seems to make the most sense
to me.
I wonder how I could officially propose at least some of these
modifications.
- Dan
Bill Janssen wrote:
> Bill Janssen <janssen at parc.com> wrote:
>
>
>> Dan Mahn <dan.mahn at digidescorp.com> wrote:
>>
>>
>>> 3) Regarding the following code fragment in urlencode():
>>>
>>> k = quote_plus(str(k))
>>> if isinstance(v, str):
>>> v = quote_plus(v)
>>> l.append(k + '=' + v)
>>> elif isinstance(v, str):
>>> # is there a reasonable way to convert to ASCII?
>>> # encode generates a string, but "replace" or "ignore"
>>> # lose information and "strict" can raise UnicodeError
>>> v = quote_plus(v.encode("ASCII","replace"))
>>> l.append(k + '=' + v)
>>>
>>> I don't understand how the "elif" section is invoked, as it uses the
>>> same condition as the "if" section.
>>>
>> This looks like a 2->3 bug; clearly only the second branch should be
>> used in Py3K. And that "replace" is also a bug; it should signal an
>> error on encoding failures. It should probably catch UnicodeError and
>> explain the problem, which is that only Latin-1 values can be passed in
>> the query string. So the encode() to "ASCII" is also a mistake; it
>> should be "ISO-8859-1", and the "replace" should be a "strict", I think.
>>
>
> Sorry! In 3.0.1, this whole thing boils down to
>
> l.append(quote_plus(k) + '=' + quote_plus(v))
>
> Bill
>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: new_urlencode.py
URL: <http://mail.python.org/pipermail/python-dev/attachments/20090307/5175bcb1/attachment.txt>
More information about the Python-Dev
mailing list