[Python-Dev] More on Py3K urllib -- urlencode()
Bill Janssen
janssen at parc.com
Sat Feb 28 23:08:33 CET 2009
Dan Mahn <dan.mahn at digidescorp.com> wrote:
> 3) Regarding the following code fragment in urlencode():
>
> k = quote_plus(str(k))
> if isinstance(v, str):
> v = quote_plus(v)
> l.append(k + '=' + v)
> elif isinstance(v, str):
> # is there a reasonable way to convert to ASCII?
> # encode generates a string, but "replace" or "ignore"
> # lose information and "strict" can raise UnicodeError
> v = quote_plus(v.encode("ASCII","replace"))
> l.append(k + '=' + v)
>
> I don't understand how the "elif" section is invoked, as it uses the
> same condition as the "if" section.
This looks like a 2->3 bug; clearly only the second branch should be
used in Py3K. And that "replace" is also a bug; it should signal an
error on encoding failures. It should probably catch UnicodeError and
explain the problem, which is that only Latin-1 values can be passed in
the query string. So the encode() to "ASCII" is also a mistake; it
should be "ISO-8859-1", and the "replace" should be a "strict", I think.
Bill
More information about the Python-Dev
mailing list