[Python-Dev] More on Py3K urllib -- urlencode()

Sat Feb 28 21:28:43 CET 2009

Hi.  I've been using Py3K successfully for a while now, and have some 
questions about urlencode().

1) The docs mention that items sent to urlencode() are quoted using 
quote_plus().  However, instances of type "bytes" are not handled like 
they are with quote_plus() because urlencode() converts the parameters 
to strings first (which then puts a small "b" and single quotes around a 
textual representation of the bytes).  It just seems to me that 
instances of type "bytes" should be passed directly to quote_plus().  
That would complicate the code just a bit, but would end up being much 
more intuitive and useful.

2) If urlencode() relies so heavily on quote_plus(), then why doesn't it 
include the extra encoding-related parameters that quote_plus() takes?

3) Regarding the following code fragment in urlencode():

            k = quote_plus(str(k))
           if isinstance(v, str):
                v = quote_plus(v)
                l.append(k + '=' + v)
            elif isinstance(v, str):
                # is there a reasonable way to convert to ASCII?
                # encode generates a string, but "replace" or "ignore"
                # lose information and "strict" can raise UnicodeError
                v = quote_plus(v.encode("ASCII","replace"))
                l.append(k + '=' + v)

I don't understand how the "elif" section is invoked, as it uses the 
same condition as the "if" section.

Thanks in advance for any thoughts on this issue.  I could submit a 
patch for urlencode() to better explain my ideas if that would be useful.

- Dan