[Python-Dev] urllib.quote and unquote - Unicode issues

Guido van Rossum guido at python.org
Wed Jul 30 22:32:24 CEST 2008


On Wed, Jul 30, 2008 at 12:49 PM, Bill Janssen <janssen at parc.com> wrote:
>> >  unquote() -- takes string, produces bytes or string
>> >
>> >     If optional "encoding" parameter is specified, decodes bytes with
>> >     that encoding and returns string.  Otherwise, returns bytes.
>>
>> The default of returning bytes will break almost all uses. Most code
>> will uses the unquoted result as a text string, not as bytes -- e.g. a
>> server has to unquote the values it receives from a form (whether POST
>> or GET), but almost always the unquoted values are text, e.g.
>> someone's name or address, or a draft email message.
>
> I actually do know a lot about the uses of this function...
>
> But:  OK, OK, I yield.  Though I still think this is a bad idea, I'll
> shut up if we can also add "unquote_as_bytes" which returns a byte
> sequence instead of a string.  I'll just change my code to use that.
>
>> (Aside: I dislike functions that have a different return type based on
>> the value of a parameter.)
>
> Fair enough.

I think this is as close as consensus as we can get on this issue. Can
whoever wrote the patch adjust the patch to this outcome? (I think the
only change is to remove the encoding arguments and make separate
functions for bytes.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-Dev mailing list