[Python-Dev] urllib.quote and unquote - Unicode issues
Bill Janssen
janssen at parc.com
Thu Jul 31 09:39:29 CEST 2008
> Guido says:
>
> > Actually, we'd need to look at the various other APIs in Py3k before we can
> > decide whether these should be considered taking or returning bytes or text.
> > It looks like all other APIs in the Py3k version of urllib treat URLs as
> > text.
>
>
> Yes, as I said in the bug tracker, I've groveled over the entire stdlib to
> see how my patch affects the behaviour of dependent code. Aside from a few
> minor bits which assumed octets (and did their own encoding/decoding) (which
> I fixed), all the code assumes strings and is very happy to go on assuming
> this, as long as the URIs are encoded with UTF-8, which they almost
> certainly are.
I'm not sure that's sufficient review, though I agree it's necessary.
The major consumers of quote/unquote are not in the Python standard
library.
> (quote will accept either type, while
> unquote will output a str, there will be a new function unquote_to_bytes
> which outputs a bytes - is everyone happy with that?)
No, so don't ask.
Bill
More information about the Python-Dev
mailing list