[Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices)
Baptiste Carvello
baptiste13z at free.fr
Thu Sep 23 10:56:22 CEST 2010
Stephen J. Turnbull a écrit :
> What really saves the day here is not that "common encodings just
> don't do that". It's that even in the case where only syntactically
> significant bytes in the representation are URL-encoded, they *are*
> URL-encoded. As long as the parsing library restricts itself to
> treating only wire-format input, you're OK.[1] But once you start
> doing things that involve decoding URL-encoding, you can run into
> trouble.
>
If I understand you well, any processing of unquoted bytes is dangerous per se.
If this is true, then perhaps 'unquote' doesn't disserve the criticism it
received in this thread for always returning str. This would be in fact quite
fortunate, as it forces url processing to either happen on quoted bytes (before
calling 'unqote'), or on unquoted str (on the result of 'unquote'), both of
which are safe.
More information about the Python-Dev
mailing list