[Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices)

Baptiste Carvello baptiste13z at free.fr
Thu Sep 23 10:56:22 CEST 2010


Stephen J. Turnbull a écrit :
> What really saves the day here is not that "common encodings just
> don't do that".  It's that even in the case where only syntactically
> significant bytes in the representation are URL-encoded, they *are*
> URL-encoded.  As long as the parsing library restricts itself to
> treating only wire-format input, you're OK.[1]  But once you start
> doing things that involve decoding URL-encoding, you can run into
> trouble.
> 
If I understand you well, any processing of unquoted bytes is dangerous per se. 
If this is true, then perhaps 'unquote' doesn't disserve the criticism it 
received in this thread for always returning str. This would be in fact quite 
fortunate, as it forces url processing to either happen on quoted bytes (before 
calling 'unqote'), or on unquoted str (on the result of 'unquote'), both of 
which are safe.



More information about the Python-Dev mailing list