[Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices)

Tue Sep 21 23:35:40 CEST 2010

On Wed, Sep 22, 2010 at 1:57 AM, Ian Bicking <ianb at colorstudy.com> wrote:
> All this is unrelated to the question, though -- a separate byte-oriented
> function won't help any case I can think of.  If the programmer is
> implementing something like
> urlparse.urlsplit(user_input.encode(sys.getdefaultencoding())), it's because
> they *want* to get bytes out.  So if it's named urlparse.urlsplit_bytes()
> they'll just use that, with the same corruption.  Since bytes and text don't
> interact well, the choice of bytes in and bytes out will be a deliberate
> one.  *Or*, bytes will unintentionally come through, but that will just
> delay the error a while when the bytes out don't work (e.g.,
> urlparse.urljoin(text_url, urlparse.urlsplit(byte_url).path).  Delaying the
> error is a little annoying, but a delayed error doesn't lead to mojibake.

Indeed, this line of thinking is what brought me back around to the
polymorphic point of view.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia