[issue9873] Allow bytes in some APIs that use string literals internally

Nick Coghlan report at bugs.python.org
Thu Sep 30 00:23:41 CEST 2010


Nick Coghlan <ncoghlan at gmail.com> added the comment:

> I think it's quite misguided. latin1 encoding and decoding is blindingly
> fast (on the order of 1GB/s. here). Unless you have multi-megabyte URLs,
> you won't notice any overhead.

Ah, I didn't know that (although it makes sense now I think about it).
I'll start exploring ideas along those lines then. Having to name all
the literals as I do in the patch is really quite ugly.

A general sketch of such a strategy would be to stick the following
near the start of affected functions:

encode_result = not isinstance(url, str) # or whatever the main
parameter is called
if encode_result:
    url = url.decode('latin-1')
    # decode any other arguments that need it
    # Select the bytes versions of any relevant globals
else:
    # Select the str versions of any relevant globals

Then, at the end, do an encoding step. However, the encoding step may
get a little messy when it comes to the structured data types. For
that, I'll probably take a leaf out of the email6 book and create a
parallel bytes API, with appropriate encode/decode methods to
transform one into the other.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue9873>
_______________________________________


More information about the Python-bugs-list mailing list