[issue3982] support .format for bytes
Terry J. Reedy
report at bugs.python.org
Wed Jan 23 00:34:32 CET 2013
Terry J. Reedy added the comment:
>it would probably be reasonable to make these protocols use str objects at the heart, and only convert to bytes after the formatting is done.
I presume this would mean adding 'if py3: out = out.encode()' after the formatting. As I said before, this works much better in 3.3+ than in 3.2-. Some actual numbers:
for len in (0, 100, 1000, 10000, 100000):
a = 'a' * len
print(timeit("a.encode()", "from __main__ import a"))
>>>
0.19305401378265558
0.22193721412302575
0.2783227054755883
0.677596406192696
7.124387897799184
Given n = 1000000, these should be microseconds per encoding. Of note:
the copying of bytes does not double the total time until there are a few thousand chars. Would protocols be using .format for much more than this?
[If speed is really an issue, we could make binary file/socket write methods unicode implementation aware. They could directly access the ascii (or latin-1) bytes in a unicode object, just as they do with a bytes object, and the extra copy could be skipped.]
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue3982>
_______________________________________
More information about the Python-bugs-list
mailing list