[issue12805] Optimizations for bytes.join() et. al

Sun Apr 8 19:59:16 CEST 2012

Serhiy Storchaka <storchaka at gmail.com> added the comment:

> Honestly, I don't think there's much point in these optimizations. The first one ("The sequence length and separator length are both 0") is probably useless.

This optimization changes the behavior.

...     def __repr__(self): return 'B(' + super().__repr__() + ')'
... 
>>> B()
B(b'')
>>> B().join([])
b''

With the patch we get B(b'').

Regardless of whether optimization (which is negligible in this case), I don't know whether to consider such side effect desirable or undesirable.

bytes.join need to optimize for the 0- and 1-byte delimiter, but the proposed patch leads to pessimization:

Unpatched:
$ ./python -m timeit -s 'seq=[bytes([i]*100000) for i in range(256)]' 'b"".join(seq)'
10 loops, best of 3: 43.3 msec per loop

Patched:
$ ./python -m timeit -s 'seq=[bytes([i]*100000) for i in range(256)]' 'b" ".join(seq)'
10 loops, best of 3: 70.3 msec per loop

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue12805>
_______________________________________