[Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
pmiscml at gmail.com
Wed Jun 8 07:26:45 EDT 2016
On Wed, 8 Jun 2016 14:05:19 +0300
Serhiy Storchaka <storchaka at gmail.com> wrote:
> On 08.06.16 13:37, Paul Sokolovsky wrote:
> >> The obvious way to create the bytes object of length n is b'\0' *
> >> n.
> > That's very inefficient: it requires allocating useless b'\0', then
> > a generic function to repeat arbitrary memory block N times. If
> > there's a talk of Python to not be laughed at for being SLOW, there
> > would rather be efficient ways to deal with blocks of binary data.
> Do you have any evidences for this claim?
Yes, it's written above, let me repeat it: bytes(n) is (can be)
calloc(1, n) underlyingly, while b"\0" * n is a more complex algorithm.
> $ ./python -m timeit -s 'n = 10000' -- 'bytes(n)'
> 1000000 loops, best of 3: 1.32 usec per loop
> $ ./python -m timeit -s 'n = 10000' -- 'b"\0" * n'
> 1000000 loops, best of 3: 0.858 usec per loop
I don't know how inefficient CPython's bytes(n) or how efficient
repetition (maybe 1-byte repetitions are optimized into memset()?), but
MicroPython (where bytes(n) is truly calloc(n)) gives expected results:
$ ./run-bench-tests bench/bytealloc*
3.333s (+00.00%) bench/bytealloc-1-bytes_n.py
11.244s (+237.35%) bench/bytealloc-2-repeat.py
Paul mailto:pmiscml at gmail.com
More information about the Python-Dev