On Fri, Feb 21, 2020 at 02:42:16PM +0200, Serhiy Storchaka wrote:
21.02.20 10:36, Steven D'Aprano пише:
On my machine, at least, constructing a bytes object first followed by an array is significantly faster than the alternative:
[steve@ando cpython]$ ./python -m timeit -s "from array import array" "array('i', bytes(500000))" 100 loops, best of 5: 1.71 msec per loop
[steve@ando cpython]$ ./python -m timeit -s "from array import array" "array('i', )*500000" 50 loops, best of 5: 7.48 msec per loop
That surprises me and I cannot explain it.
The second one allocates and copies 4 times more memory.
I completely misunderstood what the first would do. I expected it to create an array of 500,000 zeroes, but it only created an array of 125,000. That's nuts! The docstring says:
Return a new array whose items are restricted by typecode, and initialized from the optional initializer value, which must be a list, string or iterable over elements of the appropriate type.
A bytes object is an iterable over integers, so I expected these two to be equivalent:
array('i', bytes(500000)) array('i', *500000)
I never would have predicted that a bytes iterable and a list iterable behave differently. Oh, you have to read the documentation on the website:
Okay, let's try that again:
[steve@ando cpython]$ ./python -m timeit -s "from array import array" "array('i', bytes(500000*4))" 20 loops, best of 5: 12.6 msec per loop
compared to 7.65 milliseconds for the version using multiplication. That makes more sense to me.
Okay, I'm starting to come around to giving array an alternate constructor:
array.zeroes(typecode, size [, *, value=None])
If keyword-only argument value is given and is not None, it is used as the initial value instead of zero.