[Python-ideas] Re: Specify number of items to allocate for array.array() constructor

21 Feb 2020

      On Fri, Feb 21, 2020 at 02:42:16PM +0200, Serhiy Storchaka wrote:
...
21.02.20 10:36, Steven D'Aprano пише:
...
On my machine, at least, constructing a bytes object first followed by
an array is significantly faster than the alternative:
[steve@ando cpython]$ ./python -m timeit -s "from array import array"
"array('i', bytes(500000))"
100 loops, best of 5: 1.71 msec per loop
[steve@ando cpython]$ ./python -m timeit -s "from array import array"
"array('i', [0])*500000"
50 loops, best of 5: 7.48 msec per loop
That surprises me and I cannot explain it.
The second one allocates and copies 4 times more memory.
I completely misunderstood what the first would do. I expected it to 
create an array of 500,000 zeroes, but it only created an array of 
125,000. That's nuts! The docstring says:

    Return a new array whose items are restricted by typecode, and
    initialized from the optional initializer value, which must be a
    list,  string or iterable over elements of the appropriate type.

A bytes object is an iterable over integers, so I expected these two to 
be equivalent:

    array('i', bytes(500000))
    array('i', [0]*500000)

I never would have predicted that a bytes iterable and a list iterable 
behave differently. Oh, you have to read the documentation on the 
website:

https://docs.python.org/3/library/array.html#array.array

Okay, let's try that again:

[steve@ando cpython]$ ./python -m timeit -s "from array import array" 
"array('i', bytes(500000*4))"
20 loops, best of 5: 12.6 msec per loop

compared to 7.65 milliseconds for the version using multiplication. That 
makes more sense to me.

Okay, I'm starting to come around to giving array an alternate 
constructor:

    array.zeroes(typecode, size [, *, value=None])

If keyword-only argument value is given and is not None, it is used as 
the initial value instead of zero.

-- 
Steven