[Numpy-discussion] How long does it take to create an array?

Keith Goodman kwgoodman at gmail.com
Fri Feb 5 13:26:37 EST 2010


Why is the second method of converting a list of tuples to an array so
much faster?

>> x = range(500)
>> x = [(z,) for z in x] # <-- e.g. output of a sql database
>> x[:5]
   [(0,), (1,), (2,), (3,), (4,)]
>>
>> timeit np.array(x).reshape(-1)  # <-- slow
1000 loops, best of 3: 832 us per loop
>> timeit np.array([z[0] for z in x])
10000 loops, best of 3: 106 us per loop  # <-- fast

Is it a fixed overhead advantage? Doesn't seems so:

>> x = range(50000)
>> x = [[z] for z in x]
>> timeit np.array(x).reshape(-1)
10 loops, best of 3: 83 ms per loop
>> timeit np.array([z[0] for z in x])
100 loops, best of 3: 9.81 ms per loop

So it is probably faster to make a 1d array and reshape it:

>> timeit np.array([[1,2], [3,4], [5,6]])
100000 loops, best of 3: 11.8 us per loop
>> timeit np.array([1,2,3,4,5,6]).reshape(-1,2)
100000 loops, best of 3: 6.62 us per loop

Yep.



More information about the NumPy-Discussion mailing list