[Numpy-discussion] pre-PEP for making creative forking of NumPy less destructive
nouiz at nouiz.org
Tue May 22 16:07:50 EDT 2012
The example with numpy array for small array, the speed problem is
probably because NumPy have not been speed optimized for low overhead.
For example, each c function should check first if the input is a
NumPy array, if not jump to a function to make one. For example,
currently in the c function(PyArray_Multiply?) that got called by
dot(), a c function call is made to check if the array is a NumPy
array. This is an extra overhead for the expected most frequent
expected behavior that the input is a NumPy array. I'm pretty sure
this happen at many place. In this particular function, there is many
other function call before calling blas just for the simple case of
vector x vector, vector x matrix or matrix x matrix dot product.
But this is probably for another thread if people want to discuss it
more. Also, I didn't verify how frequently we could lower the overhead
as we don't need it. So it could be just a few function that need
those type of optimization.
For the comparison with the multiple type of array on the GPU, I think
the first reason is that people worked isolated and that the only
implemented the subset of the numpy ndarray they needed. As different
project/groups need different part, reusing other people work was not
Otherwise, I see the problem, but I don't know what to say about it as
I didn't experience it.
More information about the NumPy-Discussion