Mailman 3 Confusd about the implementation of @TYPE@_pairwise_sum - NumPy-Discussion

30 Sep 2022

      Hello there, I'm here again.

With the help of the previous emails, I finally found out the real C
implementation of np.sum(x), where x is a Numpy array.

I found that there is a "template" function named @TYPE@_pairwise_sum in
numpy/core/src/umath/loops_utils.h.src which is the underlying C code of
np.sum(). I can understand the code but there is still something that I
didn't figure out.

Here is the code link:
https://github.com/numpy/numpy/blob/main/numpy/core/src/umath/loops_utils.h....

The control logic is quite simple, if the size of the array which is going
to be summed up is smaller than 8, then we directly add them up; If the
size is greater than 8 but smaller than PW_BLOCKSIZE, then we sum a block
with 8 accumulators which will utilize the speed-up of AVX. Then if the
size is greater than PW_BLOCKSIZE, we divide the array into two halves and
recursively call function @TYPE@_pairwise_sum to calculate the two halves
respectively.

My confusion lies here: why should we divide the array into two halves and
recursively calculate this? Since we won't assign two threads to compute
the result in parallel, we can't get any profits from "divide and conquer".
Actually we waste some time and memory in calling functions and operations
in heap.

There must be something that I didn't take into consideration. It will be
appreciated if you can point it out! Thanks in advance~

Confusd about the implementation of @TYPE@_pairwise_sum

腾刘

tags

participants (1)