[Numpy-discussion] Problem migrating PDL's index() into NumPy

Wed Mar 17 07:12:27 EDT 2010

Hello,

being quite new to NumPy and having used previously PDL in Perl, I am 
currently migrating one of my PDL projects into NumPy.

Most of the functions can be migrated without problems and there are 
functions in NumPy that allow me to do things in much clearer way than 
in PDL. However, I have a problem with the following operation:

There are two 2D arrays with dimensions: A[10000,1000] and B[10000,100]. 
The first dimension of both arrays corresponds to a list of 10000 objects.

The array A contains for each of 10000 objects 1000 integer values 
between 0 and 99, so that for each of 10000 objects a corresponding 
value can be found in the array B.

I need a new array C[10000,1000] with values from B the following way:

for x in range(10000):
    for y in range(1000):
       C[x,y] = B[x,A[x,y]]

In Perl's PDL, this can be done with $C = $B->index($A)

If in NumPy I do C = B[A], then I do not get a [10000,1000] 2D array, 
but rather a [10000,1000,1000] 3D array, in which I can find the correct 
values on the following positions:

for x in range(10000):
    for y in range(1000):
       C[x,y,y]

which may seem nice, but it needs 1000 times more memory and very 
probably 1000 times more time to calculate... Impossible with such large 
arrays... :-(

Could anyone help me, please?

Regards,
Miroslav Sedivy