[Numpy-discussion] Status of numeric3 / scipylite / scipy_core
Travis Oliphant
oliphant at ee.byu.edu
Wed Mar 16 21:56:16 EST 2005
I wanted to let people who may be waiting, that now is a good time to
help with numeric3. The CVS version builds (although I'm sure there are
still bugs), but more eyes could help me track them down.
Currently, all that remains for the arrayobject is to implement the
newly defined methods (really it's just a re-organization and
re-inspection of the code in multiarraymodule.c to call it using methods).
I also need to check the multidimensional slicing syntax when mixed with
ellipses and slice objects so that
take can be (functionally) replaced with multidimensional slicing.
Any input on this would be appreciated.
I'm referring to the fact that I think that a[...,ind,:] should be
equivalent to take(a,ind,axis=-2). But, this necessitates some
re-thinking about what partial indexing returns. What should
a[:,ind1,:,ind2,:] return if a is a five-dimensional array?
Currently, the proposed PEP for partial indexing always has the result
as the broadcasted shape of ind1 and ind2 + the dimensionality of the
un-indexed subspace. In otherwords, the unindexed subspace shape is
always appended to the end of the result shape. I think this is wrong
at least for the case of 1 indexing array because it does not let
a[...,ind,:] be a replacement for take(a,ind,axis=-2).
Is it wrong for more than 1 indexing array?
To clarify the situation: Suppose X has shape (10,20,30,40,50) and
suppose ind1 and ind2 are both broadcastable to the shape (2,3,4).
Note for reference that take(X,ind1,axis=-2).shape returns
(10,20,30,2,3,4,50)
Now, according to the current proposal:
X[..., ind1, :] will return a (2,3,4,10,20,30,50) --- I think this
should be changed to return the same as take....
X[ind1, ind1, ind1, ind1, ind1] will return a (2,3,4) array (all
dimensions are indexed) --- O.K.
X[ind1, ind1, ind1, ind1] will return a (2,3,4,50) array
X[ind1, ind1, :, ind1, ind1] will return a (2,3,4,30) array
X[...,ind1,ind1,ind1] returns a (2,3,4,10,20) array --- is this right?
X[:,ind1,:,ind2,:] returns a (2,3,4,10,30,50) array
result[i,j,k,:,:,:] = X[:,ind1[i,j,k],:,ind2[i,j,k],:]
So, here's the issue (if you are not familiar with the concept of
subspace you can replace the word subspace with "shape tuple" in the
following):
- indexing with multidimensional index arrays under the
numarray-introduced scheme (which seems reasonable to me) creates a
single "global" subspace for all of the index arrays provided (i.e.
there is no implied outer-product).
- When there is a single index array it is unambiguous to replace the
single-axis subspace with the index array subspace: i.e. X[...,ind1,:]
can replace the second-to-last axis shape with the ind1.shape to get a
(10,20,30,2,3,4,50) array.
- Where there is more than one index array, what should replace the
single-axis subspaces that the indexes are referencing? Remember, all
of the single-axis subspaces are being replaced with one "global"
subspace.
The current proposal states that this indexing subspace should be placed
first and the "remaining subspaces" pasted in at the end.
Is this acceptable, or can someone see a problem??
Best regards,
-Travis
More information about the NumPy-Discussion
mailing list