I think that we also don't like that, and after doing the original,
somewhat incomplete, implementation using the subarray approach,
I began to feel that implementing it in C (albiet using a different
approach for the code generation) was probably easier and more
elegant than what was done here. So you are very likely to see
it integrated as a regular numeric type, with a more C-based
implementation.
Sounds good. Is development going to take place on the CVS
tree. If so, I
could help out by comitting changes directly.
2) Also, in your C-API, you have a different pointer to the
imaginary data.
I much prefer the way it is done currently to have complex numbers
represented as an 8-byte, or 16-byte chunk of contiguous memory.
Any reason not to allow both? (The pointer to the real can be
interpreted
as either a pointer to 8-byte or 16-byte quantities). It is true
that figuring out the imaginary pointer from the real is trivial
so I suppose it really isn't necessary.
I guess the way you've structured the ndarray, it is possible. I figured
some operations might be faster, but perhaps not if you have two pointers
running at the same time, anyway.
Well, the C implementation I was thinking of would only use
one pointer. The API could supply both if some algorithms would
find it useful to just access the imaginary data alone. But as
mentioned, I don't think it is important to include, so we
could easily get rid of it (and probably should)
Index Arrays:
===========
1) For what it's worth, my initial reaction to your indexing
scheme is
negative. I would prefer that if
a = [[1,2,3,4],
[5,6,7,8],
[9,10,11,12],
[13,14,15,16]]
then
a[[1,3],[0,3]] returns the sub-matrix:
[[ 4, 6],
[ 12, 14]
i.e. the cross-product of [1,3] x [0,3] This is the way MATLAB
works. I'm
not sure what IDL does.
I'm afraid I don't understand the example. Could you elaborate
a bit more how this is supposed to work? (Or is it possible
there is an error? I would understand it if the result were
[[5, 8],[13,16]] corresponding to the index pairs
[[(1,0),(1,3)],[(3,0),(3,3)]])
The idea is to consider indexing with arrays of integers to be a
generalization of slice index notation. Simply interpret the
slice as an
array of integers that would be formed by using the range operator.
For example, I would like to see
a[1:5,1:3] be the same thing as a[[1,2,3,4],[1,2]]
a[1:5,1:3] selects the 2-d subarray consisting of rows 1 to 4 and
columns 1
to 2 (inclusive starting with the first row being row 0). In
other words,
the indices used to select the elements of a are ordered-pairs
taken from the
cross-product of the index set:
[1,2,3,4] x [1,2] = [(1,1), (1,2), (2,1), (2,2), (3,1), (3,2),
(4,1), (4,2)]
and these selected elements are structured as a 2-d array of shape (4,2)
Does this make more sense? Indexing would be a natural extension of this
behavior but allowing sets that can't be necessarily formed from
the range
function.
I understand this (but is the example in the first message
consistent with this?). This is certainly a reasonable
interpetation. But if this is the way multiple index arrays
are interpreted, how does one easily specify scattered points
in a multidimensional array? The only other alternative I can
think of is to use some of the dimensions of a multidimensional
index array as indicies for each of the dimensions. For example,
if one wanted to index random points in a 2d array, then
supplying an nx2 array would provide a list of n such points.
But I see this as a more limiting way to do this (and there
are often benefits to being able to keep the indices for
different dimensions in separate arrays.
But I think doing what you would like to do is straightforward
even with the existing implementation. For example, if x is a
2d array we could easily develop a function such that:
x[outer_index_product([1,3,4],[1,5])]
# with a better function name!
The function outer_index_product would return a tuple of two
index arrays each with a shape of 3x2. These arrays
would not take up more space than the original
arrays even though they appear to have a much
larger size (the one dimension is replicated by
use of a 0 stride size so the data buffer is
the same as the original). Would this be acceptable?
In the end, all these indexing behaviors can be provided
by different functions. So it isn't really a question of
which one to have and which not to have. The question is
what is supported by the indexing notation? For us, the
behavior we have implemented is far more useful for our
applications than the one you propose. But perhaps we are
in the minority, so I'd be very interested in hearing which
indexing interpretation is most useful to the general
community.
Why not:
ravel(a)[[9,10,11]] ?
sure, that would work, especially if ravel doesn't make a copy of
the data
(which I presume it does not).
Correct.
Perry