Some of the choices between rank-0 arrays and new scalar types might be resolved by enumerating the properties desired of them.
Most properties of rank-0 arrays could be fixed by consistency requirements alone, using operations that reduce array dimensions.
Let a = ones((2,3,4)) b = sum(a) c = sum(b) d = sum(c)
Property 1: the shape of an array is a tuple of integers a.shape == (2, 3, 4) b.shape == (3, 4) c.shape == (4,) d.shape == () Property 2: rank(a) == len(a.shape) rank(a) == 3 == len(a.shape) rank(b) == 2 == len(b.shape) rank(c) == 1 == len(c.shape) rank(d) == 0 == len(d.shape)
Property 3: len(a) == a.shape[0] len(a) == 2 == a.shape[0] len(b) == 3 == b.shape[0] len(c) == 4 == c.shape[0] len(d) == Exception == d.shape[0]
# Currently the last is wrong?
Property 4: size(a) == product(a.shape) size(a) == 24 == product(a.shape) size(b) == 12 == product(b.shape) size(c) == 4 == product(c.shape) size(d) == 1 == product(d.shape)
# Currently the last is wrong
Property 5: rank-0 array behaves as mutable numbers when used as value array(2) is similar to 2 array(2.0) is similar to 2.0 array(2j) is similar to 2j
# This is a summary of many concrete properties.
Property 6: Indexing reduces rank. Slicing preserves rank. a[:,:,:].shape = (2, 3, 4) a[1,:,:].shape = (3, 4) a[1,1,:].shape = (4,) a[1,1,1].shape = ()
Property 7: Indexing by tuple of ints gives scalar. a[1,1,1] == 1 b[1,1] == 2 c[1,] == 6 d[()] == 24
# So rank-0 array indexed by empty tuple should be scalar. # Currently the last is wrong
Property 8: Indexing by tuple of slices gives array. a[:,:,:] == ones((2,3,4)) b[:,:] == ones((3,4)) * 2 c[:] == ones((,4)) * 6 d[()] == ones(()) * 24
# So rank-0 array indexed by empty tuple should be rank-0 array. # Currently the last is wrong
Property 9: Indexing as lvalues a[1,1,1] = 2 b[1,1] = 2 c[1,] = 2 d[()] = 2
Property 10: Indexing and slicing as lvalues a[:,:,:] = ones((2, 3, 4)) a[1,:,:] = ones((3, 4)) a[1,1,:] = ones((4,)) a[1,1,1] = ones(()) # But the last is wrong.
Conclusion 1: rank-0 arrays are equivalent to scalars. See properties 7 and 8.
Conclusion 2: rank-0 arrays are mutable. See property 9.
Conclusion 3: shape(scalar), size(scalar) are all defined, but len(scalar) should not be defined.
See conclusion 1 and properties 1, 2, 3, 4.
Conclusion 4: A missing axis is similar to having dimension 1. See property 4.
Conclusion 5: rank-0 int arrays should be allowed to act as indices. See property 5.
Conclusion 6: rank-0 arrays should not be hashable except by object id. See conclusion 2.
Discussions:
- These properties correspond to the current implementation quite well, except a few rough edges.
- Mutable scalars are useful in their own rights.
- Is there substantial difference in overhead between rank-0 arrays and scalars?
- How to write literal values? array(1) is too many characters.
- For rank-1 and rank-0 arrays, Python notation distinguishes:
c[1] vs c[1,] d[] vs d[()]
Should these be used to handle semantic difference between indexing and slicing? Should d[] be syntactically allowed?
Hope these observations help.
Huaiyu
Oops, the subject line was somehow cut off. Please use this one if you follow up. - Huaiyu
On Tue, 17 Sep 2002, Huaiyu Zhu wrote:
Some of the choices between rank-0 arrays and new scalar types might be resolved by enumerating the properties desired of them.
Most properties of rank-0 arrays could be fixed by consistency requirements alone, using operations that reduce array dimensions.
Let a = ones((2,3,4)) b = sum(a) c = sum(b) d = sum(c)
Property 1: the shape of an array is a tuple of integers a.shape == (2, 3, 4) b.shape == (3, 4) c.shape == (4,) d.shape == () Property 2: rank(a) == len(a.shape) rank(a) == 3 == len(a.shape) rank(b) == 2 == len(b.shape) rank(c) == 1 == len(c.shape) rank(d) == 0 == len(d.shape)
Property 3: len(a) == a.shape[0] len(a) == 2 == a.shape[0] len(b) == 3 == b.shape[0] len(c) == 4 == c.shape[0] len(d) == Exception == d.shape[0]
# Currently the last is wrong?
Property 4: size(a) == product(a.shape) size(a) == 24 == product(a.shape) size(b) == 12 == product(b.shape) size(c) == 4 == product(c.shape) size(d) == 1 == product(d.shape)
# Currently the last is wrong
Property 5: rank-0 array behaves as mutable numbers when used as value array(2) is similar to 2 array(2.0) is similar to 2.0 array(2j) is similar to 2j
# This is a summary of many concrete properties.
Property 6: Indexing reduces rank. Slicing preserves rank. a[:,:,:].shape = (2, 3, 4) a[1,:,:].shape = (3, 4) a[1,1,:].shape = (4,) a[1,1,1].shape = ()
Property 7: Indexing by tuple of ints gives scalar. a[1,1,1] == 1 b[1,1] == 2 c[1,] == 6 d[()] == 24
# So rank-0 array indexed by empty tuple should be scalar. # Currently the last is wrong
Property 8: Indexing by tuple of slices gives array. a[:,:,:] == ones((2,3,4)) b[:,:] == ones((3,4)) * 2 c[:] == ones((,4)) * 6 d[()] == ones(()) * 24
# So rank-0 array indexed by empty tuple should be rank-0 array. # Currently the last is wrong
Property 9: Indexing as lvalues a[1,1,1] = 2 b[1,1] = 2 c[1,] = 2 d[()] = 2
Property 10: Indexing and slicing as lvalues a[:,:,:] = ones((2, 3, 4)) a[1,:,:] = ones((3, 4)) a[1,1,:] = ones((4,)) a[1,1,1] = ones(()) # But the last is wrong.
Conclusion 1: rank-0 arrays are equivalent to scalars. See properties 7 and 8.
Conclusion 2: rank-0 arrays are mutable. See property 9.
Conclusion 3: shape(scalar), size(scalar) are all defined, but len(scalar) should not be defined.
See conclusion 1 and properties 1, 2, 3, 4.
Conclusion 4: A missing axis is similar to having dimension 1. See property 4.
Conclusion 5: rank-0 int arrays should be allowed to act as indices. See property 5.
Conclusion 6: rank-0 arrays should not be hashable except by object id. See conclusion 2.
Discussions:
- These properties correspond to the current implementation quite well, except a few rough edges.
- Mutable scalars are useful in their own rights.
- Is there substantial difference in overhead between rank-0 arrays and scalars?
- How to write literal values? array(1) is too many characters.
- For rank-1 and rank-0 arrays, Python notation distinguishes:
c[1] vs c[1,] d[] vs d[()]
Should these be used to handle semantic difference between indexing and slicing? Should d[] be syntactically allowed?
Hope these observations help.
Huaiyu
Some of the choices between rank-0 arrays and new scalar types might be resolved by enumerating the properties desired of them.
These are helpful observations.
Most properties of rank-0 arrays could be fixed by consistency requirements alone, using operations that reduce array dimensions.
Let a = ones((2,3,4)) b = sum(a) c = sum(b) d = sum(c)
Property 1: the shape of an array is a tuple of integers a.shape == (2, 3, 4) b.shape == (3, 4) c.shape == (4,) d.shape == ()
Property 2: rank(a) == len(a.shape) rank(a) == 3 == len(a.shape) rank(b) == 2 == len(b.shape) rank(c) == 1 == len(c.shape) rank(d) == 0 == len(d.shape)
Property 3: len(a) == a.shape[0] len(a) == 2 == a.shape[0] len(b) == 3 == b.shape[0] len(c) == 4 == c.shape[0] len(d) == Exception == d.shape[0]
# Currently the last is wrong?
Agreed, but this is because d is an integer and out of Numerics control. This is a case for returning 0d arrays rather than Python scalars.
Property 4: size(a) == product(a.shape) size(a) == 24 == product(a.shape) size(b) == 12 == product(b.shape) size(c) == 4 == product(c.shape) size(d) == 1 == product(d.shape)
# Currently the last is wrong
I disagree that this is wrong. This works as described for me.
Property 5: rank-0 array behaves as mutable numbers when used as value array(2) is similar to 2 array(2.0) is similar to 2.0 array(2j) is similar to 2j
# This is a summary of many concrete properties.
Property 6: Indexing reduces rank. Slicing preserves rank. a[:,:,:].shape = (2, 3, 4) a[1,:,:].shape = (3, 4) a[1,1,:].shape = (4,) a[1,1,1].shape = ()
Property 7: Indexing by tuple of ints gives scalar. a[1,1,1] == 1 b[1,1] == 2 c[1,] == 6 d[()] == 24
# So rank-0 array indexed by empty tuple should be scalar. # Currently the last is wrong
Not sure about this property, but interesting.
Property 8: Indexing by tuple of slices gives array. a[:,:,:] == ones((2,3,4)) b[:,:] == ones((3,4)) * 2 c[:] == ones((,4)) * 6 d[()] == ones(()) * 24
# So rank-0 array indexed by empty tuple should be rank-0 array. # Currently the last is wrong
Not sure about this one either.
Property 9: Indexing as lvalues a[1,1,1] = 2 b[1,1] = 2 c[1,] = 2 d[()] = 2
Property 10: Indexing and slicing as lvalues a[:,:,:] = ones((2, 3, 4)) a[1,:,:] = ones((3, 4)) a[1,1,:] = ones((4,)) a[1,1,1] = ones(())
# But the last is wrong.
Conclusion 1: rank-0 arrays are equivalent to scalars. See properties 7 and 8.
Conclusion 2: rank-0 arrays are mutable. See property 9.
Conclusion 3: shape(scalar), size(scalar) are all defined, but len(scalar) should not be defined.
Why is this? I thought you argued the other way for len(scalar). Of course, one solution is that we could overwrite the len() function and allow it to work for scalars.
See conclusion 1 and properties 1, 2, 3, 4.
Conclusion 4: A missing axis is similar to having dimension 1. See property 4.
Conclusion 5: rank-0 int arrays should be allowed to act as indices. See property 5.
Can't do this for lists and other builtin sequences.
Conclusion 6: rank-0 arrays should not be hashable except by object id. See conclusion 2.
Discussions:
- These properties correspond to the current implementation quite well,
except a few rough edges.
Mutable scalars are useful in their own rights.
Is there substantial difference in overhead between rank-0 arrays and scalars?
Yes.
How to write literal values? array(1) is too many characters.
For rank-1 and rank-0 arrays, Python notation distinguishes:
c[1] vs c[1,] d[] vs d[()]
Should these be used to handle semantic difference between indexing and slicing? Should d[] be syntactically allowed?
Hope these observations help.
Thanks for the observations.
-Travis O.
On Tue, 17 Sep 2002, Travis Oliphant wrote:
len(d) == Exception == d.shape[0]
# Currently the last is wrong?
Agreed, but this is because d is an integer and out of Numerics control. This is a case for returning 0d arrays rather than Python scalars.
That is one problem. It can be removed by using shape(d).
More fundamentally, though, len(d) == shape(d)[0] == ()[0] => IndexError. I think Konrad made this point a few days back.
size(d) == 1 == product(d.shape)
# Currently the last is wrong
I disagree that this is wrong. This works as described for me.
Right. Change d.shape to shape(d) here (and in several other places).
Why is this? I thought you argued the other way for len(scalar). Of course, one solution is that we could overwrite the len() function and allow it to work for scalars.
Raising exception is the correct behavior, not a problem to be solved.
Conclusion 5: rank-0 int arrays should be allowed to act as indices. See property 5.
Can't do this for lists and other builtin sequences.
If numarray defines a consistent set of behaviors for integer types that is intuitively understandable, it might not be difficult to persuade core Python to check against an abstract integer type.
- Is there substantial difference in overhead between rank-0 arrays and scalars?
Yes.
That would be one major problem.
However, after giving this some more thoughts, I'm starting to doubt the analogy I made. The problem is that in the end there is still a need to index an array and obtain a good old immutable scalar. So
- What notation should be used for this purpose? We can use c[0] to get immutable scalars and c[0,] for rank-0 arrays / mutable scalars. But what about other ranks? Python does not allow distinctions based on a[1,1,1] versus a[(1,1,1)] or d[] versus d[()].
- This weakens the argument that rank-0 arrays are scalars, since that argument is essentially based on sum(c) and c[0] being of the same type.
Huaiyu
Hi,
I have developed till now all my numerical stuff using Numeric objects.
I want to use some modules using numarray. I have tested some conversion functions on Python, e.g.
numarray.array(Numeric_object.tolist())
but it seems to me a bit slow (specially for large arrays as images). Is there any more ways of doing this conversion? Should I write my own conversion in C? Which advices could you give me in order to make an smoother transition?
Furthermore I found the web site about numarray/Numeric incompatibilities very interesting. However I did not understand quite well the paragraph:
Numeric arrays have some public attributes. Numarray arrays have none. All changes to an array's state must be made through accessor methods. Specifically, Numeric has the following attributes:
Numeric numarray accessor method(s) Attribute
shape --> shape() (i.e., specify new shape through method arg instead of assigning to shape attribute) flat --> flat() real --> real() (set capability not implemented yet, but will be) imag --> imag() (ditto) savespace --> not used, no equivalent functionality (xxx check this)
since interactively typing a.shape() or a.shape((8,8)) raise an exception (Tuple object not callable). Am I misunderstanding something?
Thanks in advance. Best regards, Aureli
Aureli Soria Frisch wrote:
Hi,
I have developed till now all my numerical stuff using Numeric objects.
I want to use some modules using numarray. I have tested some conversion functions on Python, e.g.
numarray.array(Numeric_object.tolist())
Much faster, but less general, is:
numarray.fromstring(Numeric_object.tostring(), type=XXXX)
but it seems to me a bit slow (specially for large arrays as images). Is there any more ways of doing this conversion?
Should I write my own conversion in C?
It depends on how soon you need it to be fast. numarray fromlist/tolist should be implemented in C for the next release, which should occur within a couple months, perhaps sooner. If you want to write a C extension for numarray now, the best way to do it is to use the Numeric compatability layer, which essentially uses a subset of the Numeric C-API, and is not likely to change.
Which advices could you give me in order to make an smoother transition?
Furthermore I found the web site about numarray/Numeric incompatibilities very interesting. However I did not understand quite well the paragraph:
I think that webpage is out of date.
since interactively typing a.shape() or a.shape((8,8)) raise an exception (Tuple object not callable). Am I misunderstanding something?
Actually, for python 2.2 and up, numarray has all of those attributes in a Numeric compatible, Python properties based form. So ".shape" get/sets the shape in numarray, just as in Numeric:
import numarray a = numarray.arange(10) a.shape
(10,)
a.shape = (2,5) a.shape
(2,5)
Because of this, the expression a.shape() that you mentioned tries to "call" the shape tuple returned by .shape, resulting in an exception.
Todd