Thoughts about zero dimensional arrays vs Python scalars
Travis, Discussing zero dimensional arrays, the PEP says at one point: ... When ndarray is imported, it will alter the numeric table for python int, float, and complex to behave the same as array objects. Thus, in the proposed solution, 0-dim arrays would never be returned from calculation, but instead, the equivalent Python Array Scalar Type. Internally, these ArrayScalars can be quickly converted to 0-dim arrays when needed. Each scalar would also have a method to convert to a "standard" Python Type upon request (though this shouldn't be needed often). I'm not sure I understand this. Does it mean that, after having imported ndarray, "type(1)" to "ndarray.IntArrType" rather than "int"? If so, I think this is a dangerous idea. There is one important difference between zero dimensional arrays and Python scalar types, which is not discussed in the PEP: arrays are mutable, Python scalars are immutable. When Guido introduced in-place operators in Python, (+=, *=, etc.) he decided that "i += 1" should be allowed for Python scalars and should mean "i = i + 1". Here you have it, it means something different when i is a mutable zero dimensional array. So, I suspect a tacit re-definition of Python scalars on ndarray import will break some code out there (code, that does not deal with arrays at all). Facing this important difference between arrays and Python scalars, I'm also not sure anymore that advertising zero dimensional arrays as essentially the same as Python scalars is such a good idea. Perhaps it would be better not to try to inherit from Python's number types and all that. Perhaps it would be easier to just say that indexing an array always results in an array and that zero dimensional arrays can be converted into Python scalars. Period. Ralf PS: You wrote two questions about zero dimensional arrays vs Python scalars into the PEP. What are your plans for deciding these?
I just read the section about "Array Scalars" again and am not sure anymore that I understood the whole idea. When you say "Array Scalar", do you mean a zero dimensional array or is an "Array Scalar" yet another animal? Ralf On Sat, 2005-03-19 at 21:06, Ralf Juengling wrote:
Travis,
Discussing zero dimensional arrays, the PEP says at one point:
... When ndarray is imported, it will alter the numeric table for python int, float, and complex to behave the same as array objects.
Thus, in the proposed solution, 0-dim arrays would never be returned from calculation, but instead, the equivalent Python Array Scalar Type. Internally, these ArrayScalars can be quickly converted to 0-dim arrays when needed. Each scalar would also have a method to convert to a "standard" Python Type upon request (though this shouldn't be needed often).
I'm not sure I understand this. Does it mean that, after having imported ndarray, "type(1)" to "ndarray.IntArrType" rather than "int"?
If so, I think this is a dangerous idea. There is one important difference between zero dimensional arrays and Python scalar types, which is not discussed in the PEP: arrays are mutable, Python scalars are immutable.
When Guido introduced in-place operators in Python, (+=, *=, etc.) he decided that "i += 1" should be allowed for Python scalars and should mean "i = i + 1". Here you have it, it means something different when i is a mutable zero dimensional array. So, I suspect a tacit re-definition of Python scalars on ndarray import will break some code out there (code, that does not deal with arrays at all).
Facing this important difference between arrays and Python scalars, I'm also not sure anymore that advertising zero dimensional arrays as essentially the same as Python scalars is such a good idea. Perhaps it would be better not to try to inherit from Python's number types and all that. Perhaps it would be easier to just say that indexing an array always results in an array and that zero dimensional arrays can be converted into Python scalars. Period.
Ralf
PS: You wrote two questions about zero dimensional arrays vs Python scalars into the PEP. What are your plans for deciding these?
------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Ralf Juengling wrote:
I just read the section about "Array Scalars" again and am not sure anymore that I understood the whole idea. When you say "Array Scalar", do you mean a zero dimensional array or is an "Array Scalar" yet another animal?
Ralf
It is another type object that is a scalar but "quacks" like an array (has the same methods and attributes) -Travis
Travis Oliphant wrote:
Ralf Juengling wrote:
I just read the section about "Array Scalars" again and am not sure anymore that I understood the whole idea. When you say "Array Scalar", do you mean a zero dimensional array or is an "Array Scalar" yet another animal?
Ralf
It is another type object that is a scalar but "quacks" like an array (has the same methods and attributes)
... but is, unlike arrays, an immutable type (just like the existing Python scalars). ralf
Ralf Juengling wrote:
Travis,
Discussing zero dimensional arrays, the PEP says at one point:
... When ndarray is imported, it will alter the numeric table for python int, float, and complex to behave the same as array objects.
Thus, in the proposed solution, 0-dim arrays would never be returned from calculation, but instead, the equivalent Python Array Scalar Type. Internally, these ArrayScalars can be quickly converted to 0-dim arrays when needed. Each scalar would also have a method to convert to a "standard" Python Type upon request (though this shouldn't be needed often).
I'm not sure I understand this. Does it mean that, after having imported ndarray, "type(1)" to "ndarray.IntArrType" rather than "int"?
If so, I think this is a dangerous idea. There is one important difference between zero dimensional arrays and Python scalar types, which is not discussed in the PEP: arrays are mutable, Python scalars are immutable.
When Guido introduced in-place operators in Python, (+=, *=, etc.) he decided that "i += 1" should be allowed for Python scalars and should mean "i = i + 1". Here you have it, it means something different when i is a mutable zero dimensional array. So, I suspect a tacit re-definition of Python scalars on ndarray import will break some code out there (code, that does not deal with arrays at all).
Facing this important difference between arrays and Python scalars, I'm also not sure anymore that advertising zero dimensional arrays as essentially the same as Python scalars is such a good idea. Perhaps it would be better not to try to inherit from Python's number types and all that. Perhaps it would be easier to just say that indexing an array always results in an array and that zero dimensional arrays can be converted into Python scalars. Period.
Ralf
PS: You wrote two questions about zero dimensional arrays vs Python scalars into the PEP. What are your plans for deciding these?
It looks as though a decision has been made. I was among those who favoured abandoning rank-0 arrays, we lost. To my mind rank-0 arrays add complexity for little benefit and make explanation more difficult. I don't spot any discussion in the PEP of the pros and cons of the nd == 0 case. Colin W.
Colin J. Williams wrote:
Ralf Juengling wrote:
Travis,
Discussing zero dimensional arrays, the PEP says at one point:
... When ndarray is imported, it will alter the numeric table for python int, float, and complex to behave the same as array objects. Thus, in the proposed solution, 0-dim arrays would never be returned from calculation, but instead, the equivalent Python Array Scalar Type. Internally, these ArrayScalars can be quickly converted to 0-dim arrays when needed. Each scalar would also have a method to convert to a "standard" Python Type upon request (though this shouldn't be needed often).
I'm not sure I understand this. Does it mean that, after having imported ndarray, "type(1)" to "ndarray.IntArrType" rather than "int"?
If so, I think this is a dangerous idea. There is one important difference between zero dimensional arrays and Python scalar types, which is not discussed in the PEP: arrays are mutable, Python scalars are immutable.
When Guido introduced in-place operators in Python, (+=, *=, etc.) he decided that "i += 1" should be allowed for Python scalars and should mean "i = i + 1". Here you have it, it means something different when i is a mutable zero dimensional array. So, I suspect a tacit re-definition of Python scalars on ndarray import will break some code out there (code, that does not deal with arrays at all). Facing this important difference between arrays and Python scalars, I'm also not sure anymore that advertising zero dimensional arrays as essentially the same as Python scalars is such a good idea. Perhaps it would be better not to try to inherit from Python's number types and all that. Perhaps it would be easier to just say that indexing an array always results in an array and that zero dimensional arrays can be converted into Python scalars. Period.
Ralf
PS: You wrote two questions about zero dimensional arrays vs Python scalars into the PEP. What are your plans for deciding these?
It looks as though a decision has been made. I was among those who favoured abandoning rank-0 arrays, we lost.
To my mind rank-0 arrays add complexity for little benefit and make explanation more difficult.
I don't spot any discussion in the PEP of the pros and cons of the nd == 0 case.
A correction! There is, in the PEP:: Questions 1) should sequence behavior (i.e. some combination of slicing, indexing, and len) be supported for 0-dim arrays? Pros: It means that len(a) always works and returns the size of the array. Slicing code and indexing code will work for any dimension (the 0-dim array is an identity element for the operation of slicing) Cons: 0-dim arrays are really scalars. They should behave like Python scalars which do not allow sequence behavior 2) should array operations that result in a 0-dim array that is the same basic type as one of the Python scalars, return the Python scalar instead? Pros: 1) Some cases when Python expects an integer (the most dramatic is when slicing and indexing a sequence: _PyEval_SliceIndex in ceval.c) it will not try to convert it to an integer first before raising an error. Therefore it is convenient to have 0-dim arrays that are integers converted for you by the array object. 2) No risk of user confusion by having two types that are nearly but not exactly the same and whose separate existence can only be explained by the history of Python and NumPy development. 3) No problems with code that does explicit typechecks (isinstance(x, float) or type(x) == types.FloatType). Although explicit typechecks are considered bad practice in general, there are a couple of valid reasons to use them. 4) No creation of a dependency on Numeric in pickle files (though this could also be done by a special case in the pickling code for arrays) Cons: It is difficult to write generic code because scalars do not have the same methods and attributes as arrays. (such as .type or .shape). Also Python scalars have different numeric behavior as well. This results in a special-case checking that is not pleasant. Fundamentally it lets the user believe that somehow multidimensional homoegeneous arrays are something like Python lists (which except for Object arrays they are not). For me and for the end user, the (2) Pros win. Colin W.
Colin J. Williams wrote:
It looks as though a decision has been made. I was among those who favoured abandoning rank-0 arrays, we lost.
I don't understand how you can say this. In what way have rank-0 arrays not been abandoned for the new Array Scalar objects? By the way, these array scalar objects can easily be explained as equivalent to the type hierarchy of current numarray (it is essentially identical --- it's just in C).
To my mind rank-0 arrays add complexity for little benefit and make explanation more difficult.
I don't know what you mean. rank-0 arrays are built into the arrayobject type. Removing them is actually difficult. The easiest thing to do is to return rank-0 arrays whenever the operation allows it. It is the confusion with desiring to use items in an array (which are logically rank-0 arrays) as equivalent to Python scalars that requires the Array Scalars that "bridge the gap" between rank-0 arrays and "regular" Python scalars. Perhaps you mean that "Array Scalars" add complexity for "little beneift" and not "rank-0 arrays". To address that question: It may add complexity, but it does add benefit (future optimization, array type hierarchy, and a better bridge between the problem of current Python scalars and array-conscious scalars). This rank-0 problem has been a wart with Numeric for a long time. Most of us long-time users work around it, but heavy users are definitely aware of the problem and a bit annoyed. I think we have finally found a reasonable "compromise" solution in the Array Scalars. Yes, it did take more work to implement (and will take a little more work to maintain --- you need to add methods to the GenericScalar class when you add them to the Array Class), but I can actually see it working. -Travis
participants (3)
-
Colin J. Williams
-
Ralf Juengling
-
Travis Oliphant