Some comments on the Numeric3 Draft of 1-Mar-05

Colin J. Williams wrote:
I suggest that Numeric3 offers the opportunity to drop the word /rank/ from its lexicon. "rank" has an established usage long before digital computers. See: http://mathworld.wolfram.com/Rank.html
It also has a well-established usage with multi-arrays. http://mathworld.wolfram.com/TensorRank.html
Perhaps some abbreviation for "Dimensions" would be acceptable.
It is also reasonable to say that array([1., 2., 3.]) has 3 dimensions.
Matrix Class
" A default Matrix class will either inherit from or contain the Python class". Surely, almost all of the objects above are to be rooted in "new" style classes. See PEP's 252 and 253 or http://www.python.org/2.2.2/descrintro.html
Sure, but just because inheritance is possible does not entail that it is a good idea. -- Robert Kern rkern@ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter

On 01.03.2005, at 20:08, Colin J. Williams wrote:
Basic Types These are, presumably, intended as the types of the data elements contained in an Array instance. I would see then as sub-types of Array.
Element types as subtypes???
I wonder why there is a need for 30 new types. Python itself has about 30 distinct types. Wouldn't it be more saleable to think in terms of an Array
The Python standard library has hundreds of types, considering that the difference between C types and classes is an implementation detail.
Suppose one has: import numarray.numerictypes as _nt
Then, the editor (PythonWin for example) responds to the entry of "_nt." with a drop down menu offering the available types from which the user can select one.
That sounds interesting, but it looks like this would require specific support from the editor.
I suggest that Numeric3 offers the opportunity to drop the word rank from its lexicon. "rank" has an established usage long before digital computers. See: http://mathworld.wolfram.com/Rank.html
The meaning of "tensor rank" comes very close and was probably the inspiration for the use of this terminology in array system.
Perhaps some abbreviation for "Dimensions" would be acceptable.
The equivalent of "rank" is "number of dimensions", which is a bit long for my taste.
len() seems to be treated as a synonym for the number of dimensions. Currently, in numarray, it follows the usual sequence of sequences approach of Python and returns the number of rows in a two dimensional array.
As it should. The rank is given by len(array.shape), which is pretty much a standard idiom in Numeric code. But I don't see any place in the PEP that proposes something different!
Rank-0 arrays and Python Scalars
Regarding Rank-0 Question 2. I've already, in effect, answered "yes". I'm sure that a more compelling "Pro" could be written
Three "pro" argument to be added are: - No risk of user confusion by having two types that are nearly but not exactly the same and whose separate existence can only be explained by the history of Python and NumPy development. - No problems with code that does explicit typechecks (isinstance(x, float) or type(x) == types.FloatType). Although explicit typechecks are considered bad practice in general, there are a couple of valid reasons to use them. - No creation of a dependency on Numeric in pickle files (though this could also be done by a special case in the pickling code for arrays)
The "Con" case is valid but, I suggest, of no great consequence. In my view, the important considerations are (a) the complexity of training the newcomer and (b) whether the added work should be imposed on the generic code writer or the end user. I suggest that the aim should be to make things as easy as possible for the end user.
That is indeed a valid argument.
Mapping Iterator An example could help here. I am puzzled by "slicing syntax does not work in constructors.".
Python allows the colon syntax only inside square brackets. x[a:b] and x[a:b:c] are fine but it is not possible to write iterator(a:b). One could use iterator[a:b] instead, but this is a bit confusing, as it is not the iterator that is being sliced. Konrad.

konrad.hinsen@laposte.net wrote:
On 01.03.2005, at 20:08, Colin J. Williams wrote:
Basic Types These are, presumably, intended as the types of the data elements contained in an Array instance. I would see then as sub-types of Array.
Element types as subtypes???
Sub-types in the sense that, given an instance a of Array, a.elementType gives us the type of the data elements contained in a.
I wonder why there is a need for 30 new types. Python itself has about 30 distinct types. Wouldn't it be more saleable to think in terms of an Array
The Python standard library has hundreds of types, considering that the difference between C types and classes is an implementation detail.
I was thinking of the objects in the types module.
Suppose one has: import numarray.numerictypes as _nt
Then, the editor (PythonWin for example) responds to the entry of "_nt." with a drop down menu offering the available types from which the user can select one.
That sounds interesting, but it looks like this would require specific support from the editor.
Yes, it is built into Mark Hammond's PythonWin and is a valuable tool. Unfortunately, it is not available for Linux. However, I believe that SciTE and boa-constructor are intended to have the "completion" facility. These open source projects are available both with Linux and Windows.
I suggest that Numeric3 offers the opportunity to drop the word rank from its lexicon. "rank" has an established usage long before digital computers. See: http://mathworld.wolfram.com/Rank.html
The meaning of "tensor rank" comes very close and was probably the inspiration for the use of this terminology in array system.
Yes: The total number of contravariant <http://mathworld.wolfram.com/ContravariantTensor.html> and covariant <http://mathworld.wolfram.com/CovariantTensor.html> indices of a tensor <http://mathworld.wolfram.com/Tensor.html>. The rank of a tensor <http://mathworld.wolfram.com/Tensor.html> is independent of the number of dimensions <http://mathworld.wolfram.com/Dimension.html> of the space <http://mathworld.wolfram.com/Space.html>. I was thinking in terms of linear independence, as with Matrix Rank: The rank of a matrix <http://mathworld.wolfram.com/Matrix.html> or a linear map <http://mathworld.wolfram.com/LinearMap.html> is the dimension <http://mathworld.wolfram.com/Dimension.html> of the range <http://mathworld.wolfram.com/Range.html> of the matrix <http://mathworld.wolfram.com/Matrix.html> or the linear map <http://mathworld.wolfram.com/LinearMap.html>, corresponding to the number of linearly independent <http://mathworld.wolfram.com/LinearlyIndependent.html> rows or columns of the matrix, or to the number of nonzero singular values <http://mathworld.wolfram.com/SingularValue.html> of the map. I guess there has been a tussle between the tensor users and the matrix users for some time.
Perhaps some abbreviation for "Dimensions" would be acceptable.
The equivalent of "rank" is "number of dimensions", which is a bit long for my taste.
Perhaps nDim, numDim or dim would be acceptable.
len() seems to be treated as a synonym for the number of dimensions. Currently, in numarray, it follows the usual sequence of sequences approach of Python and returns the number of rows in a two dimensional array.
As it should. The rank is given by len(array.shape), which is pretty much a standard idiom in Numeric code. But I don't see any place in the PEP that proposes something different!
This was probably my misreading of len(T).
Rank-0 arrays and Python Scalars
Regarding Rank-0 Question 2. I've already, in effect, answered "yes". I'm sure that a more compelling "Pro" could be written
Three "pro" argument to be added are:
- No risk of user confusion by having two types that are nearly but not exactly the same and whose separate existence can only be explained by the history of Python and NumPy development.
Thanks, history has a pull in favour of retaining the current approach.
- No problems with code that does explicit typechecks (isinstance(x, float) or type(x) == types.FloatType). Although explicit typechecks are considered bad practice in general, there are a couple of valid reasons to use them.
I would see this as supporting the conversion to a scalar. For example: >>> type(type(x)) <type 'type'> >>> isinstance(x, float) True >>> isinstance(x, types.FloatType) True >>>
- No creation of a dependency on Numeric in pickle files (though this could also be done by a special case in the pickling code for arrays)
The "Con" case is valid but, I suggest, of no great consequence. In my view, the important considerations are (a) the complexity of training the newcomer and (b) whether the added work should be imposed on the generic code writer or the end user. I suggest that the aim should be to make things as easy as possible for the end user.
That is indeed a valid argument.
Mapping Iterator An example could help here. I am puzzled by "slicing syntax does not work in constructors.".
Python allows the colon syntax only inside square brackets. x[a:b] and x[a:b:c] are fine but it is not possible to write iterator(a:b). One could use iterator[a:b] instead, but this is a bit confusing, as it is not the iterator that is being sliced.
Thanks. It would be nice if a:b or a:b:c could return a slice object.
Konrad.
Colin W.

On 02.03.2005, at 18:21, Colin J. Williams wrote:
Sub-types in the sense that, given an instance a of Array, a.elementType gives us the type of the data elements contained in a.
Ah, I see, it's just about how to access the type object. That's not my first worry in design questions. Once you can get the object somehow, you can make it accessible in nearly any way you like.
The Python standard library has hundreds of types, considering that the difference between C types and classes is an implementation detail.
I was thinking of the objects in the types module.
Those are just the built-in types. There are no plans to increase their number.
Yes, it is built into Mark Hammond's PythonWin and is a valuable tool. Unfortunately, it is not available for Linux. However, I believe that SciTE and boa-constructor are intended to have the "completion" facility. These open source projects are available both with Linux and Windows.
The number of Python IDEs seems to be growing all the time - I haven't even heard of those. And I am still using Emacs...
The equivalent of "rank" is "number of dimensions", which is a bit long for my taste.
Perhaps nDim, numDim or dim would be acceptable.
As a variable name, fine. As a pseudo-word in normal language, no. Not for me at least. I like sentences to use real, pronouncable words.
- No problems with code that does explicit typechecks (isinstance(x, float) or type(x) == types.FloatType). Although explicit typechecks are considered bad practice in general, there are a couple of valid reasons to use them.
I would see this as supporting the conversion to a scalar. For example:
But technically it isn't, so some code would cease to work.
Thanks. It would be nice if a:b or a:b:c could return a slice object.
That would be difficult to reconcile with Python syntax because of the use of colons in the block structure of the code. The parser (and the programmers' brains) would have to handle stuff like if slice == 1:: pass correctly. Konrad. -- ------------------------------------------------------------------------ ------- Konrad Hinsen Laboratoire Leon Brillouin, CEA Saclay, 91191 Gif-sur-Yvette Cedex, France Tel.: +33-1 69 08 79 25 Fax: +33-1 69 08 82 61 E-Mail: khinsen@cea.fr ------------------------------------------------------------------------ -------

konrad.hinsen@laposte.net wrote:
On 02.03.2005, at 18:21, Colin J. Williams wrote: [snip]
The Python standard library has hundreds of types, considering that the difference between C types and classes is an implementation detail.
I was thinking of the objects in the types module.
Those are just the built-in types. There are no plans to increase their number.
My understanding was that there was to be a new builtin multiarray/Array class/type which eventually would replace the existing array.ArrayType. Thus, for a time at least, there would be at least one new class/type. In addition, it seemed to be proposed that the new class/type would not just be Array but Array_with_Int32, Array_with_Float64 etc.. I'm not too clear on this latter point but Konrad says that there would not be this multiplicity of basic class/type's.
Yes, it is built into Mark Hammond's PythonWin and is a valuable tool. Unfortunately, it is not available for Linux. However, I believe that SciTE and boa-constructor are intended to have the "completion" facility. These open source projects are available both with Linux and Windows.
The number of Python IDEs seems to be growing all the time - I haven't even heard of those. And I am still using Emacs...
Having spent little time with Unices, I'm not familiar with emacs. Another useful facility with PythonWin is that when one enters a class, function or method, followed by "(", the docstring is presented. This is often helpful. Finally, the PythonWin debug facility provides useful context information. Suppose that f1 calls f2 which calls ... fn and that we have a breakpoint in fn, then the current values in each of these contexts is available in a PythonWin panel.
[snip]
Thanks. It would be nice if a:b or a:b:c could return a slice object.
That would be difficult to reconcile with Python syntax because of the use of colons in the block structure of the code. The parser (and the programmers' brains) would have to handle stuff like
if slice == 1:: pass
correctly.
Konrad.
Yes, that it a problem which is not well resolved by requiring that a slice be terminated with a ")", "]", "}" or a space. One of the difficulties is that the slice is not recognized in the current syntax. We have a "slicing" which ties a slice with a primary, but no "slice". Your earlier suggestion that a slice be [a:b:c] is probably better. Then a slicing would be: primary slice which no doubt creates parsing problems. Thomas Wouters proposed a similar structure for a range in PEP204 (http://python.fyxm.net/peps/pep-0204.html), which was rejected. Colin W.

Colin J. Williams wrote:
My understanding was that there was to be a new builtin multiarray/Array class/type which eventually would replace the existing array.ArrayType. Thus, for a time at least, there would be at least one new class/type.
The new type will actually be in the standard library. For backwards compatibility we will not be replacing the existing array.ArrayType but providing an additional ndarray.ndarray (or some such name -- the name hasn't been finalized yet).
In addition, it seemed to be proposed that the new class/type would not just be Array but Array_with_Int32, Array_with_Float64 etc.. I'm not too clear on this latter point but Konrad says that there would not be this multiplicity of basic class/type's.
The arrays have always been homogeneous collections of "something". This 'something' has been indicated by typecodes characters (Numeric) or Python classes (numarray). The proposal is that the "something" that identifies what the homogeneous arrays are collections of will be actual type objects. Some of these type objects are just "organizational types" which help to classify the different kinds of homogeneous arrays. The "leaf-node" types are also the types of new Python scalars that act as a transition layer between ndarrays with their variety of objects and traditional Python bool, int, float, complex, string, and unicode objects which do not "understand" that they could be considered as 0-dimensional arrays. -Travis

Travis Oliphant wrote:
Colin J. Williams wrote:
My understanding was that there was to be a new builtin multiarray/Array class/type which eventually would replace the existing array.ArrayType. Thus, for a time at least, there would be at least one new class/type.
The new type will actually be in the standard library. For backwards compatibility we will not be replacing the existing array.ArrayType but providing an additional ndarray.ndarray (or some such name -- the name hasn't been finalized yet).
In addition, it seemed to be proposed that the new class/type would not just be Array but Array_with_Int32, Array_with_Float64 etc.. I'm not too clear on this latter point but Konrad says that there would not be this multiplicity of basic class/type's.
The arrays have always been homogeneous collections of "something". This 'something' has been indicated by typecodes characters (Numeric) or Python classes (numarray). The proposal is that the "something" that identifies what the homogeneous arrays are collections of will be actual type objects. Some of these type objects are just "organizational types" which help to classify the different kinds of homogeneous arrays. The "leaf-node" types are also the types of new Python scalars that act as a transition layer between ndarrays with their variety of objects and traditional Python bool, int, float, complex, string, and unicode objects which do not "understand" that they could be considered as 0-dimensional arrays.
Thanks. This clarifies things. These 'somethingTypes' would presumably not be in the standard library but in some module like Numeric3.numerictypes. Colin W.

On Mar 3, 2005, at 17:46, Colin J. Williams wrote:
Those are just the built-in types. There are no plans to increase their number.
My understanding was that there was to be a new builtin multiarray/Array class/type which eventually would replace the existing array.ArrayType. Thus, for a time
Neither the current array type nor the proposed multiarray type are builtin types. They are types defined in modules belonging to the standard library.
Yes, that it a problem which is not well resolved by requiring that a slice be terminated with a ")", "]", "}" or a space. One of the difficulties is that the slice is not recognized in the current syntax. We have a "slicing" which ties a slice with a
It is, but in the form of a standard constructor: slice(a, b, c).
Thomas Wouters proposed a similar structure for a range in PEP204 (http://python.fyxm.net/peps/pep-0204.html), which was rejected.
We would probably face the same problem: a syntax change must matter to many people to have a chance of being accepted. Konrad. -- --------------------------------------------------------------------- Konrad Hinsen Laboratoire Léon Brillouin, CEA Saclay, 91191 Gif-sur-Yvette Cedex, France Tel.: +33-1 69 08 79 25 Fax: +33-1 69 08 82 61 E-Mail: khinsen@cea.fr ---------------------------------------------------------------------
participants (4)
-
Colin J. Williams
-
konrad.hinsen@laposte.net
-
Robert Kern
-
Travis Oliphant