Thoughts on getting "something" in the Python core
To all interested in the future of arrays... I'm still very committed to Numeric3 as I want to bring the numarray and Numeric people together behind a single array object for scientific computing. But, I've been thinking about the array protocol and thinking that it would be a good thing if this became universal. One of the ways to make it universal is by having something that follows it in the Python core. So, what if we proposed for the Python core not something like Numeric3 (which would still exist in scipy.base and be everybody's favorite array :-) ), but a very minimal array object (scaled back even from Numeric) that followed the array protocol and had some C-API associated with it. This minimal array object would support 5 basic types ('bool', 'integer', 'float', 'complex', 'Object'). (Maybe a void type could be defined and a void "scalar" introduced (which would be the bytes object)). These types correspond to scalars already available in Python and so the whole 0-dim array Python scalar arguments could be ignored. Math could be done without ufuncs initially (people really needing speed would use scipy.base anyway). But, more people in the Python community would be able to use arrays and get used to them. And we would have a reference array_protocol object so that extension writers could write to it. I would not try a project like this until after scipy_core is out, but it's an interesting thing to think about. I mainly wanted feedback on the basic concept. An alternative would be to "add" multidimensionality to the array object already part of Python, fix it's reallocating with an exposed buffer problem, and add the array protocol. -Travis
On 01.04.2005, at 01:51, Travis Oliphant wrote:
So, what if we proposed for the Python core not something like Numeric3 (which would still exist in scipy.base and be everybody's favorite array :-) ), but a very minimal array object (scaled back even from Numeric) that followed the array protocol and had some C-API associated with it.
What would that minimal array object have in common with the full-size one? A subset of both the Python API and the C API? The data layout? Would the full one be a subtype of the minimal one? I like the idea in principle but I would like to be sure that it doesn't create additional overhead in the full array or in extension modules that use arrays, in the form of additional typecheck and compatibility criteria. Once there is a minimal array type in the core, objects of that type will be circulating and must somehow be handled. Konrad. -- ------------------------------------------------------------------------ ------- Konrad Hinsen Laboratoire Leon Brillouin, CEA Saclay, 91191 Gif-sur-Yvette Cedex, France Tel.: +33-1 69 08 79 25 Fax: +33-1 69 08 82 61 E-Mail: khinsen@cea.fr ------------------------------------------------------------------------ -------
Good idea, for many applications such an extension would be 'good enough'. 1) python code using such arrays should be 100% compatible with numarray/numeric/scipy. Should be possible if a sub-set of numeric/numarray/scipy is used. 2) Extensions written in C should handle such arrays transparently (without unnecessary copying). Should also be possible given a compatible data layout. Peter
To all interested in the future of arrays...
I'm still very committed to Numeric3 as I want to bring the numarray and Numeric people together behind a single array object for scientific computing.
But, I've been thinking about the array protocol and thinking that it would be a good thing if this became universal. One of the ways to make it universal is by having something that follows it in the Python core.
So, what if we proposed for the Python core not something like Numeric3 (which would still exist in scipy.base and be everybody's favorite array :-) ), but a very minimal array object (scaled back even from Numeric) that followed the array protocol and had some C-API associated with it.
This minimal array object would support 5 basic types ('bool', 'integer', 'float', 'complex', 'Object'). (Maybe a void type could be defined and a void "scalar" introduced (which would be the bytes object)). These types correspond to scalars already available in Python and so the whole 0-dim array Python scalar arguments could be ignored.
Math could be done without ufuncs initially (people really needing speed would use scipy.base anyway). But, more people in the Python community would be able to use arrays and get used to them. And we would have a reference array_protocol object so that extension writers could write to it.
I would not try a project like this until after scipy_core is out, but it's an interesting thing to think about. I mainly wanted feedback on the basic concept.
An alternative would be to "add" multidimensionality to the array object already part of Python, fix it's reallocating with an exposed buffer problem, and add the array protocol.
Travis Oliphant wrote:
To all interested in the future of arrays...
I'm still very committed to Numeric3 as I want to bring the numarray and Numeric people together behind a single array object for scientific computing.
Good.
But, I've been thinking about the array protocol and thinking that it would be a good thing if this became universal. One of the ways to make it universal is by having something that follows it in the Python core.
So, what if we proposed for the Python core not something like Numeric3 (which would still exist in scipy.base and be everybody's favorite array :-) ), but a very minimal array object (scaled back even from Numeric) that followed the array protocol and had some C-API associated with it.
I thought that your original Numeric3 proposal was in this direction - a simple multidimensional array class/type which could eventually replace Python's array module. In addition, and separately, there were to be a collection of ufuncs. Later, discussion seemed to drift from the basic Numeric3 towards SciPy.
This minimal array object would support 5 basic types ('bool', 'integer', 'float', 'complex', 'Object'). (Maybe a void type could be defined and a void "scalar" introduced (which would be the bytes object)). These types correspond to scalars already available in Python and so the whole 0-dim array Python scalar arguments could be ignored.
Could this be subclassed so that provision could be made for Int8 (or even Int1)? How would an array of records be handled?
Math could be done without ufuncs initially (people really needing speed would use scipy.base anyway). But, more people in the Python community would be able to use arrays and get used to them. And we would have a reference array_protocol object so that extension writers could write to it.
It would be good if the user could write his/her ufunc in Python.
I would not try a project like this until after scipy_core is out, but it's an interesting thing to think about. I mainly wanted feedback on the basic concept.
The concept looks good. Regarding timing, it seems better to build the foundation before building the house. Colin W.
An alternative would be to "add" multidimensionality to the array object already part of Python, fix it's reallocating with an exposed buffer problem, and add the array protocol.
-Travis
I thought that your original Numeric3 proposal was in this direction - a simple multidimensional array class/type which could eventually replace Python's array module. In addition, and separately, there were to be a collection of ufuncs.
No, that's a misunderstanding. Original Numeric3 was never about "simplyifying." Because, we can't "simplify" and still support the uses that Numeric and numarray have enjoyed. I'm more interested in using something like Numeric and will always install it should it exist. I was iunterested in getting it into the Python core for standardization. I now believe that "universal" standardization should occur around a "protocol" and perhaps a simple implementation. I'm still interested in a more "local standardization" for numarray and Numeric users (not all Python users) which is the focus of scipy.base (used to call it Numeric3). In the process we are generating good ideas that can be used for "global standardization" among all Python users. But, I can't do it all. I have to keep focused on what I'm doing with the current Numeric arrayobject (and that has never been about "getting rid of functionality").
Later, discussion seemed to drift from the basic Numeric3 towards SciPy.
The context of the problem as I see it intimately involves scipy and the collection of packages surrounding numarray. The small community we have built up was diverging in the creation of external packages. This is what troubled me most deeply. So, there is no Numeric3 separate from the larger issue of "a collection of standard scientific packages" that scipy has tried to be. That is why reference to scipy is made. I see no "drifting occurring" There is a separate issue of a good array module for Python. I now see the solution there as being more of a "good array protocol" for Python with a default very simple implementation that is improved by extension modules.
Could this be subclassed so that provision could be made for Int8 (or even Int1)?
I suppose, but this is kind of missing the point, because Numeric3 will support those types. If you need a more advanced array you install scipy.base.
How would an array of records be handled?
By installing a more advanced array.
The concept looks good. Regarding timing, it seems better to build the foundation before building the house.
The problem with your analogy is that the "sprawling mansion in the suburbs is already built" (Numeric has been around for a long time). The question is what kind of housing to build for the city dwellers and what kind of transportation system do we establish so people can move back and forth easily. -Travis
participants (4)
-
Colin J. Williams
-
konrad.hinsen@laposte.net
-
Peter Verveer
-
Travis Oliphant