Re: [Numpy-discussion] PEP 209: Multi-dimensional Arrays

Feb. 14, 2001

      ...
Design and Implementation
Some parts of this look a bit imprecise and I don't claim to
understand them. For example:
...
Its relation to the other types is defined when the C-extension
    module for that type is imported.  The corresponding Python code
    is:
> Int32.astype[Real64] = Real64
This says that the Real64 array-type has higher priority than the
    Int32 array-type.
I'd choose a clearer name than "astype" for this, but that's a minor
detail. More important is how this is supposed to work. Suppose that
in Int32 you say that Real64 has higher priority, and in Real64 you
say that Int32 has higher priority. Would this raise an exception, and
if so, when?

Perhaps the coercion question should be treated in a separate PEP that
also covers standard Python types and provides a mechanism that any
type implementer can use. I could think of a number of cases where I
have wished I could define coercions between my own and some other
types properly.
...
3.  Array:
This class contains information about the array, such as shape,
    type, endian-ness of the data, etc..  Its operators, '+', '-',
What about the data itself?
...
4.  ArrayView
This class is similar to the Array class except that the reshape
    and flat methods will raise exceptions, since non-contiguous
There are no reshape and flat methods in this proposal...
...
1.  Does slicing syntax default to copy or view behavior?
The default behavior of Python is to return a copy of a sub-list
    or tuple when slicing syntax is used, whereas Numeric 1 returns a
    view into the array.  The choice made for Numeric 1 is apparently
    for reasons of performance: the developers wish to avoid the
Yes, performance was the main reason. But there is another one: if
slicing returns a view, you can make a copy based on it, but if
slicing returns a copy, there's no way to make a view. So if you
change this, you must provide some other way to generate a view, and
please keep the syntax simple (there are many practical cases where a
view is required).
...
In this case the performance penalty associated with copy behavior
    can be minimized by implementing copy-on-write.  This scheme has
Indeed, that's what most APL implementations do.
...
data buffer is made.  View behavior would then be implemented by
    an ArrayView class, whose behavior be similar to Numeric 1 arrays,
So users would have to write something like

    ArrayView(array, indices)

That looks a bit cumbersome, and any straightforward way to write the
indices is illegal according to the current syntax rules.
...
2.  Does item syntax default to copy or view behavior?
If compatibility with lists is a criterion at all, then I'd apply it
consistently and use view semantics. Otherwise let's forget about
lists and discuss 1. and 2. from a purely array-oriented point of
view. And then I'd argue that view semantics is more frequent and
should thus be the default for both slicing and item extraction.
...
3.  How is scalar coercion implemented?
The old discussion again...
...
annoying, particularly for very large arrays.  We prefer that the
    array type trumps the python type for the same type class, namely
That is a completely arbitrary rule from any but the "large array
performance" point of view. And it's against the Principle of Least
Surprise.

Now that we have the PEP procedure for proposing any change
whatsoever, why not lobby for the addition of a float scalar type to
Python, with its own syntax for constants? That looks like the best
solution from everybody's point of view.
...
4.  How is integer division handled?
In a future version of Python, the behavior of integer division
    will change.  The operands will be converted to floats, so the
Has that been decided already?
...
7.  How are numerical errors handled (IEEE floating-point errors in
        particular)?
It is not clear to the proposers (Paul Barrett and Travis
    Oliphant) what is the best or preferred way of handling errors.
    Since most of the C functions that do the operation, iterate over
    the inner-most (last) dimension of the array.  This dimension
    could contain a thousand or more items having one or more errors
    of differing type, such as divide-by-zero, underflow, and
    overflow.  Additionally, keeping track of these errors may come at
    the expense of performance.  Therefore, we suggest several
    options:
I'd like to add another one:

e. Keep some statistics about the errors that occur during the
   operation, and if at the end the error count is > 0, raise
   an exception containing as much useful information as possible.

I would certainly not want any Python program to *print* anything
unless I have explicitly told it to do so.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen@cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------