[Numpy-discussion] PEP 209: Multi-dimensional Arrays

Rob W. W. Hooft rob at hooft.net
Wed Feb 14 16:17:18 EST 2001

>>>>> "PB" == Paul Barrett <Barrett at stsci.edu> writes:

 PB> By way of bootstrapping, only one predefined type need be known,
 PB> say, Int32.  The operations associated with this type can only be
 PB> Int32 operations, because this is the only type it knows about.
 PB> Yet, we can add another type, say Real64, which has not only
 PB> Real64 operations, BUT also Int32 and Real64 mixed operations,
 PB> since it knows about Int32.  The Real64 type provides the
 PB> necessary information to relate the Int32 and Int64 types.  Let's
 PB> now add a third type, then a fourth, etc., each knowing about its
 PB> predecessor types but not its successors.

 PB> This approach is identical to the way core Python adds new
 PB> classes or C-extension types, so this is nothing new.  The
 PB> current types do not know about the new type, but the new type
 PB> knows about them.  As long as one type knows the relationship
 PB> between the two that is sufficient for the scheme to work.

Yuck. I'm thinking how long it would take to load the Int256 class,
because it will need to import all other types before defining the 
relations.... [see below for another idea]

 PB> Attributes: .name: e.g. "Int32", "Float64", etc. .typecode:
 PB> e.g. 'i', 'f', etc. (for backward compatibility)
 >>  .typecode() is a method now.

 PB> Yes, I propose that it become a settable attribute.

Then it is not backwards compatible anyway, and you could leave it out.

 PB> .size (in bytes): e.g. 4, 8, etc.
 >>  "element size?"

 PB> Yes.

I think it should be called like that in that case. I dnt lk abbrvs.
size could be misread as the size of the total object.

 >> >> add.register('add', (Int32, Int32, Int32), cfunc-add)
 >> Typo: cfunc-add is an expression, not an identifier.

 PB> No, it is a Python object that encompasses and describes a C
 PB> function that adds two Int32 arrays and returns an Int32 array.

I understand that, but in general a "-" in pseudo-code is the
minus operator. I'd write cfunc_add instead.

 >> An implementation of a (Int32, Float32, Float32) add is possible
 >> and desirable as mentioned earlier in the document. Which C module
 >> is going to declare such a combination?

Now that I re-think this: would it be possible for the type-loader to check
for each type that it loads whether a cross-type module is available with
a previously loaded type? That way all types can be independent. There would
be a Int32 module knowing only Int32 types, and Float32 only knowing Float32 types.
Then there would be a Int32Float32 type that handles cross-type functions.
When Int32 or Float32 is loaded, the loader can see whether the other has
been loaded earlier, and if it is, load the cross-definitions as well.

Only problem I can think of is functions linking 3 or more types.

 PB> asstring(): create string from array
 >>  Not "tostring" like now?

 PB> This is proposed so as to be a little more consistent with Core
 PB> Python which uses 'from-' and 'as-' prefixes.  But I'm don't have
 PB> strong opinions either way.

PIL uses tostring as well. Anyway, I understand the buffer interface
is a nicer way to communicate.

 PB> 4.  ArrayView
 PB> This class is similar to the Array class except that the reshape
 PB> and flat methods will raise exceptions, since non-contiguous
 PB> arrays cannot be reshaped or flattened using just pointer and
 PB> step-size information.
 >>  This was completely unclear to me until here. I must say I find
 >> this a strange way of handling things. I haven't looked into
 >> implementation details, but wouldn't it feel more natural if an
 >> Array would just be the "data", and an ArrayView would contain the
 >> dimensions and strides. Completely separated. One would always
 >> need a pair, but more than one ArrayView could use the same Array.

 PB> In my definition, an Array that has no knowledge of its shape and
 PB> type is not an Array, it's a data or character buffer.  An array
 PB> in my definition is a data buffer with information on how that
 PB> buffer is to be mapped, i.e. shape, type, etc.  An ArrayView is
 PB> an Array that shares its data buffer with another Array, but may
 PB> contain a different mapping of that Array, ie. its shape and type
 PB> are different.

 PB> If this is what you mean, then the answer is "Yes".  This is how
 PB> we intend to implement Arrays and ArrayViews.

No, it is not what I meant. Reading your answer I'd say that I wouldn't
see the need for an Array. We only need a data buffer and an ArrayView.
If there are two parts of the functionality, it is much cleaner to make 
the cut in an orthogonal way.

 PB> B = A.V[:10] or A.view[:10] are some possibilities.  B is now an
 PB> ArrayView class.

I hate magic attributes like this. I do not like abbrevs at all. It is
not at all obvious what A.T or A.V mean.

 PB> 2.  Does item syntax default to copy or view behavior?
 >>  view.
 PB> Yet, c[i] can be considered just a shorthand for c[i,:] which
 PB> would imply copy behavior assuming slicing syntax returns a copy.
 >>  If you reason that way, then c is just a shorthand for c[...]
 >> too.

 PB> Yes, that is correct, but that is not how Python currently
 PB> behaves.

Current python also doesn't treat c[i] as a shorthand for c[i,:] or


Rob Hooft

=====   rob at hooft.net          http://www.hooft.net/people/rob/  =====
=====   R&D, Nonius BV, Delft  http://www.nonius.nl/             =====
===== PGPid 0xFA19277D ========================== Use Linux! =========

More information about the NumPy-Discussion mailing list