[Numpy-discussion] Re: Simplifying array()
Todd Miller
jmiller at stsci.edu
Fri Jan 14 13:46:13 EST 2005
On Thu, 2005-01-13 at 22:18 +0200, Rory Yorke wrote:
> [Todd]
> > I agree with this myself. Does anyone care if they will no longer be
> > able to construct an array from a file or buffer object using array()
> > rather than fromfile() or NumArray(), respectively? Is a deprecation
> > process necessary to remove them?
>
> There seems to be a majority opinion in favour of deprecation, though
> at least Florian uses the sequence-as-a-buffer feature.
By way of status, I applied and committed both Rory's patches this
morning. Afterward, I added the deprecation warnings for the
frombuffer() and fromfile() cases. frombuffer() is identical to
NumArray(), so I did not add a new function.
> [Colin]
> > I would suggest deprecation on the way to removal. For the
> > newcomer, who is not yet "clued up" some advice on the instantiation
> > of NumArray would help. Currently,
>
> The deprecation warning could include a pointer to NumArray or
> fromfile, as appropriate. I think some of the Python stdlib
> deprecations (doctest?) do exactly this. The NumArray docs do need to
> be fixed, though.
I didn't touch the docs.
> [Colin]
> > Rory leaves in type and typecode. It would be good to eliminate
> > this apparent overlap. Why not deprecate and then drop type? As a
> > compromise, either could be accepted as a NumArray.__init__
> > argument, since it is easy to distinguish between them.
>
> [Perry]
> > Tim is right about this. The rationale was that typecode is
> > inaccurate since types are no longer represented by letter codes
> > (one can still use them for backward compatibility).
>
> Also, the type keyword matches the NumArray type method. It does have
> the downside of clashing with the type builtin, of course.
IMHO, all this discussion about type/typecode is moot because typecode
was added after the fact for Numeric compatibility. It's really makes
no sense to take it out now that we're going for interoperability with
scipy. I don't like it much either, but the alternative, being
incompatible, is worse. "typecode" could be factored out in to the
numerix layer, but that just makes life confusing; it's best that
numarray works the same whether it's being used with scipy or not.
> > It would be good to clarify the acceptable content of a sequence. A
>
> I think this is quite important, though perhaps not too difficult. I
> think any sequence, or nested sequences should be accepted, provided
> that they are "conformally sized" (for lack of a better phrase) and
> that the innermost sequences contain number types. I'll try to word
> this more precisely for the docs.
>
> Note that a NumArray is a sequence, in the sense that it has
> __getitem__ and __len__ methods, and is index from 0 upwards.
>
> Strings are also sequences, and Alexander made a comment to the patch
> that array() should handle sequences of strings. Consider Numeric's
> behaviour:
>
> >>> array(["abc",[1,2,3]])
> array([[97, 98, 99],
> [ 1, 2, 3]])
-1 from me. I think we're getting back into "array does too much"
territory.
> I think this needs to be handled in fromlist, which, I think, handles
> fairly general sequences, but not strings.
I think you're right, that's how it could be done.
> Note that this leads to a different interpretation of array(["abcd"])
> and array("abcd")
>
> According to the above, array(["abcd"] should return
> array([[97,98,99,100]]) and, since plain strings go straight to
> fromstring, array("abcd") should return array([1684234849]) (probably
> dependent on endianess, what Long is, etc.). Is this acceptable?
I held off consolidating all the new default types to Long. Not having
defaults hasn't been a problem up to now so I'm not sure Numeric
compatibility is such a concern or that Long is really the best
default... although it does make it easier to write doctests.
Todd
More information about the NumPy-Discussion
mailing list