[Python-Dev] Array Enhancements
Scott Gilbert
xscottg@yahoo.com
Mon, 8 Apr 2002 00:52:43 -0700 (PDT)
Thanks for the various replies. As suggested by a few, I'll take this
to the Numarray folks and see where it goes from there.
Just to respond to a few of the points though... I've put all my
responses in one message to wrap things up.
Tim Peters wrote:
>
> Sounds like a PEP to me.
>
My initial response to reading this was a loud "ugh" as I envisioned
red tape swarming around for what I would consider to be a pretty
simple patch. I mean, I just wanted to hack in some new typecodes...
After thinking about things for a while though, I've come to the
conclusion that the builtin Python array module does need a real
reworking. Even though one ships with the standard baseline, it's
getting reinvented again and again.
I hope the Numarray guys are doing a bang up job with their NDArray
type (I've looked at it briefly, but I don't really understand it
yet...). I suspect that most of the ufuncs and other stuff those guys
are doing are too special purpose to be part of the standard Python
baseline, but I would very much like to see a single usable array type
become the standard. I'd be willing to do PEP grunt work for that.
Tim Peters also wrote:
> > ...
> > *** I really need complex types. And more than the functionality
> > provided by Numeric/Numarray, I need complex integer types.
>
> This will meet resistance, as it's a pile of code of no conceivable
> use to the vast majority of Python users. That is, "code bloat".
> Instead the array type should be subclassable, and extreme
> special-purpose hair like "complex integers" should be supplied by
> extension modules.
It's not that much bloat. It would be a setitem and getitem pair for
each new type.
I'll give you that most people don't need "fixed point complex arrays".
Guido van Rossum wrote:
> You'll have to consider: is it important to be able to read pickled
> arrays on previous Python releases, or it that not a requirement? If
> it's not, you should probably add a new pickle code for pickled
> arrays, and do an implementation that writes;
Nope, we ship the version of Python we want them to use with our
applications.
Did you guys really make it possible to unpickle a Unicode string in
versions of Python that were pre Unicode?
I would think new features should only work in new versions...
Guido also wrote:
>
> Ehm, 'u' is already taken (Unicode).
>
That must have snuck in there sometime after 2.2 I guess.
Guido also wrote:
> > *** The ability to construct an array object from an existing C
> > pointer. We get our memory in all kinds of ways (valloc for page
> > aligned DMA transfers, shmem etc...), and it would be nice not to
> > copy in and copy out in some cases.
>
> But then you get into ownership issues. Who owns that memory? Who
> can free it? What if someone calls a method on the array that
> requires the memory to be resized?
>
> But it's a useful thing to be able to do, I agree, and it shouldn't
be
> too hard to add a flag that says "I don't own this memory" -- which
> would mean that the buffer can't be resized at all.
I pictured this working like CObjects do where you pass in a destructor
for when the reference count goes to zero. Possibly also passing in a
realloc function. If the realloc function is null, then an exception
is raised when someone tries to resize the array.
This means there would need to be a C visible API for building array
objects around special types of memory though.
Guido also wrote:
> Since arrays are all about compromises that trade flexibility for
> speed and memory footprint, you can't have a one size fits all. :-)
Bahh. I don't think getting a good general purpose Python object that
represents arbitrary C arrays is all that impossible. C arrays just
don't do that much.
Besides I didn't say "one size fits all", I said "one size fits all my
needs". That "my" is important (at least to me :-)
Guido also wrote:
> > Well if someone authoritative tells me that all of the above is a
> > great idea, I'll start working on a patch and scratch my plans to
> > create a "not in house" xarray module.
>
> It all depends on the quality of the patch. By the time you're done
> you may have completely rewritten the array module, and then the
> question is, wouldn't your own xarray module have been quicker to
> implement, because it doesn't need to preserve backwards
> compatibility?
Yup, I think I would be done with my xarray module by now if I had
written it instead of taking this route. It would also have the
disadvantage that it doesn't play nice with anyone else.
I now think the best bet is to replace the array module with something
flexible enough to:
1) do what it currently does
2) do what the Numarray guys need
3) do what I need
Guido also wrote:
> An alternative might be a separate bit-array implementation: it seems
> that the bit-array won't share much code with the regular array (of
> any flavor), so why not make it a separate type?
Yup. It would be nice if a bitarray was actually the same type, but
having code like:
if (o->is_bitarray) {
/* do something */
} else {
/* do every other byte addressable type */
}
is a little ugly.
David Ascher wrote:
> > I just realized that multi-dimensional __getitem__ shouldn't be a
> > big deal. The question is, given the above declaration, what a[0]
> > should return: the same as a[0, 0] or a copy of a[0, 0:20000] or
> > a reference to a[0, 0:20000].
>
> Or a ValueError? In the face of ambiguity, refuse the temptation to
> guess.
>
Yup. I think there should be a base array type that raises a
ValueError or similar, and derived array types can implement slice
references or slice copies as need be.
David also wrote:
>
> Why does submitting a patch to arraymodule seem an easier path than
> modifying numarray or numpy to support what's needed? I believe that
> the goals of numarray aren't that different from what Scott is trying
> to do (memory management APIs, etc.).
>
Well, part of my preference for modifying arraymodule.c instead of
Numarray is that I very quickly understood what's going on in
arraymodule.c, and a patch is pretty obvious. Looking at Numarray, I
just don't get it yet. Please take this as a shortcoming in my
abilities. Numarray does appear to be the heir-apparent though, so
I'll give it a better look.
I also assumed that the Numarray folks would play nice with the
standard array module. So if I could get what I wanted out of array,
then I could leverage Numarray when the opportunity arose.
David also wrote:
>
> I'd like to see fewer multi-dimensional array objects, not more...
>
I agree completely. In fact, I'd like to see one official one
distributed with the baseline.
Perry Greenfield wrote:
> [ a whole bunch of interesting things ]
I think I'll try to bring those up on the Numarray list.
Cheers,
-Scott Gilbert
__________________________________________________
Do You Yahoo!?
Yahoo! Tax Center - online filing with TurboTax
http://taxes.yahoo.com/