[Numpy-discussion] Numarray header PEP

Todd Miller jmiller at stsci.edu
Thu Jul 1 13:46:07 EDT 2004


On Thu, 2004-07-01 at 15:58, Colin J. Williams wrote:
> Sebastian Haase wrote:
> 
> >On Wednesday 30 June 2004 11:33 pm, gerard.vermeulen at grenoble.cnrs.fr wrote:
> >  
> >
> >>On 30 Jun 2004 17:54:19 -0400, Todd Miller wrote
> >>
> >>    
> >>
> >>>So... you use the "meta" code to provide package specific ordinary
> >>>(not-macro-fied) functions to keep the different versions of the
> >>>Present() and isArray() macros from conflicting.
> >>>
> >>>It would be nice to have a standard approach for using the same
> >>>"extension enhancement code" for both numarray and Numeric.  The PEP
> >>>should really be expanded to provide an example of dual support for one
> >>>complete and real function, guts and all, so people can see the process
> >>>end-to-end;  Something like a simple arrayprint.  That process needs
> >>>to be refined to remove as much tedium and duplication of effort as
> >>>possible.  The idea is to make it as close to providing one
> >>>implementation to support both array packages as possible.  I think it's
> >>>important to illustrate how to partition the extension module into
> >>>separate compilation units which correctly navigate the dual
> >>>implementation mine field in the easiest possible way.
> >>>
> >>>It would also be nice to add some logic to the meta-functions so that
> >>>which array package gets used is configurable.  We did something like
> >>>that for the matplotlib plotting software at the Python level with
> >>>the "numerix" layer, an idea I think we copied from Chaco.  The kind
> >>>of dispatch I think might be good to support configurability looks like
> >>>this:
> >>>
> >>>PyObject *
> >>>whatsThis(PyObject *dummy, PyObject *args)
> >>>{
> >>>    PyObject *result, *what = NULL;
> >>>    if (!PyArg_ParseTuple(args, "O", &what))
> >>>      return 0;
> >>>    switch(PyArray_Which(what)) {
> >>>      USE_NUMERIC:
> >>>         result = Numeric_whatsThis(what); break;
> >>>      USE_NUMARRAY:
> >>>         result = Numarray_whatsThis(what); break;
> >>>      USE_SEQUENCE:
> >>>         result = Sequence_whatsThis(what); break;
> >>>    }
> >>>    Py_INCREF(Py_None);
> >>>    return Py_None;
> >>>}
> >>>
> >>>In the above,  I'm picturing a separate .c file for Numeric_whatsThis
> >>>and for Numarray_whatsThis.  It would be nice to streamline that to one
> >>>.c and a process which somehow (simply) produces both functions.
> >>>
> >>>Or, ideally, the above would be done more like this:
> >>>
> >>>PyObject *
> >>>whatsThis(PyObject *dummy, PyObject *args)
> >>>{
> >>>    PyObject *result, *what = NULL;
> >>>    if (!PyArg_ParseTuple(args, "O", &what))
> >>>       return 0;
> >>>    switch(Numerix_Which(what)) {
> >>>       USE_NUMERIX:
> >>>          result = Numerix_whatsThis(what); break;
> >>>       USE_SEQUENCE:
> >>>          result = Sequence_whatsThis(what); break;
> >>>    }
> >>>    Py_INCREF(Py_None);
> >>>    return Py_None;
> >>>}
> >>>
> >>>Here, a common Numerix implementation supports both numarray and Numeric
> >>>from a single simple .c.  The extension module would do "#include
> >>>numerix/arrayobject.h" and "import_numerix()" and otherwise just call
> >>>PyArray_* functions.
> >>>
> >>>The current stumbling block is that numarray is not binary compatible
> >>>with Numeric... so numerix in C falls apart.  I haven't analyzed
> >>>every symbol and struct to see if it is really feasible... but it
> >>>seems like it is *almost* feasible, at least for typical usage.
> >>>
> >>>So, in a nutshell,  I think the dual implementation support you
> >>>demoed is important and we should work up an example and kick it
> >>>around to make sure it's the best way we can think of doing it.
> >>>Then we should add a section to the PEP describing dual support as well.
> >>>      
> >>>
> >>I would never apply numarray code to Numeric arrays and the inverse. It
> >>looks dangerous and I do not know if it is possible.  The first thing
> >>coming to mind is that numarray and Numeric arrays refer to different type
> >>objects (this is what my pep module uses to differentiate them).  So, even
> >>if numarray and Numeric are binary compatible, any 'alien' code referring
> >>the the 'Python-standard part' of the type objects may lead to surprises. A
> >>PEP proposing hacks will raise eyebrows at least.
> >>
> >>Secondly, most people use Numeric *or* numarray and not both.
> >>
> >>So, I prefer: Numeric In => Numeric Out or Numarray In => Numarray Out
> >>(NINO) Of course, Numeric or numarray output can be a user option if NINO
> >>does not apply.  (explicit safe conversion between Numeric and numarray is
> >>possible if really needed).
> >>
> >>I'll try to flesh out the demo with real functions in the way you indicated
> >>(going as far as I consider safe).
> >>
> >>The problem of coding the Numeric (or numarray) functions in more than
> >>a single source file has also be addressed.
> >>
> >>It may take 2 weeks because I am off to a conference next week.
> >>
> >>Regards -- Gerard
> >>    
> >>
> >
> >Hi all,
> >first, I would like to state that I don't understand much of this discussion;
> >so the only comment I wanted to make is that IF this where possible, to make 
> >(C/C++) code that can live with both Numeric and numarray, then I think it 
> >would be used more and more - think: transition phase !! (e.g. someone could 
> >start making the FFTW part  of scipy numarray friendly without having to 
> >switch everything at one [hint ;-)] )
> >
> >These where just my 2 cents.
> >Cheers,
> >Sebastian Haase
> >  
> >
> I feel lower on the understanding tree with respect to what is being 
> proposed in the draft PEP, but would still like to offer my 2 cents 
> worth.  I get the feeling that numarray is being bent out of shape to 
> fit Numeric.

Yes and no.  The numarray team has over time realized the importance of
backward compatibility with the dominant array package, Numeric.  A lot
of People use Numeric now.  We're trying to make it as easy as possible
to use numarray.

> It was my understanding that Numeric had certain weakness which made it 
> unacceptable as a Python component and that numarray was intended to 
> provide the same or better functionality within a pythonic framework.

My understanding is that until there is a consensus on an array package,
neither numarray nor Numeric is going into the Python core.  

> numarray has not achieved the expected performance level to date, but 
> progress is being made and I believe that, for larger arrays, numarray 
> has been shown to be be superior to Numeric - please correct me if I'm 
> wrong here.

I think that's a fair summary.

> 
> The shock came for me when Todd Miller said:  
>     <>
>     I looked at this some, and while INCREFing __dict__ maybe the right
>     idea, I forgot that there *is no* Python NumArray.__init__ anymore.
> 
> Wasn't it the intent of numarray to work towards the full use of the 
> Python class structure to provide the benefits which it offers?
> 

Ack.  I wasn't trying to start a panic.  The __init__ still exists, as
does __new__, they're just in C.   Sorry if I was unclear.

> The Python class has two constructors and one destructor.

We're mostly on the same page.

> The constructors are __init__ and __new__, the latter only provides the 
> shell of an instance which later has to be initialized.  In version 0.9, 
> which I use, there is no __new__, 

It's there,  but it's not very useful:

>>> import numarray
>>> numarray.NumArray.__new__
<built-in method __new__ of type object at 0x402fc860>
>>> a = numarray.NumArray.__new__(numarray.NumArray)
>>> a.info()
class: <class 'numarray.numarraycore.NumArray'>
shape: ()
strides: ()
byteoffset: 0
bytestride: 0
itemsize: 0
aligned: 1
contiguous: 1
data: None
byteorder: little
byteswap: 0
type: Any

I don't, however, recommend doing this.

> but there is a new function which has 
> a functionality similar to that intended for __new__.  Thus, with this 
> change, numarray appears to be moving further away from being pythonic.

Nope.  I'm talking about moving toward better speed with no change in
functionality at the Python level.  I also think maybe we've gotten list
threads crossed here:  the "Numarray header PEP" thread is independent
(but admittedly related) of the "Speeding up wxPython/numarray" thread.

The Numarray header PEP is about making it easy for packages to write C
extensions which *optionally* support numarray (and now Numeric as
well).  One aspect of the PEP is getting headers included in the Python
core so that extensions can be compiled even when the numarray is not
installed.  The other aspect will be illustrating a good technique for
supporting both numarray and Numeric, optionally and with choice, at the
same time.  Such an extension would still run where there is numarray,
Numeric, both, or none installed.  Gerard V. has already done some
integration of numarray and Numeric with PyQwt so he has a few good
ideas on how to do the "good technique" aspect of the PEP.

The Speeding up wxPython/numarray thread is about improving the
performance of a 50000 point wxPython drawlines which is 10x slower with
numarray than Numeric.  Tim H. and Chris B. have nailed this down
(mostly) to the numarray sequence protocol and destructor, __del__.

Regards,
Todd





More information about the NumPy-Discussion mailing list