From edcjones at erols.com  Wed Jan  1 20:29:44 2003
From: edcjones at erols.com (Edward C. Jones)
Date: Wed Jan  1 20:29:44 2003
Subject: [Numpy-discussion] numarray types and PIL modes, revisited
Message-ID: <3E13C7DA.70906@erols.com>

Perry Greenfield wrote:
 > Edward Jones writes:

 > > I write code using both PIL and numarray. PIL uses strings for
 > > modes and numarray uses (optionally) strings as typecodes. This
 > > causes problems.  One fix is to emit a DeprecationWarning when
 > > string typecodes are used.  Two functions are needed:
 > > StringTypeWarningOn and StringTypeWarningOff.  The default
 > > should be to ignore this warning.
 >
 > I'm not sure I understand. Can you give me an example of problem
 > code or usage? It sounds like you are trying to test the types of
 > PIL and numarray objects in a generic sense. But I'd understand
 > better if you could show an example.

That's what I was thinking (incorrectly). But I don't need to directly 
compare PIL modes with numarray types.

My code never tries to deduce whether an array is a numarray or a PIL 
image from just the natype_or_mode. A module name (MODULE.NUMARRY, 
MODULE.PIL) must also be given. I do things this way because I might 
want to include other array/image systems. In an earlier version, I had 
a MODULE.IPL for the Intel Image Processing Library.

The code also implements a policy of forbidding string types.

So now all I can say is:

1. UInt8 == 'X' should not raise an exception. It should return False.

3. There needs to be a function that returns True iff arg is a numarry 
type (UInt8, "UInt8", "b", ...).

def IsType(rep):
     from numerictypes import typeDict
     return isinstance(rep, NumericType) or typeDict.has_key(rep)


Here is a typical piece of code. "module" can be MODULE.PIL or
MODULE.NUMARRAY.

----
"""General image casting function. Changes the C type of the pixels. 
Information can be lost. The "Convert" functions call C casting 
functions that clip the values, For example, if the input is a UInt16 
and the output is a Int16, any input value greater than 32767 becomes 32767.
"""
def ArrayToArrayCast(arrin, module, natype_or_mode):
     """Converts one array into another. Results are clipped."""
     pars = Parameters(arrin)
     if pars.module == module == MODULE.PIL and \
           pars.mode == natype_or_mode:
         return arrin
     if pars.module == module == MODULE.NUMARRAY and \
                      NA_SameType(pars.natype, natype_or_mode):
         return arrin
     if pars.module == MODULE.NUMARRAY and module == MODULE.NUMARRAY:
         return NA_To_NA_Convert(arrin, natype_or_mode)
     if pars.module == MODULE.PIL and module == MODULE.PIL:
         return PIL_To_PIL_Convert(arrin, natype_or_mode)
     if pars.module == MODULE.NUMARRAY and module == MODULE.PIL:
         return NA_To_PIL_Convert(arrin, natype_or_mode)
     if pars.module == MODULE.PIL and module == MODULE.NUMARRAY:
         return PIL_To_NA_Convert(arrin, natype_or_mode)
----


From edcjones at erols.com  Wed Jan  1 20:42:05 2003
From: edcjones at erols.com (Edward C. Jones)
Date: Wed Jan  1 20:42:05 2003
Subject: [Numpy-discussion] End of Holidays small comments
Message-ID: <3E13CB14.7040908@erols.com>

node35.html:

     >>> print x.type(), x.real.type()
     D d

should be

     >>> print x.type(), x.real.type()
     numarray type: Complex64 numarray type: Float64

------------------------------------------------

Why use both NUM_C_ARRAY and C_ARRAY?

------------------------------------------------

in  _ndarraymodule.c:

         {"_byteoffset",
          (getter)_ndarray_byteoffset_get,
          (setter)_ndarray_byteoffset_set,
          "shortest seperation between elements in bytes"},
         {"_bytestride",
          (getter)_ndarray_bytestride_get,
          (setter)_ndarray_bytestride_set,
          "shortest seperation between elements in bytes"},

One of the comments is wrong. Also "separation".

------------------------------------------------

libnumarraymodule.c:

     /* Create an empty array. */
     static PyArrayObject *
     NA_Empty(int ndim, int *shape, NumarrayType type)

node42.html:

     static PyObject* NA_Empty( NumarrayType type, int ndim, ...)

Serious documentation error.

------------------------------------------------

I think NA_New should be

     NA_New(int ndim, int* shape, NumarrayType type, void* buffer)

The current NA_New is useful only when ndim is known at code-writing time.

------------------------------------------------

node39.html:

     Note: the type parameter for a macro is one of the Numarray Numeric
     Data Types, not a NumarrayType enumeration value.

There should be an example of one of the GET/SET macros. How about

     unsigned char n;
     int i;
     ...
     n = NA_GET1(arr, UInt8, i);

------------------------------------------------

It seems that the parameters "aligned" and "writeable" are ignored in 
the source code for NA_NewAll and class NumArray.

------------------------------------------------

I would like to see an "int* strides" parameter added to NA_NewAll, so a
non-contiguous "buffer" can be used.

------------------------------------------------

I suggest NA_Copy(PyObject* arr) which is something like

static PyObject* NA_Copy(PyObject* arr)
{
     PyArrayObject* arr1 = arr;
     return NA_NewAll(arr1->nd, (long*) arr1->dimensions,
        arr1->descr->type_num, arr1->data, arr1->byteoffset,
        arr1->bytestride, arr1->byteorder, 1, 1);
}


From edcjones at erols.com  Wed Jan  1 20:45:34 2003
From: edcjones at erols.com (Edward C. Jones)
Date: Wed Jan  1 20:45:34 2003
Subject: [Numpy-discussion] Slicing API?
Message-ID: <3E13CBC3.6000207@erols.com>

Both in Numeric and now in numarray I have found a need for API 
functions for slicing. Has anyone thought about this?


From jmiller at stsci.edu  Thu Jan  2 06:03:16 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jan  2 06:03:16 2003
Subject: [Numpy-discussion] Slicing API?
References: <3E13CBC3.6000207@erols.com>
Message-ID: <3E14481D.9080902@stsci.edu>

Edward C. Jones wrote:

> Both in Numeric and now in numarray I have found a need for API 
> functions for slicing. Has anyone thought about this?
>
Speaking for myself and the numarray C-API, the answer is no.   What API 
do you want?   Can you suggest function prototypes?

Todd


From jmiller at stsci.edu  Thu Jan  2 12:36:53 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jan  2 12:36:53 2003
Subject: [Numpy-discussion] Slicing API?
References: <3E13CBC3.6000207@erols.com> <3E14481D.9080902@stsci.edu> <3E1497E1.1050808@erols.com>
Message-ID: <3E14A435.7040609@stsci.edu>

Edward C. Jones wrote:

> Todd Miller wrote:
>
>> Edward C. Jones wrote:
>>
>>> Both in Numeric and now in numarray I have found a need for API 
>>> functions for slicing. Has anyone thought about this?
>>>
>> Speaking for myself and the numarray C-API, the answer is no.   What 
>> API do you want?   Can you suggest function prototypes? 
>
>
> An API version of  arrout[slices] = arrin[slices]:
>
> static int
> NA_CopySlice(PyArrayObject* arrin, PyArrayObject* arrout,
>     int* startin, int* stepin, int* stopin, int* startout, int* stepout);
>
>
I would suggest something more like the following then:

typedef struct {
    int start, stop, step;
} NumSlice;

static int
NA_CopySlice(PyArrayObject* arrin, int indim, NumSlice *slicein,
    PyArrayObject* arrout,  int outdim, NumSlice *sliceout);

The differences are:

1.  A slice dimension count is added for both input and output arrays. 
 This enables use of partial indices.

2.  Slice values are expressed using the NumSlice typedef/struct rather 
than 3 independent int arrays.

3. The parameter order is shuffled so that input array parameters are 
kept together, and output array parameters are kept together.

But,  I still have these comments:

1.  It looks like it will be cumbersome to use.

2.  We should probably implement it as a callback to Python to avoid 
introducing another set of assignment semantics.  Thus, the 
implementation would really just be building up and executing the calls 
for:  outarr.__setitem__(outslices, inarr.__getitem__(inslices)).

3. The slicing implementation for numarray objects should be optimized 
to C this quarter, if not this month.  So in terms of efficiency, not to 
mention comment 2, this won't buy much.

4. Since Numeric doesn't have this already,  we're probably missing 
something obvious.  

Comments?  Still interested?

Todd


From jmiller at stsci.edu  Fri Jan  3 09:49:01 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jan  3 09:49:01 2003
Subject: [Numpy-discussion] End of Holidays small comments
References: <3E13CB14.7040908@erols.com>
Message-ID: <3E15CED2.9070402@stsci.edu>

Wow!   This is great feedback.  Thanks Edward.

Edward C. Jones wrote:

> node35.html:
>
>     >>> print x.type(), x.real.type()
>     D d
>
> should be
>
>     >>> print x.type(), x.real.type()
>     numarray type: Complex64 numarray type: Float64

I taked this over with Perry,  and think it should behave and be 
documented more like:
 >>> print x.type(), x.real.type()
Complex64  Float64

>
> ------------------------------------------------
>
> Why use both NUM_C_ARRAY and C_ARRAY?

In the context of the defining enumeration,  NUM_C_ARRAY looks correct. 
  Anywhere else,  C_ARRAY is about all I can stand.   C_ARRAY is so 
common that I thought a little irregularity would be tolerable.  Chock 
it up to tastelessness.

>
> ------------------------------------------------
>
> in  _ndarraymodule.c:
>
>         {"_byteoffset",
>          (getter)_ndarray_byteoffset_get,
>          (setter)_ndarray_byteoffset_set,
>          "shortest seperation between elements in bytes"},
>         {"_bytestride",
>          (getter)_ndarray_bytestride_get,
>          (setter)_ndarray_bytestride_set,
>          "shortest seperation between elements in bytes"},
>
> One of the comments is wrong. Also "separation".

Noted.

>
> ------------------------------------------------
>
> libnumarraymodule.c:
>
>     /* Create an empty array. */
>     static PyArrayObject *
>     NA_Empty(int ndim, int *shape, NumarrayType type)
>
> node42.html:
>
>     static PyObject* NA_Empty( NumarrayType type, int ndim, ...)
>
Noted.

>
> ------------------------------------------------
>
> I think NA_New should be
>
>     NA_New(int ndim, int* shape, NumarrayType type, void* buffer)
>
> The current NA_New is useful only when ndim is known at code-writing 
> time.

NA_New is a  "convenience wrapper" around NA_NewAll,  but I see your point.

How about NA_vNew(),  in the spirit of vprintf?

>
> ------------------------------------------------
>
> node39.html:
>
>     Note: the type parameter for a macro is one of the Numarray Numeric
>     Data Types, not a NumarrayType enumeration value.
>
> There should be an example of one of the GET/SET macros. How about
>
>     unsigned char n;
>     int i;
>     ...
>     n = NA_GET1(arr, UInt8, i);

OK.

>
> ------------------------------------------------
>
> It seems that the parameters "aligned" and "writeable" are ignored in 
> the source code for NA_NewAll and class NumArray.

"aligned" is used.

"writeable" should probably be dropped since it is no longer used.   
Since doing that would break an interface someone might be using,  I'd 
rather not.

>
> ------------------------------------------------
>
> I would like to see an "int* strides" parameter added to NA_NewAll, so a
> non-contiguous "buffer" can be used. 

OK.   How about NA_NewAllWithStrides (or insert a better name here)?

>
> ------------------------------------------------
>
> I suggest NA_Copy(PyObject* arr) which is something like
>
> static PyObject* NA_Copy(PyObject* arr)
> {
>     PyArrayObject* arr1 = arr;
>     return NA_NewAll(arr1->nd, (long*) arr1->dimensions, 

This  ((long *)) doesn't work portably, so I would recommend avoiding it.

>
>        arr1->descr->type_num, arr1->data, arr1->byteoffset,
>        arr1->bytestride, arr1->byteorder, 1, 1);
> }
>
I'll add NA_Copy().


From jmiller at stsci.edu  Fri Jan  3 09:52:02 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jan  3 09:52:02 2003
Subject: [Numpy-discussion] numarray types and PIL modes, revisited
References: <3E13C7DA.70906@erols.com>
Message-ID: <3E15CF75.8080207@stsci.edu>

Edward C. Jones wrote:

> So now all I can say is:
>
> 1. UInt8 == 'X' should not raise an exception. It should return False.

OK.   I'll change numarray to return False.

>
> 3. There needs to be a function that returns True iff arg is a numarry 
> type (UInt8, "UInt8", "b", ...).
>
> def IsType(rep):
>     from numerictypes import typeDict
>     return isinstance(rep, NumericType) or typeDict.has_key(rep)

Sounds good too.  I'll add this to numerictypes.

>
>
Thanks,
Todd


From edcjones at erols.com  Fri Jan  3 16:03:04 2003
From: edcjones at erols.com (Edward C. Jones)
Date: Fri Jan  3 16:03:04 2003
Subject: [Numpy-discussion] Grepping the source
Message-ID: <3E162CCB.7070106@erols.com>

Here is a short program I find useful.

#! /usr/bin/env python

import os, sys, tempfile

"""Greps the numarray source code"""

command = \
"""grep -n "%s" \
   /usr/local/src/numarray-0.4/Include/numarray/arrayobject.h \
    ...
   /usr/local/src/numarray-0.4/Lib/_ufunc.py \
   ...
   /usr/local/src/numarray-0.4/Src/libnumarraymodule.c \
 > %s
"""

if len(sys.argv) != 2:
     raise Exception, 'program requires exactly one argument'

temp = tempfile.mktemp()
try:
     os.system(command % (sys.argv[1], temp))
     f = file(temp, 'r')
     lines = f.read().splitlines()
     f.close()
finally:
     if os.path.exists(temp):
         os.remove(temp)

common = len('/usr/local/src/numarray-0.4/')
d = {}
names = []
for line in lines:
     line = line[common:]
     colonloc = line.index(':')
     name = line[:colonloc]
     text = line[colonloc+1:]
     if not d.has_key(name):
         d[name] = []
         names.append(name)
     d[name].append(text)

for name in names:
     if len(d[name]) == 0:
         continue
     print '%s:' % name
     for text in d[name]:
         print '   %s' % text
     print


From magnus at hetland.org  Fri Jan  3 16:24:04 2003
From: magnus at hetland.org (Magnus Lie Hetland)
Date: Fri Jan  3 16:24:04 2003
Subject: [Numpy-discussion] Grepping the source
In-Reply-To: <3E162CCB.7070106@erols.com>
References: <3E162CCB.7070106@erols.com>
Message-ID: <20030104002342.GA18694@idi.ntnu.no>

Edward C. Jones <edcjones at erols.com>:
[snip]
>     lines = f.read().splitlines()

You could use f.readlines() here... Or you could just use

  for line in open(...):

later, if you're using Python 2.2+

-- 
Magnus Lie Hetland
http://hetland.org


From perry at stsci.edu  Mon Jan  6 16:28:05 2003
From: perry at stsci.edu (Perry Greenfield)
Date: Mon Jan  6 16:28:05 2003
Subject: [Numpy-discussion] package vs module
Message-ID: <NEBBIJKBMLDBLNCEEFOCOEDJCPAA.perry@stsci.edu>

Back in December the issue of whether numarray should be a package
or set of modules came up. When I asked about the possibility
of making numarray a package (on the scipy mailing list but I
can't seem to find the thread where it was discussed), I got
only positive comments. The issue needs to be raised here also.

Is there any objection to making numarray package based?
The implications are that 3rd party modules (e.g. FFT)
will be imported as part of the package structure, i.e.,

  import numarray.FFT 

or

  from numarray.FFT import *

instead of 

  import FFT

As usual there are advantages and disadvantages. The advantages
are that we will not have name collisions with existing Numeric
modules (currently we name FFT as FFT2 for this reason). It also
potentially reduces name collision issues in general. Most feel
it is a cleaner way to organize the software (at least based on
the feedback so far).

The main disadvantages I see so far are:

1) One will either have to change import statements in old code
   to match the new style (a pain, but generally changing imports
   is not terribly difficult since they are easy to identify) or
   explicitly add the path to each 3rd party module to Python
   Path (or some equivalent).
2) If numarray were accepted into the Python Standard Library, it
   would be the first case (as far as I can tell) of a standard
   library package where we would expect to add sub modules to
   it (e.g., FFT)). Normally these would not be distributed with
   the standard library, so some general mechanism will be needed
   to allow numarray to find 3rd party packages outside of the
   Python directory structure. For example, I don't think we can
   require having people install FFT in the Standard Library 
   directory structure after Python is installed. Rather, we would
   probably have numarray look for extension modules in a standard
   named site-packages directory (or site-numarray?) or otherwise
   check a numarraypath environmental variable so that
   import numarray.FFT works properly. Perhaps others have ideas
   about how to best handle this.

Any other issues being overlooked?

Feedback?

Thanks, Perry


From magnus at hetland.org  Mon Jan  6 23:05:02 2003
From: magnus at hetland.org (Magnus Lie Hetland)
Date: Mon Jan  6 23:05:02 2003
Subject: [Numpy-discussion] package vs module
In-Reply-To: <NEBBIJKBMLDBLNCEEFOCOEDJCPAA.perry@stsci.edu>
References: <NEBBIJKBMLDBLNCEEFOCOEDJCPAA.perry@stsci.edu>
Message-ID: <20030107070426.GC4884@idi.ntnu.no>

Perry Greenfield <perry at stsci.edu>:
>
> Back in December the issue of whether numarray should be a package
> or set of modules came up. When I asked about the possibility
> of making numarray a package (on the scipy mailing list but I
> can't seem to find the thread where it was discussed), I got
> only positive comments. The issue needs to be raised here also.
> 
> Is there any objection to making numarray package based?

I think this seems like a very good and natural thing to do. (Maybe
names like RandomArray2 etc. can be changed too, now... :)

-- 
Magnus Lie Hetland
http://hetland.org


From pearu at cens.ioc.ee  Tue Jan  7 02:22:03 2003
From: pearu at cens.ioc.ee (Pearu Peterson)
Date: Tue Jan  7 02:22:03 2003
Subject: [Numpy-discussion] package vs module
In-Reply-To: <NEBBIJKBMLDBLNCEEFOCOEDJCPAA.perry@stsci.edu>
Message-ID: <Pine.LNX.4.21.0301071133040.14691-100000@cens.kybi>

On Mon, 6 Jan 2003, Perry Greenfield wrote:

> The main disadvantages I see so far are:
> 
> 1) One will either have to change import statements in old code
>    to match the new style (a pain, but generally changing imports
>    is not terribly difficult since they are easy to identify) or
>    explicitly add the path to each 3rd party module to Python
>    Path (or some equivalent).
> 2) If numarray were accepted into the Python Standard Library, it
>    would be the first case (as far as I can tell) of a standard
>    library package where we would expect to add sub modules to
>    it (e.g., FFT)). Normally these would not be distributed with
>    the standard library, so some general mechanism will be needed
>    to allow numarray to find 3rd party packages outside of the
>    Python directory structure. For example, I don't think we can
>    require having people install FFT in the Standard Library 
>    directory structure after Python is installed. Rather, we would
>    probably have numarray look for extension modules in a standard
>    named site-packages directory (or site-numarray?) or otherwise
>    check a numarraypath environmental variable so that
>    import numarray.FFT works properly. Perhaps others have ideas
>    about how to best handle this.
> 
> Any other issues being overlooked?

There is one, though not so critical at this point but I will raise
it anyway. In summary, I am +1 for making numarray a package.

The issue is releated to import time and memory usage: more extension
modules in a package increase both of them, even if users have no
indention to use these modules. On slower machines this may cause
inconvinieces, especially in applications that call Python multiple times
for short tasks containing numarray operation.

Let me repeat, currently this is not a problem neither with Numeric
(because it never imports its extension modules) or numarray until
numarray will contain a number of extension modules that
presumably are not small.

For a realistic example of this issue consider Scipy (as a sort of upper
bound what numarray may become one day). Scipy contains a linalg module
that is an (almost complete) wrapper to ATLAS/BLAS/LAPACK libraries and
therefore importing the corresponding extension modules can be both time
and memory consuming.  For example, importing scipy to Python may take 2-5
seconds on PII 400MHz, mainly because of loading the linalg extension
modules. This time may be annoying for small but frequent tasks.

I wish Python import mechanism would be a bit smarter or lazier in
loading extension modules that are never used...

Pearu


From falted at openlc.org  Tue Jan  7 03:31:07 2003
From: falted at openlc.org (Francesc Alted)
Date: Tue Jan  7 03:31:07 2003
Subject: [Numpy-discussion] package vs module
In-Reply-To: <NEBBIJKBMLDBLNCEEFOCOEDJCPAA.perry@stsci.edu>
References: <NEBBIJKBMLDBLNCEEFOCOEDJCPAA.perry@stsci.edu>
Message-ID: <20030107113009.GA2445@openlc.org>

On Mon, Jan 06, 2003 at 07:29:15PM -0500, Perry Greenfield wrote:
> The main disadvantages I see so far are:
> 
> 1) One will either have to change import statements in old code
>    to match the new style (a pain, but generally changing imports
>    is not terribly difficult since they are easy to identify) or
>    explicitly add the path to each 3rd party module to Python
>    Path (or some equivalent).

I think this should be regarded as a minor annoyance compared with the
advantages of making numarray a package. In addition, the introduction of
numarray as substitute of Numeric can justify some re-code on existing
applications.

> 2) If numarray were accepted into the Python Standard Library, it
>    would be the first case (as far as I can tell) of a standard
>    library package where we would expect to add sub modules to
>    it (e.g., FFT)). Normally these would not be distributed with
>    the standard library, so some general mechanism will be needed
>    to allow numarray to find 3rd party packages outside of the
>    Python directory structure. For example, I don't think we can
>    require having people install FFT in the Standard Library 
>    directory structure after Python is installed. Rather, we would
>    probably have numarray look for extension modules in a standard
>    named site-packages directory (or site-numarray?) or otherwise
>    check a numarraypath environmental variable so that
>    import numarray.FFT works properly. Perhaps others have ideas
>    about how to best handle this.
> 

Great. I would be glad to see a package containing numarray kernel in order
to allow aplications to use their core features, and have a mechanism to add
3rd party packages. In particular, having something similar to site-numarray
to install these packages can be quite neat. In fact, I was pondering to
include a subset of numarray in the PyTables package (it only needs the
numarray core functionality), but if this reorganization takes place, I
would not need to do that anymore.

> Any other issues being overlooked?

Yeah. In case you decide to break numarray in several modules, which would
be the granularity of the separation. My opinion goes to have a reduced core
with basic functionality (to maximize the chances to be included in the
Pyhton Standard Library, but also to allow an easy entry for people who may
wish to use this functionality) and then different, small, 3rd party
packages, but perhaps this is also the most laborious solution.

-- 
Francesc Alted                            PGP KeyID:      0x61C8C11F


From hinsen at cnrs-orleans.fr  Tue Jan  7 03:32:03 2003
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Tue Jan  7 03:32:03 2003
Subject: [Numpy-discussion] package vs module
In-Reply-To: <NEBBIJKBMLDBLNCEEFOCOEDJCPAA.perry@stsci.edu>
References: <NEBBIJKBMLDBLNCEEFOCOEDJCPAA.perry@stsci.edu>
Message-ID: <m3vg11fbdw.fsf@chinon.cnrs-orleans.fr>

Perry Greenfield <perry at stsci.edu> writes:

> Back in December the issue of whether numarray should be a package
> potentially reduces name collision issues in general. Most feel
> it is a cleaner way to organize the software (at least based on
> the feedback so far).

I agree. We have discussed converting NumPy into a package a few times
in the past, the major argument against it was compatibility issues.
Numarray will require some changes to import statements anyway, so
this seems the right time to make the change.

> 2) If numarray were accepted into the Python Standard Library, it
>    would be the first case (as far as I can tell) of a standard
>    library package where we would expect to add sub modules to
>    it (e.g., FFT)). Normally these would not be distributed with
>    the standard library, so some general mechanism will be needed
>    to allow numarray to find 3rd party packages outside of the
>    Python directory structure. For example, I don't think we can

If you plan to unbundle FFT etc. from numarray, then I would prefer a
different naming scheme: numarray being just numarray, and some other
package name grouping together the other modules. That is not only a
question of installation, but also of general maintenance and of
clarity for users. I see the Python package system as a tree:
everything inside a package belongs together, is distributed together
and is maintained by the same people.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From paul at pfdubois.com  Tue Jan  7 09:25:06 2003
From: paul at pfdubois.com (paul at pfdubois.com)
Date: Tue Jan  7 09:25:06 2003
Subject: [Numpy-discussion] package vs module
In-Reply-To: <20030107113009.GA2445@openlc.org>
Message-ID: <3E0D027100007B17@mta8.wss.scd.yahoo.com>

1. I favor the package approach.

2. I don't care if FFT is numarray.FFT or numpy.FFT (i.e., in a separate
place). However, see (3).

3. Extensions built with one version of Python/numarray may not work with
a different version. This means the safer approach is to have all addons
inside the same directory, so that you can blow away just one directory
and be sure that no 'old' packages remain. 

Some new stuff being put into Python also envisions being able to add various
zipped files to the Python path as places to be searched. Perhaps this represents
a packaging opportunity. I haven't paid enough attention to be sure.

While we are on the subject of packaging, the current distribution places
all sorts of extraneous test and installation-related files in the Lib directory.
This makes it harder to work with the source when you are new to it.


From tim.hochberg at ieee.org  Tue Jan  7 09:35:17 2003
From: tim.hochberg at ieee.org (Tim Hochberg)
Date: Tue Jan  7 09:35:17 2003
Subject: [Numpy-discussion] package vs module
In-Reply-To: <Pine.LNX.4.21.0301071133040.14691-100000@cens.kybi>
References: <Pine.LNX.4.21.0301071133040.14691-100000@cens.kybi>
Message-ID: <3E1B0FAF.7020607@ieee.org>

Pearu Peterson wrote:

>On Mon, 6 Jan 2003, Perry Greenfield wrote:
>
>  
>
>>The main disadvantages I see so far are:
>>
>>1) One will either have to change import statements in old code
>>   to match the new style (a pain, but generally changing imports
>>   is not terribly difficult since they are easy to identify) or
>>   explicitly add the path to each 3rd party module to Python
>>   Path (or some equivalent).
>>2) If numarray were accepted into the Python Standard Library, it
>>   would be the first case (as far as I can tell) of a standard
>>   library package where we would expect to add sub modules to
>>   it (e.g., FFT)). Normally these would not be distributed with
>>   the standard library, so some general mechanism will be needed
>>   to allow numarray to find 3rd party packages outside of the
>>   Python directory structure. For example, I don't think we can
>>   require having people install FFT in the Standard Library 
>>   directory structure after Python is installed. Rather, we would
>>   probably have numarray look for extension modules in a standard
>>   named site-packages directory (or site-numarray?) or otherwise
>>   check a numarraypath environmental variable so that
>>   import numarray.FFT works properly. Perhaps others have ideas
>>   about how to best handle this.
>>
>>Any other issues being overlooked?
>>    
>>
>
>There is one, though not so critical at this point but I will raise
>it anyway. In summary, I am +1 for making numarray a package.
>
>The issue is releated to import time and memory usage: more extension
>modules in a package increase both of them, even if users have no
>indention to use these modules. On slower machines this may cause
>inconvinieces, especially in applications that call Python multiple times
>for short tasks containing numarray operation.
>  
>
That's not right, is it? I'm pretty certain that submodules in a package 
are not loaded until explicitly imported. I'm not sure why SciPy is 
slow, maybe the __init__ imports everything? I don't have a copy here so 
I can't check right now.

In any event I'm +1 for putting it in a package unless it interferes 
with it getting into the core. As Paul mentioned keeping it in a zip 
archive would be even cooler once that's an option.

-tim


From falted at openlc.org  Wed Jan  8 13:27:06 2003
From: falted at openlc.org (Francesc Alted)
Date: Wed Jan  8 13:27:06 2003
Subject: [Numpy-discussion] some recarray rework
Message-ID: <20030108212648.GA1309@openlc.org>

Hi,

In the context of optimizing the PyTables support for numarray and recarray
objects I have been playing with recarray module, and ended with a
somewhat improved version of it. Roughly, the modifications done are:

- Addition of a cache to quickly access the columns (numarrays) in
  recarrays. This object is a map (dictionary) where keys are the name
  fields and values are the pointers to columns regarded as numarrays
  entities. This dictionary is accessible through the new attribute
  "_fields".

- Addition of an attribute for recarray objects named "_record" which
  points to a special object ("Record2" class) and that it is aware of
  the "_fields" cache. It that can be used to access the different
  rows in recarray objects in an efficient way.

- The "_record" object is callable (it defines the "__call__" method)
  so as to select the recarray row that is active during access to the
  different fields.

Advantages

- Access to rows and columns (fields) in recarray objects are one
  order of magnitude faster (!).

- The new "_fields" and "_record" attributes provides convenient and
  intuitive ways to access the information in recarrays.

- The "_record" attribute suports the "__getattr__" and "__setattr__"
  methods that are very convenient to access fields in a row.

Drawbacks

- "_record" attribute points always to the same object and you must
  pass it the row over which you want to operate. So, if you want to
  have two different objects pointing to different rows, you can't use
  the "_record" attribute to get them (but you can still use the
  existing Record class through by calling the "__getitem__" method
  of a recarray object).

- Two new attributes are added to the already large number of recarray
  variables. However, this new variables has no special space
  requirements as "_record" object has only three scalar variables
  and "_fields" is a dictionary with many entries as fields in
  recarray, which should be not a large amount.

I'm attaching this modified version as well as a testbed program in order to
test their new access methods and improved performance. The output of this
program ran in a pentium4 at 2GHz machine is also included.

Feel free to play with it and/or take/adapt the parts you consider better
suited to recarray module.

-- 
Francesc Alted                            PGP KeyID:      0x61C8C11F
-------------- next part --------------
import numarray as num
import ndarray as mda
import memory
import chararray
import sys, copy, os, re, types, string

__version__ = '1.0'

class Char:
    """ data type Char class"""
    bytes = 1
    def __repr__(self):
        return "CharType"

CharType = Char()

# translation table to the num data types
numfmt = {'i1':num.Int8, 'u1':num.UInt8, 'i2':num.Int16, 'i4':num.Int32,
          'i8':num.Int64,
          'f4':num.Float32, 'f8':num.Float64,
          'l':num.Bool, 'b':num.Int8, 'u':num.UInt8, 's':num.Int16,
          'i':num.Int32, 'N':num.Int64,
          'f':num.Float32, 'd':num.Float64, 'r':num.Float32,
          'a':CharType,
          'Int8':num.Int8, 'Int16':num.Int16, 'Int32':num.Int32,
          'Int64':num.Int64,
          'UInt8':num.UInt8, 'Float32':num.Float32, 'Float64':num.Float64,
          'Bool':num.Bool}

# the reverse translation table of the above (for numarray only)
revfmt = {num.Int16:'s', num.Int32:'i', num.Int64:'N',
          num.Float32:'r', num.Float64:'d',
          num.Bool:'l', num.Int8:'b', num.UInt8:'u', CharType:'a'}

# TFORM regular expression
format_re = re.compile(r'(?P<repeat>^[0-9]*)(?P<dtype>[A-Za-z0-9.]+)')

def fromrecords (recList, formats=None, names=None):
    """ create a Record Array from a list of records in text form

        The data in the same field can be heterogeneous, they will be promoted
        to the highest data type.  This method is intended for creating
        smaller record arrays.  If used to create large array e.g.

        r=recarray.fromrecords([[2,3.,'abc']]*100000)

        it is slow.

    >>> r=fromrecords([[456,'dbe',1.2],[2,'de',1.3]],names='col1,col2,col3')
    >>> print r[0]
    (456, 'dbe', 1.2)
    >>> r.field('col1')
    array([456,   2])
    >>> r.field('col2')
    CharArray(['dbe', 'de'])
    >>> import cPickle
    >>> print cPickle.loads(cPickle.dumps(r))
    RecArray[ 
    (456, 'dbe', 1.2),
    (2, 'de', 1.3)
    ]
    """

    _shape = len(recList)
    _nfields = len(recList[0])
    for _rec in recList:
        if len(_rec) != _nfields:
            raise ValueError, "inconsistent number of objects in each record"
    arrlist = [0]*_nfields
    for col in range(_nfields):
        tmp = [0]*_shape
        for row in range(_shape):
            tmp[row] = recList[row][col]
        try:
            arrlist[col] = num.array(tmp)
        except:
            try:
                arrlist[col] = chararray.array(tmp)
            except:
                raise ValueError, "inconsistent data at row %d,field %d" % (row, col)
    _array = fromarrays(arrlist, formats=formats, names=names)
    del arrlist
    del tmp
    return _array

def fromarrays (arrayList, formats=None, names=None):
    """ create a Record Array from a list of num/char arrays

    >>> x1=num.array([1,2,3,4])
    >>> x2=chararray.array(['a','dd','xyz','12'])
    >>> x3=num.array([1.1,2,3,4])
    >>> r=fromarrays([x1,x2,x3],names='a,b,c')
    >>> print r[1]
    (2, 'dd', 2.0)
    >>> x1[1]=34
    >>> r.field('a')
    array([1, 2, 3, 4])
    """

    _shape = len(arrayList[0])

    if formats == None:

        # go through each object in the list to see if it is a numarray or
        # chararray and determine the formats
        formats = ''
        for obj in arrayList:
            if isinstance(obj, chararray.CharArray):
                formats += `obj._itemsize` + 'a,'
            elif isinstance(obj, num.NumArray):
                if len(obj._shape) == 1: _repeat = ''
                elif len(obj._shape) == 2: _repeat = `obj._shape[1]`
                else: raise ValueError, "doesn't support numarray more than 2-D"

                formats += _repeat + revfmt[obj._type] + ','
            else:
                raise ValueError, "item in the array list must be numarray or chararray"
        formats=formats[:-1]

    for obj in arrayList:
        if len(obj) != _shape:
            raise ValueError, "array has different lengths"

    _array = RecArray(None, formats=formats, shape=_shape, names=names)

    # populate the record array (make a copy)
    for i in range(len(arrayList)):
        try:
            _array.field(_array._names[i])[:] = arrayList[i]
        except:
            print "Incorrect CharArray format %s, copy unsuccessful." % _array._formats[i]
    return _array

def fromstring (datastring, formats, shape=0, names=None):
    """ create a Record Array from binary data contained in a string"""
    _array = RecArray(chararray._stringToBuffer(datastring), formats, shape, names)
    if mda.product(_array._shape)*_array._itemsize > len(datastring):
        raise ValueError("Insufficient input data.")
    else: return _array

def fromfile(file, formats, shape=-1, names=None):
    """Create an array from binary file data

    If file is a string then that file is opened, else it is assumed
    to be a file object. No options at the moment, all file positioning
    must be done prior to this function call with a file object

    >>> import testdata, sys
    >>> fd=open(testdata.filename)
    >>> fd.seek(2880*2)
    >>> r=fromfile(fd, formats='d,i,5a', shape=3)
    >>> r._byteorder = "big"
    >>> print r[0]
    (5.1000000000000005, 61, 'abcde')
    >>> r._shape
    (3,)
    """

    if isinstance(shape, types.IntType) or isinstance(shape, types.LongType):
        shape = (shape,)
    name = 0
    if isinstance(file, types.StringType):
        name = 1
        file = open(file, 'rb')
    size = os.path.getsize(file.name) - file.tell()

    dummy = array(None, formats=formats, shape=0)
    itemsize = dummy._itemsize

    if shape and itemsize:
        shapesize = mda.product(shape)*itemsize
        if shapesize < 0:
            shape = list(shape)
            shape[ shape.index(-1) ] = size / -shapesize
            shape = tuple(shape)

    nbytes = mda.product(shape)*itemsize

    if nbytes > size:
        raise ValueError(
                "Not enough bytes left in file for specified shape and type")

    # create the array
    _array = RecArray(None, formats=formats, shape=shape, names=names)
    nbytesread = memory.file_readinto(file, _array._data)
    if nbytesread != nbytes:
        raise IOError("Didn't read as many bytes as expected")
    if name:
        file.close()
    return _array

# The test below was factored out of "array" due to platform specific
# floating point formatted results:  e+020 vs. e+20
if sys.platform == "win32":
    _fnumber = "2.5984589414244182e+020"
else:
    _fnumber = "2.5984589414244182e+20"

__test__ = {}
__test__["array_platform_test_workaround"] = """
        >>> r=array('a'*200,'r,3s,5a,i',3)
        >>> print r[0]
        (%(_fnumber)s, array([24929, 24929, 24929], type=Int16), 'aaaaa', 1633771873)
        >>> print r[1]
        (%(_fnumber)s, array([24929, 24929, 24929], type=Int16), 'aaaaa', 1633771873)
        """ % globals()
del _fnumber

def array(buffer=None, formats=None, shape=0, names=None):
    """This function will creates a new instance of a RecArray.

    buffer      specifies the source of the array's initialization data.
                buffer can be: RecArray, list of records in text, list of
                numarray/chararray, None, string, buffer.

    formats     specifies the fromat definitions of the array's records.

    shape       specifies the array dimensions.

    names       specifies the field names.

    >>> r=array([[456,'dbe',1.2],[2,'de',1.3]],names='col1,col2,col3')
    >>> print r[0]
    (456, 'dbe', 1.2)
    >>> r=array('a'*200,'r,3i,5a,s',3)
    >>> r._bytestride
    23
    >>> r._names
    ['c1', 'c2', 'c3', 'c4']
    >>> r._repeats
    [1, 3, 5, 1]
    >>> r._shape
    (3,)
    """

    if (buffer is None) and (formats is None):
        raise ValueError("Must define formats if buffer=None")
    elif buffer is None or isinstance(buffer, types.BufferType):
        return RecArray(buffer, formats=formats, shape=shape, names=names)
    elif isinstance(buffer, types.StringType):
        return fromstring(buffer, formats=formats, shape=shape, names=names)
    elif isinstance(buffer, types.ListType) or isinstance(buffer, types.TupleType):
        if isinstance(buffer[0], num.NumArray) or isinstance(buffer[0], chararray.CharArray):
            return fromarrays(buffer, formats=formats, names=names)
        else:
            return fromrecords(buffer, formats=formats, names=names)
    elif isinstance(buffer, RecArray):
        return buffer.copy()
    elif isinstance(buffer, types.FileType):
        return fromfile(buffer, formats=formats, shape=shape, names=names)
    else:
        raise ValueError("Unknown input type")

def _RecGetType(name):
    """Converts a type repr string into a type."""
    if name == "CharType":
        return CharType
    else:
        return num._getType(name)

class RecArray(mda.NDArray):
    """Record Array Class"""

    def __init__(self, buffer, formats, shape=0, names=None, byteoffset=0,
                 bytestride=None, byteorder=sys.byteorder, aligned=1):

        # names and formats can be either a string with components separated
        # by commas or a list of string values, e.g. ['i4', 'f4'] and 'i4,f4'
        # are equivalent formats

        self._parseFormats(formats)
        self._fieldNames(names)

        itemsize = self._stops[-1] + 1

        if shape != None:
            if type(shape) in [types.IntType, types.LongType]: shape = (shape,)
            elif (type(shape) == types.TupleType and type(shape[0]) in [types.IntType, types.LongType]):
                pass
            else: raise NameError, "Illegal shape %s" % `shape`

        #XXX need to check shape*itemsize == len(buffer)?

        self._shape = shape
        mda.NDArray.__init__(self, self._shape, itemsize, buffer=buffer,
                             byteoffset=byteoffset,
                             bytestride=bytestride,
                             aligned=aligned)
        self._byteorder = byteorder

        # Build the column arrays
        self._fields = self._get_fields()

        # Associate a record object for accessing values in each row
        # in a efficient way (i.e. without creating a new object each time)
        self._record = Record2(self)

    def _parseFormats(self, formats):
        """ Parse the field formats """

        if (type(formats) in [types.ListType, types.TupleType]):
            _fmt = formats[:]           ### make a copy
        elif (type(formats) == types.StringType):
            _fmt = string.split(formats, ',')
        else:
            raise NameError, "illegal input formats %s" % `formats`

        self._nfields = len(_fmt)
        self._repeats = [1] * self._nfields
        self._sizes = [0] * self._nfields
        self._stops = [0] * self._nfields

        # preserve the input for future reference
        self._formats = [''] * self._nfields

        sum = 0
        for i in range(self._nfields):

            # parse the formats into repeats and formats
            try:
                (_repeat, _dtype) = format_re.match(string.strip(_fmt[i])).groups()
            except: print 'format %s is not recognized' % _fmt[i]

            if _repeat == '': _repeat = 1
            else: _repeat = eval(_repeat)
            _fmt[i] = numfmt[_dtype]
            self._repeats[i] = _repeat

            self._sizes[i] = _fmt[i].bytes * _repeat
            sum += self._sizes[i]
            self._stops[i] = sum - 1

            # Unify the appearance of _format, independent of input formats
            self._formats[i] = `_repeat`+revfmt[_fmt[i]]

        self._fmt = _fmt

    def __getstate__(self):
        """returns pickled state dictionary for RecArray"""
        state = mda.NDArray.__getstate__(self)
        state["_fmt"] = map(repr, self._fmt)
        return state
    
    def __setstate__(self, state):
        mda.NDArray.__setstate__(self, state)
        self._fmt = map(_RecGetType, state["_fmt"])

    def _fieldNames(self, names=None):
        """convert input field names into a list and assign to the _names
        attribute """

        if (names):
            if (type(names) in [types.ListType, types.TupleType]):
                pass
            elif (type(names) == types.StringType):
                names = string.split(names, ',')
            else:
                raise NameError, "illegal input names %s" % `names`

            self._names = map(lambda n:string.strip(n), names)
        else: self._names = []

        # if the names are not specified, they will be assigned as "c1, c2,..."
        # if not enough names are specified, they will be assigned as "c[n+1],
        # c[n+2],..." etc. where n is the number of specified names..."
        self._names += map(lambda i: 'c'+`i`, range(len(self._names)+1,self._nfields+1))

    def _get_fields(self):
        """ get a dictionary with fields as numeric arrays """

        # Iterate over all the fields
        fields = {}
        for fieldName in self._names:
            # determine the offset within the record
            indx = index_of(self._names, fieldName)
            _start = self._stops[indx] - self._sizes[indx] + 1

            _shape = self._shape
            _type = self._fmt[indx]
            _buffer = self._data
            _offset = self._byteoffset + _start

            # don't use self._itemsize due to possible slicing
            _stride = self._strides[0]

            _order = self._byteorder

            if isinstance(_type, Char):
                arr = chararray.CharArray(buffer=_buffer, shape=_shape,
                          itemsize=self._repeats[indx], byteoffset=_offset,
                          bytestride=_stride)
            else:
                arr = num.NumArray(shape=_shape, type=_type, buffer=_buffer,
                          byteoffset=_offset, bytestride=_stride,
                          byteorder = _order)

                # modify the _shape and _strides for array elements
                if (self._repeats[indx] > 1):
                    arr._shape = self._shape + (self._repeats[indx],)
                    arr._strides = (self._strides[0], _type.bytes)

            # Put this array as a value in dictionary
            fields[fieldName] = arr

        return fields

    def field(self, fieldName):
        """ get the field data as a numeric array """

        return self._fields[fieldName]
        
    def info(self):
        """display instance's attributes (except _data)"""
        _attrList = dir(self)
        _attrList.remove('_data')
        _attrList.remove('_fmt')
        for attr in _attrList:
            print '%s = %s' % (attr, getattr(self,attr))

    def __str__(self):
        outstr = 'RecArray[ \n'
        for i in self:
            outstr += Record.__str__(i) + ',\n'
        return outstr[:-2] + '\n]'

    ### The followng  __getitem__ is not in the requirements
    ### and is here for experimental purposes
    def __getitem__(self, key):
        if type(key) == types.TupleType:
            if len(key) == 1:
                return mda.NDArray.__getitem__(self,key[0])
            elif len(key) == 2 and type(key[1]) == types.StringType:
                return mda.NDArray.__getitem__(self,key[0]).field(key[1])
            else:
                raise NameError, "Illegal key %s" % `key`
        return mda.NDArray.__getitem__(self,key)

    def _getitem(self, key):
        byteoffset = self._getByteOffset(key)
        row = (byteoffset - self._byteoffset) / self._strides[0]
        return Record(self, row)

    def _setitem(self, key, value):
        byteoffset = self._getByteOffset(key)
        row = (byteoffset - self._byteoffset) / self._strides[0]
        for i in range(self._nfields):
            self.field(self._names[i])[row] = value.field(self._names[i])

    def reshape(*value):
        print "Cannot reshape record array."


class Record2:
    """Record2 Class

    This class is similar to Record except for the fact that it is
    created and associated with a recarray in their creation
    time. When speed in traversing the recarray is required this
    approach is more convenient than create a new Record object for
    each row that is visited.

    """

    def __init__(self, input):

        self.__dict__["_array"] = input
        self.__dict__["_fields"] = input._fields
        self.__dict__["_row"] = 0

    def __call__(self, row):
        """ set the row for this record object """
        
        if row < self._array.shape[0]:
            self.__dict__["_row"] = row
            return self
        else:
            return None

    def __getattr__(self, fieldName):
        """ get the field data of the record"""
        
        try:
            return self._fields[fieldName][self._row]
        except:
            (type, value, traceback) = sys.exc_info()
            raise AttributeError, "Error accessing \"%s\" attr.\n %s" % \
                  (fieldName, "Error was: \"%s: %s\"" % (type,value))

    def __setattr__(self, fieldName, value):
        """ set the field data of the record"""

        self._fields[fieldName][self._row] = value

    def __str__(self):
        """ represent the record as an string """
        
        outlist = []
        for name in self._array._names:
            outlist.append(`self._fields[name][self._row]`)
        return "(" + ", ".join(outlist) + ")"

class Record:
    """Record Class"""

    def __init__(self, input, row=0):
        if isinstance(input, types.ListType) or isinstance(input, types.TupleType):
            input = fromrecords([input])
        if isinstance(input, RecArray):
            self.array = input
            self.row = row

    def __getattr__(self, fieldName):
        """ get the field data of the record"""

        #return self.array.field(fieldName)[self.row]
        if fieldName in self.array._names:
            #return self.array.field(fieldName)[self.row]
            return self.array._fields[fieldName][self.row]

    def field(self, fieldName):
        """ get the field data of the record"""

        #return self.array.field(fieldName)[self.row]
        return self.array.field(fieldName)[self.row]

    def __str__(self):
        outstr = '('
        #for i in range(self.array._nfields):
        #    print self.array.field(i)[self.row]
        for name in self.array._names:
            #print self.array.field(name)[self.row]
            #print self.array._fields[name][self.row]
            ### this is not efficient, need to know how to convert N-bytes to each data type
            outstr += `self.array.field(name)[self.row]` + ', '
        return outstr[:-2] + ')'

def index_of(nameList, key):
    """ Get the index of the key in the name list.

        The key can be an integer or string.  If integer, it is the index
        in the list.  If string, the name matching will be case-insensitive and
        trailing blank-insensitive.
    """
    if (type(key) in [types.IntType, types.LongType]):
        indx = key
    elif (type(key) == types.StringType):
        _names = nameList[:]
        for i in range(len(_names)):
            _names[i] = string.lower(_names[i])
        try:
            indx = _names.index(string.strip(string.lower(key)))
        except:
            raise NameError, "Key %s does not exist" % key
    else:
        raise NameError, "Illegal key %s" % `key`

    return indx

def find_duplicate (list):
    """Find duplication in a list, return a list of dupicated elements"""
    dup = []
    for i in range(len(list)):
        if (list[i] in list[i+1:]):
            if (list[i] not in dup):
                dup.append(list[i])
    return dup

def test():
    import doctest, recarray
    return doctest.testmod(recarray)

if __name__ == "__main__":
    test()
-------------- next part --------------
import sys, time
import numarray as num
import chararray
import recarray
import recarray2  # This is my modified version

usage = \
"""usage: %s recordlength
     Set recordlength to 1000 at least to obtain decent figures!
""" % sys.argv[0]

try:
    reclen = int(sys.argv[1])
except:
    print usage
    sys.exit()

delta = 0.000001

# Creation of recarrays objects for test
x1=num.array(num.arange(reclen))
x2=chararray.array(None, itemsize=7, shape=reclen)
x3=num.array(num.arange(reclen,reclen*3,2), num.Float64)
r1=recarray.fromarrays([x1,x2,x3],names='a,b,c')
r2=recarray2.fromarrays([x1,x2,x3],names='a,b,c')

print "recarray shape in test ==>", r2.shape

print "Assignment in recarray modified"
print "-------------------------------"
t1 = time.clock()
for row in xrange(reclen):
    rec = r2._record(row)  # select the row to be changed
    #rec.b = "changed"      # change the "b" field
    rec.c = float(row**2)  # Change the "c" field
t2 = time.clock()
ttime = round(t2-t1, 3)
print "Assign time:", ttime, " Rows/s:", int(reclen/(ttime+delta))
print "Field b on row 2 after re-assign:", r2.field("c")[2]
print

print "Assignment in recarray original"
print "-------------------------------"
t1 = time.clock()
for row in xrange(reclen):
    #r1.field("b")[row] = "changed"
    r1.field("c")[row] = float(row**2)
t2 = time.clock()
ttime = round(t2-t1, 3)
print "Assign time:", ttime, " Rows/s:", int(reclen/(ttime+delta))
print "Field b on row 2 after re-assign:", r1.field("c")[2]
print

print "Selection in recarray modified"
print "------------------------------"
t1 = time.clock()
for row in xrange(reclen):
    rec = r2._record(row)
    if rec.a < 3:
        print "This record pass the cut ==>", rec.c, "(row", row, ")"
t2 = time.clock()
ttime = round(t2-t1, 3)
print "Select time:", ttime, " Rows/s:", int(reclen/(ttime+delta))
print

print "Selection in recarray original"
print "------------------------------"
t1 = time.clock()
for row in xrange(reclen):
    rec = r1[row]
    if rec.field("a") < 3:
        print "This record pass the cut ==>", rec.field("c"), "(row", row, ")"
t2 = time.clock()
ttime = round(t2-t1, 3)
print "Select time:", ttime, " Rows/s:", int(reclen/(ttime+delta))

-------------- next part --------------
recarray shape in test ==> (10000,)
Assignment in recarray modified
-------------------------------
Assign time: 0.15  Rows/s: 66666
Field b on row 2 after re-assign: 4.0

Assignment in recarray original
-------------------------------
Assign time: 1.24  Rows/s: 8064
Field b on row 2 after re-assign: 4.0

Selection in recarray modified
------------------------------
This record pass the cut ==> 0.0 (row 0 )
This record pass the cut ==> 1.0 (row 1 )
This record pass the cut ==> 4.0 (row 2 )
Select time: 0.18  Rows/s: 55555

Selection in recarray original
------------------------------
This record pass the cut ==> 0.0 (row 0 )
This record pass the cut ==> 1.0 (row 1 )
This record pass the cut ==> 4.0 (row 2 )
Select time: 1.52  Rows/s: 6578

From falted at openlc.org  Fri Jan 10 09:17:05 2003
From: falted at openlc.org (Francesc Alted)
Date: Fri Jan 10 09:17:05 2003
Subject: [Numpy-discussion] Some datatypes missing in numarray recarray?
Message-ID: <200301101813.41407.falted@openlc.org>

Hi,

I think there are some data types missing in the recarray module. I can
create recarrays using the fromarrays function with no problems except if I
use UInt16, UInt32 and UInt64.

As these types are well supported by numarray, is there any reason why they
don't appear on numfmt and revfmt mappings in recarray module?. Is it safe
to add them by hand in the source?

Thanks,

-- 
Francesc Alted


From perry at stsci.edu  Fri Jan 10 10:37:02 2003
From: perry at stsci.edu (Perry Greenfield)
Date: Fri Jan 10 10:37:02 2003
Subject: [Numpy-discussion] Some datatypes missing in numarray recarray?
In-Reply-To: <200301101813.41407.falted@openlc.org>
Message-ID: <JFEGLNDJEDNOMPPHDEJFOEPLEBAA.perry@stsci.edu>

> Hi,
>
> I think there are some data types missing in the recarray module. I can
> create recarrays using the fromarrays function with no problems
> except if I
> use UInt16, UInt32 and UInt64.
>
> As these types are well supported by numarray, is there any
> reason why they
> don't appear on numfmt and revfmt mappings in recarray module?. Is it safe
> to add them by hand in the source?
>
> Thanks,
>
> --
> Francesc Alted
>
Good point. We were using this for an I/O library that didn't use
these types so that's why they didn't get in there originally.
But you are right, they should be. Do you want to make the changes?

Thanks, PErry


From costas at malamas.com  Sat Jan 11 01:12:03 2003
From: costas at malamas.com (Costas Malamas)
Date: Sat Jan 11 01:12:03 2003
Subject: [Numpy-discussion] Sparse Arrays in NumPy?
Message-ID: <000701c2b951$74d59880$6e00a8c0@retek.int>

Hello all,

I have been trying to find a package/addon that will provide a sparse array
class to NumPy, or will at least trick NumPy to use a sparse array as a
regular array, to no avail.

By sparse array here, I donot mean a sparse matrix equation solver, but an
array class that accepts a "default value".  In other words, I would like to
instantiate a 1000x1000x1000 (1e9) array that will have at most 5-10%
populated (i.e. non-zero) elements.  The current NumPy will instantiate the
entire 1e9 array, which is a non-starter if you would like to calculate an
expression with say 4-5 arrays.  Instead, I'd like a class that will only
store the populated cells, and return the default value for the others
(ideally, but doing some smart disk I/O to preserve memory).

I've tried SciPy, Scientific Python, and a few other modules floating
around; none seem to do the trick, yet I can't help but wonder that this is
not un uncommon setup for a lot of problem domains.  Is there a package out
there?  If there isn't, where should I start looking to create one? From
their description I think SparseLib++ at least would be a good starting
point as a base library.

As a secondary issue, is anyone aware of a package that can handle storage
of such arrays?  netCDF and HDF do not seem to fit the bill; a B-Tree
library seems a more natural fit...

Thanks in advance --any and all input appreciated,

Costas


From ehagemann at comcast.net  Sun Jan 12 15:14:06 2003
From: ehagemann at comcast.net (eric hagemann)
Date: Sun Jan 12 15:14:06 2003
Subject: [Numpy-discussion] questions about array types
Message-ID: <003c01c2ba90$32d015b0$6401a8c0@eric>

Rereading the numeric docs I see the reference to types Float, Float32, Float64 -- which make sense, however I am curious to understand the usefulness of types Float0, Float8 and Float16 which all seem synonyms for Float32.  Was there some thinking that there would be a converter written for 8bit floats?


>>> from Numeric import *
>>> a = array([1,2,3,4],Float32)
>>> fromstring(a.tostring(),Float32)
array([ 1.,  2.,  3.,  4.],'f')
>>> fromstring(a.tostring(),Float)
array([   2.00000047,  512.00012255])  # corrupt, as would be expected
>>> fromstring(a.tostring(),Float0) #seems to convert back as if Float0 == Float32
array([ 1.,  2.,  3.,  4.],'f')
>>> fromstring(a.tostring(),Float8)
array([ 1.,  2.,  3.,  4.],'f')
>>> fromstring(a.tostring(),Float16)
array([ 1.,  2.,  3.,  4.],'f')
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20030112/c3a64101/attachment.html>

From oliphant at ee.byu.edu  Mon Jan 13 12:59:04 2003
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Mon Jan 13 12:59:04 2003
Subject: [Numpy-discussion] Sparse Arrays in NumPy?
In-Reply-To: <000701c2b951$74d59880$6e00a8c0@retek.int>
Message-ID: <Pine.LNX.4.33L2.0301131355340.22743-100000@oliphant.ee.byu.edu>

> Hello all,
>
> I have been trying to find a package/addon that will provide a sparse array
> class to NumPy, or will at least trick NumPy to use a sparse array as a
> regular array, to no avail.
>

Sparse arrays are not a common object.  Sparse matrices have many, many
implementations of which I'm sure you're aware.

What you want is a general purpose N-D array that uses some kind of sparse
storage.  I'm not aware of such an object in any other language.  Most of
the time people remap their particular problem so that any sparse arrays
become sparse matrices.  All of the effort is then focused in manipulating
certain classes of sparse matrices.

-Travis


From Chris.Barker at noaa.gov  Wed Jan 15 10:21:02 2003
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Wed Jan 15 10:21:02 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package.
References: <20021230235736.GA15420@idi.ntnu.no> <3E119E0E.2010403@stsci.edu>
Message-ID: <3E2598CC.DAB8FD8A@noaa.gov>

Hi folks,

I use Numeric an wxPython together a lot (of course I do, I use Numeric
for everything!).

Unfortunately, since wxPython is not Numeric aware, you lose some real
potential performance advantages. For example, I'm now working on
expanding the extensions to graphics device contexts (DCs) so that you
can draw a whole bunch of objects with a single Python call. The idea is
that the looping can be done in C++, rather than Python, saving a lot of
overhead of the loop itself, as well as the Python-wxWindows translation
step.

For drawing thousands of points, the speed-up is substantial. It's less
substantial on more complex objects (rectangles give a factor of two
improvement for ~1000 objects), due to the longer time it takes to draw
the object itself, rather than make the call. 

Anyway, at the moment, Robin Dunn has the wrappers set up so that you
can pass in a NumPy array (or, indeed, and sequence) rather than a list
or tuple of coordinates, but it is faster to use a list than a NumPy
array, because for arrays, it uses the generic PySequence_GetItem call.
If we used the NumPy API directly, it should be faster than using a
list, not slower! THis is how a representative section of the code looks
now:


bool      isFastSeq  = PyList_Check(pyPoints) ||
PyTuple_Check(pyPoints);
.
.
.
                // Get the point coordinants
                if (isFastSeq) {
                    obj = PySequence_Fast_GET_ITEM(pyPoints, i);
                }
                else {
                    obj = PySequence_GetItem(pyPoints, i);
                }

.
.
.

So you can see that if a NumPy array is passed in, PySequence_GetItem
will be used.

What I would like to do is have an isNumPyArray check, and then access
the NumPy array directly in that case.

The tricky part is that Robin does not want to have wxPython require
Numeric. (Oh how I dream of the day that NumArray becomes part of the
standard library!)
How can I check if an Object is a NumPy array (and then use it as such),
without including Numeric during compilation?

I know one option is to have condition compilation, with a NumPy and
non-Numpy version, but Robin is managing a whole lot of different
version as it is, and I don't think he wants to deal with twice as many!

Anyone have any ideas?

By the way, you can substitute NumArray for NumPy in this, as it is the
wave of the future, and particularly if it would be easier.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                    		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From paul at pfdubois.com  Wed Jan 15 10:50:07 2003
From: paul at pfdubois.com (Paul F Dubois)
Date: Wed Jan 15 10:50:07 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package.
In-Reply-To: <3E2598CC.DAB8FD8A@noaa.gov>
Message-ID: <000001c2bcc6$dd570790$6601a8c0@NICKLEBY>

If you could do:
try:
    import Numeric
    haveNumeric = 1
except:
    haveNumeric = 0

in some initialization routine, then you could use this flag.
Alternately you could test on the fly
'Numeric' in [m.__name__ for m in sys.modules]

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net 
> [mailto:numpy-discussion-admin at lists.sourceforge.net] On 
> Behalf Of Chris Barker
> Sent: Wednesday, January 15, 2003 9:22 AM
> Cc: Numpy-discussion
> Subject: [Numpy-discussion] Optionally using Numeric in 
> another compiled extension package.
> 
> 
> Hi folks,
> 
> I use Numeric an wxPython together a lot (of course I do, I 
> use Numeric for everything!).
> 
> Unfortunately, since wxPython is not Numeric aware, you lose 
> some real potential performance advantages. For example, I'm 
> now working on expanding the extensions to graphics device 
> contexts (DCs) so that you can draw a whole bunch of objects 
> with a single Python call. The idea is that the looping can 
> be done in C++, rather than Python, saving a lot of overhead 
> of the loop itself, as well as the Python-wxWindows translation step.
> 
> For drawing thousands of points, the speed-up is substantial. 
> It's less substantial on more complex objects (rectangles 
> give a factor of two improvement for ~1000 objects), due to 
> the longer time it takes to draw the object itself, rather 
> than make the call. 
> 
> Anyway, at the moment, Robin Dunn has the wrappers set up so 
> that you can pass in a NumPy array (or, indeed, and sequence) 
> rather than a list or tuple of coordinates, but it is faster 
> to use a list than a NumPy array, because for arrays, it uses 
> the generic PySequence_GetItem call. If we used the NumPy API 
> directly, it should be faster than using a list, not slower! 
> THis is how a representative section of the code looks
> now:
> 
> 
> bool      isFastSeq  = PyList_Check(pyPoints) ||
> PyTuple_Check(pyPoints);
> .
> .
> .
>                 // Get the point coordinants
>                 if (isFastSeq) {
>                     obj = PySequence_Fast_GET_ITEM(pyPoints, i);
>                 }
>                 else {
>                     obj = PySequence_GetItem(pyPoints, i);
>                 }
> 
> .
> .
> .
> 
> So you can see that if a NumPy array is passed in, 
> PySequence_GetItem will be used.
> 
> What I would like to do is have an isNumPyArray check, and 
> then access the NumPy array directly in that case.
> 
> The tricky part is that Robin does not want to have wxPython 
> require Numeric. (Oh how I dream of the day that NumArray 
> becomes part of the standard library!) How can I check if an 
> Object is a NumPy array (and then use it as such), without 
> including Numeric during compilation?
> 
> I know one option is to have condition compilation, with a 
> NumPy and non-Numpy version, but Robin is managing a whole 
> lot of different version as it is, and I don't think he wants 
> to deal with twice as many!
> 
> Anyone have any ideas?
> 
> By the way, you can substitute NumArray for NumPy in this, as 
> it is the wave of the future, and particularly if it would be easier.
> 
> -Chris
> 
> 
> -- 
> Christopher Barker, Ph.D.
> Oceanographer
>                                     		
> NOAA/OR&R/HAZMAT         (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
> 
> Chris.Barker at noaa.gov
> 
> 
> -------------------------------------------------------
> This SF.NET email is sponsored by: A Thawte Code Signing Certificate 
> is essential in establishing user confidence by providing 
> assurance of 
> authenticity and code integrity. Download our Free Code 
> Signing guide: 
> http://ads.sourceforge.net/cgi-> bin/redirect.pl?thaw0028en
> 
> 
> _______________________________________________
> Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> 


From jmiller at stsci.edu  Wed Jan 15 10:57:02 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Wed Jan 15 10:57:02 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled
 extension package.
References: <20021230235736.GA15420@idi.ntnu.no> <3E119E0E.2010403@stsci.edu>
 <3E2598CC.DAB8FD8A@noaa.gov>
Message-ID: <3E25B253.1070108@stsci.edu>

Chris Barker wrote:

>Hi folks,
>
>I use Numeric an wxPython together a lot (of course I do, I use Numeric
>for everything!).
>
>Unfortunately, since wxPython is not Numeric aware, you lose some real
>potential performance advantages. For example, I'm now working on
>expanding the extensions to graphics device contexts (DCs) so that you
>can draw a whole bunch of objects with a single Python call. The idea is
>that the looping can be done in C++, rather than Python, saving a lot of
>overhead of the loop itself, as well as the Python-wxWindows translation
>step.
>
>For drawing thousands of points, the speed-up is substantial. It's less
>substantial on more complex objects (rectangles give a factor of two
>improvement for ~1000 objects), due to the longer time it takes to draw
>the object itself, rather than make the call. 
>
>Anyway, at the moment, Robin Dunn has the wrappers set up so that you
>can pass in a NumPy array (or, indeed, and sequence) rather than a list
>or tuple of coordinates, but it is faster to use a list than a NumPy
>array, because for arrays, it uses the generic PySequence_GetItem call.
>If we used the NumPy API directly, it should be faster than using a
>list, not slower! THis is how a representative section of the code looks
>now:
>
>
>bool      isFastSeq  = PyList_Check(pyPoints) ||
>PyTuple_Check(pyPoints);
>.
>.
>.
>                // Get the point coordinants
>                if (isFastSeq) {
>                    obj = PySequence_Fast_GET_ITEM(pyPoints, i);
>                }
>                else {
>                    obj = PySequence_GetItem(pyPoints, i);
>                }
>
>.
>.
>.
>
>So you can see that if a NumPy array is passed in, PySequence_GetItem
>will be used.
>
>What I would like to do is have an isNumPyArray check, and then access
>the NumPy array directly in that case.
>
>The tricky part is that Robin does not want to have wxPython require
>Numeric. (Oh how I dream of the day that NumArray becomes part of the
>standard library!)
>How can I check if an Object is a NumPy array (and then use it as such),
>without including Numeric during compilation?
>
>I know one option is to have condition compilation, with a NumPy and
>non-Numpy version, but Robin is managing a whole lot of different
>version as it is, and I don't think he wants to deal with twice as many!
>
>Anyone have any ideas?
>
Use the Python C-API and string literals as the basis for the interface. 
 I think the steps are something like this:

1.  Import "Numeric". (PyImport_ImportModule)

2.  Get the module dictionary.    (PyModule_GetDict)

3.  Get "array" out of the dictionary.   (PyDict_GetItemString)

4.  Call "isinstance" on Numeric.array and the object.   
(PyObject_IsInstance)

Similarly:

1. Import "numarray".

2. Get the module dictionary.

3. Get "NumArray" out of the dictionary

4. Call the C-API equivalent of "isinstance" on numarray.NumArray and 
the object.

The first 3 steps of both cases can be initialized once, I think, and 
stored in C static variables to avoid repeated fetches.
If any of the first 3 steps fail, then consider that case failed and 
returning False.
If it's not a Numeric array,  check to see if it's a numarray.

>
>By the way, you can substitute NumArray for NumPy in this, as it is the
>wave of the future, and particularly if it would be easier.
>
>-Chris
>  
>
Todd


From Chris.Barker at noaa.gov  Wed Jan 15 11:00:05 2003
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Wed Jan 15 11:00:05 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled 
 extension package.
References: <000001c2bcc6$dd570790$6601a8c0@NICKLEBY>
Message-ID: <3E25A1E4.5CA8C453@noaa.gov>

Paul F Dubois wrote:
> 
> If you could do:
> try:
>     import Numeric
>     haveNumeric = 1
> except:
>     haveNumeric = 0
> 
> in some initialization routine, then you could use this flag.
> Alternately you could test on the fly
> 'Numeric' in [m.__name__ for m in sys.modules]

Thanks, but I'm talking about doing this at the C++ level in an
extension package, not at the Python level. This kind of thing is Soo
much easier in Python, of course!

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                    		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From jmiller at stsci.edu  Wed Jan 15 12:01:53 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Wed Jan 15 12:01:53 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled
 extension package.
References: <20021230235736.GA15420@idi.ntnu.no> <3E119E0E.2010403@stsci.edu>
 <3E2598CC.DAB8FD8A@noaa.gov> <3E25B253.1070108@stsci.edu>
Message-ID: <3E25C182.8080906@stsci.edu>

Todd Miller wrote:

> Chris Barker wrote:
>
>> How can I check if an Object is a NumPy array (and then use it as such),
>> without including Numeric during compilation?
>>
>> I know one option is to have condition compilation, with a NumPy and
>> non-Numpy version, but Robin is managing a whole lot of different
>> version as it is, and I don't think he wants to deal with twice as many!
>>
>> Anyone have any ideas?
>>
> Use the Python C-API and string literals as the basis for the 
> interface. I think the steps are something like this:
>
> 1.  Import "Numeric". (PyImport_ImportModule)
>
> 2.  Get the module dictionary.    (PyModule_GetDict)
>
> 3.  Get "array" out of the dictionary.   (PyDict_GetItemString)
>
> 4.  Call "isinstance" on Numeric.array and the object.   
> (PyObject_IsInstance)
>
> Similarly:
>
> 1. Import "numarray".
>
> 2. Get the module dictionary.
>
> 3. Get "NumArray" out of the dictionary
>
> 4. Call the C-API equivalent of "isinstance" on numarray.NumArray and 
> the object.
>
> The first 3 steps of both cases can be initialized once, I think, and 
> stored in C static variables to avoid repeated fetches. 

On second thought,  just do two functions,  one for Numeric,  one for 
numarray.  

If any of the first 3 steps fail, return False.  Otherwise, return the 
result of the isinstance call.

>
> If it's not a Numeric array,  check to see if it's a numarray. 

My idea to couple these was "not good".  They're not compatible at that 
level anyway.

Since numarray and Numeric are only source level compatible,  C-code can 
be compiled to work with one or the other,  but not both at the same 
time.  It probably makes more sense to just implement for Numeric.  If 
you do want to implement for both,  treat them as seperate cases with 
seperate recognizer functions and element access code.

But...  It's not clear to me that knowing an object is an array will 
help since getting data elements still has to be done fast,  and that 
seems hard to do without knowing the arrayobject struct.   Keep in mind 
that Numeric and numarray arrays are strided and possibly discontiguous, 
 so there's more to data access than owning a base pointer, as would be 
the case in C.

Todd


From falted at openlc.org  Wed Jan 15 12:25:27 2003
From: falted at openlc.org (Francesc Alted)
Date: Wed Jan 15 12:25:27 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package.
In-Reply-To: <3E25C182.8080906@stsci.edu>
References: <20021230235736.GA15420@idi.ntnu.no> <3E25B253.1070108@stsci.edu> <3E25C182.8080906@stsci.edu>
Message-ID: <200301152123.45614.falted@openlc.org>

A Dimecres 15 Gener 2003 21:16, Todd Miller va escriure:
>
> My idea to couple these was "not good".  They're not compatible at that
> level anyway.
>
> Since numarray and Numeric are only source level compatible,  C-code can
> be compiled to work with one or the other,  but not both at the same
> time.  It probably makes more sense to just implement for Numeric.  If
> you do want to implement for both,  treat them as seperate cases with
> seperate recognizer functions and element access code.
>
> But...  It's not clear to me that knowing an object is an array will
> help since getting data elements still has to be done fast,  and that
> seems hard to do without knowing the arrayobject struct.   Keep in mind
> that Numeric and numarray arrays are strided and possibly discontiguous,
>  so there's more to data access than owning a base pointer, as would be
> the case in C.

I think you can use the numarray High-Level C API to overcome these
dificulties. For example, by using the calls:

PyArrayObject* NA InputArray(PyObject *numarray, NumarrayType t, int requires)
PyArrayObject* NA OutputArray(PyObject *numarray, NumarrayType t, int 
requires)
PyArrayObject* NA IoArray(PyObject *numarray, NumarrayType t, int requires)

as documented in the User's Guide, you can get well-behaved (i.e.
contiguous and well-aligned) C arrays (copying them, if needed) from both
numarray or Numeric arrays if you pass C_ARRAY as the value for requires
parameter.

In fact, I'm using the InputArray in PyTables to manage both numarray and
Numeric arrays with good results.

-- 
Francesc Alted


From jmiller at stsci.edu  Wed Jan 15 12:40:02 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Wed Jan 15 12:40:02 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled
 extension package.
References: <20021230235736.GA15420@idi.ntnu.no> <3E25B253.1070108@stsci.edu>
 <3E25C182.8080906@stsci.edu> <200301152123.45614.falted@openlc.org>
Message-ID: <3E25CA79.40206@stsci.edu>

Francesc Alted wrote:

>A Dimecres 15 Gener 2003 21:16, Todd Miller va escriure:
>  
>
>>But...  It's not clear to me that knowing an object is an array will
>>help since getting data elements still has to be done fast,  and that
>>seems hard to do without knowing the arrayobject struct.   Keep in mind
>>that Numeric and numarray arrays are strided and possibly discontiguous,
>> so there's more to data access than owning a base pointer, as would be
>>the case in C.
>>    
>>
>
>I think you can use the numarray High-Level C API to overcome these
>dificulties. 
>
<snip>

But doesn't using the numarray  C-API require a level of coupling 
(direct knowledge of numarray during compilation) that Chris is trying 
to avoid?

>
>  
>

Todd


From falted at openlc.org  Wed Jan 15 12:59:04 2003
From: falted at openlc.org (Francesc Alted)
Date: Wed Jan 15 12:59:04 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package.
In-Reply-To: <3E25CA79.40206@stsci.edu>
References: <20021230235736.GA15420@idi.ntnu.no> <200301152123.45614.falted@openlc.org> <3E25CA79.40206@stsci.edu>
Message-ID: <200301152158.44234.falted@openlc.org>

A Dimecres 15 Gener 2003 21:54, Todd Miller va escriure:
> >I think you can use the numarray High-Level C API to overcome these
> >dificulties.
>
> But doesn't using the numarray  C-API require a level of coupling
> (direct knowledge of numarray during compilation) that Chris is trying
> to avoid?
>

Ooops!, you are right.

Perhaps this kind of scenario (accessing Numeric and numarray arrays from C)
would be more and more common as people is getting more aware of the
numarray capabilities and want to integrate it in their extensions. That
reinforces me in the belief that having a small core with the "glue"
functionality between numarray objects and 3rd party extensions in C (or
SWIG, Pyrex or whatever) can be a good thing (until numarray is in the
Standard Library).

That way, people interested in supporting numarray objects in their
extensions has only to install this small core (or even include it as part
of the extension).

Well, speaking as non-interested and impartial person ;-)

-- 
Francesc Alted


From Chris.Barker at noaa.gov  Wed Jan 15 13:50:02 2003
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Wed Jan 15 13:50:02 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled 
 extension package.
References: <20021230235736.GA15420@idi.ntnu.no> <200301152123.45614.falted@openlc.org> <3E25CA79.40206@stsci.edu> <200301152158.44234.falted@openlc.org>
Message-ID: <3E25C99A.9D5E1888@noaa.gov>

Francesc Alted wrote:

> that having a small core with the "glue"
> functionality between numarray objects and 3rd party extensions in C (or
> SWIG, Pyrex or whatever) can be a good thing (until numarray is in the
> Standard Library).
> 
> That way, people interested in supporting numarray objects in their
> extensions has only to install this small core (or even include it as part
> of the extension).

I think that's a fabulous idea, but I have no idea how hard it would be.
There would still be the problem of keeping versions in-sync. If I
distributed my package with the glue code, it would only work on
installations using the same version of Numeric (or NumArray, I suppose)


Thanks to all who have commented on my post. These are some ideas I now
have based on your comments:

> > Use the Python C-API and string literals as the basis for the
> > interface. I think the steps are something like this:
> >
> > 1.  Import "Numeric". (PyImport_ImportModule)
> >
> > 2.  Get the module dictionary.    (PyModule_GetDict)
> >
> > 3.  Get "array" out of the dictionary.   (PyDict_GetItemString)
> >
> > 4.  Call "isinstance" on Numeric.array and the object.
> > (PyObject_IsInstance)

OK, so now I can know, at runtime, whether Numeric has been imported.

> But...  It's not clear to me that knowing an object is an array will
> help since getting data elements still has to be done fast,  and that
> seems hard to do without knowing the arrayobject struct.

Exactly. that's my whole problem. However, I have an idea about this. If
I do the above test, I can now put all the Numeric specific code into a
conditional, so it would only get called in Numeric were imported. My
idea is that I could make sure Numeric was around at compile time, so I
could use all the Numeric API to access the array data, but it wouldn't
have to be installed at runtime, as none of the Numeric calls would be
executed if Numeric hadn't been imported. Would this work, or would the
system try to load the .dll or .so or whatever even if the calls weren't
executed?

All that being said, Tim Hochberg has mentioned that when he first made
wxPython DCs work with Numeric Arrays,( sorry I didn't give him credit
before, I had forgotten who did that, thanks Tim ) he did some timing
and discovered that the the overhead of the drawing calls was
substantially larger than the overhead of the indexing anyway, so
speedin up that process couldn't make much difference. 

My timing indicated something different, but I'm using Linux/wxGTK/X11,
and I think the drawing calls return after the message has been sent to
X, but X may not have completed the actual drawing yet. This means that
I'm not timing the whole process, and if I did, I might not see such a
difference. I did some tests with 100,000 points, and found that I could
see the difference with a List and Array, and the List was about twice
as fast. Drawing rectangles, however, I can't see the difference.

So, I think I'll probably shelve this for the moment, and concentrate on
getting all the drawing shapes supported by DrawXXXList methods.

Thanks for all your input.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                    		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From gvermeul at grenoble.cnrs.fr  Wed Jan 15 13:50:05 2003
From: gvermeul at grenoble.cnrs.fr (gvermeul at grenoble.cnrs.fr)
Date: Wed Jan 15 13:50:05 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled 
Message-ID: <200301152149.h0FLn6PN032653@grenoble.cnrs.fr>

> Gerard Vermeulen wrote:
> > I just want to point out that PyQwt plots NumPy arrays. I have played
> > a little bit with the Scipy-wxWindows interface, but it is no match
> > for PyQwt (I display x-y data with 16000 points).
> 
> Thanks for the tip, I'll check it out. I think what you have there is
> that the plotting is all done at the C++ level, expecting some kind of
> sequence of data points. That's exactly what I want to adress with
> wxPython: being able to pass in a whole sequence and have the looping
> done at the C++ level.
>
Yes, I am using PyArray_ContiguousFromObject() to convert any sequence
into a NumPy array before copying the data into Qwt's double arrays. 
>
> Have you ever tested whether it's fster or slower to plot data passed in
> as a list vs. a NumPy array?
>
I did not test it, but there is certainly more overhead if you pass
a list or a tuple into PyArray_ContiguousFromObject() than a NumPy array
> 
> How do you access the data in the passed in sequence? Do you use:
> PySequence_GetItem ?
> 
No, see above. The code looks like (in "sip" language, sip is a sort of
swig, but more specialized to C++ and Qt):

    void setData(double *, double *, int);
%MemberCode
    PyObject *xSeq, *ySeq;
    $C *ptr;
    if (sipParseArgs(&sipArgsParsed, sipArgs, "mOO",
                     sipThisObj, sipClass_$C, &ptr, &xSeq, &ySeq)) {
        PyArrayObject *x = (PyArrayObject *)
            PyArray_ContiguousFromObject(xSeq, PyArray_DOUBLE, 1, 0);
        if (!(x))
            return 0;
        PyArrayObject *y = (PyArrayObject *)
            PyArray_ContiguousFromObject(ySeq, PyArray_DOUBLE, 1, 0);
        if (!(y))
            return 0;
        int size;
        Py_BEGIN_ALLOW_THREADS
        size = (x->dimensions[0] < y->dimensions[0]) ?
            x->dimensions[0] : y->dimensions[0];
        ptr->setData((double*)(x->data), (double*)(y->data), size);
        Py_END_ALLOW_THREADS

        Py_DECREF(x);
        Py_DECREF(y);

        Py_INCREF(Py_None);
        return Py_None;
    }
%End

The setData calls copy the data.
>
> thanks for the tip. Qwt (and PyQwt) look very nice, I may have to
> reconsider using PyQT!
> 

Gerard

>
> -Chris
> 
> 
> 
>  
> > Take a look at http://gerard.vermeulen.free.fr
> > 
> > PyQwt is an addon for PyQt (a Python wrapper for Qt) that knows nothing
> > about NumPy
> > 
> > Maybe it is possible to make a NumPy plot add-on for wxWindows, too.
> > 
> > Gerard
> > 
> > On Wed, Jan 15, 2003 at 09:22:20AM -0800, Chris Barker wrote:
> > > Hi folks,
> > >
> > > I use Numeric an wxPython together a lot (of course I do, I use Numeric
> > > for everything!).
> > >
> > > Unfortunately, since wxPython is not Numeric aware, you lose some real
> > > potential performance advantages. For example, I'm now working on
> > > expanding the extensions to graphics device contexts (DCs) so that you
> > > can draw a whole bunch of objects with a single Python call. The idea is
> > > that the looping can be done in C++, rather than Python, saving a lot of
> > > overhead of the loop itself, as well as the Python-wxWindows translation
> > > step.
> > >
> > > For drawing thousands of points, the speed-up is substantial. It's less
> > > substantial on more complex objects (rectangles give a factor of two
> > > improvement for ~1000 objects), due to the longer time it takes to draw
> > > the object itself, rather than make the call.
> > >
> > > Anyway, at the moment, Robin Dunn has the wrappers set up so that you
> > > can pass in a NumPy array (or, indeed, and sequence) rather than a list
> > > or tuple of coordinates, but it is faster to use a list than a NumPy
> > > array, because for arrays, it uses the generic PySequence_GetItem call.
> > > If we used the NumPy API directly, it should be faster than using a
> > > list, not slower! THis is how a representative section of the code looks
> > > now:
> > >
> > >
> > > bool      isFastSeq  = PyList_Check(pyPoints) ||
> > > PyTuple_Check(pyPoints);
> > > .
> > > .
> > > .
> > >                 // Get the point coordinants
> > >                 if (isFastSeq) {
> > >                     obj = PySequence_Fast_GET_ITEM(pyPoints, i);
> > >                 }
> > >                 else {
> > >                     obj = PySequence_GetItem(pyPoints, i);
> > >                 }
> > >
> > > .
> > > .
> > > .
> > >
> > > So you can see that if a NumPy array is passed in, PySequence_GetItem
> > > will be used.
> > >
> > > What I would like to do is have an isNumPyArray check, and then access
> > > the NumPy array directly in that case.
> > >
> > > The tricky part is that Robin does not want to have wxPython require
> > > Numeric. (Oh how I dream of the day that NumArray becomes part of the
> > > standard library!)
> > > How can I check if an Object is a NumPy array (and then use it as such),
> > > without including Numeric during compilation?
> > >
> > > I know one option is to have condition compilation, with a NumPy and
> > > non-Numpy version, but Robin is managing a whole lot of different
> > > version as it is, and I don't think he wants to deal with twice as many!
> > >
> > > Anyone have any ideas?
> > >
> > > By the way, you can substitute NumArray for NumPy in this, as it is the
> > > wave of the future, and particularly if it would be easier.
> > >
> > > -Chris
> > >
> > >
> > > --
> > > Christopher Barker, Ph.D.
> > > Oceanographer
> > >
> > > NOAA/OR&R/HAZMAT         (206) 526-6959   voice
> > > 7600 Sand Point Way NE   (206) 526-6329   fax
> > > Seattle, WA  98115       (206) 526-6317   main reception
> > >
> > > Chris.Barker at noaa.gov
> > >
> > >
> > > -------------------------------------------------------
> > > This SF.NET email is sponsored by: A Thawte Code Signing Certificate
> > > is essential in establishing user confidence by providing assurance of
> > > authenticity and code integrity. Download our Free Code Signing guide:
> > > http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0028en
> > > _______________________________________________
> > > Numpy-discussion mailing list
> > > Numpy-discussion at lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> 
> -- 
> Christopher Barker, Ph.D.
> Oceanographer
>                                     		
> NOAA/OR&R/HAZMAT         (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
> 
> Chris.Barker at noaa.gov
> 


-------------------------------------------------------------
This message was sent using HTTPS service from CNRS Grenoble.
         --->   https://grenoble.cnrs.fr   <---         


From Jack.Jansen at oratrix.com  Wed Jan 15 14:18:05 2003
From: Jack.Jansen at oratrix.com (Jack Jansen)
Date: Wed Jan 15 14:18:05 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled  extension package.
In-Reply-To: <3E25A1E4.5CA8C453@noaa.gov>
Message-ID: <1D394963-28D7-11D7-AE69-000A27B19B96@oratrix.com>

On woensdag, jan 15, 2003, at 19:01 Europe/Amsterdam, Chris Barker 
wrote:

> Paul F Dubois wrote:
>>
>> If you could do:
>> try:
>>     import Numeric
>>     haveNumeric = 1
>> except:
>>     haveNumeric = 0
>>
>> in some initialization routine, then you could use this flag.
>> Alternately you could test on the fly
>> 'Numeric' in [m.__name__ for m in sys.modules]
>
> Thanks, but I'm talking about doing this at the C++ level in an
> extension package, not at the Python level. This kind of thing is Soo
> much easier in Python, of course!

This can be done, but it is difficult, and you need the cooperation of 
both parties (Numeric and wxPython, in this case). The problem is that 
you need a way to pass C pointers from one extension module to the 
other. One of the pointers you want to pass is the PyTypeObject, so you 
can check that an object passed in from Python is of the correct type. 
Another is the address of some C routine that will get you a C pointer 
to the data. The first one may be visible from Python (so you can get 
at it through normal means) but the second one won't be.

The dirty way to do this (and you should probably avoid this) is to put 
these pointers into Python integers in the supplying module, and put 
them in the module namespace with a funny name 
(__ConvertToCPointerAddress). In wxPython you import Numeric, and if it 
succeeds you look up the funny name, convert the Python integer to a C 
pointer, cross your fingers, and call the address.

A cleaner way to do this is with cobject objects. These are in the 
core, in Objects/cobject.c. Numeric exports a cobject (again named 
__ConvertToCPointerAddress) with the address of the routine as the 
value. But, and this is the nice bit, cobjects can be passed along by 
Python code but can't be fiddled with. And cobject.c even provides a C 
function PyCObject_Import(char *modulename, char *attributename) which 
directly returns you the pointer you're looking for by importing the 
module, looking up the name, checking that it's a cobject and 
extracting the value.

And it even has support for "protocols": Cobjects have an extra field 
called the description, again only settable and readable from C. 
Modules that don't know about each others' existence could still decide 
on a common description that would signify that the pointer in the 
cobject has a specific meaning. We could decide here that if the 
description is the C string "this pointer is a function that you pass 
one Python object and that returns the data just as Numeric would store 
it" would fit that bill, and anyone in the world writing an extension 
module could follow the protocol.
--
- Jack Jansen        <Jack.Jansen at oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -


From Jack.Jansen at oratrix.com  Wed Jan 15 14:34:05 2003
From: Jack.Jansen at oratrix.com (Jack Jansen)
Date: Wed Jan 15 14:34:05 2003
Subject: [Numpy-discussion] Exporting Numpy C functionality to other extension modules
Message-ID: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com>

Actually, wrt my previous message on cobjects for communicating between 
extension modules, we can do one better!

This is an idea I've been toying with for the MacPython extension 
types, and I think it's applicable to Numeric too. It goes as follows.

Each Numeric object has an attribute with a well-known name, lets call 
it "__Numeric_C_interface". This is a Cobject, and it is shared among 
all Numeric objects of the same type. The value of this C object is a 
pointer to a C structure with pointers to all the C routines you might 
want to call on the object, basically the PyArray_API structure (I 
think). The descr of the C object is a string with the version number 
of this particular PyArray_API structure.

An extension module that knows about this protocol and gets passed an 
object that it think might be a Numeric array checks whether the object 
has an __Numeric_C_interface attribute. If so it retrieves it, checks 
that it is a Cobject, gets the descriptor and tests it for 
compatibility and if it is compatible gets the cobject pointer and 
happily calls all the Numeric routines it needs.
--
- Jack Jansen        <Jack.Jansen at oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -


From falted at openlc.org  Thu Jan 16 04:00:03 2003
From: falted at openlc.org (Francesc Alted)
Date: Thu Jan 16 04:00:03 2003
Subject: [Numpy-discussion] Exporting Numpy C functionality to other extension modules
In-Reply-To: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com>
References: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com>
Message-ID: <200301161259.13522.falted@openlc.org>

A Dimecres 15 Gener 2003 23:33, Jack Jansen va escriure:
> Actually, wrt my previous message on cobjects for communicating between
> extension modules, we can do one better!
>
> This is an idea I've been toying with for the MacPython extension
> types, and I think it's applicable to Numeric too. It goes as follows.
>
> Each Numeric object has an attribute with a well-known name, lets call
> it "__Numeric_C_interface". This is a Cobject, and it is shared among
> all Numeric objects of the same type. The value of this C object is a
> pointer to a C structure with pointers to all the C routines you might
> want to call on the object, basically the PyArray_API structure (I
> think). The descr of the C object is a string with the version number
> of this particular PyArray_API structure.
>
> An extension module that knows about this protocol and gets passed an
> object that it think might be a Numeric array checks whether the object
> has an __Numeric_C_interface attribute. If so it retrieves it, checks
> that it is a Cobject, gets the descriptor and tests it for
> compatibility and if it is compatible gets the cobject pointer and
> happily calls all the Numeric routines it needs.

That's a nice idea. But I see two drawbacks:

- numarray needs to be reworked to include the Cobject descriptors, although
I don't know if this would be difficult or not.

- you still need to have Numeric or numarray installed on the client
machine. This could be the usual case, but what about extensions that want
to use Numeric internally (because a number of reasons, like better number
representation, convenient interface to C, etc) without forcing the user to
install it?

However, designing a small library with a minimalist API (I'm thinking in
something similar to zlib) could be very handy in allowing extensions (but
also native python modules) to deal with numarray objects. 

As I said before, this would require the user to install only this small
library, but it can also be included in the application or package. However,
this second alternative can be tricky, as Chris Barker has signaled, because
the different numarray versions coming in the future. But IMO a series of
factors may alleviate this handicap:

- The numarray data structure should be very stable, as improvements are
normally made at the functionality level.

- The library should provide a minimalistic, high level API that, if it is
well designed, should cope with small modifications in the numarray data
structures. 

- Finally, when these differences has to be added, and that would break the
current API, this version should be marked as a major release,
and existing extensions (or whatever software that is embedding the library)
will know that they have to release new versions if they want to support the
newest objects. But, hopefully, that should happen quite unfrequently.

Of course, this small library should cope with both numarray and Numeric (at
least, the not too old versions of it) objects. But I think this shouldn't
pose a big problem as the actual numarray API already can do that.

This logical separation between structure and functionality migth also lead
to a better acceptation by numerical software cratftsmen, as they can be
more confident in that the API to deal with numarray objects will be quite
stable throughout the time.

Well, this is just a thought. I must confess that I'm so interested on that
issue because I really want to support numarray objects in my project, and
I'm just wondering which is the best way to do that without creating too
much nuissance to the users. In fact, I'm pondering to build up such a
library myself, but that can be a waste of time if I've to redone it in
every numarray release.

Cheers,

-- 
Francesc Alted


From peter.chang at nottingham.ac.uk  Thu Jan 16 08:47:04 2003
From: peter.chang at nottingham.ac.uk (peter.chang at nottingham.ac.uk)
Date: Thu Jan 16 08:47:04 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled
  extension package.
In-Reply-To: <3E25C99A.9D5E1888@noaa.gov>
Message-ID: <Pine.LNX.4.44.0301161400450.27474-100000@eexpc1.eee.nott.ac.uk>

On Wed, 15 Jan 2003, Chris Barker wrote:
[...]

> My idea is that I could make sure Numeric was around at compile time, so
> I could use all the Numeric API to access the array data, but it
> wouldn't have to be installed at runtime, as none of the Numeric calls
> would be executed if Numeric hadn't been imported. Would this work, or
> would the system try to load the .dll or .so or whatever even if the
> calls weren't executed?

One way is to import a dynamic library, explicitly, which has glue code to
handle the array objects when you need them.

[...]

> My timing indicated something different, but I'm using Linux/wxGTK/X11,
> and I think the drawing calls return after the message has been sent to
> X, but X may not have completed the actual drawing yet.

That's right. X's communication model between client and server is
asynchronous.

> This means that I'm not timing the whole process, and if I did, I might
> not see such a difference.

You can synchronise the output buffer using XSync(3) and then do the 
timing.

Peter


From Chris.Barker at noaa.gov  Thu Jan 16 09:58:04 2003
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Jan 16 09:58:04 2003
Subject: [Numpy-discussion] Optionally using Numeric in another 
 compiledextension package.
References: <Pine.LNX.4.44.0301161400450.27474-100000@eexpc1.eee.nott.ac.uk>
Message-ID: <3E26E45F.3C7E2293@noaa.gov>

peter.chang at nottingham.ac.uk wrote:

> You can synchronise the output buffer using XSync(3) and then do the
> timing.

I'd love to try this, but I confess I have no idea how! I'm working with
the *.i files that tell swig what to add when creating wrappers around
wxWindows for Python. wxWindows is using wxGTK, which is using GTK,
which is using Xlib (I think, so I'm pretty far away from X, and I
barely know enough C/C++ to attempt this.

I suppose I could try including Xlib, then calling XSync, but I need to
pass a reference to a disply. I have not idea how to get that. 

Any hints?

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                    		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From Chris.Barker at noaa.gov  Thu Jan 16 10:33:07 2003
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Jan 16 10:33:07 2003
Subject: [Numpy-discussion] Exporting Numpy C functionality to other 
 extension modules
References: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com>
Message-ID: <3E26EC9D.A0B7D173@noaa.gov>

Jack Jansen wrote:

> An extension module that knows about this protocol and gets passed an
> object that it think might be a Numeric array checks whether the object
> has an __Numeric_C_interface attribute. If so it retrieves it, checks
> that it is a Cobject, gets the descriptor and tests it for
> compatibility and if it is compatible gets the cobject pointer and
> happily calls all the Numeric routines it needs.

Wow Jack! are single handely going to impliment all my pet projects that
I'm too stupid to know how to do my self ? (the other one was Universal
text file support)

I can only barely follow what you're suggesting, but I still have a
question about it. It seems while this would provide a way ro an
extension module to identify whether an object was a Numeric array, and
then get a pointer to it, how would it know the API for dealing with the
arrays, without the Numeric header file? Or would you have to include
the header file when compiling, but not need the library at runtime
unless it was actually used, which seems a reasonable compromise.

If this would work, I think it's a great idea. Short of including
NumArray with the standard library (which I imagine is a least a couple
of Python releases away), it would be a great solution for folks that
are writing extensions that they want to be able take advantage of
Numeric when it's there, but not require it.

Do any of the primary Numarray developers think this is a good and
doable idea?

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                    		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From peter.chang at nottingham.ac.uk  Thu Jan 16 11:22:03 2003
From: peter.chang at nottingham.ac.uk (peter.chang at nottingham.ac.uk)
Date: Thu Jan 16 11:22:03 2003
Subject: [Numpy-discussion] Optionally using Numeric in another 
 compiledextension package.
In-Reply-To: <3E26E45F.3C7E2293@noaa.gov>
Message-ID: <Pine.LNX.4.44.0301161804420.27474-100000@eexpc1.eee.nott.ac.uk>

On Thu, 16 Jan 2003, Chris Barker wrote:

> peter.chang at nottingham.ac.uk wrote:
> 
> > You can synchronise the output buffer using XSync(3) and then do the
> > timing.

Oops, that should be XSynchronize(3).

[...]

> I suppose I could try including Xlib, then calling XSync, but I need to
> pass a reference to a disply. I have not idea how to get that. 
> 
> Any hints?

wxGetDisplayName() gives the Display name but not a pointer to the display 
structure. So this is not much help.

In gtk+, any program can be called with --sync to aid debugging. I'd guess 
wxWindows may allow you to do the same.

Peter


From jmiller at stsci.edu  Thu Jan 16 12:06:05 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jan 16 12:06:05 2003
Subject: [Numpy-discussion] Exporting Numpy C functionality to other 
 extension modules
References: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com> <3E26EC9D.A0B7D173@noaa.gov>
Message-ID: <3E271006.4000607@stsci.edu>

Chris Barker wrote:

>Jack Jansen wrote:
>
>  
>
>>An extension module that knows about this protocol and gets passed an
>>object that it think might be a Numeric array checks whether the object
>>has an __Numeric_C_interface attribute. If so it retrieves it, checks
>>that it is a Cobject, gets the descriptor and tests it for
>>compatibility and if it is compatible gets the cobject pointer and
>>happily calls all the Numeric routines it needs.
>>    
>>
>
>Wow Jack! are single handely going to impliment all my pet projects that
>I'm too stupid to know how to do my self ? (the other one was Universal
>text file support)
>
>I can only barely follow what you're suggesting, but I still have a
>question about it. It seems while this would provide a way ro an
>extension module to identify whether an object was a Numeric array, and
>then get a pointer to it, how would it know the API for dealing with the
>arrays, without the Numeric header file? Or would you have to include
>the header file when compiling, but not need the library at runtime
>unless it was actually used, which seems a reasonable compromise.
>
>If this would work, I think it's a great idea. Short of including
>NumArray with the standard library (which I imagine is a least a couple
>of Python releases away), it would be a great solution for folks that
>are writing extensions that they want to be able take advantage of
>Numeric when it's there, but not require it.
>
>Do any of the primary Numarray developers think this is a good and
>doable idea?
>  
>
Roll out the time machine...  it's already done.

As long as you don't define the macros PY_ARRAY_UNIQUE_SYMBOL or 
NO_IMPORT_ARRAY,  any file that includes arrayobject.h gets a static 
copy of PyArray_API.

If the module executes import_array() at an appropriate time,  normally 
module initialization, but not necessarily,  the static PyArray_API gets 
filled in and becomes usable.    The import_array() call is critical; 
 without it,  API calls through the static PyArray_API are calls to NULL 
and segfault.

I think that if Numeric is not present,  and you call import_array(),   
it will fail quietly but leave the Python error status set.   So it 
might make sense to call PyErr_Clear() after doing import_array().  

>-Chris
>
So it sounds like your whole "weak linkage" scheme is plausible now with 
Numeric (maybe even numarray!), as would be a minimal API module.

1.  We discussed yesterday how to determine if an object is a Numeric 
array w/o even compiling with arrayobject.h.   The important idea there 
was that if Numeric is not present,  the "isarray" (or whatever) 
function will return false rather than segfaulting because the API 
pointer isn't filled in.

2. Call API functions in contexts where you know you're looking at 
Numeric arrays, i.e.,  right after isarray().  This creates a guard 
which prevents you from calling API functions when Numeric is not present.

3.  Call import_array() at some time before using the API functions, 
 possibly at module init time, failing quietly and clearing the error in 
installations where Numeric is not installed.


Todd


From jmiller at stsci.edu  Fri Jan 17 14:16:03 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jan 17 14:16:03 2003
Subject: [Numpy-discussion] Exporting Numpy C functionality to other 
 extension modules
References: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com> <3E26EC9D.A0B7D173@noaa.gov>
Message-ID: <3E288068.3070407@stsci.edu>

Take a look at the attached extension module "testlite" which 
demonstrates the technique I evolved from this discussion. As we 
discussed,  this usage pattern enables the construction of an extension 
which will take advantage of numarray if it is there,  but will continue 
to work if the user has not installed numarray.  Here's how it works:

1. I created a new API function,  PyArray_isArray() which is safe to 
call in all contexts.  I defined it as:

 #define PyArray_isArray(o) (PyArray_API && NA_isNumArray(o))

I added NA_isNumArray(o) to the numarray C-API because it was the easy 
way  to do it.

2. Ordinary API functions are safe to call once an object has been 
identified to be a numarray because it implies (locally) that the 
PyArray_API pointer has been initialized.

3. I tried out the standard import_array() code and added some cleanup 
for the case where numarray is not installed.  

The only caveat I see at this point is that you are required to include 
numarray headers in order to use this.  In numarray's case,  this might 
necessitate header updates and/or function call modifications.  The 
numarray C-API should stabilize pretty soon,  but I don't think its 
quite there yet.

The same approach should apply to Numeric.

This stuff is in numarray CVS now and should be in the next numarray 
release.

Todd


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: testlite.c
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20030117/b4f79444/attachment.c>

From haase at msg.ucsf.edu  Fri Jan 17 14:25:04 2003
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Fri Jan 17 14:25:04 2003
Subject: [Numpy-discussion] make C array accessible to python without copy
Message-ID: <03fa01c2be77$4cae4430$3b45da80@rodan>

Hi,
What is the C API to make an array that got allocated,
let's say, by  a = new short[512*512],
accessible to python as numarray.

I tried NA_New - but that seems to make a copy.
I would need it to use the original memory space
so that I can "observe" the array from Python WHILE
the underlying C array changes (it's actually a camera image)

Thanks,
Sebastian Haase


From jmiller at stsci.edu  Fri Jan 17 15:17:01 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jan 17 15:17:01 2003
Subject: [Numpy-discussion] make C array accessible to python without
 copy
References: <03fa01c2be77$4cae4430$3b45da80@rodan>
Message-ID: <3E288EB1.80107@stsci.edu>

Sebastian Haase wrote:

>Hi,
>What is the C API to make an array that got allocated,
>let's say, by  a = new short[512*512],
>accessible to python as numarray.
>
What you want to do is not currently supported well in C.  The way to do 
what you want is:

1.  Create a buffer object from your C++ array.  The buffer object can 
be built such that it refers to the original copy of the data.

2.  Call  back into Python (numarray.NumArray) with your buffer object 
as the buffer parameter.

You can scavenge the code in NA_newAll (Src/newarray.ch) for most of the 
callback.

>I tried NA_New - but that seems to make a copy.
>I would need it to use the original memory space
>so that I can "observe" the array from Python WHILE
>the underlying C array changes (it's actually a camera image)
>
That sounds cool!

>
>Thanks,
>Sebastian Haase
>
>
>
>
>-------------------------------------------------------
>This SF.NET email is sponsored by: Thawte.com - A 128-bit supercerts will
>allow you to extend the highest allowed 128 bit encryption to all your 
>clients even if they use browsers that are limited to 40 bit encryption. 
>Get a guide here:http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0030en
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>  
>


From falted at openlc.org  Sat Jan 18 01:23:03 2003
From: falted at openlc.org (Francesc Alted)
Date: Sat Jan 18 01:23:03 2003
Subject: [Numpy-discussion] Accessing a Numeric 'c' array from numarray
Message-ID: <200301181022.07015.falted@openlc.org>

Hi,

I'm trying to make a C array from a Numeric "c" (Character) typecode array
using the high level call:

NA_InputArray(PyObject *numarray, NumarrayType t, int requires)

with no success.

As I have been able to access all the other types (i.e.
'1','b','s','i','l','f','d') successfully, perhaps character type is not
supported?

In the NumarrayType enum, there is no tChar, but I've tried tUInt8 and tAny
as the value for NumarrayType parameter, but both choices issues the same
error:

Traceback (most recent call last):
  File "table-tree2.py", line 77, in ?
    h5file.createArray('/columns', 'name', array(names), "Name column")
  File "/home/falted/PyTables/pytables-0.3/tables/File.py", line 400, in 
createArray
    setattr(group, name, object)
  File "/home/falted/PyTables/pytables-0.3/tables/Group.py", line 355, in 
__setattr__
    value._f_putObjectInTree(name, self)
  File "/home/falted/PyTables/pytables-0.3/tables/Leaf.py", line 71, in 
_f_putObjectInTree
    self.create()
  File "/home/falted/PyTables/pytables-0.3/tables/Array.py", line 83, in 
create
    self.createArray(self.object, self.title)
  File "/home/falted/PyTables/pytables-0.3/src/hdf5Extension.pyx", line 913, 
in createArray
    array = NA_InputArray(arr, numfmt2[arr.typecode()], C_ARRAY)
libnumarray.error: getShape: sequence object nested more than MAXDIM deep.

although I was passing only a Numeric 'c' with a rather small shape (10,16).

I just want to access the buffer data, and the shape of this object from C
(well, I'm actually using Pyrex, but I think this is not important). Is that
possible by only using numarray C calls?

Thanks,

-- 
Francesc Alted


From jmiller at stsci.edu  Sat Jan 18 08:27:04 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Sat Jan 18 08:27:04 2003
Subject: [Numpy-discussion] Accessing a Numeric 'c' array from numarray]
Message-ID: <3E2983C3.7000304@stsci.edu>


Francesc Alted wrote:

>Hi,
>
>I'm trying to make a C array from a Numeric "c" (Character) typecode array
>using the high level call:
>
>NA_InputArray(PyObject *numarray, NumarrayType t, int requires)
>
Unified handling of character arrays and numeric arrays doesn't exist 
yet in numarray.  There is no C-API for the chararray module because we 
haven't needed one.  But CharArrays are NDArrays and have attributes 
stored in PyArrayObjects just like numarrays.

>with no success.
>
>As I have been able to access all the other types (i.e.
>'1','b','s','i','l','f','d') successfully, perhaps character type is not
>supported?
>
>In the NumarrayType enum, there is no tChar, but I've tried tUInt8 and tAny
>as the value for NumarrayType parameter, but both choices issues the same
>error:
>
>Traceback (most recent call last):
>  File "table-tree2.py", line 77, in ?
>    h5file.createArray('/columns', 'name', array(names), "Name column")
>  File "/home/falted/PyTables/pytables-0.3/tables/File.py", line 400, in 
>createArray
>    setattr(group, name, object)
>  File "/home/falted/PyTables/pytables-0.3/tables/Group.py", line 355, in 
>__setattr__
>    value._f_putObjectInTree(name, self)
>  File "/home/falted/PyTables/pytables-0.3/tables/Leaf.py", line 71, in 
>_f_putObjectInTree
>    self.create()
>  File "/home/falted/PyTables/pytables-0.3/tables/Array.py", line 83, in 
>create
>    self.createArray(self.object, self.title)
>  File "/home/falted/PyTables/pytables-0.3/src/hdf5Extension.pyx", line 913, 
>in createArray
>    array = NA_InputArray(arr, numfmt2[arr.typecode()], C_ARRAY)
>libnumarray.error: getShape: sequence object nested more than MAXDIM deep.
>
NA_InputArray was intended to accept non-numeric sequences.  It could 
report this better...

>although I was passing only a Numeric 'c' with a rather small shape (10,16).
>
>I just want to access the buffer data, and the shape of this object from C
>(well, I'm actually using Pyrex, but I think this is not important). Is that
>possible by only using numarray C calls?
>
Look at Lib/chararray.py and Src/_chararraymodule.c.

If you can handle using a CharArray or RawCharArray, try:

1. call NA_updateDataPtr( array ) to refresh the data buffer pointer in 
the PyArrayObject.  Even _chararraymodule.c doesn't do this right yet.

2. call NA_OFFSETDATA(array) to add the byteoffset to the pointer.

3. shape, strides, and itemsize should be directly accessible from the 
PyArrayObject.

CharArray has some extra stripping and padding semantics; these are lazy 
and hence absent without extra care in C.  RawCharArray has none.

CharArrays are really arrays of fixed length strings of bytes.  The 
string length is defined by the array itemsize.

>Thanks,
>
>  
>
Todd


From falted at openlc.org  Sat Jan 18 10:18:02 2003
From: falted at openlc.org (Francesc Alted)
Date: Sat Jan 18 10:18:02 2003
Subject: [Numpy-discussion] Accessing a Numeric 'c' array from numarray]
In-Reply-To: <3E2983C3.7000304@stsci.edu>
References: <3E2983C3.7000304@stsci.edu>
Message-ID: <200301181917.29533.falted@openlc.org>

A Dissabte 18 Gener 2003 17:41, Todd Miller va escriure:
> >I just want to access the buffer data, and the shape of this object from C
> >(well, I'm actually using Pyrex, but I think this is not important). Is
> > that possible by only using numarray C calls?
>
> Look at Lib/chararray.py and Src/_chararraymodule.c.
>
> If you can handle using a CharArray or RawCharArray, try:
>
> 1. call NA_updateDataPtr( array ) to refresh the data buffer pointer in
> the PyArrayObject.  Even _chararraymodule.c doesn't do this right yet.
>
> 2. call NA_OFFSETDATA(array) to add the byteoffset to the pointer.
>
> 3. shape, strides, and itemsize should be directly accessible from the
> PyArrayObject.

Ok. I'll try to do that.

>
> CharArray has some extra stripping and padding semantics; these are lazy
> and hence absent without extra care in C.  RawCharArray has none.
>

By the way, is it safe to assume that CharArray objects are contiguous? or
RawCharArray?. The same question goes for RecArray objects. Or it is always
convenient to check with iscontiguous() method if they are or not?. In case
these objects can be non-contiguous, I guess there's still not a function
like NA_InputArray that works with CharArray or RecArray objects in order to
obtain well-behaved objects. Is that true?

I think it would be possible to me to include support for numarray objects
in next release of PyTables. Thanks!,

-- 
Francesc Alted


From jmiller at stsci.edu  Sat Jan 18 11:57:03 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Sat Jan 18 11:57:03 2003
Subject: [Numpy-discussion] Accessing a Numeric 'c' array from numarray]
References: <3E2983C3.7000304@stsci.edu> <200301181917.29533.falted@openlc.org>
Message-ID: <3E29B52C.2030602@stsci.edu>

Francesc Alted wrote:

>A Dissabte 18 Gener 2003 17:41, Todd Miller va escriure:
>  
>
>>>I just want to access the buffer data, and the shape of this object from C
>>>(well, I'm actually using Pyrex, but I think this is not important). Is
>>>that possible by only using numarray C calls?
>>>      
>>>
>>Look at Lib/chararray.py and Src/_chararraymodule.c.
>>
>>If you can handle using a CharArray or RawCharArray, try:
>>
>>1. call NA_updateDataPtr( array ) to refresh the data buffer pointer in
>>the PyArrayObject.  Even _chararraymodule.c doesn't do this right yet.
>>
>>2. call NA_OFFSETDATA(array) to add the byteoffset to the pointer.
>>
>>3. shape, strides, and itemsize should be directly accessible from the
>>PyArrayObject.
>>    
>>
>
>Ok. I'll try to do that.
>
>  
>
>>CharArray has some extra stripping and padding semantics; these are lazy
>>and hence absent without extra care in C.  RawCharArray has none.
>>
>>    
>>
>
>By the way, is it safe to assume that CharArray objects are contiguous? or
>RawCharArray?.
>
Mostly no.   Each fixed length element is stored as a contiguous 
sequence of bytes.  Anything goes for the rest,  so you need to look at 
the strides arrays and byteoffset.

>The same question goes for RecArray objects. 
>
No.  It's possible to select every 10th record, for instance, in a 
slice.  I believe the resulting decimated array would be a discontiguous 
view of the original.  

>Or it is always
>convenient to check with iscontiguous() method if they are or not?.
>
I'm not even certain the method works correctly for chararray and 
recarray.  

I think the portion of chararray that has been written in C considers 
array strides.
recarray is pure python.  In both cases,  I think I'd just forget about 
contiguity and use the strides arrays.

> In case
>these objects can be non-contiguous, I guess there's still not a function
>like NA_InputArray that works with CharArray or RecArray objects in order to
>obtain well-behaved objects. Is that true?
>
True.  But neither recarray nor chararray really has behavedness 
problems like misalignment,
byteswapping, or type conversion.  I think contiguity is the only issue, 
and that is solved
just by calling .copy().  You might argue that  records contain 
byteswapped and misaligned fields.   I don't have an immediate answer to 
that.

My preference is to use strides and forget about contiguity,  but you 
could also make contiguous copies simply.  Noone I'm aware of has yet 
tried access to misbehaved records in C.

>
>I think it would be possible to me to include support for numarray objects
>in next release of PyTables. 
>
Great!

>Thanks!,
>  
>


From verveer at embl.de  Sun Jan 19 06:39:09 2003
From: verveer at embl.de (verveer at embl.de)
Date: Sun Jan 19 06:39:09 2003
Subject: [Numpy-discussion] numarray bug?
Message-ID: <1042987080.3e2ab8489e640@webmail.EMBL-Heidelberg.DE>

Hi, 
 
The following gives an error: 
 
>>> print numarray.Int8 == numarray.Any 
Traceback (most recent call last): 
  File "<stdin>", line 1, in ? 
  File "/usr/local/lib/python2.2/site-packages/numarray/numerictypes.py", line 
102, in __cmp__ 
    return genericTypeRank.index(self.name) - 
genericTypeRank.index(other.name) 
ValueError: list.index(x): x not in list 
 
A bug? 
 
Cheers, Peter 
 
-- 
Dr. Peter J. Verveer 
Cell Biology and Cell Biophysics Programme 
EMBL 
Meyerhofstrasse 1 
D-69117 Heidelberg 
Germany 
Tel. : +49 6221 387245 
Fax  : +49 6221 387242 
Email: verveer at embl-heidelberg.de 
 
 
From falted at openlc.org  Mon Jan 20 04:17:03 2003
From: falted at openlc.org (Francesc Alted)
Date: Mon Jan 20 04:17:03 2003
Subject: [Numpy-discussion] Accessing a Numeric 'c' array from numarray]
In-Reply-To: <3E29B52C.2030602@stsci.edu>
References: <3E2983C3.7000304@stsci.edu> <200301181917.29533.falted@openlc.org> <3E29B52C.2030602@stsci.edu>
Message-ID: <200301201316.06127.falted@openlc.org>

A Dissabte 18 Gener 2003 21:12, Todd Miller va escriure:
> >By the way, is it safe to assume that CharArray objects are contiguous? or
> >RawCharArray?.
>
> Mostly no.   Each fixed length element is stored as a contiguous
> sequence of bytes.  Anything goes for the rest,  so you need to look at
> the strides arrays and byteoffset.
>
> >The same question goes for RecArray objects.
>
> No.  It's possible to select every 10th record, for instance, in a
> slice.  I believe the resulting decimated array would be a discontiguous
> view of the original.
>
> >Or it is always
> >convenient to check with iscontiguous() method if they are or not?.
>
> I'm not even certain the method works correctly for chararray and
> recarray.

Well, during my tests with numarray 0.4, iscontiguous() seems to work well,
both for chararrays and recarrays.

> In both cases,  I think I'd just forget about
> contiguity and use the strides arrays.

Yeah, but I still want to use iscontiguous() method just to speed-up a bit
the code.

> You might argue that  records contain
> byteswapped and misaligned fields.   I don't have an immediate answer to
> that.

Exactly, I am pondering how to deal with HDF5 objects coming from machines
with a different endianess (misalignment is not a problem in my case) than
the local machine. But I think I can manage that by creating recarrays
buffers with the byteorder parameter set appropriately during the HDF5 table
reads. Then, all the data can be read correctly because numarray will
byteswap the data whenever this recarray will be accessed.

Moreover, if this object is to be used frequently, I can speed-up the access
to this recarray by byteswapping the columns (as arrays) using their
byteswap() method. In the future it would be nice to provide a generica
byteswap method for recarrays.

Thanks,

-- 
Francesc Alted


From falted at openlc.org  Mon Jan 20 11:02:02 2003
From: falted at openlc.org (Francesc Alted)
Date: Mon Jan 20 11:02:02 2003
Subject: [Numpy-discussion] recarray2 re-visited
Message-ID: <200301202000.53584.falted@openlc.org>

Hi,

As I needed a byteswap() method for recarray, after a bit of hacking I've
made one myself. This is based on my own version of recarray to take
advantage of the _fields cache so as to both speed-up and simplify the new
code.

Basically, the new method takes a recarray, checking which columns are
numarray arrays and invoking their byteswap() method if needed. Easy, but
effective. Moreover, a _byteswap() and togglebyteorder() are provided to be
compatible with existing methods in NumArray objects.

As a plus, the recarray __str__ has been modified in order to allow a
printing having in mind the byteorder of the recarray, and improving the
speed of printing by a factor of 30, that can be handy in some situations.

Do with it whatever you want,

-- 
Francesc Alted
-------------- next part --------------
A non-text attachment was scrubbed...
Name: recarray2.py
Type: text/x-python
Size: 21435 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20030120/b7a180d9/attachment.py>
-------------- next part --------------
recarray shape in test ==> (10000,)
Assignment in recarray original
-------------------------------
Assign time: 1.24  Rows/s: 8064

Assignment in recarray modified
-------------------------------
Assign time: 0.16  Rows/s: 62499  Speed-up: 7.75

Selection in recarray original
------------------------------
This record pass the cut ==> 0.0 (row 0 )
This record pass the cut ==> 1.0 (row 1 )
This record pass the cut ==> 4.0 (row 2 )
Select time: 1.53  Rows/s: 6535

Selection in recarray modified
------------------------------
This record pass the cut ==> 0.0 (row 0 )
This record pass the cut ==> 1.0 (row 1 )
This record pass the cut ==> 4.0 (row 2 )
Select time: 0.15  Rows/s: 66666  Speed-up: 10.2

Printing in recarray original
------------------------------
Print time: 18.11  Rows/s: 552

Printing in recarray modified
------------------------------
Print time: 0.63  Rows/s: 15872  Speed-up: 28.746

-------------- next part --------------
A non-text attachment was scrubbed...
Name: recarray2-test.py
Type: text/x-python
Size: 2946 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20030120/b7a180d9/attachment-0001.py>

From falted at openlc.org  Tue Jan 21 08:01:13 2003
From: falted at openlc.org (Francesc Alted)
Date: Tue Jan 21 08:01:13 2003
Subject: [Numpy-discussion] Conversion functions between Numeric and numarray arrays?
Message-ID: <200301211744.55666.falted@openlc.org>

Hi,

Anybody is aware of any function (either in C or Python or a mixture of
both) to easily convert Numerical Python arrays from/to numarray arrays?

I mean, I would like to use such a funtion that, without having to copy
element by element all the data, be able to copy the data buffer (or even
use the same if possible at all) from one object to the other.

Thanks,

-- 
Francesc Alted


From haase at msg.ucsf.edu  Tue Jan 21 10:41:07 2003
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Tue Jan 21 10:41:07 2003
Subject: [Numpy-discussion] Conversion functions between Numeric and numarray arrays?
References: <200301211744.55666.falted@openlc.org>
Message-ID: <051501c2c17c$a83e8410$3b45da80@rodan>

Hi,
I think this is actually quite related to my post from Friday:
[Numpy-discussion] make C array accessible to python without copy

-> So, to reformulate: Who hold actually the array data in memory? Or: where
gets the memory allocated and where/how many pointers to that exist?    I
understood the answer that Todd Miller gave, that there is such a thing as a
"buffer object" that does all the work, so then: one would just have to take
that and build a "new" numarray or Numeric structure around it  (referring
to the Subject of this email)   or  (in the case of my Friday-email)  just
have that "buffer object" point to a different memory space (that got
already allocated by the C-program) .

Agree ? (Did I get it right?)

Sebastian Haase

----- Original Message -----


From falted at openlc.org  Tue Jan 21 11:24:08 2003
From: falted at openlc.org (Francesc Alted)
Date: Tue Jan 21 11:24:08 2003
Subject: [Numpy-discussion] Conversion functions between Numeric and numarray arrays?
In-Reply-To: <3E2D74A2.40204@stsci.edu>
References: <200301211744.55666.falted@openlc.org> <3E2D74A2.40204@stsci.edu>
Message-ID: <200301212005.30328.falted@openlc.org>

A Dimarts 21 Gener 2003 17:26, v?reu escriure:
> Francesc Alted wrote:
> >Anybody is aware of any function (either in C or Python or a mixture of
> >both) to easily convert Numerical Python arrays from/to numarray arrays?
>
> I think you should look at numarray.fromlist() and NumArray.tolist().  I
> think fromlist() will work on a nested sequence object,  and hence a
> Numeric array.

Yeah, I knew that, but I was looking for something more optimal.

>
> >I mean, I would like to use such a funtion that, without having to copy
> >element by element all the data, be able to copy the data buffer (or even
> >use the same if possible at all) from one object to the other.
>
> I have not looked at this yet;   it's a very good question.  Note that
> going from numarray to Numeric there are issues with making the buffer
> well-behaved.

I think this should be not too difficult to achieve and I'll try to explain
why.

When going from numarray to Numeric, numarray already have NA_InputArray
C-API function that returns a well-behaved array. But strictly speaking, we
don't even need a well-behaved array (this is a too restrictive condition)
as both Numeric and numarray support discontiguous data. Even the byteorder
should be not a problem, because, as Numeric itself has no such a property,
we can create a Numeric array that is in native order as the result and
byteswap the numarray object (if needed) before doing the conversion.

So, non-alignment remains as the only issue that may cause a buffer copy
during numarray ==> Numeric conversion. Is that correct?. If yes, it is
possible to do a workaround about that, i.e. we can still get a Numeric from
a numarray without copying the data in case of numarray misaligned objects?.

Regarding to going in the other sense (ie. Numeric ==> numarray), as
numarray supports discontiguity, misalignment and byteswapped data, this
conversion should not imply a data buffer copy at all. 

Once we have a pointer to the data buffer, it is only a matter of
wrapping a Numeric or numarray object around it getting this info from the
original object, and returning the new object as a result.

All in all, this conversion *seems* to be not a too difficult task.

Making such a conversion functions (in C, but also having Python
counterparts) available might represent to open the door to a co-existence
of Numeric and numarray objects in the same program, and that would easy the
numarray deployment in existing Numeric software.

Comments?

-- 
Francesc Alted


From falted at openlc.org  Tue Jan 21 11:24:11 2003
From: falted at openlc.org (Francesc Alted)
Date: Tue Jan 21 11:24:11 2003
Subject: [Numpy-discussion] Conversion functions between Numeric and numarray arrays?
In-Reply-To: <051501c2c17c$a83e8410$3b45da80@rodan>
References: <200301211744.55666.falted@openlc.org> <051501c2c17c$a83e8410$3b45da80@rodan>
Message-ID: <200301212020.57384.falted@openlc.org>

A Dimarts 21 Gener 2003 19:41, Sebastian Haase va escriure:
> Hi,
> I think this is actually quite related to my post from Friday:
> [Numpy-discussion] make C array accessible to python without copy
>
> -> So, to reformulate: Who hold actually the array data in memory? Or:
> where gets the memory allocated and where/how many pointers to that exist? 
>   I understood the answer that Todd Miller gave, that there is such a thing
> as a "buffer object" that does all the work, so then: one would just have
> to take that and build a "new" numarray or Numeric structure around it 
> (referring to the Subject of this email)   or  (in the case of my
> Friday-email)  just have that "buffer object" point to a different memory
> space (that got already allocated by the C-program) .
>
> Agree ? (Did I get it right?)

Well, so so. I think the buffer object is a property of numarray objects,
not Numeric objects. So, in the numarray ==> Numeric conversion process you
may need to access the internals of the buffer (for example by using the
high level numarray C-API) and manage to obtain a data buffer (in the C
sense, not an object) that can be used to build the Numeric object (with the
help of the numarray object metadata). The opposite way needs something
similar but with inverted roles. See my previous message for a more in-depth
explanation.

I think the conversion (without copying) is not a difficult process, but no
so-easy like that.

Well, I'm just a newcomer to numarray and my opinions about that may
perfectly be completely wrong, of course. Take them with caution!.

-- 
Francesc Alted


From paul at pfdubois.com  Tue Jan 21 12:06:34 2003
From: paul at pfdubois.com (paul at pfdubois.com)
Date: Tue Jan 21 12:06:34 2003
Subject: [Numpy-discussion] RE: numarray/Numeric upkeep?
Message-ID: <3E0D02A0000164FB@mta6.wss.scd.yahoo.com>

Here are some of the factors leading to the slow rate of change of Numeric
lately.
a. I changed to a new project and have had a lot of startup learning to
do. My new project uses Numeric but not in as central a way as my old one.
b. I mistakenly thought numarray would be ready sooner so that I was trying
to let it slide.
c. I announced last year, in view of (a), that I was needing to be replaced
as HeadNummie. It would be logical to turn this over to the Numarray people,
but they aren't ready to do it until Numarray is ready, so nothing happened.
d. Except for Travis, most of the other listed Numeric developers aren't
in fact doing patches, releases, etc.
e. Not all patches that are submitted are correct or desirable, historically.
I'm not saying anything about any patches you may have submitted, just pointing
out that applying them requires real work, not just mechanical patching.
In fact the rate of error in patches is quite high and I've learned to be
cautious.
f. Some patches interfere with each other; for example, a patch for making
64 bit machines work right and a patch for some specific bug collided.

I've started to work on the MA for Numarray but I'm not able to do much
work on Numeric right now. This is a place where someone else has to help.


>-- Original Message --
>To: dubois at users.sourceforge.net
>Subject: numarray/Numeric upkeep?
>From: Michael Stone <mbrierst at users.sourceforge.net>
>Cc: <mbrierst at users.sourceforge.net>
>Date: Tue, 21 Jan 2003 11:32:03 -0800
>
>
>
>No one seems to be doing bugfixes for Numeric or numarray.
>Nothing seems to have happened for several months.  Lots of bugs have been
>posted for Numeric, some easily fixable (I submitted one with a patch).
>
>Any idea if either project will become active again anytime soon?


From perry at stsci.edu  Tue Jan 21 12:28:13 2003
From: perry at stsci.edu (Perry Greenfield)
Date: Tue Jan 21 12:28:13 2003
Subject: [Numpy-discussion] RE: numarray/Numeric upkeep?
In-Reply-To: <3E0D02A0000164FB@mta6.wss.scd.yahoo.com>
Message-ID: <JFEGLNDJEDNOMPPHDEJFOEBLECAA.perry@stsci.edu>

Michael Stone wrote:

> >No one seems to be doing bugfixes for Numeric or numarray.
> >Nothing seems to have happened for several months.  Lots of bugs 
> have been
...

It certainly isn't true that nothing has happened for several months
with numarray. On what do you base this belief? While not all bugs
have been fixed, the oldest listed in the numarray bug tracker is
from December. Is there a bug you feel needs urgent attention?

Work is continuing and new releases will be coming out.

As to Paul's comments regarding when numarray will be ready,
my guess is when the following are complete:

- Package reorganization (make numarray a package)
- Optimization for small arrays (making numarray'speed with small arrays
   more comparable with Numeric; this is probably the single largest
   remaining item)
- Porting some well known packages such as MA (which Paul is working on),
   scipy, pyopengl and such to work with numarray. Some of this has been
   started.

There are other smaller things to do as well. But I'm hoping that
we can be done with these in a few months.

Perry


From bazell at comcast.net  Tue Jan 21 12:33:35 2003
From: bazell at comcast.net (Dave Bazell)
Date: Tue Jan 21 12:33:35 2003
Subject: [Numpy-discussion] array operation
Message-ID: <00bd01c2c18c$10ab5000$6401a8c0@DB>

I am trying to see if I can use where() or choose() to do this.  I can't
really figure it out.

I have a 2-d array data where each row is an observation and each column is
an attribute of the observation:

data =
[[.3, .2, 2.3,...]    <- observation 1
 [.7, 1.2, .4...]     <- observation 2
...]]

I have another 1-d array that contains a code for the class of object:

class = [0,1,0,1,1,3,2,0,...]

where class[i] = the class of the ith object in the data array.  Thus,
observation 1 above is class 0, observation 2 is class 1, and so on.

I want to select all objects of a given class from data array.  I can do
this with a loop

for i in range(ndat):
    if class == 0:
        do something
   ....

Is there a way to use where() or choose() to do this?  Would it be more
efficient?

Thanks,

Dave


From perry at stsci.edu  Tue Jan 21 13:02:05 2003
From: perry at stsci.edu (Perry Greenfield)
Date: Tue Jan 21 13:02:05 2003
Subject: [Numpy-discussion] array operation
In-Reply-To: <00bd01c2c18c$10ab5000$6401a8c0@DB>
Message-ID: <JFEGLNDJEDNOMPPHDEJFOEBMECAA.perry@stsci.edu>

Dave Bazell writes:
> I am trying to see if I can use where() or choose() to do this.  I can't
> really figure it out.
> 
> I have a 2-d array data where each row is an observation and each 
> column is
> an attribute of the observation:
> 
> data =
> [[.3, .2, 2.3,...]    <- observation 1
>  [.7, 1.2, .4...]     <- observation 2
> ...]]
> 
> I have another 1-d array that contains a code for the class of object:
> 
> class = [0,1,0,1,1,3,2,0,...]

Note that using class is illegal, it is a reserved keyword.
> 
> where class[i] = the class of the ith object in the data array.  Thus,
> observation 1 above is class 0, observation 2 is class 1, and so on.
> 
> I want to select all objects of a given class from data array.  I can do
> this with a loop
> 
I assume you mean you want to select all the rows corresponding to all
the observations where the code for the class corresponding to that
observation equals some particular value.

If so then for numarray this ought to work.

index = nonzero(code==1) # want indices of all the obs where class code = 1
selected_obs = data[index]

(or in one line if you wish: selected_obs = data[nonzero(code==1)]  )

> for i in range(ndat):
>     if class == 0:
>         do something
>    ....
> 
> Is there a way to use where() or choose() to do this?  Would it be more
> efficient?
> 
Perry


From Chris.Barker at noaa.gov  Tue Jan 21 14:30:10 2003
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Tue Jan 21 14:30:10 2003
Subject: [Numpy-discussion] array operation
References: <JFEGLNDJEDNOMPPHDEJFOEBMECAA.perry@stsci.edu>
Message-ID: <3E2DC965.9328BCD6@noaa.gov>

Perry Greenfield wrote:

> If so then for numarray this ought to work.
> 
> index = nonzero(code==1) # want indices of all the obs where class code = 1
> selected_obs = data[index]

of for Numeric, use take():

selected_obs = take(data,nonzero(code == 1),1)

(this will select columns coresponding to where the code == 1, which is
how I read your question)


By the way, choose() and where() do something similar, but give you an
array back that is the saem size as the one you start with, with some
(or all) of the elements replaced. take() gives you a smaller array that
is a subset of the original one, which I think is what you want here.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                    		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From jmiller at stsci.edu  Tue Jan 21 14:39:04 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Tue Jan 21 14:39:04 2003
Subject: [Numpy-discussion] Conversion functions between Numeric and numarray
 arrays?
References: <200301211744.55666.falted@openlc.org> <3E2D74A2.40204@stsci.edu> <200301212005.30328.falted@openlc.org>
Message-ID: <3E2DCBDA.1040604@stsci.edu>

Francesc Alted wrote:

>I think this should be not too difficult to achieve and I'll try to explain
>why.
>
>When going from numarray to Numeric, numarray already have NA_InputArray
>C-API function that returns a well-behaved array. But strictly speaking, we
>don't even need a well-behaved array (this is a too restrictive condition)
>as both Numeric and numarray support discontiguous data. Even the byteorder
>should be not a problem, because, as Numeric itself has no such a property,
>we can create a Numeric array that is in native order as the result and
>byteswap the numarray object (if needed) before doing the conversion.
>
In-place byteswapping sounds like a bad idea to me.  What if the array 
is based upon a readonly buffer?  We've just started using these at 
STSCI because a readonly memory map imposes no load on the system swap 
file.  With a read only mapping,  the buffer itself has readonly pages; 
 these cannot be swapped in-place.

>So, non-alignment remains as the only issue that may cause a buffer copy
>during numarray ==> Numeric conversion. Is that correct?. 
>
I don't think so.

>If yes, it is
>possible to do a workaround about that, i.e. we can still get a Numeric from
>a numarray without copying the data in case of numarray misaligned objects?.
>  
>
I don't see how.  The primary source of misaligned arrays is numerical 
columns in recarrays.  It seems to me that if the data is misaligned, 
 you either have to copy it to someplace else which is aligned,  or 
teach the function which is going to process it how to access it 
byte-wise.  Only the former sounds feasible to me.

>Regarding to going in the other sense (ie. Numeric ==> numarray), as
>numarray supports discontiguity, misalignment and byteswapped data, this
>conversion should not imply a data buffer copy at all. 
>  
>
This sounds correct.  

>Once we have a pointer to the data buffer, it is only a matter of
>wrapping a Numeric or numarray object around it getting this info from the
>original object, and returning the new object as a result.
>
>All in all, this conversion *seems* to be not a too difficult task.
>  
>
It seems straightforward in principle,  but the memory management issues 
seem a little tricky to me.   It's easy to get buffers from numarrays, 
and create numarrays from buffers.  I guess we need a module which does 
the same for Numeric.  

There are two easy ways to "get a buffer" from a Numeric array:

1.  Wrap the Numeric data in a buffer object.
2.  Add support for the buffer API to the Numeric object.

Off hand,  I'm not sure which is better,  although (1) is less intrusive 
to Numeric and I suppose is the place to start.  This should be easy.

But,  I'm not sure how to create a Numeric array from a buffer.  It's 
easy to get the data pointer from a buffer, and to construct a Numeric 
array from a data pointer,   but we also need a way to stash the pointer 
to the buffer object.    I don't like the idea of modifying Numeric's 
PyArrayObject.  

>Making such a conversion functions (in C, but also having Python
>counterparts) available might represent to open the door to a co-existence
>of Numeric and numarray objects in the same program, and that would easy the
>numarray deployment in existing Numeric software.
>
>Comments?
>  
>
All in all,  I think this is a great idea which would really boost 
interoperability.  I wish there was a simpler approach which required no 
modifications to Numeric.

Todd 


From falted at openlc.org  Wed Jan 22 01:53:01 2003
From: falted at openlc.org (Francesc Alted)
Date: Wed Jan 22 01:53:01 2003
Subject: [Numpy-discussion] Incomplete support in certain Numeric emulation functions in numarray
Message-ID: <200301221051.57337.falted@openlc.org>

Hi,

I have discovered that the Numeric emulation functions in numarray doesn't
accept a character typecode as type parameter.

This is not immediately apparent because type parameter is of type 'int',
and passing it a 'char' maybe not a good practice. But the fact is that
Numeric *do* accept the charcodes in the type parameter. 

For example, this is the normal way to call the PyArray_FromDims function:

arr = PyArray_FromDims(self.rank, self.dimensions, tFloat64)

but, in Numeric, this other manner also works:

arr = PyArray_FromDims(self.rank, self.dimensions, 'd')

Now, in numarray, if you pass a character to the type parameter, a
"segmentation fault" is issued.

Look at the end of Numeric-22.0/Src/arraytypes.c, to see how characters are
handled as types in Numeric. I think something like this should be added to
the deferred_libnumarray_init in numarray-0.4/Src/newarray.ch.

Another thing. It seems to me that NA_New and NA_Empty functions are not
well documented in the numarray documentation as they differ from the
definitions in numarray-0.4/Src/newarray.ch. I hope that the latter will
stay, because I prefer them a lot more than the documented ones :-)

Bye,

-- 
Francesc Alted


From jmiller at stsci.edu  Wed Jan 22 06:52:08 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Wed Jan 22 06:52:08 2003
Subject: [Numpy-discussion] Incomplete support in certain Numeric emulation
 functions in numarray
References: <200301221051.57337.falted@openlc.org>
Message-ID: <3E2EAFE9.4060900@stsci.edu>

Francesc Alted wrote:

>Hi,
>
>I have discovered that the Numeric emulation functions in numarray doesn't
>accept a character typecode as type parameter.
>
Interesting.  

>
>This is not immediately apparent because type parameter is of type 'int',
>and passing it a 'char' maybe not a good practice. 
>
I wrote the emulation functions using the manual and intuition rather 
than the existing code.  There will be others like this.

>But the fact is that
>Numeric *do* accept the charcodes in the type parameter. 
>
>  
>
No argument here.  numarray can "always" be more compatible than it is 
"now",  for any value of always or now.  I think the only real way to 
avoid that would be to build Numeric into numarray,  which sounds 
dubious. :)

>For example, this is the normal way to call the PyArray_FromDims function:
>
>arr = PyArray_FromDims(self.rank, self.dimensions, tFloat64)
>
>but, in Numeric, this other manner also works:
>
>arr = PyArray_FromDims(self.rank, self.dimensions, 'd')
>  
>
This was nicely illustrated.

>Now, in numarray, if you pass a character to the type parameter, a
>"segmentation fault" is issued.
>  
>
Decidedly not good.

>Look at the end of Numeric-22.0/Src/arraytypes.c, to see how characters are
>handled as types in Numeric. I think something like this should be added to
>the deferred_libnumarray_init in numarray-0.4/Src/newarray.ch.
>
I did a simple implementation of PyArray_DescrFromType trying to add 
support for f2py.

There are 2 real issues with it that I see:

1.  It still doesn't handle character codes.  I think it could handle 
both NumericTypes and character codes without conflict because of the 
way the ASCII character set is layed out.

2. I just added it so that it *could* be called since I think f2py 
needed it.  I didn't call it anywhere from the other compatability 
functions.

Care to do another patch?  

>Another thing. It seems to me that NA_New and NA_Empty functions are not
>well documented in the numarray documentation as they differ from the
>definitions in numarray-0.4/Src/newarray.ch. I hope that the latter will
>stay, because I prefer them a lot more than the documented ones :-)
>
If you're working from CVS,  the form they're in now was the result of 
someone's detailed comments.

They're still not quite right,  because the interface is written in 
terms of int arrays,  which is not good for LP64 platforms where long is 
really what is needed to avoid creating 2G bottlenecks.  The naming is 
also not consistent and I will want to make it so before release of
numarray-0.5.

>Bye,
>
>  
>
Todd


From falted at openlc.org  Wed Jan 22 09:48:03 2003
From: falted at openlc.org (Francesc Alted)
Date: Wed Jan 22 09:48:03 2003
Subject: [Numpy-discussion] Incomplete support in certain Numeric emulation functions in numarray
In-Reply-To: <3E2EAFE9.4060900@stsci.edu>
References: <200301221051.57337.falted@openlc.org> <3E2EAFE9.4060900@stsci.edu>
Message-ID: <200301221846.13358.falted@openlc.org>

A Dimecres 22 Gener 2003 15:51, Todd Miller va escriure:
>
> I did a simple implementation of PyArray_DescrFromType trying to add
> support for f2py.

> There are 2 real issues with it that I see:
>
> 1.  It still doesn't handle character codes.  I think it could handle
> both NumericTypes and character codes without conflict because of the
> way the ASCII character set is layed out.

I think so

>
> 2. I just added it so that it *could* be called since I think f2py
> needed it.  I didn't call it anywhere from the other compatability
> functions.
>

I tried to patch your PyArray_DescrFromType, but nothing has changed
because, as you said, any compatabilty function call it.

> Care to do another patch?

Well, I've tried to patch the NA_NewAll funtion in newarray.c:

        typeObject = pNumType[type];
        if (!typeObject) {
           /* Test if it is a Numeric charcode */
           sprintf(strcharcode, "%c", type);
           charcode = PyString_FromString(strcharcode);
           typeobj = PyDict_GetItemString(pNumericTypesTDict, strcharcode);
           if (typeobj) {
              typeObject = typeobj;
           } else
             return (PyArrayObject *) PyErr_Format(_Error,
                   "Type object lookup returned NULL for type %d", type);
        }

instead of the original code:

        typeObject = pNumType[type];
        if (!typeObject)
                return (PyArrayObject *) PyErr_Format(_Error,
                    "Type object lookup returned NULL for type %d", type);
        
with no luck as the segmentation fault continues to appear.

Anyway, I've already patched my original code to use only integer codes, not
character, so it would be a problem (at least for me).

> They're still not quite right,  because the interface is written in
> terms of int arrays,  which is not good for LP64 platforms where long is
> really what is needed to avoid creating 2G bottlenecks.  The naming is
> also not consistent and I will want to make it so before release of
> numarray-0.5.

Ok, so perhaps it's better to use the PyArray_FromDims rather than NA_Empty
(at least, until the C-API stabilizes). It's good to know that!.

BTW, during the patching work of numarray sources I perceived some missing
character code types in numerictypes.py. These are the correspondents to:
UInt16, Int64 and UInt64. In recarray, they don't appear neither (except for
Int64 which appears as 'N' in numfmt, but with no correspondant in revfmt),
so one can't build-up recarrays with these types because you need a charcode
for the "formats" string.

Is this intentional? Do you plan to fill these gaps (it would be nice,
specially for recarrays)?

Thanks,

-- 
Francesc Alted


From haase at msg.ucsf.edu  Thu Jan 23 14:06:04 2003
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Thu Jan 23 14:06:04 2003
Subject: [Numpy-discussion] Have a problem: what is attribute 'compress'
References: <3E00FDB5.2090804@erols.com> <004b01c2a6fd$195c95f0$3b45da80@rodan> <3E01D07B.3070009@stsci.edu>
Message-ID: <08ad01c2c32b$900238f0$3b45da80@rodan>

Hi,
I can print numarray of any int time just fine, but
I still get the compress error message with Float (or complex)
data:
>>>c
>>>array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]], type=UInt16)
>>>c.astype(na.Float)
Traceback (most recent call last):
  File "<input>", line 1, in ?
  File "C:\Python22\Lib\site-packages\numarray\numarray.py", line 581, in
__repr__
    MAX_LINE_WIDTH, PRECISION, SUPPRESS_SMALL, ', ', 1)
  File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 163, in
array2string
    separator, array_output)
  File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 125, in
_array2string
    format, item_length = _floatFormat(data, precision, suppress_small)
  File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 246, in
_floatFormat
    non_zero = numarray.abs(numarray.compress(numarray.not_equal(data, 0),
data))
AttributeError: 'module' object has no attribute 'compress'

I get this on Windows (2000) and on Linux. Both numarray 0.4

Thanks,
Sebastian


----- Original Message -----
From: "Todd Miller" <jmiller at stsci.edu>
To: "Sebastian Haase" <haase at msg.ucsf.edu>
Cc: <Numpy-discussion at lists.sourceforge.net>
Sent: Thursday, December 19, 2002 5:58 AM
Subject: Re: [Numpy-discussion] Have a problem: what is attribute 'compress'


> Sebastian Haase wrote:
>
> >Hi!
> >Somehow I have a problem with numarray. Please take a look at this:
> >
> Hi Sebastian,
>
> I've don't recall seeing anything like this,  nor can I reproduce it
> now.   If you've been following numarray for a while now,  I can say
> that it is important to remove the old version of numarray before
> installing the new version.   I recommend deleting your current
> installation and reinstalling numarray.
>
> compress() is a ufunc,  much like add() or put().  It is defined in
> ndarray.py,  right after the import of the modules ufunc and _ufunc.
> _ufunc in particular is a problematic module,  because it has followed
> the atypical development path of moving from C-code to Python code.
>  Because of this, and the fact that a .so or .dll overrides a .py,
>  older installations interfere with newer ones.  The atypical path was
> required because the original _ufuncmodule.c was so large that it could
> not be compiled on some systems;  as a result,  I split _ufuncmodule.c
> into pieces by data type and now use _ufunc.py to glue the pieces
together.
>
> Good luck!    Please let me know if reinstalling doesn't clear up the
> problem.
>
> Todd
>
> >
> >
> >>>>import numarray as na
> >>>>na.array([0, 0])
> >>>>
> >>>>
> >array([0, 0])
> >
> >
> >>>>na.array([0.0, 0.0])
> >>>>
> >>>>
> >Traceback (most recent call last):
> >  File "<input>", line 1, in ?
> >  File "C:\Python22\Lib\site-packages\numarray\numarray.py", line 581, in
> >__repr__
> >    MAX_LINE_WIDTH, PRECISION, SUPPRESS_SMALL, ', ', 1)
> >  File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 163,
in
> >array2string
> >    separator, array_output)
> >  File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 125,
in
> >_array2string
> >    format, item_length = _floatFormat(data, precision, suppress_small)
> >  File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 246,
in
> >_floatFormat
> >    non_zero = numarray.abs(numarray.compress(numarray.not_equal(data,
0),
> >data))
> >AttributeError: 'module' object has no attribute 'compress'
> >
> >The same workes fine with Numeric. But I would prefer numarray because
I'm
> >writing C++-extensions and I need "unsigned shorts".
> >
> >What is this error about?
> >
> >Thanks,
> >Sebastian
> >
> >
> >
> >
> >-------------------------------------------------------
> >This SF.NET email is sponsored by: Order your Holiday Geek Presents Now!
> >Green Lasers, Hip Geek T-Shirts, Remote Control Tanks, Caffeinated Soap,
> >MP3 Players,  XBox Games,  Flying Saucers,  WebCams,  Smart Putty.
> >T H I N K G E E K . C O M       http://www.thinkgeek.com/sf/
> >_______________________________________________
> >Numpy-discussion mailing list
> >Numpy-discussion at lists.sourceforge.net
> >https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> >
> >
>
>
>
>


From jmiller at stsci.edu  Thu Jan 23 14:33:03 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jan 23 14:33:03 2003
Subject: [Numpy-discussion] Have a problem: what is attribute 'compress'
References: <3E00FDB5.2090804@erols.com> <004b01c2a6fd$195c95f0$3b45da80@rodan> <3E01D07B.3070009@stsci.edu> <08ad01c2c32b$900238f0$3b45da80@rodan>
Message-ID: <3E306D73.6050303@stsci.edu>

Sebastian Haase wrote:

>Hi,
>I can print numarray of any int time just fine, but
>
OK.  I am assuming you deleted all of your old numarray installations as 
I recommended and reinstalled numarray-0.4.

What is your PYTHONPATH?

>I still get the compress error message with Float (or complex)
>data:
>  
>
>>>>c
>>>>array([[0, 0, 0, ..., 0, 0, 0],
>>>>        
>>>>
>       [0, 0, 0, ..., 0, 0, 0],
>       [0, 0, 0, ..., 0, 0, 0],
>       ...,
>       [0, 0, 0, ..., 0, 0, 0],
>       [0, 0, 0, ..., 0, 0, 0],
>       [0, 0, 0, ..., 0, 0, 0]], type=UInt16)
>  
>
>>>>c.astype(na.Float)
>>>>        
>>>>
>Traceback (most recent call last):
>  File "<input>", line 1, in ?
>  File "C:\Python22\Lib\site-packages\numarray\numarray.py", line 581, in
>__repr__
>    MAX_LINE_WIDTH, PRECISION, SUPPRESS_SMALL, ', ', 1)
>  File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 163, in
>array2string
>    separator, array_output)
>  File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 125, in
>_array2string
>    format, item_length = _floatFormat(data, precision, suppress_small)
>  File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 246, in
>_floatFormat
>    non_zero = numarray.abs(numarray.compress(numarray.not_equal(data, 0),
>data))
>AttributeError: 'module' object has no attribute 'compress'
>
>I get this on Windows (2000) and on Linux. Both numarray 0.4
>  
>
I'm not sure what's going on here,  but I develop on both platforms, 
 and Linux constantly.    The self tests definitely pass in Linux.   It 
must be some kind of environment issue or runtime issue.  What happens 
when you type:

 >>> import numtestall
 >>> numtestall.test()
... what gets printed here? ...

>Thanks,
>Sebastian
>
>
>
>----- Original Message -----
>From: "Todd Miller" <jmiller at stsci.edu>
>To: "Sebastian Haase" <haase at msg.ucsf.edu>
>Cc: <Numpy-discussion at lists.sourceforge.net>
>Sent: Thursday, December 19, 2002 5:58 AM
>Subject: Re: [Numpy-discussion] Have a problem: what is attribute 'compress'
>
>
>  
>
>>Sebastian Haase wrote:
>>
>>    
>>
>>>Hi!
>>>Somehow I have a problem with numarray. Please take a look at this:
>>>
>>>      
>>>
>>Hi Sebastian,
>>
>>I've don't recall seeing anything like this,  nor can I reproduce it
>>now.   If you've been following numarray for a while now,  I can say
>>that it is important to remove the old version of numarray before
>>installing the new version.   I recommend deleting your current
>>installation and reinstalling numarray.
>>
>>compress() is a ufunc,  much like add() or put().  It is defined in
>>ndarray.py,  right after the import of the modules ufunc and _ufunc.
>>_ufunc in particular is a problematic module,  because it has followed
>>the atypical development path of moving from C-code to Python code.
>> Because of this, and the fact that a .so or .dll overrides a .py,
>> older installations interfere with newer ones.  The atypical path was
>>required because the original _ufuncmodule.c was so large that it could
>>not be compiled on some systems;  as a result,  I split _ufuncmodule.c
>>into pieces by data type and now use _ufunc.py to glue the pieces
>>    
>>
>together.
>  
>
>>Good luck!    Please let me know if reinstalling doesn't clear up the
>>problem.
>>
>>Todd
>>
>>    
>>
>>>      
>>>
>>>>>>import numarray as na
>>>>>>na.array([0, 0])
>>>>>>
>>>>>>
>>>>>>            
>>>>>>
>>>array([0, 0])
>>>
>>>
>>>      
>>>
>>>>>>na.array([0.0, 0.0])
>>>>>>
>>>>>>
>>>>>>            
>>>>>>
>>>Traceback (most recent call last):
>>> File "<input>", line 1, in ?
>>> File "C:\Python22\Lib\site-packages\numarray\numarray.py", line 581, in
>>>__repr__
>>>   MAX_LINE_WIDTH, PRECISION, SUPPRESS_SMALL, ', ', 1)
>>> File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 163,
>>>      
>>>
>in
>  
>
>>>array2string
>>>   separator, array_output)
>>> File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 125,
>>>      
>>>
>in
>  
>
>>>_array2string
>>>   format, item_length = _floatFormat(data, precision, suppress_small)
>>> File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 246,
>>>      
>>>
>in
>  
>
>>>_floatFormat
>>>   non_zero = numarray.abs(numarray.compress(numarray.not_equal(data,
>>>      
>>>
>0),
>  
>
>>>data))
>>>AttributeError: 'module' object has no attribute 'compress'
>>>
>>>The same workes fine with Numeric. But I would prefer numarray because
>>>      
>>>
>I'm
>  
>
>>>writing C++-extensions and I need "unsigned shorts".
>>>
>>>What is this error about?
>>>
>>>Thanks,
>>>Sebastian
>>>
>>>
>>>
>>>
>>>-------------------------------------------------------
>>>This SF.NET email is sponsored by: Order your Holiday Geek Presents Now!
>>>Green Lasers, Hip Geek T-Shirts, Remote Control Tanks, Caffeinated Soap,
>>>MP3 Players,  XBox Games,  Flying Saucers,  WebCams,  Smart Putty.
>>>T H I N K G E E K . C O M       http://www.thinkgeek.com/sf/
>>>_______________________________________________
>>>Numpy-discussion mailing list
>>>Numpy-discussion at lists.sourceforge.net
>>>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>>>
>>>
>>>      
>>>
>>
>>
>>    
>>
>
>
>
>-------------------------------------------------------
>This SF.NET email is sponsored by:
>SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
>http://www.vasoftware.com
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>  
>


From j_r_fonseca at yahoo.co.uk  Thu Jan 23 16:10:02 2003
From: j_r_fonseca at yahoo.co.uk (=?iso-8859-15?Q?Jos=E9?= Fonseca)
Date: Thu Jan 23 16:10:02 2003
Subject: [Numpy-discussion] Extensive use of methods instead of functions
Message-ID: <20030124000759.GA6042@localhost.localdomain>

With the ability of subclassing types in recent versions of the Python
language, more people will be interested in subclassing Numeric arrays
for specific purposes.  Still the use of functions instead of methods
takes away many of the advantages, the ability of being overloaded.

Taking this statement as an example:

	Numeric.put(myarray, myindices, myvalues)

In the current state of affairs, if we wanted to have to statment to
work with asparse matrix class derived from a Numeric array, it would
have to be something like:

	Sparse.put(myarray, myindices, myvalues)

That is, it forces to the underlaying code to know whether is dealing
with Numeric arrays, or some other equivalent class. But it would be
much more useful to have simply:

	myarray.put(myindices, myvalues)

which would work regardless of the actual type of myarray, provided it
supplied the put() method. This would improve enormously code
reusability and extensability.

I know that there are certain implementations details that may difficult
this (like many functions being implemented in pure Python), but any
advances made in this since will be an improvement of the current
situation.

Also, I know that this example is a little unhappy because numarray will
do these things with the __getitem__ and __setitem__ operators. But
others could easily be shown.

Regards,

Jos? Fonseca
__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com


From falted at openlc.org  Fri Jan 24 04:00:07 2003
From: falted at openlc.org (Francesc Alted)
Date: Fri Jan 24 04:00:07 2003
Subject: [Numpy-discussion] typecodes in numarray
Message-ID: <200301241259.30243.falted@openlc.org>

Maybe I'm becoming a bit tedious with this, but if you look at:

>>> import numerictypes
>>> numerictypes.typecode
{Complex64: 'D', Int32: 'l', UInt16: 's', Complex32: 'F', Float64: 'd',
UInt8: 'b', Int16: 's', Float32: 'f', Int8: '1'}

you can find some incongruencies that lead to weird things like:

>>> array([1,2], Int16).typecode()
's'
>>> array([1,2], UInt16).typecode()
's'  #  --> same as Int16!
>>> array([1,2], Int64).typecode()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/local/lib/python2.2/site-packages/numarray/numarray.py", line 
730, in typecode
    return numerictypes.typecode[self._type]
KeyError: numarray type: Int64
>>> array([1,2], UInt64).typecode()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/local/lib/python2.2/site-packages/numarray/numarray.py", line 
730, in typecode
    return numerictypes.typecode[self._type]
KeyError: numarray type: UInt64

Also, 'l' is used here to map Int32, while in recarray is used to map Boolean.

Moreover, Numeric 22.0 introduced the equivalent of UInt16 and UInt32 types
as 'w' and 'u' respectively. But, again, 'u' is used in recarray as synonym
of Uint8.

I think it's important to agree with a definitive set of charcodes and use
them uniformly throughout numarray.

Suggestion: if recarray charcodes are not necessary to match the Numeric
ones, I propose that using the Python convention maybe a good idea.
Look at the table in:
http://www.python.org/doc/current/lib/module-struct.html.

-- 
Francesc Alted


From perry at stsci.edu  Fri Jan 24 06:38:17 2003
From: perry at stsci.edu (Perry Greenfield)
Date: Fri Jan 24 06:38:17 2003
Subject: [Numpy-discussion] typecodes in numarray
In-Reply-To: <200301241259.30243.falted@openlc.org>
Message-ID: <JFEGLNDJEDNOMPPHDEJFCECKECAA.perry@stsci.edu>


> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net
> [mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of
> Francesc Alted
> Sent: Friday, January 24, 2003 7:00 AM
> To: numpy-discussion at lists.sourceforge.net
> Subject: [Numpy-discussion] typecodes in numarray
> 
> 
> Maybe I'm becoming a bit tedious with this, but if you look at:
> 
No, this sort of feedback is very valuable. We'll think about this a
bit, but I'd agree that consistency with Numeric codes is important. Some
of the history of the codes used by recarray arise from conventions
used in other software not related to Python or Numeric. But if
recarray is to be generic and used by others, we should hide, remove
or layer such conventions in a subclass. Let us think about how we should
do that.

Thanks, Perry 


From perry at stsci.edu  Fri Jan 24 09:04:02 2003
From: perry at stsci.edu (Perry Greenfield)
Date: Fri Jan 24 09:04:02 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
Message-ID: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu>


Todd Miller had some further comments that I thought were worth
posting as well (and I think he makes some very good points).

************************************************************************

My [i.e. Todd's]  thoughts about it:

>Maybe I'm becoming a bit tedious with this, but if you look at:
>
No.  It shows you're thinking about it carefully.   Having looked at all 
of the examples below,  I have some comments:

1.  The sparseness and obscurity of the typecode "wordspace" are both 
demonstrated here.  There are so few letters to choose from,  they're 
often already used in some other context.  Even given the large number 
of unused letters,  it's often difficult to choose good ones and to 
remember what has been chosen.  I think this is one of the reasons Perry 
chose to replace typecodes with true type objects which have rich, 
regular, and predictable symbolic names.

2. Typecodes were added as a backwards compatability feature of 
numarray,  and I think it's probable that numarray beat Numeric to 
supporting most of these types, because otherwise they'd have been 
copied directly and there would be no problem.  I'm not really trying to 
play a blame-game here,  but I am making an argument that perhaps 
numarray should only go so far in the support of what I regard as an 
obsolescent feature.  If the Numeric developers choose to continue 
extending the use of typecodes in ways that are incompatible with 
numarray,  one way of dealing with it is to "just say no".  We are going 
beyond the scope of backwards compatability to on-going compatabilty. 
(Which we may still have to do but needs to be discussed and considered)

3. STSCI has layered other software on top of numarray and recarray 
which astronomers use to do work.   It is the friction of that interface 
which makes correcting these consistency problems more difficult than 
might be immediately apparent.

>I think it's important to agree with a definitive set of charcodes and use
>them uniformly throughout numarray.
>
I wish this were possible,  but I'm thinking we should try to find an 
alternative approach altogether,  one which may be more verbose but 
implicitly free of conflict.

A means for specifying a recarray format might be created from tuples, 
type objects,  and integer repetition factors.

The verbosity of this approach might be a litte tedious,  but it would 
also be transparent, maintainable, and conflict free.

I think we should add an "obsolescent feature" warning to numarray and 
recarray which flags any use of character typecodes when the appropriate 
command line switches are set.

>Suggestion: if recarray charcodes are not necessary to match the Numeric
>ones, I propose that using the Python convention maybe a good idea.
>Look at the table in:
>http://www.python.org/doc/current/lib/module-struct.html.
>
This sounds good to me,  except that it will break an existing interface 
that I don't have control over.  Therefore,  I suggest we correct the 
problem by coming up with something better.


From paul at pfdubois.com  Fri Jan 24 09:43:07 2003
From: paul at pfdubois.com (Paul F Dubois)
Date: Fri Jan 24 09:43:07 2003
Subject: [Numpy-discussion] typecodes in numarray
In-Reply-To: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu>
Message-ID: <000501c2c3cf$e10716e0$6601a8c0@NICKLEBY>

I don't understand this remark:

<snip >but I am making an argument that perhaps 
> numarray should only go so far in the support of what I regard as an 
> obsolescent feature.  If the Numeric developers choose to continue 
> extending the use of typecodes in ways that are incompatible with 
> numarray,  one way of dealing with it is to "just say no".  
> We are going 
> beyond the scope of backwards compatability to on-going compatabilty. 
> (Which we may still have to do but needs to be discussed and 
> considered)
> 

There is no "on-going" Numeric development. It stops the minute numarray is
ready. Period. We developers all agreed on that. The whole reason for
numarray is that Numeric was pronounced unmaintainable and unextendable by
those who frequently had to work on it. To do anything else will fragment
the entire numerical python community and software set.


From falted at openlc.org  Fri Jan 24 10:48:04 2003
From: falted at openlc.org (Francesc Alted)
Date: Fri Jan 24 10:48:04 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
In-Reply-To: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu>
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu>
Message-ID: <200301241946.55398.falted@openlc.org>

A Divendres 24 Gener 2003 18:02, Todd Miller va escriure:
>
> My [i.e. Todd's]  thoughts about it:
>
> No.  It shows you're thinking about it carefully.   Having looked at all
> of the examples below,  I have some comments:

I mostly agree with your comments, but let point out some thoughts

>
> 1.  The sparseness and obscurity of the typecode "wordspace" are both
> demonstrated here.  There are so few letters to choose from,  they're
> often already used in some other context.  Even given the large number
> of unused letters,  it's often difficult to choose good ones and to
> remember what has been chosen.  I think this is one of the reasons Perry
> chose to replace typecodes with true type objects which have rich,
> regular, and predictable symbolic names.

I completely agree that type objects is a brilliant idea.

> 3. STSCI has layered other software on top of numarray and recarray
> which astronomers use to do work.   It is the friction of that interface
> which makes correcting these consistency problems more difficult than
> might be immediately apparent.

Yeah, I know...

>
> >I think it's important to agree with a definitive set of charcodes and use
> >them uniformly throughout numarray.
>
> I wish this were possible,  but I'm thinking we should try to find an
> alternative approach altogether,  one which may be more verbose but
> implicitly free of conflict.
>
> A means for specifying a recarray format might be created from tuples,
> type objects,  and integer repetition factors.
>
> The verbosity of this approach might be a litte tedious,  but it would
> also be transparent, maintainable, and conflict free.

I think this is a very good idea. In fact, while working in PyTables I was
lately pondering what would be the best way to define record arrays, and I
also think that a verbose approach should be the beast.

After considering metaclasses, and tuples, I ended to a compromise solution
between both which are dictionaries combined with some function or class to
refine the definition.

My current thinking is something like:

recarrDescr = {
    "name"        : defineType(CharType, 16, ""),  # 16-character String
    "TDCcount"    : defineType(UInt8, 1, 0),    # unsigned byte
    "ADCcount"    : defineType(Int16, 1, 0),    # signed short integer
    "grid_i"      : defineType(Int32, 1, 9),    # integer
    "grid_j"      : defineType(Int32, 1, 9),    # integer
    "pressure"    : defineType(Float32, 1, 1.),  # float  (single-precision)
    "temperature" : defineType(Float64, 32, arange(32)),  # double[32]
    "idnumber"    : defineType(Int64, 1, 0),    # signed long long 
    }

where defineType is a class that accepts (type, shape, default) parameters.
It can be extended safely in the future if more needs appear.

Dictionary has the advantage over tuple in that you can map column name to
their contents quite easily, and is more flexible than defining the fields
with a metaclass descendent (see
http://pytables.sourceforge.net/html-doc/usersguide-html3.html#subsection3.1.2)
because dictionarys can be built-up in run-time (although that also migth
metaclass descendents, but in a more misterious way that I think is not
worth of). In addition, dictionary object is available in all python version
whereas metaclasses only from 2.2 on. However, I regard metaclasses as the
most elegant solution (but elegance is not always equivalent to convenience
:().

Perhaps you may want to consider this for using in recarray definition.

>
> I think we should add an "obsolescent feature" warning to numarray and
> recarray which flags any use of character typecodes when the appropriate
> command line switches are set.

Well, I don't fully agree with that. I do believe that classes typecodes to
be a more meaningful way for describing types, but charcodes can be quite
advantageous in certain situations, like in describing in compact way the
contents of a record, or passing this info to C-routines to deal with the
data.

For example, consider the benefits of describing a recarray format as:

"3s4i20d"

instead of

((Int16, 3), 
 (Int32, 4),
 (Float64, 20),
 )

the former being more handy in lots of situations.

I certainly believe that a coexistence of both can be very beneficious,
specially for 3rd party extension makers (like me :).

>
> >Suggestion: if recarray charcodes are not necessary to match the Numeric
> >ones, I propose that using the Python convention maybe a good idea.
> >Look at the table in:
> >http://www.python.org/doc/current/lib/module-struct.html.
>
> This sounds good to me,  except that it will break an existing interface
> that I don't have control over.  Therefore,  I suggest we correct the
> problem by coming up with something better.

Well, if charcodes finally stay in, this have an additional advantage in
that python crew has provided meaningful ways to express padding (character
"x"), endianess ("=", "<", ">") and alignment ("@"). So having a compact
expresion like "@3sx4i20d", apart from resembling chinese to occidentals,
may give a lot of info in a handy way.

-- 
Francesc Alted


From jmiller at stsci.edu  Fri Jan 24 11:20:05 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jan 24 11:20:05 2003
Subject: [Fwd: Re: [Numpy-discussion] typecodes in numarray]
Message-ID: <3E319543.8040101@stsci.edu>


-------------- next part --------------
An embedded message was scrubbed...
From: unknown sender
Subject: no subject
Date: no date
Size: 38
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20030124/12498748/attachment.mht>

From jmiller at stsci.edu  Fri Jan 24 14:01:31 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Fri, 24 Jan 2003 14:01:31 -0500
Subject: [Numpy-discussion] typecodes in numarray
References: <000501c2c3cf$e10716e0$6601a8c0@NICKLEBY>
Message-ID: <3E318D8B.1090403@stsci.edu>

Paul F Dubois wrote:

>I don't understand this remark:
>
><snip >but I am making an argument that perhaps 
>  
>
>>numarray should only go so far in the support of what I regard as an 
>>obsolescent feature.  If the Numeric developers choose to continue 
>>extending the use of typecodes in ways that are incompatible with 
>>numarray,  one way of dealing with it is to "just say no".  
>>We are going 
>>beyond the scope of backwards compatability to on-going compatabilty. 
>>(Which we may still have to do but needs to be discussed and 
>>considered)
>>
>>    
>>
>
>There is no "on-going" Numeric development. It stops the minute numarray is
>ready. Period. We developers all agreed on that. The whole reason for
>numarray is that Numeric was pronounced unmaintainable and unextendable by
>those who frequently had to work on it. To do anything else will fragment
>the entire numerical python community and software set.
>
>
>  
>
Thanks for clarifying Paul.   My point didn't quite come out right.   A 
better way to put it might have been:

1. Numarray and Numeric are subject to accidental divergence.  As long 
as they both continue to change concurrently,  they will probably differ 
even in interface.  Because numarray isn't quite ready yet,  they are 
both still changing.

2. Typecodes in particular are something numarray is superceding with 
something better.  Because of this, providing on-going compatability 
with Numeric typecodes may not make sense.  

3. Numeric compatability is not the only driver for the choice of 
recarray typecodes so I can't make arbitrary changes without affecting 
other software and people.

4. I think there's a clearer,  numarray type object based approach to 
describing recarray formats which does not use typecodes at all.  Thus, 
 instead of attampting to weed through and unify layers of conflicting 
type codes,  we might be able to end-run the whole problem with an 
alternative approach.

Todd

>
>-------------------------------------------------------
>This SF.NET email is sponsored by:
>SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
>http://www.vasoftware.com
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>  
>


--Boundary_(ID_V53Q9uhCvVN46XJvLKOLLw)--


From perry at stsci.edu  Fri Jan 24 11:34:02 2003
From: perry at stsci.edu (Perry Greenfield)
Date: Fri Jan 24 11:34:02 2003
Subject: [Numpy-discussion] typecodes in numarray
In-Reply-To: <000501c2c3cf$e10716e0$6601a8c0@NICKLEBY>
Message-ID: <JFEGLNDJEDNOMPPHDEJFOECOECAA.perry@stsci.edu>

I think Todd was referring to the recent addition of unsigned types
to Numeric, along with came new typecodes. These types were already
in numarray at the time.

Perry

> -----Original Message-----
> From: Paul F Dubois [mailto:paul at pfdubois.com]
> Sent: Friday, January 24, 2003 12:42 PM
> To: 'Perry Greenfield'; falted at openlc.org;
> numpy-discussion at lists.sourceforge.net
> Subject: RE: [Numpy-discussion] typecodes in numarray
>
>
> I don't understand this remark:
>
> <snip >but I am making an argument that perhaps
> > numarray should only go so far in the support of what I regard as an
> > obsolescent feature.  If the Numeric developers choose to continue
> > extending the use of typecodes in ways that are incompatible with
> > numarray,  one way of dealing with it is to "just say no".
> > We are going
> > beyond the scope of backwards compatability to on-going compatabilty.
> > (Which we may still have to do but needs to be discussed and
> > considered)
> >
>
> There is no "on-going" Numeric development. It stops the minute
> numarray is
> ready. Period. We developers all agreed on that. The whole reason for
> numarray is that Numeric was pronounced unmaintainable and unextendable by
> those who frequently had to work on it. To do anything else will fragment
> the entire numerical python community and software set.
>
>
>
>
>
>


From jmiller at stsci.edu  Fri Jan 24 12:01:32 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jan 24 12:01:32 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu>
 <200301241946.55398.falted@openlc.org>
Message-ID: <3E319ED4.5060709@stsci.edu>

>
>
>>A means for specifying a recarray format might be created from tuples,
>>type objects,  and integer repetition factors.
>>
>>The verbosity of this approach might be a litte tedious,  but it would
>>also be transparent, maintainable, and conflict free.
>>    
>>
>
>I think this is a very good idea. In fact, while working in PyTables I was
>lately pondering what would be the best way to define record arrays, and I
>also think that a verbose approach should be the beast.
>
>After considering metaclasses, and tuples, I ended to a compromise solution
>between both which are dictionaries combined with some function or class to
>refine the definition.
>
>My current thinking is something like:
>
>recarrDescr = {
>    "name"        : defineType(CharType, 16, ""),  # 16-character String
>    "TDCcount"    : defineType(UInt8, 1, 0),    # unsigned byte
>    "ADCcount"    : defineType(Int16, 1, 0),    # signed short integer
>    "grid_i"      : defineType(Int32, 1, 9),    # integer
>    "grid_j"      : defineType(Int32, 1, 9),    # integer
>    "pressure"    : defineType(Float32, 1, 1.),  # float  (single-precision)
>    "temperature" : defineType(Float64, 32, arange(32)),  # double[32]
>    "idnumber"    : defineType(Int64, 1, 0),    # signed long long 
>    }
>
>where defineType is a class that accepts (type, shape, default) parameters.
>It can be extended safely in the future if more needs appear.
>
You're way ahead of me here.  The only thing I don't like about this is 
the additional relative complexity because of the addition of field 
names and default values.   It would be nice to layer this more.

>Perhaps you may want to consider this for using in recarray definition.
>
We'll definitely consider it as we hash this out.  

>
>  
>
>>I think we should add an "obsolescent feature" warning to numarray and
>>recarray which flags any use of character typecodes when the appropriate
>>command line switches are set.
>>    
>>
>
>Well, I don't fully agree with that. I do believe that classes typecodes to
>be a more meaningful way for describing types, but charcodes can be quite
>advantageous in certain situations, like in describing in compact way the
>contents of a record, or passing this info to C-routines to deal with the
>data.
>
Yeah, I know.

>For example, consider the benefits of describing a recarray format as:
>
>"3s4i20d"
>
I know.

>
>instead of
>
>((Int16, 3), 
> (Int32, 4),
> (Float64, 20),
> )
>
This is pretty much exactly what I was thinking.  It is straightforward 
to imagine and difficult to forget.  

>
>the former being more handy in lots of situations.
>  
>
Would you please name some of these so we can explore handling them both 
ways?

>I certainly believe that a coexistence of both can be very beneficious, specially for 3rd party extension makers (like me :).
>
If there's a reasonable way to avoid supporting both,  we should.

>>>Suggestion: if recarray charcodes are not necessary to match the Numeric
>>>ones, I propose that using the Python convention maybe a good idea.
>>>Look at the table in:
>>>http://www.python.org/doc/current/lib/module-struct.html.
>>>      
>>>
>>This sounds good to me,  except that it will break an existing interface
>>that I don't have control over.  Therefore,  I suggest we correct the
>>problem by coming up with something better.
>>    
>>
>
>Well, if charcodes finally stay in, this have an additional advantage in
>that python crew has provided meaningful ways to express padding (character
>"x"), endianess ("=", "<", ">") and alignment ("@"). 
>
We might also add these to the type-repetition tuple.

Regards,
Todd


From hinsen at cnrs-orleans.fr  Fri Jan 24 12:13:05 2003
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Fri Jan 24 12:13:05 2003
Subject: [Numpy-discussion] Extensive use of methods instead of functions
In-Reply-To: <20030124000759.GA6042@localhost.localdomain>
References: <20030124000759.GA6042@localhost.localdomain>
Message-ID: <m3n0lqcnwm.fsf@chinon.cnrs-orleans.fr>

Jos? Fonseca <j_r_fonseca at yahoo.co.uk> writes:

> With the ability of subclassing types in recent versions of the Python
> language, more people will be interested in subclassing Numeric arrays
> for specific purposes.  Still the use of functions instead of methods
> takes away many of the advantages, the ability of being overloaded.

True. On the other hand, there is also an advantage: NumPy routines
can be used on standard Python data types such as number and sequence
types.

In the ideal world (which might come one day), core NumPy
functionality would be part of standard Python, and then all these
operations would work on other built-in types as well.

Until then, I am not sure that changing NumPy functions to methods
is a good idea. I need to call them on scalar numbers much more
often than I subclass arrays.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From paul at pfdubois.com  Fri Jan 24 12:36:03 2003
From: paul at pfdubois.com (Paul F Dubois)
Date: Fri Jan 24 12:36:03 2003
Subject: [Numpy-discussion] Extensive use of methods instead of functions
In-Reply-To: <m3n0lqcnwm.fsf@chinon.cnrs-orleans.fr>
Message-ID: <000401c2c3e8$0db770f0$6601a8c0@NICKLEBY>

Every time the subject of subclassing a numeric array comes up, it as if
nobody ever thought of it before. Been there, done that. It doesn't turn out
to be all that useful. To see why, consider a + b where a and b are Foo
instances, and Foo inherits from numarray.

a. a + b will be a numarray, not a Foo instance, unless you write a new +
operator.
b. Attempting to have numarray itself apply a subclass constructor to the
result runs into the problem that numarray does not have any idea what the
constructor's signature is or what information is needed to fill out that
constructor.
c. Even if the subclass accepts numarray's constructor signature, it would
rarely produced satisfactory results just "losing" the Foo'ness details of a
and b.

This same argument applies to every method that returns a Foo instance, and
every ufunc. So you end up redoing everything anyway.

In short, worrying about subclassing is way down the list of things we ought
to consider. 

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net 
> [mailto:numpy-discussion-admin at lists.sourceforge.net] On 
> Behalf Of Konrad Hinsen
> Sent: Friday, January 24, 2003 12:07 PM
> To: Jos? Fonseca
> Cc: numpy-discussion at lists.sourceforge.net
> Subject: Re: [Numpy-discussion] Extensive use of methods 
> instead of functions
> 
> 
> Jos? Fonseca <j_r_fonseca at yahoo.co.uk> writes:
> 
> > With the ability of subclassing types in recent versions of 
> the Python 
> > language, more people will be interested in subclassing 
> Numeric arrays 
> > for specific purposes.  Still the use of functions instead 
> of methods 
> > takes away many of the advantages, the ability of being overloaded.
> 
> True. On the other hand, there is also an advantage: NumPy 
> routines can be used on standard Python data types such as 
> number and sequence types.
> 
> In the ideal world (which might come one day), core NumPy 
> functionality would be part of standard Python, and then all 
> these operations would work on other built-in types as well.
> 
> Until then, I am not sure that changing NumPy functions to 
> methods is a good idea. I need to call them on scalar numbers 
> much more often than I subclass arrays.
> 
> Konrad.
> -- 
> --------------------------------------------------------------
> -----------------
> Konrad Hinsen                            | E-Mail: 
> hinsen at cnrs-orleans.fr
> Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
> Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
> 45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
> France                                   | Nederlands/Francais
> --------------------------------------------------------------
> -----------------
> 
> 
> -------------------------------------------------------
> This SF.NET email is sponsored by:
> SourceForge Enterprise Edition + IBM + LinuxWorld =omething 2 
> See! http://www.vasoftware.com 
> _______________________________________________
> Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> 


From perry at stsci.edu  Fri Jan 24 13:11:05 2003
From: perry at stsci.edu (Perry Greenfield)
Date: Fri Jan 24 13:11:05 2003
Subject: [Numpy-discussion] Extensive use of methods instead of functions
In-Reply-To: <000401c2c3e8$0db770f0$6601a8c0@NICKLEBY>
Message-ID: <JFEGLNDJEDNOMPPHDEJFGECPECAA.perry@stsci.edu>

Paul Dubois writes:
>
> Every time the subject of subclassing a numeric array comes up, it as if
> nobody ever thought of it before. Been there, done that. It
> doesn't turn out
> to be all that useful. To see why, consider a + b where a and b are Foo
> instances, and Foo inherits from numarray.
>
> a. a + b will be a numarray, not a Foo instance, unless you write a new +
> operator.
> b. Attempting to have numarray itself apply a subclass constructor to the
> result runs into the problem that numarray does not have any idea what the
> constructor's signature is or what information is needed to fill out that
> constructor.
> c. Even if the subclass accepts numarray's constructor signature, it would
> rarely produced satisfactory results just "losing" the Foo'ness
> details of a
> and b.
>
> This same argument applies to every method that returns a Foo
> instance, and
> every ufunc. So you end up redoing everything anyway.
>
> In short, worrying about subclassing is way down the list of
> things we ought
> to consider.
>
Paul illustrates some important points. While I'm not as down on the
ability to subclass (more on that later), he is absolutely right that
most think that subclassing is a breeze and don't realize that it
is far from being so.

The arguments for this would be helped immensely by a practical
example of a desired subclass. This does far more to illustrate
the issues than an abstract discussion. For most instances that I
have considered or thought about it is unavoidable that one must
override virtually all (if not all) the operators and functions.
Nevertheless, subclassing can still save a great deal of work
over implementing a completely new extension. But you'll have to
deal with defining how all the operators and functions should behave.

In our view, the most valuable subclassing in numarray comes from
subclassing NDArray, which handles all the structural operations
for arrays (recarray makes heavy use of this). But recarrays don't
try to support numerical operations, and that makes it fairly easy.
Subclassing numarrays is significantly more work for the reasons cited.

Perry


From jmiller at stsci.edu  Fri Jan 24 13:56:01 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jan 24 13:56:01 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu>
 <200301241946.55398.falted@openlc.org> <3E319ED4.5060709@stsci.edu>
Message-ID: <3E31B9DB.7080603@stsci.edu>

>
>
>> My current thinking is something like:
>>
>> recarrDescr = {
>>    "name"        : defineType(CharType, 16, ""),  # 16-character String
>>    "TDCcount"    : defineType(UInt8, 1, 0),    # unsigned byte
>>    "ADCcount"    : defineType(Int16, 1, 0),    # signed short integer
>>    "grid_i"      : defineType(Int32, 1, 9),    # integer
>>    "grid_j"      : defineType(Int32, 1, 9),    # integer
>>    "pressure"    : defineType(Float32, 1, 1.),  # float  
>> (single-precision)
>>    "temperature" : defineType(Float64, 32, arange(32)),  # double[32]
>>    "idnumber"    : defineType(Int64, 1, 0),    # signed long long    }
>>
>> where defineType is a class that accepts (type, shape, default) 
>> parameters.
>> It can be extended safely in the future if more needs appear.
>>
> You're way ahead of me here.  The only thing I don't like about this 
> is the additional relative complexity because of the addition of field 
> names and default values.   It would be nice to layer this more. 

One more thing I don't understand looking at this:  a dictionary is 
unordered.

Todd


From j_r_fonseca at yahoo.co.uk  Fri Jan 24 14:00:03 2003
From: j_r_fonseca at yahoo.co.uk (=?iso-8859-15?Q?Jos=E9?= Fonseca)
Date: Fri Jan 24 14:00:03 2003
Subject: [Numpy-discussion] Extensive use of methods instead of functions
In-Reply-To: <m3n0lqcnwm.fsf@chinon.cnrs-orleans.fr>
References: <20030124000759.GA6042@localhost.localdomain> <m3n0lqcnwm.fsf@chinon.cnrs-orleans.fr>
Message-ID: <20030124215828.GA32437@localhost.localdomain>

On Fri, Jan 24, 2003 at 09:07:21PM +0100, Konrad Hinsen wrote:
> Jos? Fonseca <j_r_fonseca at yahoo.co.uk> writes:
> 
> > With the ability of subclassing types in recent versions of the Python
> > language, more people will be interested in subclassing Numeric arrays
> > for specific purposes.  Still the use of functions instead of methods
> > takes away many of the advantages, the ability of being overloaded.
> 
> True. On the other hand, there is also an advantage: NumPy routines
> can be used on standard Python data types such as number and sequence
> types.
> 
> In the ideal world (which might come one day), core NumPy
> functionality would be part of standard Python, and then all these
> operations would work on other built-in types as well.
> 
> Until then, I am not sure that changing NumPy functions to methods
> is a good idea. I need to call them on scalar numbers much more
> often than I subclass arrays.

You've got a good point there. I often want to use with other Numeric
array-alike classes, but I've also used them with standard Python data
types for convenience. 

Still, it's perfectly possible to both interfaces to co-exist. Of course
that when one would use the .method version it can't expect to work with
standard Python data types and has to make a choice, or to use asarray()
or something equivalent before using it.

Regards,

Jos? Fonseca
__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com


From j_r_fonseca at yahoo.co.uk  Fri Jan 24 15:21:02 2003
From: j_r_fonseca at yahoo.co.uk (=?iso-8859-15?Q?'Jos=E9?= Fonseca')
Date: Fri Jan 24 15:21:02 2003
Subject: [Numpy-discussion] Extensive use of methods instead of functions
In-Reply-To: <000401c2c3e8$0db770f0$6601a8c0@NICKLEBY>
References: <m3n0lqcnwm.fsf@chinon.cnrs-orleans.fr> <000401c2c3e8$0db770f0$6601a8c0@NICKLEBY>
Message-ID: <20030124231900.GB32437@localhost.localdomain>

On Fri, Jan 24, 2003 at 12:34:54PM -0800, Paul F Dubois wrote:
> 
> Every time the subject of subclassing a numeric array comes up, it as
> if nobody ever thought of it before.

Why do you treat me as if I was trying to sell the "Next Big Thing"!?

First, I must tell you that the first time I came across the idea of
subclassing Numeric arrays was while reading the "Subclassing"
subsection, in the "Special Topics" section of the Numeric Python
manual. Your name, Paul, appears as one of the authors.

Second, subclassing Numeric arrays may be useful. Again, the
distribution of Numeric Python even has one big example: making a linear
algebra oriented version of Numeric python, where the operations would
be the standard matrix and vector operations instead of the element-wise
operations. 
> Been there, done that.  It doesn't turn out to be all that useful. 

As seen by the examples above is obvious you did. Still, I don't see how
can you possibly say it isn't useful...

> To see why, consider a + b where a and b are Foo instances, and Foo
> inherits from numarray.
> 
> a. a + b will be a numarray, not a Foo instance, unless you write a
> new + operator.  b. Attempting to have numarray itself apply a
> subclass constructor to the result runs into the problem that numarray
> does not have any idea what the constructor's signature is or what
> information is needed to fill out that constructor.  c. Even if the
> subclass accepts numarray's constructor signature, it would rarely
> produced satisfactory results just "losing" the Foo'ness details of a
> and b.
> 
> This same argument applies to every method that returns a Foo
> instance, and every ufunc. So you end up redoing everything anyway.

[In general it may be usefully to subclass Numeric arrays if one just
want to add/overload methods, but no new properties.]

And third, if you read my thread you'd notice that the use of methods
instead of functions has implications/benefits much beyond the
subclassing issue. It's particularly important for Numeric-alike arrays. 

All objects in Python are virtual so you don't actually need to subclass
to use different kind of objects in the same piece as code.

While you're right in the sense that for many practical applications
there is little use of subclassing - a sparse matrix class is one of
them for instance -, you can't deny that is quite useful to have
Numeric-alike arrays, in the same basis as is currently done with the
file-alike objects in Python, i.e., they could be strings, web pages but
as long as they define a set of methods, these.

> In short, worrying about subclassing is way down the list of things we
> ought to consider. 

If so, then why did your comment only focused on the subclassing issue?
The subclassing was a mere introduction [perhaps unfortunate, I confess]
to the method overloading issue.  Now, if you could (re)read my first
post and comment on my actual suggestion I would appreciate.

Of course that I have no problems if the Numeric/numarray maintainers
decide to turn it down. I'll most probably just use UserArray.py to create a
"method-ized" version of Numeric, so that my algorithms can work with
both Numeric array and sparse matrices. (I do have a real case need of
for this.)

BTW, there is an alternative to create full-methodized Numeric array:
just add a attribute which points to the module which the class belongs,
e.g., "myarray.module.take" would point to "Numeric.take" if it was a
Numeric array, or "Sparse.take" if it was a sparse matrix.

Regards,

Jos? Fonseca
__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com


From bsder at allcaps.org  Fri Jan 24 16:19:03 2003
From: bsder at allcaps.org (Andrew P. Lentvorski, Jr.)
Date: Fri Jan 24 16:19:03 2003
Subject: [Numpy-discussion] Extensive use of methods instead of functions
In-Reply-To: <20030124231900.GB32437@localhost.localdomain>
Message-ID: <Pine.LNX.4.44.0301241615390.23684-100000@mail.allcaps.org>

On Fri, 24 Jan 2003, [iso-8859-15] 'Jos? Fonseca' wrote:

> Of course that I have no problems if the Numeric/numarray maintainers
> decide to turn it down. I'll most probably just use UserArray.py to create a
> "method-ized" version of Numeric, so that my algorithms can work with
> both Numeric array and sparse matrices. (I do have a real case need of
> for this.)

Sparse matricies are common enough that they really should be a base part 
of Numeric rather than requiring subclassing/extending/etc.  I know that 
Travis O. was working on some sparse matrix stuff a while back so you 
might want to contact him to get the current status of that work.

-a


From falted at openlc.org  Sat Jan 25 04:43:02 2003
From: falted at openlc.org (Francesc Alted)
Date: Sat Jan 25 04:43:02 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
In-Reply-To: <3E319ED4.5060709@stsci.edu>
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu> <200301241946.55398.falted@openlc.org> <3E319ED4.5060709@stsci.edu>
Message-ID: <200301251342.15164.falted@openlc.org>

A Divendres 24 Gener 2003 21:15, Todd Miller va escriure:
> >
> >My current thinking is something like:
> >
> >recarrDescr = {
> >    "name"        : defineType(CharType, 16, ""),  # 16-character String
> >    "TDCcount"    : defineType(UInt8, 1, 0),    # unsigned byte
> >    "ADCcount"    : defineType(Int16, 1, 0),    # signed short integer
> >    "grid_i"      : defineType(Int32, 1, 9),    # integer
> >    "grid_j"      : defineType(Int32, 1, 9),    # integer
> >    "pressure"    : defineType(Float32, 1, 1.),  # float 
> > (single-precision) "temperature" : defineType(Float64, 32, arange(32)), 
> > # double[32] "idnumber"    : defineType(Int64, 1, 0),    # signed long
> > long }
> >
> >where defineType is a class that accepts (type, shape, default)
> > parameters. It can be extended safely in the future if more needs appear.
>
> You're way ahead of me here.  The only thing I don't like about this is
> the additional relative complexity because of the addition of field
> names and default values.   It would be nice to layer this more.
>

Well, I think a map between field names and values is valuable from the
user's point of view. It may help him to label the different information on
the recarray. Moreover, if __getattr__ and __setattr__ methods (or
__getitem__ and __setitem__) would get implemented on recarray (as they are
in my recarray2 version, for example), the field name can become a very
convenient manner to access a specific field by name (this introduce the
limitation that field name must be a valid python identifier, but I think
this is not a big restriction). By looking at the description dictionary,
the user can have a quick idea of what he can find in every field (with no
need of counting, which can be a big advantage specially for long records).

With regard to default values, you can make this parameter (even the shape)
a keyword parameter in order to make it optional. In that way, the
definition can be as simple as "defineType(CharType)" (or even just
"Chartype", if you add a bit of code) or as complete as
"defineType(Chartype, shape, default, whatever_you_want)". I think this is
a quite flexible approach.

>One more thing I don't understand looking at this:  a dictionary is 
>unordered.

Yeah, but this can be regarded as an advantage rather than a drawback in the
sense that you can choose the order you (the developer) prefer. For example,
I was using first a alphanumerical order to arrange the data fields, but
now, I'm considering that a arrangement that optimizes the alignment of the
fields could be far better. As for one, say that you have a (Int8, Int32,
Float64) record; in principle it could be easy to create a routine that
arranges this record in the form (Float64,Int32, Int8) that optimizes the
different field access (it may be even possible to introduce automatic
padding later on if recarrays would support them in the future).

Maybe you are getting confused in thinking that recarrDescr will create the
recarray. Not at all, this a *metadata* definition that can be passed to the
actual recarray funtion for recarray creation. Its function would be
similar to the formats parameter (with typical values like "3a,4i,3w") in
recarray.array, but with more verbosity and all the reported advantages.

> >instead of
> >
> >((Int16, 3),
> > (Int32, 4),
> > (Float64, 20),
> > )
>
> This is pretty much exactly what I was thinking.  It is straightforward
> to imagine and difficult to forget.
>
> >the former being more handy in lots of situations.
>
> Would you please name some of these so we can explore handling them both
> ways?
>

Well, I'm afraid that the best advantage would be when dealing with
recarrays in C extension modules. In this kind of situation it would be far
better to deal with a "3a4i3w" array than a tuple of python objects. But
maybe I'm wrong and the latter is not so-complicated to manage; however, I
used to work a lot with records (even before meeting recarray) and I was
quite comfortable with formats in string mode.

Or perhaps it would be enough to provide a method for converting from the
standard metadata layout (dictionary or tuple or whatever), to a string
format. This should be not very difficult.

> >
> >Well, if charcodes finally stay in, this have an additional advantage in
> >that python crew has provided meaningful ways to express padding
> > (character "x"), endianess ("=", "<", ">") and alignment ("@").
>
> We might also add these to the type-repetition tuple.

It would be nice, of course.

-- 
Francesc Alted


From jmiller at stsci.edu  Sat Jan 25 11:16:05 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Sat Jan 25 11:16:05 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu>
 <200301241946.55398.falted@openlc.org> <3E319ED4.5060709@stsci.edu>
 <200301251342.15164.falted@openlc.org>
Message-ID: <3E32E5E3.2020704@stsci.edu>

Francesc Alted wrote:

>A Divendres 24 Gener 2003 21:15, Todd Miller va escriure:
>  
>
>>>My current thinking is something like:
>>>
>>>recarrDescr = {
>>>   "name"        : defineType(CharType, 16, ""),  # 16-character String
>>>   "TDCcount"    : defineType(UInt8, 1, 0),    # unsigned byte
>>>   "ADCcount"    : defineType(Int16, 1, 0),    # signed short integer
>>>   "grid_i"      : defineType(Int32, 1, 9),    # integer
>>>   "grid_j"      : defineType(Int32, 1, 9),    # integer
>>>   "pressure"    : defineType(Float32, 1, 1.),  # float 
>>>(single-precision) "temperature" : defineType(Float64, 32, arange(32)), 
>>># double[32] "idnumber"    : defineType(Int64, 1, 0),    # signed long
>>>long }
>>>      
>>>
Still think I'd prefer something seperable:

recarrStruct = (   (CharType, 16),
                            UInt8,
                            Int16,
                            Int32,
                            Int32,
                            Float32,
                            (Float64, 32),
                            Int64 )

recarrFields = ["name",
  "TDCcount",
  "ADCcount",
   "grid_i",
   "grid_j",
   "pressure",
   "temperature",
   "idnumber"]

I guess it might not be quite as good for large structs.

>>>where defineType is a class that accepts (type, shape, default)
>>>parameters. It can be extended safely in the future if more needs appear.
>>>      
>>>
>>You're way ahead of me here.  The only thing I don't like about this is
>>the additional relative complexity because of the addition of field
>>names and default values.   It would be nice to layer this more.
>>
>>    
>>
>
>Well, I think a map between field names and values is valuable from the
>user's point of view. It may help him to label the different information on
>the recarray. Moreover, if __getattr__ and __setattr__ methods (or
>__getitem__ and __setitem__) would get implemented on recarray (as they are
>in my recarray2 version, for example), the field name can become a very
>convenient manner to access a specific field by name (this introduce the
>limitation that field name must be a valid python identifier, but I think
>this is not a big restriction). By looking at the description dictionary,
>the user can have a quick idea of what he can find in every field (with no
>need of counting, which can be a big advantage specially for long records).
>
That's true and sounds nice.  I'm just thinking records with named 
fields should be derived
from records with positional fields.  If the functionality is layered, 
 you can use as much
complexity as you need.

It's a good sign that both you and I thought of an identical tuple 
format; it's the obvious
minimal one.

>
>With regard to default values, you can make this parameter (even the shape)
>a keyword parameter in order to make it optional. 
>
OK.  That's a good point.

>  
>
>>One more thing I don't understand looking at this:  a dictionary is 
>>unordered.
>>    
>>
>
>Yeah, but this can be regarded as an advantage rather than a drawback in the
>sense that you can choose the order you (the developer) prefer. For example,
>I was using first a alphanumerical order to arrange the data fields, but
>now, I'm considering that a arrangement that optimizes the alignment of the
>fields could be far better. As for one, say that you have a (Int8, Int32,
>Float64) record; in principle it could be easy to create a routine that
>arranges this record in the form (Float64,Int32, Int8) that optimizes the
>different field access (it may be even possible to introduce automatic
>padding later on if recarrays would support them in the future).
>
>Maybe you are getting confused 
>
Yes and no. :)

>in thinking that recarrDescr will create the
>recarray. Not at all, this a *metadata* definition that can be passed to the
>actual recarray funtion for recarray creation. 
>
Just like the type repetition tuple except also including field names 
and default values.   I don't think you lost me.  For what we do,  the 
exact physical layout of the "struct" is important, so order matters.  I 
see order as part of the
meta-data,  but I don't usually deal with meta-entities so maybe I've 
got that part wrong.  :)

>Its function would be
>similar to the formats parameter (with typical values like "3a,4i,3w") in
>recarray.array, but with more verbosity and all the reported advantages.
>
>  
>
>>>instead of
>>>
>>>((Int16, 3),
>>>(Int32, 4),
>>>(Float64, 20),
>>>)
>>>      
>>>
>>This is pretty much exactly what I was thinking.  It is straightforward
>>to imagine and difficult to forget.
>>
>>    
>>
>>>the former being more handy in lots of situations.
>>>      
>>>
>>Would you please name some of these so we can explore handling them both
>>ways?
>>
>>    
>>
>
>Well, I'm afraid that the best advantage would be when dealing with
>recarrays in C extension modules. In this kind of situation it would be far
>better to deal with a "3a4i3w" array than a tuple of python objects. But
>maybe I'm wrong and the latter is not so-complicated to manage; however, I
>used to work a lot with records (even before meeting recarray) and I was
>quite comfortable with formats in string mode.
>
I was thinking that if the above was an issue,  we could write an API 
function(s) to "compile" the type-repetition tuple into arrays of ints 
which describe the type of each field and corresponding repetition factor.

>
>Or perhaps it would be enough to provide a method for converting from the
>standard metadata layout (dictionary or tuple or whatever), to a string
>format. This should be not very difficult.
>  
>
Almost exactly what I suggested above.

See you Monday,
Todd


From baecker at physik.tu-dresden.de  Sun Jan 26 02:41:02 2003
From: baecker at physik.tu-dresden.de (baecker at physik.tu-dresden.de)
Date: Sun Jan 26 02:41:02 2003
Subject: [Numpy-discussion] complex diagonal matrix
Message-ID: <Pine.LNX.4.51.0301261134210.26705@ptpcp2.phy.tu-dresden.de>

Hi,

I just wondered if there is a "nicer" way of generating
a complex diagonal matrix than
  a)
     v=arange(10,typecode=Complex)
     mat=diag(v)
  b)
     v=arange(10)
     mat=diag(v)+0j

Namely, wouldn't something like
  v=arange(10)
  mat=diag(v,typecode=Complex)
be nicer?

BTW: I somehow found that in the (excellent) documentation
of Numeric the definitions from Mlab.py are a bit hidden.
In my case I know nothing about matlab and I somehow expected
that this type of routines are to be found in the section
(together with zeros,ones etc. etc....)
Also diag is not listed in the index
 http://www.pfdubois.com/numpy/html2/numpy-22.html#A
or ?

Arnd


From hinsen at cnrs-orleans.fr  Sun Jan 26 03:11:02 2003
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Sun Jan 26 03:11:02 2003
Subject: [Numpy-discussion] complex diagonal matrix
In-Reply-To: <Pine.LNX.4.51.0301261134210.26705@ptpcp2.phy.tu-dresden.de>
References: <Pine.LNX.4.51.0301261134210.26705@ptpcp2.phy.tu-dresden.de>
Message-ID: <m3y958tbcv.fsf@localhost.localdomain>

baecker at physik.tu-dresden.de writes:

> I just wondered if there is a "nicer" way of generating
> a complex diagonal matrix than
>   a)
>      v=arange(10,typecode=Complex)
>      mat=diag(v)
>   b)
>      v=arange(10)
>      mat=diag(v)+0j
> 
> Namely, wouldn't something like
>   v=arange(10)
>   mat=diag(v,typecode=Complex)
> be nicer?

Why would that be nicer?

Personally, I prefer to have explicit typecodes limited to a very
small number of array generators, and have all other functions apply
the standard type-preservation rules.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From list at jsaul.de  Sun Jan 26 04:03:05 2003
From: list at jsaul.de (Joachim Saul)
Date: Sun Jan 26 04:03:05 2003
Subject: [Numpy-discussion] complex diagonal matrix
In-Reply-To: <Pine.LNX.4.51.0301261134210.26705@ptpcp2.phy.tu-dresden.de>
References: <Pine.LNX.4.51.0301261134210.26705@ptpcp2.phy.tu-dresden.de>
Message-ID: <20030126120117.GB869@jsaul.de>

* baecker at physik.tu-dresden.de [26.01.2003 11:40]:
> I just wondered if there is a "nicer" way of generating
> a complex diagonal matrix than
>   a)
>      v=arange(10,typecode=Complex)
>      mat=diag(v)
>   b)
>      v=arange(10)
>      mat=diag(v)+0j
>
> Namely, wouldn't something like
>   v=arange(10)
>   mat=diag(v,typecode=Complex)
> be nicer?

No, because diag() is supposed to create a diagonal, but *not* to
cast to another type. If you wanted to add that "functionality" to
functions like diag(), you would also have to add it to functions
like reshape() etc., i.e. practically everywhere.

The way it is handled now is reasonably simple and flexible, and
there is really no advantage of your suggestion compared to
approach a).

Cheers,
Joachim


From falted at openlc.org  Mon Jan 27 04:02:02 2003
From: falted at openlc.org (Francesc Alted)
Date: Mon Jan 27 04:02:02 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
In-Reply-To: <3E32E5E3.2020704@stsci.edu>
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu> <200301251342.15164.falted@openlc.org> <3E32E5E3.2020704@stsci.edu>
Message-ID: <200301271301.01659.falted@openlc.org>

A Dissabte 25 Gener 2003 20:30, Todd Miller va escriure:
>
> Still think I'd prefer something seperable:
>
> recarrStruct = (   (CharType, 16),
>                             UInt8,
>                             Int16,
>                             Int32,
>                             Int32,
>                             Float32,
>                             (Float64, 32),
>                             Int64 )
>
> recarrFields = ["name",
>   "TDCcount",
>   "ADCcount",
>    "grid_i",
>    "grid_j",
>    "pressure",
>    "temperature",
>    "idnumber"]
>
> I guess it might not be quite as good for large structs.

Me too...

>
> It's a good sign that both you and I thought of an identical tuple
> format; it's the obvious
> minimal one.

Yeah. We just differ in the way to arrange this metadata to be passed to the
recarray constructor. But I think this is secondary compared to the
flexibility that a verbose approach offers compared with the actual string
format. In fact, more than one container might be supported to define the
metadata; one can start with tuples as you suggest, but in the future other
ways can be added (if considered convenient).

For example, I think I'll stick with the dictionary option for PyTables, but
also a class declaration for the metadata would be supported, like in :

class Small(IsRecord):
    var1 = defineType(CharType, 2, "")
    var2 = defineType(Int32, 1)
    var3 = Float64

This would not be difficult to support because, by accessing to the
Small().__dict__, you get also a dictionary. In addition, the latter will
ensure (by construction) that you are not using a non-valid python
identifier, which is mandatory in my current implementation. I find these
containers (dictionaries and classes) both elegant and convenient.

>
> Just like the type repetition tuple except also including field names
> and default values.   I don't think you lost me.  For what we do,  the
> exact physical layout of the "struct" is important, so order matters.  I
> see order as part of the
> meta-data,  but I don't usually deal with meta-entities so maybe I've
> got that part wrong.  :)
>

Well, if you need positional fields, you may add a (optional) parameter,
called for example, "position" so that you can fix it. 

>
> I was thinking that if the above was an issue,  we could write an API
> function(s) to "compile" the type-repetition tuple into arrays of ints
> which describe the type of each field and corresponding repetition factor.

Yeah, I agree that this would be the best solution. That way, the charcodes
will be factored out from the code, and by just providing such and API (both
in Python and C), would be enough to reconstruct them, if needed. That will
allow a more consistent numarray internal code. 

>
> See you Monday,

Right, how did you know that? :)

-- 
Francesc Alted


From jmiller at stsci.edu  Mon Jan 27 06:44:03 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Mon Jan 27 06:44:03 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu> <200301251342.15164.falted@openlc.org> <3E32E5E3.2020704@stsci.edu> <200301271301.01659.falted@openlc.org>
Message-ID: <3E354551.5090704@stsci.edu>

Francesc Alted wrote:

>Yeah. We just differ in the way to arrange this metadata to be passed to the
>recarray constructor. But I think this is secondary compared to the
>flexibility that a verbose approach offers compared with the actual string
>format. 
>
Yes.  So one question is:  if we were to add type-repetition tuples to 
recarray as an alternative to the current character code strings,  would 
that be any form of improvement to recarray from your perspective?

As I see it,  recarray currently has a clean seperation between format 
and naming which permits the latter to be optional.  Before changing 
that,  I'd need a clear argument why.  (I didn't design and generally 
don't even maintain recarray).

>In fact, more than one container might be supported to define the
>metadata; one can start with tuples as you suggest, but in the future other
>ways can be added (if considered convenient).
>  
>
>For example, I think I'll stick with the dictionary option for PyTables, but
>also a class declaration for the metadata would be supported, like in :
>
>class Small(IsRecord):
>    var1 = defineType(CharType, 2, "")
>    var2 = defineType(Int32, 1)
>    var3 = Float64
>
>This would not be difficult to support because, by accessing to the
>Small().__dict__, you get also a dictionary. In addition, the latter will
>ensure (by construction) that you are not using a non-valid python
>identifier, which is mandatory in my current implementation. I find these
>containers (dictionaries and classes) both elegant and convenient.
>  
>
I'm not trying to be Mr. Negative here,  but one thing to keep in mind 
is this:

 >>> class C:
...     pass
...
 >>> c = C()
 >>> dir(c.__dict__)
['__class__', '__cmp__', '__contains__', '__delattr__', '__delitem__', 
'__doc__', '__eq__', '__ge__', '__getattribute__', '__getitem__', 
'__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', 
'__lt__', '__ne__', '__new__', '__reduce__', '__repr__', '__setattr__', 
'__setitem__', '__str__', 'clear', 'copy', 'fromkeys', 'get', 'has_key', 
'items', 'iteritems', 'iterkeys', 'itervalues', 'keys', 'pop', 
'popitem', 'setdefault', 'update', 'values']

Which is to say,  the instance dictionary is a little cluttered,  and it 
might not be that easy to determine which objects in it are there to 
define the data format.

>>Just like the type repetition tuple except also including field names
>>and default values.   I don't think you lost me.  For what we do,  the
>>exact physical layout of the "struct" is important, so order matters.  I
>>see order as part of the
>>meta-data,  but I don't usually deal with meta-entities so maybe I've
>>got that part wrong.  :)
>>
>
>Well, if you need positional fields, you may add a (optional) parameter,
>called for example, "position" so that you can fix it. 
>  
>
I'm sure that's not the easiest way to capture struct layout,  but I 
take your point.   Since position matters to me,  I'd prefer that 
capturing them was implicit.   Since it doesn't to you, it seems OK for 
it to be explicit.   Either default mode can support the other,  but 
capturing order with tuples is free,  while capturing order with a 
__dict__ will take some kind of extra work.

>>I was thinking that if the above was an issue,  we could write an API
>>function(s) to "compile" the type-repetition tuple into arrays of ints
>>which describe the type of each field and corresponding repetition factor.
>>    
>>
>
>Yeah, I agree that this would be the best solution. That way, the charcodes
>will be factored out from the code, and by just providing such and API (both
>in Python and C), would be enough to reconstruct them, if needed. That will
>allow a more consistent numarray internal code. 
>  
>
I'm thinking the general format for this may be converting N-tuples of 
types and ints into N arrays of types and ints.  And vice versa.
It's obvious how this works with numarray types.  I think the chararray 
types need work and need to be mapped into the same integer enumeration 
as the numeric types in a non-overlapping way.

>See you Monday,
>  
>
>
>Right, how did you know that? :)
>  
>
Insightful on weekends anyway, 
Todd


From jmiller at stsci.edu  Mon Jan 27 08:30:02 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Mon Jan 27 08:30:02 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu> <200301271301.01659.falted@openlc.org> <3E354551.5090704@stsci.edu> <200301271717.19055.falted@openlc.org>
Message-ID: <3E355E35.9070805@stsci.edu>

Francesc Alted wrote:

>A Dilluns 27 Gener 2003 15:42, Todd Miller va escriure:
>  
>
>>Yes.  So one question is:  if we were to add type-repetition tuples to
>>recarray as an alternative to the current character code strings,  would
>>that be any form of improvement to recarray from your perspective?
>>    
>>
>
>Well, at least, charcodes can be avoided. I think it's a big win... or maybe
>not as big?
>  
>
I think that avoiding the charcodes would be an improvement. 
 Type-repetition tuples provide a clear well defined way to define data 
formats.   It's not so clear that it eliminates the requirement for 
on-going Numeric compatability,  but it might.

>  
>
>>As I see it,  recarray currently has a clean seperation between format
>>and naming which permits the latter to be optional.  Before changing
>>that,  I'd need a clear argument why.  (I didn't design and generally
>>don't even maintain recarray).
>>    
>>
>
>One argument is the fact that a map is very clear to the user, although that
>such a map can be built *after* the names and format are passed to the
>recarray constructor and be accessible as an atribute. However, the latter
>solution is worse IMO, because the user has to supply two separate pieces of
>information when, actually, these should be regarded as a unity. Anyway,
>this maybe a subjective perception.
>  
>
Well,  I think there's truth to the danger of seperating names from data 
declarations,  but it is easy to map keys(), values() to the seperate 
pieces in a different layer if necessary.  

>This would not be difficult to support because, by accessing to the
>Small().__dict__, you get also a dictionary. In addition, the latter will
>ensure (by construction) that you are not using a non-valid python
>identifier, which is mandatory in my current implementation. I find these
>containers (dictionaries and classes) both elegant and convenient.
>  
>
>>I'm not trying to be Mr. Negative here,  but one thing to keep in mind
>>    
>>
>
>Oh dear, you are right!. 
>
For a few seconds there,  I thought I was on a roll!  

>In fact, I forgot that to make this to work, you
>need to use the metaclasses introduced in Python 2.2 (see Alex Martelli's
>post: http://mail.python.org/pipermail/python-list/2002-July/112007.html).
>I was following this recipe, but I forgot that I was using Python 2.2.
>
>So, as numarray has to work with previous python versions, there is no point
>to care about that.
>  
>
In truth,   numarray-0.4 and up already require Python-2.2 and up.

>I'm sure that's not the easiest way to capture struct layout,  but I
>take your point.   Since position matters to me,  I'd prefer that
>capturing them was implicit.   Since it doesn't to you, it seems OK for
>it to be explicit.   Either default mode can support the other,  but
>capturing order with tuples is free,  while capturing order with a
>__dict__ will take some kind of extra work.
>  
>
>
>That's right. We have some different needs and priorities, and we should
>take the approach better suited to each other. But exchanging points of view
>is always a great thing.
>
>  
>
>>I'm thinking the general format for this may be converting N-tuples of
>>types and ints into N arrays of types and ints.  And vice versa.
>>It's obvious how this works with numarray types.  I think the chararray
>>types need work and need to be mapped into the same integer enumeration
>>as the numeric types in a non-overlapping way.
>>
>>    
>>
>
>I can't catch your point here. Why there should be a problem with
>chararrays?.
>
What I was trying to see is that chararray types are not as well 
designed as the numarray types,  nor are they reflected in the C-API.

>  
>


From falted at openlc.org  Mon Jan 27 08:39:05 2003
From: falted at openlc.org (Francesc Alted)
Date: Mon Jan 27 08:39:05 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
In-Reply-To: <3E354551.5090704@stsci.edu>
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu> <200301271301.01659.falted@openlc.org> <3E354551.5090704@stsci.edu>
Message-ID: <200301271717.19055.falted@openlc.org>

A Dilluns 27 Gener 2003 15:42, Todd Miller va escriure:
> Yes.  So one question is:  if we were to add type-repetition tuples to
> recarray as an alternative to the current character code strings,  would
> that be any form of improvement to recarray from your perspective?

Well, at least, charcodes can be avoided. I think it's a big win... or maybe
not as big?

>
> As I see it,  recarray currently has a clean seperation between format
> and naming which permits the latter to be optional.  Before changing
> that,  I'd need a clear argument why.  (I didn't design and generally
> don't even maintain recarray).

One argument is the fact that a map is very clear to the user, although that
such a map can be built *after* the names and format are passed to the
recarray constructor and be accessible as an atribute. However, the latter
solution is worse IMO, because the user has to supply two separate pieces of
information when, actually, these should be regarded as a unity. Anyway,
this maybe a subjective perception.

> >This would not be difficult to support because, by accessing to the
> >Small().__dict__, you get also a dictionary. In addition, the latter will
> >ensure (by construction) that you are not using a non-valid python
> >identifier, which is mandatory in my current implementation. I find these
> >containers (dictionaries and classes) both elegant and convenient.
>
> I'm not trying to be Mr. Negative here,  but one thing to keep in mind

Oh dear, you are right!. In fact, I forgot that to make this to work, you
need to use the metaclasses introduced in Python 2.2 (see Alex Martelli's
post: http://mail.python.org/pipermail/python-list/2002-July/112007.html).
I was following this recipe, but I forgot that I was using Python 2.2.

So, as numarray has to work with previous python versions, there is no point
to care about that.

>
> I'm sure that's not the easiest way to capture struct layout,  but I
> take your point.   Since position matters to me,  I'd prefer that
> capturing them was implicit.   Since it doesn't to you, it seems OK for
> it to be explicit.   Either default mode can support the other,  but
> capturing order with tuples is free,  while capturing order with a
> __dict__ will take some kind of extra work.

That's right. We have some different needs and priorities, and we should
take the approach better suited to each other. But exchanging points of view
is always a great thing.

>
> I'm thinking the general format for this may be converting N-tuples of
> types and ints into N arrays of types and ints.  And vice versa.
> It's obvious how this works with numarray types.  I think the chararray
> types need work and need to be mapped into the same integer enumeration
> as the numeric types in a non-overlapping way.
>

I can't catch your point here. Why there should be a problem with
chararrays?.

-- 
Francesc Alted


From Chris.Barker at noaa.gov  Mon Jan 27 10:20:06 2003
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Mon Jan 27 10:20:06 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu> <200301271301.01659.falted@openlc.org> <3E354551.5090704@stsci.edu> <200301271717.19055.falted@openlc.org>
Message-ID: <3E35768B.DD6454BE@noaa.gov>

Francesc Alted wrote:

> So, as numarray has to work with previous python versions, 

Why? Anyone using NumArray is either starting from scratch or porting
from Numeric, so having to port to a newer version of Python is a very
small deal. 


-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                    		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From jmiller at stsci.edu  Mon Jan 27 10:34:05 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Mon Jan 27 10:34:05 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu> <200301271301.01659.falted@openlc.org> <3E354551.5090704@stsci.edu> <200301271717.19055.falted@openlc.org> <3E35768B.DD6454BE@noaa.gov>
Message-ID: <3E357B5F.9030908@stsci.edu>

Chris Barker wrote:

>Francesc Alted wrote:
>
>  
>
>>So, as numarray has to work with previous python versions, 
>>    
>>
>
>Why? Anyone using NumArray is either starting from scratch or porting
>from Numeric, so having to port to a newer version of Python is a very
>small deal. 
>  
>
Just to make it very clear:  numarray-0.4 and up require Python-2.2 or 
higher.  

Up until numarray-0.4 (released in November),  that was not the case, 
and numarray ran (and was tested!) on Python-2.0 and higher.

The desire to increase C-level Numeric compatability and to improve 
simple indexing speed led us to a C baseclass, which is only supported 
in Python-2.2 and  up.

Todd


From falted at openlc.org  Mon Jan 27 11:23:01 2003
From: falted at openlc.org (Francesc Alted)
Date: Mon Jan 27 11:23:01 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
In-Reply-To: <3E355E35.9070805@stsci.edu>
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu> <200301271717.19055.falted@openlc.org> <3E355E35.9070805@stsci.edu>
Message-ID: <200301272021.47587.falted@openlc.org>

A Dilluns 27 Gener 2003 17:28, Todd Miller va escriure:
> >So, as numarray has to work with previous python versions, there is no
> > point to care about that.
>
> In truth,   numarray-0.4 and up already require Python-2.2 and up.

Oh!, I didn't know that. In such a case, I think it's worth to consider the
possibility to define records as classes descendants from metaclasses. But,
of course, you have the ultimate decision.

> >>I'm thinking the general format for this may be converting N-tuples of
> >>types and ints into N arrays of types and ints.  And vice versa.
> >>It's obvious how this works with numarray types.  I think the chararray
> >>types need work and need to be mapped into the same integer enumeration
> >>as the numeric types in a non-overlapping way.
> >
> >I can't catch your point here. Why there should be a problem with
> >chararrays?.
>
> What I was trying to see is that chararray types are not as well
> designed as the numarray types,  nor are they reflected in the C-API.

I see. Well, is it really desirable such a unification? CharArray entities
come from a module and NumArray from another one, and that should be ok. Why
bother in creating a unified API or integer enumeration?. I think this
should be not a big drawback for C-extension crafters (although, to say the
truth, that would be very elegant if you manage to do that, but maybe it is
not worth the effort, I don't know).

-- 
Francesc Alted


From jmiller at stsci.edu  Mon Jan 27 11:39:01 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Mon Jan 27 11:39:01 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
References: <000001c2c635$624e9a40$6601a8c0@NICKLEBY>
Message-ID: <3E358A72.6050400@stsci.edu>

Paul F Dubois wrote:

>IMHO you can assume any Python you want. Look to the long term here, not the
>short.
>
You lost me.  numarray-0.4 needs at least Python-2.2 or baseclasses 
don't exist.  I had a slow Python equivalent for the baseclass as I 
refactored prior to numarray-0.4,  but it's gone now.

>
>I'm a bit uncertain on MA as to whether my old design is right. Maybe I
>should be inheriting from NDarray? So that MA is more of a sibling of
>numarray rather than a wrapper of it?
>  
>
I asked Perry about this one.  His points (salted a little by me) were:

1. If you inherit from NumArray,  you also inherit from NDArray.  If you 
only inherit from NDArray,  all you get are the structural operations.

2. If you inherit from NumArray,  you can use Liskov substitution to 
pass MA's directly into extensions expecting NumArrays.  This 
substitution may or may not be good.  Also,  isinstance(anMA, numarray) 
will return True.  

3. If you inherit from NumArray,  you get numerical method definitions 
which may or may not be applicable to MA.  With a little thrashing,  we 
might also get MAs to work for ufuncs.   In fact, ufuncs are the key to 
whether or not the NumArray numerical methods add any value.

Todd

>  
>


From jmiller at stsci.edu  Mon Jan 27 11:54:06 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Mon Jan 27 11:54:06 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu> <200301271717.19055.falted@openlc.org> <3E355E35.9070805@stsci.edu> <200301272021.47587.falted@openlc.org>
Message-ID: <3E358DE0.7040501@stsci.edu>

Francesc Alted wrote:

>A Dilluns 27 Gener 2003 17:28, Todd Miller va escriure:
>  
>
>>>So, as numarray has to work with previous python versions, there is no
>>>point to care about that.
>>>      
>>>
>>In truth,   numarray-0.4 and up already require Python-2.2 and up.
>>    
>>
>
>Oh!, I didn't know that. In such a case, I think it's worth to consider the
>possibility to define records as classes descendants from metaclasses. But,
>of course, you have the ultimate decision.
>  
>
I don't know what you mean here.   Please spell it out a little more.

>  
>
>>>>I'm thinking the general format for this may be converting N-tuples of
>>>>types and ints into N arrays of types and ints.  And vice versa.
>>>>It's obvious how this works with numarray types.  I think the chararray
>>>>types need work and need to be mapped into the same integer enumeration
>>>>as the numeric types in a non-overlapping way.
>>>>        
>>>>
>>>I can't catch your point here. Why there should be a problem with
>>>chararrays?.
>>>      
>>>
>>What I was trying to see is that chararray types are not as well
>>designed as the numarray types,  nor are they reflected in the C-API.
>>    
>>
>
>I see. Well, is it really desirable such a unification? CharArray entities
>come from a module and NumArray from another one, and that should be ok. Why
>bother in creating a unified API or integer enumeration?. 
>
It may not be necessary.  Int8 with repitition factors may work about 
the same.


From falted at openlc.org  Mon Jan 27 12:16:02 2003
From: falted at openlc.org (Francesc Alted)
Date: Mon Jan 27 12:16:02 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
In-Reply-To: <3E358DE0.7040501@stsci.edu>
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu> <200301272021.47587.falted@openlc.org> <3E358DE0.7040501@stsci.edu>
Message-ID: <200301272114.53545.falted@openlc.org>

A Dilluns 27 Gener 2003 20:52, Todd Miller va escriure:
> >
> >Oh!, I didn't know that. In such a case, I think it's worth to consider
> > the possibility to define records as classes descendants from
> > metaclasses. But, of course, you have the ultimate decision.
>
> I don't know what you mean here.   Please spell it out a little more.

I was trying to mean that using something like :

class Small(IsRecord):
    field1 = defineType(CharType, 2, default="", position=1)
    field2 = defineType(Int32, 1, position=2)
    field3 = Float64

as as container for recarray metadata is definitely possible instead of the
tuple (formats="2aid",names=("field1","field2", "field3")), if using
Python2.2. IsRecord is a metaclass (introduced in Python 2.2) that allows
you to effectively separate the declared attributes from the implicit ones
in normal classes.

Of course, you can taylor IsRecord so as to fulfill your needs.

I hope that I have expressed myself more clearly now,

-- 
Francesc Alted


From jmiller at stsci.edu  Mon Jan 27 12:54:05 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Mon Jan 27 12:54:05 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu> <200301272021.47587.falted@openlc.org> <3E358DE0.7040501@stsci.edu> <200301272114.53545.falted@openlc.org>
Message-ID: <3E359C2B.4070509@stsci.edu>

Francesc Alted wrote:

>A Dilluns 27 Gener 2003 20:52, Todd Miller va escriure:
>  
>
>>>Oh!, I didn't know that. In such a case, I think it's worth to consider
>>>the possibility to define records as classes descendants from
>>>metaclasses. But, of course, you have the ultimate decision.
>>>      
>>>
>>I don't know what you mean here.   Please spell it out a little more.
>>    
>>
>
>I was trying to mean that using something like :
>
>class Small(IsRecord):
>    field1 = defineType(CharType, 2, default="", position=1)
>    field2 = defineType(Int32, 1, position=2)
>    field3 = Float64
>
>as as container for recarray metadata is definitely possible instead of the
>tuple (formats="2aid",names=("field1","field2", "field3")), if using
>Python2.2. IsRecord is a metaclass (introduced in Python 2.2) that allows
>you to effectively separate the declared attributes from the implicit ones
>in normal classes.
>
>Of course, you can taylor IsRecord so as to fulfill your needs.
>
>I hope that I have expressed myself more clearly now,
>
>  
>
I looked at your docs here: 
http://pytables.sourceforge.net/html-doc/usersguide-html4.html#section4.2
and what you said above clicked.  Thanks.

Todd


From Chris.Barker at noaa.gov  Tue Jan 28 11:02:04 2003
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Tue Jan 28 11:02:04 2003
Subject: [Numpy-discussion] Extensive use of methods instead of functions
References: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com> <3E26EC9D.A0B7D173@noaa.gov> <3E288068.3070407@stsci.edu>
Message-ID: <3E36D14D.C3238DFA@noaa.gov>

Konrad Hinsen wrote:
> > M = array(l)
> > Mt = M.transpose()
> >
> > just isn't that much worse than:
> >
> > Mt = transpose(l)
> 
> No, but the automatic conversion enables me to write functions that
> accept any sequence type without even having to think about it.

I've used that to, but I also frequently use something like this:

def function(A):
	A = array(A)
	...

Which is pretty simple to. 

> Moreover, it is almost essential in many situations to accept scalars
> in place of arrays, because scalars fulfill the role of rank-0 arrays.

Yes, this is critical. Isn't there a plan to make the scalar -- rank-0
array dicotomy a little cleaner in NumArray ?
 
> > I also agree that the point is not subclassing per se, it's
> > polymorphism. It should be easy to write a class that acts like an array
> > in all the ways that you need it to. 
> 
> True, and that is a weak point of NumPy.

Is this getting any better with NumArray?
 

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                    		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From falted at openlc.org  Tue Jan 28 11:42:07 2003
From: falted at openlc.org (Francesc Alted)
Date: Tue Jan 28 11:42:07 2003
Subject: [Numpy-discussion] enum values visible in numeric types instances?
Message-ID: <200301282041.21145.falted@openlc.org>

Hi,

A couple of points related with numarray type objects:

1.- When working with numeric types instances like UInt8 or Float64, is
there a way to access to their enumeration NumarrayType C counterpart?. That
can be handy when want to map from these objects and integers.

For example, right now, I'm forced to use these mappings in Pyrex:

# Conversion tables from/to classes to the numarray enum types
toenum = {num.Int8:tInt8,       num.UInt8:tUInt8,
          num.Int16:tInt16,     num.UInt16:tUInt16,
          num.Int32:tInt32,     num.UInt32:tUInt32,
          num.Float32:tFloat32, num.Float64:tFloat64,
          CharType:97   # ascii(97) --> 'a' # Special case (to be corrected)
          }

toclass = {tInt8:num.Int8,       tUInt8:num.UInt8,
           tInt16:num.Int16,     tUInt16:num.UInt16,
           tInt32:num.Int32,     tUInt32:num.UInt32,
           tFloat32:num.Float32, tFloat64:num.Float64,
           97:CharType   # ascii(97) --> 'a' # Special case (to be corrected)
          }

(yes, Pyrex lets you do that kind of "miracles", like mappings between
Python objects and C integers)

but if I had this access directly from the object (for example
Int8.enumType), my code (and C-extensions in general) could look simpler.

2.- I understand now why Todd was worried about CharArray objects to be
assigned to an enumerated type. In fact, if you look at the above maps, I
have to map myself this special object as the number 97 (which is the ascii
value for character "a"). 97 is ok for now because it can't collide (at
least for a while) with other enumeration types.

My suggestion is that it would be a good thing to have a reserved enum type
for CharArray. And I think that mapping CharArrays with Bool or Int8, would
not be a good solution because chararray objects differ in some ways from
them, that it would be a mess to distinguish both objects in C-code by just
looking at its enumeration type. 

I don't know, but maybe recarrays also merit a place in enumeration (?). 

By the way, after the discussion with Todd I finally decided to remove all
the Numeric charcodes (and related codes) from PyTables. However, I can
still manage Numeric objects by converting them to numarray and accessing
the class type with the .type() method. An you know that? the code looks
much more logical and neat, and best of all, less error-prone (well, at
least I hope so!). I definitely encourage you to do a similar transition in
numarray (although I guess that would be more difficult because you still
need to Numeric compatibility).

Thanks,

-- 
Francesc Alted


From perry at stsci.edu  Tue Jan 28 13:59:08 2003
From: perry at stsci.edu (Perry Greenfield)
Date: Tue Jan 28 13:59:08 2003
Subject: [Numpy-discussion] Extensive use of methods instead of functions
In-Reply-To: <3E36D14D.C3238DFA@noaa.gov>
Message-ID: <JFEGLNDJEDNOMPPHDEJFEEDMECAA.perry@stsci.edu>

> Yes, this is critical. Isn't there a plan to make the scalar -- rank-0
> array dicotomy a little cleaner in NumArray ?
>
Hmmm, I'd like to say yes, but I'm not sure what exactly you are
referring to. Please elaborate on how you think it should be
changed. About the only thing that comes to mind is that repr()
for rank-0 will be different for numarray than Numeric, and that
it will never be the result of any reduction or similar selection.
  
> > > I also agree that the point is not subclassing per se, it's
> > > polymorphism. It should be easy to write a class that acts 
> like an array
> > > in all the ways that you need it to. 
> > 
> > True, and that is a weak point of NumPy.
> 
> Is this getting any better with NumArray?
>  
Again, I hope so, but I find this too general to know if it satisfies
anyone's specific goals. I'd like to see specific examples. I think
it is often tricker than people initially think.

Perry


From jdhunter at ace.bsd.uchicago.edu  Wed Jan 29 13:13:03 2003
From: jdhunter at ace.bsd.uchicago.edu (John Hunter)
Date: Wed Jan 29 13:13:03 2003
Subject: [Numpy-discussion] fastest way to make two vectors into an array
Message-ID: <m2fzrbzmlc.fsf@mother.paradise.lost>

I have two equal length 1D arrays of 256-4096 complex or floating
point numbers which I need to put into a shape=(len(x),2) array.

I need to do this a lot, so I would like to use the most efficient
means.  Currently I am doing:

def somefunc(x,y):
    X = zeros( (len(x),2), typecode=x.typecode())
    X[:,0] = x
    X[:,1] = y
    do_something_with(X)

Is this the fastest way?

Thanks,
John Hunter


From list at jsaul.de  Thu Jan 30 01:20:04 2003
From: list at jsaul.de (Joachim Saul)
Date: Thu Jan 30 01:20:04 2003
Subject: [Numpy-discussion] fastest way to make two vectors into an array
In-Reply-To: <m2fzrbzmlc.fsf@mother.paradise.lost>
References: <m2fzrbzmlc.fsf@mother.paradise.lost>
Message-ID: <20030130091853.GA842@jsaul.de>

* John Hunter [2003-01-29 22:13]:
> def somefunc(x,y):
>     X = zeros( (len(x),2), typecode=x.typecode())
>     X[:,0] = x
>     X[:,1] = y
>     do_something_with(X)
>
> Is this the fastest way?

X = transpose(array([x]+[y]))

It may not be the fastest possible way, but should be about a
factor of two faster; better than nothing.

Cheers,
Joachim


From karthik at james.hut.fi  Thu Jan 30 01:47:03 2003
From: karthik at james.hut.fi (Karthikesh Raju)
Date: Thu Jan 30 01:47:03 2003
Subject: [Numpy-discussion] Object too deep for desired array
In-Reply-To: <E18dySd-0000ec-00@sc8-sf-list2.sourceforge.net>
Message-ID: <Pine.SGI.4.21.0301301138340.1340362-100000@james.hut.fi>

Hi, 

i was tring out something like this 
import Numeric
import LinearAlgebra
import cmath
import RandomArray
import copy


def sMatrix(pd, code, window):
    if window == 0:
        nprime = 1
    else:
        nprime = window
    
    K, C = Numeric.shape(code)
    K1, L = Numeric.shape(pd)
    # check if K == K1 and raise an exception here
    sCode = Numeric.zeros([nprime*C,K*L*(window+1)],'d')

    for k in range(K):
        for l in range(L):
            code1 = copy.deepcopy(Numeric.array(code[k,0:C-pd[k,l]]))
            code1.shape = (C-pd[k,l],1)
            sCode1=
Numeric.concatenate((Numeric.zeros([pd[k,l],1]),Numeric.zeros([C*window,1]),code1))
            sCode[:, (window+1)*l+window*L*k] = copy.deepcopy(sCode1)
    
    return sCode

if __name__ == "__main__":
    pd = Numeric.array([[2]])
    code = Numeric.array([[-1,1,-1,1,1]])
    np = sMatrix(pd,code,0)
    print np
    print "--"*30
    np = sMatrix(pd,code,1)
    print Numeric.shape(np)
    print np
    print "--"*30
    np = sMatrix(pd,code,2)
    print Numeric.shape(np)
    print np
    print "--"*30


------------------------------
And i get struck with the following error message::

Traceback (most recent call last):
  File "sMatrix.py", line 31, in ?
    np = sMatrix(pd,code,0)
  File "sMatrix.py", line 24, in sMatrix
    sCode[:, (window+1)*l+window*L*k] = copy.deepcopy(sCode1)
ValueError: Object too deep for desired array


------------

i think it is due to the many deep copy operations taht i am performing. i
want to be in a position where slices of matrices should not be
references, but should be copies itself and i should be able to move these
copies around. (May be it is inefficient, but that is what i did in
Matlab and want some compatibility, till i learn more of python and till
i migrate to python completely).

Is there a way out? Why is this an problem? Am i missing something.

Best regards,

karthik


-----------------------------------------------------------------------
Karthikesh Raju,		    email: karthik at james.hut.fi		
Researcher,			    http://www.cis.hut.fi/karthik
Helsinki University of Technology,  Tel: +358-9-451 5389
Laboratory of Comp. & Info. Sc.,    Fax: +358-9-451 3277
Department of Computer Sc.,
P.O Box 5400, FIN 02015 HUT,
Espoo, FINLAND
-----------------------------------------------------------------------


From pearu at cens.ioc.ee  Thu Jan 30 01:51:09 2003
From: pearu at cens.ioc.ee (Pearu Peterson)
Date: Thu Jan 30 01:51:09 2003
Subject: [Numpy-discussion] fastest way to make two vectors into an array
In-Reply-To: <m2fzrbzmlc.fsf@mother.paradise.lost>
Message-ID: <Pine.LNX.4.21.0301301141210.5388-100000@cens.kybi>

On Wed, 29 Jan 2003, John Hunter wrote:

> 
> I have two equal length 1D arrays of 256-4096 complex or floating
> point numbers which I need to put into a shape=(len(x),2) array.
> 
> I need to do this a lot, so I would like to use the most efficient
> means.  Currently I am doing:
> 
> def somefunc(x,y):
>     X = zeros( (len(x),2), typecode=x.typecode())
>     X[:,0] = x
>     X[:,1] = y
>     do_something_with(X)
> 
> Is this the fastest way?

May be you could arange your algorithm so that you first create
X and then reference its columns by x,y without copying:

# Allocate memory
X = zeros( (n,2), typecode=.. )

# Get references to columns
x = X[:,0]
y = X[:,1]

while 1:
  do_something_inplace_with(x,y)
  do_something_with(X)

Pearu


From jdhunter at ace.bsd.uchicago.edu  Thu Jan 30 11:26:05 2003
From: jdhunter at ace.bsd.uchicago.edu (John Hunter)
Date: Thu Jan 30 11:26:05 2003
Subject: [Numpy-discussion] fastest way to make two vectors into an
 array
In-Reply-To: <m2fzrbzmlc.fsf@mother.paradise.lost> (John Hunter's message of
 "Wed, 29 Jan 2003 15:13:03 -0600")
References: <m2fzrbzmlc.fsf@mother.paradise.lost>
Message-ID: <m2vg064eyj.fsf@mother.paradise.lost>

>>>>> "John" == John Hunter <jdhunter at ace.bsd.uchicago.edu> writes:

    John> I have two equal length 1D arrays of 256-4096 complex or
    John> floating point numbers which I need to put into a
    John> shape=(len(x),2) array.

    John> I need to do this a lot, so I would like to use the most
    John> efficient means.  Currently I am doing:

I tested all the suggested methods and the transpose with [x] and [y]
was the clear winner, with an 8 fold speed up over my original code.
The concatenate method was between 2-3 times faster.

Thanks to all who responded,
John Hunter

cruncher2:~/python/test> python test.py test_naive
test_naive 0.480427026749
cruncher2:~/python/test> python test.py test_concat
test_concat 0.189149975777
cruncher2:~/python/test> python test.py test_transpose
test_transpose 0.0698409080505


from Numeric import transpose, concatenate, reshape, array, zeros
from RandomArray import normal
import time, sys

def test_naive(x,y):
    "Naive approach"
    X = zeros( (len(x),2), typecode=x.typecode())
    X[:,0] = x
    X[:,1] = y

def test_concat(x,y):
    "Thanks to Chris Barker and Bryan Cole"
    X = concatenate( ( reshape(x,(-1,1)), reshape(y,(-1,1)) ), 1)


def test_transpose(x,y):
    "Thanks to Joachim Saul"
    X = transpose(array([x]+[y]))


m = {'test_naive' : test_naive,
     'test_concat' : test_concat,
     'test_transpose' : test_transpose}

nse1 = normal(0.0, 1.0, (4096,))
nse2 = normal(0.0, 1.0, nse1.shape)

N = 1000

trials = range(N)

func = m[sys.argv[1]]
t1 = time.time()
for i in trials:
    func(nse1,nse2)
t2 = time.time()
print sys.argv[1], t2-t1


From jdhunter at ace.bsd.uchicago.edu  Thu Jan 30 14:18:04 2003
From: jdhunter at ace.bsd.uchicago.edu (John Hunter)
Date: Thu Jan 30 14:18:04 2003
Subject: [Numpy-discussion] mlab functions: psd, csd, cohere, corrcoef
Message-ID: <m27kcm1duu.fsf@mother.paradise.lost>

I needed some spectral analysis functions, and finding none available,
wrote my own.  I use matlab a lot, so I wrote them to be matlab
compatible.  If you all think these look OK, I'm happy to submit them
for inclusion into MLab.  

-------------------------------------------------------------------

"""

Spectral analysis functions for Numerical python written for
compatability with matlab commands with the same names.

  psd - Power spectral density uing Welch's average periodogram
  csd - Cross spectral density uing Welch's average periodogram
  cohere - Coherence (normalized cross spectral density)
  corrcoef - The matrix of correlation coefficients

The functions are designed to work for real and complex valued Numeric
arrays.

One of the major differences between this code and matlab's is that I
use functions for 'detrend' and 'window', and matlab uses vectors.
This can be easily changed, but I think the functional approach is a
bit more elegant.

Please send comments, questions and bugs to:

Author: John D. Hunter <jdhunter at ace.bsd.uchicago.edu>

"""

from __future__ import division
from MLab import mean, hanning, cov
from Numeric import zeros, ones, diagonal, transpose, matrixmultiply, \
     resize, sqrt, divide, array, Float, Complex, concatenate, \
     convolve, dot, conjugate, absolute, arange, reshape
from FFT import fft


def norm(x):
    return sqrt(dot(x,x))

def window_hanning(x):
    return hanning(len(x))*x

def window_none(x):
    return x

def detrend_mean(x):
    return x - mean(x)

def detrend_none(x):
    return x

def detrend_linear(x):
    """Remove the best fit line from x"""
    # I'm going to regress x on xx=range(len(x)) and return
    # x - (b*xx+a)
    xx = arange(len(x), typecode=x.typecode())
    X = transpose(array([xx]+[x]))
    C = cov(X)
    b = C[0,1]/C[0,0]
    a = mean(x) - b*mean(xx)
    return x-(b*xx+a)


def psd(x, NFFT=256, Fs=2, detrend=detrend_none,
        window=window_hanning, noverlap=0):
    """
    The power spectral density by Welches average periodogram method.
    The vector x is divided into NFFT length segments.  Each segment
    is detrended by function detrend and windowed by function window.
    noperlap gives the length of the overlap between segments.  The
    absolute(fft(segment))**2 of each segment are averaged to compute Pxx,
    with a scaling to correct for power loss due to windowing.  Fs is
    the sampling frequency.

    -- NFFT must be a power of 2
    -- detrend and window are functions, unlike in matlab where they are
       vectors.
    -- if length x < NFFT, it will be zero padded to NFFT
    

    Refs:
      Bendat & Piersol -- Random Data: Analysis and Measurement
        Procedures, John Wiley & Sons (1986)

    """

    if NFFT % 2:
        raise ValueError, 'NFFT must be a power of 2'

    # zero pad x up to NFFT if it is shorter than NFFT
    if len(x)<NFFT:
        n = len(x)
        x = resize(x, (NFFT,))
        x[n:] = 0
    

    # for real x, ignore the negative frequencies
    if x.typecode()==Complex: numFreqs = NFFT
    else: numFreqs = NFFT//2+1
        
    windowVals = window(ones((NFFT,),x.typecode()))
    step = NFFT-noverlap
    ind = range(0,len(x)-NFFT+1,step)
    n = len(ind)
    Pxx = zeros((numFreqs,n), Float)

    # do the ffts of the slices
    for i in range(n):
        thisX = x[ind[i]:ind[i]+NFFT]
        thisX = windowVals*detrend(thisX)
        fx = absolute(fft(thisX))**2
        Pxx[:,i] = fx[:numFreqs]

    # Scale the spectrum by the norm of the window to compensate for
    # windowing loss; see Bendat & Piersol Sec 11.5.2
    if n>1: Pxx = mean(Pxx,1)
    Pxx = divide(Pxx, norm(windowVals)**2)
    freqs = Fs/NFFT*arange(0,numFreqs)
    return Pxx, freqs


def csd(x, y, NFFT=256, Fs=2, detrend=detrend_none,
        window=window_hanning, noverlap=0):
    """
    The cross spectral density Pxy by Welches average periodogram
    method.  The vectors x and y are divided into NFFT length
    segments.  Each segment is detrended by function detrend and
    windowed by function window.  noverlap gives the length of the
    overlap between segments.  The product of the direct FFTs of x and
    y are averaged over each segment to compute Pxy, with a scaling to
    correct for power loss due to windowing.  Fs is the sampling
    frequency.

    NFFT must be a power of 2

    Refs:
      Bendat & Piersol -- Random Data: Analysis and Measurement
        Procedures, John Wiley & Sons (1986)

    """

    if NFFT % 2:
        raise ValueError, 'NFFT must be a power of 2'

    # zero pad x and y up to NFFT if they are shorter than NFFT
    if len(x)<NFFT:
        n = len(x)
        x = resize(x, (NFFT,))
        x[n:] = 0
    if len(y)<NFFT:
        n = len(y)
        y = resize(y, (NFFT,))
        y[n:] = 0

    # for real x, ignore the negative frequencies
    if x.typecode()==Complex: numFreqs = NFFT
    else: numFreqs = NFFT//2+1
        
    windowVals = window(ones((NFFT,),x.typecode()))
    step = NFFT-noverlap
    ind = range(0,len(x)-NFFT+1,step)
    n = len(ind)
    Pxy = zeros((numFreqs,n), Complex)

    # do the ffts of the slices
    for i in range(n):
        thisX = x[ind[i]:ind[i]+NFFT]
        thisX = windowVals*detrend(thisX)
        thisY = y[ind[i]:ind[i]+NFFT]
        thisY = windowVals*detrend(thisY)
        fx = fft(thisX)
        fy = fft(thisY)
        Pxy[:,i] = fy[:numFreqs]*conjugate(fx[:numFreqs])

    # Scale the spectrum by the norm of the window to compensate for
    # windowing loss; see Bendat & Piersol Sec 11.5.2
    if n>1: Pxy = mean(Pxy,1)
    Pxy = divide(Pxy, norm(windowVals)**2)
    freqs = Fs/NFFT*arange(0,numFreqs)
    return Pxy, freqs

def cohere(x, y, NFFT=256, Fs=2, detrend=detrend_none,
           window=window_hanning, noverlap=0):
    """
    cohere the coherence between x and y.  Coherence is the normalized
    cross spectral density

    Cxy = |Pxy|^2/(Pxx*Pyy)

    The return value is (Cxy, f), where f are the frequencies of the
    coherence vector.  See the docs for psd and csd for information
    about the function arguments NFFT, detrend, windowm noverlap, as
    well as the methods used to compute Pxy, Pxx and Pyy.

    """

    
    Pxx,f = psd(x, NFFT=NFFT, Fs=Fs, detrend=detrend,
              window=window, noverlap=noverlap)
    Pyy,f = psd(y, NFFT=NFFT, Fs=Fs, detrend=detrend,
              window=window, noverlap=noverlap)
    Pxy,f = csd(x, y, NFFT=NFFT, Fs=Fs, detrend=detrend,
              window=window, noverlap=noverlap)

    Cxy = divide(absolute(Pxy)**2, Pxx*Pyy)
    return Cxy, f

def corrcoef(*args):
    """
    
    corrcoef(X) where X is a matrix returns a matrix of correlation
    coefficients for each row of X.
    
    corrcoef(x,y) where x and y are vectors returns the matrix or
    correlation coefficients for x and y.

    Numeric arrays can be real or complex

    The correlation matrix is defined from the covariance matrix C as

    r(i,j) = C[i,j] / (C[i,i]*C[j,j])
    """

    if len(args)==2:
        X = transpose(array([args[0]]+[args[1]]))
    elif len(args==1):
        X = args[0]
    else:
        raise RuntimeError, 'Only expecting 1 or 2 arguments'

    
    C = cov(X)
    d = resize(diagonal(C), (2,1))
    r = divide(C,sqrt(matrixmultiply(d,transpose(d))))[0,1]
    try: return r.real
    except AttributeError: return r


-------------------------------------------------------------------

I wrote a little test code comparing the output of matlab's equivalent
functions.  Basically, I compute the psd or cohere in matlab and
python and do the rms difference on the resultant vectors

  RMS cohere python/matlab difference 0.000854587104587
  RMS psd python/matlab difference 0.00210783306638

I am not sure where these differences are arising, but they are quite
small.  I'm going to keep trying to track them down.

For corrcoef, the answers are the same past 8 significant digits.

Hope this helps!
John Hunter


From haase at msg.ucsf.edu  Fri Jan 31 05:12:05 2003
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Fri Jan 31 05:12:05 2003
Subject: [Numpy-discussion] numarray 0.4 on osX/darwin
Message-ID: <020a01c2c897$65bf2dc0$3b45da80@rodan>

Hi everybody,
I tried a 'python2.2 setup.py install'
of numarray  on a Mac running os-X (10.1; I have also Fink installed)
I starts crunching until:
/usr/bin/ld: Undefined symbols:
_fclearexcept
_fetestexcept

Anyone out there, who uses numarray on osX ?

I'm thankful for any pointer...

Sebastian Haase


From jmiller at stsci.edu  Fri Jan 31 07:31:01 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jan 31 07:31:01 2003
Subject: [Numpy-discussion] numarray 0.4 on osX/darwin
References: <020a01c2c897$65bf2dc0$3b45da80@rodan>
Message-ID: <3E3A9628.3030704@stsci.edu>

Sebastian Haase wrote:

>Hi everybody,
>I tried a 'python2.2 setup.py install'
>of numarray  on a Mac running os-X (10.1; I have also Fink installed)
>I starts crunching until:
>/usr/bin/ld: Undefined symbols:
>_fclearexcept
>_fetestexcept
>
>Anyone out there, who uses numarray on osX ?
>
>I'm thankful for any pointer...
>
>Sebastian Haase
>  
>
Hi Sebastian,

I am very much a Mac-Amateur,  but I have run numarray under osX by 
first installing a local UNIX version of Python using the source 
tarball.  The steps were roughly as follows:

1. Obtain and unpack the Python source tarball in you home directory. 
 cd there.

2. Configure Python using:  ./configure --prefix=$HOME  

3. Edit the Makefile for the following:

61c61
 > LDFLAGS=
---
< LDFLAGS=      -framework System -framework CoreServices -framework 
Foundation

This was the only (reasonable) way I could figure out how to tunnel link 
time options down through the distutils in the proper command line 
order.  I'm not really sure this is a minimal set of frameworks,  but it 
did at least work.

4. Build and install python:  make ; make install

5.  Obtain and unpack the numarray source tarball.  cd there.

6.  Build and install numarray:  python setupall.py install

7.  Put $HOME/bin on your PATH and rehash.


Todd

>
>
>
>-------------------------------------------------------
>This SF.NET email is sponsored by:
>SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
>http://www.vasoftware.com
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>  
>


From Chris.Barker at noaa.gov  Fri Jan 31 12:44:02 2003
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Fri Jan 31 12:44:02 2003
Subject: [Numpy-discussion] fastest way to make two vectors into anarray
References: <m2fzrbzmlc.fsf@mother.paradise.lost> <m2vg064eyj.fsf@mother.paradise.lost>
Message-ID: <3E3ADC19.5566CB5A@noaa.gov>

John Hunter wrote:
>     John> I have two equal length 1D arrays of 256-4096 complex or
>     John> floating point numbers which I need to put into a
>     John> shape=(len(x),2) array.

> I tested all the suggested methods and the transpose with [x] and [y]
> was the clear winner, with an 8 fold speed up over my original code.
> The concatenate method was between 2-3 times faster.

I was a little surprised by this, as I figured that the transpose method
made an extra copy of the data (array() makes one copy, transpose()
another. So I looked at the source for concatenate:

def concatenate(a, axis=0):
    """concatenate(a, axis=0) joins the tuple of sequences in a into a
single
    NumPy array.
    """
    if axis == 0:
        return multiarray.concatenate(a)
    else:
        new_list = []
        for m in a:
            new_list.append(swapaxes(m, axis, 0))
    return swapaxes(multiarray.concatenate(new_list), axis, 0)

So, if you are concantenating along anything other than the zero-th
axis, you end up doing something similar to the transpose method. Seeign
this, I trioed something else:

def test_concat2(x,y):
    x.shape = (1,-1)
    y.shape = (1,-1)
    X = transpose( concatenate( (x, y) ) )
    x.shape = (-1,)
    y.shape = (-1,)

This then uses the native concatenate, but requires an extra copy in teh
transpose.

Here's a somewhat cleaner version, though you get more copies:

def test_concat3(x,y):
    "Thanks to Chris Barker and Bryan Cole"
    X = transpose( concatenate( ( reshape(x,(1,-1)), reshape(y,(1,-1)) )
) )

Here are the test results:

testing on vectors of length:  4096

test_concat 0.286280035973
test_transpose 0.100033998489
test_naive 0.805399060249
test_concat3 0.109319090843
test_concat2 0.136469960213

All the transpose methods are essentially a tie. Would it be that hard
for concatenate to do it's thing for any axis in C? It does seem like
this is a fairly basic operation, and shouldn't require more than one
copy.

By the way, I realised that the transpose method had an extra call.
transpose() can take an approprriate python sequence, so this works just
fine:

def test_transpose2(x,y):
    X = transpose([x]+[y])

However, it doesn't really save you the copy, as I'm retty sure
transpose makes a copy internally anyway. Test results:
testing on vectors of length:  4096

test_transpose 0.104995965958
test_transpose2 0.103582024574

I think the winner is:

X = transpose([x]+[y])


well, I learned a little bit more about Numeric today.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                    		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From rob at hooft.net  Fri Jan 31 13:36:03 2003
From: rob at hooft.net (Rob Hooft)
Date: Fri Jan 31 13:36:03 2003
Subject: [Numpy-discussion] fastest way to make two vectors into anarray
References: <m2fzrbzmlc.fsf@mother.paradise.lost>	<m2vg064eyj.fsf@mother.paradise.lost> <3E3ADC19.5566CB5A@noaa.gov>
Message-ID: <3E3AEC19.6020907@hooft.net>

Chris Barker wrote:
> 
> X = transpose([x]+[y])
> 
> 
> well, I learned a little bit more about Numeric today.
> 

I've been skipping through a lot of messages today because I was getting 
behind on mailing list traffic, but I missed one thing in the discussion 
so far (sorry if it was marked already):

    transpose doesn't actually do any work.

Actually, transpose only sets the "strides" counts differently, and this 
is blazingly fast. What is NOT fast is using the transposed array later! 
The problem is that many routines actually require a contiguous array, 
and will make a temporary local contiguous copy. This may happen 
multiple times if the lifetime of the transposed array is long. Even 
routines that do not require a contiguous array and can actually use the 
strides may run significantly slower because the CPU cache is trashed a 
lot by the high strides.

Moral: you can't test this code by looping a 1000 times through it, you 
actually should take into account the time it takes to make a contiguous 
array immediately after the transpose call.

Regards,

Rob Hooft
-- 
Rob W.W. Hooft  ||  rob at hooft.net  ||  http://www.hooft.net/people/rob/


From edcjones at erols.com  Wed Jan  1 20:29:44 2003
From: edcjones at erols.com (Edward C. Jones)
Date: Wed Jan  1 20:29:44 2003
Subject: [Numpy-discussion] numarray types and PIL modes, revisited
Message-ID: <3E13C7DA.70906@erols.com>

Perry Greenfield wrote:
 > Edward Jones writes:

 > > I write code using both PIL and numarray. PIL uses strings for
 > > modes and numarray uses (optionally) strings as typecodes. This
 > > causes problems.  One fix is to emit a DeprecationWarning when
 > > string typecodes are used.  Two functions are needed:
 > > StringTypeWarningOn and StringTypeWarningOff.  The default
 > > should be to ignore this warning.
 >
 > I'm not sure I understand. Can you give me an example of problem
 > code or usage? It sounds like you are trying to test the types of
 > PIL and numarray objects in a generic sense. But I'd understand
 > better if you could show an example.

That's what I was thinking (incorrectly). But I don't need to directly 
compare PIL modes with numarray types.

My code never tries to deduce whether an array is a numarray or a PIL 
image from just the natype_or_mode. A module name (MODULE.NUMARRY, 
MODULE.PIL) must also be given. I do things this way because I might 
want to include other array/image systems. In an earlier version, I had 
a MODULE.IPL for the Intel Image Processing Library.

The code also implements a policy of forbidding string types.

So now all I can say is:

1. UInt8 == 'X' should not raise an exception. It should return False.

3. There needs to be a function that returns True iff arg is a numarry 
type (UInt8, "UInt8", "b", ...).

def IsType(rep):
     from numerictypes import typeDict
     return isinstance(rep, NumericType) or typeDict.has_key(rep)


Here is a typical piece of code. "module" can be MODULE.PIL or
MODULE.NUMARRAY.

----
"""General image casting function. Changes the C type of the pixels. 
Information can be lost. The "Convert" functions call C casting 
functions that clip the values, For example, if the input is a UInt16 
and the output is a Int16, any input value greater than 32767 becomes 32767.
"""
def ArrayToArrayCast(arrin, module, natype_or_mode):
     """Converts one array into another. Results are clipped."""
     pars = Parameters(arrin)
     if pars.module == module == MODULE.PIL and \
           pars.mode == natype_or_mode:
         return arrin
     if pars.module == module == MODULE.NUMARRAY and \
                      NA_SameType(pars.natype, natype_or_mode):
         return arrin
     if pars.module == MODULE.NUMARRAY and module == MODULE.NUMARRAY:
         return NA_To_NA_Convert(arrin, natype_or_mode)
     if pars.module == MODULE.PIL and module == MODULE.PIL:
         return PIL_To_PIL_Convert(arrin, natype_or_mode)
     if pars.module == MODULE.NUMARRAY and module == MODULE.PIL:
         return NA_To_PIL_Convert(arrin, natype_or_mode)
     if pars.module == MODULE.PIL and module == MODULE.NUMARRAY:
         return PIL_To_NA_Convert(arrin, natype_or_mode)
----


From edcjones at erols.com  Wed Jan  1 20:42:05 2003
From: edcjones at erols.com (Edward C. Jones)
Date: Wed Jan  1 20:42:05 2003
Subject: [Numpy-discussion] End of Holidays small comments
Message-ID: <3E13CB14.7040908@erols.com>

node35.html:

     >>> print x.type(), x.real.type()
     D d

should be

     >>> print x.type(), x.real.type()
     numarray type: Complex64 numarray type: Float64

------------------------------------------------

Why use both NUM_C_ARRAY and C_ARRAY?

------------------------------------------------

in  _ndarraymodule.c:

         {"_byteoffset",
          (getter)_ndarray_byteoffset_get,
          (setter)_ndarray_byteoffset_set,
          "shortest seperation between elements in bytes"},
         {"_bytestride",
          (getter)_ndarray_bytestride_get,
          (setter)_ndarray_bytestride_set,
          "shortest seperation between elements in bytes"},

One of the comments is wrong. Also "separation".

------------------------------------------------

libnumarraymodule.c:

     /* Create an empty array. */
     static PyArrayObject *
     NA_Empty(int ndim, int *shape, NumarrayType type)

node42.html:

     static PyObject* NA_Empty( NumarrayType type, int ndim, ...)

Serious documentation error.

------------------------------------------------

I think NA_New should be

     NA_New(int ndim, int* shape, NumarrayType type, void* buffer)

The current NA_New is useful only when ndim is known at code-writing time.

------------------------------------------------

node39.html:

     Note: the type parameter for a macro is one of the Numarray Numeric
     Data Types, not a NumarrayType enumeration value.

There should be an example of one of the GET/SET macros. How about

     unsigned char n;
     int i;
     ...
     n = NA_GET1(arr, UInt8, i);

------------------------------------------------

It seems that the parameters "aligned" and "writeable" are ignored in 
the source code for NA_NewAll and class NumArray.

------------------------------------------------

I would like to see an "int* strides" parameter added to NA_NewAll, so a
non-contiguous "buffer" can be used.

------------------------------------------------

I suggest NA_Copy(PyObject* arr) which is something like

static PyObject* NA_Copy(PyObject* arr)
{
     PyArrayObject* arr1 = arr;
     return NA_NewAll(arr1->nd, (long*) arr1->dimensions,
        arr1->descr->type_num, arr1->data, arr1->byteoffset,
        arr1->bytestride, arr1->byteorder, 1, 1);
}


From edcjones at erols.com  Wed Jan  1 20:45:34 2003
From: edcjones at erols.com (Edward C. Jones)
Date: Wed Jan  1 20:45:34 2003
Subject: [Numpy-discussion] Slicing API?
Message-ID: <3E13CBC3.6000207@erols.com>

Both in Numeric and now in numarray I have found a need for API 
functions for slicing. Has anyone thought about this?


From jmiller at stsci.edu  Thu Jan  2 06:03:16 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jan  2 06:03:16 2003
Subject: [Numpy-discussion] Slicing API?
References: <3E13CBC3.6000207@erols.com>
Message-ID: <3E14481D.9080902@stsci.edu>

Edward C. Jones wrote:

> Both in Numeric and now in numarray I have found a need for API 
> functions for slicing. Has anyone thought about this?
>
Speaking for myself and the numarray C-API, the answer is no.   What API 
do you want?   Can you suggest function prototypes?

Todd


From jmiller at stsci.edu  Thu Jan  2 12:36:53 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jan  2 12:36:53 2003
Subject: [Numpy-discussion] Slicing API?
References: <3E13CBC3.6000207@erols.com> <3E14481D.9080902@stsci.edu> <3E1497E1.1050808@erols.com>
Message-ID: <3E14A435.7040609@stsci.edu>

Edward C. Jones wrote:

> Todd Miller wrote:
>
>> Edward C. Jones wrote:
>>
>>> Both in Numeric and now in numarray I have found a need for API 
>>> functions for slicing. Has anyone thought about this?
>>>
>> Speaking for myself and the numarray C-API, the answer is no.   What 
>> API do you want?   Can you suggest function prototypes? 
>
>
> An API version of  arrout[slices] = arrin[slices]:
>
> static int
> NA_CopySlice(PyArrayObject* arrin, PyArrayObject* arrout,
>     int* startin, int* stepin, int* stopin, int* startout, int* stepout);
>
>
I would suggest something more like the following then:

typedef struct {
    int start, stop, step;
} NumSlice;

static int
NA_CopySlice(PyArrayObject* arrin, int indim, NumSlice *slicein,
    PyArrayObject* arrout,  int outdim, NumSlice *sliceout);

The differences are:

1.  A slice dimension count is added for both input and output arrays. 
 This enables use of partial indices.

2.  Slice values are expressed using the NumSlice typedef/struct rather 
than 3 independent int arrays.

3. The parameter order is shuffled so that input array parameters are 
kept together, and output array parameters are kept together.

But,  I still have these comments:

1.  It looks like it will be cumbersome to use.

2.  We should probably implement it as a callback to Python to avoid 
introducing another set of assignment semantics.  Thus, the 
implementation would really just be building up and executing the calls 
for:  outarr.__setitem__(outslices, inarr.__getitem__(inslices)).

3. The slicing implementation for numarray objects should be optimized 
to C this quarter, if not this month.  So in terms of efficiency, not to 
mention comment 2, this won't buy much.

4. Since Numeric doesn't have this already,  we're probably missing 
something obvious.  

Comments?  Still interested?

Todd


From jmiller at stsci.edu  Fri Jan  3 09:49:01 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jan  3 09:49:01 2003
Subject: [Numpy-discussion] End of Holidays small comments
References: <3E13CB14.7040908@erols.com>
Message-ID: <3E15CED2.9070402@stsci.edu>

Wow!   This is great feedback.  Thanks Edward.

Edward C. Jones wrote:

> node35.html:
>
>     >>> print x.type(), x.real.type()
>     D d
>
> should be
>
>     >>> print x.type(), x.real.type()
>     numarray type: Complex64 numarray type: Float64

I taked this over with Perry,  and think it should behave and be 
documented more like:
 >>> print x.type(), x.real.type()
Complex64  Float64

>
> ------------------------------------------------
>
> Why use both NUM_C_ARRAY and C_ARRAY?

In the context of the defining enumeration,  NUM_C_ARRAY looks correct. 
  Anywhere else,  C_ARRAY is about all I can stand.   C_ARRAY is so 
common that I thought a little irregularity would be tolerable.  Chock 
it up to tastelessness.

>
> ------------------------------------------------
>
> in  _ndarraymodule.c:
>
>         {"_byteoffset",
>          (getter)_ndarray_byteoffset_get,
>          (setter)_ndarray_byteoffset_set,
>          "shortest seperation between elements in bytes"},
>         {"_bytestride",
>          (getter)_ndarray_bytestride_get,
>          (setter)_ndarray_bytestride_set,
>          "shortest seperation between elements in bytes"},
>
> One of the comments is wrong. Also "separation".

Noted.

>
> ------------------------------------------------
>
> libnumarraymodule.c:
>
>     /* Create an empty array. */
>     static PyArrayObject *
>     NA_Empty(int ndim, int *shape, NumarrayType type)
>
> node42.html:
>
>     static PyObject* NA_Empty( NumarrayType type, int ndim, ...)
>
Noted.

>
> ------------------------------------------------
>
> I think NA_New should be
>
>     NA_New(int ndim, int* shape, NumarrayType type, void* buffer)
>
> The current NA_New is useful only when ndim is known at code-writing 
> time.

NA_New is a  "convenience wrapper" around NA_NewAll,  but I see your point.

How about NA_vNew(),  in the spirit of vprintf?

>
> ------------------------------------------------
>
> node39.html:
>
>     Note: the type parameter for a macro is one of the Numarray Numeric
>     Data Types, not a NumarrayType enumeration value.
>
> There should be an example of one of the GET/SET macros. How about
>
>     unsigned char n;
>     int i;
>     ...
>     n = NA_GET1(arr, UInt8, i);

OK.

>
> ------------------------------------------------
>
> It seems that the parameters "aligned" and "writeable" are ignored in 
> the source code for NA_NewAll and class NumArray.

"aligned" is used.

"writeable" should probably be dropped since it is no longer used.   
Since doing that would break an interface someone might be using,  I'd 
rather not.

>
> ------------------------------------------------
>
> I would like to see an "int* strides" parameter added to NA_NewAll, so a
> non-contiguous "buffer" can be used. 

OK.   How about NA_NewAllWithStrides (or insert a better name here)?

>
> ------------------------------------------------
>
> I suggest NA_Copy(PyObject* arr) which is something like
>
> static PyObject* NA_Copy(PyObject* arr)
> {
>     PyArrayObject* arr1 = arr;
>     return NA_NewAll(arr1->nd, (long*) arr1->dimensions, 

This  ((long *)) doesn't work portably, so I would recommend avoiding it.

>
>        arr1->descr->type_num, arr1->data, arr1->byteoffset,
>        arr1->bytestride, arr1->byteorder, 1, 1);
> }
>
I'll add NA_Copy().


From jmiller at stsci.edu  Fri Jan  3 09:52:02 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jan  3 09:52:02 2003
Subject: [Numpy-discussion] numarray types and PIL modes, revisited
References: <3E13C7DA.70906@erols.com>
Message-ID: <3E15CF75.8080207@stsci.edu>

Edward C. Jones wrote:

> So now all I can say is:
>
> 1. UInt8 == 'X' should not raise an exception. It should return False.

OK.   I'll change numarray to return False.

>
> 3. There needs to be a function that returns True iff arg is a numarry 
> type (UInt8, "UInt8", "b", ...).
>
> def IsType(rep):
>     from numerictypes import typeDict
>     return isinstance(rep, NumericType) or typeDict.has_key(rep)

Sounds good too.  I'll add this to numerictypes.

>
>
Thanks,
Todd


From edcjones at erols.com  Fri Jan  3 16:03:04 2003
From: edcjones at erols.com (Edward C. Jones)
Date: Fri Jan  3 16:03:04 2003
Subject: [Numpy-discussion] Grepping the source
Message-ID: <3E162CCB.7070106@erols.com>

Here is a short program I find useful.

#! /usr/bin/env python

import os, sys, tempfile

"""Greps the numarray source code"""

command = \
"""grep -n "%s" \
   /usr/local/src/numarray-0.4/Include/numarray/arrayobject.h \
    ...
   /usr/local/src/numarray-0.4/Lib/_ufunc.py \
   ...
   /usr/local/src/numarray-0.4/Src/libnumarraymodule.c \
 > %s
"""

if len(sys.argv) != 2:
     raise Exception, 'program requires exactly one argument'

temp = tempfile.mktemp()
try:
     os.system(command % (sys.argv[1], temp))
     f = file(temp, 'r')
     lines = f.read().splitlines()
     f.close()
finally:
     if os.path.exists(temp):
         os.remove(temp)

common = len('/usr/local/src/numarray-0.4/')
d = {}
names = []
for line in lines:
     line = line[common:]
     colonloc = line.index(':')
     name = line[:colonloc]
     text = line[colonloc+1:]
     if not d.has_key(name):
         d[name] = []
         names.append(name)
     d[name].append(text)

for name in names:
     if len(d[name]) == 0:
         continue
     print '%s:' % name
     for text in d[name]:
         print '   %s' % text
     print


From magnus at hetland.org  Fri Jan  3 16:24:04 2003
From: magnus at hetland.org (Magnus Lie Hetland)
Date: Fri Jan  3 16:24:04 2003
Subject: [Numpy-discussion] Grepping the source
In-Reply-To: <3E162CCB.7070106@erols.com>
References: <3E162CCB.7070106@erols.com>
Message-ID: <20030104002342.GA18694@idi.ntnu.no>

Edward C. Jones <edcjones at erols.com>:
[snip]
>     lines = f.read().splitlines()

You could use f.readlines() here... Or you could just use

  for line in open(...):

later, if you're using Python 2.2+

-- 
Magnus Lie Hetland
http://hetland.org


From perry at stsci.edu  Mon Jan  6 16:28:05 2003
From: perry at stsci.edu (Perry Greenfield)
Date: Mon Jan  6 16:28:05 2003
Subject: [Numpy-discussion] package vs module
Message-ID: <NEBBIJKBMLDBLNCEEFOCOEDJCPAA.perry@stsci.edu>

Back in December the issue of whether numarray should be a package
or set of modules came up. When I asked about the possibility
of making numarray a package (on the scipy mailing list but I
can't seem to find the thread where it was discussed), I got
only positive comments. The issue needs to be raised here also.

Is there any objection to making numarray package based?
The implications are that 3rd party modules (e.g. FFT)
will be imported as part of the package structure, i.e.,

  import numarray.FFT 

or

  from numarray.FFT import *

instead of 

  import FFT

As usual there are advantages and disadvantages. The advantages
are that we will not have name collisions with existing Numeric
modules (currently we name FFT as FFT2 for this reason). It also
potentially reduces name collision issues in general. Most feel
it is a cleaner way to organize the software (at least based on
the feedback so far).

The main disadvantages I see so far are:

1) One will either have to change import statements in old code
   to match the new style (a pain, but generally changing imports
   is not terribly difficult since they are easy to identify) or
   explicitly add the path to each 3rd party module to Python
   Path (or some equivalent).
2) If numarray were accepted into the Python Standard Library, it
   would be the first case (as far as I can tell) of a standard
   library package where we would expect to add sub modules to
   it (e.g., FFT)). Normally these would not be distributed with
   the standard library, so some general mechanism will be needed
   to allow numarray to find 3rd party packages outside of the
   Python directory structure. For example, I don't think we can
   require having people install FFT in the Standard Library 
   directory structure after Python is installed. Rather, we would
   probably have numarray look for extension modules in a standard
   named site-packages directory (or site-numarray?) or otherwise
   check a numarraypath environmental variable so that
   import numarray.FFT works properly. Perhaps others have ideas
   about how to best handle this.

Any other issues being overlooked?

Feedback?

Thanks, Perry


From magnus at hetland.org  Mon Jan  6 23:05:02 2003
From: magnus at hetland.org (Magnus Lie Hetland)
Date: Mon Jan  6 23:05:02 2003
Subject: [Numpy-discussion] package vs module
In-Reply-To: <NEBBIJKBMLDBLNCEEFOCOEDJCPAA.perry@stsci.edu>
References: <NEBBIJKBMLDBLNCEEFOCOEDJCPAA.perry@stsci.edu>
Message-ID: <20030107070426.GC4884@idi.ntnu.no>

Perry Greenfield <perry at stsci.edu>:
>
> Back in December the issue of whether numarray should be a package
> or set of modules came up. When I asked about the possibility
> of making numarray a package (on the scipy mailing list but I
> can't seem to find the thread where it was discussed), I got
> only positive comments. The issue needs to be raised here also.
> 
> Is there any objection to making numarray package based?

I think this seems like a very good and natural thing to do. (Maybe
names like RandomArray2 etc. can be changed too, now... :)

-- 
Magnus Lie Hetland
http://hetland.org


From pearu at cens.ioc.ee  Tue Jan  7 02:22:03 2003
From: pearu at cens.ioc.ee (Pearu Peterson)
Date: Tue Jan  7 02:22:03 2003
Subject: [Numpy-discussion] package vs module
In-Reply-To: <NEBBIJKBMLDBLNCEEFOCOEDJCPAA.perry@stsci.edu>
Message-ID: <Pine.LNX.4.21.0301071133040.14691-100000@cens.kybi>

On Mon, 6 Jan 2003, Perry Greenfield wrote:

> The main disadvantages I see so far are:
> 
> 1) One will either have to change import statements in old code
>    to match the new style (a pain, but generally changing imports
>    is not terribly difficult since they are easy to identify) or
>    explicitly add the path to each 3rd party module to Python
>    Path (or some equivalent).
> 2) If numarray were accepted into the Python Standard Library, it
>    would be the first case (as far as I can tell) of a standard
>    library package where we would expect to add sub modules to
>    it (e.g., FFT)). Normally these would not be distributed with
>    the standard library, so some general mechanism will be needed
>    to allow numarray to find 3rd party packages outside of the
>    Python directory structure. For example, I don't think we can
>    require having people install FFT in the Standard Library 
>    directory structure after Python is installed. Rather, we would
>    probably have numarray look for extension modules in a standard
>    named site-packages directory (or site-numarray?) or otherwise
>    check a numarraypath environmental variable so that
>    import numarray.FFT works properly. Perhaps others have ideas
>    about how to best handle this.
> 
> Any other issues being overlooked?

There is one, though not so critical at this point but I will raise
it anyway. In summary, I am +1 for making numarray a package.

The issue is releated to import time and memory usage: more extension
modules in a package increase both of them, even if users have no
indention to use these modules. On slower machines this may cause
inconvinieces, especially in applications that call Python multiple times
for short tasks containing numarray operation.

Let me repeat, currently this is not a problem neither with Numeric
(because it never imports its extension modules) or numarray until
numarray will contain a number of extension modules that
presumably are not small.

For a realistic example of this issue consider Scipy (as a sort of upper
bound what numarray may become one day). Scipy contains a linalg module
that is an (almost complete) wrapper to ATLAS/BLAS/LAPACK libraries and
therefore importing the corresponding extension modules can be both time
and memory consuming.  For example, importing scipy to Python may take 2-5
seconds on PII 400MHz, mainly because of loading the linalg extension
modules. This time may be annoying for small but frequent tasks.

I wish Python import mechanism would be a bit smarter or lazier in
loading extension modules that are never used...

Pearu


From falted at openlc.org  Tue Jan  7 03:31:07 2003
From: falted at openlc.org (Francesc Alted)
Date: Tue Jan  7 03:31:07 2003
Subject: [Numpy-discussion] package vs module
In-Reply-To: <NEBBIJKBMLDBLNCEEFOCOEDJCPAA.perry@stsci.edu>
References: <NEBBIJKBMLDBLNCEEFOCOEDJCPAA.perry@stsci.edu>
Message-ID: <20030107113009.GA2445@openlc.org>

On Mon, Jan 06, 2003 at 07:29:15PM -0500, Perry Greenfield wrote:
> The main disadvantages I see so far are:
> 
> 1) One will either have to change import statements in old code
>    to match the new style (a pain, but generally changing imports
>    is not terribly difficult since they are easy to identify) or
>    explicitly add the path to each 3rd party module to Python
>    Path (or some equivalent).

I think this should be regarded as a minor annoyance compared with the
advantages of making numarray a package. In addition, the introduction of
numarray as substitute of Numeric can justify some re-code on existing
applications.

> 2) If numarray were accepted into the Python Standard Library, it
>    would be the first case (as far as I can tell) of a standard
>    library package where we would expect to add sub modules to
>    it (e.g., FFT)). Normally these would not be distributed with
>    the standard library, so some general mechanism will be needed
>    to allow numarray to find 3rd party packages outside of the
>    Python directory structure. For example, I don't think we can
>    require having people install FFT in the Standard Library 
>    directory structure after Python is installed. Rather, we would
>    probably have numarray look for extension modules in a standard
>    named site-packages directory (or site-numarray?) or otherwise
>    check a numarraypath environmental variable so that
>    import numarray.FFT works properly. Perhaps others have ideas
>    about how to best handle this.
> 

Great. I would be glad to see a package containing numarray kernel in order
to allow aplications to use their core features, and have a mechanism to add
3rd party packages. In particular, having something similar to site-numarray
to install these packages can be quite neat. In fact, I was pondering to
include a subset of numarray in the PyTables package (it only needs the
numarray core functionality), but if this reorganization takes place, I
would not need to do that anymore.

> Any other issues being overlooked?

Yeah. In case you decide to break numarray in several modules, which would
be the granularity of the separation. My opinion goes to have a reduced core
with basic functionality (to maximize the chances to be included in the
Pyhton Standard Library, but also to allow an easy entry for people who may
wish to use this functionality) and then different, small, 3rd party
packages, but perhaps this is also the most laborious solution.

-- 
Francesc Alted                            PGP KeyID:      0x61C8C11F


From hinsen at cnrs-orleans.fr  Tue Jan  7 03:32:03 2003
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Tue Jan  7 03:32:03 2003
Subject: [Numpy-discussion] package vs module
In-Reply-To: <NEBBIJKBMLDBLNCEEFOCOEDJCPAA.perry@stsci.edu>
References: <NEBBIJKBMLDBLNCEEFOCOEDJCPAA.perry@stsci.edu>
Message-ID: <m3vg11fbdw.fsf@chinon.cnrs-orleans.fr>

Perry Greenfield <perry at stsci.edu> writes:

> Back in December the issue of whether numarray should be a package
> potentially reduces name collision issues in general. Most feel
> it is a cleaner way to organize the software (at least based on
> the feedback so far).

I agree. We have discussed converting NumPy into a package a few times
in the past, the major argument against it was compatibility issues.
Numarray will require some changes to import statements anyway, so
this seems the right time to make the change.

> 2) If numarray were accepted into the Python Standard Library, it
>    would be the first case (as far as I can tell) of a standard
>    library package where we would expect to add sub modules to
>    it (e.g., FFT)). Normally these would not be distributed with
>    the standard library, so some general mechanism will be needed
>    to allow numarray to find 3rd party packages outside of the
>    Python directory structure. For example, I don't think we can

If you plan to unbundle FFT etc. from numarray, then I would prefer a
different naming scheme: numarray being just numarray, and some other
package name grouping together the other modules. That is not only a
question of installation, but also of general maintenance and of
clarity for users. I see the Python package system as a tree:
everything inside a package belongs together, is distributed together
and is maintained by the same people.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From paul at pfdubois.com  Tue Jan  7 09:25:06 2003
From: paul at pfdubois.com (paul at pfdubois.com)
Date: Tue Jan  7 09:25:06 2003
Subject: [Numpy-discussion] package vs module
In-Reply-To: <20030107113009.GA2445@openlc.org>
Message-ID: <3E0D027100007B17@mta8.wss.scd.yahoo.com>

1. I favor the package approach.

2. I don't care if FFT is numarray.FFT or numpy.FFT (i.e., in a separate
place). However, see (3).

3. Extensions built with one version of Python/numarray may not work with
a different version. This means the safer approach is to have all addons
inside the same directory, so that you can blow away just one directory
and be sure that no 'old' packages remain. 

Some new stuff being put into Python also envisions being able to add various
zipped files to the Python path as places to be searched. Perhaps this represents
a packaging opportunity. I haven't paid enough attention to be sure.

While we are on the subject of packaging, the current distribution places
all sorts of extraneous test and installation-related files in the Lib directory.
This makes it harder to work with the source when you are new to it.


From tim.hochberg at ieee.org  Tue Jan  7 09:35:17 2003
From: tim.hochberg at ieee.org (Tim Hochberg)
Date: Tue Jan  7 09:35:17 2003
Subject: [Numpy-discussion] package vs module
In-Reply-To: <Pine.LNX.4.21.0301071133040.14691-100000@cens.kybi>
References: <Pine.LNX.4.21.0301071133040.14691-100000@cens.kybi>
Message-ID: <3E1B0FAF.7020607@ieee.org>

Pearu Peterson wrote:

>On Mon, 6 Jan 2003, Perry Greenfield wrote:
>
>  
>
>>The main disadvantages I see so far are:
>>
>>1) One will either have to change import statements in old code
>>   to match the new style (a pain, but generally changing imports
>>   is not terribly difficult since they are easy to identify) or
>>   explicitly add the path to each 3rd party module to Python
>>   Path (or some equivalent).
>>2) If numarray were accepted into the Python Standard Library, it
>>   would be the first case (as far as I can tell) of a standard
>>   library package where we would expect to add sub modules to
>>   it (e.g., FFT)). Normally these would not be distributed with
>>   the standard library, so some general mechanism will be needed
>>   to allow numarray to find 3rd party packages outside of the
>>   Python directory structure. For example, I don't think we can
>>   require having people install FFT in the Standard Library 
>>   directory structure after Python is installed. Rather, we would
>>   probably have numarray look for extension modules in a standard
>>   named site-packages directory (or site-numarray?) or otherwise
>>   check a numarraypath environmental variable so that
>>   import numarray.FFT works properly. Perhaps others have ideas
>>   about how to best handle this.
>>
>>Any other issues being overlooked?
>>    
>>
>
>There is one, though not so critical at this point but I will raise
>it anyway. In summary, I am +1 for making numarray a package.
>
>The issue is releated to import time and memory usage: more extension
>modules in a package increase both of them, even if users have no
>indention to use these modules. On slower machines this may cause
>inconvinieces, especially in applications that call Python multiple times
>for short tasks containing numarray operation.
>  
>
That's not right, is it? I'm pretty certain that submodules in a package 
are not loaded until explicitly imported. I'm not sure why SciPy is 
slow, maybe the __init__ imports everything? I don't have a copy here so 
I can't check right now.

In any event I'm +1 for putting it in a package unless it interferes 
with it getting into the core. As Paul mentioned keeping it in a zip 
archive would be even cooler once that's an option.

-tim


From falted at openlc.org  Wed Jan  8 13:27:06 2003
From: falted at openlc.org (Francesc Alted)
Date: Wed Jan  8 13:27:06 2003
Subject: [Numpy-discussion] some recarray rework
Message-ID: <20030108212648.GA1309@openlc.org>

Hi,

In the context of optimizing the PyTables support for numarray and recarray
objects I have been playing with recarray module, and ended with a
somewhat improved version of it. Roughly, the modifications done are:

- Addition of a cache to quickly access the columns (numarrays) in
  recarrays. This object is a map (dictionary) where keys are the name
  fields and values are the pointers to columns regarded as numarrays
  entities. This dictionary is accessible through the new attribute
  "_fields".

- Addition of an attribute for recarray objects named "_record" which
  points to a special object ("Record2" class) and that it is aware of
  the "_fields" cache. It that can be used to access the different
  rows in recarray objects in an efficient way.

- The "_record" object is callable (it defines the "__call__" method)
  so as to select the recarray row that is active during access to the
  different fields.

Advantages

- Access to rows and columns (fields) in recarray objects are one
  order of magnitude faster (!).

- The new "_fields" and "_record" attributes provides convenient and
  intuitive ways to access the information in recarrays.

- The "_record" attribute suports the "__getattr__" and "__setattr__"
  methods that are very convenient to access fields in a row.

Drawbacks

- "_record" attribute points always to the same object and you must
  pass it the row over which you want to operate. So, if you want to
  have two different objects pointing to different rows, you can't use
  the "_record" attribute to get them (but you can still use the
  existing Record class through by calling the "__getitem__" method
  of a recarray object).

- Two new attributes are added to the already large number of recarray
  variables. However, this new variables has no special space
  requirements as "_record" object has only three scalar variables
  and "_fields" is a dictionary with many entries as fields in
  recarray, which should be not a large amount.

I'm attaching this modified version as well as a testbed program in order to
test their new access methods and improved performance. The output of this
program ran in a pentium4 at 2GHz machine is also included.

Feel free to play with it and/or take/adapt the parts you consider better
suited to recarray module.

-- 
Francesc Alted                            PGP KeyID:      0x61C8C11F
-------------- next part --------------
import numarray as num
import ndarray as mda
import memory
import chararray
import sys, copy, os, re, types, string

__version__ = '1.0'

class Char:
    """ data type Char class"""
    bytes = 1
    def __repr__(self):
        return "CharType"

CharType = Char()

# translation table to the num data types
numfmt = {'i1':num.Int8, 'u1':num.UInt8, 'i2':num.Int16, 'i4':num.Int32,
          'i8':num.Int64,
          'f4':num.Float32, 'f8':num.Float64,
          'l':num.Bool, 'b':num.Int8, 'u':num.UInt8, 's':num.Int16,
          'i':num.Int32, 'N':num.Int64,
          'f':num.Float32, 'd':num.Float64, 'r':num.Float32,
          'a':CharType,
          'Int8':num.Int8, 'Int16':num.Int16, 'Int32':num.Int32,
          'Int64':num.Int64,
          'UInt8':num.UInt8, 'Float32':num.Float32, 'Float64':num.Float64,
          'Bool':num.Bool}

# the reverse translation table of the above (for numarray only)
revfmt = {num.Int16:'s', num.Int32:'i', num.Int64:'N',
          num.Float32:'r', num.Float64:'d',
          num.Bool:'l', num.Int8:'b', num.UInt8:'u', CharType:'a'}

# TFORM regular expression
format_re = re.compile(r'(?P<repeat>^[0-9]*)(?P<dtype>[A-Za-z0-9.]+)')

def fromrecords (recList, formats=None, names=None):
    """ create a Record Array from a list of records in text form

        The data in the same field can be heterogeneous, they will be promoted
        to the highest data type.  This method is intended for creating
        smaller record arrays.  If used to create large array e.g.

        r=recarray.fromrecords([[2,3.,'abc']]*100000)

        it is slow.

    >>> r=fromrecords([[456,'dbe',1.2],[2,'de',1.3]],names='col1,col2,col3')
    >>> print r[0]
    (456, 'dbe', 1.2)
    >>> r.field('col1')
    array([456,   2])
    >>> r.field('col2')
    CharArray(['dbe', 'de'])
    >>> import cPickle
    >>> print cPickle.loads(cPickle.dumps(r))
    RecArray[ 
    (456, 'dbe', 1.2),
    (2, 'de', 1.3)
    ]
    """

    _shape = len(recList)
    _nfields = len(recList[0])
    for _rec in recList:
        if len(_rec) != _nfields:
            raise ValueError, "inconsistent number of objects in each record"
    arrlist = [0]*_nfields
    for col in range(_nfields):
        tmp = [0]*_shape
        for row in range(_shape):
            tmp[row] = recList[row][col]
        try:
            arrlist[col] = num.array(tmp)
        except:
            try:
                arrlist[col] = chararray.array(tmp)
            except:
                raise ValueError, "inconsistent data at row %d,field %d" % (row, col)
    _array = fromarrays(arrlist, formats=formats, names=names)
    del arrlist
    del tmp
    return _array

def fromarrays (arrayList, formats=None, names=None):
    """ create a Record Array from a list of num/char arrays

    >>> x1=num.array([1,2,3,4])
    >>> x2=chararray.array(['a','dd','xyz','12'])
    >>> x3=num.array([1.1,2,3,4])
    >>> r=fromarrays([x1,x2,x3],names='a,b,c')
    >>> print r[1]
    (2, 'dd', 2.0)
    >>> x1[1]=34
    >>> r.field('a')
    array([1, 2, 3, 4])
    """

    _shape = len(arrayList[0])

    if formats == None:

        # go through each object in the list to see if it is a numarray or
        # chararray and determine the formats
        formats = ''
        for obj in arrayList:
            if isinstance(obj, chararray.CharArray):
                formats += `obj._itemsize` + 'a,'
            elif isinstance(obj, num.NumArray):
                if len(obj._shape) == 1: _repeat = ''
                elif len(obj._shape) == 2: _repeat = `obj._shape[1]`
                else: raise ValueError, "doesn't support numarray more than 2-D"

                formats += _repeat + revfmt[obj._type] + ','
            else:
                raise ValueError, "item in the array list must be numarray or chararray"
        formats=formats[:-1]

    for obj in arrayList:
        if len(obj) != _shape:
            raise ValueError, "array has different lengths"

    _array = RecArray(None, formats=formats, shape=_shape, names=names)

    # populate the record array (make a copy)
    for i in range(len(arrayList)):
        try:
            _array.field(_array._names[i])[:] = arrayList[i]
        except:
            print "Incorrect CharArray format %s, copy unsuccessful." % _array._formats[i]
    return _array

def fromstring (datastring, formats, shape=0, names=None):
    """ create a Record Array from binary data contained in a string"""
    _array = RecArray(chararray._stringToBuffer(datastring), formats, shape, names)
    if mda.product(_array._shape)*_array._itemsize > len(datastring):
        raise ValueError("Insufficient input data.")
    else: return _array

def fromfile(file, formats, shape=-1, names=None):
    """Create an array from binary file data

    If file is a string then that file is opened, else it is assumed
    to be a file object. No options at the moment, all file positioning
    must be done prior to this function call with a file object

    >>> import testdata, sys
    >>> fd=open(testdata.filename)
    >>> fd.seek(2880*2)
    >>> r=fromfile(fd, formats='d,i,5a', shape=3)
    >>> r._byteorder = "big"
    >>> print r[0]
    (5.1000000000000005, 61, 'abcde')
    >>> r._shape
    (3,)
    """

    if isinstance(shape, types.IntType) or isinstance(shape, types.LongType):
        shape = (shape,)
    name = 0
    if isinstance(file, types.StringType):
        name = 1
        file = open(file, 'rb')
    size = os.path.getsize(file.name) - file.tell()

    dummy = array(None, formats=formats, shape=0)
    itemsize = dummy._itemsize

    if shape and itemsize:
        shapesize = mda.product(shape)*itemsize
        if shapesize < 0:
            shape = list(shape)
            shape[ shape.index(-1) ] = size / -shapesize
            shape = tuple(shape)

    nbytes = mda.product(shape)*itemsize

    if nbytes > size:
        raise ValueError(
                "Not enough bytes left in file for specified shape and type")

    # create the array
    _array = RecArray(None, formats=formats, shape=shape, names=names)
    nbytesread = memory.file_readinto(file, _array._data)
    if nbytesread != nbytes:
        raise IOError("Didn't read as many bytes as expected")
    if name:
        file.close()
    return _array

# The test below was factored out of "array" due to platform specific
# floating point formatted results:  e+020 vs. e+20
if sys.platform == "win32":
    _fnumber = "2.5984589414244182e+020"
else:
    _fnumber = "2.5984589414244182e+20"

__test__ = {}
__test__["array_platform_test_workaround"] = """
        >>> r=array('a'*200,'r,3s,5a,i',3)
        >>> print r[0]
        (%(_fnumber)s, array([24929, 24929, 24929], type=Int16), 'aaaaa', 1633771873)
        >>> print r[1]
        (%(_fnumber)s, array([24929, 24929, 24929], type=Int16), 'aaaaa', 1633771873)
        """ % globals()
del _fnumber

def array(buffer=None, formats=None, shape=0, names=None):
    """This function will creates a new instance of a RecArray.

    buffer      specifies the source of the array's initialization data.
                buffer can be: RecArray, list of records in text, list of
                numarray/chararray, None, string, buffer.

    formats     specifies the fromat definitions of the array's records.

    shape       specifies the array dimensions.

    names       specifies the field names.

    >>> r=array([[456,'dbe',1.2],[2,'de',1.3]],names='col1,col2,col3')
    >>> print r[0]
    (456, 'dbe', 1.2)
    >>> r=array('a'*200,'r,3i,5a,s',3)
    >>> r._bytestride
    23
    >>> r._names
    ['c1', 'c2', 'c3', 'c4']
    >>> r._repeats
    [1, 3, 5, 1]
    >>> r._shape
    (3,)
    """

    if (buffer is None) and (formats is None):
        raise ValueError("Must define formats if buffer=None")
    elif buffer is None or isinstance(buffer, types.BufferType):
        return RecArray(buffer, formats=formats, shape=shape, names=names)
    elif isinstance(buffer, types.StringType):
        return fromstring(buffer, formats=formats, shape=shape, names=names)
    elif isinstance(buffer, types.ListType) or isinstance(buffer, types.TupleType):
        if isinstance(buffer[0], num.NumArray) or isinstance(buffer[0], chararray.CharArray):
            return fromarrays(buffer, formats=formats, names=names)
        else:
            return fromrecords(buffer, formats=formats, names=names)
    elif isinstance(buffer, RecArray):
        return buffer.copy()
    elif isinstance(buffer, types.FileType):
        return fromfile(buffer, formats=formats, shape=shape, names=names)
    else:
        raise ValueError("Unknown input type")

def _RecGetType(name):
    """Converts a type repr string into a type."""
    if name == "CharType":
        return CharType
    else:
        return num._getType(name)

class RecArray(mda.NDArray):
    """Record Array Class"""

    def __init__(self, buffer, formats, shape=0, names=None, byteoffset=0,
                 bytestride=None, byteorder=sys.byteorder, aligned=1):

        # names and formats can be either a string with components separated
        # by commas or a list of string values, e.g. ['i4', 'f4'] and 'i4,f4'
        # are equivalent formats

        self._parseFormats(formats)
        self._fieldNames(names)

        itemsize = self._stops[-1] + 1

        if shape != None:
            if type(shape) in [types.IntType, types.LongType]: shape = (shape,)
            elif (type(shape) == types.TupleType and type(shape[0]) in [types.IntType, types.LongType]):
                pass
            else: raise NameError, "Illegal shape %s" % `shape`

        #XXX need to check shape*itemsize == len(buffer)?

        self._shape = shape
        mda.NDArray.__init__(self, self._shape, itemsize, buffer=buffer,
                             byteoffset=byteoffset,
                             bytestride=bytestride,
                             aligned=aligned)
        self._byteorder = byteorder

        # Build the column arrays
        self._fields = self._get_fields()

        # Associate a record object for accessing values in each row
        # in a efficient way (i.e. without creating a new object each time)
        self._record = Record2(self)

    def _parseFormats(self, formats):
        """ Parse the field formats """

        if (type(formats) in [types.ListType, types.TupleType]):
            _fmt = formats[:]           ### make a copy
        elif (type(formats) == types.StringType):
            _fmt = string.split(formats, ',')
        else:
            raise NameError, "illegal input formats %s" % `formats`

        self._nfields = len(_fmt)
        self._repeats = [1] * self._nfields
        self._sizes = [0] * self._nfields
        self._stops = [0] * self._nfields

        # preserve the input for future reference
        self._formats = [''] * self._nfields

        sum = 0
        for i in range(self._nfields):

            # parse the formats into repeats and formats
            try:
                (_repeat, _dtype) = format_re.match(string.strip(_fmt[i])).groups()
            except: print 'format %s is not recognized' % _fmt[i]

            if _repeat == '': _repeat = 1
            else: _repeat = eval(_repeat)
            _fmt[i] = numfmt[_dtype]
            self._repeats[i] = _repeat

            self._sizes[i] = _fmt[i].bytes * _repeat
            sum += self._sizes[i]
            self._stops[i] = sum - 1

            # Unify the appearance of _format, independent of input formats
            self._formats[i] = `_repeat`+revfmt[_fmt[i]]

        self._fmt = _fmt

    def __getstate__(self):
        """returns pickled state dictionary for RecArray"""
        state = mda.NDArray.__getstate__(self)
        state["_fmt"] = map(repr, self._fmt)
        return state
    
    def __setstate__(self, state):
        mda.NDArray.__setstate__(self, state)
        self._fmt = map(_RecGetType, state["_fmt"])

    def _fieldNames(self, names=None):
        """convert input field names into a list and assign to the _names
        attribute """

        if (names):
            if (type(names) in [types.ListType, types.TupleType]):
                pass
            elif (type(names) == types.StringType):
                names = string.split(names, ',')
            else:
                raise NameError, "illegal input names %s" % `names`

            self._names = map(lambda n:string.strip(n), names)
        else: self._names = []

        # if the names are not specified, they will be assigned as "c1, c2,..."
        # if not enough names are specified, they will be assigned as "c[n+1],
        # c[n+2],..." etc. where n is the number of specified names..."
        self._names += map(lambda i: 'c'+`i`, range(len(self._names)+1,self._nfields+1))

    def _get_fields(self):
        """ get a dictionary with fields as numeric arrays """

        # Iterate over all the fields
        fields = {}
        for fieldName in self._names:
            # determine the offset within the record
            indx = index_of(self._names, fieldName)
            _start = self._stops[indx] - self._sizes[indx] + 1

            _shape = self._shape
            _type = self._fmt[indx]
            _buffer = self._data
            _offset = self._byteoffset + _start

            # don't use self._itemsize due to possible slicing
            _stride = self._strides[0]

            _order = self._byteorder

            if isinstance(_type, Char):
                arr = chararray.CharArray(buffer=_buffer, shape=_shape,
                          itemsize=self._repeats[indx], byteoffset=_offset,
                          bytestride=_stride)
            else:
                arr = num.NumArray(shape=_shape, type=_type, buffer=_buffer,
                          byteoffset=_offset, bytestride=_stride,
                          byteorder = _order)

                # modify the _shape and _strides for array elements
                if (self._repeats[indx] > 1):
                    arr._shape = self._shape + (self._repeats[indx],)
                    arr._strides = (self._strides[0], _type.bytes)

            # Put this array as a value in dictionary
            fields[fieldName] = arr

        return fields

    def field(self, fieldName):
        """ get the field data as a numeric array """

        return self._fields[fieldName]
        
    def info(self):
        """display instance's attributes (except _data)"""
        _attrList = dir(self)
        _attrList.remove('_data')
        _attrList.remove('_fmt')
        for attr in _attrList:
            print '%s = %s' % (attr, getattr(self,attr))

    def __str__(self):
        outstr = 'RecArray[ \n'
        for i in self:
            outstr += Record.__str__(i) + ',\n'
        return outstr[:-2] + '\n]'

    ### The followng  __getitem__ is not in the requirements
    ### and is here for experimental purposes
    def __getitem__(self, key):
        if type(key) == types.TupleType:
            if len(key) == 1:
                return mda.NDArray.__getitem__(self,key[0])
            elif len(key) == 2 and type(key[1]) == types.StringType:
                return mda.NDArray.__getitem__(self,key[0]).field(key[1])
            else:
                raise NameError, "Illegal key %s" % `key`
        return mda.NDArray.__getitem__(self,key)

    def _getitem(self, key):
        byteoffset = self._getByteOffset(key)
        row = (byteoffset - self._byteoffset) / self._strides[0]
        return Record(self, row)

    def _setitem(self, key, value):
        byteoffset = self._getByteOffset(key)
        row = (byteoffset - self._byteoffset) / self._strides[0]
        for i in range(self._nfields):
            self.field(self._names[i])[row] = value.field(self._names[i])

    def reshape(*value):
        print "Cannot reshape record array."


class Record2:
    """Record2 Class

    This class is similar to Record except for the fact that it is
    created and associated with a recarray in their creation
    time. When speed in traversing the recarray is required this
    approach is more convenient than create a new Record object for
    each row that is visited.

    """

    def __init__(self, input):

        self.__dict__["_array"] = input
        self.__dict__["_fields"] = input._fields
        self.__dict__["_row"] = 0

    def __call__(self, row):
        """ set the row for this record object """
        
        if row < self._array.shape[0]:
            self.__dict__["_row"] = row
            return self
        else:
            return None

    def __getattr__(self, fieldName):
        """ get the field data of the record"""
        
        try:
            return self._fields[fieldName][self._row]
        except:
            (type, value, traceback) = sys.exc_info()
            raise AttributeError, "Error accessing \"%s\" attr.\n %s" % \
                  (fieldName, "Error was: \"%s: %s\"" % (type,value))

    def __setattr__(self, fieldName, value):
        """ set the field data of the record"""

        self._fields[fieldName][self._row] = value

    def __str__(self):
        """ represent the record as an string """
        
        outlist = []
        for name in self._array._names:
            outlist.append(`self._fields[name][self._row]`)
        return "(" + ", ".join(outlist) + ")"

class Record:
    """Record Class"""

    def __init__(self, input, row=0):
        if isinstance(input, types.ListType) or isinstance(input, types.TupleType):
            input = fromrecords([input])
        if isinstance(input, RecArray):
            self.array = input
            self.row = row

    def __getattr__(self, fieldName):
        """ get the field data of the record"""

        #return self.array.field(fieldName)[self.row]
        if fieldName in self.array._names:
            #return self.array.field(fieldName)[self.row]
            return self.array._fields[fieldName][self.row]

    def field(self, fieldName):
        """ get the field data of the record"""

        #return self.array.field(fieldName)[self.row]
        return self.array.field(fieldName)[self.row]

    def __str__(self):
        outstr = '('
        #for i in range(self.array._nfields):
        #    print self.array.field(i)[self.row]
        for name in self.array._names:
            #print self.array.field(name)[self.row]
            #print self.array._fields[name][self.row]
            ### this is not efficient, need to know how to convert N-bytes to each data type
            outstr += `self.array.field(name)[self.row]` + ', '
        return outstr[:-2] + ')'

def index_of(nameList, key):
    """ Get the index of the key in the name list.

        The key can be an integer or string.  If integer, it is the index
        in the list.  If string, the name matching will be case-insensitive and
        trailing blank-insensitive.
    """
    if (type(key) in [types.IntType, types.LongType]):
        indx = key
    elif (type(key) == types.StringType):
        _names = nameList[:]
        for i in range(len(_names)):
            _names[i] = string.lower(_names[i])
        try:
            indx = _names.index(string.strip(string.lower(key)))
        except:
            raise NameError, "Key %s does not exist" % key
    else:
        raise NameError, "Illegal key %s" % `key`

    return indx

def find_duplicate (list):
    """Find duplication in a list, return a list of dupicated elements"""
    dup = []
    for i in range(len(list)):
        if (list[i] in list[i+1:]):
            if (list[i] not in dup):
                dup.append(list[i])
    return dup

def test():
    import doctest, recarray
    return doctest.testmod(recarray)

if __name__ == "__main__":
    test()
-------------- next part --------------
import sys, time
import numarray as num
import chararray
import recarray
import recarray2  # This is my modified version

usage = \
"""usage: %s recordlength
     Set recordlength to 1000 at least to obtain decent figures!
""" % sys.argv[0]

try:
    reclen = int(sys.argv[1])
except:
    print usage
    sys.exit()

delta = 0.000001

# Creation of recarrays objects for test
x1=num.array(num.arange(reclen))
x2=chararray.array(None, itemsize=7, shape=reclen)
x3=num.array(num.arange(reclen,reclen*3,2), num.Float64)
r1=recarray.fromarrays([x1,x2,x3],names='a,b,c')
r2=recarray2.fromarrays([x1,x2,x3],names='a,b,c')

print "recarray shape in test ==>", r2.shape

print "Assignment in recarray modified"
print "-------------------------------"
t1 = time.clock()
for row in xrange(reclen):
    rec = r2._record(row)  # select the row to be changed
    #rec.b = "changed"      # change the "b" field
    rec.c = float(row**2)  # Change the "c" field
t2 = time.clock()
ttime = round(t2-t1, 3)
print "Assign time:", ttime, " Rows/s:", int(reclen/(ttime+delta))
print "Field b on row 2 after re-assign:", r2.field("c")[2]
print

print "Assignment in recarray original"
print "-------------------------------"
t1 = time.clock()
for row in xrange(reclen):
    #r1.field("b")[row] = "changed"
    r1.field("c")[row] = float(row**2)
t2 = time.clock()
ttime = round(t2-t1, 3)
print "Assign time:", ttime, " Rows/s:", int(reclen/(ttime+delta))
print "Field b on row 2 after re-assign:", r1.field("c")[2]
print

print "Selection in recarray modified"
print "------------------------------"
t1 = time.clock()
for row in xrange(reclen):
    rec = r2._record(row)
    if rec.a < 3:
        print "This record pass the cut ==>", rec.c, "(row", row, ")"
t2 = time.clock()
ttime = round(t2-t1, 3)
print "Select time:", ttime, " Rows/s:", int(reclen/(ttime+delta))
print

print "Selection in recarray original"
print "------------------------------"
t1 = time.clock()
for row in xrange(reclen):
    rec = r1[row]
    if rec.field("a") < 3:
        print "This record pass the cut ==>", rec.field("c"), "(row", row, ")"
t2 = time.clock()
ttime = round(t2-t1, 3)
print "Select time:", ttime, " Rows/s:", int(reclen/(ttime+delta))

-------------- next part --------------
recarray shape in test ==> (10000,)
Assignment in recarray modified
-------------------------------
Assign time: 0.15  Rows/s: 66666
Field b on row 2 after re-assign: 4.0

Assignment in recarray original
-------------------------------
Assign time: 1.24  Rows/s: 8064
Field b on row 2 after re-assign: 4.0

Selection in recarray modified
------------------------------
This record pass the cut ==> 0.0 (row 0 )
This record pass the cut ==> 1.0 (row 1 )
This record pass the cut ==> 4.0 (row 2 )
Select time: 0.18  Rows/s: 55555

Selection in recarray original
------------------------------
This record pass the cut ==> 0.0 (row 0 )
This record pass the cut ==> 1.0 (row 1 )
This record pass the cut ==> 4.0 (row 2 )
Select time: 1.52  Rows/s: 6578

From falted at openlc.org  Fri Jan 10 09:17:05 2003
From: falted at openlc.org (Francesc Alted)
Date: Fri Jan 10 09:17:05 2003
Subject: [Numpy-discussion] Some datatypes missing in numarray recarray?
Message-ID: <200301101813.41407.falted@openlc.org>

Hi,

I think there are some data types missing in the recarray module. I can
create recarrays using the fromarrays function with no problems except if I
use UInt16, UInt32 and UInt64.

As these types are well supported by numarray, is there any reason why they
don't appear on numfmt and revfmt mappings in recarray module?. Is it safe
to add them by hand in the source?

Thanks,

-- 
Francesc Alted


From perry at stsci.edu  Fri Jan 10 10:37:02 2003
From: perry at stsci.edu (Perry Greenfield)
Date: Fri Jan 10 10:37:02 2003
Subject: [Numpy-discussion] Some datatypes missing in numarray recarray?
In-Reply-To: <200301101813.41407.falted@openlc.org>
Message-ID: <JFEGLNDJEDNOMPPHDEJFOEPLEBAA.perry@stsci.edu>

> Hi,
>
> I think there are some data types missing in the recarray module. I can
> create recarrays using the fromarrays function with no problems
> except if I
> use UInt16, UInt32 and UInt64.
>
> As these types are well supported by numarray, is there any
> reason why they
> don't appear on numfmt and revfmt mappings in recarray module?. Is it safe
> to add them by hand in the source?
>
> Thanks,
>
> --
> Francesc Alted
>
Good point. We were using this for an I/O library that didn't use
these types so that's why they didn't get in there originally.
But you are right, they should be. Do you want to make the changes?

Thanks, PErry


From costas at malamas.com  Sat Jan 11 01:12:03 2003
From: costas at malamas.com (Costas Malamas)
Date: Sat Jan 11 01:12:03 2003
Subject: [Numpy-discussion] Sparse Arrays in NumPy?
Message-ID: <000701c2b951$74d59880$6e00a8c0@retek.int>

Hello all,

I have been trying to find a package/addon that will provide a sparse array
class to NumPy, or will at least trick NumPy to use a sparse array as a
regular array, to no avail.

By sparse array here, I donot mean a sparse matrix equation solver, but an
array class that accepts a "default value".  In other words, I would like to
instantiate a 1000x1000x1000 (1e9) array that will have at most 5-10%
populated (i.e. non-zero) elements.  The current NumPy will instantiate the
entire 1e9 array, which is a non-starter if you would like to calculate an
expression with say 4-5 arrays.  Instead, I'd like a class that will only
store the populated cells, and return the default value for the others
(ideally, but doing some smart disk I/O to preserve memory).

I've tried SciPy, Scientific Python, and a few other modules floating
around; none seem to do the trick, yet I can't help but wonder that this is
not un uncommon setup for a lot of problem domains.  Is there a package out
there?  If there isn't, where should I start looking to create one? From
their description I think SparseLib++ at least would be a good starting
point as a base library.

As a secondary issue, is anyone aware of a package that can handle storage
of such arrays?  netCDF and HDF do not seem to fit the bill; a B-Tree
library seems a more natural fit...

Thanks in advance --any and all input appreciated,

Costas


From ehagemann at comcast.net  Sun Jan 12 15:14:06 2003
From: ehagemann at comcast.net (eric hagemann)
Date: Sun Jan 12 15:14:06 2003
Subject: [Numpy-discussion] questions about array types
Message-ID: <003c01c2ba90$32d015b0$6401a8c0@eric>

Rereading the numeric docs I see the reference to types Float, Float32, Float64 -- which make sense, however I am curious to understand the usefulness of types Float0, Float8 and Float16 which all seem synonyms for Float32.  Was there some thinking that there would be a converter written for 8bit floats?


>>> from Numeric import *
>>> a = array([1,2,3,4],Float32)
>>> fromstring(a.tostring(),Float32)
array([ 1.,  2.,  3.,  4.],'f')
>>> fromstring(a.tostring(),Float)
array([   2.00000047,  512.00012255])  # corrupt, as would be expected
>>> fromstring(a.tostring(),Float0) #seems to convert back as if Float0 == Float32
array([ 1.,  2.,  3.,  4.],'f')
>>> fromstring(a.tostring(),Float8)
array([ 1.,  2.,  3.,  4.],'f')
>>> fromstring(a.tostring(),Float16)
array([ 1.,  2.,  3.,  4.],'f')
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20030112/c3a64101/attachment-0001.html>

From oliphant at ee.byu.edu  Mon Jan 13 12:59:04 2003
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Mon Jan 13 12:59:04 2003
Subject: [Numpy-discussion] Sparse Arrays in NumPy?
In-Reply-To: <000701c2b951$74d59880$6e00a8c0@retek.int>
Message-ID: <Pine.LNX.4.33L2.0301131355340.22743-100000@oliphant.ee.byu.edu>

> Hello all,
>
> I have been trying to find a package/addon that will provide a sparse array
> class to NumPy, or will at least trick NumPy to use a sparse array as a
> regular array, to no avail.
>

Sparse arrays are not a common object.  Sparse matrices have many, many
implementations of which I'm sure you're aware.

What you want is a general purpose N-D array that uses some kind of sparse
storage.  I'm not aware of such an object in any other language.  Most of
the time people remap their particular problem so that any sparse arrays
become sparse matrices.  All of the effort is then focused in manipulating
certain classes of sparse matrices.

-Travis


From Chris.Barker at noaa.gov  Wed Jan 15 10:21:02 2003
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Wed Jan 15 10:21:02 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package.
References: <20021230235736.GA15420@idi.ntnu.no> <3E119E0E.2010403@stsci.edu>
Message-ID: <3E2598CC.DAB8FD8A@noaa.gov>

Hi folks,

I use Numeric an wxPython together a lot (of course I do, I use Numeric
for everything!).

Unfortunately, since wxPython is not Numeric aware, you lose some real
potential performance advantages. For example, I'm now working on
expanding the extensions to graphics device contexts (DCs) so that you
can draw a whole bunch of objects with a single Python call. The idea is
that the looping can be done in C++, rather than Python, saving a lot of
overhead of the loop itself, as well as the Python-wxWindows translation
step.

For drawing thousands of points, the speed-up is substantial. It's less
substantial on more complex objects (rectangles give a factor of two
improvement for ~1000 objects), due to the longer time it takes to draw
the object itself, rather than make the call. 

Anyway, at the moment, Robin Dunn has the wrappers set up so that you
can pass in a NumPy array (or, indeed, and sequence) rather than a list
or tuple of coordinates, but it is faster to use a list than a NumPy
array, because for arrays, it uses the generic PySequence_GetItem call.
If we used the NumPy API directly, it should be faster than using a
list, not slower! THis is how a representative section of the code looks
now:


bool      isFastSeq  = PyList_Check(pyPoints) ||
PyTuple_Check(pyPoints);
.
.
.
                // Get the point coordinants
                if (isFastSeq) {
                    obj = PySequence_Fast_GET_ITEM(pyPoints, i);
                }
                else {
                    obj = PySequence_GetItem(pyPoints, i);
                }

.
.
.

So you can see that if a NumPy array is passed in, PySequence_GetItem
will be used.

What I would like to do is have an isNumPyArray check, and then access
the NumPy array directly in that case.

The tricky part is that Robin does not want to have wxPython require
Numeric. (Oh how I dream of the day that NumArray becomes part of the
standard library!)
How can I check if an Object is a NumPy array (and then use it as such),
without including Numeric during compilation?

I know one option is to have condition compilation, with a NumPy and
non-Numpy version, but Robin is managing a whole lot of different
version as it is, and I don't think he wants to deal with twice as many!

Anyone have any ideas?

By the way, you can substitute NumArray for NumPy in this, as it is the
wave of the future, and particularly if it would be easier.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                    		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From paul at pfdubois.com  Wed Jan 15 10:50:07 2003
From: paul at pfdubois.com (Paul F Dubois)
Date: Wed Jan 15 10:50:07 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package.
In-Reply-To: <3E2598CC.DAB8FD8A@noaa.gov>
Message-ID: <000001c2bcc6$dd570790$6601a8c0@NICKLEBY>

If you could do:
try:
    import Numeric
    haveNumeric = 1
except:
    haveNumeric = 0

in some initialization routine, then you could use this flag.
Alternately you could test on the fly
'Numeric' in [m.__name__ for m in sys.modules]

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net 
> [mailto:numpy-discussion-admin at lists.sourceforge.net] On 
> Behalf Of Chris Barker
> Sent: Wednesday, January 15, 2003 9:22 AM
> Cc: Numpy-discussion
> Subject: [Numpy-discussion] Optionally using Numeric in 
> another compiled extension package.
> 
> 
> Hi folks,
> 
> I use Numeric an wxPython together a lot (of course I do, I 
> use Numeric for everything!).
> 
> Unfortunately, since wxPython is not Numeric aware, you lose 
> some real potential performance advantages. For example, I'm 
> now working on expanding the extensions to graphics device 
> contexts (DCs) so that you can draw a whole bunch of objects 
> with a single Python call. The idea is that the looping can 
> be done in C++, rather than Python, saving a lot of overhead 
> of the loop itself, as well as the Python-wxWindows translation step.
> 
> For drawing thousands of points, the speed-up is substantial. 
> It's less substantial on more complex objects (rectangles 
> give a factor of two improvement for ~1000 objects), due to 
> the longer time it takes to draw the object itself, rather 
> than make the call. 
> 
> Anyway, at the moment, Robin Dunn has the wrappers set up so 
> that you can pass in a NumPy array (or, indeed, and sequence) 
> rather than a list or tuple of coordinates, but it is faster 
> to use a list than a NumPy array, because for arrays, it uses 
> the generic PySequence_GetItem call. If we used the NumPy API 
> directly, it should be faster than using a list, not slower! 
> THis is how a representative section of the code looks
> now:
> 
> 
> bool      isFastSeq  = PyList_Check(pyPoints) ||
> PyTuple_Check(pyPoints);
> .
> .
> .
>                 // Get the point coordinants
>                 if (isFastSeq) {
>                     obj = PySequence_Fast_GET_ITEM(pyPoints, i);
>                 }
>                 else {
>                     obj = PySequence_GetItem(pyPoints, i);
>                 }
> 
> .
> .
> .
> 
> So you can see that if a NumPy array is passed in, 
> PySequence_GetItem will be used.
> 
> What I would like to do is have an isNumPyArray check, and 
> then access the NumPy array directly in that case.
> 
> The tricky part is that Robin does not want to have wxPython 
> require Numeric. (Oh how I dream of the day that NumArray 
> becomes part of the standard library!) How can I check if an 
> Object is a NumPy array (and then use it as such), without 
> including Numeric during compilation?
> 
> I know one option is to have condition compilation, with a 
> NumPy and non-Numpy version, but Robin is managing a whole 
> lot of different version as it is, and I don't think he wants 
> to deal with twice as many!
> 
> Anyone have any ideas?
> 
> By the way, you can substitute NumArray for NumPy in this, as 
> it is the wave of the future, and particularly if it would be easier.
> 
> -Chris
> 
> 
> -- 
> Christopher Barker, Ph.D.
> Oceanographer
>                                     		
> NOAA/OR&R/HAZMAT         (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
> 
> Chris.Barker at noaa.gov
> 
> 
> -------------------------------------------------------
> This SF.NET email is sponsored by: A Thawte Code Signing Certificate 
> is essential in establishing user confidence by providing 
> assurance of 
> authenticity and code integrity. Download our Free Code 
> Signing guide: 
> http://ads.sourceforge.net/cgi-> bin/redirect.pl?thaw0028en
> 
> 
> _______________________________________________
> Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> 


From jmiller at stsci.edu  Wed Jan 15 10:57:02 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Wed Jan 15 10:57:02 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled
 extension package.
References: <20021230235736.GA15420@idi.ntnu.no> <3E119E0E.2010403@stsci.edu>
 <3E2598CC.DAB8FD8A@noaa.gov>
Message-ID: <3E25B253.1070108@stsci.edu>

Chris Barker wrote:

>Hi folks,
>
>I use Numeric an wxPython together a lot (of course I do, I use Numeric
>for everything!).
>
>Unfortunately, since wxPython is not Numeric aware, you lose some real
>potential performance advantages. For example, I'm now working on
>expanding the extensions to graphics device contexts (DCs) so that you
>can draw a whole bunch of objects with a single Python call. The idea is
>that the looping can be done in C++, rather than Python, saving a lot of
>overhead of the loop itself, as well as the Python-wxWindows translation
>step.
>
>For drawing thousands of points, the speed-up is substantial. It's less
>substantial on more complex objects (rectangles give a factor of two
>improvement for ~1000 objects), due to the longer time it takes to draw
>the object itself, rather than make the call. 
>
>Anyway, at the moment, Robin Dunn has the wrappers set up so that you
>can pass in a NumPy array (or, indeed, and sequence) rather than a list
>or tuple of coordinates, but it is faster to use a list than a NumPy
>array, because for arrays, it uses the generic PySequence_GetItem call.
>If we used the NumPy API directly, it should be faster than using a
>list, not slower! THis is how a representative section of the code looks
>now:
>
>
>bool      isFastSeq  = PyList_Check(pyPoints) ||
>PyTuple_Check(pyPoints);
>.
>.
>.
>                // Get the point coordinants
>                if (isFastSeq) {
>                    obj = PySequence_Fast_GET_ITEM(pyPoints, i);
>                }
>                else {
>                    obj = PySequence_GetItem(pyPoints, i);
>                }
>
>.
>.
>.
>
>So you can see that if a NumPy array is passed in, PySequence_GetItem
>will be used.
>
>What I would like to do is have an isNumPyArray check, and then access
>the NumPy array directly in that case.
>
>The tricky part is that Robin does not want to have wxPython require
>Numeric. (Oh how I dream of the day that NumArray becomes part of the
>standard library!)
>How can I check if an Object is a NumPy array (and then use it as such),
>without including Numeric during compilation?
>
>I know one option is to have condition compilation, with a NumPy and
>non-Numpy version, but Robin is managing a whole lot of different
>version as it is, and I don't think he wants to deal with twice as many!
>
>Anyone have any ideas?
>
Use the Python C-API and string literals as the basis for the interface. 
 I think the steps are something like this:

1.  Import "Numeric". (PyImport_ImportModule)

2.  Get the module dictionary.    (PyModule_GetDict)

3.  Get "array" out of the dictionary.   (PyDict_GetItemString)

4.  Call "isinstance" on Numeric.array and the object.   
(PyObject_IsInstance)

Similarly:

1. Import "numarray".

2. Get the module dictionary.

3. Get "NumArray" out of the dictionary

4. Call the C-API equivalent of "isinstance" on numarray.NumArray and 
the object.

The first 3 steps of both cases can be initialized once, I think, and 
stored in C static variables to avoid repeated fetches.
If any of the first 3 steps fail, then consider that case failed and 
returning False.
If it's not a Numeric array,  check to see if it's a numarray.

>
>By the way, you can substitute NumArray for NumPy in this, as it is the
>wave of the future, and particularly if it would be easier.
>
>-Chris
>  
>
Todd


From Chris.Barker at noaa.gov  Wed Jan 15 11:00:05 2003
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Wed Jan 15 11:00:05 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled 
 extension package.
References: <000001c2bcc6$dd570790$6601a8c0@NICKLEBY>
Message-ID: <3E25A1E4.5CA8C453@noaa.gov>

Paul F Dubois wrote:
> 
> If you could do:
> try:
>     import Numeric
>     haveNumeric = 1
> except:
>     haveNumeric = 0
> 
> in some initialization routine, then you could use this flag.
> Alternately you could test on the fly
> 'Numeric' in [m.__name__ for m in sys.modules]

Thanks, but I'm talking about doing this at the C++ level in an
extension package, not at the Python level. This kind of thing is Soo
much easier in Python, of course!

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                    		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From jmiller at stsci.edu  Wed Jan 15 12:01:53 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Wed Jan 15 12:01:53 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled
 extension package.
References: <20021230235736.GA15420@idi.ntnu.no> <3E119E0E.2010403@stsci.edu>
 <3E2598CC.DAB8FD8A@noaa.gov> <3E25B253.1070108@stsci.edu>
Message-ID: <3E25C182.8080906@stsci.edu>

Todd Miller wrote:

> Chris Barker wrote:
>
>> How can I check if an Object is a NumPy array (and then use it as such),
>> without including Numeric during compilation?
>>
>> I know one option is to have condition compilation, with a NumPy and
>> non-Numpy version, but Robin is managing a whole lot of different
>> version as it is, and I don't think he wants to deal with twice as many!
>>
>> Anyone have any ideas?
>>
> Use the Python C-API and string literals as the basis for the 
> interface. I think the steps are something like this:
>
> 1.  Import "Numeric". (PyImport_ImportModule)
>
> 2.  Get the module dictionary.    (PyModule_GetDict)
>
> 3.  Get "array" out of the dictionary.   (PyDict_GetItemString)
>
> 4.  Call "isinstance" on Numeric.array and the object.   
> (PyObject_IsInstance)
>
> Similarly:
>
> 1. Import "numarray".
>
> 2. Get the module dictionary.
>
> 3. Get "NumArray" out of the dictionary
>
> 4. Call the C-API equivalent of "isinstance" on numarray.NumArray and 
> the object.
>
> The first 3 steps of both cases can be initialized once, I think, and 
> stored in C static variables to avoid repeated fetches. 

On second thought,  just do two functions,  one for Numeric,  one for 
numarray.  

If any of the first 3 steps fail, return False.  Otherwise, return the 
result of the isinstance call.

>
> If it's not a Numeric array,  check to see if it's a numarray. 

My idea to couple these was "not good".  They're not compatible at that 
level anyway.

Since numarray and Numeric are only source level compatible,  C-code can 
be compiled to work with one or the other,  but not both at the same 
time.  It probably makes more sense to just implement for Numeric.  If 
you do want to implement for both,  treat them as seperate cases with 
seperate recognizer functions and element access code.

But...  It's not clear to me that knowing an object is an array will 
help since getting data elements still has to be done fast,  and that 
seems hard to do without knowing the arrayobject struct.   Keep in mind 
that Numeric and numarray arrays are strided and possibly discontiguous, 
 so there's more to data access than owning a base pointer, as would be 
the case in C.

Todd


From falted at openlc.org  Wed Jan 15 12:25:27 2003
From: falted at openlc.org (Francesc Alted)
Date: Wed Jan 15 12:25:27 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package.
In-Reply-To: <3E25C182.8080906@stsci.edu>
References: <20021230235736.GA15420@idi.ntnu.no> <3E25B253.1070108@stsci.edu> <3E25C182.8080906@stsci.edu>
Message-ID: <200301152123.45614.falted@openlc.org>

A Dimecres 15 Gener 2003 21:16, Todd Miller va escriure:
>
> My idea to couple these was "not good".  They're not compatible at that
> level anyway.
>
> Since numarray and Numeric are only source level compatible,  C-code can
> be compiled to work with one or the other,  but not both at the same
> time.  It probably makes more sense to just implement for Numeric.  If
> you do want to implement for both,  treat them as seperate cases with
> seperate recognizer functions and element access code.
>
> But...  It's not clear to me that knowing an object is an array will
> help since getting data elements still has to be done fast,  and that
> seems hard to do without knowing the arrayobject struct.   Keep in mind
> that Numeric and numarray arrays are strided and possibly discontiguous,
>  so there's more to data access than owning a base pointer, as would be
> the case in C.

I think you can use the numarray High-Level C API to overcome these
dificulties. For example, by using the calls:

PyArrayObject* NA InputArray(PyObject *numarray, NumarrayType t, int requires)
PyArrayObject* NA OutputArray(PyObject *numarray, NumarrayType t, int 
requires)
PyArrayObject* NA IoArray(PyObject *numarray, NumarrayType t, int requires)

as documented in the User's Guide, you can get well-behaved (i.e.
contiguous and well-aligned) C arrays (copying them, if needed) from both
numarray or Numeric arrays if you pass C_ARRAY as the value for requires
parameter.

In fact, I'm using the InputArray in PyTables to manage both numarray and
Numeric arrays with good results.

-- 
Francesc Alted


From jmiller at stsci.edu  Wed Jan 15 12:40:02 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Wed Jan 15 12:40:02 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled
 extension package.
References: <20021230235736.GA15420@idi.ntnu.no> <3E25B253.1070108@stsci.edu>
 <3E25C182.8080906@stsci.edu> <200301152123.45614.falted@openlc.org>
Message-ID: <3E25CA79.40206@stsci.edu>

Francesc Alted wrote:

>A Dimecres 15 Gener 2003 21:16, Todd Miller va escriure:
>  
>
>>But...  It's not clear to me that knowing an object is an array will
>>help since getting data elements still has to be done fast,  and that
>>seems hard to do without knowing the arrayobject struct.   Keep in mind
>>that Numeric and numarray arrays are strided and possibly discontiguous,
>> so there's more to data access than owning a base pointer, as would be
>>the case in C.
>>    
>>
>
>I think you can use the numarray High-Level C API to overcome these
>dificulties. 
>
<snip>

But doesn't using the numarray  C-API require a level of coupling 
(direct knowledge of numarray during compilation) that Chris is trying 
to avoid?

>
>  
>

Todd


From falted at openlc.org  Wed Jan 15 12:59:04 2003
From: falted at openlc.org (Francesc Alted)
Date: Wed Jan 15 12:59:04 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package.
In-Reply-To: <3E25CA79.40206@stsci.edu>
References: <20021230235736.GA15420@idi.ntnu.no> <200301152123.45614.falted@openlc.org> <3E25CA79.40206@stsci.edu>
Message-ID: <200301152158.44234.falted@openlc.org>

A Dimecres 15 Gener 2003 21:54, Todd Miller va escriure:
> >I think you can use the numarray High-Level C API to overcome these
> >dificulties.
>
> But doesn't using the numarray  C-API require a level of coupling
> (direct knowledge of numarray during compilation) that Chris is trying
> to avoid?
>

Ooops!, you are right.

Perhaps this kind of scenario (accessing Numeric and numarray arrays from C)
would be more and more common as people is getting more aware of the
numarray capabilities and want to integrate it in their extensions. That
reinforces me in the belief that having a small core with the "glue"
functionality between numarray objects and 3rd party extensions in C (or
SWIG, Pyrex or whatever) can be a good thing (until numarray is in the
Standard Library).

That way, people interested in supporting numarray objects in their
extensions has only to install this small core (or even include it as part
of the extension).

Well, speaking as non-interested and impartial person ;-)

-- 
Francesc Alted


From Chris.Barker at noaa.gov  Wed Jan 15 13:50:02 2003
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Wed Jan 15 13:50:02 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled 
 extension package.
References: <20021230235736.GA15420@idi.ntnu.no> <200301152123.45614.falted@openlc.org> <3E25CA79.40206@stsci.edu> <200301152158.44234.falted@openlc.org>
Message-ID: <3E25C99A.9D5E1888@noaa.gov>

Francesc Alted wrote:

> that having a small core with the "glue"
> functionality between numarray objects and 3rd party extensions in C (or
> SWIG, Pyrex or whatever) can be a good thing (until numarray is in the
> Standard Library).
> 
> That way, people interested in supporting numarray objects in their
> extensions has only to install this small core (or even include it as part
> of the extension).

I think that's a fabulous idea, but I have no idea how hard it would be.
There would still be the problem of keeping versions in-sync. If I
distributed my package with the glue code, it would only work on
installations using the same version of Numeric (or NumArray, I suppose)


Thanks to all who have commented on my post. These are some ideas I now
have based on your comments:

> > Use the Python C-API and string literals as the basis for the
> > interface. I think the steps are something like this:
> >
> > 1.  Import "Numeric". (PyImport_ImportModule)
> >
> > 2.  Get the module dictionary.    (PyModule_GetDict)
> >
> > 3.  Get "array" out of the dictionary.   (PyDict_GetItemString)
> >
> > 4.  Call "isinstance" on Numeric.array and the object.
> > (PyObject_IsInstance)

OK, so now I can know, at runtime, whether Numeric has been imported.

> But...  It's not clear to me that knowing an object is an array will
> help since getting data elements still has to be done fast,  and that
> seems hard to do without knowing the arrayobject struct.

Exactly. that's my whole problem. However, I have an idea about this. If
I do the above test, I can now put all the Numeric specific code into a
conditional, so it would only get called in Numeric were imported. My
idea is that I could make sure Numeric was around at compile time, so I
could use all the Numeric API to access the array data, but it wouldn't
have to be installed at runtime, as none of the Numeric calls would be
executed if Numeric hadn't been imported. Would this work, or would the
system try to load the .dll or .so or whatever even if the calls weren't
executed?

All that being said, Tim Hochberg has mentioned that when he first made
wxPython DCs work with Numeric Arrays,( sorry I didn't give him credit
before, I had forgotten who did that, thanks Tim ) he did some timing
and discovered that the the overhead of the drawing calls was
substantially larger than the overhead of the indexing anyway, so
speedin up that process couldn't make much difference. 

My timing indicated something different, but I'm using Linux/wxGTK/X11,
and I think the drawing calls return after the message has been sent to
X, but X may not have completed the actual drawing yet. This means that
I'm not timing the whole process, and if I did, I might not see such a
difference. I did some tests with 100,000 points, and found that I could
see the difference with a List and Array, and the List was about twice
as fast. Drawing rectangles, however, I can't see the difference.

So, I think I'll probably shelve this for the moment, and concentrate on
getting all the drawing shapes supported by DrawXXXList methods.

Thanks for all your input.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                    		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From gvermeul at grenoble.cnrs.fr  Wed Jan 15 13:50:05 2003
From: gvermeul at grenoble.cnrs.fr (gvermeul at grenoble.cnrs.fr)
Date: Wed Jan 15 13:50:05 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled 
Message-ID: <200301152149.h0FLn6PN032653@grenoble.cnrs.fr>

> Gerard Vermeulen wrote:
> > I just want to point out that PyQwt plots NumPy arrays. I have played
> > a little bit with the Scipy-wxWindows interface, but it is no match
> > for PyQwt (I display x-y data with 16000 points).
> 
> Thanks for the tip, I'll check it out. I think what you have there is
> that the plotting is all done at the C++ level, expecting some kind of
> sequence of data points. That's exactly what I want to adress with
> wxPython: being able to pass in a whole sequence and have the looping
> done at the C++ level.
>
Yes, I am using PyArray_ContiguousFromObject() to convert any sequence
into a NumPy array before copying the data into Qwt's double arrays. 
>
> Have you ever tested whether it's fster or slower to plot data passed in
> as a list vs. a NumPy array?
>
I did not test it, but there is certainly more overhead if you pass
a list or a tuple into PyArray_ContiguousFromObject() than a NumPy array
> 
> How do you access the data in the passed in sequence? Do you use:
> PySequence_GetItem ?
> 
No, see above. The code looks like (in "sip" language, sip is a sort of
swig, but more specialized to C++ and Qt):

    void setData(double *, double *, int);
%MemberCode
    PyObject *xSeq, *ySeq;
    $C *ptr;
    if (sipParseArgs(&sipArgsParsed, sipArgs, "mOO",
                     sipThisObj, sipClass_$C, &ptr, &xSeq, &ySeq)) {
        PyArrayObject *x = (PyArrayObject *)
            PyArray_ContiguousFromObject(xSeq, PyArray_DOUBLE, 1, 0);
        if (!(x))
            return 0;
        PyArrayObject *y = (PyArrayObject *)
            PyArray_ContiguousFromObject(ySeq, PyArray_DOUBLE, 1, 0);
        if (!(y))
            return 0;
        int size;
        Py_BEGIN_ALLOW_THREADS
        size = (x->dimensions[0] < y->dimensions[0]) ?
            x->dimensions[0] : y->dimensions[0];
        ptr->setData((double*)(x->data), (double*)(y->data), size);
        Py_END_ALLOW_THREADS

        Py_DECREF(x);
        Py_DECREF(y);

        Py_INCREF(Py_None);
        return Py_None;
    }
%End

The setData calls copy the data.
>
> thanks for the tip. Qwt (and PyQwt) look very nice, I may have to
> reconsider using PyQT!
> 

Gerard

>
> -Chris
> 
> 
> 
>  
> > Take a look at http://gerard.vermeulen.free.fr
> > 
> > PyQwt is an addon for PyQt (a Python wrapper for Qt) that knows nothing
> > about NumPy
> > 
> > Maybe it is possible to make a NumPy plot add-on for wxWindows, too.
> > 
> > Gerard
> > 
> > On Wed, Jan 15, 2003 at 09:22:20AM -0800, Chris Barker wrote:
> > > Hi folks,
> > >
> > > I use Numeric an wxPython together a lot (of course I do, I use Numeric
> > > for everything!).
> > >
> > > Unfortunately, since wxPython is not Numeric aware, you lose some real
> > > potential performance advantages. For example, I'm now working on
> > > expanding the extensions to graphics device contexts (DCs) so that you
> > > can draw a whole bunch of objects with a single Python call. The idea is
> > > that the looping can be done in C++, rather than Python, saving a lot of
> > > overhead of the loop itself, as well as the Python-wxWindows translation
> > > step.
> > >
> > > For drawing thousands of points, the speed-up is substantial. It's less
> > > substantial on more complex objects (rectangles give a factor of two
> > > improvement for ~1000 objects), due to the longer time it takes to draw
> > > the object itself, rather than make the call.
> > >
> > > Anyway, at the moment, Robin Dunn has the wrappers set up so that you
> > > can pass in a NumPy array (or, indeed, and sequence) rather than a list
> > > or tuple of coordinates, but it is faster to use a list than a NumPy
> > > array, because for arrays, it uses the generic PySequence_GetItem call.
> > > If we used the NumPy API directly, it should be faster than using a
> > > list, not slower! THis is how a representative section of the code looks
> > > now:
> > >
> > >
> > > bool      isFastSeq  = PyList_Check(pyPoints) ||
> > > PyTuple_Check(pyPoints);
> > > .
> > > .
> > > .
> > >                 // Get the point coordinants
> > >                 if (isFastSeq) {
> > >                     obj = PySequence_Fast_GET_ITEM(pyPoints, i);
> > >                 }
> > >                 else {
> > >                     obj = PySequence_GetItem(pyPoints, i);
> > >                 }
> > >
> > > .
> > > .
> > > .
> > >
> > > So you can see that if a NumPy array is passed in, PySequence_GetItem
> > > will be used.
> > >
> > > What I would like to do is have an isNumPyArray check, and then access
> > > the NumPy array directly in that case.
> > >
> > > The tricky part is that Robin does not want to have wxPython require
> > > Numeric. (Oh how I dream of the day that NumArray becomes part of the
> > > standard library!)
> > > How can I check if an Object is a NumPy array (and then use it as such),
> > > without including Numeric during compilation?
> > >
> > > I know one option is to have condition compilation, with a NumPy and
> > > non-Numpy version, but Robin is managing a whole lot of different
> > > version as it is, and I don't think he wants to deal with twice as many!
> > >
> > > Anyone have any ideas?
> > >
> > > By the way, you can substitute NumArray for NumPy in this, as it is the
> > > wave of the future, and particularly if it would be easier.
> > >
> > > -Chris
> > >
> > >
> > > --
> > > Christopher Barker, Ph.D.
> > > Oceanographer
> > >
> > > NOAA/OR&R/HAZMAT         (206) 526-6959   voice
> > > 7600 Sand Point Way NE   (206) 526-6329   fax
> > > Seattle, WA  98115       (206) 526-6317   main reception
> > >
> > > Chris.Barker at noaa.gov
> > >
> > >
> > > -------------------------------------------------------
> > > This SF.NET email is sponsored by: A Thawte Code Signing Certificate
> > > is essential in establishing user confidence by providing assurance of
> > > authenticity and code integrity. Download our Free Code Signing guide:
> > > http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0028en
> > > _______________________________________________
> > > Numpy-discussion mailing list
> > > Numpy-discussion at lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> 
> -- 
> Christopher Barker, Ph.D.
> Oceanographer
>                                     		
> NOAA/OR&R/HAZMAT         (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
> 
> Chris.Barker at noaa.gov
> 


-------------------------------------------------------------
This message was sent using HTTPS service from CNRS Grenoble.
         --->   https://grenoble.cnrs.fr   <---         


From Jack.Jansen at oratrix.com  Wed Jan 15 14:18:05 2003
From: Jack.Jansen at oratrix.com (Jack Jansen)
Date: Wed Jan 15 14:18:05 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled  extension package.
In-Reply-To: <3E25A1E4.5CA8C453@noaa.gov>
Message-ID: <1D394963-28D7-11D7-AE69-000A27B19B96@oratrix.com>

On woensdag, jan 15, 2003, at 19:01 Europe/Amsterdam, Chris Barker 
wrote:

> Paul F Dubois wrote:
>>
>> If you could do:
>> try:
>>     import Numeric
>>     haveNumeric = 1
>> except:
>>     haveNumeric = 0
>>
>> in some initialization routine, then you could use this flag.
>> Alternately you could test on the fly
>> 'Numeric' in [m.__name__ for m in sys.modules]
>
> Thanks, but I'm talking about doing this at the C++ level in an
> extension package, not at the Python level. This kind of thing is Soo
> much easier in Python, of course!

This can be done, but it is difficult, and you need the cooperation of 
both parties (Numeric and wxPython, in this case). The problem is that 
you need a way to pass C pointers from one extension module to the 
other. One of the pointers you want to pass is the PyTypeObject, so you 
can check that an object passed in from Python is of the correct type. 
Another is the address of some C routine that will get you a C pointer 
to the data. The first one may be visible from Python (so you can get 
at it through normal means) but the second one won't be.

The dirty way to do this (and you should probably avoid this) is to put 
these pointers into Python integers in the supplying module, and put 
them in the module namespace with a funny name 
(__ConvertToCPointerAddress). In wxPython you import Numeric, and if it 
succeeds you look up the funny name, convert the Python integer to a C 
pointer, cross your fingers, and call the address.

A cleaner way to do this is with cobject objects. These are in the 
core, in Objects/cobject.c. Numeric exports a cobject (again named 
__ConvertToCPointerAddress) with the address of the routine as the 
value. But, and this is the nice bit, cobjects can be passed along by 
Python code but can't be fiddled with. And cobject.c even provides a C 
function PyCObject_Import(char *modulename, char *attributename) which 
directly returns you the pointer you're looking for by importing the 
module, looking up the name, checking that it's a cobject and 
extracting the value.

And it even has support for "protocols": Cobjects have an extra field 
called the description, again only settable and readable from C. 
Modules that don't know about each others' existence could still decide 
on a common description that would signify that the pointer in the 
cobject has a specific meaning. We could decide here that if the 
description is the C string "this pointer is a function that you pass 
one Python object and that returns the data just as Numeric would store 
it" would fit that bill, and anyone in the world writing an extension 
module could follow the protocol.
--
- Jack Jansen        <Jack.Jansen at oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -


From Jack.Jansen at oratrix.com  Wed Jan 15 14:34:05 2003
From: Jack.Jansen at oratrix.com (Jack Jansen)
Date: Wed Jan 15 14:34:05 2003
Subject: [Numpy-discussion] Exporting Numpy C functionality to other extension modules
Message-ID: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com>

Actually, wrt my previous message on cobjects for communicating between 
extension modules, we can do one better!

This is an idea I've been toying with for the MacPython extension 
types, and I think it's applicable to Numeric too. It goes as follows.

Each Numeric object has an attribute with a well-known name, lets call 
it "__Numeric_C_interface". This is a Cobject, and it is shared among 
all Numeric objects of the same type. The value of this C object is a 
pointer to a C structure with pointers to all the C routines you might 
want to call on the object, basically the PyArray_API structure (I 
think). The descr of the C object is a string with the version number 
of this particular PyArray_API structure.

An extension module that knows about this protocol and gets passed an 
object that it think might be a Numeric array checks whether the object 
has an __Numeric_C_interface attribute. If so it retrieves it, checks 
that it is a Cobject, gets the descriptor and tests it for 
compatibility and if it is compatible gets the cobject pointer and 
happily calls all the Numeric routines it needs.
--
- Jack Jansen        <Jack.Jansen at oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -


From falted at openlc.org  Thu Jan 16 04:00:03 2003
From: falted at openlc.org (Francesc Alted)
Date: Thu Jan 16 04:00:03 2003
Subject: [Numpy-discussion] Exporting Numpy C functionality to other extension modules
In-Reply-To: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com>
References: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com>
Message-ID: <200301161259.13522.falted@openlc.org>

A Dimecres 15 Gener 2003 23:33, Jack Jansen va escriure:
> Actually, wrt my previous message on cobjects for communicating between
> extension modules, we can do one better!
>
> This is an idea I've been toying with for the MacPython extension
> types, and I think it's applicable to Numeric too. It goes as follows.
>
> Each Numeric object has an attribute with a well-known name, lets call
> it "__Numeric_C_interface". This is a Cobject, and it is shared among
> all Numeric objects of the same type. The value of this C object is a
> pointer to a C structure with pointers to all the C routines you might
> want to call on the object, basically the PyArray_API structure (I
> think). The descr of the C object is a string with the version number
> of this particular PyArray_API structure.
>
> An extension module that knows about this protocol and gets passed an
> object that it think might be a Numeric array checks whether the object
> has an __Numeric_C_interface attribute. If so it retrieves it, checks
> that it is a Cobject, gets the descriptor and tests it for
> compatibility and if it is compatible gets the cobject pointer and
> happily calls all the Numeric routines it needs.

That's a nice idea. But I see two drawbacks:

- numarray needs to be reworked to include the Cobject descriptors, although
I don't know if this would be difficult or not.

- you still need to have Numeric or numarray installed on the client
machine. This could be the usual case, but what about extensions that want
to use Numeric internally (because a number of reasons, like better number
representation, convenient interface to C, etc) without forcing the user to
install it?

However, designing a small library with a minimalist API (I'm thinking in
something similar to zlib) could be very handy in allowing extensions (but
also native python modules) to deal with numarray objects. 

As I said before, this would require the user to install only this small
library, but it can also be included in the application or package. However,
this second alternative can be tricky, as Chris Barker has signaled, because
the different numarray versions coming in the future. But IMO a series of
factors may alleviate this handicap:

- The numarray data structure should be very stable, as improvements are
normally made at the functionality level.

- The library should provide a minimalistic, high level API that, if it is
well designed, should cope with small modifications in the numarray data
structures. 

- Finally, when these differences has to be added, and that would break the
current API, this version should be marked as a major release,
and existing extensions (or whatever software that is embedding the library)
will know that they have to release new versions if they want to support the
newest objects. But, hopefully, that should happen quite unfrequently.

Of course, this small library should cope with both numarray and Numeric (at
least, the not too old versions of it) objects. But I think this shouldn't
pose a big problem as the actual numarray API already can do that.

This logical separation between structure and functionality migth also lead
to a better acceptation by numerical software cratftsmen, as they can be
more confident in that the API to deal with numarray objects will be quite
stable throughout the time.

Well, this is just a thought. I must confess that I'm so interested on that
issue because I really want to support numarray objects in my project, and
I'm just wondering which is the best way to do that without creating too
much nuissance to the users. In fact, I'm pondering to build up such a
library myself, but that can be a waste of time if I've to redone it in
every numarray release.

Cheers,

-- 
Francesc Alted


From peter.chang at nottingham.ac.uk  Thu Jan 16 08:47:04 2003
From: peter.chang at nottingham.ac.uk (peter.chang at nottingham.ac.uk)
Date: Thu Jan 16 08:47:04 2003
Subject: [Numpy-discussion] Optionally using Numeric in another compiled
  extension package.
In-Reply-To: <3E25C99A.9D5E1888@noaa.gov>
Message-ID: <Pine.LNX.4.44.0301161400450.27474-100000@eexpc1.eee.nott.ac.uk>

On Wed, 15 Jan 2003, Chris Barker wrote:
[...]

> My idea is that I could make sure Numeric was around at compile time, so
> I could use all the Numeric API to access the array data, but it
> wouldn't have to be installed at runtime, as none of the Numeric calls
> would be executed if Numeric hadn't been imported. Would this work, or
> would the system try to load the .dll or .so or whatever even if the
> calls weren't executed?

One way is to import a dynamic library, explicitly, which has glue code to
handle the array objects when you need them.

[...]

> My timing indicated something different, but I'm using Linux/wxGTK/X11,
> and I think the drawing calls return after the message has been sent to
> X, but X may not have completed the actual drawing yet.

That's right. X's communication model between client and server is
asynchronous.

> This means that I'm not timing the whole process, and if I did, I might
> not see such a difference.

You can synchronise the output buffer using XSync(3) and then do the 
timing.

Peter


From Chris.Barker at noaa.gov  Thu Jan 16 09:58:04 2003
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Jan 16 09:58:04 2003
Subject: [Numpy-discussion] Optionally using Numeric in another 
 compiledextension package.
References: <Pine.LNX.4.44.0301161400450.27474-100000@eexpc1.eee.nott.ac.uk>
Message-ID: <3E26E45F.3C7E2293@noaa.gov>

peter.chang at nottingham.ac.uk wrote:

> You can synchronise the output buffer using XSync(3) and then do the
> timing.

I'd love to try this, but I confess I have no idea how! I'm working with
the *.i files that tell swig what to add when creating wrappers around
wxWindows for Python. wxWindows is using wxGTK, which is using GTK,
which is using Xlib (I think, so I'm pretty far away from X, and I
barely know enough C/C++ to attempt this.

I suppose I could try including Xlib, then calling XSync, but I need to
pass a reference to a disply. I have not idea how to get that. 

Any hints?

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                    		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From Chris.Barker at noaa.gov  Thu Jan 16 10:33:07 2003
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu Jan 16 10:33:07 2003
Subject: [Numpy-discussion] Exporting Numpy C functionality to other 
 extension modules
References: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com>
Message-ID: <3E26EC9D.A0B7D173@noaa.gov>

Jack Jansen wrote:

> An extension module that knows about this protocol and gets passed an
> object that it think might be a Numeric array checks whether the object
> has an __Numeric_C_interface attribute. If so it retrieves it, checks
> that it is a Cobject, gets the descriptor and tests it for
> compatibility and if it is compatible gets the cobject pointer and
> happily calls all the Numeric routines it needs.

Wow Jack! are single handely going to impliment all my pet projects that
I'm too stupid to know how to do my self ? (the other one was Universal
text file support)

I can only barely follow what you're suggesting, but I still have a
question about it. It seems while this would provide a way ro an
extension module to identify whether an object was a Numeric array, and
then get a pointer to it, how would it know the API for dealing with the
arrays, without the Numeric header file? Or would you have to include
the header file when compiling, but not need the library at runtime
unless it was actually used, which seems a reasonable compromise.

If this would work, I think it's a great idea. Short of including
NumArray with the standard library (which I imagine is a least a couple
of Python releases away), it would be a great solution for folks that
are writing extensions that they want to be able take advantage of
Numeric when it's there, but not require it.

Do any of the primary Numarray developers think this is a good and
doable idea?

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                    		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From peter.chang at nottingham.ac.uk  Thu Jan 16 11:22:03 2003
From: peter.chang at nottingham.ac.uk (peter.chang at nottingham.ac.uk)
Date: Thu Jan 16 11:22:03 2003
Subject: [Numpy-discussion] Optionally using Numeric in another 
 compiledextension package.
In-Reply-To: <3E26E45F.3C7E2293@noaa.gov>
Message-ID: <Pine.LNX.4.44.0301161804420.27474-100000@eexpc1.eee.nott.ac.uk>

On Thu, 16 Jan 2003, Chris Barker wrote:

> peter.chang at nottingham.ac.uk wrote:
> 
> > You can synchronise the output buffer using XSync(3) and then do the
> > timing.

Oops, that should be XSynchronize(3).

[...]

> I suppose I could try including Xlib, then calling XSync, but I need to
> pass a reference to a disply. I have not idea how to get that. 
> 
> Any hints?

wxGetDisplayName() gives the Display name but not a pointer to the display 
structure. So this is not much help.

In gtk+, any program can be called with --sync to aid debugging. I'd guess 
wxWindows may allow you to do the same.

Peter


From jmiller at stsci.edu  Thu Jan 16 12:06:05 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jan 16 12:06:05 2003
Subject: [Numpy-discussion] Exporting Numpy C functionality to other 
 extension modules
References: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com> <3E26EC9D.A0B7D173@noaa.gov>
Message-ID: <3E271006.4000607@stsci.edu>

Chris Barker wrote:

>Jack Jansen wrote:
>
>  
>
>>An extension module that knows about this protocol and gets passed an
>>object that it think might be a Numeric array checks whether the object
>>has an __Numeric_C_interface attribute. If so it retrieves it, checks
>>that it is a Cobject, gets the descriptor and tests it for
>>compatibility and if it is compatible gets the cobject pointer and
>>happily calls all the Numeric routines it needs.
>>    
>>
>
>Wow Jack! are single handely going to impliment all my pet projects that
>I'm too stupid to know how to do my self ? (the other one was Universal
>text file support)
>
>I can only barely follow what you're suggesting, but I still have a
>question about it. It seems while this would provide a way ro an
>extension module to identify whether an object was a Numeric array, and
>then get a pointer to it, how would it know the API for dealing with the
>arrays, without the Numeric header file? Or would you have to include
>the header file when compiling, but not need the library at runtime
>unless it was actually used, which seems a reasonable compromise.
>
>If this would work, I think it's a great idea. Short of including
>NumArray with the standard library (which I imagine is a least a couple
>of Python releases away), it would be a great solution for folks that
>are writing extensions that they want to be able take advantage of
>Numeric when it's there, but not require it.
>
>Do any of the primary Numarray developers think this is a good and
>doable idea?
>  
>
Roll out the time machine...  it's already done.

As long as you don't define the macros PY_ARRAY_UNIQUE_SYMBOL or 
NO_IMPORT_ARRAY,  any file that includes arrayobject.h gets a static 
copy of PyArray_API.

If the module executes import_array() at an appropriate time,  normally 
module initialization, but not necessarily,  the static PyArray_API gets 
filled in and becomes usable.    The import_array() call is critical; 
 without it,  API calls through the static PyArray_API are calls to NULL 
and segfault.

I think that if Numeric is not present,  and you call import_array(),   
it will fail quietly but leave the Python error status set.   So it 
might make sense to call PyErr_Clear() after doing import_array().  

>-Chris
>
So it sounds like your whole "weak linkage" scheme is plausible now with 
Numeric (maybe even numarray!), as would be a minimal API module.

1.  We discussed yesterday how to determine if an object is a Numeric 
array w/o even compiling with arrayobject.h.   The important idea there 
was that if Numeric is not present,  the "isarray" (or whatever) 
function will return false rather than segfaulting because the API 
pointer isn't filled in.

2. Call API functions in contexts where you know you're looking at 
Numeric arrays, i.e.,  right after isarray().  This creates a guard 
which prevents you from calling API functions when Numeric is not present.

3.  Call import_array() at some time before using the API functions, 
 possibly at module init time, failing quietly and clearing the error in 
installations where Numeric is not installed.


Todd


From jmiller at stsci.edu  Fri Jan 17 14:16:03 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jan 17 14:16:03 2003
Subject: [Numpy-discussion] Exporting Numpy C functionality to other 
 extension modules
References: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com> <3E26EC9D.A0B7D173@noaa.gov>
Message-ID: <3E288068.3070407@stsci.edu>

Take a look at the attached extension module "testlite" which 
demonstrates the technique I evolved from this discussion. As we 
discussed,  this usage pattern enables the construction of an extension 
which will take advantage of numarray if it is there,  but will continue 
to work if the user has not installed numarray.  Here's how it works:

1. I created a new API function,  PyArray_isArray() which is safe to 
call in all contexts.  I defined it as:

 #define PyArray_isArray(o) (PyArray_API && NA_isNumArray(o))

I added NA_isNumArray(o) to the numarray C-API because it was the easy 
way  to do it.

2. Ordinary API functions are safe to call once an object has been 
identified to be a numarray because it implies (locally) that the 
PyArray_API pointer has been initialized.

3. I tried out the standard import_array() code and added some cleanup 
for the case where numarray is not installed.  

The only caveat I see at this point is that you are required to include 
numarray headers in order to use this.  In numarray's case,  this might 
necessitate header updates and/or function call modifications.  The 
numarray C-API should stabilize pretty soon,  but I don't think its 
quite there yet.

The same approach should apply to Numeric.

This stuff is in numarray CVS now and should be in the next numarray 
release.

Todd


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: testlite.c
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20030117/b4f79444/attachment-0001.c>

From haase at msg.ucsf.edu  Fri Jan 17 14:25:04 2003
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Fri Jan 17 14:25:04 2003
Subject: [Numpy-discussion] make C array accessible to python without copy
Message-ID: <03fa01c2be77$4cae4430$3b45da80@rodan>

Hi,
What is the C API to make an array that got allocated,
let's say, by  a = new short[512*512],
accessible to python as numarray.

I tried NA_New - but that seems to make a copy.
I would need it to use the original memory space
so that I can "observe" the array from Python WHILE
the underlying C array changes (it's actually a camera image)

Thanks,
Sebastian Haase


From jmiller at stsci.edu  Fri Jan 17 15:17:01 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jan 17 15:17:01 2003
Subject: [Numpy-discussion] make C array accessible to python without
 copy
References: <03fa01c2be77$4cae4430$3b45da80@rodan>
Message-ID: <3E288EB1.80107@stsci.edu>

Sebastian Haase wrote:

>Hi,
>What is the C API to make an array that got allocated,
>let's say, by  a = new short[512*512],
>accessible to python as numarray.
>
What you want to do is not currently supported well in C.  The way to do 
what you want is:

1.  Create a buffer object from your C++ array.  The buffer object can 
be built such that it refers to the original copy of the data.

2.  Call  back into Python (numarray.NumArray) with your buffer object 
as the buffer parameter.

You can scavenge the code in NA_newAll (Src/newarray.ch) for most of the 
callback.

>I tried NA_New - but that seems to make a copy.
>I would need it to use the original memory space
>so that I can "observe" the array from Python WHILE
>the underlying C array changes (it's actually a camera image)
>
That sounds cool!

>
>Thanks,
>Sebastian Haase
>
>
>
>
>-------------------------------------------------------
>This SF.NET email is sponsored by: Thawte.com - A 128-bit supercerts will
>allow you to extend the highest allowed 128 bit encryption to all your 
>clients even if they use browsers that are limited to 40 bit encryption. 
>Get a guide here:http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0030en
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>  
>


From falted at openlc.org  Sat Jan 18 01:23:03 2003
From: falted at openlc.org (Francesc Alted)
Date: Sat Jan 18 01:23:03 2003
Subject: [Numpy-discussion] Accessing a Numeric 'c' array from numarray
Message-ID: <200301181022.07015.falted@openlc.org>

Hi,

I'm trying to make a C array from a Numeric "c" (Character) typecode array
using the high level call:

NA_InputArray(PyObject *numarray, NumarrayType t, int requires)

with no success.

As I have been able to access all the other types (i.e.
'1','b','s','i','l','f','d') successfully, perhaps character type is not
supported?

In the NumarrayType enum, there is no tChar, but I've tried tUInt8 and tAny
as the value for NumarrayType parameter, but both choices issues the same
error:

Traceback (most recent call last):
  File "table-tree2.py", line 77, in ?
    h5file.createArray('/columns', 'name', array(names), "Name column")
  File "/home/falted/PyTables/pytables-0.3/tables/File.py", line 400, in 
createArray
    setattr(group, name, object)
  File "/home/falted/PyTables/pytables-0.3/tables/Group.py", line 355, in 
__setattr__
    value._f_putObjectInTree(name, self)
  File "/home/falted/PyTables/pytables-0.3/tables/Leaf.py", line 71, in 
_f_putObjectInTree
    self.create()
  File "/home/falted/PyTables/pytables-0.3/tables/Array.py", line 83, in 
create
    self.createArray(self.object, self.title)
  File "/home/falted/PyTables/pytables-0.3/src/hdf5Extension.pyx", line 913, 
in createArray
    array = NA_InputArray(arr, numfmt2[arr.typecode()], C_ARRAY)
libnumarray.error: getShape: sequence object nested more than MAXDIM deep.

although I was passing only a Numeric 'c' with a rather small shape (10,16).

I just want to access the buffer data, and the shape of this object from C
(well, I'm actually using Pyrex, but I think this is not important). Is that
possible by only using numarray C calls?

Thanks,

-- 
Francesc Alted


From jmiller at stsci.edu  Sat Jan 18 08:27:04 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Sat Jan 18 08:27:04 2003
Subject: [Numpy-discussion] Accessing a Numeric 'c' array from numarray]
Message-ID: <3E2983C3.7000304@stsci.edu>


Francesc Alted wrote:

>Hi,
>
>I'm trying to make a C array from a Numeric "c" (Character) typecode array
>using the high level call:
>
>NA_InputArray(PyObject *numarray, NumarrayType t, int requires)
>
Unified handling of character arrays and numeric arrays doesn't exist 
yet in numarray.  There is no C-API for the chararray module because we 
haven't needed one.  But CharArrays are NDArrays and have attributes 
stored in PyArrayObjects just like numarrays.

>with no success.
>
>As I have been able to access all the other types (i.e.
>'1','b','s','i','l','f','d') successfully, perhaps character type is not
>supported?
>
>In the NumarrayType enum, there is no tChar, but I've tried tUInt8 and tAny
>as the value for NumarrayType parameter, but both choices issues the same
>error:
>
>Traceback (most recent call last):
>  File "table-tree2.py", line 77, in ?
>    h5file.createArray('/columns', 'name', array(names), "Name column")
>  File "/home/falted/PyTables/pytables-0.3/tables/File.py", line 400, in 
>createArray
>    setattr(group, name, object)
>  File "/home/falted/PyTables/pytables-0.3/tables/Group.py", line 355, in 
>__setattr__
>    value._f_putObjectInTree(name, self)
>  File "/home/falted/PyTables/pytables-0.3/tables/Leaf.py", line 71, in 
>_f_putObjectInTree
>    self.create()
>  File "/home/falted/PyTables/pytables-0.3/tables/Array.py", line 83, in 
>create
>    self.createArray(self.object, self.title)
>  File "/home/falted/PyTables/pytables-0.3/src/hdf5Extension.pyx", line 913, 
>in createArray
>    array = NA_InputArray(arr, numfmt2[arr.typecode()], C_ARRAY)
>libnumarray.error: getShape: sequence object nested more than MAXDIM deep.
>
NA_InputArray was intended to accept non-numeric sequences.  It could 
report this better...

>although I was passing only a Numeric 'c' with a rather small shape (10,16).
>
>I just want to access the buffer data, and the shape of this object from C
>(well, I'm actually using Pyrex, but I think this is not important). Is that
>possible by only using numarray C calls?
>
Look at Lib/chararray.py and Src/_chararraymodule.c.

If you can handle using a CharArray or RawCharArray, try:

1. call NA_updateDataPtr( array ) to refresh the data buffer pointer in 
the PyArrayObject.  Even _chararraymodule.c doesn't do this right yet.

2. call NA_OFFSETDATA(array) to add the byteoffset to the pointer.

3. shape, strides, and itemsize should be directly accessible from the 
PyArrayObject.

CharArray has some extra stripping and padding semantics; these are lazy 
and hence absent without extra care in C.  RawCharArray has none.

CharArrays are really arrays of fixed length strings of bytes.  The 
string length is defined by the array itemsize.

>Thanks,
>
>  
>
Todd


From falted at openlc.org  Sat Jan 18 10:18:02 2003
From: falted at openlc.org (Francesc Alted)
Date: Sat Jan 18 10:18:02 2003
Subject: [Numpy-discussion] Accessing a Numeric 'c' array from numarray]
In-Reply-To: <3E2983C3.7000304@stsci.edu>
References: <3E2983C3.7000304@stsci.edu>
Message-ID: <200301181917.29533.falted@openlc.org>

A Dissabte 18 Gener 2003 17:41, Todd Miller va escriure:
> >I just want to access the buffer data, and the shape of this object from C
> >(well, I'm actually using Pyrex, but I think this is not important). Is
> > that possible by only using numarray C calls?
>
> Look at Lib/chararray.py and Src/_chararraymodule.c.
>
> If you can handle using a CharArray or RawCharArray, try:
>
> 1. call NA_updateDataPtr( array ) to refresh the data buffer pointer in
> the PyArrayObject.  Even _chararraymodule.c doesn't do this right yet.
>
> 2. call NA_OFFSETDATA(array) to add the byteoffset to the pointer.
>
> 3. shape, strides, and itemsize should be directly accessible from the
> PyArrayObject.

Ok. I'll try to do that.

>
> CharArray has some extra stripping and padding semantics; these are lazy
> and hence absent without extra care in C.  RawCharArray has none.
>

By the way, is it safe to assume that CharArray objects are contiguous? or
RawCharArray?. The same question goes for RecArray objects. Or it is always
convenient to check with iscontiguous() method if they are or not?. In case
these objects can be non-contiguous, I guess there's still not a function
like NA_InputArray that works with CharArray or RecArray objects in order to
obtain well-behaved objects. Is that true?

I think it would be possible to me to include support for numarray objects
in next release of PyTables. Thanks!,

-- 
Francesc Alted


From jmiller at stsci.edu  Sat Jan 18 11:57:03 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Sat Jan 18 11:57:03 2003
Subject: [Numpy-discussion] Accessing a Numeric 'c' array from numarray]
References: <3E2983C3.7000304@stsci.edu> <200301181917.29533.falted@openlc.org>
Message-ID: <3E29B52C.2030602@stsci.edu>

Francesc Alted wrote:

>A Dissabte 18 Gener 2003 17:41, Todd Miller va escriure:
>  
>
>>>I just want to access the buffer data, and the shape of this object from C
>>>(well, I'm actually using Pyrex, but I think this is not important). Is
>>>that possible by only using numarray C calls?
>>>      
>>>
>>Look at Lib/chararray.py and Src/_chararraymodule.c.
>>
>>If you can handle using a CharArray or RawCharArray, try:
>>
>>1. call NA_updateDataPtr( array ) to refresh the data buffer pointer in
>>the PyArrayObject.  Even _chararraymodule.c doesn't do this right yet.
>>
>>2. call NA_OFFSETDATA(array) to add the byteoffset to the pointer.
>>
>>3. shape, strides, and itemsize should be directly accessible from the
>>PyArrayObject.
>>    
>>
>
>Ok. I'll try to do that.
>
>  
>
>>CharArray has some extra stripping and padding semantics; these are lazy
>>and hence absent without extra care in C.  RawCharArray has none.
>>
>>    
>>
>
>By the way, is it safe to assume that CharArray objects are contiguous? or
>RawCharArray?.
>
Mostly no.   Each fixed length element is stored as a contiguous 
sequence of bytes.  Anything goes for the rest,  so you need to look at 
the strides arrays and byteoffset.

>The same question goes for RecArray objects. 
>
No.  It's possible to select every 10th record, for instance, in a 
slice.  I believe the resulting decimated array would be a discontiguous 
view of the original.  

>Or it is always
>convenient to check with iscontiguous() method if they are or not?.
>
I'm not even certain the method works correctly for chararray and 
recarray.  

I think the portion of chararray that has been written in C considers 
array strides.
recarray is pure python.  In both cases,  I think I'd just forget about 
contiguity and use the strides arrays.

> In case
>these objects can be non-contiguous, I guess there's still not a function
>like NA_InputArray that works with CharArray or RecArray objects in order to
>obtain well-behaved objects. Is that true?
>
True.  But neither recarray nor chararray really has behavedness 
problems like misalignment,
byteswapping, or type conversion.  I think contiguity is the only issue, 
and that is solved
just by calling .copy().  You might argue that  records contain 
byteswapped and misaligned fields.   I don't have an immediate answer to 
that.

My preference is to use strides and forget about contiguity,  but you 
could also make contiguous copies simply.  Noone I'm aware of has yet 
tried access to misbehaved records in C.

>
>I think it would be possible to me to include support for numarray objects
>in next release of PyTables. 
>
Great!

>Thanks!,
>  
>


From verveer at embl.de  Sun Jan 19 06:39:09 2003
From: verveer at embl.de (verveer at embl.de)
Date: Sun Jan 19 06:39:09 2003
Subject: [Numpy-discussion] numarray bug?
Message-ID: <1042987080.3e2ab8489e640@webmail.EMBL-Heidelberg.DE>

Hi, 
 
The following gives an error: 
 
>>> print numarray.Int8 == numarray.Any 
Traceback (most recent call last): 
  File "<stdin>", line 1, in ? 
  File "/usr/local/lib/python2.2/site-packages/numarray/numerictypes.py", line 
102, in __cmp__ 
    return genericTypeRank.index(self.name) - 
genericTypeRank.index(other.name) 
ValueError: list.index(x): x not in list 
 
A bug? 
 
Cheers, Peter 
 
-- 
Dr. Peter J. Verveer 
Cell Biology and Cell Biophysics Programme 
EMBL 
Meyerhofstrasse 1 
D-69117 Heidelberg 
Germany 
Tel. : +49 6221 387245 
Fax  : +49 6221 387242 
Email: verveer at embl-heidelberg.de 
 
 
From falted at openlc.org  Mon Jan 20 04:17:03 2003
From: falted at openlc.org (Francesc Alted)
Date: Mon Jan 20 04:17:03 2003
Subject: [Numpy-discussion] Accessing a Numeric 'c' array from numarray]
In-Reply-To: <3E29B52C.2030602@stsci.edu>
References: <3E2983C3.7000304@stsci.edu> <200301181917.29533.falted@openlc.org> <3E29B52C.2030602@stsci.edu>
Message-ID: <200301201316.06127.falted@openlc.org>

A Dissabte 18 Gener 2003 21:12, Todd Miller va escriure:
> >By the way, is it safe to assume that CharArray objects are contiguous? or
> >RawCharArray?.
>
> Mostly no.   Each fixed length element is stored as a contiguous
> sequence of bytes.  Anything goes for the rest,  so you need to look at
> the strides arrays and byteoffset.
>
> >The same question goes for RecArray objects.
>
> No.  It's possible to select every 10th record, for instance, in a
> slice.  I believe the resulting decimated array would be a discontiguous
> view of the original.
>
> >Or it is always
> >convenient to check with iscontiguous() method if they are or not?.
>
> I'm not even certain the method works correctly for chararray and
> recarray.

Well, during my tests with numarray 0.4, iscontiguous() seems to work well,
both for chararrays and recarrays.

> In both cases,  I think I'd just forget about
> contiguity and use the strides arrays.

Yeah, but I still want to use iscontiguous() method just to speed-up a bit
the code.

> You might argue that  records contain
> byteswapped and misaligned fields.   I don't have an immediate answer to
> that.

Exactly, I am pondering how to deal with HDF5 objects coming from machines
with a different endianess (misalignment is not a problem in my case) than
the local machine. But I think I can manage that by creating recarrays
buffers with the byteorder parameter set appropriately during the HDF5 table
reads. Then, all the data can be read correctly because numarray will
byteswap the data whenever this recarray will be accessed.

Moreover, if this object is to be used frequently, I can speed-up the access
to this recarray by byteswapping the columns (as arrays) using their
byteswap() method. In the future it would be nice to provide a generica
byteswap method for recarrays.

Thanks,

-- 
Francesc Alted


From falted at openlc.org  Mon Jan 20 11:02:02 2003
From: falted at openlc.org (Francesc Alted)
Date: Mon Jan 20 11:02:02 2003
Subject: [Numpy-discussion] recarray2 re-visited
Message-ID: <200301202000.53584.falted@openlc.org>

Hi,

As I needed a byteswap() method for recarray, after a bit of hacking I've
made one myself. This is based on my own version of recarray to take
advantage of the _fields cache so as to both speed-up and simplify the new
code.

Basically, the new method takes a recarray, checking which columns are
numarray arrays and invoking their byteswap() method if needed. Easy, but
effective. Moreover, a _byteswap() and togglebyteorder() are provided to be
compatible with existing methods in NumArray objects.

As a plus, the recarray __str__ has been modified in order to allow a
printing having in mind the byteorder of the recarray, and improving the
speed of printing by a factor of 30, that can be handy in some situations.

Do with it whatever you want,

-- 
Francesc Alted
-------------- next part --------------
A non-text attachment was scrubbed...
Name: recarray2.py
Type: text/x-python
Size: 21435 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20030120/b7a180d9/attachment-0002.py>
-------------- next part --------------
recarray shape in test ==> (10000,)
Assignment in recarray original
-------------------------------
Assign time: 1.24  Rows/s: 8064

Assignment in recarray modified
-------------------------------
Assign time: 0.16  Rows/s: 62499  Speed-up: 7.75

Selection in recarray original
------------------------------
This record pass the cut ==> 0.0 (row 0 )
This record pass the cut ==> 1.0 (row 1 )
This record pass the cut ==> 4.0 (row 2 )
Select time: 1.53  Rows/s: 6535

Selection in recarray modified
------------------------------
This record pass the cut ==> 0.0 (row 0 )
This record pass the cut ==> 1.0 (row 1 )
This record pass the cut ==> 4.0 (row 2 )
Select time: 0.15  Rows/s: 66666  Speed-up: 10.2

Printing in recarray original
------------------------------
Print time: 18.11  Rows/s: 552

Printing in recarray modified
------------------------------
Print time: 0.63  Rows/s: 15872  Speed-up: 28.746

-------------- next part --------------
A non-text attachment was scrubbed...
Name: recarray2-test.py
Type: text/x-python
Size: 2946 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20030120/b7a180d9/attachment-0003.py>

From falted at openlc.org  Tue Jan 21 08:01:13 2003
From: falted at openlc.org (Francesc Alted)
Date: Tue Jan 21 08:01:13 2003
Subject: [Numpy-discussion] Conversion functions between Numeric and numarray arrays?
Message-ID: <200301211744.55666.falted@openlc.org>

Hi,

Anybody is aware of any function (either in C or Python or a mixture of
both) to easily convert Numerical Python arrays from/to numarray arrays?

I mean, I would like to use such a funtion that, without having to copy
element by element all the data, be able to copy the data buffer (or even
use the same if possible at all) from one object to the other.

Thanks,

-- 
Francesc Alted


From haase at msg.ucsf.edu  Tue Jan 21 10:41:07 2003
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Tue Jan 21 10:41:07 2003
Subject: [Numpy-discussion] Conversion functions between Numeric and numarray arrays?
References: <200301211744.55666.falted@openlc.org>
Message-ID: <051501c2c17c$a83e8410$3b45da80@rodan>

Hi,
I think this is actually quite related to my post from Friday:
[Numpy-discussion] make C array accessible to python without copy

-> So, to reformulate: Who hold actually the array data in memory? Or: where
gets the memory allocated and where/how many pointers to that exist?    I
understood the answer that Todd Miller gave, that there is such a thing as a
"buffer object" that does all the work, so then: one would just have to take
that and build a "new" numarray or Numeric structure around it  (referring
to the Subject of this email)   or  (in the case of my Friday-email)  just
have that "buffer object" point to a different memory space (that got
already allocated by the C-program) .

Agree ? (Did I get it right?)

Sebastian Haase

----- Original Message -----


From falted at openlc.org  Tue Jan 21 11:24:08 2003
From: falted at openlc.org (Francesc Alted)
Date: Tue Jan 21 11:24:08 2003
Subject: [Numpy-discussion] Conversion functions between Numeric and numarray arrays?
In-Reply-To: <3E2D74A2.40204@stsci.edu>
References: <200301211744.55666.falted@openlc.org> <3E2D74A2.40204@stsci.edu>
Message-ID: <200301212005.30328.falted@openlc.org>

A Dimarts 21 Gener 2003 17:26, v?reu escriure:
> Francesc Alted wrote:
> >Anybody is aware of any function (either in C or Python or a mixture of
> >both) to easily convert Numerical Python arrays from/to numarray arrays?
>
> I think you should look at numarray.fromlist() and NumArray.tolist().  I
> think fromlist() will work on a nested sequence object,  and hence a
> Numeric array.

Yeah, I knew that, but I was looking for something more optimal.

>
> >I mean, I would like to use such a funtion that, without having to copy
> >element by element all the data, be able to copy the data buffer (or even
> >use the same if possible at all) from one object to the other.
>
> I have not looked at this yet;   it's a very good question.  Note that
> going from numarray to Numeric there are issues with making the buffer
> well-behaved.

I think this should be not too difficult to achieve and I'll try to explain
why.

When going from numarray to Numeric, numarray already have NA_InputArray
C-API function that returns a well-behaved array. But strictly speaking, we
don't even need a well-behaved array (this is a too restrictive condition)
as both Numeric and numarray support discontiguous data. Even the byteorder
should be not a problem, because, as Numeric itself has no such a property,
we can create a Numeric array that is in native order as the result and
byteswap the numarray object (if needed) before doing the conversion.

So, non-alignment remains as the only issue that may cause a buffer copy
during numarray ==> Numeric conversion. Is that correct?. If yes, it is
possible to do a workaround about that, i.e. we can still get a Numeric from
a numarray without copying the data in case of numarray misaligned objects?.

Regarding to going in the other sense (ie. Numeric ==> numarray), as
numarray supports discontiguity, misalignment and byteswapped data, this
conversion should not imply a data buffer copy at all. 

Once we have a pointer to the data buffer, it is only a matter of
wrapping a Numeric or numarray object around it getting this info from the
original object, and returning the new object as a result.

All in all, this conversion *seems* to be not a too difficult task.

Making such a conversion functions (in C, but also having Python
counterparts) available might represent to open the door to a co-existence
of Numeric and numarray objects in the same program, and that would easy the
numarray deployment in existing Numeric software.

Comments?

-- 
Francesc Alted


From falted at openlc.org  Tue Jan 21 11:24:11 2003
From: falted at openlc.org (Francesc Alted)
Date: Tue Jan 21 11:24:11 2003
Subject: [Numpy-discussion] Conversion functions between Numeric and numarray arrays?
In-Reply-To: <051501c2c17c$a83e8410$3b45da80@rodan>
References: <200301211744.55666.falted@openlc.org> <051501c2c17c$a83e8410$3b45da80@rodan>
Message-ID: <200301212020.57384.falted@openlc.org>

A Dimarts 21 Gener 2003 19:41, Sebastian Haase va escriure:
> Hi,
> I think this is actually quite related to my post from Friday:
> [Numpy-discussion] make C array accessible to python without copy
>
> -> So, to reformulate: Who hold actually the array data in memory? Or:
> where gets the memory allocated and where/how many pointers to that exist? 
>   I understood the answer that Todd Miller gave, that there is such a thing
> as a "buffer object" that does all the work, so then: one would just have
> to take that and build a "new" numarray or Numeric structure around it 
> (referring to the Subject of this email)   or  (in the case of my
> Friday-email)  just have that "buffer object" point to a different memory
> space (that got already allocated by the C-program) .
>
> Agree ? (Did I get it right?)

Well, so so. I think the buffer object is a property of numarray objects,
not Numeric objects. So, in the numarray ==> Numeric conversion process you
may need to access the internals of the buffer (for example by using the
high level numarray C-API) and manage to obtain a data buffer (in the C
sense, not an object) that can be used to build the Numeric object (with the
help of the numarray object metadata). The opposite way needs something
similar but with inverted roles. See my previous message for a more in-depth
explanation.

I think the conversion (without copying) is not a difficult process, but no
so-easy like that.

Well, I'm just a newcomer to numarray and my opinions about that may
perfectly be completely wrong, of course. Take them with caution!.

-- 
Francesc Alted


From paul at pfdubois.com  Tue Jan 21 12:06:34 2003
From: paul at pfdubois.com (paul at pfdubois.com)
Date: Tue Jan 21 12:06:34 2003
Subject: [Numpy-discussion] RE: numarray/Numeric upkeep?
Message-ID: <3E0D02A0000164FB@mta6.wss.scd.yahoo.com>

Here are some of the factors leading to the slow rate of change of Numeric
lately.
a. I changed to a new project and have had a lot of startup learning to
do. My new project uses Numeric but not in as central a way as my old one.
b. I mistakenly thought numarray would be ready sooner so that I was trying
to let it slide.
c. I announced last year, in view of (a), that I was needing to be replaced
as HeadNummie. It would be logical to turn this over to the Numarray people,
but they aren't ready to do it until Numarray is ready, so nothing happened.
d. Except for Travis, most of the other listed Numeric developers aren't
in fact doing patches, releases, etc.
e. Not all patches that are submitted are correct or desirable, historically.
I'm not saying anything about any patches you may have submitted, just pointing
out that applying them requires real work, not just mechanical patching.
In fact the rate of error in patches is quite high and I've learned to be
cautious.
f. Some patches interfere with each other; for example, a patch for making
64 bit machines work right and a patch for some specific bug collided.

I've started to work on the MA for Numarray but I'm not able to do much
work on Numeric right now. This is a place where someone else has to help.


>-- Original Message --
>To: dubois at users.sourceforge.net
>Subject: numarray/Numeric upkeep?
>From: Michael Stone <mbrierst at users.sourceforge.net>
>Cc: <mbrierst at users.sourceforge.net>
>Date: Tue, 21 Jan 2003 11:32:03 -0800
>
>
>
>No one seems to be doing bugfixes for Numeric or numarray.
>Nothing seems to have happened for several months.  Lots of bugs have been
>posted for Numeric, some easily fixable (I submitted one with a patch).
>
>Any idea if either project will become active again anytime soon?


From perry at stsci.edu  Tue Jan 21 12:28:13 2003
From: perry at stsci.edu (Perry Greenfield)
Date: Tue Jan 21 12:28:13 2003
Subject: [Numpy-discussion] RE: numarray/Numeric upkeep?
In-Reply-To: <3E0D02A0000164FB@mta6.wss.scd.yahoo.com>
Message-ID: <JFEGLNDJEDNOMPPHDEJFOEBLECAA.perry@stsci.edu>

Michael Stone wrote:

> >No one seems to be doing bugfixes for Numeric or numarray.
> >Nothing seems to have happened for several months.  Lots of bugs 
> have been
...

It certainly isn't true that nothing has happened for several months
with numarray. On what do you base this belief? While not all bugs
have been fixed, the oldest listed in the numarray bug tracker is
from December. Is there a bug you feel needs urgent attention?

Work is continuing and new releases will be coming out.

As to Paul's comments regarding when numarray will be ready,
my guess is when the following are complete:

- Package reorganization (make numarray a package)
- Optimization for small arrays (making numarray'speed with small arrays
   more comparable with Numeric; this is probably the single largest
   remaining item)
- Porting some well known packages such as MA (which Paul is working on),
   scipy, pyopengl and such to work with numarray. Some of this has been
   started.

There are other smaller things to do as well. But I'm hoping that
we can be done with these in a few months.

Perry


From bazell at comcast.net  Tue Jan 21 12:33:35 2003
From: bazell at comcast.net (Dave Bazell)
Date: Tue Jan 21 12:33:35 2003
Subject: [Numpy-discussion] array operation
Message-ID: <00bd01c2c18c$10ab5000$6401a8c0@DB>

I am trying to see if I can use where() or choose() to do this.  I can't
really figure it out.

I have a 2-d array data where each row is an observation and each column is
an attribute of the observation:

data =
[[.3, .2, 2.3,...]    <- observation 1
 [.7, 1.2, .4...]     <- observation 2
...]]

I have another 1-d array that contains a code for the class of object:

class = [0,1,0,1,1,3,2,0,...]

where class[i] = the class of the ith object in the data array.  Thus,
observation 1 above is class 0, observation 2 is class 1, and so on.

I want to select all objects of a given class from data array.  I can do
this with a loop

for i in range(ndat):
    if class == 0:
        do something
   ....

Is there a way to use where() or choose() to do this?  Would it be more
efficient?

Thanks,

Dave


From perry at stsci.edu  Tue Jan 21 13:02:05 2003
From: perry at stsci.edu (Perry Greenfield)
Date: Tue Jan 21 13:02:05 2003
Subject: [Numpy-discussion] array operation
In-Reply-To: <00bd01c2c18c$10ab5000$6401a8c0@DB>
Message-ID: <JFEGLNDJEDNOMPPHDEJFOEBMECAA.perry@stsci.edu>

Dave Bazell writes:
> I am trying to see if I can use where() or choose() to do this.  I can't
> really figure it out.
> 
> I have a 2-d array data where each row is an observation and each 
> column is
> an attribute of the observation:
> 
> data =
> [[.3, .2, 2.3,...]    <- observation 1
>  [.7, 1.2, .4...]     <- observation 2
> ...]]
> 
> I have another 1-d array that contains a code for the class of object:
> 
> class = [0,1,0,1,1,3,2,0,...]

Note that using class is illegal, it is a reserved keyword.
> 
> where class[i] = the class of the ith object in the data array.  Thus,
> observation 1 above is class 0, observation 2 is class 1, and so on.
> 
> I want to select all objects of a given class from data array.  I can do
> this with a loop
> 
I assume you mean you want to select all the rows corresponding to all
the observations where the code for the class corresponding to that
observation equals some particular value.

If so then for numarray this ought to work.

index = nonzero(code==1) # want indices of all the obs where class code = 1
selected_obs = data[index]

(or in one line if you wish: selected_obs = data[nonzero(code==1)]  )

> for i in range(ndat):
>     if class == 0:
>         do something
>    ....
> 
> Is there a way to use where() or choose() to do this?  Would it be more
> efficient?
> 
Perry


From Chris.Barker at noaa.gov  Tue Jan 21 14:30:10 2003
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Tue Jan 21 14:30:10 2003
Subject: [Numpy-discussion] array operation
References: <JFEGLNDJEDNOMPPHDEJFOEBMECAA.perry@stsci.edu>
Message-ID: <3E2DC965.9328BCD6@noaa.gov>

Perry Greenfield wrote:

> If so then for numarray this ought to work.
> 
> index = nonzero(code==1) # want indices of all the obs where class code = 1
> selected_obs = data[index]

of for Numeric, use take():

selected_obs = take(data,nonzero(code == 1),1)

(this will select columns coresponding to where the code == 1, which is
how I read your question)


By the way, choose() and where() do something similar, but give you an
array back that is the saem size as the one you start with, with some
(or all) of the elements replaced. take() gives you a smaller array that
is a subset of the original one, which I think is what you want here.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                    		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From jmiller at stsci.edu  Tue Jan 21 14:39:04 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Tue Jan 21 14:39:04 2003
Subject: [Numpy-discussion] Conversion functions between Numeric and numarray
 arrays?
References: <200301211744.55666.falted@openlc.org> <3E2D74A2.40204@stsci.edu> <200301212005.30328.falted@openlc.org>
Message-ID: <3E2DCBDA.1040604@stsci.edu>

Francesc Alted wrote:

>I think this should be not too difficult to achieve and I'll try to explain
>why.
>
>When going from numarray to Numeric, numarray already have NA_InputArray
>C-API function that returns a well-behaved array. But strictly speaking, we
>don't even need a well-behaved array (this is a too restrictive condition)
>as both Numeric and numarray support discontiguous data. Even the byteorder
>should be not a problem, because, as Numeric itself has no such a property,
>we can create a Numeric array that is in native order as the result and
>byteswap the numarray object (if needed) before doing the conversion.
>
In-place byteswapping sounds like a bad idea to me.  What if the array 
is based upon a readonly buffer?  We've just started using these at 
STSCI because a readonly memory map imposes no load on the system swap 
file.  With a read only mapping,  the buffer itself has readonly pages; 
 these cannot be swapped in-place.

>So, non-alignment remains as the only issue that may cause a buffer copy
>during numarray ==> Numeric conversion. Is that correct?. 
>
I don't think so.

>If yes, it is
>possible to do a workaround about that, i.e. we can still get a Numeric from
>a numarray without copying the data in case of numarray misaligned objects?.
>  
>
I don't see how.  The primary source of misaligned arrays is numerical 
columns in recarrays.  It seems to me that if the data is misaligned, 
 you either have to copy it to someplace else which is aligned,  or 
teach the function which is going to process it how to access it 
byte-wise.  Only the former sounds feasible to me.

>Regarding to going in the other sense (ie. Numeric ==> numarray), as
>numarray supports discontiguity, misalignment and byteswapped data, this
>conversion should not imply a data buffer copy at all. 
>  
>
This sounds correct.  

>Once we have a pointer to the data buffer, it is only a matter of
>wrapping a Numeric or numarray object around it getting this info from the
>original object, and returning the new object as a result.
>
>All in all, this conversion *seems* to be not a too difficult task.
>  
>
It seems straightforward in principle,  but the memory management issues 
seem a little tricky to me.   It's easy to get buffers from numarrays, 
and create numarrays from buffers.  I guess we need a module which does 
the same for Numeric.  

There are two easy ways to "get a buffer" from a Numeric array:

1.  Wrap the Numeric data in a buffer object.
2.  Add support for the buffer API to the Numeric object.

Off hand,  I'm not sure which is better,  although (1) is less intrusive 
to Numeric and I suppose is the place to start.  This should be easy.

But,  I'm not sure how to create a Numeric array from a buffer.  It's 
easy to get the data pointer from a buffer, and to construct a Numeric 
array from a data pointer,   but we also need a way to stash the pointer 
to the buffer object.    I don't like the idea of modifying Numeric's 
PyArrayObject.  

>Making such a conversion functions (in C, but also having Python
>counterparts) available might represent to open the door to a co-existence
>of Numeric and numarray objects in the same program, and that would easy the
>numarray deployment in existing Numeric software.
>
>Comments?
>  
>
All in all,  I think this is a great idea which would really boost 
interoperability.  I wish there was a simpler approach which required no 
modifications to Numeric.

Todd 


From falted at openlc.org  Wed Jan 22 01:53:01 2003
From: falted at openlc.org (Francesc Alted)
Date: Wed Jan 22 01:53:01 2003
Subject: [Numpy-discussion] Incomplete support in certain Numeric emulation functions in numarray
Message-ID: <200301221051.57337.falted@openlc.org>

Hi,

I have discovered that the Numeric emulation functions in numarray doesn't
accept a character typecode as type parameter.

This is not immediately apparent because type parameter is of type 'int',
and passing it a 'char' maybe not a good practice. But the fact is that
Numeric *do* accept the charcodes in the type parameter. 

For example, this is the normal way to call the PyArray_FromDims function:

arr = PyArray_FromDims(self.rank, self.dimensions, tFloat64)

but, in Numeric, this other manner also works:

arr = PyArray_FromDims(self.rank, self.dimensions, 'd')

Now, in numarray, if you pass a character to the type parameter, a
"segmentation fault" is issued.

Look at the end of Numeric-22.0/Src/arraytypes.c, to see how characters are
handled as types in Numeric. I think something like this should be added to
the deferred_libnumarray_init in numarray-0.4/Src/newarray.ch.

Another thing. It seems to me that NA_New and NA_Empty functions are not
well documented in the numarray documentation as they differ from the
definitions in numarray-0.4/Src/newarray.ch. I hope that the latter will
stay, because I prefer them a lot more than the documented ones :-)

Bye,

-- 
Francesc Alted


From jmiller at stsci.edu  Wed Jan 22 06:52:08 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Wed Jan 22 06:52:08 2003
Subject: [Numpy-discussion] Incomplete support in certain Numeric emulation
 functions in numarray
References: <200301221051.57337.falted@openlc.org>
Message-ID: <3E2EAFE9.4060900@stsci.edu>

Francesc Alted wrote:

>Hi,
>
>I have discovered that the Numeric emulation functions in numarray doesn't
>accept a character typecode as type parameter.
>
Interesting.  

>
>This is not immediately apparent because type parameter is of type 'int',
>and passing it a 'char' maybe not a good practice. 
>
I wrote the emulation functions using the manual and intuition rather 
than the existing code.  There will be others like this.

>But the fact is that
>Numeric *do* accept the charcodes in the type parameter. 
>
>  
>
No argument here.  numarray can "always" be more compatible than it is 
"now",  for any value of always or now.  I think the only real way to 
avoid that would be to build Numeric into numarray,  which sounds 
dubious. :)

>For example, this is the normal way to call the PyArray_FromDims function:
>
>arr = PyArray_FromDims(self.rank, self.dimensions, tFloat64)
>
>but, in Numeric, this other manner also works:
>
>arr = PyArray_FromDims(self.rank, self.dimensions, 'd')
>  
>
This was nicely illustrated.

>Now, in numarray, if you pass a character to the type parameter, a
>"segmentation fault" is issued.
>  
>
Decidedly not good.

>Look at the end of Numeric-22.0/Src/arraytypes.c, to see how characters are
>handled as types in Numeric. I think something like this should be added to
>the deferred_libnumarray_init in numarray-0.4/Src/newarray.ch.
>
I did a simple implementation of PyArray_DescrFromType trying to add 
support for f2py.

There are 2 real issues with it that I see:

1.  It still doesn't handle character codes.  I think it could handle 
both NumericTypes and character codes without conflict because of the 
way the ASCII character set is layed out.

2. I just added it so that it *could* be called since I think f2py 
needed it.  I didn't call it anywhere from the other compatability 
functions.

Care to do another patch?  

>Another thing. It seems to me that NA_New and NA_Empty functions are not
>well documented in the numarray documentation as they differ from the
>definitions in numarray-0.4/Src/newarray.ch. I hope that the latter will
>stay, because I prefer them a lot more than the documented ones :-)
>
If you're working from CVS,  the form they're in now was the result of 
someone's detailed comments.

They're still not quite right,  because the interface is written in 
terms of int arrays,  which is not good for LP64 platforms where long is 
really what is needed to avoid creating 2G bottlenecks.  The naming is 
also not consistent and I will want to make it so before release of
numarray-0.5.

>Bye,
>
>  
>
Todd


From falted at openlc.org  Wed Jan 22 09:48:03 2003
From: falted at openlc.org (Francesc Alted)
Date: Wed Jan 22 09:48:03 2003
Subject: [Numpy-discussion] Incomplete support in certain Numeric emulation functions in numarray
In-Reply-To: <3E2EAFE9.4060900@stsci.edu>
References: <200301221051.57337.falted@openlc.org> <3E2EAFE9.4060900@stsci.edu>
Message-ID: <200301221846.13358.falted@openlc.org>

A Dimecres 22 Gener 2003 15:51, Todd Miller va escriure:
>
> I did a simple implementation of PyArray_DescrFromType trying to add
> support for f2py.

> There are 2 real issues with it that I see:
>
> 1.  It still doesn't handle character codes.  I think it could handle
> both NumericTypes and character codes without conflict because of the
> way the ASCII character set is layed out.

I think so

>
> 2. I just added it so that it *could* be called since I think f2py
> needed it.  I didn't call it anywhere from the other compatability
> functions.
>

I tried to patch your PyArray_DescrFromType, but nothing has changed
because, as you said, any compatabilty function call it.

> Care to do another patch?

Well, I've tried to patch the NA_NewAll funtion in newarray.c:

        typeObject = pNumType[type];
        if (!typeObject) {
           /* Test if it is a Numeric charcode */
           sprintf(strcharcode, "%c", type);
           charcode = PyString_FromString(strcharcode);
           typeobj = PyDict_GetItemString(pNumericTypesTDict, strcharcode);
           if (typeobj) {
              typeObject = typeobj;
           } else
             return (PyArrayObject *) PyErr_Format(_Error,
                   "Type object lookup returned NULL for type %d", type);
        }

instead of the original code:

        typeObject = pNumType[type];
        if (!typeObject)
                return (PyArrayObject *) PyErr_Format(_Error,
                    "Type object lookup returned NULL for type %d", type);
        
with no luck as the segmentation fault continues to appear.

Anyway, I've already patched my original code to use only integer codes, not
character, so it would be a problem (at least for me).

> They're still not quite right,  because the interface is written in
> terms of int arrays,  which is not good for LP64 platforms where long is
> really what is needed to avoid creating 2G bottlenecks.  The naming is
> also not consistent and I will want to make it so before release of
> numarray-0.5.

Ok, so perhaps it's better to use the PyArray_FromDims rather than NA_Empty
(at least, until the C-API stabilizes). It's good to know that!.

BTW, during the patching work of numarray sources I perceived some missing
character code types in numerictypes.py. These are the correspondents to:
UInt16, Int64 and UInt64. In recarray, they don't appear neither (except for
Int64 which appears as 'N' in numfmt, but with no correspondant in revfmt),
so one can't build-up recarrays with these types because you need a charcode
for the "formats" string.

Is this intentional? Do you plan to fill these gaps (it would be nice,
specially for recarrays)?

Thanks,

-- 
Francesc Alted


From haase at msg.ucsf.edu  Thu Jan 23 14:06:04 2003
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Thu Jan 23 14:06:04 2003
Subject: [Numpy-discussion] Have a problem: what is attribute 'compress'
References: <3E00FDB5.2090804@erols.com> <004b01c2a6fd$195c95f0$3b45da80@rodan> <3E01D07B.3070009@stsci.edu>
Message-ID: <08ad01c2c32b$900238f0$3b45da80@rodan>

Hi,
I can print numarray of any int time just fine, but
I still get the compress error message with Float (or complex)
data:
>>>c
>>>array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]], type=UInt16)
>>>c.astype(na.Float)
Traceback (most recent call last):
  File "<input>", line 1, in ?
  File "C:\Python22\Lib\site-packages\numarray\numarray.py", line 581, in
__repr__
    MAX_LINE_WIDTH, PRECISION, SUPPRESS_SMALL, ', ', 1)
  File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 163, in
array2string
    separator, array_output)
  File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 125, in
_array2string
    format, item_length = _floatFormat(data, precision, suppress_small)
  File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 246, in
_floatFormat
    non_zero = numarray.abs(numarray.compress(numarray.not_equal(data, 0),
data))
AttributeError: 'module' object has no attribute 'compress'

I get this on Windows (2000) and on Linux. Both numarray 0.4

Thanks,
Sebastian


----- Original Message -----
From: "Todd Miller" <jmiller at stsci.edu>
To: "Sebastian Haase" <haase at msg.ucsf.edu>
Cc: <Numpy-discussion at lists.sourceforge.net>
Sent: Thursday, December 19, 2002 5:58 AM
Subject: Re: [Numpy-discussion] Have a problem: what is attribute 'compress'


> Sebastian Haase wrote:
>
> >Hi!
> >Somehow I have a problem with numarray. Please take a look at this:
> >
> Hi Sebastian,
>
> I've don't recall seeing anything like this,  nor can I reproduce it
> now.   If you've been following numarray for a while now,  I can say
> that it is important to remove the old version of numarray before
> installing the new version.   I recommend deleting your current
> installation and reinstalling numarray.
>
> compress() is a ufunc,  much like add() or put().  It is defined in
> ndarray.py,  right after the import of the modules ufunc and _ufunc.
> _ufunc in particular is a problematic module,  because it has followed
> the atypical development path of moving from C-code to Python code.
>  Because of this, and the fact that a .so or .dll overrides a .py,
>  older installations interfere with newer ones.  The atypical path was
> required because the original _ufuncmodule.c was so large that it could
> not be compiled on some systems;  as a result,  I split _ufuncmodule.c
> into pieces by data type and now use _ufunc.py to glue the pieces
together.
>
> Good luck!    Please let me know if reinstalling doesn't clear up the
> problem.
>
> Todd
>
> >
> >
> >>>>import numarray as na
> >>>>na.array([0, 0])
> >>>>
> >>>>
> >array([0, 0])
> >
> >
> >>>>na.array([0.0, 0.0])
> >>>>
> >>>>
> >Traceback (most recent call last):
> >  File "<input>", line 1, in ?
> >  File "C:\Python22\Lib\site-packages\numarray\numarray.py", line 581, in
> >__repr__
> >    MAX_LINE_WIDTH, PRECISION, SUPPRESS_SMALL, ', ', 1)
> >  File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 163,
in
> >array2string
> >    separator, array_output)
> >  File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 125,
in
> >_array2string
> >    format, item_length = _floatFormat(data, precision, suppress_small)
> >  File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 246,
in
> >_floatFormat
> >    non_zero = numarray.abs(numarray.compress(numarray.not_equal(data,
0),
> >data))
> >AttributeError: 'module' object has no attribute 'compress'
> >
> >The same workes fine with Numeric. But I would prefer numarray because
I'm
> >writing C++-extensions and I need "unsigned shorts".
> >
> >What is this error about?
> >
> >Thanks,
> >Sebastian
> >
> >
> >
> >
> >-------------------------------------------------------
> >This SF.NET email is sponsored by: Order your Holiday Geek Presents Now!
> >Green Lasers, Hip Geek T-Shirts, Remote Control Tanks, Caffeinated Soap,
> >MP3 Players,  XBox Games,  Flying Saucers,  WebCams,  Smart Putty.
> >T H I N K G E E K . C O M       http://www.thinkgeek.com/sf/
> >_______________________________________________
> >Numpy-discussion mailing list
> >Numpy-discussion at lists.sourceforge.net
> >https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> >
> >
>
>
>
>


From jmiller at stsci.edu  Thu Jan 23 14:33:03 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Thu Jan 23 14:33:03 2003
Subject: [Numpy-discussion] Have a problem: what is attribute 'compress'
References: <3E00FDB5.2090804@erols.com> <004b01c2a6fd$195c95f0$3b45da80@rodan> <3E01D07B.3070009@stsci.edu> <08ad01c2c32b$900238f0$3b45da80@rodan>
Message-ID: <3E306D73.6050303@stsci.edu>

Sebastian Haase wrote:

>Hi,
>I can print numarray of any int time just fine, but
>
OK.  I am assuming you deleted all of your old numarray installations as 
I recommended and reinstalled numarray-0.4.

What is your PYTHONPATH?

>I still get the compress error message with Float (or complex)
>data:
>  
>
>>>>c
>>>>array([[0, 0, 0, ..., 0, 0, 0],
>>>>        
>>>>
>       [0, 0, 0, ..., 0, 0, 0],
>       [0, 0, 0, ..., 0, 0, 0],
>       ...,
>       [0, 0, 0, ..., 0, 0, 0],
>       [0, 0, 0, ..., 0, 0, 0],
>       [0, 0, 0, ..., 0, 0, 0]], type=UInt16)
>  
>
>>>>c.astype(na.Float)
>>>>        
>>>>
>Traceback (most recent call last):
>  File "<input>", line 1, in ?
>  File "C:\Python22\Lib\site-packages\numarray\numarray.py", line 581, in
>__repr__
>    MAX_LINE_WIDTH, PRECISION, SUPPRESS_SMALL, ', ', 1)
>  File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 163, in
>array2string
>    separator, array_output)
>  File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 125, in
>_array2string
>    format, item_length = _floatFormat(data, precision, suppress_small)
>  File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 246, in
>_floatFormat
>    non_zero = numarray.abs(numarray.compress(numarray.not_equal(data, 0),
>data))
>AttributeError: 'module' object has no attribute 'compress'
>
>I get this on Windows (2000) and on Linux. Both numarray 0.4
>  
>
I'm not sure what's going on here,  but I develop on both platforms, 
 and Linux constantly.    The self tests definitely pass in Linux.   It 
must be some kind of environment issue or runtime issue.  What happens 
when you type:

 >>> import numtestall
 >>> numtestall.test()
... what gets printed here? ...

>Thanks,
>Sebastian
>
>
>
>----- Original Message -----
>From: "Todd Miller" <jmiller at stsci.edu>
>To: "Sebastian Haase" <haase at msg.ucsf.edu>
>Cc: <Numpy-discussion at lists.sourceforge.net>
>Sent: Thursday, December 19, 2002 5:58 AM
>Subject: Re: [Numpy-discussion] Have a problem: what is attribute 'compress'
>
>
>  
>
>>Sebastian Haase wrote:
>>
>>    
>>
>>>Hi!
>>>Somehow I have a problem with numarray. Please take a look at this:
>>>
>>>      
>>>
>>Hi Sebastian,
>>
>>I've don't recall seeing anything like this,  nor can I reproduce it
>>now.   If you've been following numarray for a while now,  I can say
>>that it is important to remove the old version of numarray before
>>installing the new version.   I recommend deleting your current
>>installation and reinstalling numarray.
>>
>>compress() is a ufunc,  much like add() or put().  It is defined in
>>ndarray.py,  right after the import of the modules ufunc and _ufunc.
>>_ufunc in particular is a problematic module,  because it has followed
>>the atypical development path of moving from C-code to Python code.
>> Because of this, and the fact that a .so or .dll overrides a .py,
>> older installations interfere with newer ones.  The atypical path was
>>required because the original _ufuncmodule.c was so large that it could
>>not be compiled on some systems;  as a result,  I split _ufuncmodule.c
>>into pieces by data type and now use _ufunc.py to glue the pieces
>>    
>>
>together.
>  
>
>>Good luck!    Please let me know if reinstalling doesn't clear up the
>>problem.
>>
>>Todd
>>
>>    
>>
>>>      
>>>
>>>>>>import numarray as na
>>>>>>na.array([0, 0])
>>>>>>
>>>>>>
>>>>>>            
>>>>>>
>>>array([0, 0])
>>>
>>>
>>>      
>>>
>>>>>>na.array([0.0, 0.0])
>>>>>>
>>>>>>
>>>>>>            
>>>>>>
>>>Traceback (most recent call last):
>>> File "<input>", line 1, in ?
>>> File "C:\Python22\Lib\site-packages\numarray\numarray.py", line 581, in
>>>__repr__
>>>   MAX_LINE_WIDTH, PRECISION, SUPPRESS_SMALL, ', ', 1)
>>> File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 163,
>>>      
>>>
>in
>  
>
>>>array2string
>>>   separator, array_output)
>>> File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 125,
>>>      
>>>
>in
>  
>
>>>_array2string
>>>   format, item_length = _floatFormat(data, precision, suppress_small)
>>> File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 246,
>>>      
>>>
>in
>  
>
>>>_floatFormat
>>>   non_zero = numarray.abs(numarray.compress(numarray.not_equal(data,
>>>      
>>>
>0),
>  
>
>>>data))
>>>AttributeError: 'module' object has no attribute 'compress'
>>>
>>>The same workes fine with Numeric. But I would prefer numarray because
>>>      
>>>
>I'm
>  
>
>>>writing C++-extensions and I need "unsigned shorts".
>>>
>>>What is this error about?
>>>
>>>Thanks,
>>>Sebastian
>>>
>>>
>>>
>>>
>>>-------------------------------------------------------
>>>This SF.NET email is sponsored by: Order your Holiday Geek Presents Now!
>>>Green Lasers, Hip Geek T-Shirts, Remote Control Tanks, Caffeinated Soap,
>>>MP3 Players,  XBox Games,  Flying Saucers,  WebCams,  Smart Putty.
>>>T H I N K G E E K . C O M       http://www.thinkgeek.com/sf/
>>>_______________________________________________
>>>Numpy-discussion mailing list
>>>Numpy-discussion at lists.sourceforge.net
>>>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>>>
>>>
>>>      
>>>
>>
>>
>>    
>>
>
>
>
>-------------------------------------------------------
>This SF.NET email is sponsored by:
>SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
>http://www.vasoftware.com
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>  
>


From j_r_fonseca at yahoo.co.uk  Thu Jan 23 16:10:02 2003
From: j_r_fonseca at yahoo.co.uk (=?iso-8859-15?Q?Jos=E9?= Fonseca)
Date: Thu Jan 23 16:10:02 2003
Subject: [Numpy-discussion] Extensive use of methods instead of functions
Message-ID: <20030124000759.GA6042@localhost.localdomain>

With the ability of subclassing types in recent versions of the Python
language, more people will be interested in subclassing Numeric arrays
for specific purposes.  Still the use of functions instead of methods
takes away many of the advantages, the ability of being overloaded.

Taking this statement as an example:

	Numeric.put(myarray, myindices, myvalues)

In the current state of affairs, if we wanted to have to statment to
work with asparse matrix class derived from a Numeric array, it would
have to be something like:

	Sparse.put(myarray, myindices, myvalues)

That is, it forces to the underlaying code to know whether is dealing
with Numeric arrays, or some other equivalent class. But it would be
much more useful to have simply:

	myarray.put(myindices, myvalues)

which would work regardless of the actual type of myarray, provided it
supplied the put() method. This would improve enormously code
reusability and extensability.

I know that there are certain implementations details that may difficult
this (like many functions being implemented in pure Python), but any
advances made in this since will be an improvement of the current
situation.

Also, I know that this example is a little unhappy because numarray will
do these things with the __getitem__ and __setitem__ operators. But
others could easily be shown.

Regards,

Jos? Fonseca
__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com


From falted at openlc.org  Fri Jan 24 04:00:07 2003
From: falted at openlc.org (Francesc Alted)
Date: Fri Jan 24 04:00:07 2003
Subject: [Numpy-discussion] typecodes in numarray
Message-ID: <200301241259.30243.falted@openlc.org>

Maybe I'm becoming a bit tedious with this, but if you look at:

>>> import numerictypes
>>> numerictypes.typecode
{Complex64: 'D', Int32: 'l', UInt16: 's', Complex32: 'F', Float64: 'd',
UInt8: 'b', Int16: 's', Float32: 'f', Int8: '1'}

you can find some incongruencies that lead to weird things like:

>>> array([1,2], Int16).typecode()
's'
>>> array([1,2], UInt16).typecode()
's'  #  --> same as Int16!
>>> array([1,2], Int64).typecode()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/local/lib/python2.2/site-packages/numarray/numarray.py", line 
730, in typecode
    return numerictypes.typecode[self._type]
KeyError: numarray type: Int64
>>> array([1,2], UInt64).typecode()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/local/lib/python2.2/site-packages/numarray/numarray.py", line 
730, in typecode
    return numerictypes.typecode[self._type]
KeyError: numarray type: UInt64

Also, 'l' is used here to map Int32, while in recarray is used to map Boolean.

Moreover, Numeric 22.0 introduced the equivalent of UInt16 and UInt32 types
as 'w' and 'u' respectively. But, again, 'u' is used in recarray as synonym
of Uint8.

I think it's important to agree with a definitive set of charcodes and use
them uniformly throughout numarray.

Suggestion: if recarray charcodes are not necessary to match the Numeric
ones, I propose that using the Python convention maybe a good idea.
Look at the table in:
http://www.python.org/doc/current/lib/module-struct.html.

-- 
Francesc Alted


From perry at stsci.edu  Fri Jan 24 06:38:17 2003
From: perry at stsci.edu (Perry Greenfield)
Date: Fri Jan 24 06:38:17 2003
Subject: [Numpy-discussion] typecodes in numarray
In-Reply-To: <200301241259.30243.falted@openlc.org>
Message-ID: <JFEGLNDJEDNOMPPHDEJFCECKECAA.perry@stsci.edu>


> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net
> [mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of
> Francesc Alted
> Sent: Friday, January 24, 2003 7:00 AM
> To: numpy-discussion at lists.sourceforge.net
> Subject: [Numpy-discussion] typecodes in numarray
> 
> 
> Maybe I'm becoming a bit tedious with this, but if you look at:
> 
No, this sort of feedback is very valuable. We'll think about this a
bit, but I'd agree that consistency with Numeric codes is important. Some
of the history of the codes used by recarray arise from conventions
used in other software not related to Python or Numeric. But if
recarray is to be generic and used by others, we should hide, remove
or layer such conventions in a subclass. Let us think about how we should
do that.

Thanks, Perry 


From perry at stsci.edu  Fri Jan 24 09:04:02 2003
From: perry at stsci.edu (Perry Greenfield)
Date: Fri Jan 24 09:04:02 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
Message-ID: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu>


Todd Miller had some further comments that I thought were worth
posting as well (and I think he makes some very good points).

************************************************************************

My [i.e. Todd's]  thoughts about it:

>Maybe I'm becoming a bit tedious with this, but if you look at:
>
No.  It shows you're thinking about it carefully.   Having looked at all 
of the examples below,  I have some comments:

1.  The sparseness and obscurity of the typecode "wordspace" are both 
demonstrated here.  There are so few letters to choose from,  they're 
often already used in some other context.  Even given the large number 
of unused letters,  it's often difficult to choose good ones and to 
remember what has been chosen.  I think this is one of the reasons Perry 
chose to replace typecodes with true type objects which have rich, 
regular, and predictable symbolic names.

2. Typecodes were added as a backwards compatability feature of 
numarray,  and I think it's probable that numarray beat Numeric to 
supporting most of these types, because otherwise they'd have been 
copied directly and there would be no problem.  I'm not really trying to 
play a blame-game here,  but I am making an argument that perhaps 
numarray should only go so far in the support of what I regard as an 
obsolescent feature.  If the Numeric developers choose to continue 
extending the use of typecodes in ways that are incompatible with 
numarray,  one way of dealing with it is to "just say no".  We are going 
beyond the scope of backwards compatability to on-going compatabilty. 
(Which we may still have to do but needs to be discussed and considered)

3. STSCI has layered other software on top of numarray and recarray 
which astronomers use to do work.   It is the friction of that interface 
which makes correcting these consistency problems more difficult than 
might be immediately apparent.

>I think it's important to agree with a definitive set of charcodes and use
>them uniformly throughout numarray.
>
I wish this were possible,  but I'm thinking we should try to find an 
alternative approach altogether,  one which may be more verbose but 
implicitly free of conflict.

A means for specifying a recarray format might be created from tuples, 
type objects,  and integer repetition factors.

The verbosity of this approach might be a litte tedious,  but it would 
also be transparent, maintainable, and conflict free.

I think we should add an "obsolescent feature" warning to numarray and 
recarray which flags any use of character typecodes when the appropriate 
command line switches are set.

>Suggestion: if recarray charcodes are not necessary to match the Numeric
>ones, I propose that using the Python convention maybe a good idea.
>Look at the table in:
>http://www.python.org/doc/current/lib/module-struct.html.
>
This sounds good to me,  except that it will break an existing interface 
that I don't have control over.  Therefore,  I suggest we correct the 
problem by coming up with something better.


From paul at pfdubois.com  Fri Jan 24 09:43:07 2003
From: paul at pfdubois.com (Paul F Dubois)
Date: Fri Jan 24 09:43:07 2003
Subject: [Numpy-discussion] typecodes in numarray
In-Reply-To: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu>
Message-ID: <000501c2c3cf$e10716e0$6601a8c0@NICKLEBY>

I don't understand this remark:

<snip >but I am making an argument that perhaps 
> numarray should only go so far in the support of what I regard as an 
> obsolescent feature.  If the Numeric developers choose to continue 
> extending the use of typecodes in ways that are incompatible with 
> numarray,  one way of dealing with it is to "just say no".  
> We are going 
> beyond the scope of backwards compatability to on-going compatabilty. 
> (Which we may still have to do but needs to be discussed and 
> considered)
> 

There is no "on-going" Numeric development. It stops the minute numarray is
ready. Period. We developers all agreed on that. The whole reason for
numarray is that Numeric was pronounced unmaintainable and unextendable by
those who frequently had to work on it. To do anything else will fragment
the entire numerical python community and software set.


From falted at openlc.org  Fri Jan 24 10:48:04 2003
From: falted at openlc.org (Francesc Alted)
Date: Fri Jan 24 10:48:04 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
In-Reply-To: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu>
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu>
Message-ID: <200301241946.55398.falted@openlc.org>

A Divendres 24 Gener 2003 18:02, Todd Miller va escriure:
>
> My [i.e. Todd's]  thoughts about it:
>
> No.  It shows you're thinking about it carefully.   Having looked at all
> of the examples below,  I have some comments:

I mostly agree with your comments, but let point out some thoughts

>
> 1.  The sparseness and obscurity of the typecode "wordspace" are both
> demonstrated here.  There are so few letters to choose from,  they're
> often already used in some other context.  Even given the large number
> of unused letters,  it's often difficult to choose good ones and to
> remember what has been chosen.  I think this is one of the reasons Perry
> chose to replace typecodes with true type objects which have rich,
> regular, and predictable symbolic names.

I completely agree that type objects is a brilliant idea.

> 3. STSCI has layered other software on top of numarray and recarray
> which astronomers use to do work.   It is the friction of that interface
> which makes correcting these consistency problems more difficult than
> might be immediately apparent.

Yeah, I know...

>
> >I think it's important to agree with a definitive set of charcodes and use
> >them uniformly throughout numarray.
>
> I wish this were possible,  but I'm thinking we should try to find an
> alternative approach altogether,  one which may be more verbose but
> implicitly free of conflict.
>
> A means for specifying a recarray format might be created from tuples,
> type objects,  and integer repetition factors.
>
> The verbosity of this approach might be a litte tedious,  but it would
> also be transparent, maintainable, and conflict free.

I think this is a very good idea. In fact, while working in PyTables I was
lately pondering what would be the best way to define record arrays, and I
also think that a verbose approach should be the beast.

After considering metaclasses, and tuples, I ended to a compromise solution
between both which are dictionaries combined with some function or class to
refine the definition.

My current thinking is something like:

recarrDescr = {
    "name"        : defineType(CharType, 16, ""),  # 16-character String
    "TDCcount"    : defineType(UInt8, 1, 0),    # unsigned byte
    "ADCcount"    : defineType(Int16, 1, 0),    # signed short integer
    "grid_i"      : defineType(Int32, 1, 9),    # integer
    "grid_j"      : defineType(Int32, 1, 9),    # integer
    "pressure"    : defineType(Float32, 1, 1.),  # float  (single-precision)
    "temperature" : defineType(Float64, 32, arange(32)),  # double[32]
    "idnumber"    : defineType(Int64, 1, 0),    # signed long long 
    }

where defineType is a class that accepts (type, shape, default) parameters.
It can be extended safely in the future if more needs appear.

Dictionary has the advantage over tuple in that you can map column name to
their contents quite easily, and is more flexible than defining the fields
with a metaclass descendent (see
http://pytables.sourceforge.net/html-doc/usersguide-html3.html#subsection3.1.2)
because dictionarys can be built-up in run-time (although that also migth
metaclass descendents, but in a more misterious way that I think is not
worth of). In addition, dictionary object is available in all python version
whereas metaclasses only from 2.2 on. However, I regard metaclasses as the
most elegant solution (but elegance is not always equivalent to convenience
:().

Perhaps you may want to consider this for using in recarray definition.

>
> I think we should add an "obsolescent feature" warning to numarray and
> recarray which flags any use of character typecodes when the appropriate
> command line switches are set.

Well, I don't fully agree with that. I do believe that classes typecodes to
be a more meaningful way for describing types, but charcodes can be quite
advantageous in certain situations, like in describing in compact way the
contents of a record, or passing this info to C-routines to deal with the
data.

For example, consider the benefits of describing a recarray format as:

"3s4i20d"

instead of

((Int16, 3), 
 (Int32, 4),
 (Float64, 20),
 )

the former being more handy in lots of situations.

I certainly believe that a coexistence of both can be very beneficious,
specially for 3rd party extension makers (like me :).

>
> >Suggestion: if recarray charcodes are not necessary to match the Numeric
> >ones, I propose that using the Python convention maybe a good idea.
> >Look at the table in:
> >http://www.python.org/doc/current/lib/module-struct.html.
>
> This sounds good to me,  except that it will break an existing interface
> that I don't have control over.  Therefore,  I suggest we correct the
> problem by coming up with something better.

Well, if charcodes finally stay in, this have an additional advantage in
that python crew has provided meaningful ways to express padding (character
"x"), endianess ("=", "<", ">") and alignment ("@"). So having a compact
expresion like "@3sx4i20d", apart from resembling chinese to occidentals,
may give a lot of info in a handy way.

-- 
Francesc Alted


From jmiller at stsci.edu  Fri Jan 24 11:20:05 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jan 24 11:20:05 2003
Subject: [Fwd: Re: [Numpy-discussion] typecodes in numarray]
Message-ID: <3E319543.8040101@stsci.edu>


-------------- next part --------------
An embedded message was scrubbed...
From: unknown sender
Subject: no subject
Date: no date
Size: 38
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20030124/12498748/attachment-0001.mht>

From jmiller at stsci.edu  Fri Jan 24 14:01:31 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Fri, 24 Jan 2003 14:01:31 -0500
Subject: [Numpy-discussion] typecodes in numarray
References: <000501c2c3cf$e10716e0$6601a8c0@NICKLEBY>
Message-ID: <3E318D8B.1090403@stsci.edu>

Paul F Dubois wrote:

>I don't understand this remark:
>
><snip >but I am making an argument that perhaps 
>  
>
>>numarray should only go so far in the support of what I regard as an 
>>obsolescent feature.  If the Numeric developers choose to continue 
>>extending the use of typecodes in ways that are incompatible with 
>>numarray,  one way of dealing with it is to "just say no".  
>>We are going 
>>beyond the scope of backwards compatability to on-going compatabilty. 
>>(Which we may still have to do but needs to be discussed and 
>>considered)
>>
>>    
>>
>
>There is no "on-going" Numeric development. It stops the minute numarray is
>ready. Period. We developers all agreed on that. The whole reason for
>numarray is that Numeric was pronounced unmaintainable and unextendable by
>those who frequently had to work on it. To do anything else will fragment
>the entire numerical python community and software set.
>
>
>  
>
Thanks for clarifying Paul.   My point didn't quite come out right.   A 
better way to put it might have been:

1. Numarray and Numeric are subject to accidental divergence.  As long 
as they both continue to change concurrently,  they will probably differ 
even in interface.  Because numarray isn't quite ready yet,  they are 
both still changing.

2. Typecodes in particular are something numarray is superceding with 
something better.  Because of this, providing on-going compatability 
with Numeric typecodes may not make sense.  

3. Numeric compatability is not the only driver for the choice of 
recarray typecodes so I can't make arbitrary changes without affecting 
other software and people.

4. I think there's a clearer,  numarray type object based approach to 
describing recarray formats which does not use typecodes at all.  Thus, 
 instead of attampting to weed through and unify layers of conflicting 
type codes,  we might be able to end-run the whole problem with an 
alternative approach.

Todd

>
>-------------------------------------------------------
>This SF.NET email is sponsored by:
>SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
>http://www.vasoftware.com
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>  
>


--Boundary_(ID_V53Q9uhCvVN46XJvLKOLLw)--


From perry at stsci.edu  Fri Jan 24 11:34:02 2003
From: perry at stsci.edu (Perry Greenfield)
Date: Fri Jan 24 11:34:02 2003
Subject: [Numpy-discussion] typecodes in numarray
In-Reply-To: <000501c2c3cf$e10716e0$6601a8c0@NICKLEBY>
Message-ID: <JFEGLNDJEDNOMPPHDEJFOECOECAA.perry@stsci.edu>

I think Todd was referring to the recent addition of unsigned types
to Numeric, along with came new typecodes. These types were already
in numarray at the time.

Perry

> -----Original Message-----
> From: Paul F Dubois [mailto:paul at pfdubois.com]
> Sent: Friday, January 24, 2003 12:42 PM
> To: 'Perry Greenfield'; falted at openlc.org;
> numpy-discussion at lists.sourceforge.net
> Subject: RE: [Numpy-discussion] typecodes in numarray
>
>
> I don't understand this remark:
>
> <snip >but I am making an argument that perhaps
> > numarray should only go so far in the support of what I regard as an
> > obsolescent feature.  If the Numeric developers choose to continue
> > extending the use of typecodes in ways that are incompatible with
> > numarray,  one way of dealing with it is to "just say no".
> > We are going
> > beyond the scope of backwards compatability to on-going compatabilty.
> > (Which we may still have to do but needs to be discussed and
> > considered)
> >
>
> There is no "on-going" Numeric development. It stops the minute
> numarray is
> ready. Period. We developers all agreed on that. The whole reason for
> numarray is that Numeric was pronounced unmaintainable and unextendable by
> those who frequently had to work on it. To do anything else will fragment
> the entire numerical python community and software set.
>
>
>
>
>
>


From jmiller at stsci.edu  Fri Jan 24 12:01:32 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jan 24 12:01:32 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu>
 <200301241946.55398.falted@openlc.org>
Message-ID: <3E319ED4.5060709@stsci.edu>

>
>
>>A means for specifying a recarray format might be created from tuples,
>>type objects,  and integer repetition factors.
>>
>>The verbosity of this approach might be a litte tedious,  but it would
>>also be transparent, maintainable, and conflict free.
>>    
>>
>
>I think this is a very good idea. In fact, while working in PyTables I was
>lately pondering what would be the best way to define record arrays, and I
>also think that a verbose approach should be the beast.
>
>After considering metaclasses, and tuples, I ended to a compromise solution
>between both which are dictionaries combined with some function or class to
>refine the definition.
>
>My current thinking is something like:
>
>recarrDescr = {
>    "name"        : defineType(CharType, 16, ""),  # 16-character String
>    "TDCcount"    : defineType(UInt8, 1, 0),    # unsigned byte
>    "ADCcount"    : defineType(Int16, 1, 0),    # signed short integer
>    "grid_i"      : defineType(Int32, 1, 9),    # integer
>    "grid_j"      : defineType(Int32, 1, 9),    # integer
>    "pressure"    : defineType(Float32, 1, 1.),  # float  (single-precision)
>    "temperature" : defineType(Float64, 32, arange(32)),  # double[32]
>    "idnumber"    : defineType(Int64, 1, 0),    # signed long long 
>    }
>
>where defineType is a class that accepts (type, shape, default) parameters.
>It can be extended safely in the future if more needs appear.
>
You're way ahead of me here.  The only thing I don't like about this is 
the additional relative complexity because of the addition of field 
names and default values.   It would be nice to layer this more.

>Perhaps you may want to consider this for using in recarray definition.
>
We'll definitely consider it as we hash this out.  

>
>  
>
>>I think we should add an "obsolescent feature" warning to numarray and
>>recarray which flags any use of character typecodes when the appropriate
>>command line switches are set.
>>    
>>
>
>Well, I don't fully agree with that. I do believe that classes typecodes to
>be a more meaningful way for describing types, but charcodes can be quite
>advantageous in certain situations, like in describing in compact way the
>contents of a record, or passing this info to C-routines to deal with the
>data.
>
Yeah, I know.

>For example, consider the benefits of describing a recarray format as:
>
>"3s4i20d"
>
I know.

>
>instead of
>
>((Int16, 3), 
> (Int32, 4),
> (Float64, 20),
> )
>
This is pretty much exactly what I was thinking.  It is straightforward 
to imagine and difficult to forget.  

>
>the former being more handy in lots of situations.
>  
>
Would you please name some of these so we can explore handling them both 
ways?

>I certainly believe that a coexistence of both can be very beneficious, specially for 3rd party extension makers (like me :).
>
If there's a reasonable way to avoid supporting both,  we should.

>>>Suggestion: if recarray charcodes are not necessary to match the Numeric
>>>ones, I propose that using the Python convention maybe a good idea.
>>>Look at the table in:
>>>http://www.python.org/doc/current/lib/module-struct.html.
>>>      
>>>
>>This sounds good to me,  except that it will break an existing interface
>>that I don't have control over.  Therefore,  I suggest we correct the
>>problem by coming up with something better.
>>    
>>
>
>Well, if charcodes finally stay in, this have an additional advantage in
>that python crew has provided meaningful ways to express padding (character
>"x"), endianess ("=", "<", ">") and alignment ("@"). 
>
We might also add these to the type-repetition tuple.

Regards,
Todd


From hinsen at cnrs-orleans.fr  Fri Jan 24 12:13:05 2003
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Fri Jan 24 12:13:05 2003
Subject: [Numpy-discussion] Extensive use of methods instead of functions
In-Reply-To: <20030124000759.GA6042@localhost.localdomain>
References: <20030124000759.GA6042@localhost.localdomain>
Message-ID: <m3n0lqcnwm.fsf@chinon.cnrs-orleans.fr>

Jos? Fonseca <j_r_fonseca at yahoo.co.uk> writes:

> With the ability of subclassing types in recent versions of the Python
> language, more people will be interested in subclassing Numeric arrays
> for specific purposes.  Still the use of functions instead of methods
> takes away many of the advantages, the ability of being overloaded.

True. On the other hand, there is also an advantage: NumPy routines
can be used on standard Python data types such as number and sequence
types.

In the ideal world (which might come one day), core NumPy
functionality would be part of standard Python, and then all these
operations would work on other built-in types as well.

Until then, I am not sure that changing NumPy functions to methods
is a good idea. I need to call them on scalar numbers much more
often than I subclass arrays.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From paul at pfdubois.com  Fri Jan 24 12:36:03 2003
From: paul at pfdubois.com (Paul F Dubois)
Date: Fri Jan 24 12:36:03 2003
Subject: [Numpy-discussion] Extensive use of methods instead of functions
In-Reply-To: <m3n0lqcnwm.fsf@chinon.cnrs-orleans.fr>
Message-ID: <000401c2c3e8$0db770f0$6601a8c0@NICKLEBY>

Every time the subject of subclassing a numeric array comes up, it as if
nobody ever thought of it before. Been there, done that. It doesn't turn out
to be all that useful. To see why, consider a + b where a and b are Foo
instances, and Foo inherits from numarray.

a. a + b will be a numarray, not a Foo instance, unless you write a new +
operator.
b. Attempting to have numarray itself apply a subclass constructor to the
result runs into the problem that numarray does not have any idea what the
constructor's signature is or what information is needed to fill out that
constructor.
c. Even if the subclass accepts numarray's constructor signature, it would
rarely produced satisfactory results just "losing" the Foo'ness details of a
and b.

This same argument applies to every method that returns a Foo instance, and
every ufunc. So you end up redoing everything anyway.

In short, worrying about subclassing is way down the list of things we ought
to consider. 

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net 
> [mailto:numpy-discussion-admin at lists.sourceforge.net] On 
> Behalf Of Konrad Hinsen
> Sent: Friday, January 24, 2003 12:07 PM
> To: Jos? Fonseca
> Cc: numpy-discussion at lists.sourceforge.net
> Subject: Re: [Numpy-discussion] Extensive use of methods 
> instead of functions
> 
> 
> Jos? Fonseca <j_r_fonseca at yahoo.co.uk> writes:
> 
> > With the ability of subclassing types in recent versions of 
> the Python 
> > language, more people will be interested in subclassing 
> Numeric arrays 
> > for specific purposes.  Still the use of functions instead 
> of methods 
> > takes away many of the advantages, the ability of being overloaded.
> 
> True. On the other hand, there is also an advantage: NumPy 
> routines can be used on standard Python data types such as 
> number and sequence types.
> 
> In the ideal world (which might come one day), core NumPy 
> functionality would be part of standard Python, and then all 
> these operations would work on other built-in types as well.
> 
> Until then, I am not sure that changing NumPy functions to 
> methods is a good idea. I need to call them on scalar numbers 
> much more often than I subclass arrays.
> 
> Konrad.
> -- 
> --------------------------------------------------------------
> -----------------
> Konrad Hinsen                            | E-Mail: 
> hinsen at cnrs-orleans.fr
> Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
> Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
> 45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
> France                                   | Nederlands/Francais
> --------------------------------------------------------------
> -----------------
> 
> 
> -------------------------------------------------------
> This SF.NET email is sponsored by:
> SourceForge Enterprise Edition + IBM + LinuxWorld =omething 2 
> See! http://www.vasoftware.com 
> _______________________________________________
> Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> 


From perry at stsci.edu  Fri Jan 24 13:11:05 2003
From: perry at stsci.edu (Perry Greenfield)
Date: Fri Jan 24 13:11:05 2003
Subject: [Numpy-discussion] Extensive use of methods instead of functions
In-Reply-To: <000401c2c3e8$0db770f0$6601a8c0@NICKLEBY>
Message-ID: <JFEGLNDJEDNOMPPHDEJFGECPECAA.perry@stsci.edu>

Paul Dubois writes:
>
> Every time the subject of subclassing a numeric array comes up, it as if
> nobody ever thought of it before. Been there, done that. It
> doesn't turn out
> to be all that useful. To see why, consider a + b where a and b are Foo
> instances, and Foo inherits from numarray.
>
> a. a + b will be a numarray, not a Foo instance, unless you write a new +
> operator.
> b. Attempting to have numarray itself apply a subclass constructor to the
> result runs into the problem that numarray does not have any idea what the
> constructor's signature is or what information is needed to fill out that
> constructor.
> c. Even if the subclass accepts numarray's constructor signature, it would
> rarely produced satisfactory results just "losing" the Foo'ness
> details of a
> and b.
>
> This same argument applies to every method that returns a Foo
> instance, and
> every ufunc. So you end up redoing everything anyway.
>
> In short, worrying about subclassing is way down the list of
> things we ought
> to consider.
>
Paul illustrates some important points. While I'm not as down on the
ability to subclass (more on that later), he is absolutely right that
most think that subclassing is a breeze and don't realize that it
is far from being so.

The arguments for this would be helped immensely by a practical
example of a desired subclass. This does far more to illustrate
the issues than an abstract discussion. For most instances that I
have considered or thought about it is unavoidable that one must
override virtually all (if not all) the operators and functions.
Nevertheless, subclassing can still save a great deal of work
over implementing a completely new extension. But you'll have to
deal with defining how all the operators and functions should behave.

In our view, the most valuable subclassing in numarray comes from
subclassing NDArray, which handles all the structural operations
for arrays (recarray makes heavy use of this). But recarrays don't
try to support numerical operations, and that makes it fairly easy.
Subclassing numarrays is significantly more work for the reasons cited.

Perry


From jmiller at stsci.edu  Fri Jan 24 13:56:01 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jan 24 13:56:01 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu>
 <200301241946.55398.falted@openlc.org> <3E319ED4.5060709@stsci.edu>
Message-ID: <3E31B9DB.7080603@stsci.edu>

>
>
>> My current thinking is something like:
>>
>> recarrDescr = {
>>    "name"        : defineType(CharType, 16, ""),  # 16-character String
>>    "TDCcount"    : defineType(UInt8, 1, 0),    # unsigned byte
>>    "ADCcount"    : defineType(Int16, 1, 0),    # signed short integer
>>    "grid_i"      : defineType(Int32, 1, 9),    # integer
>>    "grid_j"      : defineType(Int32, 1, 9),    # integer
>>    "pressure"    : defineType(Float32, 1, 1.),  # float  
>> (single-precision)
>>    "temperature" : defineType(Float64, 32, arange(32)),  # double[32]
>>    "idnumber"    : defineType(Int64, 1, 0),    # signed long long    }
>>
>> where defineType is a class that accepts (type, shape, default) 
>> parameters.
>> It can be extended safely in the future if more needs appear.
>>
> You're way ahead of me here.  The only thing I don't like about this 
> is the additional relative complexity because of the addition of field 
> names and default values.   It would be nice to layer this more. 

One more thing I don't understand looking at this:  a dictionary is 
unordered.

Todd


From j_r_fonseca at yahoo.co.uk  Fri Jan 24 14:00:03 2003
From: j_r_fonseca at yahoo.co.uk (=?iso-8859-15?Q?Jos=E9?= Fonseca)
Date: Fri Jan 24 14:00:03 2003
Subject: [Numpy-discussion] Extensive use of methods instead of functions
In-Reply-To: <m3n0lqcnwm.fsf@chinon.cnrs-orleans.fr>
References: <20030124000759.GA6042@localhost.localdomain> <m3n0lqcnwm.fsf@chinon.cnrs-orleans.fr>
Message-ID: <20030124215828.GA32437@localhost.localdomain>

On Fri, Jan 24, 2003 at 09:07:21PM +0100, Konrad Hinsen wrote:
> Jos? Fonseca <j_r_fonseca at yahoo.co.uk> writes:
> 
> > With the ability of subclassing types in recent versions of the Python
> > language, more people will be interested in subclassing Numeric arrays
> > for specific purposes.  Still the use of functions instead of methods
> > takes away many of the advantages, the ability of being overloaded.
> 
> True. On the other hand, there is also an advantage: NumPy routines
> can be used on standard Python data types such as number and sequence
> types.
> 
> In the ideal world (which might come one day), core NumPy
> functionality would be part of standard Python, and then all these
> operations would work on other built-in types as well.
> 
> Until then, I am not sure that changing NumPy functions to methods
> is a good idea. I need to call them on scalar numbers much more
> often than I subclass arrays.

You've got a good point there. I often want to use with other Numeric
array-alike classes, but I've also used them with standard Python data
types for convenience. 

Still, it's perfectly possible to both interfaces to co-exist. Of course
that when one would use the .method version it can't expect to work with
standard Python data types and has to make a choice, or to use asarray()
or something equivalent before using it.

Regards,

Jos? Fonseca
__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com


From j_r_fonseca at yahoo.co.uk  Fri Jan 24 15:21:02 2003
From: j_r_fonseca at yahoo.co.uk (=?iso-8859-15?Q?'Jos=E9?= Fonseca')
Date: Fri Jan 24 15:21:02 2003
Subject: [Numpy-discussion] Extensive use of methods instead of functions
In-Reply-To: <000401c2c3e8$0db770f0$6601a8c0@NICKLEBY>
References: <m3n0lqcnwm.fsf@chinon.cnrs-orleans.fr> <000401c2c3e8$0db770f0$6601a8c0@NICKLEBY>
Message-ID: <20030124231900.GB32437@localhost.localdomain>

On Fri, Jan 24, 2003 at 12:34:54PM -0800, Paul F Dubois wrote:
> 
> Every time the subject of subclassing a numeric array comes up, it as
> if nobody ever thought of it before.

Why do you treat me as if I was trying to sell the "Next Big Thing"!?

First, I must tell you that the first time I came across the idea of
subclassing Numeric arrays was while reading the "Subclassing"
subsection, in the "Special Topics" section of the Numeric Python
manual. Your name, Paul, appears as one of the authors.

Second, subclassing Numeric arrays may be useful. Again, the
distribution of Numeric Python even has one big example: making a linear
algebra oriented version of Numeric python, where the operations would
be the standard matrix and vector operations instead of the element-wise
operations. 
> Been there, done that.  It doesn't turn out to be all that useful. 

As seen by the examples above is obvious you did. Still, I don't see how
can you possibly say it isn't useful...

> To see why, consider a + b where a and b are Foo instances, and Foo
> inherits from numarray.
> 
> a. a + b will be a numarray, not a Foo instance, unless you write a
> new + operator.  b. Attempting to have numarray itself apply a
> subclass constructor to the result runs into the problem that numarray
> does not have any idea what the constructor's signature is or what
> information is needed to fill out that constructor.  c. Even if the
> subclass accepts numarray's constructor signature, it would rarely
> produced satisfactory results just "losing" the Foo'ness details of a
> and b.
> 
> This same argument applies to every method that returns a Foo
> instance, and every ufunc. So you end up redoing everything anyway.

[In general it may be usefully to subclass Numeric arrays if one just
want to add/overload methods, but no new properties.]

And third, if you read my thread you'd notice that the use of methods
instead of functions has implications/benefits much beyond the
subclassing issue. It's particularly important for Numeric-alike arrays. 

All objects in Python are virtual so you don't actually need to subclass
to use different kind of objects in the same piece as code.

While you're right in the sense that for many practical applications
there is little use of subclassing - a sparse matrix class is one of
them for instance -, you can't deny that is quite useful to have
Numeric-alike arrays, in the same basis as is currently done with the
file-alike objects in Python, i.e., they could be strings, web pages but
as long as they define a set of methods, these.

> In short, worrying about subclassing is way down the list of things we
> ought to consider. 

If so, then why did your comment only focused on the subclassing issue?
The subclassing was a mere introduction [perhaps unfortunate, I confess]
to the method overloading issue.  Now, if you could (re)read my first
post and comment on my actual suggestion I would appreciate.

Of course that I have no problems if the Numeric/numarray maintainers
decide to turn it down. I'll most probably just use UserArray.py to create a
"method-ized" version of Numeric, so that my algorithms can work with
both Numeric array and sparse matrices. (I do have a real case need of
for this.)

BTW, there is an alternative to create full-methodized Numeric array:
just add a attribute which points to the module which the class belongs,
e.g., "myarray.module.take" would point to "Numeric.take" if it was a
Numeric array, or "Sparse.take" if it was a sparse matrix.

Regards,

Jos? Fonseca
__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com


From bsder at allcaps.org  Fri Jan 24 16:19:03 2003
From: bsder at allcaps.org (Andrew P. Lentvorski, Jr.)
Date: Fri Jan 24 16:19:03 2003
Subject: [Numpy-discussion] Extensive use of methods instead of functions
In-Reply-To: <20030124231900.GB32437@localhost.localdomain>
Message-ID: <Pine.LNX.4.44.0301241615390.23684-100000@mail.allcaps.org>

On Fri, 24 Jan 2003, [iso-8859-15] 'Jos? Fonseca' wrote:

> Of course that I have no problems if the Numeric/numarray maintainers
> decide to turn it down. I'll most probably just use UserArray.py to create a
> "method-ized" version of Numeric, so that my algorithms can work with
> both Numeric array and sparse matrices. (I do have a real case need of
> for this.)

Sparse matricies are common enough that they really should be a base part 
of Numeric rather than requiring subclassing/extending/etc.  I know that 
Travis O. was working on some sparse matrix stuff a while back so you 
might want to contact him to get the current status of that work.

-a


From falted at openlc.org  Sat Jan 25 04:43:02 2003
From: falted at openlc.org (Francesc Alted)
Date: Sat Jan 25 04:43:02 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
In-Reply-To: <3E319ED4.5060709@stsci.edu>
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu> <200301241946.55398.falted@openlc.org> <3E319ED4.5060709@stsci.edu>
Message-ID: <200301251342.15164.falted@openlc.org>

A Divendres 24 Gener 2003 21:15, Todd Miller va escriure:
> >
> >My current thinking is something like:
> >
> >recarrDescr = {
> >    "name"        : defineType(CharType, 16, ""),  # 16-character String
> >    "TDCcount"    : defineType(UInt8, 1, 0),    # unsigned byte
> >    "ADCcount"    : defineType(Int16, 1, 0),    # signed short integer
> >    "grid_i"      : defineType(Int32, 1, 9),    # integer
> >    "grid_j"      : defineType(Int32, 1, 9),    # integer
> >    "pressure"    : defineType(Float32, 1, 1.),  # float 
> > (single-precision) "temperature" : defineType(Float64, 32, arange(32)), 
> > # double[32] "idnumber"    : defineType(Int64, 1, 0),    # signed long
> > long }
> >
> >where defineType is a class that accepts (type, shape, default)
> > parameters. It can be extended safely in the future if more needs appear.
>
> You're way ahead of me here.  The only thing I don't like about this is
> the additional relative complexity because of the addition of field
> names and default values.   It would be nice to layer this more.
>

Well, I think a map between field names and values is valuable from the
user's point of view. It may help him to label the different information on
the recarray. Moreover, if __getattr__ and __setattr__ methods (or
__getitem__ and __setitem__) would get implemented on recarray (as they are
in my recarray2 version, for example), the field name can become a very
convenient manner to access a specific field by name (this introduce the
limitation that field name must be a valid python identifier, but I think
this is not a big restriction). By looking at the description dictionary,
the user can have a quick idea of what he can find in every field (with no
need of counting, which can be a big advantage specially for long records).

With regard to default values, you can make this parameter (even the shape)
a keyword parameter in order to make it optional. In that way, the
definition can be as simple as "defineType(CharType)" (or even just
"Chartype", if you add a bit of code) or as complete as
"defineType(Chartype, shape, default, whatever_you_want)". I think this is
a quite flexible approach.

>One more thing I don't understand looking at this:  a dictionary is 
>unordered.

Yeah, but this can be regarded as an advantage rather than a drawback in the
sense that you can choose the order you (the developer) prefer. For example,
I was using first a alphanumerical order to arrange the data fields, but
now, I'm considering that a arrangement that optimizes the alignment of the
fields could be far better. As for one, say that you have a (Int8, Int32,
Float64) record; in principle it could be easy to create a routine that
arranges this record in the form (Float64,Int32, Int8) that optimizes the
different field access (it may be even possible to introduce automatic
padding later on if recarrays would support them in the future).

Maybe you are getting confused in thinking that recarrDescr will create the
recarray. Not at all, this a *metadata* definition that can be passed to the
actual recarray funtion for recarray creation. Its function would be
similar to the formats parameter (with typical values like "3a,4i,3w") in
recarray.array, but with more verbosity and all the reported advantages.

> >instead of
> >
> >((Int16, 3),
> > (Int32, 4),
> > (Float64, 20),
> > )
>
> This is pretty much exactly what I was thinking.  It is straightforward
> to imagine and difficult to forget.
>
> >the former being more handy in lots of situations.
>
> Would you please name some of these so we can explore handling them both
> ways?
>

Well, I'm afraid that the best advantage would be when dealing with
recarrays in C extension modules. In this kind of situation it would be far
better to deal with a "3a4i3w" array than a tuple of python objects. But
maybe I'm wrong and the latter is not so-complicated to manage; however, I
used to work a lot with records (even before meeting recarray) and I was
quite comfortable with formats in string mode.

Or perhaps it would be enough to provide a method for converting from the
standard metadata layout (dictionary or tuple or whatever), to a string
format. This should be not very difficult.

> >
> >Well, if charcodes finally stay in, this have an additional advantage in
> >that python crew has provided meaningful ways to express padding
> > (character "x"), endianess ("=", "<", ">") and alignment ("@").
>
> We might also add these to the type-repetition tuple.

It would be nice, of course.

-- 
Francesc Alted


From jmiller at stsci.edu  Sat Jan 25 11:16:05 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Sat Jan 25 11:16:05 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu>
 <200301241946.55398.falted@openlc.org> <3E319ED4.5060709@stsci.edu>
 <200301251342.15164.falted@openlc.org>
Message-ID: <3E32E5E3.2020704@stsci.edu>

Francesc Alted wrote:

>A Divendres 24 Gener 2003 21:15, Todd Miller va escriure:
>  
>
>>>My current thinking is something like:
>>>
>>>recarrDescr = {
>>>   "name"        : defineType(CharType, 16, ""),  # 16-character String
>>>   "TDCcount"    : defineType(UInt8, 1, 0),    # unsigned byte
>>>   "ADCcount"    : defineType(Int16, 1, 0),    # signed short integer
>>>   "grid_i"      : defineType(Int32, 1, 9),    # integer
>>>   "grid_j"      : defineType(Int32, 1, 9),    # integer
>>>   "pressure"    : defineType(Float32, 1, 1.),  # float 
>>>(single-precision) "temperature" : defineType(Float64, 32, arange(32)), 
>>># double[32] "idnumber"    : defineType(Int64, 1, 0),    # signed long
>>>long }
>>>      
>>>
Still think I'd prefer something seperable:

recarrStruct = (   (CharType, 16),
                            UInt8,
                            Int16,
                            Int32,
                            Int32,
                            Float32,
                            (Float64, 32),
                            Int64 )

recarrFields = ["name",
  "TDCcount",
  "ADCcount",
   "grid_i",
   "grid_j",
   "pressure",
   "temperature",
   "idnumber"]

I guess it might not be quite as good for large structs.

>>>where defineType is a class that accepts (type, shape, default)
>>>parameters. It can be extended safely in the future if more needs appear.
>>>      
>>>
>>You're way ahead of me here.  The only thing I don't like about this is
>>the additional relative complexity because of the addition of field
>>names and default values.   It would be nice to layer this more.
>>
>>    
>>
>
>Well, I think a map between field names and values is valuable from the
>user's point of view. It may help him to label the different information on
>the recarray. Moreover, if __getattr__ and __setattr__ methods (or
>__getitem__ and __setitem__) would get implemented on recarray (as they are
>in my recarray2 version, for example), the field name can become a very
>convenient manner to access a specific field by name (this introduce the
>limitation that field name must be a valid python identifier, but I think
>this is not a big restriction). By looking at the description dictionary,
>the user can have a quick idea of what he can find in every field (with no
>need of counting, which can be a big advantage specially for long records).
>
That's true and sounds nice.  I'm just thinking records with named 
fields should be derived
from records with positional fields.  If the functionality is layered, 
 you can use as much
complexity as you need.

It's a good sign that both you and I thought of an identical tuple 
format; it's the obvious
minimal one.

>
>With regard to default values, you can make this parameter (even the shape)
>a keyword parameter in order to make it optional. 
>
OK.  That's a good point.

>  
>
>>One more thing I don't understand looking at this:  a dictionary is 
>>unordered.
>>    
>>
>
>Yeah, but this can be regarded as an advantage rather than a drawback in the
>sense that you can choose the order you (the developer) prefer. For example,
>I was using first a alphanumerical order to arrange the data fields, but
>now, I'm considering that a arrangement that optimizes the alignment of the
>fields could be far better. As for one, say that you have a (Int8, Int32,
>Float64) record; in principle it could be easy to create a routine that
>arranges this record in the form (Float64,Int32, Int8) that optimizes the
>different field access (it may be even possible to introduce automatic
>padding later on if recarrays would support them in the future).
>
>Maybe you are getting confused 
>
Yes and no. :)

>in thinking that recarrDescr will create the
>recarray. Not at all, this a *metadata* definition that can be passed to the
>actual recarray funtion for recarray creation. 
>
Just like the type repetition tuple except also including field names 
and default values.   I don't think you lost me.  For what we do,  the 
exact physical layout of the "struct" is important, so order matters.  I 
see order as part of the
meta-data,  but I don't usually deal with meta-entities so maybe I've 
got that part wrong.  :)

>Its function would be
>similar to the formats parameter (with typical values like "3a,4i,3w") in
>recarray.array, but with more verbosity and all the reported advantages.
>
>  
>
>>>instead of
>>>
>>>((Int16, 3),
>>>(Int32, 4),
>>>(Float64, 20),
>>>)
>>>      
>>>
>>This is pretty much exactly what I was thinking.  It is straightforward
>>to imagine and difficult to forget.
>>
>>    
>>
>>>the former being more handy in lots of situations.
>>>      
>>>
>>Would you please name some of these so we can explore handling them both
>>ways?
>>
>>    
>>
>
>Well, I'm afraid that the best advantage would be when dealing with
>recarrays in C extension modules. In this kind of situation it would be far
>better to deal with a "3a4i3w" array than a tuple of python objects. But
>maybe I'm wrong and the latter is not so-complicated to manage; however, I
>used to work a lot with records (even before meeting recarray) and I was
>quite comfortable with formats in string mode.
>
I was thinking that if the above was an issue,  we could write an API 
function(s) to "compile" the type-repetition tuple into arrays of ints 
which describe the type of each field and corresponding repetition factor.

>
>Or perhaps it would be enough to provide a method for converting from the
>standard metadata layout (dictionary or tuple or whatever), to a string
>format. This should be not very difficult.
>  
>
Almost exactly what I suggested above.

See you Monday,
Todd


From baecker at physik.tu-dresden.de  Sun Jan 26 02:41:02 2003
From: baecker at physik.tu-dresden.de (baecker at physik.tu-dresden.de)
Date: Sun Jan 26 02:41:02 2003
Subject: [Numpy-discussion] complex diagonal matrix
Message-ID: <Pine.LNX.4.51.0301261134210.26705@ptpcp2.phy.tu-dresden.de>

Hi,

I just wondered if there is a "nicer" way of generating
a complex diagonal matrix than
  a)
     v=arange(10,typecode=Complex)
     mat=diag(v)
  b)
     v=arange(10)
     mat=diag(v)+0j

Namely, wouldn't something like
  v=arange(10)
  mat=diag(v,typecode=Complex)
be nicer?

BTW: I somehow found that in the (excellent) documentation
of Numeric the definitions from Mlab.py are a bit hidden.
In my case I know nothing about matlab and I somehow expected
that this type of routines are to be found in the section
(together with zeros,ones etc. etc....)
Also diag is not listed in the index
 http://www.pfdubois.com/numpy/html2/numpy-22.html#A
or ?

Arnd


From hinsen at cnrs-orleans.fr  Sun Jan 26 03:11:02 2003
From: hinsen at cnrs-orleans.fr (Konrad Hinsen)
Date: Sun Jan 26 03:11:02 2003
Subject: [Numpy-discussion] complex diagonal matrix
In-Reply-To: <Pine.LNX.4.51.0301261134210.26705@ptpcp2.phy.tu-dresden.de>
References: <Pine.LNX.4.51.0301261134210.26705@ptpcp2.phy.tu-dresden.de>
Message-ID: <m3y958tbcv.fsf@localhost.localdomain>

baecker at physik.tu-dresden.de writes:

> I just wondered if there is a "nicer" way of generating
> a complex diagonal matrix than
>   a)
>      v=arange(10,typecode=Complex)
>      mat=diag(v)
>   b)
>      v=arange(10)
>      mat=diag(v)+0j
> 
> Namely, wouldn't something like
>   v=arange(10)
>   mat=diag(v,typecode=Complex)
> be nicer?

Why would that be nicer?

Personally, I prefer to have explicit typecodes limited to a very
small number of array generators, and have all other functions apply
the standard type-preservation rules.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From list at jsaul.de  Sun Jan 26 04:03:05 2003
From: list at jsaul.de (Joachim Saul)
Date: Sun Jan 26 04:03:05 2003
Subject: [Numpy-discussion] complex diagonal matrix
In-Reply-To: <Pine.LNX.4.51.0301261134210.26705@ptpcp2.phy.tu-dresden.de>
References: <Pine.LNX.4.51.0301261134210.26705@ptpcp2.phy.tu-dresden.de>
Message-ID: <20030126120117.GB869@jsaul.de>

* baecker at physik.tu-dresden.de [26.01.2003 11:40]:
> I just wondered if there is a "nicer" way of generating
> a complex diagonal matrix than
>   a)
>      v=arange(10,typecode=Complex)
>      mat=diag(v)
>   b)
>      v=arange(10)
>      mat=diag(v)+0j
>
> Namely, wouldn't something like
>   v=arange(10)
>   mat=diag(v,typecode=Complex)
> be nicer?

No, because diag() is supposed to create a diagonal, but *not* to
cast to another type. If you wanted to add that "functionality" to
functions like diag(), you would also have to add it to functions
like reshape() etc., i.e. practically everywhere.

The way it is handled now is reasonably simple and flexible, and
there is really no advantage of your suggestion compared to
approach a).

Cheers,
Joachim


From falted at openlc.org  Mon Jan 27 04:02:02 2003
From: falted at openlc.org (Francesc Alted)
Date: Mon Jan 27 04:02:02 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
In-Reply-To: <3E32E5E3.2020704@stsci.edu>
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu> <200301251342.15164.falted@openlc.org> <3E32E5E3.2020704@stsci.edu>
Message-ID: <200301271301.01659.falted@openlc.org>

A Dissabte 25 Gener 2003 20:30, Todd Miller va escriure:
>
> Still think I'd prefer something seperable:
>
> recarrStruct = (   (CharType, 16),
>                             UInt8,
>                             Int16,
>                             Int32,
>                             Int32,
>                             Float32,
>                             (Float64, 32),
>                             Int64 )
>
> recarrFields = ["name",
>   "TDCcount",
>   "ADCcount",
>    "grid_i",
>    "grid_j",
>    "pressure",
>    "temperature",
>    "idnumber"]
>
> I guess it might not be quite as good for large structs.

Me too...

>
> It's a good sign that both you and I thought of an identical tuple
> format; it's the obvious
> minimal one.

Yeah. We just differ in the way to arrange this metadata to be passed to the
recarray constructor. But I think this is secondary compared to the
flexibility that a verbose approach offers compared with the actual string
format. In fact, more than one container might be supported to define the
metadata; one can start with tuples as you suggest, but in the future other
ways can be added (if considered convenient).

For example, I think I'll stick with the dictionary option for PyTables, but
also a class declaration for the metadata would be supported, like in :

class Small(IsRecord):
    var1 = defineType(CharType, 2, "")
    var2 = defineType(Int32, 1)
    var3 = Float64

This would not be difficult to support because, by accessing to the
Small().__dict__, you get also a dictionary. In addition, the latter will
ensure (by construction) that you are not using a non-valid python
identifier, which is mandatory in my current implementation. I find these
containers (dictionaries and classes) both elegant and convenient.

>
> Just like the type repetition tuple except also including field names
> and default values.   I don't think you lost me.  For what we do,  the
> exact physical layout of the "struct" is important, so order matters.  I
> see order as part of the
> meta-data,  but I don't usually deal with meta-entities so maybe I've
> got that part wrong.  :)
>

Well, if you need positional fields, you may add a (optional) parameter,
called for example, "position" so that you can fix it. 

>
> I was thinking that if the above was an issue,  we could write an API
> function(s) to "compile" the type-repetition tuple into arrays of ints
> which describe the type of each field and corresponding repetition factor.

Yeah, I agree that this would be the best solution. That way, the charcodes
will be factored out from the code, and by just providing such and API (both
in Python and C), would be enough to reconstruct them, if needed. That will
allow a more consistent numarray internal code. 

>
> See you Monday,

Right, how did you know that? :)

-- 
Francesc Alted


From jmiller at stsci.edu  Mon Jan 27 06:44:03 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Mon Jan 27 06:44:03 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu> <200301251342.15164.falted@openlc.org> <3E32E5E3.2020704@stsci.edu> <200301271301.01659.falted@openlc.org>
Message-ID: <3E354551.5090704@stsci.edu>

Francesc Alted wrote:

>Yeah. We just differ in the way to arrange this metadata to be passed to the
>recarray constructor. But I think this is secondary compared to the
>flexibility that a verbose approach offers compared with the actual string
>format. 
>
Yes.  So one question is:  if we were to add type-repetition tuples to 
recarray as an alternative to the current character code strings,  would 
that be any form of improvement to recarray from your perspective?

As I see it,  recarray currently has a clean seperation between format 
and naming which permits the latter to be optional.  Before changing 
that,  I'd need a clear argument why.  (I didn't design and generally 
don't even maintain recarray).

>In fact, more than one container might be supported to define the
>metadata; one can start with tuples as you suggest, but in the future other
>ways can be added (if considered convenient).
>  
>
>For example, I think I'll stick with the dictionary option for PyTables, but
>also a class declaration for the metadata would be supported, like in :
>
>class Small(IsRecord):
>    var1 = defineType(CharType, 2, "")
>    var2 = defineType(Int32, 1)
>    var3 = Float64
>
>This would not be difficult to support because, by accessing to the
>Small().__dict__, you get also a dictionary. In addition, the latter will
>ensure (by construction) that you are not using a non-valid python
>identifier, which is mandatory in my current implementation. I find these
>containers (dictionaries and classes) both elegant and convenient.
>  
>
I'm not trying to be Mr. Negative here,  but one thing to keep in mind 
is this:

 >>> class C:
...     pass
...
 >>> c = C()
 >>> dir(c.__dict__)
['__class__', '__cmp__', '__contains__', '__delattr__', '__delitem__', 
'__doc__', '__eq__', '__ge__', '__getattribute__', '__getitem__', 
'__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', 
'__lt__', '__ne__', '__new__', '__reduce__', '__repr__', '__setattr__', 
'__setitem__', '__str__', 'clear', 'copy', 'fromkeys', 'get', 'has_key', 
'items', 'iteritems', 'iterkeys', 'itervalues', 'keys', 'pop', 
'popitem', 'setdefault', 'update', 'values']

Which is to say,  the instance dictionary is a little cluttered,  and it 
might not be that easy to determine which objects in it are there to 
define the data format.

>>Just like the type repetition tuple except also including field names
>>and default values.   I don't think you lost me.  For what we do,  the
>>exact physical layout of the "struct" is important, so order matters.  I
>>see order as part of the
>>meta-data,  but I don't usually deal with meta-entities so maybe I've
>>got that part wrong.  :)
>>
>
>Well, if you need positional fields, you may add a (optional) parameter,
>called for example, "position" so that you can fix it. 
>  
>
I'm sure that's not the easiest way to capture struct layout,  but I 
take your point.   Since position matters to me,  I'd prefer that 
capturing them was implicit.   Since it doesn't to you, it seems OK for 
it to be explicit.   Either default mode can support the other,  but 
capturing order with tuples is free,  while capturing order with a 
__dict__ will take some kind of extra work.

>>I was thinking that if the above was an issue,  we could write an API
>>function(s) to "compile" the type-repetition tuple into arrays of ints
>>which describe the type of each field and corresponding repetition factor.
>>    
>>
>
>Yeah, I agree that this would be the best solution. That way, the charcodes
>will be factored out from the code, and by just providing such and API (both
>in Python and C), would be enough to reconstruct them, if needed. That will
>allow a more consistent numarray internal code. 
>  
>
I'm thinking the general format for this may be converting N-tuples of 
types and ints into N arrays of types and ints.  And vice versa.
It's obvious how this works with numarray types.  I think the chararray 
types need work and need to be mapped into the same integer enumeration 
as the numeric types in a non-overlapping way.

>See you Monday,
>  
>
>
>Right, how did you know that? :)
>  
>
Insightful on weekends anyway, 
Todd


From jmiller at stsci.edu  Mon Jan 27 08:30:02 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Mon Jan 27 08:30:02 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu> <200301271301.01659.falted@openlc.org> <3E354551.5090704@stsci.edu> <200301271717.19055.falted@openlc.org>
Message-ID: <3E355E35.9070805@stsci.edu>

Francesc Alted wrote:

>A Dilluns 27 Gener 2003 15:42, Todd Miller va escriure:
>  
>
>>Yes.  So one question is:  if we were to add type-repetition tuples to
>>recarray as an alternative to the current character code strings,  would
>>that be any form of improvement to recarray from your perspective?
>>    
>>
>
>Well, at least, charcodes can be avoided. I think it's a big win... or maybe
>not as big?
>  
>
I think that avoiding the charcodes would be an improvement. 
 Type-repetition tuples provide a clear well defined way to define data 
formats.   It's not so clear that it eliminates the requirement for 
on-going Numeric compatability,  but it might.

>  
>
>>As I see it,  recarray currently has a clean seperation between format
>>and naming which permits the latter to be optional.  Before changing
>>that,  I'd need a clear argument why.  (I didn't design and generally
>>don't even maintain recarray).
>>    
>>
>
>One argument is the fact that a map is very clear to the user, although that
>such a map can be built *after* the names and format are passed to the
>recarray constructor and be accessible as an atribute. However, the latter
>solution is worse IMO, because the user has to supply two separate pieces of
>information when, actually, these should be regarded as a unity. Anyway,
>this maybe a subjective perception.
>  
>
Well,  I think there's truth to the danger of seperating names from data 
declarations,  but it is easy to map keys(), values() to the seperate 
pieces in a different layer if necessary.  

>This would not be difficult to support because, by accessing to the
>Small().__dict__, you get also a dictionary. In addition, the latter will
>ensure (by construction) that you are not using a non-valid python
>identifier, which is mandatory in my current implementation. I find these
>containers (dictionaries and classes) both elegant and convenient.
>  
>
>>I'm not trying to be Mr. Negative here,  but one thing to keep in mind
>>    
>>
>
>Oh dear, you are right!. 
>
For a few seconds there,  I thought I was on a roll!  

>In fact, I forgot that to make this to work, you
>need to use the metaclasses introduced in Python 2.2 (see Alex Martelli's
>post: http://mail.python.org/pipermail/python-list/2002-July/112007.html).
>I was following this recipe, but I forgot that I was using Python 2.2.
>
>So, as numarray has to work with previous python versions, there is no point
>to care about that.
>  
>
In truth,   numarray-0.4 and up already require Python-2.2 and up.

>I'm sure that's not the easiest way to capture struct layout,  but I
>take your point.   Since position matters to me,  I'd prefer that
>capturing them was implicit.   Since it doesn't to you, it seems OK for
>it to be explicit.   Either default mode can support the other,  but
>capturing order with tuples is free,  while capturing order with a
>__dict__ will take some kind of extra work.
>  
>
>
>That's right. We have some different needs and priorities, and we should
>take the approach better suited to each other. But exchanging points of view
>is always a great thing.
>
>  
>
>>I'm thinking the general format for this may be converting N-tuples of
>>types and ints into N arrays of types and ints.  And vice versa.
>>It's obvious how this works with numarray types.  I think the chararray
>>types need work and need to be mapped into the same integer enumeration
>>as the numeric types in a non-overlapping way.
>>
>>    
>>
>
>I can't catch your point here. Why there should be a problem with
>chararrays?.
>
What I was trying to see is that chararray types are not as well 
designed as the numarray types,  nor are they reflected in the C-API.

>  
>


From falted at openlc.org  Mon Jan 27 08:39:05 2003
From: falted at openlc.org (Francesc Alted)
Date: Mon Jan 27 08:39:05 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
In-Reply-To: <3E354551.5090704@stsci.edu>
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu> <200301271301.01659.falted@openlc.org> <3E354551.5090704@stsci.edu>
Message-ID: <200301271717.19055.falted@openlc.org>

A Dilluns 27 Gener 2003 15:42, Todd Miller va escriure:
> Yes.  So one question is:  if we were to add type-repetition tuples to
> recarray as an alternative to the current character code strings,  would
> that be any form of improvement to recarray from your perspective?

Well, at least, charcodes can be avoided. I think it's a big win... or maybe
not as big?

>
> As I see it,  recarray currently has a clean seperation between format
> and naming which permits the latter to be optional.  Before changing
> that,  I'd need a clear argument why.  (I didn't design and generally
> don't even maintain recarray).

One argument is the fact that a map is very clear to the user, although that
such a map can be built *after* the names and format are passed to the
recarray constructor and be accessible as an atribute. However, the latter
solution is worse IMO, because the user has to supply two separate pieces of
information when, actually, these should be regarded as a unity. Anyway,
this maybe a subjective perception.

> >This would not be difficult to support because, by accessing to the
> >Small().__dict__, you get also a dictionary. In addition, the latter will
> >ensure (by construction) that you are not using a non-valid python
> >identifier, which is mandatory in my current implementation. I find these
> >containers (dictionaries and classes) both elegant and convenient.
>
> I'm not trying to be Mr. Negative here,  but one thing to keep in mind

Oh dear, you are right!. In fact, I forgot that to make this to work, you
need to use the metaclasses introduced in Python 2.2 (see Alex Martelli's
post: http://mail.python.org/pipermail/python-list/2002-July/112007.html).
I was following this recipe, but I forgot that I was using Python 2.2.

So, as numarray has to work with previous python versions, there is no point
to care about that.

>
> I'm sure that's not the easiest way to capture struct layout,  but I
> take your point.   Since position matters to me,  I'd prefer that
> capturing them was implicit.   Since it doesn't to you, it seems OK for
> it to be explicit.   Either default mode can support the other,  but
> capturing order with tuples is free,  while capturing order with a
> __dict__ will take some kind of extra work.

That's right. We have some different needs and priorities, and we should
take the approach better suited to each other. But exchanging points of view
is always a great thing.

>
> I'm thinking the general format for this may be converting N-tuples of
> types and ints into N arrays of types and ints.  And vice versa.
> It's obvious how this works with numarray types.  I think the chararray
> types need work and need to be mapped into the same integer enumeration
> as the numeric types in a non-overlapping way.
>

I can't catch your point here. Why there should be a problem with
chararrays?.

-- 
Francesc Alted


From Chris.Barker at noaa.gov  Mon Jan 27 10:20:06 2003
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Mon Jan 27 10:20:06 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu> <200301271301.01659.falted@openlc.org> <3E354551.5090704@stsci.edu> <200301271717.19055.falted@openlc.org>
Message-ID: <3E35768B.DD6454BE@noaa.gov>

Francesc Alted wrote:

> So, as numarray has to work with previous python versions, 

Why? Anyone using NumArray is either starting from scratch or porting
from Numeric, so having to port to a newer version of Python is a very
small deal. 


-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                    		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From jmiller at stsci.edu  Mon Jan 27 10:34:05 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Mon Jan 27 10:34:05 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu> <200301271301.01659.falted@openlc.org> <3E354551.5090704@stsci.edu> <200301271717.19055.falted@openlc.org> <3E35768B.DD6454BE@noaa.gov>
Message-ID: <3E357B5F.9030908@stsci.edu>

Chris Barker wrote:

>Francesc Alted wrote:
>
>  
>
>>So, as numarray has to work with previous python versions, 
>>    
>>
>
>Why? Anyone using NumArray is either starting from scratch or porting
>from Numeric, so having to port to a newer version of Python is a very
>small deal. 
>  
>
Just to make it very clear:  numarray-0.4 and up require Python-2.2 or 
higher.  

Up until numarray-0.4 (released in November),  that was not the case, 
and numarray ran (and was tested!) on Python-2.0 and higher.

The desire to increase C-level Numeric compatability and to improve 
simple indexing speed led us to a C baseclass, which is only supported 
in Python-2.2 and  up.

Todd


From falted at openlc.org  Mon Jan 27 11:23:01 2003
From: falted at openlc.org (Francesc Alted)
Date: Mon Jan 27 11:23:01 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
In-Reply-To: <3E355E35.9070805@stsci.edu>
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu> <200301271717.19055.falted@openlc.org> <3E355E35.9070805@stsci.edu>
Message-ID: <200301272021.47587.falted@openlc.org>

A Dilluns 27 Gener 2003 17:28, Todd Miller va escriure:
> >So, as numarray has to work with previous python versions, there is no
> > point to care about that.
>
> In truth,   numarray-0.4 and up already require Python-2.2 and up.

Oh!, I didn't know that. In such a case, I think it's worth to consider the
possibility to define records as classes descendants from metaclasses. But,
of course, you have the ultimate decision.

> >>I'm thinking the general format for this may be converting N-tuples of
> >>types and ints into N arrays of types and ints.  And vice versa.
> >>It's obvious how this works with numarray types.  I think the chararray
> >>types need work and need to be mapped into the same integer enumeration
> >>as the numeric types in a non-overlapping way.
> >
> >I can't catch your point here. Why there should be a problem with
> >chararrays?.
>
> What I was trying to see is that chararray types are not as well
> designed as the numarray types,  nor are they reflected in the C-API.

I see. Well, is it really desirable such a unification? CharArray entities
come from a module and NumArray from another one, and that should be ok. Why
bother in creating a unified API or integer enumeration?. I think this
should be not a big drawback for C-extension crafters (although, to say the
truth, that would be very elegant if you manage to do that, but maybe it is
not worth the effort, I don't know).

-- 
Francesc Alted


From jmiller at stsci.edu  Mon Jan 27 11:39:01 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Mon Jan 27 11:39:01 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
References: <000001c2c635$624e9a40$6601a8c0@NICKLEBY>
Message-ID: <3E358A72.6050400@stsci.edu>

Paul F Dubois wrote:

>IMHO you can assume any Python you want. Look to the long term here, not the
>short.
>
You lost me.  numarray-0.4 needs at least Python-2.2 or baseclasses 
don't exist.  I had a slow Python equivalent for the baseclass as I 
refactored prior to numarray-0.4,  but it's gone now.

>
>I'm a bit uncertain on MA as to whether my old design is right. Maybe I
>should be inheriting from NDarray? So that MA is more of a sibling of
>numarray rather than a wrapper of it?
>  
>
I asked Perry about this one.  His points (salted a little by me) were:

1. If you inherit from NumArray,  you also inherit from NDArray.  If you 
only inherit from NDArray,  all you get are the structural operations.

2. If you inherit from NumArray,  you can use Liskov substitution to 
pass MA's directly into extensions expecting NumArrays.  This 
substitution may or may not be good.  Also,  isinstance(anMA, numarray) 
will return True.  

3. If you inherit from NumArray,  you get numerical method definitions 
which may or may not be applicable to MA.  With a little thrashing,  we 
might also get MAs to work for ufuncs.   In fact, ufuncs are the key to 
whether or not the NumArray numerical methods add any value.

Todd

>  
>


From jmiller at stsci.edu  Mon Jan 27 11:54:06 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Mon Jan 27 11:54:06 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu> <200301271717.19055.falted@openlc.org> <3E355E35.9070805@stsci.edu> <200301272021.47587.falted@openlc.org>
Message-ID: <3E358DE0.7040501@stsci.edu>

Francesc Alted wrote:

>A Dilluns 27 Gener 2003 17:28, Todd Miller va escriure:
>  
>
>>>So, as numarray has to work with previous python versions, there is no
>>>point to care about that.
>>>      
>>>
>>In truth,   numarray-0.4 and up already require Python-2.2 and up.
>>    
>>
>
>Oh!, I didn't know that. In such a case, I think it's worth to consider the
>possibility to define records as classes descendants from metaclasses. But,
>of course, you have the ultimate decision.
>  
>
I don't know what you mean here.   Please spell it out a little more.

>  
>
>>>>I'm thinking the general format for this may be converting N-tuples of
>>>>types and ints into N arrays of types and ints.  And vice versa.
>>>>It's obvious how this works with numarray types.  I think the chararray
>>>>types need work and need to be mapped into the same integer enumeration
>>>>as the numeric types in a non-overlapping way.
>>>>        
>>>>
>>>I can't catch your point here. Why there should be a problem with
>>>chararrays?.
>>>      
>>>
>>What I was trying to see is that chararray types are not as well
>>designed as the numarray types,  nor are they reflected in the C-API.
>>    
>>
>
>I see. Well, is it really desirable such a unification? CharArray entities
>come from a module and NumArray from another one, and that should be ok. Why
>bother in creating a unified API or integer enumeration?. 
>
It may not be necessary.  Int8 with repitition factors may work about 
the same.


From falted at openlc.org  Mon Jan 27 12:16:02 2003
From: falted at openlc.org (Francesc Alted)
Date: Mon Jan 27 12:16:02 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
In-Reply-To: <3E358DE0.7040501@stsci.edu>
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu> <200301272021.47587.falted@openlc.org> <3E358DE0.7040501@stsci.edu>
Message-ID: <200301272114.53545.falted@openlc.org>

A Dilluns 27 Gener 2003 20:52, Todd Miller va escriure:
> >
> >Oh!, I didn't know that. In such a case, I think it's worth to consider
> > the possibility to define records as classes descendants from
> > metaclasses. But, of course, you have the ultimate decision.
>
> I don't know what you mean here.   Please spell it out a little more.

I was trying to mean that using something like :

class Small(IsRecord):
    field1 = defineType(CharType, 2, default="", position=1)
    field2 = defineType(Int32, 1, position=2)
    field3 = Float64

as as container for recarray metadata is definitely possible instead of the
tuple (formats="2aid",names=("field1","field2", "field3")), if using
Python2.2. IsRecord is a metaclass (introduced in Python 2.2) that allows
you to effectively separate the declared attributes from the implicit ones
in normal classes.

Of course, you can taylor IsRecord so as to fulfill your needs.

I hope that I have expressed myself more clearly now,

-- 
Francesc Alted


From jmiller at stsci.edu  Mon Jan 27 12:54:05 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Mon Jan 27 12:54:05 2003
Subject: FW: [Numpy-discussion] typecodes in numarray
References: <JFEGLNDJEDNOMPPHDEJFOECNECAA.perry@stsci.edu> <200301272021.47587.falted@openlc.org> <3E358DE0.7040501@stsci.edu> <200301272114.53545.falted@openlc.org>
Message-ID: <3E359C2B.4070509@stsci.edu>

Francesc Alted wrote:

>A Dilluns 27 Gener 2003 20:52, Todd Miller va escriure:
>  
>
>>>Oh!, I didn't know that. In such a case, I think it's worth to consider
>>>the possibility to define records as classes descendants from
>>>metaclasses. But, of course, you have the ultimate decision.
>>>      
>>>
>>I don't know what you mean here.   Please spell it out a little more.
>>    
>>
>
>I was trying to mean that using something like :
>
>class Small(IsRecord):
>    field1 = defineType(CharType, 2, default="", position=1)
>    field2 = defineType(Int32, 1, position=2)
>    field3 = Float64
>
>as as container for recarray metadata is definitely possible instead of the
>tuple (formats="2aid",names=("field1","field2", "field3")), if using
>Python2.2. IsRecord is a metaclass (introduced in Python 2.2) that allows
>you to effectively separate the declared attributes from the implicit ones
>in normal classes.
>
>Of course, you can taylor IsRecord so as to fulfill your needs.
>
>I hope that I have expressed myself more clearly now,
>
>  
>
I looked at your docs here: 
http://pytables.sourceforge.net/html-doc/usersguide-html4.html#section4.2
and what you said above clicked.  Thanks.

Todd


From Chris.Barker at noaa.gov  Tue Jan 28 11:02:04 2003
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Tue Jan 28 11:02:04 2003
Subject: [Numpy-discussion] Extensive use of methods instead of functions
References: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com> <3E26EC9D.A0B7D173@noaa.gov> <3E288068.3070407@stsci.edu>
Message-ID: <3E36D14D.C3238DFA@noaa.gov>

Konrad Hinsen wrote:
> > M = array(l)
> > Mt = M.transpose()
> >
> > just isn't that much worse than:
> >
> > Mt = transpose(l)
> 
> No, but the automatic conversion enables me to write functions that
> accept any sequence type without even having to think about it.

I've used that to, but I also frequently use something like this:

def function(A):
	A = array(A)
	...

Which is pretty simple to. 

> Moreover, it is almost essential in many situations to accept scalars
> in place of arrays, because scalars fulfill the role of rank-0 arrays.

Yes, this is critical. Isn't there a plan to make the scalar -- rank-0
array dicotomy a little cleaner in NumArray ?
 
> > I also agree that the point is not subclassing per se, it's
> > polymorphism. It should be easy to write a class that acts like an array
> > in all the ways that you need it to. 
> 
> True, and that is a weak point of NumPy.

Is this getting any better with NumArray?
 

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                    		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From falted at openlc.org  Tue Jan 28 11:42:07 2003
From: falted at openlc.org (Francesc Alted)
Date: Tue Jan 28 11:42:07 2003
Subject: [Numpy-discussion] enum values visible in numeric types instances?
Message-ID: <200301282041.21145.falted@openlc.org>

Hi,

A couple of points related with numarray type objects:

1.- When working with numeric types instances like UInt8 or Float64, is
there a way to access to their enumeration NumarrayType C counterpart?. That
can be handy when want to map from these objects and integers.

For example, right now, I'm forced to use these mappings in Pyrex:

# Conversion tables from/to classes to the numarray enum types
toenum = {num.Int8:tInt8,       num.UInt8:tUInt8,
          num.Int16:tInt16,     num.UInt16:tUInt16,
          num.Int32:tInt32,     num.UInt32:tUInt32,
          num.Float32:tFloat32, num.Float64:tFloat64,
          CharType:97   # ascii(97) --> 'a' # Special case (to be corrected)
          }

toclass = {tInt8:num.Int8,       tUInt8:num.UInt8,
           tInt16:num.Int16,     tUInt16:num.UInt16,
           tInt32:num.Int32,     tUInt32:num.UInt32,
           tFloat32:num.Float32, tFloat64:num.Float64,
           97:CharType   # ascii(97) --> 'a' # Special case (to be corrected)
          }

(yes, Pyrex lets you do that kind of "miracles", like mappings between
Python objects and C integers)

but if I had this access directly from the object (for example
Int8.enumType), my code (and C-extensions in general) could look simpler.

2.- I understand now why Todd was worried about CharArray objects to be
assigned to an enumerated type. In fact, if you look at the above maps, I
have to map myself this special object as the number 97 (which is the ascii
value for character "a"). 97 is ok for now because it can't collide (at
least for a while) with other enumeration types.

My suggestion is that it would be a good thing to have a reserved enum type
for CharArray. And I think that mapping CharArrays with Bool or Int8, would
not be a good solution because chararray objects differ in some ways from
them, that it would be a mess to distinguish both objects in C-code by just
looking at its enumeration type. 

I don't know, but maybe recarrays also merit a place in enumeration (?). 

By the way, after the discussion with Todd I finally decided to remove all
the Numeric charcodes (and related codes) from PyTables. However, I can
still manage Numeric objects by converting them to numarray and accessing
the class type with the .type() method. An you know that? the code looks
much more logical and neat, and best of all, less error-prone (well, at
least I hope so!). I definitely encourage you to do a similar transition in
numarray (although I guess that would be more difficult because you still
need to Numeric compatibility).

Thanks,

-- 
Francesc Alted


From perry at stsci.edu  Tue Jan 28 13:59:08 2003
From: perry at stsci.edu (Perry Greenfield)
Date: Tue Jan 28 13:59:08 2003
Subject: [Numpy-discussion] Extensive use of methods instead of functions
In-Reply-To: <3E36D14D.C3238DFA@noaa.gov>
Message-ID: <JFEGLNDJEDNOMPPHDEJFEEDMECAA.perry@stsci.edu>

> Yes, this is critical. Isn't there a plan to make the scalar -- rank-0
> array dicotomy a little cleaner in NumArray ?
>
Hmmm, I'd like to say yes, but I'm not sure what exactly you are
referring to. Please elaborate on how you think it should be
changed. About the only thing that comes to mind is that repr()
for rank-0 will be different for numarray than Numeric, and that
it will never be the result of any reduction or similar selection.
  
> > > I also agree that the point is not subclassing per se, it's
> > > polymorphism. It should be easy to write a class that acts 
> like an array
> > > in all the ways that you need it to. 
> > 
> > True, and that is a weak point of NumPy.
> 
> Is this getting any better with NumArray?
>  
Again, I hope so, but I find this too general to know if it satisfies
anyone's specific goals. I'd like to see specific examples. I think
it is often tricker than people initially think.

Perry


From jdhunter at ace.bsd.uchicago.edu  Wed Jan 29 13:13:03 2003
From: jdhunter at ace.bsd.uchicago.edu (John Hunter)
Date: Wed Jan 29 13:13:03 2003
Subject: [Numpy-discussion] fastest way to make two vectors into an array
Message-ID: <m2fzrbzmlc.fsf@mother.paradise.lost>

I have two equal length 1D arrays of 256-4096 complex or floating
point numbers which I need to put into a shape=(len(x),2) array.

I need to do this a lot, so I would like to use the most efficient
means.  Currently I am doing:

def somefunc(x,y):
    X = zeros( (len(x),2), typecode=x.typecode())
    X[:,0] = x
    X[:,1] = y
    do_something_with(X)

Is this the fastest way?

Thanks,
John Hunter


From list at jsaul.de  Thu Jan 30 01:20:04 2003
From: list at jsaul.de (Joachim Saul)
Date: Thu Jan 30 01:20:04 2003
Subject: [Numpy-discussion] fastest way to make two vectors into an array
In-Reply-To: <m2fzrbzmlc.fsf@mother.paradise.lost>
References: <m2fzrbzmlc.fsf@mother.paradise.lost>
Message-ID: <20030130091853.GA842@jsaul.de>

* John Hunter [2003-01-29 22:13]:
> def somefunc(x,y):
>     X = zeros( (len(x),2), typecode=x.typecode())
>     X[:,0] = x
>     X[:,1] = y
>     do_something_with(X)
>
> Is this the fastest way?

X = transpose(array([x]+[y]))

It may not be the fastest possible way, but should be about a
factor of two faster; better than nothing.

Cheers,
Joachim


From karthik at james.hut.fi  Thu Jan 30 01:47:03 2003
From: karthik at james.hut.fi (Karthikesh Raju)
Date: Thu Jan 30 01:47:03 2003
Subject: [Numpy-discussion] Object too deep for desired array
In-Reply-To: <E18dySd-0000ec-00@sc8-sf-list2.sourceforge.net>
Message-ID: <Pine.SGI.4.21.0301301138340.1340362-100000@james.hut.fi>

Hi, 

i was tring out something like this 
import Numeric
import LinearAlgebra
import cmath
import RandomArray
import copy


def sMatrix(pd, code, window):
    if window == 0:
        nprime = 1
    else:
        nprime = window
    
    K, C = Numeric.shape(code)
    K1, L = Numeric.shape(pd)
    # check if K == K1 and raise an exception here
    sCode = Numeric.zeros([nprime*C,K*L*(window+1)],'d')

    for k in range(K):
        for l in range(L):
            code1 = copy.deepcopy(Numeric.array(code[k,0:C-pd[k,l]]))
            code1.shape = (C-pd[k,l],1)
            sCode1=
Numeric.concatenate((Numeric.zeros([pd[k,l],1]),Numeric.zeros([C*window,1]),code1))
            sCode[:, (window+1)*l+window*L*k] = copy.deepcopy(sCode1)
    
    return sCode

if __name__ == "__main__":
    pd = Numeric.array([[2]])
    code = Numeric.array([[-1,1,-1,1,1]])
    np = sMatrix(pd,code,0)
    print np
    print "--"*30
    np = sMatrix(pd,code,1)
    print Numeric.shape(np)
    print np
    print "--"*30
    np = sMatrix(pd,code,2)
    print Numeric.shape(np)
    print np
    print "--"*30


------------------------------
And i get struck with the following error message::

Traceback (most recent call last):
  File "sMatrix.py", line 31, in ?
    np = sMatrix(pd,code,0)
  File "sMatrix.py", line 24, in sMatrix
    sCode[:, (window+1)*l+window*L*k] = copy.deepcopy(sCode1)
ValueError: Object too deep for desired array


------------

i think it is due to the many deep copy operations taht i am performing. i
want to be in a position where slices of matrices should not be
references, but should be copies itself and i should be able to move these
copies around. (May be it is inefficient, but that is what i did in
Matlab and want some compatibility, till i learn more of python and till
i migrate to python completely).

Is there a way out? Why is this an problem? Am i missing something.

Best regards,

karthik


-----------------------------------------------------------------------
Karthikesh Raju,		    email: karthik at james.hut.fi		
Researcher,			    http://www.cis.hut.fi/karthik
Helsinki University of Technology,  Tel: +358-9-451 5389
Laboratory of Comp. & Info. Sc.,    Fax: +358-9-451 3277
Department of Computer Sc.,
P.O Box 5400, FIN 02015 HUT,
Espoo, FINLAND
-----------------------------------------------------------------------


From pearu at cens.ioc.ee  Thu Jan 30 01:51:09 2003
From: pearu at cens.ioc.ee (Pearu Peterson)
Date: Thu Jan 30 01:51:09 2003
Subject: [Numpy-discussion] fastest way to make two vectors into an array
In-Reply-To: <m2fzrbzmlc.fsf@mother.paradise.lost>
Message-ID: <Pine.LNX.4.21.0301301141210.5388-100000@cens.kybi>

On Wed, 29 Jan 2003, John Hunter wrote:

> 
> I have two equal length 1D arrays of 256-4096 complex or floating
> point numbers which I need to put into a shape=(len(x),2) array.
> 
> I need to do this a lot, so I would like to use the most efficient
> means.  Currently I am doing:
> 
> def somefunc(x,y):
>     X = zeros( (len(x),2), typecode=x.typecode())
>     X[:,0] = x
>     X[:,1] = y
>     do_something_with(X)
> 
> Is this the fastest way?

May be you could arange your algorithm so that you first create
X and then reference its columns by x,y without copying:

# Allocate memory
X = zeros( (n,2), typecode=.. )

# Get references to columns
x = X[:,0]
y = X[:,1]

while 1:
  do_something_inplace_with(x,y)
  do_something_with(X)

Pearu


From jdhunter at ace.bsd.uchicago.edu  Thu Jan 30 11:26:05 2003
From: jdhunter at ace.bsd.uchicago.edu (John Hunter)
Date: Thu Jan 30 11:26:05 2003
Subject: [Numpy-discussion] fastest way to make two vectors into an
 array
In-Reply-To: <m2fzrbzmlc.fsf@mother.paradise.lost> (John Hunter's message of
 "Wed, 29 Jan 2003 15:13:03 -0600")
References: <m2fzrbzmlc.fsf@mother.paradise.lost>
Message-ID: <m2vg064eyj.fsf@mother.paradise.lost>

>>>>> "John" == John Hunter <jdhunter at ace.bsd.uchicago.edu> writes:

    John> I have two equal length 1D arrays of 256-4096 complex or
    John> floating point numbers which I need to put into a
    John> shape=(len(x),2) array.

    John> I need to do this a lot, so I would like to use the most
    John> efficient means.  Currently I am doing:

I tested all the suggested methods and the transpose with [x] and [y]
was the clear winner, with an 8 fold speed up over my original code.
The concatenate method was between 2-3 times faster.

Thanks to all who responded,
John Hunter

cruncher2:~/python/test> python test.py test_naive
test_naive 0.480427026749
cruncher2:~/python/test> python test.py test_concat
test_concat 0.189149975777
cruncher2:~/python/test> python test.py test_transpose
test_transpose 0.0698409080505


from Numeric import transpose, concatenate, reshape, array, zeros
from RandomArray import normal
import time, sys

def test_naive(x,y):
    "Naive approach"
    X = zeros( (len(x),2), typecode=x.typecode())
    X[:,0] = x
    X[:,1] = y

def test_concat(x,y):
    "Thanks to Chris Barker and Bryan Cole"
    X = concatenate( ( reshape(x,(-1,1)), reshape(y,(-1,1)) ), 1)


def test_transpose(x,y):
    "Thanks to Joachim Saul"
    X = transpose(array([x]+[y]))


m = {'test_naive' : test_naive,
     'test_concat' : test_concat,
     'test_transpose' : test_transpose}

nse1 = normal(0.0, 1.0, (4096,))
nse2 = normal(0.0, 1.0, nse1.shape)

N = 1000

trials = range(N)

func = m[sys.argv[1]]
t1 = time.time()
for i in trials:
    func(nse1,nse2)
t2 = time.time()
print sys.argv[1], t2-t1


From jdhunter at ace.bsd.uchicago.edu  Thu Jan 30 14:18:04 2003
From: jdhunter at ace.bsd.uchicago.edu (John Hunter)
Date: Thu Jan 30 14:18:04 2003
Subject: [Numpy-discussion] mlab functions: psd, csd, cohere, corrcoef
Message-ID: <m27kcm1duu.fsf@mother.paradise.lost>

I needed some spectral analysis functions, and finding none available,
wrote my own.  I use matlab a lot, so I wrote them to be matlab
compatible.  If you all think these look OK, I'm happy to submit them
for inclusion into MLab.  

-------------------------------------------------------------------

"""

Spectral analysis functions for Numerical python written for
compatability with matlab commands with the same names.

  psd - Power spectral density uing Welch's average periodogram
  csd - Cross spectral density uing Welch's average periodogram
  cohere - Coherence (normalized cross spectral density)
  corrcoef - The matrix of correlation coefficients

The functions are designed to work for real and complex valued Numeric
arrays.

One of the major differences between this code and matlab's is that I
use functions for 'detrend' and 'window', and matlab uses vectors.
This can be easily changed, but I think the functional approach is a
bit more elegant.

Please send comments, questions and bugs to:

Author: John D. Hunter <jdhunter at ace.bsd.uchicago.edu>

"""

from __future__ import division
from MLab import mean, hanning, cov
from Numeric import zeros, ones, diagonal, transpose, matrixmultiply, \
     resize, sqrt, divide, array, Float, Complex, concatenate, \
     convolve, dot, conjugate, absolute, arange, reshape
from FFT import fft


def norm(x):
    return sqrt(dot(x,x))

def window_hanning(x):
    return hanning(len(x))*x

def window_none(x):
    return x

def detrend_mean(x):
    return x - mean(x)

def detrend_none(x):
    return x

def detrend_linear(x):
    """Remove the best fit line from x"""
    # I'm going to regress x on xx=range(len(x)) and return
    # x - (b*xx+a)
    xx = arange(len(x), typecode=x.typecode())
    X = transpose(array([xx]+[x]))
    C = cov(X)
    b = C[0,1]/C[0,0]
    a = mean(x) - b*mean(xx)
    return x-(b*xx+a)


def psd(x, NFFT=256, Fs=2, detrend=detrend_none,
        window=window_hanning, noverlap=0):
    """
    The power spectral density by Welches average periodogram method.
    The vector x is divided into NFFT length segments.  Each segment
    is detrended by function detrend and windowed by function window.
    noperlap gives the length of the overlap between segments.  The
    absolute(fft(segment))**2 of each segment are averaged to compute Pxx,
    with a scaling to correct for power loss due to windowing.  Fs is
    the sampling frequency.

    -- NFFT must be a power of 2
    -- detrend and window are functions, unlike in matlab where they are
       vectors.
    -- if length x < NFFT, it will be zero padded to NFFT
    

    Refs:
      Bendat & Piersol -- Random Data: Analysis and Measurement
        Procedures, John Wiley & Sons (1986)

    """

    if NFFT % 2:
        raise ValueError, 'NFFT must be a power of 2'

    # zero pad x up to NFFT if it is shorter than NFFT
    if len(x)<NFFT:
        n = len(x)
        x = resize(x, (NFFT,))
        x[n:] = 0
    

    # for real x, ignore the negative frequencies
    if x.typecode()==Complex: numFreqs = NFFT
    else: numFreqs = NFFT//2+1
        
    windowVals = window(ones((NFFT,),x.typecode()))
    step = NFFT-noverlap
    ind = range(0,len(x)-NFFT+1,step)
    n = len(ind)
    Pxx = zeros((numFreqs,n), Float)

    # do the ffts of the slices
    for i in range(n):
        thisX = x[ind[i]:ind[i]+NFFT]
        thisX = windowVals*detrend(thisX)
        fx = absolute(fft(thisX))**2
        Pxx[:,i] = fx[:numFreqs]

    # Scale the spectrum by the norm of the window to compensate for
    # windowing loss; see Bendat & Piersol Sec 11.5.2
    if n>1: Pxx = mean(Pxx,1)
    Pxx = divide(Pxx, norm(windowVals)**2)
    freqs = Fs/NFFT*arange(0,numFreqs)
    return Pxx, freqs


def csd(x, y, NFFT=256, Fs=2, detrend=detrend_none,
        window=window_hanning, noverlap=0):
    """
    The cross spectral density Pxy by Welches average periodogram
    method.  The vectors x and y are divided into NFFT length
    segments.  Each segment is detrended by function detrend and
    windowed by function window.  noverlap gives the length of the
    overlap between segments.  The product of the direct FFTs of x and
    y are averaged over each segment to compute Pxy, with a scaling to
    correct for power loss due to windowing.  Fs is the sampling
    frequency.

    NFFT must be a power of 2

    Refs:
      Bendat & Piersol -- Random Data: Analysis and Measurement
        Procedures, John Wiley & Sons (1986)

    """

    if NFFT % 2:
        raise ValueError, 'NFFT must be a power of 2'

    # zero pad x and y up to NFFT if they are shorter than NFFT
    if len(x)<NFFT:
        n = len(x)
        x = resize(x, (NFFT,))
        x[n:] = 0
    if len(y)<NFFT:
        n = len(y)
        y = resize(y, (NFFT,))
        y[n:] = 0

    # for real x, ignore the negative frequencies
    if x.typecode()==Complex: numFreqs = NFFT
    else: numFreqs = NFFT//2+1
        
    windowVals = window(ones((NFFT,),x.typecode()))
    step = NFFT-noverlap
    ind = range(0,len(x)-NFFT+1,step)
    n = len(ind)
    Pxy = zeros((numFreqs,n), Complex)

    # do the ffts of the slices
    for i in range(n):
        thisX = x[ind[i]:ind[i]+NFFT]
        thisX = windowVals*detrend(thisX)
        thisY = y[ind[i]:ind[i]+NFFT]
        thisY = windowVals*detrend(thisY)
        fx = fft(thisX)
        fy = fft(thisY)
        Pxy[:,i] = fy[:numFreqs]*conjugate(fx[:numFreqs])

    # Scale the spectrum by the norm of the window to compensate for
    # windowing loss; see Bendat & Piersol Sec 11.5.2
    if n>1: Pxy = mean(Pxy,1)
    Pxy = divide(Pxy, norm(windowVals)**2)
    freqs = Fs/NFFT*arange(0,numFreqs)
    return Pxy, freqs

def cohere(x, y, NFFT=256, Fs=2, detrend=detrend_none,
           window=window_hanning, noverlap=0):
    """
    cohere the coherence between x and y.  Coherence is the normalized
    cross spectral density

    Cxy = |Pxy|^2/(Pxx*Pyy)

    The return value is (Cxy, f), where f are the frequencies of the
    coherence vector.  See the docs for psd and csd for information
    about the function arguments NFFT, detrend, windowm noverlap, as
    well as the methods used to compute Pxy, Pxx and Pyy.

    """

    
    Pxx,f = psd(x, NFFT=NFFT, Fs=Fs, detrend=detrend,
              window=window, noverlap=noverlap)
    Pyy,f = psd(y, NFFT=NFFT, Fs=Fs, detrend=detrend,
              window=window, noverlap=noverlap)
    Pxy,f = csd(x, y, NFFT=NFFT, Fs=Fs, detrend=detrend,
              window=window, noverlap=noverlap)

    Cxy = divide(absolute(Pxy)**2, Pxx*Pyy)
    return Cxy, f

def corrcoef(*args):
    """
    
    corrcoef(X) where X is a matrix returns a matrix of correlation
    coefficients for each row of X.
    
    corrcoef(x,y) where x and y are vectors returns the matrix or
    correlation coefficients for x and y.

    Numeric arrays can be real or complex

    The correlation matrix is defined from the covariance matrix C as

    r(i,j) = C[i,j] / (C[i,i]*C[j,j])
    """

    if len(args)==2:
        X = transpose(array([args[0]]+[args[1]]))
    elif len(args==1):
        X = args[0]
    else:
        raise RuntimeError, 'Only expecting 1 or 2 arguments'

    
    C = cov(X)
    d = resize(diagonal(C), (2,1))
    r = divide(C,sqrt(matrixmultiply(d,transpose(d))))[0,1]
    try: return r.real
    except AttributeError: return r


-------------------------------------------------------------------

I wrote a little test code comparing the output of matlab's equivalent
functions.  Basically, I compute the psd or cohere in matlab and
python and do the rms difference on the resultant vectors

  RMS cohere python/matlab difference 0.000854587104587
  RMS psd python/matlab difference 0.00210783306638

I am not sure where these differences are arising, but they are quite
small.  I'm going to keep trying to track them down.

For corrcoef, the answers are the same past 8 significant digits.

Hope this helps!
John Hunter


From haase at msg.ucsf.edu  Fri Jan 31 05:12:05 2003
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Fri Jan 31 05:12:05 2003
Subject: [Numpy-discussion] numarray 0.4 on osX/darwin
Message-ID: <020a01c2c897$65bf2dc0$3b45da80@rodan>

Hi everybody,
I tried a 'python2.2 setup.py install'
of numarray  on a Mac running os-X (10.1; I have also Fink installed)
I starts crunching until:
/usr/bin/ld: Undefined symbols:
_fclearexcept
_fetestexcept

Anyone out there, who uses numarray on osX ?

I'm thankful for any pointer...

Sebastian Haase


From jmiller at stsci.edu  Fri Jan 31 07:31:01 2003
From: jmiller at stsci.edu (Todd Miller)
Date: Fri Jan 31 07:31:01 2003
Subject: [Numpy-discussion] numarray 0.4 on osX/darwin
References: <020a01c2c897$65bf2dc0$3b45da80@rodan>
Message-ID: <3E3A9628.3030704@stsci.edu>

Sebastian Haase wrote:

>Hi everybody,
>I tried a 'python2.2 setup.py install'
>of numarray  on a Mac running os-X (10.1; I have also Fink installed)
>I starts crunching until:
>/usr/bin/ld: Undefined symbols:
>_fclearexcept
>_fetestexcept
>
>Anyone out there, who uses numarray on osX ?
>
>I'm thankful for any pointer...
>
>Sebastian Haase
>  
>
Hi Sebastian,

I am very much a Mac-Amateur,  but I have run numarray under osX by 
first installing a local UNIX version of Python using the source 
tarball.  The steps were roughly as follows:

1. Obtain and unpack the Python source tarball in you home directory. 
 cd there.

2. Configure Python using:  ./configure --prefix=$HOME  

3. Edit the Makefile for the following:

61c61
 > LDFLAGS=
---
< LDFLAGS=      -framework System -framework CoreServices -framework 
Foundation

This was the only (reasonable) way I could figure out how to tunnel link 
time options down through the distutils in the proper command line 
order.  I'm not really sure this is a minimal set of frameworks,  but it 
did at least work.

4. Build and install python:  make ; make install

5.  Obtain and unpack the numarray source tarball.  cd there.

6.  Build and install numarray:  python setupall.py install

7.  Put $HOME/bin on your PATH and rehash.


Todd

>
>
>
>-------------------------------------------------------
>This SF.NET email is sponsored by:
>SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
>http://www.vasoftware.com
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>  
>


From Chris.Barker at noaa.gov  Fri Jan 31 12:44:02 2003
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Fri Jan 31 12:44:02 2003
Subject: [Numpy-discussion] fastest way to make two vectors into anarray
References: <m2fzrbzmlc.fsf@mother.paradise.lost> <m2vg064eyj.fsf@mother.paradise.lost>
Message-ID: <3E3ADC19.5566CB5A@noaa.gov>

John Hunter wrote:
>     John> I have two equal length 1D arrays of 256-4096 complex or
>     John> floating point numbers which I need to put into a
>     John> shape=(len(x),2) array.

> I tested all the suggested methods and the transpose with [x] and [y]
> was the clear winner, with an 8 fold speed up over my original code.
> The concatenate method was between 2-3 times faster.

I was a little surprised by this, as I figured that the transpose method
made an extra copy of the data (array() makes one copy, transpose()
another. So I looked at the source for concatenate:

def concatenate(a, axis=0):
    """concatenate(a, axis=0) joins the tuple of sequences in a into a
single
    NumPy array.
    """
    if axis == 0:
        return multiarray.concatenate(a)
    else:
        new_list = []
        for m in a:
            new_list.append(swapaxes(m, axis, 0))
    return swapaxes(multiarray.concatenate(new_list), axis, 0)

So, if you are concantenating along anything other than the zero-th
axis, you end up doing something similar to the transpose method. Seeign
this, I trioed something else:

def test_concat2(x,y):
    x.shape = (1,-1)
    y.shape = (1,-1)
    X = transpose( concatenate( (x, y) ) )
    x.shape = (-1,)
    y.shape = (-1,)

This then uses the native concatenate, but requires an extra copy in teh
transpose.

Here's a somewhat cleaner version, though you get more copies:

def test_concat3(x,y):
    "Thanks to Chris Barker and Bryan Cole"
    X = transpose( concatenate( ( reshape(x,(1,-1)), reshape(y,(1,-1)) )
) )

Here are the test results:

testing on vectors of length:  4096

test_concat 0.286280035973
test_transpose 0.100033998489
test_naive 0.805399060249
test_concat3 0.109319090843
test_concat2 0.136469960213

All the transpose methods are essentially a tie. Would it be that hard
for concatenate to do it's thing for any axis in C? It does seem like
this is a fairly basic operation, and shouldn't require more than one
copy.

By the way, I realised that the transpose method had an extra call.
transpose() can take an approprriate python sequence, so this works just
fine:

def test_transpose2(x,y):
    X = transpose([x]+[y])

However, it doesn't really save you the copy, as I'm retty sure
transpose makes a copy internally anyway. Test results:
testing on vectors of length:  4096

test_transpose 0.104995965958
test_transpose2 0.103582024574

I think the winner is:

X = transpose([x]+[y])


well, I learned a little bit more about Numeric today.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                    		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From rob at hooft.net  Fri Jan 31 13:36:03 2003
From: rob at hooft.net (Rob Hooft)
Date: Fri Jan 31 13:36:03 2003
Subject: [Numpy-discussion] fastest way to make two vectors into anarray
References: <m2fzrbzmlc.fsf@mother.paradise.lost>	<m2vg064eyj.fsf@mother.paradise.lost> <3E3ADC19.5566CB5A@noaa.gov>
Message-ID: <3E3AEC19.6020907@hooft.net>

Chris Barker wrote:
> 
> X = transpose([x]+[y])
> 
> 
> well, I learned a little bit more about Numeric today.
> 

I've been skipping through a lot of messages today because I was getting 
behind on mailing list traffic, but I missed one thing in the discussion 
so far (sorry if it was marked already):

    transpose doesn't actually do any work.

Actually, transpose only sets the "strides" counts differently, and this 
is blazingly fast. What is NOT fast is using the transposed array later! 
The problem is that many routines actually require a contiguous array, 
and will make a temporary local contiguous copy. This may happen 
multiple times if the lifetime of the transposed array is long. Even 
routines that do not require a contiguous array and can actually use the 
strides may run significantly slower because the CPU cache is trashed a 
lot by the high strides.

Moral: you can't test this code by looping a 1000 times through it, you 
actually should take into account the time it takes to make a contiguous 
array immediately after the transpose call.

Regards,

Rob Hooft
-- 
Rob W.W. Hooft  ||  rob at hooft.net  ||  http://www.hooft.net/people/rob/