From edcjones at erols.com Wed Jan 1 20:29:44 2003 From: edcjones at erols.com (Edward C. Jones) Date: Wed Jan 1 20:29:44 2003 Subject: [Numpy-discussion] numarray types and PIL modes, revisited Message-ID: <3E13C7DA.70906@erols.com> Perry Greenfield wrote: > Edward Jones writes: > > I write code using both PIL and numarray. PIL uses strings for > > modes and numarray uses (optionally) strings as typecodes. This > > causes problems. One fix is to emit a DeprecationWarning when > > string typecodes are used. Two functions are needed: > > StringTypeWarningOn and StringTypeWarningOff. The default > > should be to ignore this warning. > > I'm not sure I understand. Can you give me an example of problem > code or usage? It sounds like you are trying to test the types of > PIL and numarray objects in a generic sense. But I'd understand > better if you could show an example. That's what I was thinking (incorrectly). But I don't need to directly compare PIL modes with numarray types. My code never tries to deduce whether an array is a numarray or a PIL image from just the natype_or_mode. A module name (MODULE.NUMARRY, MODULE.PIL) must also be given. I do things this way because I might want to include other array/image systems. In an earlier version, I had a MODULE.IPL for the Intel Image Processing Library. The code also implements a policy of forbidding string types. So now all I can say is: 1. UInt8 == 'X' should not raise an exception. It should return False. 3. There needs to be a function that returns True iff arg is a numarry type (UInt8, "UInt8", "b", ...). def IsType(rep): from numerictypes import typeDict return isinstance(rep, NumericType) or typeDict.has_key(rep) Here is a typical piece of code. "module" can be MODULE.PIL or MODULE.NUMARRAY. ---- """General image casting function. Changes the C type of the pixels. Information can be lost. The "Convert" functions call C casting functions that clip the values, For example, if the input is a UInt16 and the output is a Int16, any input value greater than 32767 becomes 32767. """ def ArrayToArrayCast(arrin, module, natype_or_mode): """Converts one array into another. Results are clipped.""" pars = Parameters(arrin) if pars.module == module == MODULE.PIL and \ pars.mode == natype_or_mode: return arrin if pars.module == module == MODULE.NUMARRAY and \ NA_SameType(pars.natype, natype_or_mode): return arrin if pars.module == MODULE.NUMARRAY and module == MODULE.NUMARRAY: return NA_To_NA_Convert(arrin, natype_or_mode) if pars.module == MODULE.PIL and module == MODULE.PIL: return PIL_To_PIL_Convert(arrin, natype_or_mode) if pars.module == MODULE.NUMARRAY and module == MODULE.PIL: return NA_To_PIL_Convert(arrin, natype_or_mode) if pars.module == MODULE.PIL and module == MODULE.NUMARRAY: return PIL_To_NA_Convert(arrin, natype_or_mode) ---- From edcjones at erols.com Wed Jan 1 20:42:05 2003 From: edcjones at erols.com (Edward C. Jones) Date: Wed Jan 1 20:42:05 2003 Subject: [Numpy-discussion] End of Holidays small comments Message-ID: <3E13CB14.7040908@erols.com> node35.html: >>> print x.type(), x.real.type() D d should be >>> print x.type(), x.real.type() numarray type: Complex64 numarray type: Float64 ------------------------------------------------ Why use both NUM_C_ARRAY and C_ARRAY? ------------------------------------------------ in _ndarraymodule.c: {"_byteoffset", (getter)_ndarray_byteoffset_get, (setter)_ndarray_byteoffset_set, "shortest seperation between elements in bytes"}, {"_bytestride", (getter)_ndarray_bytestride_get, (setter)_ndarray_bytestride_set, "shortest seperation between elements in bytes"}, One of the comments is wrong. Also "separation". ------------------------------------------------ libnumarraymodule.c: /* Create an empty array. */ static PyArrayObject * NA_Empty(int ndim, int *shape, NumarrayType type) node42.html: static PyObject* NA_Empty( NumarrayType type, int ndim, ...) Serious documentation error. ------------------------------------------------ I think NA_New should be NA_New(int ndim, int* shape, NumarrayType type, void* buffer) The current NA_New is useful only when ndim is known at code-writing time. ------------------------------------------------ node39.html: Note: the type parameter for a macro is one of the Numarray Numeric Data Types, not a NumarrayType enumeration value. There should be an example of one of the GET/SET macros. How about unsigned char n; int i; ... n = NA_GET1(arr, UInt8, i); ------------------------------------------------ It seems that the parameters "aligned" and "writeable" are ignored in the source code for NA_NewAll and class NumArray. ------------------------------------------------ I would like to see an "int* strides" parameter added to NA_NewAll, so a non-contiguous "buffer" can be used. ------------------------------------------------ I suggest NA_Copy(PyObject* arr) which is something like static PyObject* NA_Copy(PyObject* arr) { PyArrayObject* arr1 = arr; return NA_NewAll(arr1->nd, (long*) arr1->dimensions, arr1->descr->type_num, arr1->data, arr1->byteoffset, arr1->bytestride, arr1->byteorder, 1, 1); } From edcjones at erols.com Wed Jan 1 20:45:34 2003 From: edcjones at erols.com (Edward C. Jones) Date: Wed Jan 1 20:45:34 2003 Subject: [Numpy-discussion] Slicing API? Message-ID: <3E13CBC3.6000207@erols.com> Both in Numeric and now in numarray I have found a need for API functions for slicing. Has anyone thought about this? From jmiller at stsci.edu Thu Jan 2 06:03:16 2003 From: jmiller at stsci.edu (Todd Miller) Date: Thu Jan 2 06:03:16 2003 Subject: [Numpy-discussion] Slicing API? References: <3E13CBC3.6000207@erols.com> Message-ID: <3E14481D.9080902@stsci.edu> Edward C. Jones wrote: > Both in Numeric and now in numarray I have found a need for API > functions for slicing. Has anyone thought about this? > Speaking for myself and the numarray C-API, the answer is no. What API do you want? Can you suggest function prototypes? Todd From jmiller at stsci.edu Thu Jan 2 12:36:53 2003 From: jmiller at stsci.edu (Todd Miller) Date: Thu Jan 2 12:36:53 2003 Subject: [Numpy-discussion] Slicing API? References: <3E13CBC3.6000207@erols.com> <3E14481D.9080902@stsci.edu> <3E1497E1.1050808@erols.com> Message-ID: <3E14A435.7040609@stsci.edu> Edward C. Jones wrote: > Todd Miller wrote: > >> Edward C. Jones wrote: >> >>> Both in Numeric and now in numarray I have found a need for API >>> functions for slicing. Has anyone thought about this? >>> >> Speaking for myself and the numarray C-API, the answer is no. What >> API do you want? Can you suggest function prototypes? > > > An API version of arrout[slices] = arrin[slices]: > > static int > NA_CopySlice(PyArrayObject* arrin, PyArrayObject* arrout, > int* startin, int* stepin, int* stopin, int* startout, int* stepout); > > I would suggest something more like the following then: typedef struct { int start, stop, step; } NumSlice; static int NA_CopySlice(PyArrayObject* arrin, int indim, NumSlice *slicein, PyArrayObject* arrout, int outdim, NumSlice *sliceout); The differences are: 1. A slice dimension count is added for both input and output arrays. This enables use of partial indices. 2. Slice values are expressed using the NumSlice typedef/struct rather than 3 independent int arrays. 3. The parameter order is shuffled so that input array parameters are kept together, and output array parameters are kept together. But, I still have these comments: 1. It looks like it will be cumbersome to use. 2. We should probably implement it as a callback to Python to avoid introducing another set of assignment semantics. Thus, the implementation would really just be building up and executing the calls for: outarr.__setitem__(outslices, inarr.__getitem__(inslices)). 3. The slicing implementation for numarray objects should be optimized to C this quarter, if not this month. So in terms of efficiency, not to mention comment 2, this won't buy much. 4. Since Numeric doesn't have this already, we're probably missing something obvious. Comments? Still interested? Todd From jmiller at stsci.edu Fri Jan 3 09:49:01 2003 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 3 09:49:01 2003 Subject: [Numpy-discussion] End of Holidays small comments References: <3E13CB14.7040908@erols.com> Message-ID: <3E15CED2.9070402@stsci.edu> Wow! This is great feedback. Thanks Edward. Edward C. Jones wrote: > node35.html: > > >>> print x.type(), x.real.type() > D d > > should be > > >>> print x.type(), x.real.type() > numarray type: Complex64 numarray type: Float64 I taked this over with Perry, and think it should behave and be documented more like: >>> print x.type(), x.real.type() Complex64 Float64 > > ------------------------------------------------ > > Why use both NUM_C_ARRAY and C_ARRAY? In the context of the defining enumeration, NUM_C_ARRAY looks correct. Anywhere else, C_ARRAY is about all I can stand. C_ARRAY is so common that I thought a little irregularity would be tolerable. Chock it up to tastelessness. > > ------------------------------------------------ > > in _ndarraymodule.c: > > {"_byteoffset", > (getter)_ndarray_byteoffset_get, > (setter)_ndarray_byteoffset_set, > "shortest seperation between elements in bytes"}, > {"_bytestride", > (getter)_ndarray_bytestride_get, > (setter)_ndarray_bytestride_set, > "shortest seperation between elements in bytes"}, > > One of the comments is wrong. Also "separation". Noted. > > ------------------------------------------------ > > libnumarraymodule.c: > > /* Create an empty array. */ > static PyArrayObject * > NA_Empty(int ndim, int *shape, NumarrayType type) > > node42.html: > > static PyObject* NA_Empty( NumarrayType type, int ndim, ...) > Noted. > > ------------------------------------------------ > > I think NA_New should be > > NA_New(int ndim, int* shape, NumarrayType type, void* buffer) > > The current NA_New is useful only when ndim is known at code-writing > time. NA_New is a "convenience wrapper" around NA_NewAll, but I see your point. How about NA_vNew(), in the spirit of vprintf? > > ------------------------------------------------ > > node39.html: > > Note: the type parameter for a macro is one of the Numarray Numeric > Data Types, not a NumarrayType enumeration value. > > There should be an example of one of the GET/SET macros. How about > > unsigned char n; > int i; > ... > n = NA_GET1(arr, UInt8, i); OK. > > ------------------------------------------------ > > It seems that the parameters "aligned" and "writeable" are ignored in > the source code for NA_NewAll and class NumArray. "aligned" is used. "writeable" should probably be dropped since it is no longer used. Since doing that would break an interface someone might be using, I'd rather not. > > ------------------------------------------------ > > I would like to see an "int* strides" parameter added to NA_NewAll, so a > non-contiguous "buffer" can be used. OK. How about NA_NewAllWithStrides (or insert a better name here)? > > ------------------------------------------------ > > I suggest NA_Copy(PyObject* arr) which is something like > > static PyObject* NA_Copy(PyObject* arr) > { > PyArrayObject* arr1 = arr; > return NA_NewAll(arr1->nd, (long*) arr1->dimensions, This ((long *)) doesn't work portably, so I would recommend avoiding it. > > arr1->descr->type_num, arr1->data, arr1->byteoffset, > arr1->bytestride, arr1->byteorder, 1, 1); > } > I'll add NA_Copy(). From jmiller at stsci.edu Fri Jan 3 09:52:02 2003 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 3 09:52:02 2003 Subject: [Numpy-discussion] numarray types and PIL modes, revisited References: <3E13C7DA.70906@erols.com> Message-ID: <3E15CF75.8080207@stsci.edu> Edward C. Jones wrote: > So now all I can say is: > > 1. UInt8 == 'X' should not raise an exception. It should return False. OK. I'll change numarray to return False. > > 3. There needs to be a function that returns True iff arg is a numarry > type (UInt8, "UInt8", "b", ...). > > def IsType(rep): > from numerictypes import typeDict > return isinstance(rep, NumericType) or typeDict.has_key(rep) Sounds good too. I'll add this to numerictypes. > > Thanks, Todd From edcjones at erols.com Fri Jan 3 16:03:04 2003 From: edcjones at erols.com (Edward C. Jones) Date: Fri Jan 3 16:03:04 2003 Subject: [Numpy-discussion] Grepping the source Message-ID: <3E162CCB.7070106@erols.com> Here is a short program I find useful. #! /usr/bin/env python import os, sys, tempfile """Greps the numarray source code""" command = \ """grep -n "%s" \ /usr/local/src/numarray-0.4/Include/numarray/arrayobject.h \ ... /usr/local/src/numarray-0.4/Lib/_ufunc.py \ ... /usr/local/src/numarray-0.4/Src/libnumarraymodule.c \ > %s """ if len(sys.argv) != 2: raise Exception, 'program requires exactly one argument' temp = tempfile.mktemp() try: os.system(command % (sys.argv[1], temp)) f = file(temp, 'r') lines = f.read().splitlines() f.close() finally: if os.path.exists(temp): os.remove(temp) common = len('/usr/local/src/numarray-0.4/') d = {} names = [] for line in lines: line = line[common:] colonloc = line.index(':') name = line[:colonloc] text = line[colonloc+1:] if not d.has_key(name): d[name] = [] names.append(name) d[name].append(text) for name in names: if len(d[name]) == 0: continue print '%s:' % name for text in d[name]: print ' %s' % text print From magnus at hetland.org Fri Jan 3 16:24:04 2003 From: magnus at hetland.org (Magnus Lie Hetland) Date: Fri Jan 3 16:24:04 2003 Subject: [Numpy-discussion] Grepping the source In-Reply-To: <3E162CCB.7070106@erols.com> References: <3E162CCB.7070106@erols.com> Message-ID: <20030104002342.GA18694@idi.ntnu.no> Edward C. Jones : [snip] > lines = f.read().splitlines() You could use f.readlines() here... Or you could just use for line in open(...): later, if you're using Python 2.2+ -- Magnus Lie Hetland http://hetland.org From perry at stsci.edu Mon Jan 6 16:28:05 2003 From: perry at stsci.edu (Perry Greenfield) Date: Mon Jan 6 16:28:05 2003 Subject: [Numpy-discussion] package vs module Message-ID: Back in December the issue of whether numarray should be a package or set of modules came up. When I asked about the possibility of making numarray a package (on the scipy mailing list but I can't seem to find the thread where it was discussed), I got only positive comments. The issue needs to be raised here also. Is there any objection to making numarray package based? The implications are that 3rd party modules (e.g. FFT) will be imported as part of the package structure, i.e., import numarray.FFT or from numarray.FFT import * instead of import FFT As usual there are advantages and disadvantages. The advantages are that we will not have name collisions with existing Numeric modules (currently we name FFT as FFT2 for this reason). It also potentially reduces name collision issues in general. Most feel it is a cleaner way to organize the software (at least based on the feedback so far). The main disadvantages I see so far are: 1) One will either have to change import statements in old code to match the new style (a pain, but generally changing imports is not terribly difficult since they are easy to identify) or explicitly add the path to each 3rd party module to Python Path (or some equivalent). 2) If numarray were accepted into the Python Standard Library, it would be the first case (as far as I can tell) of a standard library package where we would expect to add sub modules to it (e.g., FFT)). Normally these would not be distributed with the standard library, so some general mechanism will be needed to allow numarray to find 3rd party packages outside of the Python directory structure. For example, I don't think we can require having people install FFT in the Standard Library directory structure after Python is installed. Rather, we would probably have numarray look for extension modules in a standard named site-packages directory (or site-numarray?) or otherwise check a numarraypath environmental variable so that import numarray.FFT works properly. Perhaps others have ideas about how to best handle this. Any other issues being overlooked? Feedback? Thanks, Perry From magnus at hetland.org Mon Jan 6 23:05:02 2003 From: magnus at hetland.org (Magnus Lie Hetland) Date: Mon Jan 6 23:05:02 2003 Subject: [Numpy-discussion] package vs module In-Reply-To: References: Message-ID: <20030107070426.GC4884@idi.ntnu.no> Perry Greenfield : > > Back in December the issue of whether numarray should be a package > or set of modules came up. When I asked about the possibility > of making numarray a package (on the scipy mailing list but I > can't seem to find the thread where it was discussed), I got > only positive comments. The issue needs to be raised here also. > > Is there any objection to making numarray package based? I think this seems like a very good and natural thing to do. (Maybe names like RandomArray2 etc. can be changed too, now... :) -- Magnus Lie Hetland http://hetland.org From pearu at cens.ioc.ee Tue Jan 7 02:22:03 2003 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Tue Jan 7 02:22:03 2003 Subject: [Numpy-discussion] package vs module In-Reply-To: Message-ID: On Mon, 6 Jan 2003, Perry Greenfield wrote: > The main disadvantages I see so far are: > > 1) One will either have to change import statements in old code > to match the new style (a pain, but generally changing imports > is not terribly difficult since they are easy to identify) or > explicitly add the path to each 3rd party module to Python > Path (or some equivalent). > 2) If numarray were accepted into the Python Standard Library, it > would be the first case (as far as I can tell) of a standard > library package where we would expect to add sub modules to > it (e.g., FFT)). Normally these would not be distributed with > the standard library, so some general mechanism will be needed > to allow numarray to find 3rd party packages outside of the > Python directory structure. For example, I don't think we can > require having people install FFT in the Standard Library > directory structure after Python is installed. Rather, we would > probably have numarray look for extension modules in a standard > named site-packages directory (or site-numarray?) or otherwise > check a numarraypath environmental variable so that > import numarray.FFT works properly. Perhaps others have ideas > about how to best handle this. > > Any other issues being overlooked? There is one, though not so critical at this point but I will raise it anyway. In summary, I am +1 for making numarray a package. The issue is releated to import time and memory usage: more extension modules in a package increase both of them, even if users have no indention to use these modules. On slower machines this may cause inconvinieces, especially in applications that call Python multiple times for short tasks containing numarray operation. Let me repeat, currently this is not a problem neither with Numeric (because it never imports its extension modules) or numarray until numarray will contain a number of extension modules that presumably are not small. For a realistic example of this issue consider Scipy (as a sort of upper bound what numarray may become one day). Scipy contains a linalg module that is an (almost complete) wrapper to ATLAS/BLAS/LAPACK libraries and therefore importing the corresponding extension modules can be both time and memory consuming. For example, importing scipy to Python may take 2-5 seconds on PII 400MHz, mainly because of loading the linalg extension modules. This time may be annoying for small but frequent tasks. I wish Python import mechanism would be a bit smarter or lazier in loading extension modules that are never used... Pearu From falted at openlc.org Tue Jan 7 03:31:07 2003 From: falted at openlc.org (Francesc Alted) Date: Tue Jan 7 03:31:07 2003 Subject: [Numpy-discussion] package vs module In-Reply-To: References: Message-ID: <20030107113009.GA2445@openlc.org> On Mon, Jan 06, 2003 at 07:29:15PM -0500, Perry Greenfield wrote: > The main disadvantages I see so far are: > > 1) One will either have to change import statements in old code > to match the new style (a pain, but generally changing imports > is not terribly difficult since they are easy to identify) or > explicitly add the path to each 3rd party module to Python > Path (or some equivalent). I think this should be regarded as a minor annoyance compared with the advantages of making numarray a package. In addition, the introduction of numarray as substitute of Numeric can justify some re-code on existing applications. > 2) If numarray were accepted into the Python Standard Library, it > would be the first case (as far as I can tell) of a standard > library package where we would expect to add sub modules to > it (e.g., FFT)). Normally these would not be distributed with > the standard library, so some general mechanism will be needed > to allow numarray to find 3rd party packages outside of the > Python directory structure. For example, I don't think we can > require having people install FFT in the Standard Library > directory structure after Python is installed. Rather, we would > probably have numarray look for extension modules in a standard > named site-packages directory (or site-numarray?) or otherwise > check a numarraypath environmental variable so that > import numarray.FFT works properly. Perhaps others have ideas > about how to best handle this. > Great. I would be glad to see a package containing numarray kernel in order to allow aplications to use their core features, and have a mechanism to add 3rd party packages. In particular, having something similar to site-numarray to install these packages can be quite neat. In fact, I was pondering to include a subset of numarray in the PyTables package (it only needs the numarray core functionality), but if this reorganization takes place, I would not need to do that anymore. > Any other issues being overlooked? Yeah. In case you decide to break numarray in several modules, which would be the granularity of the separation. My opinion goes to have a reduced core with basic functionality (to maximize the chances to be included in the Pyhton Standard Library, but also to allow an easy entry for people who may wish to use this functionality) and then different, small, 3rd party packages, but perhaps this is also the most laborious solution. -- Francesc Alted PGP KeyID: 0x61C8C11F From hinsen at cnrs-orleans.fr Tue Jan 7 03:32:03 2003 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Tue Jan 7 03:32:03 2003 Subject: [Numpy-discussion] package vs module In-Reply-To: References: Message-ID: Perry Greenfield writes: > Back in December the issue of whether numarray should be a package > potentially reduces name collision issues in general. Most feel > it is a cleaner way to organize the software (at least based on > the feedback so far). I agree. We have discussed converting NumPy into a package a few times in the past, the major argument against it was compatibility issues. Numarray will require some changes to import statements anyway, so this seems the right time to make the change. > 2) If numarray were accepted into the Python Standard Library, it > would be the first case (as far as I can tell) of a standard > library package where we would expect to add sub modules to > it (e.g., FFT)). Normally these would not be distributed with > the standard library, so some general mechanism will be needed > to allow numarray to find 3rd party packages outside of the > Python directory structure. For example, I don't think we can If you plan to unbundle FFT etc. from numarray, then I would prefer a different naming scheme: numarray being just numarray, and some other package name grouping together the other modules. That is not only a question of installation, but also of general maintenance and of clarity for users. I see the Python package system as a tree: everything inside a package belongs together, is distributed together and is maintained by the same people. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From paul at pfdubois.com Tue Jan 7 09:25:06 2003 From: paul at pfdubois.com (paul at pfdubois.com) Date: Tue Jan 7 09:25:06 2003 Subject: [Numpy-discussion] package vs module In-Reply-To: <20030107113009.GA2445@openlc.org> Message-ID: <3E0D027100007B17@mta8.wss.scd.yahoo.com> 1. I favor the package approach. 2. I don't care if FFT is numarray.FFT or numpy.FFT (i.e., in a separate place). However, see (3). 3. Extensions built with one version of Python/numarray may not work with a different version. This means the safer approach is to have all addons inside the same directory, so that you can blow away just one directory and be sure that no 'old' packages remain. Some new stuff being put into Python also envisions being able to add various zipped files to the Python path as places to be searched. Perhaps this represents a packaging opportunity. I haven't paid enough attention to be sure. While we are on the subject of packaging, the current distribution places all sorts of extraneous test and installation-related files in the Lib directory. This makes it harder to work with the source when you are new to it. From tim.hochberg at ieee.org Tue Jan 7 09:35:17 2003 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Tue Jan 7 09:35:17 2003 Subject: [Numpy-discussion] package vs module In-Reply-To: References: Message-ID: <3E1B0FAF.7020607@ieee.org> Pearu Peterson wrote: >On Mon, 6 Jan 2003, Perry Greenfield wrote: > > > >>The main disadvantages I see so far are: >> >>1) One will either have to change import statements in old code >> to match the new style (a pain, but generally changing imports >> is not terribly difficult since they are easy to identify) or >> explicitly add the path to each 3rd party module to Python >> Path (or some equivalent). >>2) If numarray were accepted into the Python Standard Library, it >> would be the first case (as far as I can tell) of a standard >> library package where we would expect to add sub modules to >> it (e.g., FFT)). Normally these would not be distributed with >> the standard library, so some general mechanism will be needed >> to allow numarray to find 3rd party packages outside of the >> Python directory structure. For example, I don't think we can >> require having people install FFT in the Standard Library >> directory structure after Python is installed. Rather, we would >> probably have numarray look for extension modules in a standard >> named site-packages directory (or site-numarray?) or otherwise >> check a numarraypath environmental variable so that >> import numarray.FFT works properly. Perhaps others have ideas >> about how to best handle this. >> >>Any other issues being overlooked? >> >> > >There is one, though not so critical at this point but I will raise >it anyway. In summary, I am +1 for making numarray a package. > >The issue is releated to import time and memory usage: more extension >modules in a package increase both of them, even if users have no >indention to use these modules. On slower machines this may cause >inconvinieces, especially in applications that call Python multiple times >for short tasks containing numarray operation. > > That's not right, is it? I'm pretty certain that submodules in a package are not loaded until explicitly imported. I'm not sure why SciPy is slow, maybe the __init__ imports everything? I don't have a copy here so I can't check right now. In any event I'm +1 for putting it in a package unless it interferes with it getting into the core. As Paul mentioned keeping it in a zip archive would be even cooler once that's an option. -tim From falted at openlc.org Wed Jan 8 13:27:06 2003 From: falted at openlc.org (Francesc Alted) Date: Wed Jan 8 13:27:06 2003 Subject: [Numpy-discussion] some recarray rework Message-ID: <20030108212648.GA1309@openlc.org> Hi, In the context of optimizing the PyTables support for numarray and recarray objects I have been playing with recarray module, and ended with a somewhat improved version of it. Roughly, the modifications done are: - Addition of a cache to quickly access the columns (numarrays) in recarrays. This object is a map (dictionary) where keys are the name fields and values are the pointers to columns regarded as numarrays entities. This dictionary is accessible through the new attribute "_fields". - Addition of an attribute for recarray objects named "_record" which points to a special object ("Record2" class) and that it is aware of the "_fields" cache. It that can be used to access the different rows in recarray objects in an efficient way. - The "_record" object is callable (it defines the "__call__" method) so as to select the recarray row that is active during access to the different fields. Advantages - Access to rows and columns (fields) in recarray objects are one order of magnitude faster (!). - The new "_fields" and "_record" attributes provides convenient and intuitive ways to access the information in recarrays. - The "_record" attribute suports the "__getattr__" and "__setattr__" methods that are very convenient to access fields in a row. Drawbacks - "_record" attribute points always to the same object and you must pass it the row over which you want to operate. So, if you want to have two different objects pointing to different rows, you can't use the "_record" attribute to get them (but you can still use the existing Record class through by calling the "__getitem__" method of a recarray object). - Two new attributes are added to the already large number of recarray variables. However, this new variables has no special space requirements as "_record" object has only three scalar variables and "_fields" is a dictionary with many entries as fields in recarray, which should be not a large amount. I'm attaching this modified version as well as a testbed program in order to test their new access methods and improved performance. The output of this program ran in a pentium4 at 2GHz machine is also included. Feel free to play with it and/or take/adapt the parts you consider better suited to recarray module. -- Francesc Alted PGP KeyID: 0x61C8C11F -------------- next part -------------- import numarray as num import ndarray as mda import memory import chararray import sys, copy, os, re, types, string __version__ = '1.0' class Char: """ data type Char class""" bytes = 1 def __repr__(self): return "CharType" CharType = Char() # translation table to the num data types numfmt = {'i1':num.Int8, 'u1':num.UInt8, 'i2':num.Int16, 'i4':num.Int32, 'i8':num.Int64, 'f4':num.Float32, 'f8':num.Float64, 'l':num.Bool, 'b':num.Int8, 'u':num.UInt8, 's':num.Int16, 'i':num.Int32, 'N':num.Int64, 'f':num.Float32, 'd':num.Float64, 'r':num.Float32, 'a':CharType, 'Int8':num.Int8, 'Int16':num.Int16, 'Int32':num.Int32, 'Int64':num.Int64, 'UInt8':num.UInt8, 'Float32':num.Float32, 'Float64':num.Float64, 'Bool':num.Bool} # the reverse translation table of the above (for numarray only) revfmt = {num.Int16:'s', num.Int32:'i', num.Int64:'N', num.Float32:'r', num.Float64:'d', num.Bool:'l', num.Int8:'b', num.UInt8:'u', CharType:'a'} # TFORM regular expression format_re = re.compile(r'(?P^[0-9]*)(?P[A-Za-z0-9.]+)') def fromrecords (recList, formats=None, names=None): """ create a Record Array from a list of records in text form The data in the same field can be heterogeneous, they will be promoted to the highest data type. This method is intended for creating smaller record arrays. If used to create large array e.g. r=recarray.fromrecords([[2,3.,'abc']]*100000) it is slow. >>> r=fromrecords([[456,'dbe',1.2],[2,'de',1.3]],names='col1,col2,col3') >>> print r[0] (456, 'dbe', 1.2) >>> r.field('col1') array([456, 2]) >>> r.field('col2') CharArray(['dbe', 'de']) >>> import cPickle >>> print cPickle.loads(cPickle.dumps(r)) RecArray[ (456, 'dbe', 1.2), (2, 'de', 1.3) ] """ _shape = len(recList) _nfields = len(recList[0]) for _rec in recList: if len(_rec) != _nfields: raise ValueError, "inconsistent number of objects in each record" arrlist = [0]*_nfields for col in range(_nfields): tmp = [0]*_shape for row in range(_shape): tmp[row] = recList[row][col] try: arrlist[col] = num.array(tmp) except: try: arrlist[col] = chararray.array(tmp) except: raise ValueError, "inconsistent data at row %d,field %d" % (row, col) _array = fromarrays(arrlist, formats=formats, names=names) del arrlist del tmp return _array def fromarrays (arrayList, formats=None, names=None): """ create a Record Array from a list of num/char arrays >>> x1=num.array([1,2,3,4]) >>> x2=chararray.array(['a','dd','xyz','12']) >>> x3=num.array([1.1,2,3,4]) >>> r=fromarrays([x1,x2,x3],names='a,b,c') >>> print r[1] (2, 'dd', 2.0) >>> x1[1]=34 >>> r.field('a') array([1, 2, 3, 4]) """ _shape = len(arrayList[0]) if formats == None: # go through each object in the list to see if it is a numarray or # chararray and determine the formats formats = '' for obj in arrayList: if isinstance(obj, chararray.CharArray): formats += `obj._itemsize` + 'a,' elif isinstance(obj, num.NumArray): if len(obj._shape) == 1: _repeat = '' elif len(obj._shape) == 2: _repeat = `obj._shape[1]` else: raise ValueError, "doesn't support numarray more than 2-D" formats += _repeat + revfmt[obj._type] + ',' else: raise ValueError, "item in the array list must be numarray or chararray" formats=formats[:-1] for obj in arrayList: if len(obj) != _shape: raise ValueError, "array has different lengths" _array = RecArray(None, formats=formats, shape=_shape, names=names) # populate the record array (make a copy) for i in range(len(arrayList)): try: _array.field(_array._names[i])[:] = arrayList[i] except: print "Incorrect CharArray format %s, copy unsuccessful." % _array._formats[i] return _array def fromstring (datastring, formats, shape=0, names=None): """ create a Record Array from binary data contained in a string""" _array = RecArray(chararray._stringToBuffer(datastring), formats, shape, names) if mda.product(_array._shape)*_array._itemsize > len(datastring): raise ValueError("Insufficient input data.") else: return _array def fromfile(file, formats, shape=-1, names=None): """Create an array from binary file data If file is a string then that file is opened, else it is assumed to be a file object. No options at the moment, all file positioning must be done prior to this function call with a file object >>> import testdata, sys >>> fd=open(testdata.filename) >>> fd.seek(2880*2) >>> r=fromfile(fd, formats='d,i,5a', shape=3) >>> r._byteorder = "big" >>> print r[0] (5.1000000000000005, 61, 'abcde') >>> r._shape (3,) """ if isinstance(shape, types.IntType) or isinstance(shape, types.LongType): shape = (shape,) name = 0 if isinstance(file, types.StringType): name = 1 file = open(file, 'rb') size = os.path.getsize(file.name) - file.tell() dummy = array(None, formats=formats, shape=0) itemsize = dummy._itemsize if shape and itemsize: shapesize = mda.product(shape)*itemsize if shapesize < 0: shape = list(shape) shape[ shape.index(-1) ] = size / -shapesize shape = tuple(shape) nbytes = mda.product(shape)*itemsize if nbytes > size: raise ValueError( "Not enough bytes left in file for specified shape and type") # create the array _array = RecArray(None, formats=formats, shape=shape, names=names) nbytesread = memory.file_readinto(file, _array._data) if nbytesread != nbytes: raise IOError("Didn't read as many bytes as expected") if name: file.close() return _array # The test below was factored out of "array" due to platform specific # floating point formatted results: e+020 vs. e+20 if sys.platform == "win32": _fnumber = "2.5984589414244182e+020" else: _fnumber = "2.5984589414244182e+20" __test__ = {} __test__["array_platform_test_workaround"] = """ >>> r=array('a'*200,'r,3s,5a,i',3) >>> print r[0] (%(_fnumber)s, array([24929, 24929, 24929], type=Int16), 'aaaaa', 1633771873) >>> print r[1] (%(_fnumber)s, array([24929, 24929, 24929], type=Int16), 'aaaaa', 1633771873) """ % globals() del _fnumber def array(buffer=None, formats=None, shape=0, names=None): """This function will creates a new instance of a RecArray. buffer specifies the source of the array's initialization data. buffer can be: RecArray, list of records in text, list of numarray/chararray, None, string, buffer. formats specifies the fromat definitions of the array's records. shape specifies the array dimensions. names specifies the field names. >>> r=array([[456,'dbe',1.2],[2,'de',1.3]],names='col1,col2,col3') >>> print r[0] (456, 'dbe', 1.2) >>> r=array('a'*200,'r,3i,5a,s',3) >>> r._bytestride 23 >>> r._names ['c1', 'c2', 'c3', 'c4'] >>> r._repeats [1, 3, 5, 1] >>> r._shape (3,) """ if (buffer is None) and (formats is None): raise ValueError("Must define formats if buffer=None") elif buffer is None or isinstance(buffer, types.BufferType): return RecArray(buffer, formats=formats, shape=shape, names=names) elif isinstance(buffer, types.StringType): return fromstring(buffer, formats=formats, shape=shape, names=names) elif isinstance(buffer, types.ListType) or isinstance(buffer, types.TupleType): if isinstance(buffer[0], num.NumArray) or isinstance(buffer[0], chararray.CharArray): return fromarrays(buffer, formats=formats, names=names) else: return fromrecords(buffer, formats=formats, names=names) elif isinstance(buffer, RecArray): return buffer.copy() elif isinstance(buffer, types.FileType): return fromfile(buffer, formats=formats, shape=shape, names=names) else: raise ValueError("Unknown input type") def _RecGetType(name): """Converts a type repr string into a type.""" if name == "CharType": return CharType else: return num._getType(name) class RecArray(mda.NDArray): """Record Array Class""" def __init__(self, buffer, formats, shape=0, names=None, byteoffset=0, bytestride=None, byteorder=sys.byteorder, aligned=1): # names and formats can be either a string with components separated # by commas or a list of string values, e.g. ['i4', 'f4'] and 'i4,f4' # are equivalent formats self._parseFormats(formats) self._fieldNames(names) itemsize = self._stops[-1] + 1 if shape != None: if type(shape) in [types.IntType, types.LongType]: shape = (shape,) elif (type(shape) == types.TupleType and type(shape[0]) in [types.IntType, types.LongType]): pass else: raise NameError, "Illegal shape %s" % `shape` #XXX need to check shape*itemsize == len(buffer)? self._shape = shape mda.NDArray.__init__(self, self._shape, itemsize, buffer=buffer, byteoffset=byteoffset, bytestride=bytestride, aligned=aligned) self._byteorder = byteorder # Build the column arrays self._fields = self._get_fields() # Associate a record object for accessing values in each row # in a efficient way (i.e. without creating a new object each time) self._record = Record2(self) def _parseFormats(self, formats): """ Parse the field formats """ if (type(formats) in [types.ListType, types.TupleType]): _fmt = formats[:] ### make a copy elif (type(formats) == types.StringType): _fmt = string.split(formats, ',') else: raise NameError, "illegal input formats %s" % `formats` self._nfields = len(_fmt) self._repeats = [1] * self._nfields self._sizes = [0] * self._nfields self._stops = [0] * self._nfields # preserve the input for future reference self._formats = [''] * self._nfields sum = 0 for i in range(self._nfields): # parse the formats into repeats and formats try: (_repeat, _dtype) = format_re.match(string.strip(_fmt[i])).groups() except: print 'format %s is not recognized' % _fmt[i] if _repeat == '': _repeat = 1 else: _repeat = eval(_repeat) _fmt[i] = numfmt[_dtype] self._repeats[i] = _repeat self._sizes[i] = _fmt[i].bytes * _repeat sum += self._sizes[i] self._stops[i] = sum - 1 # Unify the appearance of _format, independent of input formats self._formats[i] = `_repeat`+revfmt[_fmt[i]] self._fmt = _fmt def __getstate__(self): """returns pickled state dictionary for RecArray""" state = mda.NDArray.__getstate__(self) state["_fmt"] = map(repr, self._fmt) return state def __setstate__(self, state): mda.NDArray.__setstate__(self, state) self._fmt = map(_RecGetType, state["_fmt"]) def _fieldNames(self, names=None): """convert input field names into a list and assign to the _names attribute """ if (names): if (type(names) in [types.ListType, types.TupleType]): pass elif (type(names) == types.StringType): names = string.split(names, ',') else: raise NameError, "illegal input names %s" % `names` self._names = map(lambda n:string.strip(n), names) else: self._names = [] # if the names are not specified, they will be assigned as "c1, c2,..." # if not enough names are specified, they will be assigned as "c[n+1], # c[n+2],..." etc. where n is the number of specified names..." self._names += map(lambda i: 'c'+`i`, range(len(self._names)+1,self._nfields+1)) def _get_fields(self): """ get a dictionary with fields as numeric arrays """ # Iterate over all the fields fields = {} for fieldName in self._names: # determine the offset within the record indx = index_of(self._names, fieldName) _start = self._stops[indx] - self._sizes[indx] + 1 _shape = self._shape _type = self._fmt[indx] _buffer = self._data _offset = self._byteoffset + _start # don't use self._itemsize due to possible slicing _stride = self._strides[0] _order = self._byteorder if isinstance(_type, Char): arr = chararray.CharArray(buffer=_buffer, shape=_shape, itemsize=self._repeats[indx], byteoffset=_offset, bytestride=_stride) else: arr = num.NumArray(shape=_shape, type=_type, buffer=_buffer, byteoffset=_offset, bytestride=_stride, byteorder = _order) # modify the _shape and _strides for array elements if (self._repeats[indx] > 1): arr._shape = self._shape + (self._repeats[indx],) arr._strides = (self._strides[0], _type.bytes) # Put this array as a value in dictionary fields[fieldName] = arr return fields def field(self, fieldName): """ get the field data as a numeric array """ return self._fields[fieldName] def info(self): """display instance's attributes (except _data)""" _attrList = dir(self) _attrList.remove('_data') _attrList.remove('_fmt') for attr in _attrList: print '%s = %s' % (attr, getattr(self,attr)) def __str__(self): outstr = 'RecArray[ \n' for i in self: outstr += Record.__str__(i) + ',\n' return outstr[:-2] + '\n]' ### The followng __getitem__ is not in the requirements ### and is here for experimental purposes def __getitem__(self, key): if type(key) == types.TupleType: if len(key) == 1: return mda.NDArray.__getitem__(self,key[0]) elif len(key) == 2 and type(key[1]) == types.StringType: return mda.NDArray.__getitem__(self,key[0]).field(key[1]) else: raise NameError, "Illegal key %s" % `key` return mda.NDArray.__getitem__(self,key) def _getitem(self, key): byteoffset = self._getByteOffset(key) row = (byteoffset - self._byteoffset) / self._strides[0] return Record(self, row) def _setitem(self, key, value): byteoffset = self._getByteOffset(key) row = (byteoffset - self._byteoffset) / self._strides[0] for i in range(self._nfields): self.field(self._names[i])[row] = value.field(self._names[i]) def reshape(*value): print "Cannot reshape record array." class Record2: """Record2 Class This class is similar to Record except for the fact that it is created and associated with a recarray in their creation time. When speed in traversing the recarray is required this approach is more convenient than create a new Record object for each row that is visited. """ def __init__(self, input): self.__dict__["_array"] = input self.__dict__["_fields"] = input._fields self.__dict__["_row"] = 0 def __call__(self, row): """ set the row for this record object """ if row < self._array.shape[0]: self.__dict__["_row"] = row return self else: return None def __getattr__(self, fieldName): """ get the field data of the record""" try: return self._fields[fieldName][self._row] except: (type, value, traceback) = sys.exc_info() raise AttributeError, "Error accessing \"%s\" attr.\n %s" % \ (fieldName, "Error was: \"%s: %s\"" % (type,value)) def __setattr__(self, fieldName, value): """ set the field data of the record""" self._fields[fieldName][self._row] = value def __str__(self): """ represent the record as an string """ outlist = [] for name in self._array._names: outlist.append(`self._fields[name][self._row]`) return "(" + ", ".join(outlist) + ")" class Record: """Record Class""" def __init__(self, input, row=0): if isinstance(input, types.ListType) or isinstance(input, types.TupleType): input = fromrecords([input]) if isinstance(input, RecArray): self.array = input self.row = row def __getattr__(self, fieldName): """ get the field data of the record""" #return self.array.field(fieldName)[self.row] if fieldName in self.array._names: #return self.array.field(fieldName)[self.row] return self.array._fields[fieldName][self.row] def field(self, fieldName): """ get the field data of the record""" #return self.array.field(fieldName)[self.row] return self.array.field(fieldName)[self.row] def __str__(self): outstr = '(' #for i in range(self.array._nfields): # print self.array.field(i)[self.row] for name in self.array._names: #print self.array.field(name)[self.row] #print self.array._fields[name][self.row] ### this is not efficient, need to know how to convert N-bytes to each data type outstr += `self.array.field(name)[self.row]` + ', ' return outstr[:-2] + ')' def index_of(nameList, key): """ Get the index of the key in the name list. The key can be an integer or string. If integer, it is the index in the list. If string, the name matching will be case-insensitive and trailing blank-insensitive. """ if (type(key) in [types.IntType, types.LongType]): indx = key elif (type(key) == types.StringType): _names = nameList[:] for i in range(len(_names)): _names[i] = string.lower(_names[i]) try: indx = _names.index(string.strip(string.lower(key))) except: raise NameError, "Key %s does not exist" % key else: raise NameError, "Illegal key %s" % `key` return indx def find_duplicate (list): """Find duplication in a list, return a list of dupicated elements""" dup = [] for i in range(len(list)): if (list[i] in list[i+1:]): if (list[i] not in dup): dup.append(list[i]) return dup def test(): import doctest, recarray return doctest.testmod(recarray) if __name__ == "__main__": test() -------------- next part -------------- import sys, time import numarray as num import chararray import recarray import recarray2 # This is my modified version usage = \ """usage: %s recordlength Set recordlength to 1000 at least to obtain decent figures! """ % sys.argv[0] try: reclen = int(sys.argv[1]) except: print usage sys.exit() delta = 0.000001 # Creation of recarrays objects for test x1=num.array(num.arange(reclen)) x2=chararray.array(None, itemsize=7, shape=reclen) x3=num.array(num.arange(reclen,reclen*3,2), num.Float64) r1=recarray.fromarrays([x1,x2,x3],names='a,b,c') r2=recarray2.fromarrays([x1,x2,x3],names='a,b,c') print "recarray shape in test ==>", r2.shape print "Assignment in recarray modified" print "-------------------------------" t1 = time.clock() for row in xrange(reclen): rec = r2._record(row) # select the row to be changed #rec.b = "changed" # change the "b" field rec.c = float(row**2) # Change the "c" field t2 = time.clock() ttime = round(t2-t1, 3) print "Assign time:", ttime, " Rows/s:", int(reclen/(ttime+delta)) print "Field b on row 2 after re-assign:", r2.field("c")[2] print print "Assignment in recarray original" print "-------------------------------" t1 = time.clock() for row in xrange(reclen): #r1.field("b")[row] = "changed" r1.field("c")[row] = float(row**2) t2 = time.clock() ttime = round(t2-t1, 3) print "Assign time:", ttime, " Rows/s:", int(reclen/(ttime+delta)) print "Field b on row 2 after re-assign:", r1.field("c")[2] print print "Selection in recarray modified" print "------------------------------" t1 = time.clock() for row in xrange(reclen): rec = r2._record(row) if rec.a < 3: print "This record pass the cut ==>", rec.c, "(row", row, ")" t2 = time.clock() ttime = round(t2-t1, 3) print "Select time:", ttime, " Rows/s:", int(reclen/(ttime+delta)) print print "Selection in recarray original" print "------------------------------" t1 = time.clock() for row in xrange(reclen): rec = r1[row] if rec.field("a") < 3: print "This record pass the cut ==>", rec.field("c"), "(row", row, ")" t2 = time.clock() ttime = round(t2-t1, 3) print "Select time:", ttime, " Rows/s:", int(reclen/(ttime+delta)) -------------- next part -------------- recarray shape in test ==> (10000,) Assignment in recarray modified ------------------------------- Assign time: 0.15 Rows/s: 66666 Field b on row 2 after re-assign: 4.0 Assignment in recarray original ------------------------------- Assign time: 1.24 Rows/s: 8064 Field b on row 2 after re-assign: 4.0 Selection in recarray modified ------------------------------ This record pass the cut ==> 0.0 (row 0 ) This record pass the cut ==> 1.0 (row 1 ) This record pass the cut ==> 4.0 (row 2 ) Select time: 0.18 Rows/s: 55555 Selection in recarray original ------------------------------ This record pass the cut ==> 0.0 (row 0 ) This record pass the cut ==> 1.0 (row 1 ) This record pass the cut ==> 4.0 (row 2 ) Select time: 1.52 Rows/s: 6578 From falted at openlc.org Fri Jan 10 09:17:05 2003 From: falted at openlc.org (Francesc Alted) Date: Fri Jan 10 09:17:05 2003 Subject: [Numpy-discussion] Some datatypes missing in numarray recarray? Message-ID: <200301101813.41407.falted@openlc.org> Hi, I think there are some data types missing in the recarray module. I can create recarrays using the fromarrays function with no problems except if I use UInt16, UInt32 and UInt64. As these types are well supported by numarray, is there any reason why they don't appear on numfmt and revfmt mappings in recarray module?. Is it safe to add them by hand in the source? Thanks, -- Francesc Alted From perry at stsci.edu Fri Jan 10 10:37:02 2003 From: perry at stsci.edu (Perry Greenfield) Date: Fri Jan 10 10:37:02 2003 Subject: [Numpy-discussion] Some datatypes missing in numarray recarray? In-Reply-To: <200301101813.41407.falted@openlc.org> Message-ID: > Hi, > > I think there are some data types missing in the recarray module. I can > create recarrays using the fromarrays function with no problems > except if I > use UInt16, UInt32 and UInt64. > > As these types are well supported by numarray, is there any > reason why they > don't appear on numfmt and revfmt mappings in recarray module?. Is it safe > to add them by hand in the source? > > Thanks, > > -- > Francesc Alted > Good point. We were using this for an I/O library that didn't use these types so that's why they didn't get in there originally. But you are right, they should be. Do you want to make the changes? Thanks, PErry From costas at malamas.com Sat Jan 11 01:12:03 2003 From: costas at malamas.com (Costas Malamas) Date: Sat Jan 11 01:12:03 2003 Subject: [Numpy-discussion] Sparse Arrays in NumPy? Message-ID: <000701c2b951$74d59880$6e00a8c0@retek.int> Hello all, I have been trying to find a package/addon that will provide a sparse array class to NumPy, or will at least trick NumPy to use a sparse array as a regular array, to no avail. By sparse array here, I donot mean a sparse matrix equation solver, but an array class that accepts a "default value". In other words, I would like to instantiate a 1000x1000x1000 (1e9) array that will have at most 5-10% populated (i.e. non-zero) elements. The current NumPy will instantiate the entire 1e9 array, which is a non-starter if you would like to calculate an expression with say 4-5 arrays. Instead, I'd like a class that will only store the populated cells, and return the default value for the others (ideally, but doing some smart disk I/O to preserve memory). I've tried SciPy, Scientific Python, and a few other modules floating around; none seem to do the trick, yet I can't help but wonder that this is not un uncommon setup for a lot of problem domains. Is there a package out there? If there isn't, where should I start looking to create one? From their description I think SparseLib++ at least would be a good starting point as a base library. As a secondary issue, is anyone aware of a package that can handle storage of such arrays? netCDF and HDF do not seem to fit the bill; a B-Tree library seems a more natural fit... Thanks in advance --any and all input appreciated, Costas From ehagemann at comcast.net Sun Jan 12 15:14:06 2003 From: ehagemann at comcast.net (eric hagemann) Date: Sun Jan 12 15:14:06 2003 Subject: [Numpy-discussion] questions about array types Message-ID: <003c01c2ba90$32d015b0$6401a8c0@eric> Rereading the numeric docs I see the reference to types Float, Float32, Float64 -- which make sense, however I am curious to understand the usefulness of types Float0, Float8 and Float16 which all seem synonyms for Float32. Was there some thinking that there would be a converter written for 8bit floats? >>> from Numeric import * >>> a = array([1,2,3,4],Float32) >>> fromstring(a.tostring(),Float32) array([ 1., 2., 3., 4.],'f') >>> fromstring(a.tostring(),Float) array([ 2.00000047, 512.00012255]) # corrupt, as would be expected >>> fromstring(a.tostring(),Float0) #seems to convert back as if Float0 == Float32 array([ 1., 2., 3., 4.],'f') >>> fromstring(a.tostring(),Float8) array([ 1., 2., 3., 4.],'f') >>> fromstring(a.tostring(),Float16) array([ 1., 2., 3., 4.],'f') >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at ee.byu.edu Mon Jan 13 12:59:04 2003 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Jan 13 12:59:04 2003 Subject: [Numpy-discussion] Sparse Arrays in NumPy? In-Reply-To: <000701c2b951$74d59880$6e00a8c0@retek.int> Message-ID: > Hello all, > > I have been trying to find a package/addon that will provide a sparse array > class to NumPy, or will at least trick NumPy to use a sparse array as a > regular array, to no avail. > Sparse arrays are not a common object. Sparse matrices have many, many implementations of which I'm sure you're aware. What you want is a general purpose N-D array that uses some kind of sparse storage. I'm not aware of such an object in any other language. Most of the time people remap their particular problem so that any sparse arrays become sparse matrices. All of the effort is then focused in manipulating certain classes of sparse matrices. -Travis From Chris.Barker at noaa.gov Wed Jan 15 10:21:02 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Wed Jan 15 10:21:02 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package. References: <20021230235736.GA15420@idi.ntnu.no> <3E119E0E.2010403@stsci.edu> Message-ID: <3E2598CC.DAB8FD8A@noaa.gov> Hi folks, I use Numeric an wxPython together a lot (of course I do, I use Numeric for everything!). Unfortunately, since wxPython is not Numeric aware, you lose some real potential performance advantages. For example, I'm now working on expanding the extensions to graphics device contexts (DCs) so that you can draw a whole bunch of objects with a single Python call. The idea is that the looping can be done in C++, rather than Python, saving a lot of overhead of the loop itself, as well as the Python-wxWindows translation step. For drawing thousands of points, the speed-up is substantial. It's less substantial on more complex objects (rectangles give a factor of two improvement for ~1000 objects), due to the longer time it takes to draw the object itself, rather than make the call. Anyway, at the moment, Robin Dunn has the wrappers set up so that you can pass in a NumPy array (or, indeed, and sequence) rather than a list or tuple of coordinates, but it is faster to use a list than a NumPy array, because for arrays, it uses the generic PySequence_GetItem call. If we used the NumPy API directly, it should be faster than using a list, not slower! THis is how a representative section of the code looks now: bool isFastSeq = PyList_Check(pyPoints) || PyTuple_Check(pyPoints); . . . // Get the point coordinants if (isFastSeq) { obj = PySequence_Fast_GET_ITEM(pyPoints, i); } else { obj = PySequence_GetItem(pyPoints, i); } . . . So you can see that if a NumPy array is passed in, PySequence_GetItem will be used. What I would like to do is have an isNumPyArray check, and then access the NumPy array directly in that case. The tricky part is that Robin does not want to have wxPython require Numeric. (Oh how I dream of the day that NumArray becomes part of the standard library!) How can I check if an Object is a NumPy array (and then use it as such), without including Numeric during compilation? I know one option is to have condition compilation, with a NumPy and non-Numpy version, but Robin is managing a whole lot of different version as it is, and I don't think he wants to deal with twice as many! Anyone have any ideas? By the way, you can substitute NumArray for NumPy in this, as it is the wave of the future, and particularly if it would be easier. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From paul at pfdubois.com Wed Jan 15 10:50:07 2003 From: paul at pfdubois.com (Paul F Dubois) Date: Wed Jan 15 10:50:07 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package. In-Reply-To: <3E2598CC.DAB8FD8A@noaa.gov> Message-ID: <000001c2bcc6$dd570790$6601a8c0@NICKLEBY> If you could do: try: import Numeric haveNumeric = 1 except: haveNumeric = 0 in some initialization routine, then you could use this flag. Alternately you could test on the fly 'Numeric' in [m.__name__ for m in sys.modules] > -----Original Message----- > From: numpy-discussion-admin at lists.sourceforge.net > [mailto:numpy-discussion-admin at lists.sourceforge.net] On > Behalf Of Chris Barker > Sent: Wednesday, January 15, 2003 9:22 AM > Cc: Numpy-discussion > Subject: [Numpy-discussion] Optionally using Numeric in > another compiled extension package. > > > Hi folks, > > I use Numeric an wxPython together a lot (of course I do, I > use Numeric for everything!). > > Unfortunately, since wxPython is not Numeric aware, you lose > some real potential performance advantages. For example, I'm > now working on expanding the extensions to graphics device > contexts (DCs) so that you can draw a whole bunch of objects > with a single Python call. The idea is that the looping can > be done in C++, rather than Python, saving a lot of overhead > of the loop itself, as well as the Python-wxWindows translation step. > > For drawing thousands of points, the speed-up is substantial. > It's less substantial on more complex objects (rectangles > give a factor of two improvement for ~1000 objects), due to > the longer time it takes to draw the object itself, rather > than make the call. > > Anyway, at the moment, Robin Dunn has the wrappers set up so > that you can pass in a NumPy array (or, indeed, and sequence) > rather than a list or tuple of coordinates, but it is faster > to use a list than a NumPy array, because for arrays, it uses > the generic PySequence_GetItem call. If we used the NumPy API > directly, it should be faster than using a list, not slower! > THis is how a representative section of the code looks > now: > > > bool isFastSeq = PyList_Check(pyPoints) || > PyTuple_Check(pyPoints); > . > . > . > // Get the point coordinants > if (isFastSeq) { > obj = PySequence_Fast_GET_ITEM(pyPoints, i); > } > else { > obj = PySequence_GetItem(pyPoints, i); > } > > . > . > . > > So you can see that if a NumPy array is passed in, > PySequence_GetItem will be used. > > What I would like to do is have an isNumPyArray check, and > then access the NumPy array directly in that case. > > The tricky part is that Robin does not want to have wxPython > require Numeric. (Oh how I dream of the day that NumArray > becomes part of the standard library!) How can I check if an > Object is a NumPy array (and then use it as such), without > including Numeric during compilation? > > I know one option is to have condition compilation, with a > NumPy and non-Numpy version, but Robin is managing a whole > lot of different version as it is, and I don't think he wants > to deal with twice as many! > > Anyone have any ideas? > > By the way, you can substitute NumArray for NumPy in this, as > it is the wave of the future, and particularly if it would be easier. > > -Chris > > > -- > Christopher Barker, Ph.D. > Oceanographer > > NOAA/OR&R/HAZMAT (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > > ------------------------------------------------------- > This SF.NET email is sponsored by: A Thawte Code Signing Certificate > is essential in establishing user confidence by providing > assurance of > authenticity and code integrity. Download our Free Code > Signing guide: > http://ads.sourceforge.net/cgi-> bin/redirect.pl?thaw0028en > > > _______________________________________________ > Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From jmiller at stsci.edu Wed Jan 15 10:57:02 2003 From: jmiller at stsci.edu (Todd Miller) Date: Wed Jan 15 10:57:02 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package. References: <20021230235736.GA15420@idi.ntnu.no> <3E119E0E.2010403@stsci.edu> <3E2598CC.DAB8FD8A@noaa.gov> Message-ID: <3E25B253.1070108@stsci.edu> Chris Barker wrote: >Hi folks, > >I use Numeric an wxPython together a lot (of course I do, I use Numeric >for everything!). > >Unfortunately, since wxPython is not Numeric aware, you lose some real >potential performance advantages. For example, I'm now working on >expanding the extensions to graphics device contexts (DCs) so that you >can draw a whole bunch of objects with a single Python call. The idea is >that the looping can be done in C++, rather than Python, saving a lot of >overhead of the loop itself, as well as the Python-wxWindows translation >step. > >For drawing thousands of points, the speed-up is substantial. It's less >substantial on more complex objects (rectangles give a factor of two >improvement for ~1000 objects), due to the longer time it takes to draw >the object itself, rather than make the call. > >Anyway, at the moment, Robin Dunn has the wrappers set up so that you >can pass in a NumPy array (or, indeed, and sequence) rather than a list >or tuple of coordinates, but it is faster to use a list than a NumPy >array, because for arrays, it uses the generic PySequence_GetItem call. >If we used the NumPy API directly, it should be faster than using a >list, not slower! THis is how a representative section of the code looks >now: > > >bool isFastSeq = PyList_Check(pyPoints) || >PyTuple_Check(pyPoints); >. >. >. > // Get the point coordinants > if (isFastSeq) { > obj = PySequence_Fast_GET_ITEM(pyPoints, i); > } > else { > obj = PySequence_GetItem(pyPoints, i); > } > >. >. >. > >So you can see that if a NumPy array is passed in, PySequence_GetItem >will be used. > >What I would like to do is have an isNumPyArray check, and then access >the NumPy array directly in that case. > >The tricky part is that Robin does not want to have wxPython require >Numeric. (Oh how I dream of the day that NumArray becomes part of the >standard library!) >How can I check if an Object is a NumPy array (and then use it as such), >without including Numeric during compilation? > >I know one option is to have condition compilation, with a NumPy and >non-Numpy version, but Robin is managing a whole lot of different >version as it is, and I don't think he wants to deal with twice as many! > >Anyone have any ideas? > Use the Python C-API and string literals as the basis for the interface. I think the steps are something like this: 1. Import "Numeric". (PyImport_ImportModule) 2. Get the module dictionary. (PyModule_GetDict) 3. Get "array" out of the dictionary. (PyDict_GetItemString) 4. Call "isinstance" on Numeric.array and the object. (PyObject_IsInstance) Similarly: 1. Import "numarray". 2. Get the module dictionary. 3. Get "NumArray" out of the dictionary 4. Call the C-API equivalent of "isinstance" on numarray.NumArray and the object. The first 3 steps of both cases can be initialized once, I think, and stored in C static variables to avoid repeated fetches. If any of the first 3 steps fail, then consider that case failed and returning False. If it's not a Numeric array, check to see if it's a numarray. > >By the way, you can substitute NumArray for NumPy in this, as it is the >wave of the future, and particularly if it would be easier. > >-Chris > > Todd From Chris.Barker at noaa.gov Wed Jan 15 11:00:05 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Wed Jan 15 11:00:05 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package. References: <000001c2bcc6$dd570790$6601a8c0@NICKLEBY> Message-ID: <3E25A1E4.5CA8C453@noaa.gov> Paul F Dubois wrote: > > If you could do: > try: > import Numeric > haveNumeric = 1 > except: > haveNumeric = 0 > > in some initialization routine, then you could use this flag. > Alternately you could test on the fly > 'Numeric' in [m.__name__ for m in sys.modules] Thanks, but I'm talking about doing this at the C++ level in an extension package, not at the Python level. This kind of thing is Soo much easier in Python, of course! -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From jmiller at stsci.edu Wed Jan 15 12:01:53 2003 From: jmiller at stsci.edu (Todd Miller) Date: Wed Jan 15 12:01:53 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package. References: <20021230235736.GA15420@idi.ntnu.no> <3E119E0E.2010403@stsci.edu> <3E2598CC.DAB8FD8A@noaa.gov> <3E25B253.1070108@stsci.edu> Message-ID: <3E25C182.8080906@stsci.edu> Todd Miller wrote: > Chris Barker wrote: > >> How can I check if an Object is a NumPy array (and then use it as such), >> without including Numeric during compilation? >> >> I know one option is to have condition compilation, with a NumPy and >> non-Numpy version, but Robin is managing a whole lot of different >> version as it is, and I don't think he wants to deal with twice as many! >> >> Anyone have any ideas? >> > Use the Python C-API and string literals as the basis for the > interface. I think the steps are something like this: > > 1. Import "Numeric". (PyImport_ImportModule) > > 2. Get the module dictionary. (PyModule_GetDict) > > 3. Get "array" out of the dictionary. (PyDict_GetItemString) > > 4. Call "isinstance" on Numeric.array and the object. > (PyObject_IsInstance) > > Similarly: > > 1. Import "numarray". > > 2. Get the module dictionary. > > 3. Get "NumArray" out of the dictionary > > 4. Call the C-API equivalent of "isinstance" on numarray.NumArray and > the object. > > The first 3 steps of both cases can be initialized once, I think, and > stored in C static variables to avoid repeated fetches. On second thought, just do two functions, one for Numeric, one for numarray. If any of the first 3 steps fail, return False. Otherwise, return the result of the isinstance call. > > If it's not a Numeric array, check to see if it's a numarray. My idea to couple these was "not good". They're not compatible at that level anyway. Since numarray and Numeric are only source level compatible, C-code can be compiled to work with one or the other, but not both at the same time. It probably makes more sense to just implement for Numeric. If you do want to implement for both, treat them as seperate cases with seperate recognizer functions and element access code. But... It's not clear to me that knowing an object is an array will help since getting data elements still has to be done fast, and that seems hard to do without knowing the arrayobject struct. Keep in mind that Numeric and numarray arrays are strided and possibly discontiguous, so there's more to data access than owning a base pointer, as would be the case in C. Todd From falted at openlc.org Wed Jan 15 12:25:27 2003 From: falted at openlc.org (Francesc Alted) Date: Wed Jan 15 12:25:27 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package. In-Reply-To: <3E25C182.8080906@stsci.edu> References: <20021230235736.GA15420@idi.ntnu.no> <3E25B253.1070108@stsci.edu> <3E25C182.8080906@stsci.edu> Message-ID: <200301152123.45614.falted@openlc.org> A Dimecres 15 Gener 2003 21:16, Todd Miller va escriure: > > My idea to couple these was "not good". They're not compatible at that > level anyway. > > Since numarray and Numeric are only source level compatible, C-code can > be compiled to work with one or the other, but not both at the same > time. It probably makes more sense to just implement for Numeric. If > you do want to implement for both, treat them as seperate cases with > seperate recognizer functions and element access code. > > But... It's not clear to me that knowing an object is an array will > help since getting data elements still has to be done fast, and that > seems hard to do without knowing the arrayobject struct. Keep in mind > that Numeric and numarray arrays are strided and possibly discontiguous, > so there's more to data access than owning a base pointer, as would be > the case in C. I think you can use the numarray High-Level C API to overcome these dificulties. For example, by using the calls: PyArrayObject* NA InputArray(PyObject *numarray, NumarrayType t, int requires) PyArrayObject* NA OutputArray(PyObject *numarray, NumarrayType t, int requires) PyArrayObject* NA IoArray(PyObject *numarray, NumarrayType t, int requires) as documented in the User's Guide, you can get well-behaved (i.e. contiguous and well-aligned) C arrays (copying them, if needed) from both numarray or Numeric arrays if you pass C_ARRAY as the value for requires parameter. In fact, I'm using the InputArray in PyTables to manage both numarray and Numeric arrays with good results. -- Francesc Alted From jmiller at stsci.edu Wed Jan 15 12:40:02 2003 From: jmiller at stsci.edu (Todd Miller) Date: Wed Jan 15 12:40:02 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package. References: <20021230235736.GA15420@idi.ntnu.no> <3E25B253.1070108@stsci.edu> <3E25C182.8080906@stsci.edu> <200301152123.45614.falted@openlc.org> Message-ID: <3E25CA79.40206@stsci.edu> Francesc Alted wrote: >A Dimecres 15 Gener 2003 21:16, Todd Miller va escriure: > > >>But... It's not clear to me that knowing an object is an array will >>help since getting data elements still has to be done fast, and that >>seems hard to do without knowing the arrayobject struct. Keep in mind >>that Numeric and numarray arrays are strided and possibly discontiguous, >> so there's more to data access than owning a base pointer, as would be >>the case in C. >> >> > >I think you can use the numarray High-Level C API to overcome these >dificulties. > But doesn't using the numarray C-API require a level of coupling (direct knowledge of numarray during compilation) that Chris is trying to avoid? > > > Todd From falted at openlc.org Wed Jan 15 12:59:04 2003 From: falted at openlc.org (Francesc Alted) Date: Wed Jan 15 12:59:04 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package. In-Reply-To: <3E25CA79.40206@stsci.edu> References: <20021230235736.GA15420@idi.ntnu.no> <200301152123.45614.falted@openlc.org> <3E25CA79.40206@stsci.edu> Message-ID: <200301152158.44234.falted@openlc.org> A Dimecres 15 Gener 2003 21:54, Todd Miller va escriure: > >I think you can use the numarray High-Level C API to overcome these > >dificulties. > > But doesn't using the numarray C-API require a level of coupling > (direct knowledge of numarray during compilation) that Chris is trying > to avoid? > Ooops!, you are right. Perhaps this kind of scenario (accessing Numeric and numarray arrays from C) would be more and more common as people is getting more aware of the numarray capabilities and want to integrate it in their extensions. That reinforces me in the belief that having a small core with the "glue" functionality between numarray objects and 3rd party extensions in C (or SWIG, Pyrex or whatever) can be a good thing (until numarray is in the Standard Library). That way, people interested in supporting numarray objects in their extensions has only to install this small core (or even include it as part of the extension). Well, speaking as non-interested and impartial person ;-) -- Francesc Alted From Chris.Barker at noaa.gov Wed Jan 15 13:50:02 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Wed Jan 15 13:50:02 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package. References: <20021230235736.GA15420@idi.ntnu.no> <200301152123.45614.falted@openlc.org> <3E25CA79.40206@stsci.edu> <200301152158.44234.falted@openlc.org> Message-ID: <3E25C99A.9D5E1888@noaa.gov> Francesc Alted wrote: > that having a small core with the "glue" > functionality between numarray objects and 3rd party extensions in C (or > SWIG, Pyrex or whatever) can be a good thing (until numarray is in the > Standard Library). > > That way, people interested in supporting numarray objects in their > extensions has only to install this small core (or even include it as part > of the extension). I think that's a fabulous idea, but I have no idea how hard it would be. There would still be the problem of keeping versions in-sync. If I distributed my package with the glue code, it would only work on installations using the same version of Numeric (or NumArray, I suppose) Thanks to all who have commented on my post. These are some ideas I now have based on your comments: > > Use the Python C-API and string literals as the basis for the > > interface. I think the steps are something like this: > > > > 1. Import "Numeric". (PyImport_ImportModule) > > > > 2. Get the module dictionary. (PyModule_GetDict) > > > > 3. Get "array" out of the dictionary. (PyDict_GetItemString) > > > > 4. Call "isinstance" on Numeric.array and the object. > > (PyObject_IsInstance) OK, so now I can know, at runtime, whether Numeric has been imported. > But... It's not clear to me that knowing an object is an array will > help since getting data elements still has to be done fast, and that > seems hard to do without knowing the arrayobject struct. Exactly. that's my whole problem. However, I have an idea about this. If I do the above test, I can now put all the Numeric specific code into a conditional, so it would only get called in Numeric were imported. My idea is that I could make sure Numeric was around at compile time, so I could use all the Numeric API to access the array data, but it wouldn't have to be installed at runtime, as none of the Numeric calls would be executed if Numeric hadn't been imported. Would this work, or would the system try to load the .dll or .so or whatever even if the calls weren't executed? All that being said, Tim Hochberg has mentioned that when he first made wxPython DCs work with Numeric Arrays,( sorry I didn't give him credit before, I had forgotten who did that, thanks Tim ) he did some timing and discovered that the the overhead of the drawing calls was substantially larger than the overhead of the indexing anyway, so speedin up that process couldn't make much difference. My timing indicated something different, but I'm using Linux/wxGTK/X11, and I think the drawing calls return after the message has been sent to X, but X may not have completed the actual drawing yet. This means that I'm not timing the whole process, and if I did, I might not see such a difference. I did some tests with 100,000 points, and found that I could see the difference with a List and Array, and the List was about twice as fast. Drawing rectangles, however, I can't see the difference. So, I think I'll probably shelve this for the moment, and concentrate on getting all the drawing shapes supported by DrawXXXList methods. Thanks for all your input. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From gvermeul at grenoble.cnrs.fr Wed Jan 15 13:50:05 2003 From: gvermeul at grenoble.cnrs.fr (gvermeul at grenoble.cnrs.fr) Date: Wed Jan 15 13:50:05 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled Message-ID: <200301152149.h0FLn6PN032653@grenoble.cnrs.fr> > Gerard Vermeulen wrote: > > I just want to point out that PyQwt plots NumPy arrays. I have played > > a little bit with the Scipy-wxWindows interface, but it is no match > > for PyQwt (I display x-y data with 16000 points). > > Thanks for the tip, I'll check it out. I think what you have there is > that the plotting is all done at the C++ level, expecting some kind of > sequence of data points. That's exactly what I want to adress with > wxPython: being able to pass in a whole sequence and have the looping > done at the C++ level. > Yes, I am using PyArray_ContiguousFromObject() to convert any sequence into a NumPy array before copying the data into Qwt's double arrays. > > Have you ever tested whether it's fster or slower to plot data passed in > as a list vs. a NumPy array? > I did not test it, but there is certainly more overhead if you pass a list or a tuple into PyArray_ContiguousFromObject() than a NumPy array > > How do you access the data in the passed in sequence? Do you use: > PySequence_GetItem ? > No, see above. The code looks like (in "sip" language, sip is a sort of swig, but more specialized to C++ and Qt): void setData(double *, double *, int); %MemberCode PyObject *xSeq, *ySeq; $C *ptr; if (sipParseArgs(&sipArgsParsed, sipArgs, "mOO", sipThisObj, sipClass_$C, &ptr, &xSeq, &ySeq)) { PyArrayObject *x = (PyArrayObject *) PyArray_ContiguousFromObject(xSeq, PyArray_DOUBLE, 1, 0); if (!(x)) return 0; PyArrayObject *y = (PyArrayObject *) PyArray_ContiguousFromObject(ySeq, PyArray_DOUBLE, 1, 0); if (!(y)) return 0; int size; Py_BEGIN_ALLOW_THREADS size = (x->dimensions[0] < y->dimensions[0]) ? x->dimensions[0] : y->dimensions[0]; ptr->setData((double*)(x->data), (double*)(y->data), size); Py_END_ALLOW_THREADS Py_DECREF(x); Py_DECREF(y); Py_INCREF(Py_None); return Py_None; } %End The setData calls copy the data. > > thanks for the tip. Qwt (and PyQwt) look very nice, I may have to > reconsider using PyQT! > Gerard > > -Chris > > > > > > Take a look at http://gerard.vermeulen.free.fr > > > > PyQwt is an addon for PyQt (a Python wrapper for Qt) that knows nothing > > about NumPy > > > > Maybe it is possible to make a NumPy plot add-on for wxWindows, too. > > > > Gerard > > > > On Wed, Jan 15, 2003 at 09:22:20AM -0800, Chris Barker wrote: > > > Hi folks, > > > > > > I use Numeric an wxPython together a lot (of course I do, I use Numeric > > > for everything!). > > > > > > Unfortunately, since wxPython is not Numeric aware, you lose some real > > > potential performance advantages. For example, I'm now working on > > > expanding the extensions to graphics device contexts (DCs) so that you > > > can draw a whole bunch of objects with a single Python call. The idea is > > > that the looping can be done in C++, rather than Python, saving a lot of > > > overhead of the loop itself, as well as the Python-wxWindows translation > > > step. > > > > > > For drawing thousands of points, the speed-up is substantial. It's less > > > substantial on more complex objects (rectangles give a factor of two > > > improvement for ~1000 objects), due to the longer time it takes to draw > > > the object itself, rather than make the call. > > > > > > Anyway, at the moment, Robin Dunn has the wrappers set up so that you > > > can pass in a NumPy array (or, indeed, and sequence) rather than a list > > > or tuple of coordinates, but it is faster to use a list than a NumPy > > > array, because for arrays, it uses the generic PySequence_GetItem call. > > > If we used the NumPy API directly, it should be faster than using a > > > list, not slower! THis is how a representative section of the code looks > > > now: > > > > > > > > > bool isFastSeq = PyList_Check(pyPoints) || > > > PyTuple_Check(pyPoints); > > > . > > > . > > > . > > > // Get the point coordinants > > > if (isFastSeq) { > > > obj = PySequence_Fast_GET_ITEM(pyPoints, i); > > > } > > > else { > > > obj = PySequence_GetItem(pyPoints, i); > > > } > > > > > > . > > > . > > > . > > > > > > So you can see that if a NumPy array is passed in, PySequence_GetItem > > > will be used. > > > > > > What I would like to do is have an isNumPyArray check, and then access > > > the NumPy array directly in that case. > > > > > > The tricky part is that Robin does not want to have wxPython require > > > Numeric. (Oh how I dream of the day that NumArray becomes part of the > > > standard library!) > > > How can I check if an Object is a NumPy array (and then use it as such), > > > without including Numeric during compilation? > > > > > > I know one option is to have condition compilation, with a NumPy and > > > non-Numpy version, but Robin is managing a whole lot of different > > > version as it is, and I don't think he wants to deal with twice as many! > > > > > > Anyone have any ideas? > > > > > > By the way, you can substitute NumArray for NumPy in this, as it is the > > > wave of the future, and particularly if it would be easier. > > > > > > -Chris > > > > > > > > > -- > > > Christopher Barker, Ph.D. > > > Oceanographer > > > > > > NOAA/OR&R/HAZMAT (206) 526-6959 voice > > > 7600 Sand Point Way NE (206) 526-6329 fax > > > Seattle, WA 98115 (206) 526-6317 main reception > > > > > > Chris.Barker at noaa.gov > > > > > > > > > ------------------------------------------------------- > > > This SF.NET email is sponsored by: A Thawte Code Signing Certificate > > > is essential in establishing user confidence by providing assurance of > > > authenticity and code integrity. Download our Free Code Signing guide: > > > http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0028en > > > _______________________________________________ > > > Numpy-discussion mailing list > > > Numpy-discussion at lists.sourceforge.net > > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > -- > Christopher Barker, Ph.D. > Oceanographer > > NOAA/OR&R/HAZMAT (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > ------------------------------------------------------------- This message was sent using HTTPS service from CNRS Grenoble. ---> https://grenoble.cnrs.fr <--- From Jack.Jansen at oratrix.com Wed Jan 15 14:18:05 2003 From: Jack.Jansen at oratrix.com (Jack Jansen) Date: Wed Jan 15 14:18:05 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package. In-Reply-To: <3E25A1E4.5CA8C453@noaa.gov> Message-ID: <1D394963-28D7-11D7-AE69-000A27B19B96@oratrix.com> On woensdag, jan 15, 2003, at 19:01 Europe/Amsterdam, Chris Barker wrote: > Paul F Dubois wrote: >> >> If you could do: >> try: >> import Numeric >> haveNumeric = 1 >> except: >> haveNumeric = 0 >> >> in some initialization routine, then you could use this flag. >> Alternately you could test on the fly >> 'Numeric' in [m.__name__ for m in sys.modules] > > Thanks, but I'm talking about doing this at the C++ level in an > extension package, not at the Python level. This kind of thing is Soo > much easier in Python, of course! This can be done, but it is difficult, and you need the cooperation of both parties (Numeric and wxPython, in this case). The problem is that you need a way to pass C pointers from one extension module to the other. One of the pointers you want to pass is the PyTypeObject, so you can check that an object passed in from Python is of the correct type. Another is the address of some C routine that will get you a C pointer to the data. The first one may be visible from Python (so you can get at it through normal means) but the second one won't be. The dirty way to do this (and you should probably avoid this) is to put these pointers into Python integers in the supplying module, and put them in the module namespace with a funny name (__ConvertToCPointerAddress). In wxPython you import Numeric, and if it succeeds you look up the funny name, convert the Python integer to a C pointer, cross your fingers, and call the address. A cleaner way to do this is with cobject objects. These are in the core, in Objects/cobject.c. Numeric exports a cobject (again named __ConvertToCPointerAddress) with the address of the routine as the value. But, and this is the nice bit, cobjects can be passed along by Python code but can't be fiddled with. And cobject.c even provides a C function PyCObject_Import(char *modulename, char *attributename) which directly returns you the pointer you're looking for by importing the module, looking up the name, checking that it's a cobject and extracting the value. And it even has support for "protocols": Cobjects have an extra field called the description, again only settable and readable from C. Modules that don't know about each others' existence could still decide on a common description that would signify that the pointer in the cobject has a specific meaning. We could decide here that if the description is the C string "this pointer is a function that you pass one Python object and that returns the data just as Numeric would store it" would fit that bill, and anyone in the world writing an extension module could follow the protocol. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From Jack.Jansen at oratrix.com Wed Jan 15 14:34:05 2003 From: Jack.Jansen at oratrix.com (Jack Jansen) Date: Wed Jan 15 14:34:05 2003 Subject: [Numpy-discussion] Exporting Numpy C functionality to other extension modules Message-ID: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com> Actually, wrt my previous message on cobjects for communicating between extension modules, we can do one better! This is an idea I've been toying with for the MacPython extension types, and I think it's applicable to Numeric too. It goes as follows. Each Numeric object has an attribute with a well-known name, lets call it "__Numeric_C_interface". This is a Cobject, and it is shared among all Numeric objects of the same type. The value of this C object is a pointer to a C structure with pointers to all the C routines you might want to call on the object, basically the PyArray_API structure (I think). The descr of the C object is a string with the version number of this particular PyArray_API structure. An extension module that knows about this protocol and gets passed an object that it think might be a Numeric array checks whether the object has an __Numeric_C_interface attribute. If so it retrieves it, checks that it is a Cobject, gets the descriptor and tests it for compatibility and if it is compatible gets the cobject pointer and happily calls all the Numeric routines it needs. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From falted at openlc.org Thu Jan 16 04:00:03 2003 From: falted at openlc.org (Francesc Alted) Date: Thu Jan 16 04:00:03 2003 Subject: [Numpy-discussion] Exporting Numpy C functionality to other extension modules In-Reply-To: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com> References: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com> Message-ID: <200301161259.13522.falted@openlc.org> A Dimecres 15 Gener 2003 23:33, Jack Jansen va escriure: > Actually, wrt my previous message on cobjects for communicating between > extension modules, we can do one better! > > This is an idea I've been toying with for the MacPython extension > types, and I think it's applicable to Numeric too. It goes as follows. > > Each Numeric object has an attribute with a well-known name, lets call > it "__Numeric_C_interface". This is a Cobject, and it is shared among > all Numeric objects of the same type. The value of this C object is a > pointer to a C structure with pointers to all the C routines you might > want to call on the object, basically the PyArray_API structure (I > think). The descr of the C object is a string with the version number > of this particular PyArray_API structure. > > An extension module that knows about this protocol and gets passed an > object that it think might be a Numeric array checks whether the object > has an __Numeric_C_interface attribute. If so it retrieves it, checks > that it is a Cobject, gets the descriptor and tests it for > compatibility and if it is compatible gets the cobject pointer and > happily calls all the Numeric routines it needs. That's a nice idea. But I see two drawbacks: - numarray needs to be reworked to include the Cobject descriptors, although I don't know if this would be difficult or not. - you still need to have Numeric or numarray installed on the client machine. This could be the usual case, but what about extensions that want to use Numeric internally (because a number of reasons, like better number representation, convenient interface to C, etc) without forcing the user to install it? However, designing a small library with a minimalist API (I'm thinking in something similar to zlib) could be very handy in allowing extensions (but also native python modules) to deal with numarray objects. As I said before, this would require the user to install only this small library, but it can also be included in the application or package. However, this second alternative can be tricky, as Chris Barker has signaled, because the different numarray versions coming in the future. But IMO a series of factors may alleviate this handicap: - The numarray data structure should be very stable, as improvements are normally made at the functionality level. - The library should provide a minimalistic, high level API that, if it is well designed, should cope with small modifications in the numarray data structures. - Finally, when these differences has to be added, and that would break the current API, this version should be marked as a major release, and existing extensions (or whatever software that is embedding the library) will know that they have to release new versions if they want to support the newest objects. But, hopefully, that should happen quite unfrequently. Of course, this small library should cope with both numarray and Numeric (at least, the not too old versions of it) objects. But I think this shouldn't pose a big problem as the actual numarray API already can do that. This logical separation between structure and functionality migth also lead to a better acceptation by numerical software cratftsmen, as they can be more confident in that the API to deal with numarray objects will be quite stable throughout the time. Well, this is just a thought. I must confess that I'm so interested on that issue because I really want to support numarray objects in my project, and I'm just wondering which is the best way to do that without creating too much nuissance to the users. In fact, I'm pondering to build up such a library myself, but that can be a waste of time if I've to redone it in every numarray release. Cheers, -- Francesc Alted From peter.chang at nottingham.ac.uk Thu Jan 16 08:47:04 2003 From: peter.chang at nottingham.ac.uk (peter.chang at nottingham.ac.uk) Date: Thu Jan 16 08:47:04 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package. In-Reply-To: <3E25C99A.9D5E1888@noaa.gov> Message-ID: On Wed, 15 Jan 2003, Chris Barker wrote: [...] > My idea is that I could make sure Numeric was around at compile time, so > I could use all the Numeric API to access the array data, but it > wouldn't have to be installed at runtime, as none of the Numeric calls > would be executed if Numeric hadn't been imported. Would this work, or > would the system try to load the .dll or .so or whatever even if the > calls weren't executed? One way is to import a dynamic library, explicitly, which has glue code to handle the array objects when you need them. [...] > My timing indicated something different, but I'm using Linux/wxGTK/X11, > and I think the drawing calls return after the message has been sent to > X, but X may not have completed the actual drawing yet. That's right. X's communication model between client and server is asynchronous. > This means that I'm not timing the whole process, and if I did, I might > not see such a difference. You can synchronise the output buffer using XSync(3) and then do the timing. Peter From Chris.Barker at noaa.gov Thu Jan 16 09:58:04 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Thu Jan 16 09:58:04 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiledextension package. References: Message-ID: <3E26E45F.3C7E2293@noaa.gov> peter.chang at nottingham.ac.uk wrote: > You can synchronise the output buffer using XSync(3) and then do the > timing. I'd love to try this, but I confess I have no idea how! I'm working with the *.i files that tell swig what to add when creating wrappers around wxWindows for Python. wxWindows is using wxGTK, which is using GTK, which is using Xlib (I think, so I'm pretty far away from X, and I barely know enough C/C++ to attempt this. I suppose I could try including Xlib, then calling XSync, but I need to pass a reference to a disply. I have not idea how to get that. Any hints? -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Thu Jan 16 10:33:07 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Thu Jan 16 10:33:07 2003 Subject: [Numpy-discussion] Exporting Numpy C functionality to other extension modules References: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com> Message-ID: <3E26EC9D.A0B7D173@noaa.gov> Jack Jansen wrote: > An extension module that knows about this protocol and gets passed an > object that it think might be a Numeric array checks whether the object > has an __Numeric_C_interface attribute. If so it retrieves it, checks > that it is a Cobject, gets the descriptor and tests it for > compatibility and if it is compatible gets the cobject pointer and > happily calls all the Numeric routines it needs. Wow Jack! are single handely going to impliment all my pet projects that I'm too stupid to know how to do my self ? (the other one was Universal text file support) I can only barely follow what you're suggesting, but I still have a question about it. It seems while this would provide a way ro an extension module to identify whether an object was a Numeric array, and then get a pointer to it, how would it know the API for dealing with the arrays, without the Numeric header file? Or would you have to include the header file when compiling, but not need the library at runtime unless it was actually used, which seems a reasonable compromise. If this would work, I think it's a great idea. Short of including NumArray with the standard library (which I imagine is a least a couple of Python releases away), it would be a great solution for folks that are writing extensions that they want to be able take advantage of Numeric when it's there, but not require it. Do any of the primary Numarray developers think this is a good and doable idea? -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From peter.chang at nottingham.ac.uk Thu Jan 16 11:22:03 2003 From: peter.chang at nottingham.ac.uk (peter.chang at nottingham.ac.uk) Date: Thu Jan 16 11:22:03 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiledextension package. In-Reply-To: <3E26E45F.3C7E2293@noaa.gov> Message-ID: On Thu, 16 Jan 2003, Chris Barker wrote: > peter.chang at nottingham.ac.uk wrote: > > > You can synchronise the output buffer using XSync(3) and then do the > > timing. Oops, that should be XSynchronize(3). [...] > I suppose I could try including Xlib, then calling XSync, but I need to > pass a reference to a disply. I have not idea how to get that. > > Any hints? wxGetDisplayName() gives the Display name but not a pointer to the display structure. So this is not much help. In gtk+, any program can be called with --sync to aid debugging. I'd guess wxWindows may allow you to do the same. Peter From jmiller at stsci.edu Thu Jan 16 12:06:05 2003 From: jmiller at stsci.edu (Todd Miller) Date: Thu Jan 16 12:06:05 2003 Subject: [Numpy-discussion] Exporting Numpy C functionality to other extension modules References: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com> <3E26EC9D.A0B7D173@noaa.gov> Message-ID: <3E271006.4000607@stsci.edu> Chris Barker wrote: >Jack Jansen wrote: > > > >>An extension module that knows about this protocol and gets passed an >>object that it think might be a Numeric array checks whether the object >>has an __Numeric_C_interface attribute. If so it retrieves it, checks >>that it is a Cobject, gets the descriptor and tests it for >>compatibility and if it is compatible gets the cobject pointer and >>happily calls all the Numeric routines it needs. >> >> > >Wow Jack! are single handely going to impliment all my pet projects that >I'm too stupid to know how to do my self ? (the other one was Universal >text file support) > >I can only barely follow what you're suggesting, but I still have a >question about it. It seems while this would provide a way ro an >extension module to identify whether an object was a Numeric array, and >then get a pointer to it, how would it know the API for dealing with the >arrays, without the Numeric header file? Or would you have to include >the header file when compiling, but not need the library at runtime >unless it was actually used, which seems a reasonable compromise. > >If this would work, I think it's a great idea. Short of including >NumArray with the standard library (which I imagine is a least a couple >of Python releases away), it would be a great solution for folks that >are writing extensions that they want to be able take advantage of >Numeric when it's there, but not require it. > >Do any of the primary Numarray developers think this is a good and >doable idea? > > Roll out the time machine... it's already done. As long as you don't define the macros PY_ARRAY_UNIQUE_SYMBOL or NO_IMPORT_ARRAY, any file that includes arrayobject.h gets a static copy of PyArray_API. If the module executes import_array() at an appropriate time, normally module initialization, but not necessarily, the static PyArray_API gets filled in and becomes usable. The import_array() call is critical; without it, API calls through the static PyArray_API are calls to NULL and segfault. I think that if Numeric is not present, and you call import_array(), it will fail quietly but leave the Python error status set. So it might make sense to call PyErr_Clear() after doing import_array(). >-Chris > So it sounds like your whole "weak linkage" scheme is plausible now with Numeric (maybe even numarray!), as would be a minimal API module. 1. We discussed yesterday how to determine if an object is a Numeric array w/o even compiling with arrayobject.h. The important idea there was that if Numeric is not present, the "isarray" (or whatever) function will return false rather than segfaulting because the API pointer isn't filled in. 2. Call API functions in contexts where you know you're looking at Numeric arrays, i.e., right after isarray(). This creates a guard which prevents you from calling API functions when Numeric is not present. 3. Call import_array() at some time before using the API functions, possibly at module init time, failing quietly and clearing the error in installations where Numeric is not installed. Todd From jmiller at stsci.edu Fri Jan 17 14:16:03 2003 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 17 14:16:03 2003 Subject: [Numpy-discussion] Exporting Numpy C functionality to other extension modules References: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com> <3E26EC9D.A0B7D173@noaa.gov> Message-ID: <3E288068.3070407@stsci.edu> Take a look at the attached extension module "testlite" which demonstrates the technique I evolved from this discussion. As we discussed, this usage pattern enables the construction of an extension which will take advantage of numarray if it is there, but will continue to work if the user has not installed numarray. Here's how it works: 1. I created a new API function, PyArray_isArray() which is safe to call in all contexts. I defined it as: #define PyArray_isArray(o) (PyArray_API && NA_isNumArray(o)) I added NA_isNumArray(o) to the numarray C-API because it was the easy way to do it. 2. Ordinary API functions are safe to call once an object has been identified to be a numarray because it implies (locally) that the PyArray_API pointer has been initialized. 3. I tried out the standard import_array() code and added some cleanup for the case where numarray is not installed. The only caveat I see at this point is that you are required to include numarray headers in order to use this. In numarray's case, this might necessitate header updates and/or function call modifications. The numarray C-API should stabilize pretty soon, but I don't think its quite there yet. The same approach should apply to Numeric. This stuff is in numarray CVS now and should be in the next numarray release. Todd -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: testlite.c URL: From haase at msg.ucsf.edu Fri Jan 17 14:25:04 2003 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Fri Jan 17 14:25:04 2003 Subject: [Numpy-discussion] make C array accessible to python without copy Message-ID: <03fa01c2be77$4cae4430$3b45da80@rodan> Hi, What is the C API to make an array that got allocated, let's say, by a = new short[512*512], accessible to python as numarray. I tried NA_New - but that seems to make a copy. I would need it to use the original memory space so that I can "observe" the array from Python WHILE the underlying C array changes (it's actually a camera image) Thanks, Sebastian Haase From jmiller at stsci.edu Fri Jan 17 15:17:01 2003 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 17 15:17:01 2003 Subject: [Numpy-discussion] make C array accessible to python without copy References: <03fa01c2be77$4cae4430$3b45da80@rodan> Message-ID: <3E288EB1.80107@stsci.edu> Sebastian Haase wrote: >Hi, >What is the C API to make an array that got allocated, >let's say, by a = new short[512*512], >accessible to python as numarray. > What you want to do is not currently supported well in C. The way to do what you want is: 1. Create a buffer object from your C++ array. The buffer object can be built such that it refers to the original copy of the data. 2. Call back into Python (numarray.NumArray) with your buffer object as the buffer parameter. You can scavenge the code in NA_newAll (Src/newarray.ch) for most of the callback. >I tried NA_New - but that seems to make a copy. >I would need it to use the original memory space >so that I can "observe" the array from Python WHILE >the underlying C array changes (it's actually a camera image) > That sounds cool! > >Thanks, >Sebastian Haase > > > > >------------------------------------------------------- >This SF.NET email is sponsored by: Thawte.com - A 128-bit supercerts will >allow you to extend the highest allowed 128 bit encryption to all your >clients even if they use browsers that are limited to 40 bit encryption. >Get a guide here:http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0030en >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From falted at openlc.org Sat Jan 18 01:23:03 2003 From: falted at openlc.org (Francesc Alted) Date: Sat Jan 18 01:23:03 2003 Subject: [Numpy-discussion] Accessing a Numeric 'c' array from numarray Message-ID: <200301181022.07015.falted@openlc.org> Hi, I'm trying to make a C array from a Numeric "c" (Character) typecode array using the high level call: NA_InputArray(PyObject *numarray, NumarrayType t, int requires) with no success. As I have been able to access all the other types (i.e. '1','b','s','i','l','f','d') successfully, perhaps character type is not supported? In the NumarrayType enum, there is no tChar, but I've tried tUInt8 and tAny as the value for NumarrayType parameter, but both choices issues the same error: Traceback (most recent call last): File "table-tree2.py", line 77, in ? h5file.createArray('/columns', 'name', array(names), "Name column") File "/home/falted/PyTables/pytables-0.3/tables/File.py", line 400, in createArray setattr(group, name, object) File "/home/falted/PyTables/pytables-0.3/tables/Group.py", line 355, in __setattr__ value._f_putObjectInTree(name, self) File "/home/falted/PyTables/pytables-0.3/tables/Leaf.py", line 71, in _f_putObjectInTree self.create() File "/home/falted/PyTables/pytables-0.3/tables/Array.py", line 83, in create self.createArray(self.object, self.title) File "/home/falted/PyTables/pytables-0.3/src/hdf5Extension.pyx", line 913, in createArray array = NA_InputArray(arr, numfmt2[arr.typecode()], C_ARRAY) libnumarray.error: getShape: sequence object nested more than MAXDIM deep. although I was passing only a Numeric 'c' with a rather small shape (10,16). I just want to access the buffer data, and the shape of this object from C (well, I'm actually using Pyrex, but I think this is not important). Is that possible by only using numarray C calls? Thanks, -- Francesc Alted From jmiller at stsci.edu Sat Jan 18 08:27:04 2003 From: jmiller at stsci.edu (Todd Miller) Date: Sat Jan 18 08:27:04 2003 Subject: [Numpy-discussion] Accessing a Numeric 'c' array from numarray] Message-ID: <3E2983C3.7000304@stsci.edu> Francesc Alted wrote: >Hi, > >I'm trying to make a C array from a Numeric "c" (Character) typecode array >using the high level call: > >NA_InputArray(PyObject *numarray, NumarrayType t, int requires) > Unified handling of character arrays and numeric arrays doesn't exist yet in numarray. There is no C-API for the chararray module because we haven't needed one. But CharArrays are NDArrays and have attributes stored in PyArrayObjects just like numarrays. >with no success. > >As I have been able to access all the other types (i.e. >'1','b','s','i','l','f','d') successfully, perhaps character type is not >supported? > >In the NumarrayType enum, there is no tChar, but I've tried tUInt8 and tAny >as the value for NumarrayType parameter, but both choices issues the same >error: > >Traceback (most recent call last): > File "table-tree2.py", line 77, in ? > h5file.createArray('/columns', 'name', array(names), "Name column") > File "/home/falted/PyTables/pytables-0.3/tables/File.py", line 400, in >createArray > setattr(group, name, object) > File "/home/falted/PyTables/pytables-0.3/tables/Group.py", line 355, in >__setattr__ > value._f_putObjectInTree(name, self) > File "/home/falted/PyTables/pytables-0.3/tables/Leaf.py", line 71, in >_f_putObjectInTree > self.create() > File "/home/falted/PyTables/pytables-0.3/tables/Array.py", line 83, in >create > self.createArray(self.object, self.title) > File "/home/falted/PyTables/pytables-0.3/src/hdf5Extension.pyx", line 913, >in createArray > array = NA_InputArray(arr, numfmt2[arr.typecode()], C_ARRAY) >libnumarray.error: getShape: sequence object nested more than MAXDIM deep. > NA_InputArray was intended to accept non-numeric sequences. It could report this better... >although I was passing only a Numeric 'c' with a rather small shape (10,16). > >I just want to access the buffer data, and the shape of this object from C >(well, I'm actually using Pyrex, but I think this is not important). Is that >possible by only using numarray C calls? > Look at Lib/chararray.py and Src/_chararraymodule.c. If you can handle using a CharArray or RawCharArray, try: 1. call NA_updateDataPtr( array ) to refresh the data buffer pointer in the PyArrayObject. Even _chararraymodule.c doesn't do this right yet. 2. call NA_OFFSETDATA(array) to add the byteoffset to the pointer. 3. shape, strides, and itemsize should be directly accessible from the PyArrayObject. CharArray has some extra stripping and padding semantics; these are lazy and hence absent without extra care in C. RawCharArray has none. CharArrays are really arrays of fixed length strings of bytes. The string length is defined by the array itemsize. >Thanks, > > > Todd From falted at openlc.org Sat Jan 18 10:18:02 2003 From: falted at openlc.org (Francesc Alted) Date: Sat Jan 18 10:18:02 2003 Subject: [Numpy-discussion] Accessing a Numeric 'c' array from numarray] In-Reply-To: <3E2983C3.7000304@stsci.edu> References: <3E2983C3.7000304@stsci.edu> Message-ID: <200301181917.29533.falted@openlc.org> A Dissabte 18 Gener 2003 17:41, Todd Miller va escriure: > >I just want to access the buffer data, and the shape of this object from C > >(well, I'm actually using Pyrex, but I think this is not important). Is > > that possible by only using numarray C calls? > > Look at Lib/chararray.py and Src/_chararraymodule.c. > > If you can handle using a CharArray or RawCharArray, try: > > 1. call NA_updateDataPtr( array ) to refresh the data buffer pointer in > the PyArrayObject. Even _chararraymodule.c doesn't do this right yet. > > 2. call NA_OFFSETDATA(array) to add the byteoffset to the pointer. > > 3. shape, strides, and itemsize should be directly accessible from the > PyArrayObject. Ok. I'll try to do that. > > CharArray has some extra stripping and padding semantics; these are lazy > and hence absent without extra care in C. RawCharArray has none. > By the way, is it safe to assume that CharArray objects are contiguous? or RawCharArray?. The same question goes for RecArray objects. Or it is always convenient to check with iscontiguous() method if they are or not?. In case these objects can be non-contiguous, I guess there's still not a function like NA_InputArray that works with CharArray or RecArray objects in order to obtain well-behaved objects. Is that true? I think it would be possible to me to include support for numarray objects in next release of PyTables. Thanks!, -- Francesc Alted From jmiller at stsci.edu Sat Jan 18 11:57:03 2003 From: jmiller at stsci.edu (Todd Miller) Date: Sat Jan 18 11:57:03 2003 Subject: [Numpy-discussion] Accessing a Numeric 'c' array from numarray] References: <3E2983C3.7000304@stsci.edu> <200301181917.29533.falted@openlc.org> Message-ID: <3E29B52C.2030602@stsci.edu> Francesc Alted wrote: >A Dissabte 18 Gener 2003 17:41, Todd Miller va escriure: > > >>>I just want to access the buffer data, and the shape of this object from C >>>(well, I'm actually using Pyrex, but I think this is not important). Is >>>that possible by only using numarray C calls? >>> >>> >>Look at Lib/chararray.py and Src/_chararraymodule.c. >> >>If you can handle using a CharArray or RawCharArray, try: >> >>1. call NA_updateDataPtr( array ) to refresh the data buffer pointer in >>the PyArrayObject. Even _chararraymodule.c doesn't do this right yet. >> >>2. call NA_OFFSETDATA(array) to add the byteoffset to the pointer. >> >>3. shape, strides, and itemsize should be directly accessible from the >>PyArrayObject. >> >> > >Ok. I'll try to do that. > > > >>CharArray has some extra stripping and padding semantics; these are lazy >>and hence absent without extra care in C. RawCharArray has none. >> >> >> > >By the way, is it safe to assume that CharArray objects are contiguous? or >RawCharArray?. > Mostly no. Each fixed length element is stored as a contiguous sequence of bytes. Anything goes for the rest, so you need to look at the strides arrays and byteoffset. >The same question goes for RecArray objects. > No. It's possible to select every 10th record, for instance, in a slice. I believe the resulting decimated array would be a discontiguous view of the original. >Or it is always >convenient to check with iscontiguous() method if they are or not?. > I'm not even certain the method works correctly for chararray and recarray. I think the portion of chararray that has been written in C considers array strides. recarray is pure python. In both cases, I think I'd just forget about contiguity and use the strides arrays. > In case >these objects can be non-contiguous, I guess there's still not a function >like NA_InputArray that works with CharArray or RecArray objects in order to >obtain well-behaved objects. Is that true? > True. But neither recarray nor chararray really has behavedness problems like misalignment, byteswapping, or type conversion. I think contiguity is the only issue, and that is solved just by calling .copy(). You might argue that records contain byteswapped and misaligned fields. I don't have an immediate answer to that. My preference is to use strides and forget about contiguity, but you could also make contiguous copies simply. Noone I'm aware of has yet tried access to misbehaved records in C. > >I think it would be possible to me to include support for numarray objects >in next release of PyTables. > Great! >Thanks!, > > From verveer at embl.de Sun Jan 19 06:39:09 2003 From: verveer at embl.de (verveer at embl.de) Date: Sun Jan 19 06:39:09 2003 Subject: [Numpy-discussion] numarray bug? Message-ID: <1042987080.3e2ab8489e640@webmail.EMBL-Heidelberg.DE> Hi, The following gives an error: >>> print numarray.Int8 == numarray.Any Traceback (most recent call last): File "", line 1, in ? File "/usr/local/lib/python2.2/site-packages/numarray/numerictypes.py", line 102, in __cmp__ return genericTypeRank.index(self.name) - genericTypeRank.index(other.name) ValueError: list.index(x): x not in list A bug? Cheers, Peter -- Dr. Peter J. Verveer Cell Biology and Cell Biophysics Programme EMBL Meyerhofstrasse 1 D-69117 Heidelberg Germany Tel. : +49 6221 387245 Fax : +49 6221 387242 Email: verveer at embl-heidelberg.de From falted at openlc.org Mon Jan 20 04:17:03 2003 From: falted at openlc.org (Francesc Alted) Date: Mon Jan 20 04:17:03 2003 Subject: [Numpy-discussion] Accessing a Numeric 'c' array from numarray] In-Reply-To: <3E29B52C.2030602@stsci.edu> References: <3E2983C3.7000304@stsci.edu> <200301181917.29533.falted@openlc.org> <3E29B52C.2030602@stsci.edu> Message-ID: <200301201316.06127.falted@openlc.org> A Dissabte 18 Gener 2003 21:12, Todd Miller va escriure: > >By the way, is it safe to assume that CharArray objects are contiguous? or > >RawCharArray?. > > Mostly no. Each fixed length element is stored as a contiguous > sequence of bytes. Anything goes for the rest, so you need to look at > the strides arrays and byteoffset. > > >The same question goes for RecArray objects. > > No. It's possible to select every 10th record, for instance, in a > slice. I believe the resulting decimated array would be a discontiguous > view of the original. > > >Or it is always > >convenient to check with iscontiguous() method if they are or not?. > > I'm not even certain the method works correctly for chararray and > recarray. Well, during my tests with numarray 0.4, iscontiguous() seems to work well, both for chararrays and recarrays. > In both cases, I think I'd just forget about > contiguity and use the strides arrays. Yeah, but I still want to use iscontiguous() method just to speed-up a bit the code. > You might argue that records contain > byteswapped and misaligned fields. I don't have an immediate answer to > that. Exactly, I am pondering how to deal with HDF5 objects coming from machines with a different endianess (misalignment is not a problem in my case) than the local machine. But I think I can manage that by creating recarrays buffers with the byteorder parameter set appropriately during the HDF5 table reads. Then, all the data can be read correctly because numarray will byteswap the data whenever this recarray will be accessed. Moreover, if this object is to be used frequently, I can speed-up the access to this recarray by byteswapping the columns (as arrays) using their byteswap() method. In the future it would be nice to provide a generica byteswap method for recarrays. Thanks, -- Francesc Alted From falted at openlc.org Mon Jan 20 11:02:02 2003 From: falted at openlc.org (Francesc Alted) Date: Mon Jan 20 11:02:02 2003 Subject: [Numpy-discussion] recarray2 re-visited Message-ID: <200301202000.53584.falted@openlc.org> Hi, As I needed a byteswap() method for recarray, after a bit of hacking I've made one myself. This is based on my own version of recarray to take advantage of the _fields cache so as to both speed-up and simplify the new code. Basically, the new method takes a recarray, checking which columns are numarray arrays and invoking their byteswap() method if needed. Easy, but effective. Moreover, a _byteswap() and togglebyteorder() are provided to be compatible with existing methods in NumArray objects. As a plus, the recarray __str__ has been modified in order to allow a printing having in mind the byteorder of the recarray, and improving the speed of printing by a factor of 30, that can be handy in some situations. Do with it whatever you want, -- Francesc Alted -------------- next part -------------- A non-text attachment was scrubbed... Name: recarray2.py Type: text/x-python Size: 21435 bytes Desc: not available URL: -------------- next part -------------- recarray shape in test ==> (10000,) Assignment in recarray original ------------------------------- Assign time: 1.24 Rows/s: 8064 Assignment in recarray modified ------------------------------- Assign time: 0.16 Rows/s: 62499 Speed-up: 7.75 Selection in recarray original ------------------------------ This record pass the cut ==> 0.0 (row 0 ) This record pass the cut ==> 1.0 (row 1 ) This record pass the cut ==> 4.0 (row 2 ) Select time: 1.53 Rows/s: 6535 Selection in recarray modified ------------------------------ This record pass the cut ==> 0.0 (row 0 ) This record pass the cut ==> 1.0 (row 1 ) This record pass the cut ==> 4.0 (row 2 ) Select time: 0.15 Rows/s: 66666 Speed-up: 10.2 Printing in recarray original ------------------------------ Print time: 18.11 Rows/s: 552 Printing in recarray modified ------------------------------ Print time: 0.63 Rows/s: 15872 Speed-up: 28.746 -------------- next part -------------- A non-text attachment was scrubbed... Name: recarray2-test.py Type: text/x-python Size: 2946 bytes Desc: not available URL: From falted at openlc.org Tue Jan 21 08:01:13 2003 From: falted at openlc.org (Francesc Alted) Date: Tue Jan 21 08:01:13 2003 Subject: [Numpy-discussion] Conversion functions between Numeric and numarray arrays? Message-ID: <200301211744.55666.falted@openlc.org> Hi, Anybody is aware of any function (either in C or Python or a mixture of both) to easily convert Numerical Python arrays from/to numarray arrays? I mean, I would like to use such a funtion that, without having to copy element by element all the data, be able to copy the data buffer (or even use the same if possible at all) from one object to the other. Thanks, -- Francesc Alted From haase at msg.ucsf.edu Tue Jan 21 10:41:07 2003 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Tue Jan 21 10:41:07 2003 Subject: [Numpy-discussion] Conversion functions between Numeric and numarray arrays? References: <200301211744.55666.falted@openlc.org> Message-ID: <051501c2c17c$a83e8410$3b45da80@rodan> Hi, I think this is actually quite related to my post from Friday: [Numpy-discussion] make C array accessible to python without copy -> So, to reformulate: Who hold actually the array data in memory? Or: where gets the memory allocated and where/how many pointers to that exist? I understood the answer that Todd Miller gave, that there is such a thing as a "buffer object" that does all the work, so then: one would just have to take that and build a "new" numarray or Numeric structure around it (referring to the Subject of this email) or (in the case of my Friday-email) just have that "buffer object" point to a different memory space (that got already allocated by the C-program) . Agree ? (Did I get it right?) Sebastian Haase ----- Original Message ----- From falted at openlc.org Tue Jan 21 11:24:08 2003 From: falted at openlc.org (Francesc Alted) Date: Tue Jan 21 11:24:08 2003 Subject: [Numpy-discussion] Conversion functions between Numeric and numarray arrays? In-Reply-To: <3E2D74A2.40204@stsci.edu> References: <200301211744.55666.falted@openlc.org> <3E2D74A2.40204@stsci.edu> Message-ID: <200301212005.30328.falted@openlc.org> A Dimarts 21 Gener 2003 17:26, v?reu escriure: > Francesc Alted wrote: > >Anybody is aware of any function (either in C or Python or a mixture of > >both) to easily convert Numerical Python arrays from/to numarray arrays? > > I think you should look at numarray.fromlist() and NumArray.tolist(). I > think fromlist() will work on a nested sequence object, and hence a > Numeric array. Yeah, I knew that, but I was looking for something more optimal. > > >I mean, I would like to use such a funtion that, without having to copy > >element by element all the data, be able to copy the data buffer (or even > >use the same if possible at all) from one object to the other. > > I have not looked at this yet; it's a very good question. Note that > going from numarray to Numeric there are issues with making the buffer > well-behaved. I think this should be not too difficult to achieve and I'll try to explain why. When going from numarray to Numeric, numarray already have NA_InputArray C-API function that returns a well-behaved array. But strictly speaking, we don't even need a well-behaved array (this is a too restrictive condition) as both Numeric and numarray support discontiguous data. Even the byteorder should be not a problem, because, as Numeric itself has no such a property, we can create a Numeric array that is in native order as the result and byteswap the numarray object (if needed) before doing the conversion. So, non-alignment remains as the only issue that may cause a buffer copy during numarray ==> Numeric conversion. Is that correct?. If yes, it is possible to do a workaround about that, i.e. we can still get a Numeric from a numarray without copying the data in case of numarray misaligned objects?. Regarding to going in the other sense (ie. Numeric ==> numarray), as numarray supports discontiguity, misalignment and byteswapped data, this conversion should not imply a data buffer copy at all. Once we have a pointer to the data buffer, it is only a matter of wrapping a Numeric or numarray object around it getting this info from the original object, and returning the new object as a result. All in all, this conversion *seems* to be not a too difficult task. Making such a conversion functions (in C, but also having Python counterparts) available might represent to open the door to a co-existence of Numeric and numarray objects in the same program, and that would easy the numarray deployment in existing Numeric software. Comments? -- Francesc Alted From falted at openlc.org Tue Jan 21 11:24:11 2003 From: falted at openlc.org (Francesc Alted) Date: Tue Jan 21 11:24:11 2003 Subject: [Numpy-discussion] Conversion functions between Numeric and numarray arrays? In-Reply-To: <051501c2c17c$a83e8410$3b45da80@rodan> References: <200301211744.55666.falted@openlc.org> <051501c2c17c$a83e8410$3b45da80@rodan> Message-ID: <200301212020.57384.falted@openlc.org> A Dimarts 21 Gener 2003 19:41, Sebastian Haase va escriure: > Hi, > I think this is actually quite related to my post from Friday: > [Numpy-discussion] make C array accessible to python without copy > > -> So, to reformulate: Who hold actually the array data in memory? Or: > where gets the memory allocated and where/how many pointers to that exist? > I understood the answer that Todd Miller gave, that there is such a thing > as a "buffer object" that does all the work, so then: one would just have > to take that and build a "new" numarray or Numeric structure around it > (referring to the Subject of this email) or (in the case of my > Friday-email) just have that "buffer object" point to a different memory > space (that got already allocated by the C-program) . > > Agree ? (Did I get it right?) Well, so so. I think the buffer object is a property of numarray objects, not Numeric objects. So, in the numarray ==> Numeric conversion process you may need to access the internals of the buffer (for example by using the high level numarray C-API) and manage to obtain a data buffer (in the C sense, not an object) that can be used to build the Numeric object (with the help of the numarray object metadata). The opposite way needs something similar but with inverted roles. See my previous message for a more in-depth explanation. I think the conversion (without copying) is not a difficult process, but no so-easy like that. Well, I'm just a newcomer to numarray and my opinions about that may perfectly be completely wrong, of course. Take them with caution!. -- Francesc Alted From paul at pfdubois.com Tue Jan 21 12:06:34 2003 From: paul at pfdubois.com (paul at pfdubois.com) Date: Tue Jan 21 12:06:34 2003 Subject: [Numpy-discussion] RE: numarray/Numeric upkeep? Message-ID: <3E0D02A0000164FB@mta6.wss.scd.yahoo.com> Here are some of the factors leading to the slow rate of change of Numeric lately. a. I changed to a new project and have had a lot of startup learning to do. My new project uses Numeric but not in as central a way as my old one. b. I mistakenly thought numarray would be ready sooner so that I was trying to let it slide. c. I announced last year, in view of (a), that I was needing to be replaced as HeadNummie. It would be logical to turn this over to the Numarray people, but they aren't ready to do it until Numarray is ready, so nothing happened. d. Except for Travis, most of the other listed Numeric developers aren't in fact doing patches, releases, etc. e. Not all patches that are submitted are correct or desirable, historically. I'm not saying anything about any patches you may have submitted, just pointing out that applying them requires real work, not just mechanical patching. In fact the rate of error in patches is quite high and I've learned to be cautious. f. Some patches interfere with each other; for example, a patch for making 64 bit machines work right and a patch for some specific bug collided. I've started to work on the MA for Numarray but I'm not able to do much work on Numeric right now. This is a place where someone else has to help. >-- Original Message -- >To: dubois at users.sourceforge.net >Subject: numarray/Numeric upkeep? >From: Michael Stone >Cc: >Date: Tue, 21 Jan 2003 11:32:03 -0800 > > > >No one seems to be doing bugfixes for Numeric or numarray. >Nothing seems to have happened for several months. Lots of bugs have been >posted for Numeric, some easily fixable (I submitted one with a patch). > >Any idea if either project will become active again anytime soon? From perry at stsci.edu Tue Jan 21 12:28:13 2003 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 21 12:28:13 2003 Subject: [Numpy-discussion] RE: numarray/Numeric upkeep? In-Reply-To: <3E0D02A0000164FB@mta6.wss.scd.yahoo.com> Message-ID: Michael Stone wrote: > >No one seems to be doing bugfixes for Numeric or numarray. > >Nothing seems to have happened for several months. Lots of bugs > have been ... It certainly isn't true that nothing has happened for several months with numarray. On what do you base this belief? While not all bugs have been fixed, the oldest listed in the numarray bug tracker is from December. Is there a bug you feel needs urgent attention? Work is continuing and new releases will be coming out. As to Paul's comments regarding when numarray will be ready, my guess is when the following are complete: - Package reorganization (make numarray a package) - Optimization for small arrays (making numarray'speed with small arrays more comparable with Numeric; this is probably the single largest remaining item) - Porting some well known packages such as MA (which Paul is working on), scipy, pyopengl and such to work with numarray. Some of this has been started. There are other smaller things to do as well. But I'm hoping that we can be done with these in a few months. Perry From bazell at comcast.net Tue Jan 21 12:33:35 2003 From: bazell at comcast.net (Dave Bazell) Date: Tue Jan 21 12:33:35 2003 Subject: [Numpy-discussion] array operation Message-ID: <00bd01c2c18c$10ab5000$6401a8c0@DB> I am trying to see if I can use where() or choose() to do this. I can't really figure it out. I have a 2-d array data where each row is an observation and each column is an attribute of the observation: data = [[.3, .2, 2.3,...] <- observation 1 [.7, 1.2, .4...] <- observation 2 ...]] I have another 1-d array that contains a code for the class of object: class = [0,1,0,1,1,3,2,0,...] where class[i] = the class of the ith object in the data array. Thus, observation 1 above is class 0, observation 2 is class 1, and so on. I want to select all objects of a given class from data array. I can do this with a loop for i in range(ndat): if class == 0: do something .... Is there a way to use where() or choose() to do this? Would it be more efficient? Thanks, Dave From perry at stsci.edu Tue Jan 21 13:02:05 2003 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 21 13:02:05 2003 Subject: [Numpy-discussion] array operation In-Reply-To: <00bd01c2c18c$10ab5000$6401a8c0@DB> Message-ID: Dave Bazell writes: > I am trying to see if I can use where() or choose() to do this. I can't > really figure it out. > > I have a 2-d array data where each row is an observation and each > column is > an attribute of the observation: > > data = > [[.3, .2, 2.3,...] <- observation 1 > [.7, 1.2, .4...] <- observation 2 > ...]] > > I have another 1-d array that contains a code for the class of object: > > class = [0,1,0,1,1,3,2,0,...] Note that using class is illegal, it is a reserved keyword. > > where class[i] = the class of the ith object in the data array. Thus, > observation 1 above is class 0, observation 2 is class 1, and so on. > > I want to select all objects of a given class from data array. I can do > this with a loop > I assume you mean you want to select all the rows corresponding to all the observations where the code for the class corresponding to that observation equals some particular value. If so then for numarray this ought to work. index = nonzero(code==1) # want indices of all the obs where class code = 1 selected_obs = data[index] (or in one line if you wish: selected_obs = data[nonzero(code==1)] ) > for i in range(ndat): > if class == 0: > do something > .... > > Is there a way to use where() or choose() to do this? Would it be more > efficient? > Perry From Chris.Barker at noaa.gov Tue Jan 21 14:30:10 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Tue Jan 21 14:30:10 2003 Subject: [Numpy-discussion] array operation References: Message-ID: <3E2DC965.9328BCD6@noaa.gov> Perry Greenfield wrote: > If so then for numarray this ought to work. > > index = nonzero(code==1) # want indices of all the obs where class code = 1 > selected_obs = data[index] of for Numeric, use take(): selected_obs = take(data,nonzero(code == 1),1) (this will select columns coresponding to where the code == 1, which is how I read your question) By the way, choose() and where() do something similar, but give you an array back that is the saem size as the one you start with, with some (or all) of the elements replaced. take() gives you a smaller array that is a subset of the original one, which I think is what you want here. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From jmiller at stsci.edu Tue Jan 21 14:39:04 2003 From: jmiller at stsci.edu (Todd Miller) Date: Tue Jan 21 14:39:04 2003 Subject: [Numpy-discussion] Conversion functions between Numeric and numarray arrays? References: <200301211744.55666.falted@openlc.org> <3E2D74A2.40204@stsci.edu> <200301212005.30328.falted@openlc.org> Message-ID: <3E2DCBDA.1040604@stsci.edu> Francesc Alted wrote: >I think this should be not too difficult to achieve and I'll try to explain >why. > >When going from numarray to Numeric, numarray already have NA_InputArray >C-API function that returns a well-behaved array. But strictly speaking, we >don't even need a well-behaved array (this is a too restrictive condition) >as both Numeric and numarray support discontiguous data. Even the byteorder >should be not a problem, because, as Numeric itself has no such a property, >we can create a Numeric array that is in native order as the result and >byteswap the numarray object (if needed) before doing the conversion. > In-place byteswapping sounds like a bad idea to me. What if the array is based upon a readonly buffer? We've just started using these at STSCI because a readonly memory map imposes no load on the system swap file. With a read only mapping, the buffer itself has readonly pages; these cannot be swapped in-place. >So, non-alignment remains as the only issue that may cause a buffer copy >during numarray ==> Numeric conversion. Is that correct?. > I don't think so. >If yes, it is >possible to do a workaround about that, i.e. we can still get a Numeric from >a numarray without copying the data in case of numarray misaligned objects?. > > I don't see how. The primary source of misaligned arrays is numerical columns in recarrays. It seems to me that if the data is misaligned, you either have to copy it to someplace else which is aligned, or teach the function which is going to process it how to access it byte-wise. Only the former sounds feasible to me. >Regarding to going in the other sense (ie. Numeric ==> numarray), as >numarray supports discontiguity, misalignment and byteswapped data, this >conversion should not imply a data buffer copy at all. > > This sounds correct. >Once we have a pointer to the data buffer, it is only a matter of >wrapping a Numeric or numarray object around it getting this info from the >original object, and returning the new object as a result. > >All in all, this conversion *seems* to be not a too difficult task. > > It seems straightforward in principle, but the memory management issues seem a little tricky to me. It's easy to get buffers from numarrays, and create numarrays from buffers. I guess we need a module which does the same for Numeric. There are two easy ways to "get a buffer" from a Numeric array: 1. Wrap the Numeric data in a buffer object. 2. Add support for the buffer API to the Numeric object. Off hand, I'm not sure which is better, although (1) is less intrusive to Numeric and I suppose is the place to start. This should be easy. But, I'm not sure how to create a Numeric array from a buffer. It's easy to get the data pointer from a buffer, and to construct a Numeric array from a data pointer, but we also need a way to stash the pointer to the buffer object. I don't like the idea of modifying Numeric's PyArrayObject. >Making such a conversion functions (in C, but also having Python >counterparts) available might represent to open the door to a co-existence >of Numeric and numarray objects in the same program, and that would easy the >numarray deployment in existing Numeric software. > >Comments? > > All in all, I think this is a great idea which would really boost interoperability. I wish there was a simpler approach which required no modifications to Numeric. Todd From falted at openlc.org Wed Jan 22 01:53:01 2003 From: falted at openlc.org (Francesc Alted) Date: Wed Jan 22 01:53:01 2003 Subject: [Numpy-discussion] Incomplete support in certain Numeric emulation functions in numarray Message-ID: <200301221051.57337.falted@openlc.org> Hi, I have discovered that the Numeric emulation functions in numarray doesn't accept a character typecode as type parameter. This is not immediately apparent because type parameter is of type 'int', and passing it a 'char' maybe not a good practice. But the fact is that Numeric *do* accept the charcodes in the type parameter. For example, this is the normal way to call the PyArray_FromDims function: arr = PyArray_FromDims(self.rank, self.dimensions, tFloat64) but, in Numeric, this other manner also works: arr = PyArray_FromDims(self.rank, self.dimensions, 'd') Now, in numarray, if you pass a character to the type parameter, a "segmentation fault" is issued. Look at the end of Numeric-22.0/Src/arraytypes.c, to see how characters are handled as types in Numeric. I think something like this should be added to the deferred_libnumarray_init in numarray-0.4/Src/newarray.ch. Another thing. It seems to me that NA_New and NA_Empty functions are not well documented in the numarray documentation as they differ from the definitions in numarray-0.4/Src/newarray.ch. I hope that the latter will stay, because I prefer them a lot more than the documented ones :-) Bye, -- Francesc Alted From jmiller at stsci.edu Wed Jan 22 06:52:08 2003 From: jmiller at stsci.edu (Todd Miller) Date: Wed Jan 22 06:52:08 2003 Subject: [Numpy-discussion] Incomplete support in certain Numeric emulation functions in numarray References: <200301221051.57337.falted@openlc.org> Message-ID: <3E2EAFE9.4060900@stsci.edu> Francesc Alted wrote: >Hi, > >I have discovered that the Numeric emulation functions in numarray doesn't >accept a character typecode as type parameter. > Interesting. > >This is not immediately apparent because type parameter is of type 'int', >and passing it a 'char' maybe not a good practice. > I wrote the emulation functions using the manual and intuition rather than the existing code. There will be others like this. >But the fact is that >Numeric *do* accept the charcodes in the type parameter. > > > No argument here. numarray can "always" be more compatible than it is "now", for any value of always or now. I think the only real way to avoid that would be to build Numeric into numarray, which sounds dubious. :) >For example, this is the normal way to call the PyArray_FromDims function: > >arr = PyArray_FromDims(self.rank, self.dimensions, tFloat64) > >but, in Numeric, this other manner also works: > >arr = PyArray_FromDims(self.rank, self.dimensions, 'd') > > This was nicely illustrated. >Now, in numarray, if you pass a character to the type parameter, a >"segmentation fault" is issued. > > Decidedly not good. >Look at the end of Numeric-22.0/Src/arraytypes.c, to see how characters are >handled as types in Numeric. I think something like this should be added to >the deferred_libnumarray_init in numarray-0.4/Src/newarray.ch. > I did a simple implementation of PyArray_DescrFromType trying to add support for f2py. There are 2 real issues with it that I see: 1. It still doesn't handle character codes. I think it could handle both NumericTypes and character codes without conflict because of the way the ASCII character set is layed out. 2. I just added it so that it *could* be called since I think f2py needed it. I didn't call it anywhere from the other compatability functions. Care to do another patch? >Another thing. It seems to me that NA_New and NA_Empty functions are not >well documented in the numarray documentation as they differ from the >definitions in numarray-0.4/Src/newarray.ch. I hope that the latter will >stay, because I prefer them a lot more than the documented ones :-) > If you're working from CVS, the form they're in now was the result of someone's detailed comments. They're still not quite right, because the interface is written in terms of int arrays, which is not good for LP64 platforms where long is really what is needed to avoid creating 2G bottlenecks. The naming is also not consistent and I will want to make it so before release of numarray-0.5. >Bye, > > > Todd From falted at openlc.org Wed Jan 22 09:48:03 2003 From: falted at openlc.org (Francesc Alted) Date: Wed Jan 22 09:48:03 2003 Subject: [Numpy-discussion] Incomplete support in certain Numeric emulation functions in numarray In-Reply-To: <3E2EAFE9.4060900@stsci.edu> References: <200301221051.57337.falted@openlc.org> <3E2EAFE9.4060900@stsci.edu> Message-ID: <200301221846.13358.falted@openlc.org> A Dimecres 22 Gener 2003 15:51, Todd Miller va escriure: > > I did a simple implementation of PyArray_DescrFromType trying to add > support for f2py. > There are 2 real issues with it that I see: > > 1. It still doesn't handle character codes. I think it could handle > both NumericTypes and character codes without conflict because of the > way the ASCII character set is layed out. I think so > > 2. I just added it so that it *could* be called since I think f2py > needed it. I didn't call it anywhere from the other compatability > functions. > I tried to patch your PyArray_DescrFromType, but nothing has changed because, as you said, any compatabilty function call it. > Care to do another patch? Well, I've tried to patch the NA_NewAll funtion in newarray.c: typeObject = pNumType[type]; if (!typeObject) { /* Test if it is a Numeric charcode */ sprintf(strcharcode, "%c", type); charcode = PyString_FromString(strcharcode); typeobj = PyDict_GetItemString(pNumericTypesTDict, strcharcode); if (typeobj) { typeObject = typeobj; } else return (PyArrayObject *) PyErr_Format(_Error, "Type object lookup returned NULL for type %d", type); } instead of the original code: typeObject = pNumType[type]; if (!typeObject) return (PyArrayObject *) PyErr_Format(_Error, "Type object lookup returned NULL for type %d", type); with no luck as the segmentation fault continues to appear. Anyway, I've already patched my original code to use only integer codes, not character, so it would be a problem (at least for me). > They're still not quite right, because the interface is written in > terms of int arrays, which is not good for LP64 platforms where long is > really what is needed to avoid creating 2G bottlenecks. The naming is > also not consistent and I will want to make it so before release of > numarray-0.5. Ok, so perhaps it's better to use the PyArray_FromDims rather than NA_Empty (at least, until the C-API stabilizes). It's good to know that!. BTW, during the patching work of numarray sources I perceived some missing character code types in numerictypes.py. These are the correspondents to: UInt16, Int64 and UInt64. In recarray, they don't appear neither (except for Int64 which appears as 'N' in numfmt, but with no correspondant in revfmt), so one can't build-up recarrays with these types because you need a charcode for the "formats" string. Is this intentional? Do you plan to fill these gaps (it would be nice, specially for recarrays)? Thanks, -- Francesc Alted From haase at msg.ucsf.edu Thu Jan 23 14:06:04 2003 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Thu Jan 23 14:06:04 2003 Subject: [Numpy-discussion] Have a problem: what is attribute 'compress' References: <3E00FDB5.2090804@erols.com> <004b01c2a6fd$195c95f0$3b45da80@rodan> <3E01D07B.3070009@stsci.edu> Message-ID: <08ad01c2c32b$900238f0$3b45da80@rodan> Hi, I can print numarray of any int time just fine, but I still get the compress error message with Float (or complex) data: >>>c >>>array([[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], type=UInt16) >>>c.astype(na.Float) Traceback (most recent call last): File "", line 1, in ? File "C:\Python22\Lib\site-packages\numarray\numarray.py", line 581, in __repr__ MAX_LINE_WIDTH, PRECISION, SUPPRESS_SMALL, ', ', 1) File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 163, in array2string separator, array_output) File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 125, in _array2string format, item_length = _floatFormat(data, precision, suppress_small) File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 246, in _floatFormat non_zero = numarray.abs(numarray.compress(numarray.not_equal(data, 0), data)) AttributeError: 'module' object has no attribute 'compress' I get this on Windows (2000) and on Linux. Both numarray 0.4 Thanks, Sebastian ----- Original Message ----- From: "Todd Miller" To: "Sebastian Haase" Cc: Sent: Thursday, December 19, 2002 5:58 AM Subject: Re: [Numpy-discussion] Have a problem: what is attribute 'compress' > Sebastian Haase wrote: > > >Hi! > >Somehow I have a problem with numarray. Please take a look at this: > > > Hi Sebastian, > > I've don't recall seeing anything like this, nor can I reproduce it > now. If you've been following numarray for a while now, I can say > that it is important to remove the old version of numarray before > installing the new version. I recommend deleting your current > installation and reinstalling numarray. > > compress() is a ufunc, much like add() or put(). It is defined in > ndarray.py, right after the import of the modules ufunc and _ufunc. > _ufunc in particular is a problematic module, because it has followed > the atypical development path of moving from C-code to Python code. > Because of this, and the fact that a .so or .dll overrides a .py, > older installations interfere with newer ones. The atypical path was > required because the original _ufuncmodule.c was so large that it could > not be compiled on some systems; as a result, I split _ufuncmodule.c > into pieces by data type and now use _ufunc.py to glue the pieces together. > > Good luck! Please let me know if reinstalling doesn't clear up the > problem. > > Todd > > > > > > >>>>import numarray as na > >>>>na.array([0, 0]) > >>>> > >>>> > >array([0, 0]) > > > > > >>>>na.array([0.0, 0.0]) > >>>> > >>>> > >Traceback (most recent call last): > > File "", line 1, in ? > > File "C:\Python22\Lib\site-packages\numarray\numarray.py", line 581, in > >__repr__ > > MAX_LINE_WIDTH, PRECISION, SUPPRESS_SMALL, ', ', 1) > > File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 163, in > >array2string > > separator, array_output) > > File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 125, in > >_array2string > > format, item_length = _floatFormat(data, precision, suppress_small) > > File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 246, in > >_floatFormat > > non_zero = numarray.abs(numarray.compress(numarray.not_equal(data, 0), > >data)) > >AttributeError: 'module' object has no attribute 'compress' > > > >The same workes fine with Numeric. But I would prefer numarray because I'm > >writing C++-extensions and I need "unsigned shorts". > > > >What is this error about? > > > >Thanks, > >Sebastian > > > > > > > > > >------------------------------------------------------- > >This SF.NET email is sponsored by: Order your Holiday Geek Presents Now! > >Green Lasers, Hip Geek T-Shirts, Remote Control Tanks, Caffeinated Soap, > >MP3 Players, XBox Games, Flying Saucers, WebCams, Smart Putty. > >T H I N K G E E K . C O M http://www.thinkgeek.com/sf/ > >_______________________________________________ > >Numpy-discussion mailing list > >Numpy-discussion at lists.sourceforge.net > >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > > > > > From jmiller at stsci.edu Thu Jan 23 14:33:03 2003 From: jmiller at stsci.edu (Todd Miller) Date: Thu Jan 23 14:33:03 2003 Subject: [Numpy-discussion] Have a problem: what is attribute 'compress' References: <3E00FDB5.2090804@erols.com> <004b01c2a6fd$195c95f0$3b45da80@rodan> <3E01D07B.3070009@stsci.edu> <08ad01c2c32b$900238f0$3b45da80@rodan> Message-ID: <3E306D73.6050303@stsci.edu> Sebastian Haase wrote: >Hi, >I can print numarray of any int time just fine, but > OK. I am assuming you deleted all of your old numarray installations as I recommended and reinstalled numarray-0.4. What is your PYTHONPATH? >I still get the compress error message with Float (or complex) >data: > > >>>>c >>>>array([[0, 0, 0, ..., 0, 0, 0], >>>> >>>> > [0, 0, 0, ..., 0, 0, 0], > [0, 0, 0, ..., 0, 0, 0], > ..., > [0, 0, 0, ..., 0, 0, 0], > [0, 0, 0, ..., 0, 0, 0], > [0, 0, 0, ..., 0, 0, 0]], type=UInt16) > > >>>>c.astype(na.Float) >>>> >>>> >Traceback (most recent call last): > File "", line 1, in ? > File "C:\Python22\Lib\site-packages\numarray\numarray.py", line 581, in >__repr__ > MAX_LINE_WIDTH, PRECISION, SUPPRESS_SMALL, ', ', 1) > File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 163, in >array2string > separator, array_output) > File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 125, in >_array2string > format, item_length = _floatFormat(data, precision, suppress_small) > File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 246, in >_floatFormat > non_zero = numarray.abs(numarray.compress(numarray.not_equal(data, 0), >data)) >AttributeError: 'module' object has no attribute 'compress' > >I get this on Windows (2000) and on Linux. Both numarray 0.4 > > I'm not sure what's going on here, but I develop on both platforms, and Linux constantly. The self tests definitely pass in Linux. It must be some kind of environment issue or runtime issue. What happens when you type: >>> import numtestall >>> numtestall.test() ... what gets printed here? ... >Thanks, >Sebastian > > > >----- Original Message ----- >From: "Todd Miller" >To: "Sebastian Haase" >Cc: >Sent: Thursday, December 19, 2002 5:58 AM >Subject: Re: [Numpy-discussion] Have a problem: what is attribute 'compress' > > > > >>Sebastian Haase wrote: >> >> >> >>>Hi! >>>Somehow I have a problem with numarray. Please take a look at this: >>> >>> >>> >>Hi Sebastian, >> >>I've don't recall seeing anything like this, nor can I reproduce it >>now. If you've been following numarray for a while now, I can say >>that it is important to remove the old version of numarray before >>installing the new version. I recommend deleting your current >>installation and reinstalling numarray. >> >>compress() is a ufunc, much like add() or put(). It is defined in >>ndarray.py, right after the import of the modules ufunc and _ufunc. >>_ufunc in particular is a problematic module, because it has followed >>the atypical development path of moving from C-code to Python code. >> Because of this, and the fact that a .so or .dll overrides a .py, >> older installations interfere with newer ones. The atypical path was >>required because the original _ufuncmodule.c was so large that it could >>not be compiled on some systems; as a result, I split _ufuncmodule.c >>into pieces by data type and now use _ufunc.py to glue the pieces >> >> >together. > > >>Good luck! Please let me know if reinstalling doesn't clear up the >>problem. >> >>Todd >> >> >> >>> >>> >>>>>>import numarray as na >>>>>>na.array([0, 0]) >>>>>> >>>>>> >>>>>> >>>>>> >>>array([0, 0]) >>> >>> >>> >>> >>>>>>na.array([0.0, 0.0]) >>>>>> >>>>>> >>>>>> >>>>>> >>>Traceback (most recent call last): >>> File "", line 1, in ? >>> File "C:\Python22\Lib\site-packages\numarray\numarray.py", line 581, in >>>__repr__ >>> MAX_LINE_WIDTH, PRECISION, SUPPRESS_SMALL, ', ', 1) >>> File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 163, >>> >>> >in > > >>>array2string >>> separator, array_output) >>> File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 125, >>> >>> >in > > >>>_array2string >>> format, item_length = _floatFormat(data, precision, suppress_small) >>> File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 246, >>> >>> >in > > >>>_floatFormat >>> non_zero = numarray.abs(numarray.compress(numarray.not_equal(data, >>> >>> >0), > > >>>data)) >>>AttributeError: 'module' object has no attribute 'compress' >>> >>>The same workes fine with Numeric. But I would prefer numarray because >>> >>> >I'm > > >>>writing C++-extensions and I need "unsigned shorts". >>> >>>What is this error about? >>> >>>Thanks, >>>Sebastian >>> >>> >>> >>> >>>------------------------------------------------------- >>>This SF.NET email is sponsored by: Order your Holiday Geek Presents Now! >>>Green Lasers, Hip Geek T-Shirts, Remote Control Tanks, Caffeinated Soap, >>>MP3 Players, XBox Games, Flying Saucers, WebCams, Smart Putty. >>>T H I N K G E E K . C O M http://www.thinkgeek.com/sf/ >>>_______________________________________________ >>>Numpy-discussion mailing list >>>Numpy-discussion at lists.sourceforge.net >>>https://lists.sourceforge.net/lists/listinfo/numpy-discussion >>> >>> >>> >>> >> >> >> >> > > > >------------------------------------------------------- >This SF.NET email is sponsored by: >SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See! >http://www.vasoftware.com >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From j_r_fonseca at yahoo.co.uk Thu Jan 23 16:10:02 2003 From: j_r_fonseca at yahoo.co.uk (=?iso-8859-15?Q?Jos=E9?= Fonseca) Date: Thu Jan 23 16:10:02 2003 Subject: [Numpy-discussion] Extensive use of methods instead of functions Message-ID: <20030124000759.GA6042@localhost.localdomain> With the ability of subclassing types in recent versions of the Python language, more people will be interested in subclassing Numeric arrays for specific purposes. Still the use of functions instead of methods takes away many of the advantages, the ability of being overloaded. Taking this statement as an example: Numeric.put(myarray, myindices, myvalues) In the current state of affairs, if we wanted to have to statment to work with asparse matrix class derived from a Numeric array, it would have to be something like: Sparse.put(myarray, myindices, myvalues) That is, it forces to the underlaying code to know whether is dealing with Numeric arrays, or some other equivalent class. But it would be much more useful to have simply: myarray.put(myindices, myvalues) which would work regardless of the actual type of myarray, provided it supplied the put() method. This would improve enormously code reusability and extensability. I know that there are certain implementations details that may difficult this (like many functions being implemented in pure Python), but any advances made in this since will be an improvement of the current situation. Also, I know that this example is a little unhappy because numarray will do these things with the __getitem__ and __setitem__ operators. But others could easily be shown. Regards, Jos? Fonseca __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com From falted at openlc.org Fri Jan 24 04:00:07 2003 From: falted at openlc.org (Francesc Alted) Date: Fri Jan 24 04:00:07 2003 Subject: [Numpy-discussion] typecodes in numarray Message-ID: <200301241259.30243.falted@openlc.org> Maybe I'm becoming a bit tedious with this, but if you look at: >>> import numerictypes >>> numerictypes.typecode {Complex64: 'D', Int32: 'l', UInt16: 's', Complex32: 'F', Float64: 'd', UInt8: 'b', Int16: 's', Float32: 'f', Int8: '1'} you can find some incongruencies that lead to weird things like: >>> array([1,2], Int16).typecode() 's' >>> array([1,2], UInt16).typecode() 's' # --> same as Int16! >>> array([1,2], Int64).typecode() Traceback (most recent call last): File "", line 1, in ? File "/usr/local/lib/python2.2/site-packages/numarray/numarray.py", line 730, in typecode return numerictypes.typecode[self._type] KeyError: numarray type: Int64 >>> array([1,2], UInt64).typecode() Traceback (most recent call last): File "", line 1, in ? File "/usr/local/lib/python2.2/site-packages/numarray/numarray.py", line 730, in typecode return numerictypes.typecode[self._type] KeyError: numarray type: UInt64 Also, 'l' is used here to map Int32, while in recarray is used to map Boolean. Moreover, Numeric 22.0 introduced the equivalent of UInt16 and UInt32 types as 'w' and 'u' respectively. But, again, 'u' is used in recarray as synonym of Uint8. I think it's important to agree with a definitive set of charcodes and use them uniformly throughout numarray. Suggestion: if recarray charcodes are not necessary to match the Numeric ones, I propose that using the Python convention maybe a good idea. Look at the table in: http://www.python.org/doc/current/lib/module-struct.html. -- Francesc Alted From perry at stsci.edu Fri Jan 24 06:38:17 2003 From: perry at stsci.edu (Perry Greenfield) Date: Fri Jan 24 06:38:17 2003 Subject: [Numpy-discussion] typecodes in numarray In-Reply-To: <200301241259.30243.falted@openlc.org> Message-ID: > -----Original Message----- > From: numpy-discussion-admin at lists.sourceforge.net > [mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of > Francesc Alted > Sent: Friday, January 24, 2003 7:00 AM > To: numpy-discussion at lists.sourceforge.net > Subject: [Numpy-discussion] typecodes in numarray > > > Maybe I'm becoming a bit tedious with this, but if you look at: > No, this sort of feedback is very valuable. We'll think about this a bit, but I'd agree that consistency with Numeric codes is important. Some of the history of the codes used by recarray arise from conventions used in other software not related to Python or Numeric. But if recarray is to be generic and used by others, we should hide, remove or layer such conventions in a subclass. Let us think about how we should do that. Thanks, Perry From perry at stsci.edu Fri Jan 24 09:04:02 2003 From: perry at stsci.edu (Perry Greenfield) Date: Fri Jan 24 09:04:02 2003 Subject: FW: [Numpy-discussion] typecodes in numarray Message-ID: Todd Miller had some further comments that I thought were worth posting as well (and I think he makes some very good points). ************************************************************************ My [i.e. Todd's] thoughts about it: >Maybe I'm becoming a bit tedious with this, but if you look at: > No. It shows you're thinking about it carefully. Having looked at all of the examples below, I have some comments: 1. The sparseness and obscurity of the typecode "wordspace" are both demonstrated here. There are so few letters to choose from, they're often already used in some other context. Even given the large number of unused letters, it's often difficult to choose good ones and to remember what has been chosen. I think this is one of the reasons Perry chose to replace typecodes with true type objects which have rich, regular, and predictable symbolic names. 2. Typecodes were added as a backwards compatability feature of numarray, and I think it's probable that numarray beat Numeric to supporting most of these types, because otherwise they'd have been copied directly and there would be no problem. I'm not really trying to play a blame-game here, but I am making an argument that perhaps numarray should only go so far in the support of what I regard as an obsolescent feature. If the Numeric developers choose to continue extending the use of typecodes in ways that are incompatible with numarray, one way of dealing with it is to "just say no". We are going beyond the scope of backwards compatability to on-going compatabilty. (Which we may still have to do but needs to be discussed and considered) 3. STSCI has layered other software on top of numarray and recarray which astronomers use to do work. It is the friction of that interface which makes correcting these consistency problems more difficult than might be immediately apparent. >I think it's important to agree with a definitive set of charcodes and use >them uniformly throughout numarray. > I wish this were possible, but I'm thinking we should try to find an alternative approach altogether, one which may be more verbose but implicitly free of conflict. A means for specifying a recarray format might be created from tuples, type objects, and integer repetition factors. The verbosity of this approach might be a litte tedious, but it would also be transparent, maintainable, and conflict free. I think we should add an "obsolescent feature" warning to numarray and recarray which flags any use of character typecodes when the appropriate command line switches are set. >Suggestion: if recarray charcodes are not necessary to match the Numeric >ones, I propose that using the Python convention maybe a good idea. >Look at the table in: >http://www.python.org/doc/current/lib/module-struct.html. > This sounds good to me, except that it will break an existing interface that I don't have control over. Therefore, I suggest we correct the problem by coming up with something better. From paul at pfdubois.com Fri Jan 24 09:43:07 2003 From: paul at pfdubois.com (Paul F Dubois) Date: Fri Jan 24 09:43:07 2003 Subject: [Numpy-discussion] typecodes in numarray In-Reply-To: Message-ID: <000501c2c3cf$e10716e0$6601a8c0@NICKLEBY> I don't understand this remark: but I am making an argument that perhaps > numarray should only go so far in the support of what I regard as an > obsolescent feature. If the Numeric developers choose to continue > extending the use of typecodes in ways that are incompatible with > numarray, one way of dealing with it is to "just say no". > We are going > beyond the scope of backwards compatability to on-going compatabilty. > (Which we may still have to do but needs to be discussed and > considered) > There is no "on-going" Numeric development. It stops the minute numarray is ready. Period. We developers all agreed on that. The whole reason for numarray is that Numeric was pronounced unmaintainable and unextendable by those who frequently had to work on it. To do anything else will fragment the entire numerical python community and software set. From falted at openlc.org Fri Jan 24 10:48:04 2003 From: falted at openlc.org (Francesc Alted) Date: Fri Jan 24 10:48:04 2003 Subject: FW: [Numpy-discussion] typecodes in numarray In-Reply-To: References: Message-ID: <200301241946.55398.falted@openlc.org> A Divendres 24 Gener 2003 18:02, Todd Miller va escriure: > > My [i.e. Todd's] thoughts about it: > > No. It shows you're thinking about it carefully. Having looked at all > of the examples below, I have some comments: I mostly agree with your comments, but let point out some thoughts > > 1. The sparseness and obscurity of the typecode "wordspace" are both > demonstrated here. There are so few letters to choose from, they're > often already used in some other context. Even given the large number > of unused letters, it's often difficult to choose good ones and to > remember what has been chosen. I think this is one of the reasons Perry > chose to replace typecodes with true type objects which have rich, > regular, and predictable symbolic names. I completely agree that type objects is a brilliant idea. > 3. STSCI has layered other software on top of numarray and recarray > which astronomers use to do work. It is the friction of that interface > which makes correcting these consistency problems more difficult than > might be immediately apparent. Yeah, I know... > > >I think it's important to agree with a definitive set of charcodes and use > >them uniformly throughout numarray. > > I wish this were possible, but I'm thinking we should try to find an > alternative approach altogether, one which may be more verbose but > implicitly free of conflict. > > A means for specifying a recarray format might be created from tuples, > type objects, and integer repetition factors. > > The verbosity of this approach might be a litte tedious, but it would > also be transparent, maintainable, and conflict free. I think this is a very good idea. In fact, while working in PyTables I was lately pondering what would be the best way to define record arrays, and I also think that a verbose approach should be the beast. After considering metaclasses, and tuples, I ended to a compromise solution between both which are dictionaries combined with some function or class to refine the definition. My current thinking is something like: recarrDescr = { "name" : defineType(CharType, 16, ""), # 16-character String "TDCcount" : defineType(UInt8, 1, 0), # unsigned byte "ADCcount" : defineType(Int16, 1, 0), # signed short integer "grid_i" : defineType(Int32, 1, 9), # integer "grid_j" : defineType(Int32, 1, 9), # integer "pressure" : defineType(Float32, 1, 1.), # float (single-precision) "temperature" : defineType(Float64, 32, arange(32)), # double[32] "idnumber" : defineType(Int64, 1, 0), # signed long long } where defineType is a class that accepts (type, shape, default) parameters. It can be extended safely in the future if more needs appear. Dictionary has the advantage over tuple in that you can map column name to their contents quite easily, and is more flexible than defining the fields with a metaclass descendent (see http://pytables.sourceforge.net/html-doc/usersguide-html3.html#subsection3.1.2) because dictionarys can be built-up in run-time (although that also migth metaclass descendents, but in a more misterious way that I think is not worth of). In addition, dictionary object is available in all python version whereas metaclasses only from 2.2 on. However, I regard metaclasses as the most elegant solution (but elegance is not always equivalent to convenience :(). Perhaps you may want to consider this for using in recarray definition. > > I think we should add an "obsolescent feature" warning to numarray and > recarray which flags any use of character typecodes when the appropriate > command line switches are set. Well, I don't fully agree with that. I do believe that classes typecodes to be a more meaningful way for describing types, but charcodes can be quite advantageous in certain situations, like in describing in compact way the contents of a record, or passing this info to C-routines to deal with the data. For example, consider the benefits of describing a recarray format as: "3s4i20d" instead of ((Int16, 3), (Int32, 4), (Float64, 20), ) the former being more handy in lots of situations. I certainly believe that a coexistence of both can be very beneficious, specially for 3rd party extension makers (like me :). > > >Suggestion: if recarray charcodes are not necessary to match the Numeric > >ones, I propose that using the Python convention maybe a good idea. > >Look at the table in: > >http://www.python.org/doc/current/lib/module-struct.html. > > This sounds good to me, except that it will break an existing interface > that I don't have control over. Therefore, I suggest we correct the > problem by coming up with something better. Well, if charcodes finally stay in, this have an additional advantage in that python crew has provided meaningful ways to express padding (character "x"), endianess ("=", "<", ">") and alignment ("@"). So having a compact expresion like "@3sx4i20d", apart from resembling chinese to occidentals, may give a lot of info in a handy way. -- Francesc Alted From jmiller at stsci.edu Fri Jan 24 11:20:05 2003 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 24 11:20:05 2003 Subject: [Fwd: Re: [Numpy-discussion] typecodes in numarray] Message-ID: <3E319543.8040101@stsci.edu> -------------- next part -------------- An embedded message was scrubbed... From: unknown sender Subject: no subject Date: no date Size: 38 URL: From jmiller at stsci.edu Fri Jan 24 14:01:31 2003 From: jmiller at stsci.edu (Todd Miller) Date: Fri, 24 Jan 2003 14:01:31 -0500 Subject: [Numpy-discussion] typecodes in numarray References: <000501c2c3cf$e10716e0$6601a8c0@NICKLEBY> Message-ID: <3E318D8B.1090403@stsci.edu> Paul F Dubois wrote: >I don't understand this remark: > >but I am making an argument that perhaps > > >>numarray should only go so far in the support of what I regard as an >>obsolescent feature. If the Numeric developers choose to continue >>extending the use of typecodes in ways that are incompatible with >>numarray, one way of dealing with it is to "just say no". >>We are going >>beyond the scope of backwards compatability to on-going compatabilty. >>(Which we may still have to do but needs to be discussed and >>considered) >> >> >> > >There is no "on-going" Numeric development. It stops the minute numarray is >ready. Period. We developers all agreed on that. The whole reason for >numarray is that Numeric was pronounced unmaintainable and unextendable by >those who frequently had to work on it. To do anything else will fragment >the entire numerical python community and software set. > > > > Thanks for clarifying Paul. My point didn't quite come out right. A better way to put it might have been: 1. Numarray and Numeric are subject to accidental divergence. As long as they both continue to change concurrently, they will probably differ even in interface. Because numarray isn't quite ready yet, they are both still changing. 2. Typecodes in particular are something numarray is superceding with something better. Because of this, providing on-going compatability with Numeric typecodes may not make sense. 3. Numeric compatability is not the only driver for the choice of recarray typecodes so I can't make arbitrary changes without affecting other software and people. 4. I think there's a clearer, numarray type object based approach to describing recarray formats which does not use typecodes at all. Thus, instead of attampting to weed through and unify layers of conflicting type codes, we might be able to end-run the whole problem with an alternative approach. Todd > >------------------------------------------------------- >This SF.NET email is sponsored by: >SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See! >http://www.vasoftware.com >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > --Boundary_(ID_V53Q9uhCvVN46XJvLKOLLw)-- From perry at stsci.edu Fri Jan 24 11:34:02 2003 From: perry at stsci.edu (Perry Greenfield) Date: Fri Jan 24 11:34:02 2003 Subject: [Numpy-discussion] typecodes in numarray In-Reply-To: <000501c2c3cf$e10716e0$6601a8c0@NICKLEBY> Message-ID: I think Todd was referring to the recent addition of unsigned types to Numeric, along with came new typecodes. These types were already in numarray at the time. Perry > -----Original Message----- > From: Paul F Dubois [mailto:paul at pfdubois.com] > Sent: Friday, January 24, 2003 12:42 PM > To: 'Perry Greenfield'; falted at openlc.org; > numpy-discussion at lists.sourceforge.net > Subject: RE: [Numpy-discussion] typecodes in numarray > > > I don't understand this remark: > > but I am making an argument that perhaps > > numarray should only go so far in the support of what I regard as an > > obsolescent feature. If the Numeric developers choose to continue > > extending the use of typecodes in ways that are incompatible with > > numarray, one way of dealing with it is to "just say no". > > We are going > > beyond the scope of backwards compatability to on-going compatabilty. > > (Which we may still have to do but needs to be discussed and > > considered) > > > > There is no "on-going" Numeric development. It stops the minute > numarray is > ready. Period. We developers all agreed on that. The whole reason for > numarray is that Numeric was pronounced unmaintainable and unextendable by > those who frequently had to work on it. To do anything else will fragment > the entire numerical python community and software set. > > > > > > From jmiller at stsci.edu Fri Jan 24 12:01:32 2003 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 24 12:01:32 2003 Subject: FW: [Numpy-discussion] typecodes in numarray References: <200301241946.55398.falted@openlc.org> Message-ID: <3E319ED4.5060709@stsci.edu> > > >>A means for specifying a recarray format might be created from tuples, >>type objects, and integer repetition factors. >> >>The verbosity of this approach might be a litte tedious, but it would >>also be transparent, maintainable, and conflict free. >> >> > >I think this is a very good idea. In fact, while working in PyTables I was >lately pondering what would be the best way to define record arrays, and I >also think that a verbose approach should be the beast. > >After considering metaclasses, and tuples, I ended to a compromise solution >between both which are dictionaries combined with some function or class to >refine the definition. > >My current thinking is something like: > >recarrDescr = { > "name" : defineType(CharType, 16, ""), # 16-character String > "TDCcount" : defineType(UInt8, 1, 0), # unsigned byte > "ADCcount" : defineType(Int16, 1, 0), # signed short integer > "grid_i" : defineType(Int32, 1, 9), # integer > "grid_j" : defineType(Int32, 1, 9), # integer > "pressure" : defineType(Float32, 1, 1.), # float (single-precision) > "temperature" : defineType(Float64, 32, arange(32)), # double[32] > "idnumber" : defineType(Int64, 1, 0), # signed long long > } > >where defineType is a class that accepts (type, shape, default) parameters. >It can be extended safely in the future if more needs appear. > You're way ahead of me here. The only thing I don't like about this is the additional relative complexity because of the addition of field names and default values. It would be nice to layer this more. >Perhaps you may want to consider this for using in recarray definition. > We'll definitely consider it as we hash this out. > > > >>I think we should add an "obsolescent feature" warning to numarray and >>recarray which flags any use of character typecodes when the appropriate >>command line switches are set. >> >> > >Well, I don't fully agree with that. I do believe that classes typecodes to >be a more meaningful way for describing types, but charcodes can be quite >advantageous in certain situations, like in describing in compact way the >contents of a record, or passing this info to C-routines to deal with the >data. > Yeah, I know. >For example, consider the benefits of describing a recarray format as: > >"3s4i20d" > I know. > >instead of > >((Int16, 3), > (Int32, 4), > (Float64, 20), > ) > This is pretty much exactly what I was thinking. It is straightforward to imagine and difficult to forget. > >the former being more handy in lots of situations. > > Would you please name some of these so we can explore handling them both ways? >I certainly believe that a coexistence of both can be very beneficious, specially for 3rd party extension makers (like me :). > If there's a reasonable way to avoid supporting both, we should. >>>Suggestion: if recarray charcodes are not necessary to match the Numeric >>>ones, I propose that using the Python convention maybe a good idea. >>>Look at the table in: >>>http://www.python.org/doc/current/lib/module-struct.html. >>> >>> >>This sounds good to me, except that it will break an existing interface >>that I don't have control over. Therefore, I suggest we correct the >>problem by coming up with something better. >> >> > >Well, if charcodes finally stay in, this have an additional advantage in >that python crew has provided meaningful ways to express padding (character >"x"), endianess ("=", "<", ">") and alignment ("@"). > We might also add these to the type-repetition tuple. Regards, Todd From hinsen at cnrs-orleans.fr Fri Jan 24 12:13:05 2003 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Jan 24 12:13:05 2003 Subject: [Numpy-discussion] Extensive use of methods instead of functions In-Reply-To: <20030124000759.GA6042@localhost.localdomain> References: <20030124000759.GA6042@localhost.localdomain> Message-ID: Jos? Fonseca writes: > With the ability of subclassing types in recent versions of the Python > language, more people will be interested in subclassing Numeric arrays > for specific purposes. Still the use of functions instead of methods > takes away many of the advantages, the ability of being overloaded. True. On the other hand, there is also an advantage: NumPy routines can be used on standard Python data types such as number and sequence types. In the ideal world (which might come one day), core NumPy functionality would be part of standard Python, and then all these operations would work on other built-in types as well. Until then, I am not sure that changing NumPy functions to methods is a good idea. I need to call them on scalar numbers much more often than I subclass arrays. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From paul at pfdubois.com Fri Jan 24 12:36:03 2003 From: paul at pfdubois.com (Paul F Dubois) Date: Fri Jan 24 12:36:03 2003 Subject: [Numpy-discussion] Extensive use of methods instead of functions In-Reply-To: Message-ID: <000401c2c3e8$0db770f0$6601a8c0@NICKLEBY> Every time the subject of subclassing a numeric array comes up, it as if nobody ever thought of it before. Been there, done that. It doesn't turn out to be all that useful. To see why, consider a + b where a and b are Foo instances, and Foo inherits from numarray. a. a + b will be a numarray, not a Foo instance, unless you write a new + operator. b. Attempting to have numarray itself apply a subclass constructor to the result runs into the problem that numarray does not have any idea what the constructor's signature is or what information is needed to fill out that constructor. c. Even if the subclass accepts numarray's constructor signature, it would rarely produced satisfactory results just "losing" the Foo'ness details of a and b. This same argument applies to every method that returns a Foo instance, and every ufunc. So you end up redoing everything anyway. In short, worrying about subclassing is way down the list of things we ought to consider. > -----Original Message----- > From: numpy-discussion-admin at lists.sourceforge.net > [mailto:numpy-discussion-admin at lists.sourceforge.net] On > Behalf Of Konrad Hinsen > Sent: Friday, January 24, 2003 12:07 PM > To: Jos? Fonseca > Cc: numpy-discussion at lists.sourceforge.net > Subject: Re: [Numpy-discussion] Extensive use of methods > instead of functions > > > Jos? Fonseca writes: > > > With the ability of subclassing types in recent versions of > the Python > > language, more people will be interested in subclassing > Numeric arrays > > for specific purposes. Still the use of functions instead > of methods > > takes away many of the advantages, the ability of being overloaded. > > True. On the other hand, there is also an advantage: NumPy > routines can be used on standard Python data types such as > number and sequence types. > > In the ideal world (which might come one day), core NumPy > functionality would be part of standard Python, and then all > these operations would work on other built-in types as well. > > Until then, I am not sure that changing NumPy functions to > methods is a good idea. I need to call them on scalar numbers > much more often than I subclass arrays. > > Konrad. > -- > -------------------------------------------------------------- > ----------------- > Konrad Hinsen | E-Mail: > hinsen at cnrs-orleans.fr > Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 > Rue Charles Sadron | Fax: +33-2.38.63.15.17 > 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ > France | Nederlands/Francais > -------------------------------------------------------------- > ----------------- > > > ------------------------------------------------------- > This SF.NET email is sponsored by: > SourceForge Enterprise Edition + IBM + LinuxWorld =omething 2 > See! http://www.vasoftware.com > _______________________________________________ > Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From perry at stsci.edu Fri Jan 24 13:11:05 2003 From: perry at stsci.edu (Perry Greenfield) Date: Fri Jan 24 13:11:05 2003 Subject: [Numpy-discussion] Extensive use of methods instead of functions In-Reply-To: <000401c2c3e8$0db770f0$6601a8c0@NICKLEBY> Message-ID: Paul Dubois writes: > > Every time the subject of subclassing a numeric array comes up, it as if > nobody ever thought of it before. Been there, done that. It > doesn't turn out > to be all that useful. To see why, consider a + b where a and b are Foo > instances, and Foo inherits from numarray. > > a. a + b will be a numarray, not a Foo instance, unless you write a new + > operator. > b. Attempting to have numarray itself apply a subclass constructor to the > result runs into the problem that numarray does not have any idea what the > constructor's signature is or what information is needed to fill out that > constructor. > c. Even if the subclass accepts numarray's constructor signature, it would > rarely produced satisfactory results just "losing" the Foo'ness > details of a > and b. > > This same argument applies to every method that returns a Foo > instance, and > every ufunc. So you end up redoing everything anyway. > > In short, worrying about subclassing is way down the list of > things we ought > to consider. > Paul illustrates some important points. While I'm not as down on the ability to subclass (more on that later), he is absolutely right that most think that subclassing is a breeze and don't realize that it is far from being so. The arguments for this would be helped immensely by a practical example of a desired subclass. This does far more to illustrate the issues than an abstract discussion. For most instances that I have considered or thought about it is unavoidable that one must override virtually all (if not all) the operators and functions. Nevertheless, subclassing can still save a great deal of work over implementing a completely new extension. But you'll have to deal with defining how all the operators and functions should behave. In our view, the most valuable subclassing in numarray comes from subclassing NDArray, which handles all the structural operations for arrays (recarray makes heavy use of this). But recarrays don't try to support numerical operations, and that makes it fairly easy. Subclassing numarrays is significantly more work for the reasons cited. Perry From jmiller at stsci.edu Fri Jan 24 13:56:01 2003 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 24 13:56:01 2003 Subject: FW: [Numpy-discussion] typecodes in numarray References: <200301241946.55398.falted@openlc.org> <3E319ED4.5060709@stsci.edu> Message-ID: <3E31B9DB.7080603@stsci.edu> > > >> My current thinking is something like: >> >> recarrDescr = { >> "name" : defineType(CharType, 16, ""), # 16-character String >> "TDCcount" : defineType(UInt8, 1, 0), # unsigned byte >> "ADCcount" : defineType(Int16, 1, 0), # signed short integer >> "grid_i" : defineType(Int32, 1, 9), # integer >> "grid_j" : defineType(Int32, 1, 9), # integer >> "pressure" : defineType(Float32, 1, 1.), # float >> (single-precision) >> "temperature" : defineType(Float64, 32, arange(32)), # double[32] >> "idnumber" : defineType(Int64, 1, 0), # signed long long } >> >> where defineType is a class that accepts (type, shape, default) >> parameters. >> It can be extended safely in the future if more needs appear. >> > You're way ahead of me here. The only thing I don't like about this > is the additional relative complexity because of the addition of field > names and default values. It would be nice to layer this more. One more thing I don't understand looking at this: a dictionary is unordered. Todd From j_r_fonseca at yahoo.co.uk Fri Jan 24 14:00:03 2003 From: j_r_fonseca at yahoo.co.uk (=?iso-8859-15?Q?Jos=E9?= Fonseca) Date: Fri Jan 24 14:00:03 2003 Subject: [Numpy-discussion] Extensive use of methods instead of functions In-Reply-To: References: <20030124000759.GA6042@localhost.localdomain> Message-ID: <20030124215828.GA32437@localhost.localdomain> On Fri, Jan 24, 2003 at 09:07:21PM +0100, Konrad Hinsen wrote: > Jos? Fonseca writes: > > > With the ability of subclassing types in recent versions of the Python > > language, more people will be interested in subclassing Numeric arrays > > for specific purposes. Still the use of functions instead of methods > > takes away many of the advantages, the ability of being overloaded. > > True. On the other hand, there is also an advantage: NumPy routines > can be used on standard Python data types such as number and sequence > types. > > In the ideal world (which might come one day), core NumPy > functionality would be part of standard Python, and then all these > operations would work on other built-in types as well. > > Until then, I am not sure that changing NumPy functions to methods > is a good idea. I need to call them on scalar numbers much more > often than I subclass arrays. You've got a good point there. I often want to use with other Numeric array-alike classes, but I've also used them with standard Python data types for convenience. Still, it's perfectly possible to both interfaces to co-exist. Of course that when one would use the .method version it can't expect to work with standard Python data types and has to make a choice, or to use asarray() or something equivalent before using it. Regards, Jos? Fonseca __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com From j_r_fonseca at yahoo.co.uk Fri Jan 24 15:21:02 2003 From: j_r_fonseca at yahoo.co.uk (=?iso-8859-15?Q?'Jos=E9?= Fonseca') Date: Fri Jan 24 15:21:02 2003 Subject: [Numpy-discussion] Extensive use of methods instead of functions In-Reply-To: <000401c2c3e8$0db770f0$6601a8c0@NICKLEBY> References: <000401c2c3e8$0db770f0$6601a8c0@NICKLEBY> Message-ID: <20030124231900.GB32437@localhost.localdomain> On Fri, Jan 24, 2003 at 12:34:54PM -0800, Paul F Dubois wrote: > > Every time the subject of subclassing a numeric array comes up, it as > if nobody ever thought of it before. Why do you treat me as if I was trying to sell the "Next Big Thing"!? First, I must tell you that the first time I came across the idea of subclassing Numeric arrays was while reading the "Subclassing" subsection, in the "Special Topics" section of the Numeric Python manual. Your name, Paul, appears as one of the authors. Second, subclassing Numeric arrays may be useful. Again, the distribution of Numeric Python even has one big example: making a linear algebra oriented version of Numeric python, where the operations would be the standard matrix and vector operations instead of the element-wise operations. > Been there, done that. It doesn't turn out to be all that useful. As seen by the examples above is obvious you did. Still, I don't see how can you possibly say it isn't useful... > To see why, consider a + b where a and b are Foo instances, and Foo > inherits from numarray. > > a. a + b will be a numarray, not a Foo instance, unless you write a > new + operator. b. Attempting to have numarray itself apply a > subclass constructor to the result runs into the problem that numarray > does not have any idea what the constructor's signature is or what > information is needed to fill out that constructor. c. Even if the > subclass accepts numarray's constructor signature, it would rarely > produced satisfactory results just "losing" the Foo'ness details of a > and b. > > This same argument applies to every method that returns a Foo > instance, and every ufunc. So you end up redoing everything anyway. [In general it may be usefully to subclass Numeric arrays if one just want to add/overload methods, but no new properties.] And third, if you read my thread you'd notice that the use of methods instead of functions has implications/benefits much beyond the subclassing issue. It's particularly important for Numeric-alike arrays. All objects in Python are virtual so you don't actually need to subclass to use different kind of objects in the same piece as code. While you're right in the sense that for many practical applications there is little use of subclassing - a sparse matrix class is one of them for instance -, you can't deny that is quite useful to have Numeric-alike arrays, in the same basis as is currently done with the file-alike objects in Python, i.e., they could be strings, web pages but as long as they define a set of methods, these. > In short, worrying about subclassing is way down the list of things we > ought to consider. If so, then why did your comment only focused on the subclassing issue? The subclassing was a mere introduction [perhaps unfortunate, I confess] to the method overloading issue. Now, if you could (re)read my first post and comment on my actual suggestion I would appreciate. Of course that I have no problems if the Numeric/numarray maintainers decide to turn it down. I'll most probably just use UserArray.py to create a "method-ized" version of Numeric, so that my algorithms can work with both Numeric array and sparse matrices. (I do have a real case need of for this.) BTW, there is an alternative to create full-methodized Numeric array: just add a attribute which points to the module which the class belongs, e.g., "myarray.module.take" would point to "Numeric.take" if it was a Numeric array, or "Sparse.take" if it was a sparse matrix. Regards, Jos? Fonseca __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com From bsder at allcaps.org Fri Jan 24 16:19:03 2003 From: bsder at allcaps.org (Andrew P. Lentvorski, Jr.) Date: Fri Jan 24 16:19:03 2003 Subject: [Numpy-discussion] Extensive use of methods instead of functions In-Reply-To: <20030124231900.GB32437@localhost.localdomain> Message-ID: On Fri, 24 Jan 2003, [iso-8859-15] 'Jos? Fonseca' wrote: > Of course that I have no problems if the Numeric/numarray maintainers > decide to turn it down. I'll most probably just use UserArray.py to create a > "method-ized" version of Numeric, so that my algorithms can work with > both Numeric array and sparse matrices. (I do have a real case need of > for this.) Sparse matricies are common enough that they really should be a base part of Numeric rather than requiring subclassing/extending/etc. I know that Travis O. was working on some sparse matrix stuff a while back so you might want to contact him to get the current status of that work. -a From falted at openlc.org Sat Jan 25 04:43:02 2003 From: falted at openlc.org (Francesc Alted) Date: Sat Jan 25 04:43:02 2003 Subject: FW: [Numpy-discussion] typecodes in numarray In-Reply-To: <3E319ED4.5060709@stsci.edu> References: <200301241946.55398.falted@openlc.org> <3E319ED4.5060709@stsci.edu> Message-ID: <200301251342.15164.falted@openlc.org> A Divendres 24 Gener 2003 21:15, Todd Miller va escriure: > > > >My current thinking is something like: > > > >recarrDescr = { > > "name" : defineType(CharType, 16, ""), # 16-character String > > "TDCcount" : defineType(UInt8, 1, 0), # unsigned byte > > "ADCcount" : defineType(Int16, 1, 0), # signed short integer > > "grid_i" : defineType(Int32, 1, 9), # integer > > "grid_j" : defineType(Int32, 1, 9), # integer > > "pressure" : defineType(Float32, 1, 1.), # float > > (single-precision) "temperature" : defineType(Float64, 32, arange(32)), > > # double[32] "idnumber" : defineType(Int64, 1, 0), # signed long > > long } > > > >where defineType is a class that accepts (type, shape, default) > > parameters. It can be extended safely in the future if more needs appear. > > You're way ahead of me here. The only thing I don't like about this is > the additional relative complexity because of the addition of field > names and default values. It would be nice to layer this more. > Well, I think a map between field names and values is valuable from the user's point of view. It may help him to label the different information on the recarray. Moreover, if __getattr__ and __setattr__ methods (or __getitem__ and __setitem__) would get implemented on recarray (as they are in my recarray2 version, for example), the field name can become a very convenient manner to access a specific field by name (this introduce the limitation that field name must be a valid python identifier, but I think this is not a big restriction). By looking at the description dictionary, the user can have a quick idea of what he can find in every field (with no need of counting, which can be a big advantage specially for long records). With regard to default values, you can make this parameter (even the shape) a keyword parameter in order to make it optional. In that way, the definition can be as simple as "defineType(CharType)" (or even just "Chartype", if you add a bit of code) or as complete as "defineType(Chartype, shape, default, whatever_you_want)". I think this is a quite flexible approach. >One more thing I don't understand looking at this: a dictionary is >unordered. Yeah, but this can be regarded as an advantage rather than a drawback in the sense that you can choose the order you (the developer) prefer. For example, I was using first a alphanumerical order to arrange the data fields, but now, I'm considering that a arrangement that optimizes the alignment of the fields could be far better. As for one, say that you have a (Int8, Int32, Float64) record; in principle it could be easy to create a routine that arranges this record in the form (Float64,Int32, Int8) that optimizes the different field access (it may be even possible to introduce automatic padding later on if recarrays would support them in the future). Maybe you are getting confused in thinking that recarrDescr will create the recarray. Not at all, this a *metadata* definition that can be passed to the actual recarray funtion for recarray creation. Its function would be similar to the formats parameter (with typical values like "3a,4i,3w") in recarray.array, but with more verbosity and all the reported advantages. > >instead of > > > >((Int16, 3), > > (Int32, 4), > > (Float64, 20), > > ) > > This is pretty much exactly what I was thinking. It is straightforward > to imagine and difficult to forget. > > >the former being more handy in lots of situations. > > Would you please name some of these so we can explore handling them both > ways? > Well, I'm afraid that the best advantage would be when dealing with recarrays in C extension modules. In this kind of situation it would be far better to deal with a "3a4i3w" array than a tuple of python objects. But maybe I'm wrong and the latter is not so-complicated to manage; however, I used to work a lot with records (even before meeting recarray) and I was quite comfortable with formats in string mode. Or perhaps it would be enough to provide a method for converting from the standard metadata layout (dictionary or tuple or whatever), to a string format. This should be not very difficult. > > > >Well, if charcodes finally stay in, this have an additional advantage in > >that python crew has provided meaningful ways to express padding > > (character "x"), endianess ("=", "<", ">") and alignment ("@"). > > We might also add these to the type-repetition tuple. It would be nice, of course. -- Francesc Alted From jmiller at stsci.edu Sat Jan 25 11:16:05 2003 From: jmiller at stsci.edu (Todd Miller) Date: Sat Jan 25 11:16:05 2003 Subject: FW: [Numpy-discussion] typecodes in numarray References: <200301241946.55398.falted@openlc.org> <3E319ED4.5060709@stsci.edu> <200301251342.15164.falted@openlc.org> Message-ID: <3E32E5E3.2020704@stsci.edu> Francesc Alted wrote: >A Divendres 24 Gener 2003 21:15, Todd Miller va escriure: > > >>>My current thinking is something like: >>> >>>recarrDescr = { >>> "name" : defineType(CharType, 16, ""), # 16-character String >>> "TDCcount" : defineType(UInt8, 1, 0), # unsigned byte >>> "ADCcount" : defineType(Int16, 1, 0), # signed short integer >>> "grid_i" : defineType(Int32, 1, 9), # integer >>> "grid_j" : defineType(Int32, 1, 9), # integer >>> "pressure" : defineType(Float32, 1, 1.), # float >>>(single-precision) "temperature" : defineType(Float64, 32, arange(32)), >>># double[32] "idnumber" : defineType(Int64, 1, 0), # signed long >>>long } >>> >>> Still think I'd prefer something seperable: recarrStruct = ( (CharType, 16), UInt8, Int16, Int32, Int32, Float32, (Float64, 32), Int64 ) recarrFields = ["name", "TDCcount", "ADCcount", "grid_i", "grid_j", "pressure", "temperature", "idnumber"] I guess it might not be quite as good for large structs. >>>where defineType is a class that accepts (type, shape, default) >>>parameters. It can be extended safely in the future if more needs appear. >>> >>> >>You're way ahead of me here. The only thing I don't like about this is >>the additional relative complexity because of the addition of field >>names and default values. It would be nice to layer this more. >> >> >> > >Well, I think a map between field names and values is valuable from the >user's point of view. It may help him to label the different information on >the recarray. Moreover, if __getattr__ and __setattr__ methods (or >__getitem__ and __setitem__) would get implemented on recarray (as they are >in my recarray2 version, for example), the field name can become a very >convenient manner to access a specific field by name (this introduce the >limitation that field name must be a valid python identifier, but I think >this is not a big restriction). By looking at the description dictionary, >the user can have a quick idea of what he can find in every field (with no >need of counting, which can be a big advantage specially for long records). > That's true and sounds nice. I'm just thinking records with named fields should be derived from records with positional fields. If the functionality is layered, you can use as much complexity as you need. It's a good sign that both you and I thought of an identical tuple format; it's the obvious minimal one. > >With regard to default values, you can make this parameter (even the shape) >a keyword parameter in order to make it optional. > OK. That's a good point. > > >>One more thing I don't understand looking at this: a dictionary is >>unordered. >> >> > >Yeah, but this can be regarded as an advantage rather than a drawback in the >sense that you can choose the order you (the developer) prefer. For example, >I was using first a alphanumerical order to arrange the data fields, but >now, I'm considering that a arrangement that optimizes the alignment of the >fields could be far better. As for one, say that you have a (Int8, Int32, >Float64) record; in principle it could be easy to create a routine that >arranges this record in the form (Float64,Int32, Int8) that optimizes the >different field access (it may be even possible to introduce automatic >padding later on if recarrays would support them in the future). > >Maybe you are getting confused > Yes and no. :) >in thinking that recarrDescr will create the >recarray. Not at all, this a *metadata* definition that can be passed to the >actual recarray funtion for recarray creation. > Just like the type repetition tuple except also including field names and default values. I don't think you lost me. For what we do, the exact physical layout of the "struct" is important, so order matters. I see order as part of the meta-data, but I don't usually deal with meta-entities so maybe I've got that part wrong. :) >Its function would be >similar to the formats parameter (with typical values like "3a,4i,3w") in >recarray.array, but with more verbosity and all the reported advantages. > > > >>>instead of >>> >>>((Int16, 3), >>>(Int32, 4), >>>(Float64, 20), >>>) >>> >>> >>This is pretty much exactly what I was thinking. It is straightforward >>to imagine and difficult to forget. >> >> >> >>>the former being more handy in lots of situations. >>> >>> >>Would you please name some of these so we can explore handling them both >>ways? >> >> >> > >Well, I'm afraid that the best advantage would be when dealing with >recarrays in C extension modules. In this kind of situation it would be far >better to deal with a "3a4i3w" array than a tuple of python objects. But >maybe I'm wrong and the latter is not so-complicated to manage; however, I >used to work a lot with records (even before meeting recarray) and I was >quite comfortable with formats in string mode. > I was thinking that if the above was an issue, we could write an API function(s) to "compile" the type-repetition tuple into arrays of ints which describe the type of each field and corresponding repetition factor. > >Or perhaps it would be enough to provide a method for converting from the >standard metadata layout (dictionary or tuple or whatever), to a string >format. This should be not very difficult. > > Almost exactly what I suggested above. See you Monday, Todd From baecker at physik.tu-dresden.de Sun Jan 26 02:41:02 2003 From: baecker at physik.tu-dresden.de (baecker at physik.tu-dresden.de) Date: Sun Jan 26 02:41:02 2003 Subject: [Numpy-discussion] complex diagonal matrix Message-ID: Hi, I just wondered if there is a "nicer" way of generating a complex diagonal matrix than a) v=arange(10,typecode=Complex) mat=diag(v) b) v=arange(10) mat=diag(v)+0j Namely, wouldn't something like v=arange(10) mat=diag(v,typecode=Complex) be nicer? BTW: I somehow found that in the (excellent) documentation of Numeric the definitions from Mlab.py are a bit hidden. In my case I know nothing about matlab and I somehow expected that this type of routines are to be found in the section (together with zeros,ones etc. etc....) Also diag is not listed in the index http://www.pfdubois.com/numpy/html2/numpy-22.html#A or ? Arnd From hinsen at cnrs-orleans.fr Sun Jan 26 03:11:02 2003 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Sun Jan 26 03:11:02 2003 Subject: [Numpy-discussion] complex diagonal matrix In-Reply-To: References: Message-ID: baecker at physik.tu-dresden.de writes: > I just wondered if there is a "nicer" way of generating > a complex diagonal matrix than > a) > v=arange(10,typecode=Complex) > mat=diag(v) > b) > v=arange(10) > mat=diag(v)+0j > > Namely, wouldn't something like > v=arange(10) > mat=diag(v,typecode=Complex) > be nicer? Why would that be nicer? Personally, I prefer to have explicit typecodes limited to a very small number of array generators, and have all other functions apply the standard type-preservation rules. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From list at jsaul.de Sun Jan 26 04:03:05 2003 From: list at jsaul.de (Joachim Saul) Date: Sun Jan 26 04:03:05 2003 Subject: [Numpy-discussion] complex diagonal matrix In-Reply-To: References: Message-ID: <20030126120117.GB869@jsaul.de> * baecker at physik.tu-dresden.de [26.01.2003 11:40]: > I just wondered if there is a "nicer" way of generating > a complex diagonal matrix than > a) > v=arange(10,typecode=Complex) > mat=diag(v) > b) > v=arange(10) > mat=diag(v)+0j > > Namely, wouldn't something like > v=arange(10) > mat=diag(v,typecode=Complex) > be nicer? No, because diag() is supposed to create a diagonal, but *not* to cast to another type. If you wanted to add that "functionality" to functions like diag(), you would also have to add it to functions like reshape() etc., i.e. practically everywhere. The way it is handled now is reasonably simple and flexible, and there is really no advantage of your suggestion compared to approach a). Cheers, Joachim From falted at openlc.org Mon Jan 27 04:02:02 2003 From: falted at openlc.org (Francesc Alted) Date: Mon Jan 27 04:02:02 2003 Subject: FW: [Numpy-discussion] typecodes in numarray In-Reply-To: <3E32E5E3.2020704@stsci.edu> References: <200301251342.15164.falted@openlc.org> <3E32E5E3.2020704@stsci.edu> Message-ID: <200301271301.01659.falted@openlc.org> A Dissabte 25 Gener 2003 20:30, Todd Miller va escriure: > > Still think I'd prefer something seperable: > > recarrStruct = ( (CharType, 16), > UInt8, > Int16, > Int32, > Int32, > Float32, > (Float64, 32), > Int64 ) > > recarrFields = ["name", > "TDCcount", > "ADCcount", > "grid_i", > "grid_j", > "pressure", > "temperature", > "idnumber"] > > I guess it might not be quite as good for large structs. Me too... > > It's a good sign that both you and I thought of an identical tuple > format; it's the obvious > minimal one. Yeah. We just differ in the way to arrange this metadata to be passed to the recarray constructor. But I think this is secondary compared to the flexibility that a verbose approach offers compared with the actual string format. In fact, more than one container might be supported to define the metadata; one can start with tuples as you suggest, but in the future other ways can be added (if considered convenient). For example, I think I'll stick with the dictionary option for PyTables, but also a class declaration for the metadata would be supported, like in : class Small(IsRecord): var1 = defineType(CharType, 2, "") var2 = defineType(Int32, 1) var3 = Float64 This would not be difficult to support because, by accessing to the Small().__dict__, you get also a dictionary. In addition, the latter will ensure (by construction) that you are not using a non-valid python identifier, which is mandatory in my current implementation. I find these containers (dictionaries and classes) both elegant and convenient. > > Just like the type repetition tuple except also including field names > and default values. I don't think you lost me. For what we do, the > exact physical layout of the "struct" is important, so order matters. I > see order as part of the > meta-data, but I don't usually deal with meta-entities so maybe I've > got that part wrong. :) > Well, if you need positional fields, you may add a (optional) parameter, called for example, "position" so that you can fix it. > > I was thinking that if the above was an issue, we could write an API > function(s) to "compile" the type-repetition tuple into arrays of ints > which describe the type of each field and corresponding repetition factor. Yeah, I agree that this would be the best solution. That way, the charcodes will be factored out from the code, and by just providing such and API (both in Python and C), would be enough to reconstruct them, if needed. That will allow a more consistent numarray internal code. > > See you Monday, Right, how did you know that? :) -- Francesc Alted From jmiller at stsci.edu Mon Jan 27 06:44:03 2003 From: jmiller at stsci.edu (Todd Miller) Date: Mon Jan 27 06:44:03 2003 Subject: FW: [Numpy-discussion] typecodes in numarray References: <200301251342.15164.falted@openlc.org> <3E32E5E3.2020704@stsci.edu> <200301271301.01659.falted@openlc.org> Message-ID: <3E354551.5090704@stsci.edu> Francesc Alted wrote: >Yeah. We just differ in the way to arrange this metadata to be passed to the >recarray constructor. But I think this is secondary compared to the >flexibility that a verbose approach offers compared with the actual string >format. > Yes. So one question is: if we were to add type-repetition tuples to recarray as an alternative to the current character code strings, would that be any form of improvement to recarray from your perspective? As I see it, recarray currently has a clean seperation between format and naming which permits the latter to be optional. Before changing that, I'd need a clear argument why. (I didn't design and generally don't even maintain recarray). >In fact, more than one container might be supported to define the >metadata; one can start with tuples as you suggest, but in the future other >ways can be added (if considered convenient). > > >For example, I think I'll stick with the dictionary option for PyTables, but >also a class declaration for the metadata would be supported, like in : > >class Small(IsRecord): > var1 = defineType(CharType, 2, "") > var2 = defineType(Int32, 1) > var3 = Float64 > >This would not be difficult to support because, by accessing to the >Small().__dict__, you get also a dictionary. In addition, the latter will >ensure (by construction) that you are not using a non-valid python >identifier, which is mandatory in my current implementation. I find these >containers (dictionaries and classes) both elegant and convenient. > > I'm not trying to be Mr. Negative here, but one thing to keep in mind is this: >>> class C: ... pass ... >>> c = C() >>> dir(c.__dict__) ['__class__', '__cmp__', '__contains__', '__delattr__', '__delitem__', '__doc__', '__eq__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__reduce__', '__repr__', '__setattr__', '__setitem__', '__str__', 'clear', 'copy', 'fromkeys', 'get', 'has_key', 'items', 'iteritems', 'iterkeys', 'itervalues', 'keys', 'pop', 'popitem', 'setdefault', 'update', 'values'] Which is to say, the instance dictionary is a little cluttered, and it might not be that easy to determine which objects in it are there to define the data format. >>Just like the type repetition tuple except also including field names >>and default values. I don't think you lost me. For what we do, the >>exact physical layout of the "struct" is important, so order matters. I >>see order as part of the >>meta-data, but I don't usually deal with meta-entities so maybe I've >>got that part wrong. :) >> > >Well, if you need positional fields, you may add a (optional) parameter, >called for example, "position" so that you can fix it. > > I'm sure that's not the easiest way to capture struct layout, but I take your point. Since position matters to me, I'd prefer that capturing them was implicit. Since it doesn't to you, it seems OK for it to be explicit. Either default mode can support the other, but capturing order with tuples is free, while capturing order with a __dict__ will take some kind of extra work. >>I was thinking that if the above was an issue, we could write an API >>function(s) to "compile" the type-repetition tuple into arrays of ints >>which describe the type of each field and corresponding repetition factor. >> >> > >Yeah, I agree that this would be the best solution. That way, the charcodes >will be factored out from the code, and by just providing such and API (both >in Python and C), would be enough to reconstruct them, if needed. That will >allow a more consistent numarray internal code. > > I'm thinking the general format for this may be converting N-tuples of types and ints into N arrays of types and ints. And vice versa. It's obvious how this works with numarray types. I think the chararray types need work and need to be mapped into the same integer enumeration as the numeric types in a non-overlapping way. >See you Monday, > > > >Right, how did you know that? :) > > Insightful on weekends anyway, Todd From jmiller at stsci.edu Mon Jan 27 08:30:02 2003 From: jmiller at stsci.edu (Todd Miller) Date: Mon Jan 27 08:30:02 2003 Subject: FW: [Numpy-discussion] typecodes in numarray References: <200301271301.01659.falted@openlc.org> <3E354551.5090704@stsci.edu> <200301271717.19055.falted@openlc.org> Message-ID: <3E355E35.9070805@stsci.edu> Francesc Alted wrote: >A Dilluns 27 Gener 2003 15:42, Todd Miller va escriure: > > >>Yes. So one question is: if we were to add type-repetition tuples to >>recarray as an alternative to the current character code strings, would >>that be any form of improvement to recarray from your perspective? >> >> > >Well, at least, charcodes can be avoided. I think it's a big win... or maybe >not as big? > > I think that avoiding the charcodes would be an improvement. Type-repetition tuples provide a clear well defined way to define data formats. It's not so clear that it eliminates the requirement for on-going Numeric compatability, but it might. > > >>As I see it, recarray currently has a clean seperation between format >>and naming which permits the latter to be optional. Before changing >>that, I'd need a clear argument why. (I didn't design and generally >>don't even maintain recarray). >> >> > >One argument is the fact that a map is very clear to the user, although that >such a map can be built *after* the names and format are passed to the >recarray constructor and be accessible as an atribute. However, the latter >solution is worse IMO, because the user has to supply two separate pieces of >information when, actually, these should be regarded as a unity. Anyway, >this maybe a subjective perception. > > Well, I think there's truth to the danger of seperating names from data declarations, but it is easy to map keys(), values() to the seperate pieces in a different layer if necessary. >This would not be difficult to support because, by accessing to the >Small().__dict__, you get also a dictionary. In addition, the latter will >ensure (by construction) that you are not using a non-valid python >identifier, which is mandatory in my current implementation. I find these >containers (dictionaries and classes) both elegant and convenient. > > >>I'm not trying to be Mr. Negative here, but one thing to keep in mind >> >> > >Oh dear, you are right!. > For a few seconds there, I thought I was on a roll! >In fact, I forgot that to make this to work, you >need to use the metaclasses introduced in Python 2.2 (see Alex Martelli's >post: http://mail.python.org/pipermail/python-list/2002-July/112007.html). >I was following this recipe, but I forgot that I was using Python 2.2. > >So, as numarray has to work with previous python versions, there is no point >to care about that. > > In truth, numarray-0.4 and up already require Python-2.2 and up. >I'm sure that's not the easiest way to capture struct layout, but I >take your point. Since position matters to me, I'd prefer that >capturing them was implicit. Since it doesn't to you, it seems OK for >it to be explicit. Either default mode can support the other, but >capturing order with tuples is free, while capturing order with a >__dict__ will take some kind of extra work. > > > >That's right. We have some different needs and priorities, and we should >take the approach better suited to each other. But exchanging points of view >is always a great thing. > > > >>I'm thinking the general format for this may be converting N-tuples of >>types and ints into N arrays of types and ints. And vice versa. >>It's obvious how this works with numarray types. I think the chararray >>types need work and need to be mapped into the same integer enumeration >>as the numeric types in a non-overlapping way. >> >> >> > >I can't catch your point here. Why there should be a problem with >chararrays?. > What I was trying to see is that chararray types are not as well designed as the numarray types, nor are they reflected in the C-API. > > From falted at openlc.org Mon Jan 27 08:39:05 2003 From: falted at openlc.org (Francesc Alted) Date: Mon Jan 27 08:39:05 2003 Subject: FW: [Numpy-discussion] typecodes in numarray In-Reply-To: <3E354551.5090704@stsci.edu> References: <200301271301.01659.falted@openlc.org> <3E354551.5090704@stsci.edu> Message-ID: <200301271717.19055.falted@openlc.org> A Dilluns 27 Gener 2003 15:42, Todd Miller va escriure: > Yes. So one question is: if we were to add type-repetition tuples to > recarray as an alternative to the current character code strings, would > that be any form of improvement to recarray from your perspective? Well, at least, charcodes can be avoided. I think it's a big win... or maybe not as big? > > As I see it, recarray currently has a clean seperation between format > and naming which permits the latter to be optional. Before changing > that, I'd need a clear argument why. (I didn't design and generally > don't even maintain recarray). One argument is the fact that a map is very clear to the user, although that such a map can be built *after* the names and format are passed to the recarray constructor and be accessible as an atribute. However, the latter solution is worse IMO, because the user has to supply two separate pieces of information when, actually, these should be regarded as a unity. Anyway, this maybe a subjective perception. > >This would not be difficult to support because, by accessing to the > >Small().__dict__, you get also a dictionary. In addition, the latter will > >ensure (by construction) that you are not using a non-valid python > >identifier, which is mandatory in my current implementation. I find these > >containers (dictionaries and classes) both elegant and convenient. > > I'm not trying to be Mr. Negative here, but one thing to keep in mind Oh dear, you are right!. In fact, I forgot that to make this to work, you need to use the metaclasses introduced in Python 2.2 (see Alex Martelli's post: http://mail.python.org/pipermail/python-list/2002-July/112007.html). I was following this recipe, but I forgot that I was using Python 2.2. So, as numarray has to work with previous python versions, there is no point to care about that. > > I'm sure that's not the easiest way to capture struct layout, but I > take your point. Since position matters to me, I'd prefer that > capturing them was implicit. Since it doesn't to you, it seems OK for > it to be explicit. Either default mode can support the other, but > capturing order with tuples is free, while capturing order with a > __dict__ will take some kind of extra work. That's right. We have some different needs and priorities, and we should take the approach better suited to each other. But exchanging points of view is always a great thing. > > I'm thinking the general format for this may be converting N-tuples of > types and ints into N arrays of types and ints. And vice versa. > It's obvious how this works with numarray types. I think the chararray > types need work and need to be mapped into the same integer enumeration > as the numeric types in a non-overlapping way. > I can't catch your point here. Why there should be a problem with chararrays?. -- Francesc Alted From Chris.Barker at noaa.gov Mon Jan 27 10:20:06 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Mon Jan 27 10:20:06 2003 Subject: FW: [Numpy-discussion] typecodes in numarray References: <200301271301.01659.falted@openlc.org> <3E354551.5090704@stsci.edu> <200301271717.19055.falted@openlc.org> Message-ID: <3E35768B.DD6454BE@noaa.gov> Francesc Alted wrote: > So, as numarray has to work with previous python versions, Why? Anyone using NumArray is either starting from scratch or porting from Numeric, so having to port to a newer version of Python is a very small deal. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From jmiller at stsci.edu Mon Jan 27 10:34:05 2003 From: jmiller at stsci.edu (Todd Miller) Date: Mon Jan 27 10:34:05 2003 Subject: FW: [Numpy-discussion] typecodes in numarray References: <200301271301.01659.falted@openlc.org> <3E354551.5090704@stsci.edu> <200301271717.19055.falted@openlc.org> <3E35768B.DD6454BE@noaa.gov> Message-ID: <3E357B5F.9030908@stsci.edu> Chris Barker wrote: >Francesc Alted wrote: > > > >>So, as numarray has to work with previous python versions, >> >> > >Why? Anyone using NumArray is either starting from scratch or porting >from Numeric, so having to port to a newer version of Python is a very >small deal. > > Just to make it very clear: numarray-0.4 and up require Python-2.2 or higher. Up until numarray-0.4 (released in November), that was not the case, and numarray ran (and was tested!) on Python-2.0 and higher. The desire to increase C-level Numeric compatability and to improve simple indexing speed led us to a C baseclass, which is only supported in Python-2.2 and up. Todd From falted at openlc.org Mon Jan 27 11:23:01 2003 From: falted at openlc.org (Francesc Alted) Date: Mon Jan 27 11:23:01 2003 Subject: FW: [Numpy-discussion] typecodes in numarray In-Reply-To: <3E355E35.9070805@stsci.edu> References: <200301271717.19055.falted@openlc.org> <3E355E35.9070805@stsci.edu> Message-ID: <200301272021.47587.falted@openlc.org> A Dilluns 27 Gener 2003 17:28, Todd Miller va escriure: > >So, as numarray has to work with previous python versions, there is no > > point to care about that. > > In truth, numarray-0.4 and up already require Python-2.2 and up. Oh!, I didn't know that. In such a case, I think it's worth to consider the possibility to define records as classes descendants from metaclasses. But, of course, you have the ultimate decision. > >>I'm thinking the general format for this may be converting N-tuples of > >>types and ints into N arrays of types and ints. And vice versa. > >>It's obvious how this works with numarray types. I think the chararray > >>types need work and need to be mapped into the same integer enumeration > >>as the numeric types in a non-overlapping way. > > > >I can't catch your point here. Why there should be a problem with > >chararrays?. > > What I was trying to see is that chararray types are not as well > designed as the numarray types, nor are they reflected in the C-API. I see. Well, is it really desirable such a unification? CharArray entities come from a module and NumArray from another one, and that should be ok. Why bother in creating a unified API or integer enumeration?. I think this should be not a big drawback for C-extension crafters (although, to say the truth, that would be very elegant if you manage to do that, but maybe it is not worth the effort, I don't know). -- Francesc Alted From jmiller at stsci.edu Mon Jan 27 11:39:01 2003 From: jmiller at stsci.edu (Todd Miller) Date: Mon Jan 27 11:39:01 2003 Subject: FW: [Numpy-discussion] typecodes in numarray References: <000001c2c635$624e9a40$6601a8c0@NICKLEBY> Message-ID: <3E358A72.6050400@stsci.edu> Paul F Dubois wrote: >IMHO you can assume any Python you want. Look to the long term here, not the >short. > You lost me. numarray-0.4 needs at least Python-2.2 or baseclasses don't exist. I had a slow Python equivalent for the baseclass as I refactored prior to numarray-0.4, but it's gone now. > >I'm a bit uncertain on MA as to whether my old design is right. Maybe I >should be inheriting from NDarray? So that MA is more of a sibling of >numarray rather than a wrapper of it? > > I asked Perry about this one. His points (salted a little by me) were: 1. If you inherit from NumArray, you also inherit from NDArray. If you only inherit from NDArray, all you get are the structural operations. 2. If you inherit from NumArray, you can use Liskov substitution to pass MA's directly into extensions expecting NumArrays. This substitution may or may not be good. Also, isinstance(anMA, numarray) will return True. 3. If you inherit from NumArray, you get numerical method definitions which may or may not be applicable to MA. With a little thrashing, we might also get MAs to work for ufuncs. In fact, ufuncs are the key to whether or not the NumArray numerical methods add any value. Todd > > From jmiller at stsci.edu Mon Jan 27 11:54:06 2003 From: jmiller at stsci.edu (Todd Miller) Date: Mon Jan 27 11:54:06 2003 Subject: FW: [Numpy-discussion] typecodes in numarray References: <200301271717.19055.falted@openlc.org> <3E355E35.9070805@stsci.edu> <200301272021.47587.falted@openlc.org> Message-ID: <3E358DE0.7040501@stsci.edu> Francesc Alted wrote: >A Dilluns 27 Gener 2003 17:28, Todd Miller va escriure: > > >>>So, as numarray has to work with previous python versions, there is no >>>point to care about that. >>> >>> >>In truth, numarray-0.4 and up already require Python-2.2 and up. >> >> > >Oh!, I didn't know that. In such a case, I think it's worth to consider the >possibility to define records as classes descendants from metaclasses. But, >of course, you have the ultimate decision. > > I don't know what you mean here. Please spell it out a little more. > > >>>>I'm thinking the general format for this may be converting N-tuples of >>>>types and ints into N arrays of types and ints. And vice versa. >>>>It's obvious how this works with numarray types. I think the chararray >>>>types need work and need to be mapped into the same integer enumeration >>>>as the numeric types in a non-overlapping way. >>>> >>>> >>>I can't catch your point here. Why there should be a problem with >>>chararrays?. >>> >>> >>What I was trying to see is that chararray types are not as well >>designed as the numarray types, nor are they reflected in the C-API. >> >> > >I see. Well, is it really desirable such a unification? CharArray entities >come from a module and NumArray from another one, and that should be ok. Why >bother in creating a unified API or integer enumeration?. > It may not be necessary. Int8 with repitition factors may work about the same. From falted at openlc.org Mon Jan 27 12:16:02 2003 From: falted at openlc.org (Francesc Alted) Date: Mon Jan 27 12:16:02 2003 Subject: FW: [Numpy-discussion] typecodes in numarray In-Reply-To: <3E358DE0.7040501@stsci.edu> References: <200301272021.47587.falted@openlc.org> <3E358DE0.7040501@stsci.edu> Message-ID: <200301272114.53545.falted@openlc.org> A Dilluns 27 Gener 2003 20:52, Todd Miller va escriure: > > > >Oh!, I didn't know that. In such a case, I think it's worth to consider > > the possibility to define records as classes descendants from > > metaclasses. But, of course, you have the ultimate decision. > > I don't know what you mean here. Please spell it out a little more. I was trying to mean that using something like : class Small(IsRecord): field1 = defineType(CharType, 2, default="", position=1) field2 = defineType(Int32, 1, position=2) field3 = Float64 as as container for recarray metadata is definitely possible instead of the tuple (formats="2aid",names=("field1","field2", "field3")), if using Python2.2. IsRecord is a metaclass (introduced in Python 2.2) that allows you to effectively separate the declared attributes from the implicit ones in normal classes. Of course, you can taylor IsRecord so as to fulfill your needs. I hope that I have expressed myself more clearly now, -- Francesc Alted From jmiller at stsci.edu Mon Jan 27 12:54:05 2003 From: jmiller at stsci.edu (Todd Miller) Date: Mon Jan 27 12:54:05 2003 Subject: FW: [Numpy-discussion] typecodes in numarray References: <200301272021.47587.falted@openlc.org> <3E358DE0.7040501@stsci.edu> <200301272114.53545.falted@openlc.org> Message-ID: <3E359C2B.4070509@stsci.edu> Francesc Alted wrote: >A Dilluns 27 Gener 2003 20:52, Todd Miller va escriure: > > >>>Oh!, I didn't know that. In such a case, I think it's worth to consider >>>the possibility to define records as classes descendants from >>>metaclasses. But, of course, you have the ultimate decision. >>> >>> >>I don't know what you mean here. Please spell it out a little more. >> >> > >I was trying to mean that using something like : > >class Small(IsRecord): > field1 = defineType(CharType, 2, default="", position=1) > field2 = defineType(Int32, 1, position=2) > field3 = Float64 > >as as container for recarray metadata is definitely possible instead of the >tuple (formats="2aid",names=("field1","field2", "field3")), if using >Python2.2. IsRecord is a metaclass (introduced in Python 2.2) that allows >you to effectively separate the declared attributes from the implicit ones >in normal classes. > >Of course, you can taylor IsRecord so as to fulfill your needs. > >I hope that I have expressed myself more clearly now, > > > I looked at your docs here: http://pytables.sourceforge.net/html-doc/usersguide-html4.html#section4.2 and what you said above clicked. Thanks. Todd From Chris.Barker at noaa.gov Tue Jan 28 11:02:04 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Tue Jan 28 11:02:04 2003 Subject: [Numpy-discussion] Extensive use of methods instead of functions References: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com> <3E26EC9D.A0B7D173@noaa.gov> <3E288068.3070407@stsci.edu> Message-ID: <3E36D14D.C3238DFA@noaa.gov> Konrad Hinsen wrote: > > M = array(l) > > Mt = M.transpose() > > > > just isn't that much worse than: > > > > Mt = transpose(l) > > No, but the automatic conversion enables me to write functions that > accept any sequence type without even having to think about it. I've used that to, but I also frequently use something like this: def function(A): A = array(A) ... Which is pretty simple to. > Moreover, it is almost essential in many situations to accept scalars > in place of arrays, because scalars fulfill the role of rank-0 arrays. Yes, this is critical. Isn't there a plan to make the scalar -- rank-0 array dicotomy a little cleaner in NumArray ? > > I also agree that the point is not subclassing per se, it's > > polymorphism. It should be easy to write a class that acts like an array > > in all the ways that you need it to. > > True, and that is a weak point of NumPy. Is this getting any better with NumArray? -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From falted at openlc.org Tue Jan 28 11:42:07 2003 From: falted at openlc.org (Francesc Alted) Date: Tue Jan 28 11:42:07 2003 Subject: [Numpy-discussion] enum values visible in numeric types instances? Message-ID: <200301282041.21145.falted@openlc.org> Hi, A couple of points related with numarray type objects: 1.- When working with numeric types instances like UInt8 or Float64, is there a way to access to their enumeration NumarrayType C counterpart?. That can be handy when want to map from these objects and integers. For example, right now, I'm forced to use these mappings in Pyrex: # Conversion tables from/to classes to the numarray enum types toenum = {num.Int8:tInt8, num.UInt8:tUInt8, num.Int16:tInt16, num.UInt16:tUInt16, num.Int32:tInt32, num.UInt32:tUInt32, num.Float32:tFloat32, num.Float64:tFloat64, CharType:97 # ascii(97) --> 'a' # Special case (to be corrected) } toclass = {tInt8:num.Int8, tUInt8:num.UInt8, tInt16:num.Int16, tUInt16:num.UInt16, tInt32:num.Int32, tUInt32:num.UInt32, tFloat32:num.Float32, tFloat64:num.Float64, 97:CharType # ascii(97) --> 'a' # Special case (to be corrected) } (yes, Pyrex lets you do that kind of "miracles", like mappings between Python objects and C integers) but if I had this access directly from the object (for example Int8.enumType), my code (and C-extensions in general) could look simpler. 2.- I understand now why Todd was worried about CharArray objects to be assigned to an enumerated type. In fact, if you look at the above maps, I have to map myself this special object as the number 97 (which is the ascii value for character "a"). 97 is ok for now because it can't collide (at least for a while) with other enumeration types. My suggestion is that it would be a good thing to have a reserved enum type for CharArray. And I think that mapping CharArrays with Bool or Int8, would not be a good solution because chararray objects differ in some ways from them, that it would be a mess to distinguish both objects in C-code by just looking at its enumeration type. I don't know, but maybe recarrays also merit a place in enumeration (?). By the way, after the discussion with Todd I finally decided to remove all the Numeric charcodes (and related codes) from PyTables. However, I can still manage Numeric objects by converting them to numarray and accessing the class type with the .type() method. An you know that? the code looks much more logical and neat, and best of all, less error-prone (well, at least I hope so!). I definitely encourage you to do a similar transition in numarray (although I guess that would be more difficult because you still need to Numeric compatibility). Thanks, -- Francesc Alted From perry at stsci.edu Tue Jan 28 13:59:08 2003 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 28 13:59:08 2003 Subject: [Numpy-discussion] Extensive use of methods instead of functions In-Reply-To: <3E36D14D.C3238DFA@noaa.gov> Message-ID: > Yes, this is critical. Isn't there a plan to make the scalar -- rank-0 > array dicotomy a little cleaner in NumArray ? > Hmmm, I'd like to say yes, but I'm not sure what exactly you are referring to. Please elaborate on how you think it should be changed. About the only thing that comes to mind is that repr() for rank-0 will be different for numarray than Numeric, and that it will never be the result of any reduction or similar selection. > > > I also agree that the point is not subclassing per se, it's > > > polymorphism. It should be easy to write a class that acts > like an array > > > in all the ways that you need it to. > > > > True, and that is a weak point of NumPy. > > Is this getting any better with NumArray? > Again, I hope so, but I find this too general to know if it satisfies anyone's specific goals. I'd like to see specific examples. I think it is often tricker than people initially think. Perry From jdhunter at ace.bsd.uchicago.edu Wed Jan 29 13:13:03 2003 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Wed Jan 29 13:13:03 2003 Subject: [Numpy-discussion] fastest way to make two vectors into an array Message-ID: I have two equal length 1D arrays of 256-4096 complex or floating point numbers which I need to put into a shape=(len(x),2) array. I need to do this a lot, so I would like to use the most efficient means. Currently I am doing: def somefunc(x,y): X = zeros( (len(x),2), typecode=x.typecode()) X[:,0] = x X[:,1] = y do_something_with(X) Is this the fastest way? Thanks, John Hunter From list at jsaul.de Thu Jan 30 01:20:04 2003 From: list at jsaul.de (Joachim Saul) Date: Thu Jan 30 01:20:04 2003 Subject: [Numpy-discussion] fastest way to make two vectors into an array In-Reply-To: References: Message-ID: <20030130091853.GA842@jsaul.de> * John Hunter [2003-01-29 22:13]: > def somefunc(x,y): > X = zeros( (len(x),2), typecode=x.typecode()) > X[:,0] = x > X[:,1] = y > do_something_with(X) > > Is this the fastest way? X = transpose(array([x]+[y])) It may not be the fastest possible way, but should be about a factor of two faster; better than nothing. Cheers, Joachim From karthik at james.hut.fi Thu Jan 30 01:47:03 2003 From: karthik at james.hut.fi (Karthikesh Raju) Date: Thu Jan 30 01:47:03 2003 Subject: [Numpy-discussion] Object too deep for desired array In-Reply-To: Message-ID: Hi, i was tring out something like this import Numeric import LinearAlgebra import cmath import RandomArray import copy def sMatrix(pd, code, window): if window == 0: nprime = 1 else: nprime = window K, C = Numeric.shape(code) K1, L = Numeric.shape(pd) # check if K == K1 and raise an exception here sCode = Numeric.zeros([nprime*C,K*L*(window+1)],'d') for k in range(K): for l in range(L): code1 = copy.deepcopy(Numeric.array(code[k,0:C-pd[k,l]])) code1.shape = (C-pd[k,l],1) sCode1= Numeric.concatenate((Numeric.zeros([pd[k,l],1]),Numeric.zeros([C*window,1]),code1)) sCode[:, (window+1)*l+window*L*k] = copy.deepcopy(sCode1) return sCode if __name__ == "__main__": pd = Numeric.array([[2]]) code = Numeric.array([[-1,1,-1,1,1]]) np = sMatrix(pd,code,0) print np print "--"*30 np = sMatrix(pd,code,1) print Numeric.shape(np) print np print "--"*30 np = sMatrix(pd,code,2) print Numeric.shape(np) print np print "--"*30 ------------------------------ And i get struck with the following error message:: Traceback (most recent call last): File "sMatrix.py", line 31, in ? np = sMatrix(pd,code,0) File "sMatrix.py", line 24, in sMatrix sCode[:, (window+1)*l+window*L*k] = copy.deepcopy(sCode1) ValueError: Object too deep for desired array ------------ i think it is due to the many deep copy operations taht i am performing. i want to be in a position where slices of matrices should not be references, but should be copies itself and i should be able to move these copies around. (May be it is inefficient, but that is what i did in Matlab and want some compatibility, till i learn more of python and till i migrate to python completely). Is there a way out? Why is this an problem? Am i missing something. Best regards, karthik ----------------------------------------------------------------------- Karthikesh Raju, email: karthik at james.hut.fi Researcher, http://www.cis.hut.fi/karthik Helsinki University of Technology, Tel: +358-9-451 5389 Laboratory of Comp. & Info. Sc., Fax: +358-9-451 3277 Department of Computer Sc., P.O Box 5400, FIN 02015 HUT, Espoo, FINLAND ----------------------------------------------------------------------- From pearu at cens.ioc.ee Thu Jan 30 01:51:09 2003 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Thu Jan 30 01:51:09 2003 Subject: [Numpy-discussion] fastest way to make two vectors into an array In-Reply-To: Message-ID: On Wed, 29 Jan 2003, John Hunter wrote: > > I have two equal length 1D arrays of 256-4096 complex or floating > point numbers which I need to put into a shape=(len(x),2) array. > > I need to do this a lot, so I would like to use the most efficient > means. Currently I am doing: > > def somefunc(x,y): > X = zeros( (len(x),2), typecode=x.typecode()) > X[:,0] = x > X[:,1] = y > do_something_with(X) > > Is this the fastest way? May be you could arange your algorithm so that you first create X and then reference its columns by x,y without copying: # Allocate memory X = zeros( (n,2), typecode=.. ) # Get references to columns x = X[:,0] y = X[:,1] while 1: do_something_inplace_with(x,y) do_something_with(X) Pearu From jdhunter at ace.bsd.uchicago.edu Thu Jan 30 11:26:05 2003 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Thu Jan 30 11:26:05 2003 Subject: [Numpy-discussion] fastest way to make two vectors into an array In-Reply-To: (John Hunter's message of "Wed, 29 Jan 2003 15:13:03 -0600") References: Message-ID: >>>>> "John" == John Hunter writes: John> I have two equal length 1D arrays of 256-4096 complex or John> floating point numbers which I need to put into a John> shape=(len(x),2) array. John> I need to do this a lot, so I would like to use the most John> efficient means. Currently I am doing: I tested all the suggested methods and the transpose with [x] and [y] was the clear winner, with an 8 fold speed up over my original code. The concatenate method was between 2-3 times faster. Thanks to all who responded, John Hunter cruncher2:~/python/test> python test.py test_naive test_naive 0.480427026749 cruncher2:~/python/test> python test.py test_concat test_concat 0.189149975777 cruncher2:~/python/test> python test.py test_transpose test_transpose 0.0698409080505 from Numeric import transpose, concatenate, reshape, array, zeros from RandomArray import normal import time, sys def test_naive(x,y): "Naive approach" X = zeros( (len(x),2), typecode=x.typecode()) X[:,0] = x X[:,1] = y def test_concat(x,y): "Thanks to Chris Barker and Bryan Cole" X = concatenate( ( reshape(x,(-1,1)), reshape(y,(-1,1)) ), 1) def test_transpose(x,y): "Thanks to Joachim Saul" X = transpose(array([x]+[y])) m = {'test_naive' : test_naive, 'test_concat' : test_concat, 'test_transpose' : test_transpose} nse1 = normal(0.0, 1.0, (4096,)) nse2 = normal(0.0, 1.0, nse1.shape) N = 1000 trials = range(N) func = m[sys.argv[1]] t1 = time.time() for i in trials: func(nse1,nse2) t2 = time.time() print sys.argv[1], t2-t1 From jdhunter at ace.bsd.uchicago.edu Thu Jan 30 14:18:04 2003 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Thu Jan 30 14:18:04 2003 Subject: [Numpy-discussion] mlab functions: psd, csd, cohere, corrcoef Message-ID: I needed some spectral analysis functions, and finding none available, wrote my own. I use matlab a lot, so I wrote them to be matlab compatible. If you all think these look OK, I'm happy to submit them for inclusion into MLab. ------------------------------------------------------------------- """ Spectral analysis functions for Numerical python written for compatability with matlab commands with the same names. psd - Power spectral density uing Welch's average periodogram csd - Cross spectral density uing Welch's average periodogram cohere - Coherence (normalized cross spectral density) corrcoef - The matrix of correlation coefficients The functions are designed to work for real and complex valued Numeric arrays. One of the major differences between this code and matlab's is that I use functions for 'detrend' and 'window', and matlab uses vectors. This can be easily changed, but I think the functional approach is a bit more elegant. Please send comments, questions and bugs to: Author: John D. Hunter """ from __future__ import division from MLab import mean, hanning, cov from Numeric import zeros, ones, diagonal, transpose, matrixmultiply, \ resize, sqrt, divide, array, Float, Complex, concatenate, \ convolve, dot, conjugate, absolute, arange, reshape from FFT import fft def norm(x): return sqrt(dot(x,x)) def window_hanning(x): return hanning(len(x))*x def window_none(x): return x def detrend_mean(x): return x - mean(x) def detrend_none(x): return x def detrend_linear(x): """Remove the best fit line from x""" # I'm going to regress x on xx=range(len(x)) and return # x - (b*xx+a) xx = arange(len(x), typecode=x.typecode()) X = transpose(array([xx]+[x])) C = cov(X) b = C[0,1]/C[0,0] a = mean(x) - b*mean(xx) return x-(b*xx+a) def psd(x, NFFT=256, Fs=2, detrend=detrend_none, window=window_hanning, noverlap=0): """ The power spectral density by Welches average periodogram method. The vector x is divided into NFFT length segments. Each segment is detrended by function detrend and windowed by function window. noperlap gives the length of the overlap between segments. The absolute(fft(segment))**2 of each segment are averaged to compute Pxx, with a scaling to correct for power loss due to windowing. Fs is the sampling frequency. -- NFFT must be a power of 2 -- detrend and window are functions, unlike in matlab where they are vectors. -- if length x < NFFT, it will be zero padded to NFFT Refs: Bendat & Piersol -- Random Data: Analysis and Measurement Procedures, John Wiley & Sons (1986) """ if NFFT % 2: raise ValueError, 'NFFT must be a power of 2' # zero pad x up to NFFT if it is shorter than NFFT if len(x)1: Pxx = mean(Pxx,1) Pxx = divide(Pxx, norm(windowVals)**2) freqs = Fs/NFFT*arange(0,numFreqs) return Pxx, freqs def csd(x, y, NFFT=256, Fs=2, detrend=detrend_none, window=window_hanning, noverlap=0): """ The cross spectral density Pxy by Welches average periodogram method. The vectors x and y are divided into NFFT length segments. Each segment is detrended by function detrend and windowed by function window. noverlap gives the length of the overlap between segments. The product of the direct FFTs of x and y are averaged over each segment to compute Pxy, with a scaling to correct for power loss due to windowing. Fs is the sampling frequency. NFFT must be a power of 2 Refs: Bendat & Piersol -- Random Data: Analysis and Measurement Procedures, John Wiley & Sons (1986) """ if NFFT % 2: raise ValueError, 'NFFT must be a power of 2' # zero pad x and y up to NFFT if they are shorter than NFFT if len(x)1: Pxy = mean(Pxy,1) Pxy = divide(Pxy, norm(windowVals)**2) freqs = Fs/NFFT*arange(0,numFreqs) return Pxy, freqs def cohere(x, y, NFFT=256, Fs=2, detrend=detrend_none, window=window_hanning, noverlap=0): """ cohere the coherence between x and y. Coherence is the normalized cross spectral density Cxy = |Pxy|^2/(Pxx*Pyy) The return value is (Cxy, f), where f are the frequencies of the coherence vector. See the docs for psd and csd for information about the function arguments NFFT, detrend, windowm noverlap, as well as the methods used to compute Pxy, Pxx and Pyy. """ Pxx,f = psd(x, NFFT=NFFT, Fs=Fs, detrend=detrend, window=window, noverlap=noverlap) Pyy,f = psd(y, NFFT=NFFT, Fs=Fs, detrend=detrend, window=window, noverlap=noverlap) Pxy,f = csd(x, y, NFFT=NFFT, Fs=Fs, detrend=detrend, window=window, noverlap=noverlap) Cxy = divide(absolute(Pxy)**2, Pxx*Pyy) return Cxy, f def corrcoef(*args): """ corrcoef(X) where X is a matrix returns a matrix of correlation coefficients for each row of X. corrcoef(x,y) where x and y are vectors returns the matrix or correlation coefficients for x and y. Numeric arrays can be real or complex The correlation matrix is defined from the covariance matrix C as r(i,j) = C[i,j] / (C[i,i]*C[j,j]) """ if len(args)==2: X = transpose(array([args[0]]+[args[1]])) elif len(args==1): X = args[0] else: raise RuntimeError, 'Only expecting 1 or 2 arguments' C = cov(X) d = resize(diagonal(C), (2,1)) r = divide(C,sqrt(matrixmultiply(d,transpose(d))))[0,1] try: return r.real except AttributeError: return r ------------------------------------------------------------------- I wrote a little test code comparing the output of matlab's equivalent functions. Basically, I compute the psd or cohere in matlab and python and do the rms difference on the resultant vectors RMS cohere python/matlab difference 0.000854587104587 RMS psd python/matlab difference 0.00210783306638 I am not sure where these differences are arising, but they are quite small. I'm going to keep trying to track them down. For corrcoef, the answers are the same past 8 significant digits. Hope this helps! John Hunter From haase at msg.ucsf.edu Fri Jan 31 05:12:05 2003 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Fri Jan 31 05:12:05 2003 Subject: [Numpy-discussion] numarray 0.4 on osX/darwin Message-ID: <020a01c2c897$65bf2dc0$3b45da80@rodan> Hi everybody, I tried a 'python2.2 setup.py install' of numarray on a Mac running os-X (10.1; I have also Fink installed) I starts crunching until: /usr/bin/ld: Undefined symbols: _fclearexcept _fetestexcept Anyone out there, who uses numarray on osX ? I'm thankful for any pointer... Sebastian Haase From jmiller at stsci.edu Fri Jan 31 07:31:01 2003 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 31 07:31:01 2003 Subject: [Numpy-discussion] numarray 0.4 on osX/darwin References: <020a01c2c897$65bf2dc0$3b45da80@rodan> Message-ID: <3E3A9628.3030704@stsci.edu> Sebastian Haase wrote: >Hi everybody, >I tried a 'python2.2 setup.py install' >of numarray on a Mac running os-X (10.1; I have also Fink installed) >I starts crunching until: >/usr/bin/ld: Undefined symbols: >_fclearexcept >_fetestexcept > >Anyone out there, who uses numarray on osX ? > >I'm thankful for any pointer... > >Sebastian Haase > > Hi Sebastian, I am very much a Mac-Amateur, but I have run numarray under osX by first installing a local UNIX version of Python using the source tarball. The steps were roughly as follows: 1. Obtain and unpack the Python source tarball in you home directory. cd there. 2. Configure Python using: ./configure --prefix=$HOME 3. Edit the Makefile for the following: 61c61 > LDFLAGS= --- < LDFLAGS= -framework System -framework CoreServices -framework Foundation This was the only (reasonable) way I could figure out how to tunnel link time options down through the distutils in the proper command line order. I'm not really sure this is a minimal set of frameworks, but it did at least work. 4. Build and install python: make ; make install 5. Obtain and unpack the numarray source tarball. cd there. 6. Build and install numarray: python setupall.py install 7. Put $HOME/bin on your PATH and rehash. Todd > > > >------------------------------------------------------- >This SF.NET email is sponsored by: >SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See! >http://www.vasoftware.com >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From Chris.Barker at noaa.gov Fri Jan 31 12:44:02 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Fri Jan 31 12:44:02 2003 Subject: [Numpy-discussion] fastest way to make two vectors into anarray References: Message-ID: <3E3ADC19.5566CB5A@noaa.gov> John Hunter wrote: > John> I have two equal length 1D arrays of 256-4096 complex or > John> floating point numbers which I need to put into a > John> shape=(len(x),2) array. > I tested all the suggested methods and the transpose with [x] and [y] > was the clear winner, with an 8 fold speed up over my original code. > The concatenate method was between 2-3 times faster. I was a little surprised by this, as I figured that the transpose method made an extra copy of the data (array() makes one copy, transpose() another. So I looked at the source for concatenate: def concatenate(a, axis=0): """concatenate(a, axis=0) joins the tuple of sequences in a into a single NumPy array. """ if axis == 0: return multiarray.concatenate(a) else: new_list = [] for m in a: new_list.append(swapaxes(m, axis, 0)) return swapaxes(multiarray.concatenate(new_list), axis, 0) So, if you are concantenating along anything other than the zero-th axis, you end up doing something similar to the transpose method. Seeign this, I trioed something else: def test_concat2(x,y): x.shape = (1,-1) y.shape = (1,-1) X = transpose( concatenate( (x, y) ) ) x.shape = (-1,) y.shape = (-1,) This then uses the native concatenate, but requires an extra copy in teh transpose. Here's a somewhat cleaner version, though you get more copies: def test_concat3(x,y): "Thanks to Chris Barker and Bryan Cole" X = transpose( concatenate( ( reshape(x,(1,-1)), reshape(y,(1,-1)) ) ) ) Here are the test results: testing on vectors of length: 4096 test_concat 0.286280035973 test_transpose 0.100033998489 test_naive 0.805399060249 test_concat3 0.109319090843 test_concat2 0.136469960213 All the transpose methods are essentially a tie. Would it be that hard for concatenate to do it's thing for any axis in C? It does seem like this is a fairly basic operation, and shouldn't require more than one copy. By the way, I realised that the transpose method had an extra call. transpose() can take an approprriate python sequence, so this works just fine: def test_transpose2(x,y): X = transpose([x]+[y]) However, it doesn't really save you the copy, as I'm retty sure transpose makes a copy internally anyway. Test results: testing on vectors of length: 4096 test_transpose 0.104995965958 test_transpose2 0.103582024574 I think the winner is: X = transpose([x]+[y]) well, I learned a little bit more about Numeric today. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From rob at hooft.net Fri Jan 31 13:36:03 2003 From: rob at hooft.net (Rob Hooft) Date: Fri Jan 31 13:36:03 2003 Subject: [Numpy-discussion] fastest way to make two vectors into anarray References: <3E3ADC19.5566CB5A@noaa.gov> Message-ID: <3E3AEC19.6020907@hooft.net> Chris Barker wrote: > > X = transpose([x]+[y]) > > > well, I learned a little bit more about Numeric today. > I've been skipping through a lot of messages today because I was getting behind on mailing list traffic, but I missed one thing in the discussion so far (sorry if it was marked already): transpose doesn't actually do any work. Actually, transpose only sets the "strides" counts differently, and this is blazingly fast. What is NOT fast is using the transposed array later! The problem is that many routines actually require a contiguous array, and will make a temporary local contiguous copy. This may happen multiple times if the lifetime of the transposed array is long. Even routines that do not require a contiguous array and can actually use the strides may run significantly slower because the CPU cache is trashed a lot by the high strides. Moral: you can't test this code by looping a 1000 times through it, you actually should take into account the time it takes to make a contiguous array immediately after the transpose call. Regards, Rob Hooft -- Rob W.W. Hooft || rob at hooft.net || http://www.hooft.net/people/rob/ From edcjones at erols.com Wed Jan 1 20:29:44 2003 From: edcjones at erols.com (Edward C. Jones) Date: Wed Jan 1 20:29:44 2003 Subject: [Numpy-discussion] numarray types and PIL modes, revisited Message-ID: <3E13C7DA.70906@erols.com> Perry Greenfield wrote: > Edward Jones writes: > > I write code using both PIL and numarray. PIL uses strings for > > modes and numarray uses (optionally) strings as typecodes. This > > causes problems. One fix is to emit a DeprecationWarning when > > string typecodes are used. Two functions are needed: > > StringTypeWarningOn and StringTypeWarningOff. The default > > should be to ignore this warning. > > I'm not sure I understand. Can you give me an example of problem > code or usage? It sounds like you are trying to test the types of > PIL and numarray objects in a generic sense. But I'd understand > better if you could show an example. That's what I was thinking (incorrectly). But I don't need to directly compare PIL modes with numarray types. My code never tries to deduce whether an array is a numarray or a PIL image from just the natype_or_mode. A module name (MODULE.NUMARRY, MODULE.PIL) must also be given. I do things this way because I might want to include other array/image systems. In an earlier version, I had a MODULE.IPL for the Intel Image Processing Library. The code also implements a policy of forbidding string types. So now all I can say is: 1. UInt8 == 'X' should not raise an exception. It should return False. 3. There needs to be a function that returns True iff arg is a numarry type (UInt8, "UInt8", "b", ...). def IsType(rep): from numerictypes import typeDict return isinstance(rep, NumericType) or typeDict.has_key(rep) Here is a typical piece of code. "module" can be MODULE.PIL or MODULE.NUMARRAY. ---- """General image casting function. Changes the C type of the pixels. Information can be lost. The "Convert" functions call C casting functions that clip the values, For example, if the input is a UInt16 and the output is a Int16, any input value greater than 32767 becomes 32767. """ def ArrayToArrayCast(arrin, module, natype_or_mode): """Converts one array into another. Results are clipped.""" pars = Parameters(arrin) if pars.module == module == MODULE.PIL and \ pars.mode == natype_or_mode: return arrin if pars.module == module == MODULE.NUMARRAY and \ NA_SameType(pars.natype, natype_or_mode): return arrin if pars.module == MODULE.NUMARRAY and module == MODULE.NUMARRAY: return NA_To_NA_Convert(arrin, natype_or_mode) if pars.module == MODULE.PIL and module == MODULE.PIL: return PIL_To_PIL_Convert(arrin, natype_or_mode) if pars.module == MODULE.NUMARRAY and module == MODULE.PIL: return NA_To_PIL_Convert(arrin, natype_or_mode) if pars.module == MODULE.PIL and module == MODULE.NUMARRAY: return PIL_To_NA_Convert(arrin, natype_or_mode) ---- From edcjones at erols.com Wed Jan 1 20:42:05 2003 From: edcjones at erols.com (Edward C. Jones) Date: Wed Jan 1 20:42:05 2003 Subject: [Numpy-discussion] End of Holidays small comments Message-ID: <3E13CB14.7040908@erols.com> node35.html: >>> print x.type(), x.real.type() D d should be >>> print x.type(), x.real.type() numarray type: Complex64 numarray type: Float64 ------------------------------------------------ Why use both NUM_C_ARRAY and C_ARRAY? ------------------------------------------------ in _ndarraymodule.c: {"_byteoffset", (getter)_ndarray_byteoffset_get, (setter)_ndarray_byteoffset_set, "shortest seperation between elements in bytes"}, {"_bytestride", (getter)_ndarray_bytestride_get, (setter)_ndarray_bytestride_set, "shortest seperation between elements in bytes"}, One of the comments is wrong. Also "separation". ------------------------------------------------ libnumarraymodule.c: /* Create an empty array. */ static PyArrayObject * NA_Empty(int ndim, int *shape, NumarrayType type) node42.html: static PyObject* NA_Empty( NumarrayType type, int ndim, ...) Serious documentation error. ------------------------------------------------ I think NA_New should be NA_New(int ndim, int* shape, NumarrayType type, void* buffer) The current NA_New is useful only when ndim is known at code-writing time. ------------------------------------------------ node39.html: Note: the type parameter for a macro is one of the Numarray Numeric Data Types, not a NumarrayType enumeration value. There should be an example of one of the GET/SET macros. How about unsigned char n; int i; ... n = NA_GET1(arr, UInt8, i); ------------------------------------------------ It seems that the parameters "aligned" and "writeable" are ignored in the source code for NA_NewAll and class NumArray. ------------------------------------------------ I would like to see an "int* strides" parameter added to NA_NewAll, so a non-contiguous "buffer" can be used. ------------------------------------------------ I suggest NA_Copy(PyObject* arr) which is something like static PyObject* NA_Copy(PyObject* arr) { PyArrayObject* arr1 = arr; return NA_NewAll(arr1->nd, (long*) arr1->dimensions, arr1->descr->type_num, arr1->data, arr1->byteoffset, arr1->bytestride, arr1->byteorder, 1, 1); } From edcjones at erols.com Wed Jan 1 20:45:34 2003 From: edcjones at erols.com (Edward C. Jones) Date: Wed Jan 1 20:45:34 2003 Subject: [Numpy-discussion] Slicing API? Message-ID: <3E13CBC3.6000207@erols.com> Both in Numeric and now in numarray I have found a need for API functions for slicing. Has anyone thought about this? From jmiller at stsci.edu Thu Jan 2 06:03:16 2003 From: jmiller at stsci.edu (Todd Miller) Date: Thu Jan 2 06:03:16 2003 Subject: [Numpy-discussion] Slicing API? References: <3E13CBC3.6000207@erols.com> Message-ID: <3E14481D.9080902@stsci.edu> Edward C. Jones wrote: > Both in Numeric and now in numarray I have found a need for API > functions for slicing. Has anyone thought about this? > Speaking for myself and the numarray C-API, the answer is no. What API do you want? Can you suggest function prototypes? Todd From jmiller at stsci.edu Thu Jan 2 12:36:53 2003 From: jmiller at stsci.edu (Todd Miller) Date: Thu Jan 2 12:36:53 2003 Subject: [Numpy-discussion] Slicing API? References: <3E13CBC3.6000207@erols.com> <3E14481D.9080902@stsci.edu> <3E1497E1.1050808@erols.com> Message-ID: <3E14A435.7040609@stsci.edu> Edward C. Jones wrote: > Todd Miller wrote: > >> Edward C. Jones wrote: >> >>> Both in Numeric and now in numarray I have found a need for API >>> functions for slicing. Has anyone thought about this? >>> >> Speaking for myself and the numarray C-API, the answer is no. What >> API do you want? Can you suggest function prototypes? > > > An API version of arrout[slices] = arrin[slices]: > > static int > NA_CopySlice(PyArrayObject* arrin, PyArrayObject* arrout, > int* startin, int* stepin, int* stopin, int* startout, int* stepout); > > I would suggest something more like the following then: typedef struct { int start, stop, step; } NumSlice; static int NA_CopySlice(PyArrayObject* arrin, int indim, NumSlice *slicein, PyArrayObject* arrout, int outdim, NumSlice *sliceout); The differences are: 1. A slice dimension count is added for both input and output arrays. This enables use of partial indices. 2. Slice values are expressed using the NumSlice typedef/struct rather than 3 independent int arrays. 3. The parameter order is shuffled so that input array parameters are kept together, and output array parameters are kept together. But, I still have these comments: 1. It looks like it will be cumbersome to use. 2. We should probably implement it as a callback to Python to avoid introducing another set of assignment semantics. Thus, the implementation would really just be building up and executing the calls for: outarr.__setitem__(outslices, inarr.__getitem__(inslices)). 3. The slicing implementation for numarray objects should be optimized to C this quarter, if not this month. So in terms of efficiency, not to mention comment 2, this won't buy much. 4. Since Numeric doesn't have this already, we're probably missing something obvious. Comments? Still interested? Todd From jmiller at stsci.edu Fri Jan 3 09:49:01 2003 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 3 09:49:01 2003 Subject: [Numpy-discussion] End of Holidays small comments References: <3E13CB14.7040908@erols.com> Message-ID: <3E15CED2.9070402@stsci.edu> Wow! This is great feedback. Thanks Edward. Edward C. Jones wrote: > node35.html: > > >>> print x.type(), x.real.type() > D d > > should be > > >>> print x.type(), x.real.type() > numarray type: Complex64 numarray type: Float64 I taked this over with Perry, and think it should behave and be documented more like: >>> print x.type(), x.real.type() Complex64 Float64 > > ------------------------------------------------ > > Why use both NUM_C_ARRAY and C_ARRAY? In the context of the defining enumeration, NUM_C_ARRAY looks correct. Anywhere else, C_ARRAY is about all I can stand. C_ARRAY is so common that I thought a little irregularity would be tolerable. Chock it up to tastelessness. > > ------------------------------------------------ > > in _ndarraymodule.c: > > {"_byteoffset", > (getter)_ndarray_byteoffset_get, > (setter)_ndarray_byteoffset_set, > "shortest seperation between elements in bytes"}, > {"_bytestride", > (getter)_ndarray_bytestride_get, > (setter)_ndarray_bytestride_set, > "shortest seperation between elements in bytes"}, > > One of the comments is wrong. Also "separation". Noted. > > ------------------------------------------------ > > libnumarraymodule.c: > > /* Create an empty array. */ > static PyArrayObject * > NA_Empty(int ndim, int *shape, NumarrayType type) > > node42.html: > > static PyObject* NA_Empty( NumarrayType type, int ndim, ...) > Noted. > > ------------------------------------------------ > > I think NA_New should be > > NA_New(int ndim, int* shape, NumarrayType type, void* buffer) > > The current NA_New is useful only when ndim is known at code-writing > time. NA_New is a "convenience wrapper" around NA_NewAll, but I see your point. How about NA_vNew(), in the spirit of vprintf? > > ------------------------------------------------ > > node39.html: > > Note: the type parameter for a macro is one of the Numarray Numeric > Data Types, not a NumarrayType enumeration value. > > There should be an example of one of the GET/SET macros. How about > > unsigned char n; > int i; > ... > n = NA_GET1(arr, UInt8, i); OK. > > ------------------------------------------------ > > It seems that the parameters "aligned" and "writeable" are ignored in > the source code for NA_NewAll and class NumArray. "aligned" is used. "writeable" should probably be dropped since it is no longer used. Since doing that would break an interface someone might be using, I'd rather not. > > ------------------------------------------------ > > I would like to see an "int* strides" parameter added to NA_NewAll, so a > non-contiguous "buffer" can be used. OK. How about NA_NewAllWithStrides (or insert a better name here)? > > ------------------------------------------------ > > I suggest NA_Copy(PyObject* arr) which is something like > > static PyObject* NA_Copy(PyObject* arr) > { > PyArrayObject* arr1 = arr; > return NA_NewAll(arr1->nd, (long*) arr1->dimensions, This ((long *)) doesn't work portably, so I would recommend avoiding it. > > arr1->descr->type_num, arr1->data, arr1->byteoffset, > arr1->bytestride, arr1->byteorder, 1, 1); > } > I'll add NA_Copy(). From jmiller at stsci.edu Fri Jan 3 09:52:02 2003 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 3 09:52:02 2003 Subject: [Numpy-discussion] numarray types and PIL modes, revisited References: <3E13C7DA.70906@erols.com> Message-ID: <3E15CF75.8080207@stsci.edu> Edward C. Jones wrote: > So now all I can say is: > > 1. UInt8 == 'X' should not raise an exception. It should return False. OK. I'll change numarray to return False. > > 3. There needs to be a function that returns True iff arg is a numarry > type (UInt8, "UInt8", "b", ...). > > def IsType(rep): > from numerictypes import typeDict > return isinstance(rep, NumericType) or typeDict.has_key(rep) Sounds good too. I'll add this to numerictypes. > > Thanks, Todd From edcjones at erols.com Fri Jan 3 16:03:04 2003 From: edcjones at erols.com (Edward C. Jones) Date: Fri Jan 3 16:03:04 2003 Subject: [Numpy-discussion] Grepping the source Message-ID: <3E162CCB.7070106@erols.com> Here is a short program I find useful. #! /usr/bin/env python import os, sys, tempfile """Greps the numarray source code""" command = \ """grep -n "%s" \ /usr/local/src/numarray-0.4/Include/numarray/arrayobject.h \ ... /usr/local/src/numarray-0.4/Lib/_ufunc.py \ ... /usr/local/src/numarray-0.4/Src/libnumarraymodule.c \ > %s """ if len(sys.argv) != 2: raise Exception, 'program requires exactly one argument' temp = tempfile.mktemp() try: os.system(command % (sys.argv[1], temp)) f = file(temp, 'r') lines = f.read().splitlines() f.close() finally: if os.path.exists(temp): os.remove(temp) common = len('/usr/local/src/numarray-0.4/') d = {} names = [] for line in lines: line = line[common:] colonloc = line.index(':') name = line[:colonloc] text = line[colonloc+1:] if not d.has_key(name): d[name] = [] names.append(name) d[name].append(text) for name in names: if len(d[name]) == 0: continue print '%s:' % name for text in d[name]: print ' %s' % text print From magnus at hetland.org Fri Jan 3 16:24:04 2003 From: magnus at hetland.org (Magnus Lie Hetland) Date: Fri Jan 3 16:24:04 2003 Subject: [Numpy-discussion] Grepping the source In-Reply-To: <3E162CCB.7070106@erols.com> References: <3E162CCB.7070106@erols.com> Message-ID: <20030104002342.GA18694@idi.ntnu.no> Edward C. Jones : [snip] > lines = f.read().splitlines() You could use f.readlines() here... Or you could just use for line in open(...): later, if you're using Python 2.2+ -- Magnus Lie Hetland http://hetland.org From perry at stsci.edu Mon Jan 6 16:28:05 2003 From: perry at stsci.edu (Perry Greenfield) Date: Mon Jan 6 16:28:05 2003 Subject: [Numpy-discussion] package vs module Message-ID: Back in December the issue of whether numarray should be a package or set of modules came up. When I asked about the possibility of making numarray a package (on the scipy mailing list but I can't seem to find the thread where it was discussed), I got only positive comments. The issue needs to be raised here also. Is there any objection to making numarray package based? The implications are that 3rd party modules (e.g. FFT) will be imported as part of the package structure, i.e., import numarray.FFT or from numarray.FFT import * instead of import FFT As usual there are advantages and disadvantages. The advantages are that we will not have name collisions with existing Numeric modules (currently we name FFT as FFT2 for this reason). It also potentially reduces name collision issues in general. Most feel it is a cleaner way to organize the software (at least based on the feedback so far). The main disadvantages I see so far are: 1) One will either have to change import statements in old code to match the new style (a pain, but generally changing imports is not terribly difficult since they are easy to identify) or explicitly add the path to each 3rd party module to Python Path (or some equivalent). 2) If numarray were accepted into the Python Standard Library, it would be the first case (as far as I can tell) of a standard library package where we would expect to add sub modules to it (e.g., FFT)). Normally these would not be distributed with the standard library, so some general mechanism will be needed to allow numarray to find 3rd party packages outside of the Python directory structure. For example, I don't think we can require having people install FFT in the Standard Library directory structure after Python is installed. Rather, we would probably have numarray look for extension modules in a standard named site-packages directory (or site-numarray?) or otherwise check a numarraypath environmental variable so that import numarray.FFT works properly. Perhaps others have ideas about how to best handle this. Any other issues being overlooked? Feedback? Thanks, Perry From magnus at hetland.org Mon Jan 6 23:05:02 2003 From: magnus at hetland.org (Magnus Lie Hetland) Date: Mon Jan 6 23:05:02 2003 Subject: [Numpy-discussion] package vs module In-Reply-To: References: Message-ID: <20030107070426.GC4884@idi.ntnu.no> Perry Greenfield : > > Back in December the issue of whether numarray should be a package > or set of modules came up. When I asked about the possibility > of making numarray a package (on the scipy mailing list but I > can't seem to find the thread where it was discussed), I got > only positive comments. The issue needs to be raised here also. > > Is there any objection to making numarray package based? I think this seems like a very good and natural thing to do. (Maybe names like RandomArray2 etc. can be changed too, now... :) -- Magnus Lie Hetland http://hetland.org From pearu at cens.ioc.ee Tue Jan 7 02:22:03 2003 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Tue Jan 7 02:22:03 2003 Subject: [Numpy-discussion] package vs module In-Reply-To: Message-ID: On Mon, 6 Jan 2003, Perry Greenfield wrote: > The main disadvantages I see so far are: > > 1) One will either have to change import statements in old code > to match the new style (a pain, but generally changing imports > is not terribly difficult since they are easy to identify) or > explicitly add the path to each 3rd party module to Python > Path (or some equivalent). > 2) If numarray were accepted into the Python Standard Library, it > would be the first case (as far as I can tell) of a standard > library package where we would expect to add sub modules to > it (e.g., FFT)). Normally these would not be distributed with > the standard library, so some general mechanism will be needed > to allow numarray to find 3rd party packages outside of the > Python directory structure. For example, I don't think we can > require having people install FFT in the Standard Library > directory structure after Python is installed. Rather, we would > probably have numarray look for extension modules in a standard > named site-packages directory (or site-numarray?) or otherwise > check a numarraypath environmental variable so that > import numarray.FFT works properly. Perhaps others have ideas > about how to best handle this. > > Any other issues being overlooked? There is one, though not so critical at this point but I will raise it anyway. In summary, I am +1 for making numarray a package. The issue is releated to import time and memory usage: more extension modules in a package increase both of them, even if users have no indention to use these modules. On slower machines this may cause inconvinieces, especially in applications that call Python multiple times for short tasks containing numarray operation. Let me repeat, currently this is not a problem neither with Numeric (because it never imports its extension modules) or numarray until numarray will contain a number of extension modules that presumably are not small. For a realistic example of this issue consider Scipy (as a sort of upper bound what numarray may become one day). Scipy contains a linalg module that is an (almost complete) wrapper to ATLAS/BLAS/LAPACK libraries and therefore importing the corresponding extension modules can be both time and memory consuming. For example, importing scipy to Python may take 2-5 seconds on PII 400MHz, mainly because of loading the linalg extension modules. This time may be annoying for small but frequent tasks. I wish Python import mechanism would be a bit smarter or lazier in loading extension modules that are never used... Pearu From falted at openlc.org Tue Jan 7 03:31:07 2003 From: falted at openlc.org (Francesc Alted) Date: Tue Jan 7 03:31:07 2003 Subject: [Numpy-discussion] package vs module In-Reply-To: References: Message-ID: <20030107113009.GA2445@openlc.org> On Mon, Jan 06, 2003 at 07:29:15PM -0500, Perry Greenfield wrote: > The main disadvantages I see so far are: > > 1) One will either have to change import statements in old code > to match the new style (a pain, but generally changing imports > is not terribly difficult since they are easy to identify) or > explicitly add the path to each 3rd party module to Python > Path (or some equivalent). I think this should be regarded as a minor annoyance compared with the advantages of making numarray a package. In addition, the introduction of numarray as substitute of Numeric can justify some re-code on existing applications. > 2) If numarray were accepted into the Python Standard Library, it > would be the first case (as far as I can tell) of a standard > library package where we would expect to add sub modules to > it (e.g., FFT)). Normally these would not be distributed with > the standard library, so some general mechanism will be needed > to allow numarray to find 3rd party packages outside of the > Python directory structure. For example, I don't think we can > require having people install FFT in the Standard Library > directory structure after Python is installed. Rather, we would > probably have numarray look for extension modules in a standard > named site-packages directory (or site-numarray?) or otherwise > check a numarraypath environmental variable so that > import numarray.FFT works properly. Perhaps others have ideas > about how to best handle this. > Great. I would be glad to see a package containing numarray kernel in order to allow aplications to use their core features, and have a mechanism to add 3rd party packages. In particular, having something similar to site-numarray to install these packages can be quite neat. In fact, I was pondering to include a subset of numarray in the PyTables package (it only needs the numarray core functionality), but if this reorganization takes place, I would not need to do that anymore. > Any other issues being overlooked? Yeah. In case you decide to break numarray in several modules, which would be the granularity of the separation. My opinion goes to have a reduced core with basic functionality (to maximize the chances to be included in the Pyhton Standard Library, but also to allow an easy entry for people who may wish to use this functionality) and then different, small, 3rd party packages, but perhaps this is also the most laborious solution. -- Francesc Alted PGP KeyID: 0x61C8C11F From hinsen at cnrs-orleans.fr Tue Jan 7 03:32:03 2003 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Tue Jan 7 03:32:03 2003 Subject: [Numpy-discussion] package vs module In-Reply-To: References: Message-ID: Perry Greenfield writes: > Back in December the issue of whether numarray should be a package > potentially reduces name collision issues in general. Most feel > it is a cleaner way to organize the software (at least based on > the feedback so far). I agree. We have discussed converting NumPy into a package a few times in the past, the major argument against it was compatibility issues. Numarray will require some changes to import statements anyway, so this seems the right time to make the change. > 2) If numarray were accepted into the Python Standard Library, it > would be the first case (as far as I can tell) of a standard > library package where we would expect to add sub modules to > it (e.g., FFT)). Normally these would not be distributed with > the standard library, so some general mechanism will be needed > to allow numarray to find 3rd party packages outside of the > Python directory structure. For example, I don't think we can If you plan to unbundle FFT etc. from numarray, then I would prefer a different naming scheme: numarray being just numarray, and some other package name grouping together the other modules. That is not only a question of installation, but also of general maintenance and of clarity for users. I see the Python package system as a tree: everything inside a package belongs together, is distributed together and is maintained by the same people. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From paul at pfdubois.com Tue Jan 7 09:25:06 2003 From: paul at pfdubois.com (paul at pfdubois.com) Date: Tue Jan 7 09:25:06 2003 Subject: [Numpy-discussion] package vs module In-Reply-To: <20030107113009.GA2445@openlc.org> Message-ID: <3E0D027100007B17@mta8.wss.scd.yahoo.com> 1. I favor the package approach. 2. I don't care if FFT is numarray.FFT or numpy.FFT (i.e., in a separate place). However, see (3). 3. Extensions built with one version of Python/numarray may not work with a different version. This means the safer approach is to have all addons inside the same directory, so that you can blow away just one directory and be sure that no 'old' packages remain. Some new stuff being put into Python also envisions being able to add various zipped files to the Python path as places to be searched. Perhaps this represents a packaging opportunity. I haven't paid enough attention to be sure. While we are on the subject of packaging, the current distribution places all sorts of extraneous test and installation-related files in the Lib directory. This makes it harder to work with the source when you are new to it. From tim.hochberg at ieee.org Tue Jan 7 09:35:17 2003 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Tue Jan 7 09:35:17 2003 Subject: [Numpy-discussion] package vs module In-Reply-To: References: Message-ID: <3E1B0FAF.7020607@ieee.org> Pearu Peterson wrote: >On Mon, 6 Jan 2003, Perry Greenfield wrote: > > > >>The main disadvantages I see so far are: >> >>1) One will either have to change import statements in old code >> to match the new style (a pain, but generally changing imports >> is not terribly difficult since they are easy to identify) or >> explicitly add the path to each 3rd party module to Python >> Path (or some equivalent). >>2) If numarray were accepted into the Python Standard Library, it >> would be the first case (as far as I can tell) of a standard >> library package where we would expect to add sub modules to >> it (e.g., FFT)). Normally these would not be distributed with >> the standard library, so some general mechanism will be needed >> to allow numarray to find 3rd party packages outside of the >> Python directory structure. For example, I don't think we can >> require having people install FFT in the Standard Library >> directory structure after Python is installed. Rather, we would >> probably have numarray look for extension modules in a standard >> named site-packages directory (or site-numarray?) or otherwise >> check a numarraypath environmental variable so that >> import numarray.FFT works properly. Perhaps others have ideas >> about how to best handle this. >> >>Any other issues being overlooked? >> >> > >There is one, though not so critical at this point but I will raise >it anyway. In summary, I am +1 for making numarray a package. > >The issue is releated to import time and memory usage: more extension >modules in a package increase both of them, even if users have no >indention to use these modules. On slower machines this may cause >inconvinieces, especially in applications that call Python multiple times >for short tasks containing numarray operation. > > That's not right, is it? I'm pretty certain that submodules in a package are not loaded until explicitly imported. I'm not sure why SciPy is slow, maybe the __init__ imports everything? I don't have a copy here so I can't check right now. In any event I'm +1 for putting it in a package unless it interferes with it getting into the core. As Paul mentioned keeping it in a zip archive would be even cooler once that's an option. -tim From falted at openlc.org Wed Jan 8 13:27:06 2003 From: falted at openlc.org (Francesc Alted) Date: Wed Jan 8 13:27:06 2003 Subject: [Numpy-discussion] some recarray rework Message-ID: <20030108212648.GA1309@openlc.org> Hi, In the context of optimizing the PyTables support for numarray and recarray objects I have been playing with recarray module, and ended with a somewhat improved version of it. Roughly, the modifications done are: - Addition of a cache to quickly access the columns (numarrays) in recarrays. This object is a map (dictionary) where keys are the name fields and values are the pointers to columns regarded as numarrays entities. This dictionary is accessible through the new attribute "_fields". - Addition of an attribute for recarray objects named "_record" which points to a special object ("Record2" class) and that it is aware of the "_fields" cache. It that can be used to access the different rows in recarray objects in an efficient way. - The "_record" object is callable (it defines the "__call__" method) so as to select the recarray row that is active during access to the different fields. Advantages - Access to rows and columns (fields) in recarray objects are one order of magnitude faster (!). - The new "_fields" and "_record" attributes provides convenient and intuitive ways to access the information in recarrays. - The "_record" attribute suports the "__getattr__" and "__setattr__" methods that are very convenient to access fields in a row. Drawbacks - "_record" attribute points always to the same object and you must pass it the row over which you want to operate. So, if you want to have two different objects pointing to different rows, you can't use the "_record" attribute to get them (but you can still use the existing Record class through by calling the "__getitem__" method of a recarray object). - Two new attributes are added to the already large number of recarray variables. However, this new variables has no special space requirements as "_record" object has only three scalar variables and "_fields" is a dictionary with many entries as fields in recarray, which should be not a large amount. I'm attaching this modified version as well as a testbed program in order to test their new access methods and improved performance. The output of this program ran in a pentium4 at 2GHz machine is also included. Feel free to play with it and/or take/adapt the parts you consider better suited to recarray module. -- Francesc Alted PGP KeyID: 0x61C8C11F -------------- next part -------------- import numarray as num import ndarray as mda import memory import chararray import sys, copy, os, re, types, string __version__ = '1.0' class Char: """ data type Char class""" bytes = 1 def __repr__(self): return "CharType" CharType = Char() # translation table to the num data types numfmt = {'i1':num.Int8, 'u1':num.UInt8, 'i2':num.Int16, 'i4':num.Int32, 'i8':num.Int64, 'f4':num.Float32, 'f8':num.Float64, 'l':num.Bool, 'b':num.Int8, 'u':num.UInt8, 's':num.Int16, 'i':num.Int32, 'N':num.Int64, 'f':num.Float32, 'd':num.Float64, 'r':num.Float32, 'a':CharType, 'Int8':num.Int8, 'Int16':num.Int16, 'Int32':num.Int32, 'Int64':num.Int64, 'UInt8':num.UInt8, 'Float32':num.Float32, 'Float64':num.Float64, 'Bool':num.Bool} # the reverse translation table of the above (for numarray only) revfmt = {num.Int16:'s', num.Int32:'i', num.Int64:'N', num.Float32:'r', num.Float64:'d', num.Bool:'l', num.Int8:'b', num.UInt8:'u', CharType:'a'} # TFORM regular expression format_re = re.compile(r'(?P^[0-9]*)(?P[A-Za-z0-9.]+)') def fromrecords (recList, formats=None, names=None): """ create a Record Array from a list of records in text form The data in the same field can be heterogeneous, they will be promoted to the highest data type. This method is intended for creating smaller record arrays. If used to create large array e.g. r=recarray.fromrecords([[2,3.,'abc']]*100000) it is slow. >>> r=fromrecords([[456,'dbe',1.2],[2,'de',1.3]],names='col1,col2,col3') >>> print r[0] (456, 'dbe', 1.2) >>> r.field('col1') array([456, 2]) >>> r.field('col2') CharArray(['dbe', 'de']) >>> import cPickle >>> print cPickle.loads(cPickle.dumps(r)) RecArray[ (456, 'dbe', 1.2), (2, 'de', 1.3) ] """ _shape = len(recList) _nfields = len(recList[0]) for _rec in recList: if len(_rec) != _nfields: raise ValueError, "inconsistent number of objects in each record" arrlist = [0]*_nfields for col in range(_nfields): tmp = [0]*_shape for row in range(_shape): tmp[row] = recList[row][col] try: arrlist[col] = num.array(tmp) except: try: arrlist[col] = chararray.array(tmp) except: raise ValueError, "inconsistent data at row %d,field %d" % (row, col) _array = fromarrays(arrlist, formats=formats, names=names) del arrlist del tmp return _array def fromarrays (arrayList, formats=None, names=None): """ create a Record Array from a list of num/char arrays >>> x1=num.array([1,2,3,4]) >>> x2=chararray.array(['a','dd','xyz','12']) >>> x3=num.array([1.1,2,3,4]) >>> r=fromarrays([x1,x2,x3],names='a,b,c') >>> print r[1] (2, 'dd', 2.0) >>> x1[1]=34 >>> r.field('a') array([1, 2, 3, 4]) """ _shape = len(arrayList[0]) if formats == None: # go through each object in the list to see if it is a numarray or # chararray and determine the formats formats = '' for obj in arrayList: if isinstance(obj, chararray.CharArray): formats += `obj._itemsize` + 'a,' elif isinstance(obj, num.NumArray): if len(obj._shape) == 1: _repeat = '' elif len(obj._shape) == 2: _repeat = `obj._shape[1]` else: raise ValueError, "doesn't support numarray more than 2-D" formats += _repeat + revfmt[obj._type] + ',' else: raise ValueError, "item in the array list must be numarray or chararray" formats=formats[:-1] for obj in arrayList: if len(obj) != _shape: raise ValueError, "array has different lengths" _array = RecArray(None, formats=formats, shape=_shape, names=names) # populate the record array (make a copy) for i in range(len(arrayList)): try: _array.field(_array._names[i])[:] = arrayList[i] except: print "Incorrect CharArray format %s, copy unsuccessful." % _array._formats[i] return _array def fromstring (datastring, formats, shape=0, names=None): """ create a Record Array from binary data contained in a string""" _array = RecArray(chararray._stringToBuffer(datastring), formats, shape, names) if mda.product(_array._shape)*_array._itemsize > len(datastring): raise ValueError("Insufficient input data.") else: return _array def fromfile(file, formats, shape=-1, names=None): """Create an array from binary file data If file is a string then that file is opened, else it is assumed to be a file object. No options at the moment, all file positioning must be done prior to this function call with a file object >>> import testdata, sys >>> fd=open(testdata.filename) >>> fd.seek(2880*2) >>> r=fromfile(fd, formats='d,i,5a', shape=3) >>> r._byteorder = "big" >>> print r[0] (5.1000000000000005, 61, 'abcde') >>> r._shape (3,) """ if isinstance(shape, types.IntType) or isinstance(shape, types.LongType): shape = (shape,) name = 0 if isinstance(file, types.StringType): name = 1 file = open(file, 'rb') size = os.path.getsize(file.name) - file.tell() dummy = array(None, formats=formats, shape=0) itemsize = dummy._itemsize if shape and itemsize: shapesize = mda.product(shape)*itemsize if shapesize < 0: shape = list(shape) shape[ shape.index(-1) ] = size / -shapesize shape = tuple(shape) nbytes = mda.product(shape)*itemsize if nbytes > size: raise ValueError( "Not enough bytes left in file for specified shape and type") # create the array _array = RecArray(None, formats=formats, shape=shape, names=names) nbytesread = memory.file_readinto(file, _array._data) if nbytesread != nbytes: raise IOError("Didn't read as many bytes as expected") if name: file.close() return _array # The test below was factored out of "array" due to platform specific # floating point formatted results: e+020 vs. e+20 if sys.platform == "win32": _fnumber = "2.5984589414244182e+020" else: _fnumber = "2.5984589414244182e+20" __test__ = {} __test__["array_platform_test_workaround"] = """ >>> r=array('a'*200,'r,3s,5a,i',3) >>> print r[0] (%(_fnumber)s, array([24929, 24929, 24929], type=Int16), 'aaaaa', 1633771873) >>> print r[1] (%(_fnumber)s, array([24929, 24929, 24929], type=Int16), 'aaaaa', 1633771873) """ % globals() del _fnumber def array(buffer=None, formats=None, shape=0, names=None): """This function will creates a new instance of a RecArray. buffer specifies the source of the array's initialization data. buffer can be: RecArray, list of records in text, list of numarray/chararray, None, string, buffer. formats specifies the fromat definitions of the array's records. shape specifies the array dimensions. names specifies the field names. >>> r=array([[456,'dbe',1.2],[2,'de',1.3]],names='col1,col2,col3') >>> print r[0] (456, 'dbe', 1.2) >>> r=array('a'*200,'r,3i,5a,s',3) >>> r._bytestride 23 >>> r._names ['c1', 'c2', 'c3', 'c4'] >>> r._repeats [1, 3, 5, 1] >>> r._shape (3,) """ if (buffer is None) and (formats is None): raise ValueError("Must define formats if buffer=None") elif buffer is None or isinstance(buffer, types.BufferType): return RecArray(buffer, formats=formats, shape=shape, names=names) elif isinstance(buffer, types.StringType): return fromstring(buffer, formats=formats, shape=shape, names=names) elif isinstance(buffer, types.ListType) or isinstance(buffer, types.TupleType): if isinstance(buffer[0], num.NumArray) or isinstance(buffer[0], chararray.CharArray): return fromarrays(buffer, formats=formats, names=names) else: return fromrecords(buffer, formats=formats, names=names) elif isinstance(buffer, RecArray): return buffer.copy() elif isinstance(buffer, types.FileType): return fromfile(buffer, formats=formats, shape=shape, names=names) else: raise ValueError("Unknown input type") def _RecGetType(name): """Converts a type repr string into a type.""" if name == "CharType": return CharType else: return num._getType(name) class RecArray(mda.NDArray): """Record Array Class""" def __init__(self, buffer, formats, shape=0, names=None, byteoffset=0, bytestride=None, byteorder=sys.byteorder, aligned=1): # names and formats can be either a string with components separated # by commas or a list of string values, e.g. ['i4', 'f4'] and 'i4,f4' # are equivalent formats self._parseFormats(formats) self._fieldNames(names) itemsize = self._stops[-1] + 1 if shape != None: if type(shape) in [types.IntType, types.LongType]: shape = (shape,) elif (type(shape) == types.TupleType and type(shape[0]) in [types.IntType, types.LongType]): pass else: raise NameError, "Illegal shape %s" % `shape` #XXX need to check shape*itemsize == len(buffer)? self._shape = shape mda.NDArray.__init__(self, self._shape, itemsize, buffer=buffer, byteoffset=byteoffset, bytestride=bytestride, aligned=aligned) self._byteorder = byteorder # Build the column arrays self._fields = self._get_fields() # Associate a record object for accessing values in each row # in a efficient way (i.e. without creating a new object each time) self._record = Record2(self) def _parseFormats(self, formats): """ Parse the field formats """ if (type(formats) in [types.ListType, types.TupleType]): _fmt = formats[:] ### make a copy elif (type(formats) == types.StringType): _fmt = string.split(formats, ',') else: raise NameError, "illegal input formats %s" % `formats` self._nfields = len(_fmt) self._repeats = [1] * self._nfields self._sizes = [0] * self._nfields self._stops = [0] * self._nfields # preserve the input for future reference self._formats = [''] * self._nfields sum = 0 for i in range(self._nfields): # parse the formats into repeats and formats try: (_repeat, _dtype) = format_re.match(string.strip(_fmt[i])).groups() except: print 'format %s is not recognized' % _fmt[i] if _repeat == '': _repeat = 1 else: _repeat = eval(_repeat) _fmt[i] = numfmt[_dtype] self._repeats[i] = _repeat self._sizes[i] = _fmt[i].bytes * _repeat sum += self._sizes[i] self._stops[i] = sum - 1 # Unify the appearance of _format, independent of input formats self._formats[i] = `_repeat`+revfmt[_fmt[i]] self._fmt = _fmt def __getstate__(self): """returns pickled state dictionary for RecArray""" state = mda.NDArray.__getstate__(self) state["_fmt"] = map(repr, self._fmt) return state def __setstate__(self, state): mda.NDArray.__setstate__(self, state) self._fmt = map(_RecGetType, state["_fmt"]) def _fieldNames(self, names=None): """convert input field names into a list and assign to the _names attribute """ if (names): if (type(names) in [types.ListType, types.TupleType]): pass elif (type(names) == types.StringType): names = string.split(names, ',') else: raise NameError, "illegal input names %s" % `names` self._names = map(lambda n:string.strip(n), names) else: self._names = [] # if the names are not specified, they will be assigned as "c1, c2,..." # if not enough names are specified, they will be assigned as "c[n+1], # c[n+2],..." etc. where n is the number of specified names..." self._names += map(lambda i: 'c'+`i`, range(len(self._names)+1,self._nfields+1)) def _get_fields(self): """ get a dictionary with fields as numeric arrays """ # Iterate over all the fields fields = {} for fieldName in self._names: # determine the offset within the record indx = index_of(self._names, fieldName) _start = self._stops[indx] - self._sizes[indx] + 1 _shape = self._shape _type = self._fmt[indx] _buffer = self._data _offset = self._byteoffset + _start # don't use self._itemsize due to possible slicing _stride = self._strides[0] _order = self._byteorder if isinstance(_type, Char): arr = chararray.CharArray(buffer=_buffer, shape=_shape, itemsize=self._repeats[indx], byteoffset=_offset, bytestride=_stride) else: arr = num.NumArray(shape=_shape, type=_type, buffer=_buffer, byteoffset=_offset, bytestride=_stride, byteorder = _order) # modify the _shape and _strides for array elements if (self._repeats[indx] > 1): arr._shape = self._shape + (self._repeats[indx],) arr._strides = (self._strides[0], _type.bytes) # Put this array as a value in dictionary fields[fieldName] = arr return fields def field(self, fieldName): """ get the field data as a numeric array """ return self._fields[fieldName] def info(self): """display instance's attributes (except _data)""" _attrList = dir(self) _attrList.remove('_data') _attrList.remove('_fmt') for attr in _attrList: print '%s = %s' % (attr, getattr(self,attr)) def __str__(self): outstr = 'RecArray[ \n' for i in self: outstr += Record.__str__(i) + ',\n' return outstr[:-2] + '\n]' ### The followng __getitem__ is not in the requirements ### and is here for experimental purposes def __getitem__(self, key): if type(key) == types.TupleType: if len(key) == 1: return mda.NDArray.__getitem__(self,key[0]) elif len(key) == 2 and type(key[1]) == types.StringType: return mda.NDArray.__getitem__(self,key[0]).field(key[1]) else: raise NameError, "Illegal key %s" % `key` return mda.NDArray.__getitem__(self,key) def _getitem(self, key): byteoffset = self._getByteOffset(key) row = (byteoffset - self._byteoffset) / self._strides[0] return Record(self, row) def _setitem(self, key, value): byteoffset = self._getByteOffset(key) row = (byteoffset - self._byteoffset) / self._strides[0] for i in range(self._nfields): self.field(self._names[i])[row] = value.field(self._names[i]) def reshape(*value): print "Cannot reshape record array." class Record2: """Record2 Class This class is similar to Record except for the fact that it is created and associated with a recarray in their creation time. When speed in traversing the recarray is required this approach is more convenient than create a new Record object for each row that is visited. """ def __init__(self, input): self.__dict__["_array"] = input self.__dict__["_fields"] = input._fields self.__dict__["_row"] = 0 def __call__(self, row): """ set the row for this record object """ if row < self._array.shape[0]: self.__dict__["_row"] = row return self else: return None def __getattr__(self, fieldName): """ get the field data of the record""" try: return self._fields[fieldName][self._row] except: (type, value, traceback) = sys.exc_info() raise AttributeError, "Error accessing \"%s\" attr.\n %s" % \ (fieldName, "Error was: \"%s: %s\"" % (type,value)) def __setattr__(self, fieldName, value): """ set the field data of the record""" self._fields[fieldName][self._row] = value def __str__(self): """ represent the record as an string """ outlist = [] for name in self._array._names: outlist.append(`self._fields[name][self._row]`) return "(" + ", ".join(outlist) + ")" class Record: """Record Class""" def __init__(self, input, row=0): if isinstance(input, types.ListType) or isinstance(input, types.TupleType): input = fromrecords([input]) if isinstance(input, RecArray): self.array = input self.row = row def __getattr__(self, fieldName): """ get the field data of the record""" #return self.array.field(fieldName)[self.row] if fieldName in self.array._names: #return self.array.field(fieldName)[self.row] return self.array._fields[fieldName][self.row] def field(self, fieldName): """ get the field data of the record""" #return self.array.field(fieldName)[self.row] return self.array.field(fieldName)[self.row] def __str__(self): outstr = '(' #for i in range(self.array._nfields): # print self.array.field(i)[self.row] for name in self.array._names: #print self.array.field(name)[self.row] #print self.array._fields[name][self.row] ### this is not efficient, need to know how to convert N-bytes to each data type outstr += `self.array.field(name)[self.row]` + ', ' return outstr[:-2] + ')' def index_of(nameList, key): """ Get the index of the key in the name list. The key can be an integer or string. If integer, it is the index in the list. If string, the name matching will be case-insensitive and trailing blank-insensitive. """ if (type(key) in [types.IntType, types.LongType]): indx = key elif (type(key) == types.StringType): _names = nameList[:] for i in range(len(_names)): _names[i] = string.lower(_names[i]) try: indx = _names.index(string.strip(string.lower(key))) except: raise NameError, "Key %s does not exist" % key else: raise NameError, "Illegal key %s" % `key` return indx def find_duplicate (list): """Find duplication in a list, return a list of dupicated elements""" dup = [] for i in range(len(list)): if (list[i] in list[i+1:]): if (list[i] not in dup): dup.append(list[i]) return dup def test(): import doctest, recarray return doctest.testmod(recarray) if __name__ == "__main__": test() -------------- next part -------------- import sys, time import numarray as num import chararray import recarray import recarray2 # This is my modified version usage = \ """usage: %s recordlength Set recordlength to 1000 at least to obtain decent figures! """ % sys.argv[0] try: reclen = int(sys.argv[1]) except: print usage sys.exit() delta = 0.000001 # Creation of recarrays objects for test x1=num.array(num.arange(reclen)) x2=chararray.array(None, itemsize=7, shape=reclen) x3=num.array(num.arange(reclen,reclen*3,2), num.Float64) r1=recarray.fromarrays([x1,x2,x3],names='a,b,c') r2=recarray2.fromarrays([x1,x2,x3],names='a,b,c') print "recarray shape in test ==>", r2.shape print "Assignment in recarray modified" print "-------------------------------" t1 = time.clock() for row in xrange(reclen): rec = r2._record(row) # select the row to be changed #rec.b = "changed" # change the "b" field rec.c = float(row**2) # Change the "c" field t2 = time.clock() ttime = round(t2-t1, 3) print "Assign time:", ttime, " Rows/s:", int(reclen/(ttime+delta)) print "Field b on row 2 after re-assign:", r2.field("c")[2] print print "Assignment in recarray original" print "-------------------------------" t1 = time.clock() for row in xrange(reclen): #r1.field("b")[row] = "changed" r1.field("c")[row] = float(row**2) t2 = time.clock() ttime = round(t2-t1, 3) print "Assign time:", ttime, " Rows/s:", int(reclen/(ttime+delta)) print "Field b on row 2 after re-assign:", r1.field("c")[2] print print "Selection in recarray modified" print "------------------------------" t1 = time.clock() for row in xrange(reclen): rec = r2._record(row) if rec.a < 3: print "This record pass the cut ==>", rec.c, "(row", row, ")" t2 = time.clock() ttime = round(t2-t1, 3) print "Select time:", ttime, " Rows/s:", int(reclen/(ttime+delta)) print print "Selection in recarray original" print "------------------------------" t1 = time.clock() for row in xrange(reclen): rec = r1[row] if rec.field("a") < 3: print "This record pass the cut ==>", rec.field("c"), "(row", row, ")" t2 = time.clock() ttime = round(t2-t1, 3) print "Select time:", ttime, " Rows/s:", int(reclen/(ttime+delta)) -------------- next part -------------- recarray shape in test ==> (10000,) Assignment in recarray modified ------------------------------- Assign time: 0.15 Rows/s: 66666 Field b on row 2 after re-assign: 4.0 Assignment in recarray original ------------------------------- Assign time: 1.24 Rows/s: 8064 Field b on row 2 after re-assign: 4.0 Selection in recarray modified ------------------------------ This record pass the cut ==> 0.0 (row 0 ) This record pass the cut ==> 1.0 (row 1 ) This record pass the cut ==> 4.0 (row 2 ) Select time: 0.18 Rows/s: 55555 Selection in recarray original ------------------------------ This record pass the cut ==> 0.0 (row 0 ) This record pass the cut ==> 1.0 (row 1 ) This record pass the cut ==> 4.0 (row 2 ) Select time: 1.52 Rows/s: 6578 From falted at openlc.org Fri Jan 10 09:17:05 2003 From: falted at openlc.org (Francesc Alted) Date: Fri Jan 10 09:17:05 2003 Subject: [Numpy-discussion] Some datatypes missing in numarray recarray? Message-ID: <200301101813.41407.falted@openlc.org> Hi, I think there are some data types missing in the recarray module. I can create recarrays using the fromarrays function with no problems except if I use UInt16, UInt32 and UInt64. As these types are well supported by numarray, is there any reason why they don't appear on numfmt and revfmt mappings in recarray module?. Is it safe to add them by hand in the source? Thanks, -- Francesc Alted From perry at stsci.edu Fri Jan 10 10:37:02 2003 From: perry at stsci.edu (Perry Greenfield) Date: Fri Jan 10 10:37:02 2003 Subject: [Numpy-discussion] Some datatypes missing in numarray recarray? In-Reply-To: <200301101813.41407.falted@openlc.org> Message-ID: > Hi, > > I think there are some data types missing in the recarray module. I can > create recarrays using the fromarrays function with no problems > except if I > use UInt16, UInt32 and UInt64. > > As these types are well supported by numarray, is there any > reason why they > don't appear on numfmt and revfmt mappings in recarray module?. Is it safe > to add them by hand in the source? > > Thanks, > > -- > Francesc Alted > Good point. We were using this for an I/O library that didn't use these types so that's why they didn't get in there originally. But you are right, they should be. Do you want to make the changes? Thanks, PErry From costas at malamas.com Sat Jan 11 01:12:03 2003 From: costas at malamas.com (Costas Malamas) Date: Sat Jan 11 01:12:03 2003 Subject: [Numpy-discussion] Sparse Arrays in NumPy? Message-ID: <000701c2b951$74d59880$6e00a8c0@retek.int> Hello all, I have been trying to find a package/addon that will provide a sparse array class to NumPy, or will at least trick NumPy to use a sparse array as a regular array, to no avail. By sparse array here, I donot mean a sparse matrix equation solver, but an array class that accepts a "default value". In other words, I would like to instantiate a 1000x1000x1000 (1e9) array that will have at most 5-10% populated (i.e. non-zero) elements. The current NumPy will instantiate the entire 1e9 array, which is a non-starter if you would like to calculate an expression with say 4-5 arrays. Instead, I'd like a class that will only store the populated cells, and return the default value for the others (ideally, but doing some smart disk I/O to preserve memory). I've tried SciPy, Scientific Python, and a few other modules floating around; none seem to do the trick, yet I can't help but wonder that this is not un uncommon setup for a lot of problem domains. Is there a package out there? If there isn't, where should I start looking to create one? From their description I think SparseLib++ at least would be a good starting point as a base library. As a secondary issue, is anyone aware of a package that can handle storage of such arrays? netCDF and HDF do not seem to fit the bill; a B-Tree library seems a more natural fit... Thanks in advance --any and all input appreciated, Costas From ehagemann at comcast.net Sun Jan 12 15:14:06 2003 From: ehagemann at comcast.net (eric hagemann) Date: Sun Jan 12 15:14:06 2003 Subject: [Numpy-discussion] questions about array types Message-ID: <003c01c2ba90$32d015b0$6401a8c0@eric> Rereading the numeric docs I see the reference to types Float, Float32, Float64 -- which make sense, however I am curious to understand the usefulness of types Float0, Float8 and Float16 which all seem synonyms for Float32. Was there some thinking that there would be a converter written for 8bit floats? >>> from Numeric import * >>> a = array([1,2,3,4],Float32) >>> fromstring(a.tostring(),Float32) array([ 1., 2., 3., 4.],'f') >>> fromstring(a.tostring(),Float) array([ 2.00000047, 512.00012255]) # corrupt, as would be expected >>> fromstring(a.tostring(),Float0) #seems to convert back as if Float0 == Float32 array([ 1., 2., 3., 4.],'f') >>> fromstring(a.tostring(),Float8) array([ 1., 2., 3., 4.],'f') >>> fromstring(a.tostring(),Float16) array([ 1., 2., 3., 4.],'f') >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at ee.byu.edu Mon Jan 13 12:59:04 2003 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Mon Jan 13 12:59:04 2003 Subject: [Numpy-discussion] Sparse Arrays in NumPy? In-Reply-To: <000701c2b951$74d59880$6e00a8c0@retek.int> Message-ID: > Hello all, > > I have been trying to find a package/addon that will provide a sparse array > class to NumPy, or will at least trick NumPy to use a sparse array as a > regular array, to no avail. > Sparse arrays are not a common object. Sparse matrices have many, many implementations of which I'm sure you're aware. What you want is a general purpose N-D array that uses some kind of sparse storage. I'm not aware of such an object in any other language. Most of the time people remap their particular problem so that any sparse arrays become sparse matrices. All of the effort is then focused in manipulating certain classes of sparse matrices. -Travis From Chris.Barker at noaa.gov Wed Jan 15 10:21:02 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Wed Jan 15 10:21:02 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package. References: <20021230235736.GA15420@idi.ntnu.no> <3E119E0E.2010403@stsci.edu> Message-ID: <3E2598CC.DAB8FD8A@noaa.gov> Hi folks, I use Numeric an wxPython together a lot (of course I do, I use Numeric for everything!). Unfortunately, since wxPython is not Numeric aware, you lose some real potential performance advantages. For example, I'm now working on expanding the extensions to graphics device contexts (DCs) so that you can draw a whole bunch of objects with a single Python call. The idea is that the looping can be done in C++, rather than Python, saving a lot of overhead of the loop itself, as well as the Python-wxWindows translation step. For drawing thousands of points, the speed-up is substantial. It's less substantial on more complex objects (rectangles give a factor of two improvement for ~1000 objects), due to the longer time it takes to draw the object itself, rather than make the call. Anyway, at the moment, Robin Dunn has the wrappers set up so that you can pass in a NumPy array (or, indeed, and sequence) rather than a list or tuple of coordinates, but it is faster to use a list than a NumPy array, because for arrays, it uses the generic PySequence_GetItem call. If we used the NumPy API directly, it should be faster than using a list, not slower! THis is how a representative section of the code looks now: bool isFastSeq = PyList_Check(pyPoints) || PyTuple_Check(pyPoints); . . . // Get the point coordinants if (isFastSeq) { obj = PySequence_Fast_GET_ITEM(pyPoints, i); } else { obj = PySequence_GetItem(pyPoints, i); } . . . So you can see that if a NumPy array is passed in, PySequence_GetItem will be used. What I would like to do is have an isNumPyArray check, and then access the NumPy array directly in that case. The tricky part is that Robin does not want to have wxPython require Numeric. (Oh how I dream of the day that NumArray becomes part of the standard library!) How can I check if an Object is a NumPy array (and then use it as such), without including Numeric during compilation? I know one option is to have condition compilation, with a NumPy and non-Numpy version, but Robin is managing a whole lot of different version as it is, and I don't think he wants to deal with twice as many! Anyone have any ideas? By the way, you can substitute NumArray for NumPy in this, as it is the wave of the future, and particularly if it would be easier. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From paul at pfdubois.com Wed Jan 15 10:50:07 2003 From: paul at pfdubois.com (Paul F Dubois) Date: Wed Jan 15 10:50:07 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package. In-Reply-To: <3E2598CC.DAB8FD8A@noaa.gov> Message-ID: <000001c2bcc6$dd570790$6601a8c0@NICKLEBY> If you could do: try: import Numeric haveNumeric = 1 except: haveNumeric = 0 in some initialization routine, then you could use this flag. Alternately you could test on the fly 'Numeric' in [m.__name__ for m in sys.modules] > -----Original Message----- > From: numpy-discussion-admin at lists.sourceforge.net > [mailto:numpy-discussion-admin at lists.sourceforge.net] On > Behalf Of Chris Barker > Sent: Wednesday, January 15, 2003 9:22 AM > Cc: Numpy-discussion > Subject: [Numpy-discussion] Optionally using Numeric in > another compiled extension package. > > > Hi folks, > > I use Numeric an wxPython together a lot (of course I do, I > use Numeric for everything!). > > Unfortunately, since wxPython is not Numeric aware, you lose > some real potential performance advantages. For example, I'm > now working on expanding the extensions to graphics device > contexts (DCs) so that you can draw a whole bunch of objects > with a single Python call. The idea is that the looping can > be done in C++, rather than Python, saving a lot of overhead > of the loop itself, as well as the Python-wxWindows translation step. > > For drawing thousands of points, the speed-up is substantial. > It's less substantial on more complex objects (rectangles > give a factor of two improvement for ~1000 objects), due to > the longer time it takes to draw the object itself, rather > than make the call. > > Anyway, at the moment, Robin Dunn has the wrappers set up so > that you can pass in a NumPy array (or, indeed, and sequence) > rather than a list or tuple of coordinates, but it is faster > to use a list than a NumPy array, because for arrays, it uses > the generic PySequence_GetItem call. If we used the NumPy API > directly, it should be faster than using a list, not slower! > THis is how a representative section of the code looks > now: > > > bool isFastSeq = PyList_Check(pyPoints) || > PyTuple_Check(pyPoints); > . > . > . > // Get the point coordinants > if (isFastSeq) { > obj = PySequence_Fast_GET_ITEM(pyPoints, i); > } > else { > obj = PySequence_GetItem(pyPoints, i); > } > > . > . > . > > So you can see that if a NumPy array is passed in, > PySequence_GetItem will be used. > > What I would like to do is have an isNumPyArray check, and > then access the NumPy array directly in that case. > > The tricky part is that Robin does not want to have wxPython > require Numeric. (Oh how I dream of the day that NumArray > becomes part of the standard library!) How can I check if an > Object is a NumPy array (and then use it as such), without > including Numeric during compilation? > > I know one option is to have condition compilation, with a > NumPy and non-Numpy version, but Robin is managing a whole > lot of different version as it is, and I don't think he wants > to deal with twice as many! > > Anyone have any ideas? > > By the way, you can substitute NumArray for NumPy in this, as > it is the wave of the future, and particularly if it would be easier. > > -Chris > > > -- > Christopher Barker, Ph.D. > Oceanographer > > NOAA/OR&R/HAZMAT (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > > ------------------------------------------------------- > This SF.NET email is sponsored by: A Thawte Code Signing Certificate > is essential in establishing user confidence by providing > assurance of > authenticity and code integrity. Download our Free Code > Signing guide: > http://ads.sourceforge.net/cgi-> bin/redirect.pl?thaw0028en > > > _______________________________________________ > Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From jmiller at stsci.edu Wed Jan 15 10:57:02 2003 From: jmiller at stsci.edu (Todd Miller) Date: Wed Jan 15 10:57:02 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package. References: <20021230235736.GA15420@idi.ntnu.no> <3E119E0E.2010403@stsci.edu> <3E2598CC.DAB8FD8A@noaa.gov> Message-ID: <3E25B253.1070108@stsci.edu> Chris Barker wrote: >Hi folks, > >I use Numeric an wxPython together a lot (of course I do, I use Numeric >for everything!). > >Unfortunately, since wxPython is not Numeric aware, you lose some real >potential performance advantages. For example, I'm now working on >expanding the extensions to graphics device contexts (DCs) so that you >can draw a whole bunch of objects with a single Python call. The idea is >that the looping can be done in C++, rather than Python, saving a lot of >overhead of the loop itself, as well as the Python-wxWindows translation >step. > >For drawing thousands of points, the speed-up is substantial. It's less >substantial on more complex objects (rectangles give a factor of two >improvement for ~1000 objects), due to the longer time it takes to draw >the object itself, rather than make the call. > >Anyway, at the moment, Robin Dunn has the wrappers set up so that you >can pass in a NumPy array (or, indeed, and sequence) rather than a list >or tuple of coordinates, but it is faster to use a list than a NumPy >array, because for arrays, it uses the generic PySequence_GetItem call. >If we used the NumPy API directly, it should be faster than using a >list, not slower! THis is how a representative section of the code looks >now: > > >bool isFastSeq = PyList_Check(pyPoints) || >PyTuple_Check(pyPoints); >. >. >. > // Get the point coordinants > if (isFastSeq) { > obj = PySequence_Fast_GET_ITEM(pyPoints, i); > } > else { > obj = PySequence_GetItem(pyPoints, i); > } > >. >. >. > >So you can see that if a NumPy array is passed in, PySequence_GetItem >will be used. > >What I would like to do is have an isNumPyArray check, and then access >the NumPy array directly in that case. > >The tricky part is that Robin does not want to have wxPython require >Numeric. (Oh how I dream of the day that NumArray becomes part of the >standard library!) >How can I check if an Object is a NumPy array (and then use it as such), >without including Numeric during compilation? > >I know one option is to have condition compilation, with a NumPy and >non-Numpy version, but Robin is managing a whole lot of different >version as it is, and I don't think he wants to deal with twice as many! > >Anyone have any ideas? > Use the Python C-API and string literals as the basis for the interface. I think the steps are something like this: 1. Import "Numeric". (PyImport_ImportModule) 2. Get the module dictionary. (PyModule_GetDict) 3. Get "array" out of the dictionary. (PyDict_GetItemString) 4. Call "isinstance" on Numeric.array and the object. (PyObject_IsInstance) Similarly: 1. Import "numarray". 2. Get the module dictionary. 3. Get "NumArray" out of the dictionary 4. Call the C-API equivalent of "isinstance" on numarray.NumArray and the object. The first 3 steps of both cases can be initialized once, I think, and stored in C static variables to avoid repeated fetches. If any of the first 3 steps fail, then consider that case failed and returning False. If it's not a Numeric array, check to see if it's a numarray. > >By the way, you can substitute NumArray for NumPy in this, as it is the >wave of the future, and particularly if it would be easier. > >-Chris > > Todd From Chris.Barker at noaa.gov Wed Jan 15 11:00:05 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Wed Jan 15 11:00:05 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package. References: <000001c2bcc6$dd570790$6601a8c0@NICKLEBY> Message-ID: <3E25A1E4.5CA8C453@noaa.gov> Paul F Dubois wrote: > > If you could do: > try: > import Numeric > haveNumeric = 1 > except: > haveNumeric = 0 > > in some initialization routine, then you could use this flag. > Alternately you could test on the fly > 'Numeric' in [m.__name__ for m in sys.modules] Thanks, but I'm talking about doing this at the C++ level in an extension package, not at the Python level. This kind of thing is Soo much easier in Python, of course! -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From jmiller at stsci.edu Wed Jan 15 12:01:53 2003 From: jmiller at stsci.edu (Todd Miller) Date: Wed Jan 15 12:01:53 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package. References: <20021230235736.GA15420@idi.ntnu.no> <3E119E0E.2010403@stsci.edu> <3E2598CC.DAB8FD8A@noaa.gov> <3E25B253.1070108@stsci.edu> Message-ID: <3E25C182.8080906@stsci.edu> Todd Miller wrote: > Chris Barker wrote: > >> How can I check if an Object is a NumPy array (and then use it as such), >> without including Numeric during compilation? >> >> I know one option is to have condition compilation, with a NumPy and >> non-Numpy version, but Robin is managing a whole lot of different >> version as it is, and I don't think he wants to deal with twice as many! >> >> Anyone have any ideas? >> > Use the Python C-API and string literals as the basis for the > interface. I think the steps are something like this: > > 1. Import "Numeric". (PyImport_ImportModule) > > 2. Get the module dictionary. (PyModule_GetDict) > > 3. Get "array" out of the dictionary. (PyDict_GetItemString) > > 4. Call "isinstance" on Numeric.array and the object. > (PyObject_IsInstance) > > Similarly: > > 1. Import "numarray". > > 2. Get the module dictionary. > > 3. Get "NumArray" out of the dictionary > > 4. Call the C-API equivalent of "isinstance" on numarray.NumArray and > the object. > > The first 3 steps of both cases can be initialized once, I think, and > stored in C static variables to avoid repeated fetches. On second thought, just do two functions, one for Numeric, one for numarray. If any of the first 3 steps fail, return False. Otherwise, return the result of the isinstance call. > > If it's not a Numeric array, check to see if it's a numarray. My idea to couple these was "not good". They're not compatible at that level anyway. Since numarray and Numeric are only source level compatible, C-code can be compiled to work with one or the other, but not both at the same time. It probably makes more sense to just implement for Numeric. If you do want to implement for both, treat them as seperate cases with seperate recognizer functions and element access code. But... It's not clear to me that knowing an object is an array will help since getting data elements still has to be done fast, and that seems hard to do without knowing the arrayobject struct. Keep in mind that Numeric and numarray arrays are strided and possibly discontiguous, so there's more to data access than owning a base pointer, as would be the case in C. Todd From falted at openlc.org Wed Jan 15 12:25:27 2003 From: falted at openlc.org (Francesc Alted) Date: Wed Jan 15 12:25:27 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package. In-Reply-To: <3E25C182.8080906@stsci.edu> References: <20021230235736.GA15420@idi.ntnu.no> <3E25B253.1070108@stsci.edu> <3E25C182.8080906@stsci.edu> Message-ID: <200301152123.45614.falted@openlc.org> A Dimecres 15 Gener 2003 21:16, Todd Miller va escriure: > > My idea to couple these was "not good". They're not compatible at that > level anyway. > > Since numarray and Numeric are only source level compatible, C-code can > be compiled to work with one or the other, but not both at the same > time. It probably makes more sense to just implement for Numeric. If > you do want to implement for both, treat them as seperate cases with > seperate recognizer functions and element access code. > > But... It's not clear to me that knowing an object is an array will > help since getting data elements still has to be done fast, and that > seems hard to do without knowing the arrayobject struct. Keep in mind > that Numeric and numarray arrays are strided and possibly discontiguous, > so there's more to data access than owning a base pointer, as would be > the case in C. I think you can use the numarray High-Level C API to overcome these dificulties. For example, by using the calls: PyArrayObject* NA InputArray(PyObject *numarray, NumarrayType t, int requires) PyArrayObject* NA OutputArray(PyObject *numarray, NumarrayType t, int requires) PyArrayObject* NA IoArray(PyObject *numarray, NumarrayType t, int requires) as documented in the User's Guide, you can get well-behaved (i.e. contiguous and well-aligned) C arrays (copying them, if needed) from both numarray or Numeric arrays if you pass C_ARRAY as the value for requires parameter. In fact, I'm using the InputArray in PyTables to manage both numarray and Numeric arrays with good results. -- Francesc Alted From jmiller at stsci.edu Wed Jan 15 12:40:02 2003 From: jmiller at stsci.edu (Todd Miller) Date: Wed Jan 15 12:40:02 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package. References: <20021230235736.GA15420@idi.ntnu.no> <3E25B253.1070108@stsci.edu> <3E25C182.8080906@stsci.edu> <200301152123.45614.falted@openlc.org> Message-ID: <3E25CA79.40206@stsci.edu> Francesc Alted wrote: >A Dimecres 15 Gener 2003 21:16, Todd Miller va escriure: > > >>But... It's not clear to me that knowing an object is an array will >>help since getting data elements still has to be done fast, and that >>seems hard to do without knowing the arrayobject struct. Keep in mind >>that Numeric and numarray arrays are strided and possibly discontiguous, >> so there's more to data access than owning a base pointer, as would be >>the case in C. >> >> > >I think you can use the numarray High-Level C API to overcome these >dificulties. > But doesn't using the numarray C-API require a level of coupling (direct knowledge of numarray during compilation) that Chris is trying to avoid? > > > Todd From falted at openlc.org Wed Jan 15 12:59:04 2003 From: falted at openlc.org (Francesc Alted) Date: Wed Jan 15 12:59:04 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package. In-Reply-To: <3E25CA79.40206@stsci.edu> References: <20021230235736.GA15420@idi.ntnu.no> <200301152123.45614.falted@openlc.org> <3E25CA79.40206@stsci.edu> Message-ID: <200301152158.44234.falted@openlc.org> A Dimecres 15 Gener 2003 21:54, Todd Miller va escriure: > >I think you can use the numarray High-Level C API to overcome these > >dificulties. > > But doesn't using the numarray C-API require a level of coupling > (direct knowledge of numarray during compilation) that Chris is trying > to avoid? > Ooops!, you are right. Perhaps this kind of scenario (accessing Numeric and numarray arrays from C) would be more and more common as people is getting more aware of the numarray capabilities and want to integrate it in their extensions. That reinforces me in the belief that having a small core with the "glue" functionality between numarray objects and 3rd party extensions in C (or SWIG, Pyrex or whatever) can be a good thing (until numarray is in the Standard Library). That way, people interested in supporting numarray objects in their extensions has only to install this small core (or even include it as part of the extension). Well, speaking as non-interested and impartial person ;-) -- Francesc Alted From Chris.Barker at noaa.gov Wed Jan 15 13:50:02 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Wed Jan 15 13:50:02 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package. References: <20021230235736.GA15420@idi.ntnu.no> <200301152123.45614.falted@openlc.org> <3E25CA79.40206@stsci.edu> <200301152158.44234.falted@openlc.org> Message-ID: <3E25C99A.9D5E1888@noaa.gov> Francesc Alted wrote: > that having a small core with the "glue" > functionality between numarray objects and 3rd party extensions in C (or > SWIG, Pyrex or whatever) can be a good thing (until numarray is in the > Standard Library). > > That way, people interested in supporting numarray objects in their > extensions has only to install this small core (or even include it as part > of the extension). I think that's a fabulous idea, but I have no idea how hard it would be. There would still be the problem of keeping versions in-sync. If I distributed my package with the glue code, it would only work on installations using the same version of Numeric (or NumArray, I suppose) Thanks to all who have commented on my post. These are some ideas I now have based on your comments: > > Use the Python C-API and string literals as the basis for the > > interface. I think the steps are something like this: > > > > 1. Import "Numeric". (PyImport_ImportModule) > > > > 2. Get the module dictionary. (PyModule_GetDict) > > > > 3. Get "array" out of the dictionary. (PyDict_GetItemString) > > > > 4. Call "isinstance" on Numeric.array and the object. > > (PyObject_IsInstance) OK, so now I can know, at runtime, whether Numeric has been imported. > But... It's not clear to me that knowing an object is an array will > help since getting data elements still has to be done fast, and that > seems hard to do without knowing the arrayobject struct. Exactly. that's my whole problem. However, I have an idea about this. If I do the above test, I can now put all the Numeric specific code into a conditional, so it would only get called in Numeric were imported. My idea is that I could make sure Numeric was around at compile time, so I could use all the Numeric API to access the array data, but it wouldn't have to be installed at runtime, as none of the Numeric calls would be executed if Numeric hadn't been imported. Would this work, or would the system try to load the .dll or .so or whatever even if the calls weren't executed? All that being said, Tim Hochberg has mentioned that when he first made wxPython DCs work with Numeric Arrays,( sorry I didn't give him credit before, I had forgotten who did that, thanks Tim ) he did some timing and discovered that the the overhead of the drawing calls was substantially larger than the overhead of the indexing anyway, so speedin up that process couldn't make much difference. My timing indicated something different, but I'm using Linux/wxGTK/X11, and I think the drawing calls return after the message has been sent to X, but X may not have completed the actual drawing yet. This means that I'm not timing the whole process, and if I did, I might not see such a difference. I did some tests with 100,000 points, and found that I could see the difference with a List and Array, and the List was about twice as fast. Drawing rectangles, however, I can't see the difference. So, I think I'll probably shelve this for the moment, and concentrate on getting all the drawing shapes supported by DrawXXXList methods. Thanks for all your input. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From gvermeul at grenoble.cnrs.fr Wed Jan 15 13:50:05 2003 From: gvermeul at grenoble.cnrs.fr (gvermeul at grenoble.cnrs.fr) Date: Wed Jan 15 13:50:05 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled Message-ID: <200301152149.h0FLn6PN032653@grenoble.cnrs.fr> > Gerard Vermeulen wrote: > > I just want to point out that PyQwt plots NumPy arrays. I have played > > a little bit with the Scipy-wxWindows interface, but it is no match > > for PyQwt (I display x-y data with 16000 points). > > Thanks for the tip, I'll check it out. I think what you have there is > that the plotting is all done at the C++ level, expecting some kind of > sequence of data points. That's exactly what I want to adress with > wxPython: being able to pass in a whole sequence and have the looping > done at the C++ level. > Yes, I am using PyArray_ContiguousFromObject() to convert any sequence into a NumPy array before copying the data into Qwt's double arrays. > > Have you ever tested whether it's fster or slower to plot data passed in > as a list vs. a NumPy array? > I did not test it, but there is certainly more overhead if you pass a list or a tuple into PyArray_ContiguousFromObject() than a NumPy array > > How do you access the data in the passed in sequence? Do you use: > PySequence_GetItem ? > No, see above. The code looks like (in "sip" language, sip is a sort of swig, but more specialized to C++ and Qt): void setData(double *, double *, int); %MemberCode PyObject *xSeq, *ySeq; $C *ptr; if (sipParseArgs(&sipArgsParsed, sipArgs, "mOO", sipThisObj, sipClass_$C, &ptr, &xSeq, &ySeq)) { PyArrayObject *x = (PyArrayObject *) PyArray_ContiguousFromObject(xSeq, PyArray_DOUBLE, 1, 0); if (!(x)) return 0; PyArrayObject *y = (PyArrayObject *) PyArray_ContiguousFromObject(ySeq, PyArray_DOUBLE, 1, 0); if (!(y)) return 0; int size; Py_BEGIN_ALLOW_THREADS size = (x->dimensions[0] < y->dimensions[0]) ? x->dimensions[0] : y->dimensions[0]; ptr->setData((double*)(x->data), (double*)(y->data), size); Py_END_ALLOW_THREADS Py_DECREF(x); Py_DECREF(y); Py_INCREF(Py_None); return Py_None; } %End The setData calls copy the data. > > thanks for the tip. Qwt (and PyQwt) look very nice, I may have to > reconsider using PyQT! > Gerard > > -Chris > > > > > > Take a look at http://gerard.vermeulen.free.fr > > > > PyQwt is an addon for PyQt (a Python wrapper for Qt) that knows nothing > > about NumPy > > > > Maybe it is possible to make a NumPy plot add-on for wxWindows, too. > > > > Gerard > > > > On Wed, Jan 15, 2003 at 09:22:20AM -0800, Chris Barker wrote: > > > Hi folks, > > > > > > I use Numeric an wxPython together a lot (of course I do, I use Numeric > > > for everything!). > > > > > > Unfortunately, since wxPython is not Numeric aware, you lose some real > > > potential performance advantages. For example, I'm now working on > > > expanding the extensions to graphics device contexts (DCs) so that you > > > can draw a whole bunch of objects with a single Python call. The idea is > > > that the looping can be done in C++, rather than Python, saving a lot of > > > overhead of the loop itself, as well as the Python-wxWindows translation > > > step. > > > > > > For drawing thousands of points, the speed-up is substantial. It's less > > > substantial on more complex objects (rectangles give a factor of two > > > improvement for ~1000 objects), due to the longer time it takes to draw > > > the object itself, rather than make the call. > > > > > > Anyway, at the moment, Robin Dunn has the wrappers set up so that you > > > can pass in a NumPy array (or, indeed, and sequence) rather than a list > > > or tuple of coordinates, but it is faster to use a list than a NumPy > > > array, because for arrays, it uses the generic PySequence_GetItem call. > > > If we used the NumPy API directly, it should be faster than using a > > > list, not slower! THis is how a representative section of the code looks > > > now: > > > > > > > > > bool isFastSeq = PyList_Check(pyPoints) || > > > PyTuple_Check(pyPoints); > > > . > > > . > > > . > > > // Get the point coordinants > > > if (isFastSeq) { > > > obj = PySequence_Fast_GET_ITEM(pyPoints, i); > > > } > > > else { > > > obj = PySequence_GetItem(pyPoints, i); > > > } > > > > > > . > > > . > > > . > > > > > > So you can see that if a NumPy array is passed in, PySequence_GetItem > > > will be used. > > > > > > What I would like to do is have an isNumPyArray check, and then access > > > the NumPy array directly in that case. > > > > > > The tricky part is that Robin does not want to have wxPython require > > > Numeric. (Oh how I dream of the day that NumArray becomes part of the > > > standard library!) > > > How can I check if an Object is a NumPy array (and then use it as such), > > > without including Numeric during compilation? > > > > > > I know one option is to have condition compilation, with a NumPy and > > > non-Numpy version, but Robin is managing a whole lot of different > > > version as it is, and I don't think he wants to deal with twice as many! > > > > > > Anyone have any ideas? > > > > > > By the way, you can substitute NumArray for NumPy in this, as it is the > > > wave of the future, and particularly if it would be easier. > > > > > > -Chris > > > > > > > > > -- > > > Christopher Barker, Ph.D. > > > Oceanographer > > > > > > NOAA/OR&R/HAZMAT (206) 526-6959 voice > > > 7600 Sand Point Way NE (206) 526-6329 fax > > > Seattle, WA 98115 (206) 526-6317 main reception > > > > > > Chris.Barker at noaa.gov > > > > > > > > > ------------------------------------------------------- > > > This SF.NET email is sponsored by: A Thawte Code Signing Certificate > > > is essential in establishing user confidence by providing assurance of > > > authenticity and code integrity. Download our Free Code Signing guide: > > > http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0028en > > > _______________________________________________ > > > Numpy-discussion mailing list > > > Numpy-discussion at lists.sourceforge.net > > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > -- > Christopher Barker, Ph.D. > Oceanographer > > NOAA/OR&R/HAZMAT (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > ------------------------------------------------------------- This message was sent using HTTPS service from CNRS Grenoble. ---> https://grenoble.cnrs.fr <--- From Jack.Jansen at oratrix.com Wed Jan 15 14:18:05 2003 From: Jack.Jansen at oratrix.com (Jack Jansen) Date: Wed Jan 15 14:18:05 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package. In-Reply-To: <3E25A1E4.5CA8C453@noaa.gov> Message-ID: <1D394963-28D7-11D7-AE69-000A27B19B96@oratrix.com> On woensdag, jan 15, 2003, at 19:01 Europe/Amsterdam, Chris Barker wrote: > Paul F Dubois wrote: >> >> If you could do: >> try: >> import Numeric >> haveNumeric = 1 >> except: >> haveNumeric = 0 >> >> in some initialization routine, then you could use this flag. >> Alternately you could test on the fly >> 'Numeric' in [m.__name__ for m in sys.modules] > > Thanks, but I'm talking about doing this at the C++ level in an > extension package, not at the Python level. This kind of thing is Soo > much easier in Python, of course! This can be done, but it is difficult, and you need the cooperation of both parties (Numeric and wxPython, in this case). The problem is that you need a way to pass C pointers from one extension module to the other. One of the pointers you want to pass is the PyTypeObject, so you can check that an object passed in from Python is of the correct type. Another is the address of some C routine that will get you a C pointer to the data. The first one may be visible from Python (so you can get at it through normal means) but the second one won't be. The dirty way to do this (and you should probably avoid this) is to put these pointers into Python integers in the supplying module, and put them in the module namespace with a funny name (__ConvertToCPointerAddress). In wxPython you import Numeric, and if it succeeds you look up the funny name, convert the Python integer to a C pointer, cross your fingers, and call the address. A cleaner way to do this is with cobject objects. These are in the core, in Objects/cobject.c. Numeric exports a cobject (again named __ConvertToCPointerAddress) with the address of the routine as the value. But, and this is the nice bit, cobjects can be passed along by Python code but can't be fiddled with. And cobject.c even provides a C function PyCObject_Import(char *modulename, char *attributename) which directly returns you the pointer you're looking for by importing the module, looking up the name, checking that it's a cobject and extracting the value. And it even has support for "protocols": Cobjects have an extra field called the description, again only settable and readable from C. Modules that don't know about each others' existence could still decide on a common description that would signify that the pointer in the cobject has a specific meaning. We could decide here that if the description is the C string "this pointer is a function that you pass one Python object and that returns the data just as Numeric would store it" would fit that bill, and anyone in the world writing an extension module could follow the protocol. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From Jack.Jansen at oratrix.com Wed Jan 15 14:34:05 2003 From: Jack.Jansen at oratrix.com (Jack Jansen) Date: Wed Jan 15 14:34:05 2003 Subject: [Numpy-discussion] Exporting Numpy C functionality to other extension modules Message-ID: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com> Actually, wrt my previous message on cobjects for communicating between extension modules, we can do one better! This is an idea I've been toying with for the MacPython extension types, and I think it's applicable to Numeric too. It goes as follows. Each Numeric object has an attribute with a well-known name, lets call it "__Numeric_C_interface". This is a Cobject, and it is shared among all Numeric objects of the same type. The value of this C object is a pointer to a C structure with pointers to all the C routines you might want to call on the object, basically the PyArray_API structure (I think). The descr of the C object is a string with the version number of this particular PyArray_API structure. An extension module that knows about this protocol and gets passed an object that it think might be a Numeric array checks whether the object has an __Numeric_C_interface attribute. If so it retrieves it, checks that it is a Cobject, gets the descriptor and tests it for compatibility and if it is compatible gets the cobject pointer and happily calls all the Numeric routines it needs. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From falted at openlc.org Thu Jan 16 04:00:03 2003 From: falted at openlc.org (Francesc Alted) Date: Thu Jan 16 04:00:03 2003 Subject: [Numpy-discussion] Exporting Numpy C functionality to other extension modules In-Reply-To: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com> References: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com> Message-ID: <200301161259.13522.falted@openlc.org> A Dimecres 15 Gener 2003 23:33, Jack Jansen va escriure: > Actually, wrt my previous message on cobjects for communicating between > extension modules, we can do one better! > > This is an idea I've been toying with for the MacPython extension > types, and I think it's applicable to Numeric too. It goes as follows. > > Each Numeric object has an attribute with a well-known name, lets call > it "__Numeric_C_interface". This is a Cobject, and it is shared among > all Numeric objects of the same type. The value of this C object is a > pointer to a C structure with pointers to all the C routines you might > want to call on the object, basically the PyArray_API structure (I > think). The descr of the C object is a string with the version number > of this particular PyArray_API structure. > > An extension module that knows about this protocol and gets passed an > object that it think might be a Numeric array checks whether the object > has an __Numeric_C_interface attribute. If so it retrieves it, checks > that it is a Cobject, gets the descriptor and tests it for > compatibility and if it is compatible gets the cobject pointer and > happily calls all the Numeric routines it needs. That's a nice idea. But I see two drawbacks: - numarray needs to be reworked to include the Cobject descriptors, although I don't know if this would be difficult or not. - you still need to have Numeric or numarray installed on the client machine. This could be the usual case, but what about extensions that want to use Numeric internally (because a number of reasons, like better number representation, convenient interface to C, etc) without forcing the user to install it? However, designing a small library with a minimalist API (I'm thinking in something similar to zlib) could be very handy in allowing extensions (but also native python modules) to deal with numarray objects. As I said before, this would require the user to install only this small library, but it can also be included in the application or package. However, this second alternative can be tricky, as Chris Barker has signaled, because the different numarray versions coming in the future. But IMO a series of factors may alleviate this handicap: - The numarray data structure should be very stable, as improvements are normally made at the functionality level. - The library should provide a minimalistic, high level API that, if it is well designed, should cope with small modifications in the numarray data structures. - Finally, when these differences has to be added, and that would break the current API, this version should be marked as a major release, and existing extensions (or whatever software that is embedding the library) will know that they have to release new versions if they want to support the newest objects. But, hopefully, that should happen quite unfrequently. Of course, this small library should cope with both numarray and Numeric (at least, the not too old versions of it) objects. But I think this shouldn't pose a big problem as the actual numarray API already can do that. This logical separation between structure and functionality migth also lead to a better acceptation by numerical software cratftsmen, as they can be more confident in that the API to deal with numarray objects will be quite stable throughout the time. Well, this is just a thought. I must confess that I'm so interested on that issue because I really want to support numarray objects in my project, and I'm just wondering which is the best way to do that without creating too much nuissance to the users. In fact, I'm pondering to build up such a library myself, but that can be a waste of time if I've to redone it in every numarray release. Cheers, -- Francesc Alted From peter.chang at nottingham.ac.uk Thu Jan 16 08:47:04 2003 From: peter.chang at nottingham.ac.uk (peter.chang at nottingham.ac.uk) Date: Thu Jan 16 08:47:04 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiled extension package. In-Reply-To: <3E25C99A.9D5E1888@noaa.gov> Message-ID: On Wed, 15 Jan 2003, Chris Barker wrote: [...] > My idea is that I could make sure Numeric was around at compile time, so > I could use all the Numeric API to access the array data, but it > wouldn't have to be installed at runtime, as none of the Numeric calls > would be executed if Numeric hadn't been imported. Would this work, or > would the system try to load the .dll or .so or whatever even if the > calls weren't executed? One way is to import a dynamic library, explicitly, which has glue code to handle the array objects when you need them. [...] > My timing indicated something different, but I'm using Linux/wxGTK/X11, > and I think the drawing calls return after the message has been sent to > X, but X may not have completed the actual drawing yet. That's right. X's communication model between client and server is asynchronous. > This means that I'm not timing the whole process, and if I did, I might > not see such a difference. You can synchronise the output buffer using XSync(3) and then do the timing. Peter From Chris.Barker at noaa.gov Thu Jan 16 09:58:04 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Thu Jan 16 09:58:04 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiledextension package. References: Message-ID: <3E26E45F.3C7E2293@noaa.gov> peter.chang at nottingham.ac.uk wrote: > You can synchronise the output buffer using XSync(3) and then do the > timing. I'd love to try this, but I confess I have no idea how! I'm working with the *.i files that tell swig what to add when creating wrappers around wxWindows for Python. wxWindows is using wxGTK, which is using GTK, which is using Xlib (I think, so I'm pretty far away from X, and I barely know enough C/C++ to attempt this. I suppose I could try including Xlib, then calling XSync, but I need to pass a reference to a disply. I have not idea how to get that. Any hints? -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Thu Jan 16 10:33:07 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Thu Jan 16 10:33:07 2003 Subject: [Numpy-discussion] Exporting Numpy C functionality to other extension modules References: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com> Message-ID: <3E26EC9D.A0B7D173@noaa.gov> Jack Jansen wrote: > An extension module that knows about this protocol and gets passed an > object that it think might be a Numeric array checks whether the object > has an __Numeric_C_interface attribute. If so it retrieves it, checks > that it is a Cobject, gets the descriptor and tests it for > compatibility and if it is compatible gets the cobject pointer and > happily calls all the Numeric routines it needs. Wow Jack! are single handely going to impliment all my pet projects that I'm too stupid to know how to do my self ? (the other one was Universal text file support) I can only barely follow what you're suggesting, but I still have a question about it. It seems while this would provide a way ro an extension module to identify whether an object was a Numeric array, and then get a pointer to it, how would it know the API for dealing with the arrays, without the Numeric header file? Or would you have to include the header file when compiling, but not need the library at runtime unless it was actually used, which seems a reasonable compromise. If this would work, I think it's a great idea. Short of including NumArray with the standard library (which I imagine is a least a couple of Python releases away), it would be a great solution for folks that are writing extensions that they want to be able take advantage of Numeric when it's there, but not require it. Do any of the primary Numarray developers think this is a good and doable idea? -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From peter.chang at nottingham.ac.uk Thu Jan 16 11:22:03 2003 From: peter.chang at nottingham.ac.uk (peter.chang at nottingham.ac.uk) Date: Thu Jan 16 11:22:03 2003 Subject: [Numpy-discussion] Optionally using Numeric in another compiledextension package. In-Reply-To: <3E26E45F.3C7E2293@noaa.gov> Message-ID: On Thu, 16 Jan 2003, Chris Barker wrote: > peter.chang at nottingham.ac.uk wrote: > > > You can synchronise the output buffer using XSync(3) and then do the > > timing. Oops, that should be XSynchronize(3). [...] > I suppose I could try including Xlib, then calling XSync, but I need to > pass a reference to a disply. I have not idea how to get that. > > Any hints? wxGetDisplayName() gives the Display name but not a pointer to the display structure. So this is not much help. In gtk+, any program can be called with --sync to aid debugging. I'd guess wxWindows may allow you to do the same. Peter From jmiller at stsci.edu Thu Jan 16 12:06:05 2003 From: jmiller at stsci.edu (Todd Miller) Date: Thu Jan 16 12:06:05 2003 Subject: [Numpy-discussion] Exporting Numpy C functionality to other extension modules References: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com> <3E26EC9D.A0B7D173@noaa.gov> Message-ID: <3E271006.4000607@stsci.edu> Chris Barker wrote: >Jack Jansen wrote: > > > >>An extension module that knows about this protocol and gets passed an >>object that it think might be a Numeric array checks whether the object >>has an __Numeric_C_interface attribute. If so it retrieves it, checks >>that it is a Cobject, gets the descriptor and tests it for >>compatibility and if it is compatible gets the cobject pointer and >>happily calls all the Numeric routines it needs. >> >> > >Wow Jack! are single handely going to impliment all my pet projects that >I'm too stupid to know how to do my self ? (the other one was Universal >text file support) > >I can only barely follow what you're suggesting, but I still have a >question about it. It seems while this would provide a way ro an >extension module to identify whether an object was a Numeric array, and >then get a pointer to it, how would it know the API for dealing with the >arrays, without the Numeric header file? Or would you have to include >the header file when compiling, but not need the library at runtime >unless it was actually used, which seems a reasonable compromise. > >If this would work, I think it's a great idea. Short of including >NumArray with the standard library (which I imagine is a least a couple >of Python releases away), it would be a great solution for folks that >are writing extensions that they want to be able take advantage of >Numeric when it's there, but not require it. > >Do any of the primary Numarray developers think this is a good and >doable idea? > > Roll out the time machine... it's already done. As long as you don't define the macros PY_ARRAY_UNIQUE_SYMBOL or NO_IMPORT_ARRAY, any file that includes arrayobject.h gets a static copy of PyArray_API. If the module executes import_array() at an appropriate time, normally module initialization, but not necessarily, the static PyArray_API gets filled in and becomes usable. The import_array() call is critical; without it, API calls through the static PyArray_API are calls to NULL and segfault. I think that if Numeric is not present, and you call import_array(), it will fail quietly but leave the Python error status set. So it might make sense to call PyErr_Clear() after doing import_array(). >-Chris > So it sounds like your whole "weak linkage" scheme is plausible now with Numeric (maybe even numarray!), as would be a minimal API module. 1. We discussed yesterday how to determine if an object is a Numeric array w/o even compiling with arrayobject.h. The important idea there was that if Numeric is not present, the "isarray" (or whatever) function will return false rather than segfaulting because the API pointer isn't filled in. 2. Call API functions in contexts where you know you're looking at Numeric arrays, i.e., right after isarray(). This creates a guard which prevents you from calling API functions when Numeric is not present. 3. Call import_array() at some time before using the API functions, possibly at module init time, failing quietly and clearing the error in installations where Numeric is not installed. Todd From jmiller at stsci.edu Fri Jan 17 14:16:03 2003 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 17 14:16:03 2003 Subject: [Numpy-discussion] Exporting Numpy C functionality to other extension modules References: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com> <3E26EC9D.A0B7D173@noaa.gov> Message-ID: <3E288068.3070407@stsci.edu> Take a look at the attached extension module "testlite" which demonstrates the technique I evolved from this discussion. As we discussed, this usage pattern enables the construction of an extension which will take advantage of numarray if it is there, but will continue to work if the user has not installed numarray. Here's how it works: 1. I created a new API function, PyArray_isArray() which is safe to call in all contexts. I defined it as: #define PyArray_isArray(o) (PyArray_API && NA_isNumArray(o)) I added NA_isNumArray(o) to the numarray C-API because it was the easy way to do it. 2. Ordinary API functions are safe to call once an object has been identified to be a numarray because it implies (locally) that the PyArray_API pointer has been initialized. 3. I tried out the standard import_array() code and added some cleanup for the case where numarray is not installed. The only caveat I see at this point is that you are required to include numarray headers in order to use this. In numarray's case, this might necessitate header updates and/or function call modifications. The numarray C-API should stabilize pretty soon, but I don't think its quite there yet. The same approach should apply to Numeric. This stuff is in numarray CVS now and should be in the next numarray release. Todd -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: testlite.c URL: From haase at msg.ucsf.edu Fri Jan 17 14:25:04 2003 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Fri Jan 17 14:25:04 2003 Subject: [Numpy-discussion] make C array accessible to python without copy Message-ID: <03fa01c2be77$4cae4430$3b45da80@rodan> Hi, What is the C API to make an array that got allocated, let's say, by a = new short[512*512], accessible to python as numarray. I tried NA_New - but that seems to make a copy. I would need it to use the original memory space so that I can "observe" the array from Python WHILE the underlying C array changes (it's actually a camera image) Thanks, Sebastian Haase From jmiller at stsci.edu Fri Jan 17 15:17:01 2003 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 17 15:17:01 2003 Subject: [Numpy-discussion] make C array accessible to python without copy References: <03fa01c2be77$4cae4430$3b45da80@rodan> Message-ID: <3E288EB1.80107@stsci.edu> Sebastian Haase wrote: >Hi, >What is the C API to make an array that got allocated, >let's say, by a = new short[512*512], >accessible to python as numarray. > What you want to do is not currently supported well in C. The way to do what you want is: 1. Create a buffer object from your C++ array. The buffer object can be built such that it refers to the original copy of the data. 2. Call back into Python (numarray.NumArray) with your buffer object as the buffer parameter. You can scavenge the code in NA_newAll (Src/newarray.ch) for most of the callback. >I tried NA_New - but that seems to make a copy. >I would need it to use the original memory space >so that I can "observe" the array from Python WHILE >the underlying C array changes (it's actually a camera image) > That sounds cool! > >Thanks, >Sebastian Haase > > > > >------------------------------------------------------- >This SF.NET email is sponsored by: Thawte.com - A 128-bit supercerts will >allow you to extend the highest allowed 128 bit encryption to all your >clients even if they use browsers that are limited to 40 bit encryption. >Get a guide here:http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0030en >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From falted at openlc.org Sat Jan 18 01:23:03 2003 From: falted at openlc.org (Francesc Alted) Date: Sat Jan 18 01:23:03 2003 Subject: [Numpy-discussion] Accessing a Numeric 'c' array from numarray Message-ID: <200301181022.07015.falted@openlc.org> Hi, I'm trying to make a C array from a Numeric "c" (Character) typecode array using the high level call: NA_InputArray(PyObject *numarray, NumarrayType t, int requires) with no success. As I have been able to access all the other types (i.e. '1','b','s','i','l','f','d') successfully, perhaps character type is not supported? In the NumarrayType enum, there is no tChar, but I've tried tUInt8 and tAny as the value for NumarrayType parameter, but both choices issues the same error: Traceback (most recent call last): File "table-tree2.py", line 77, in ? h5file.createArray('/columns', 'name', array(names), "Name column") File "/home/falted/PyTables/pytables-0.3/tables/File.py", line 400, in createArray setattr(group, name, object) File "/home/falted/PyTables/pytables-0.3/tables/Group.py", line 355, in __setattr__ value._f_putObjectInTree(name, self) File "/home/falted/PyTables/pytables-0.3/tables/Leaf.py", line 71, in _f_putObjectInTree self.create() File "/home/falted/PyTables/pytables-0.3/tables/Array.py", line 83, in create self.createArray(self.object, self.title) File "/home/falted/PyTables/pytables-0.3/src/hdf5Extension.pyx", line 913, in createArray array = NA_InputArray(arr, numfmt2[arr.typecode()], C_ARRAY) libnumarray.error: getShape: sequence object nested more than MAXDIM deep. although I was passing only a Numeric 'c' with a rather small shape (10,16). I just want to access the buffer data, and the shape of this object from C (well, I'm actually using Pyrex, but I think this is not important). Is that possible by only using numarray C calls? Thanks, -- Francesc Alted From jmiller at stsci.edu Sat Jan 18 08:27:04 2003 From: jmiller at stsci.edu (Todd Miller) Date: Sat Jan 18 08:27:04 2003 Subject: [Numpy-discussion] Accessing a Numeric 'c' array from numarray] Message-ID: <3E2983C3.7000304@stsci.edu> Francesc Alted wrote: >Hi, > >I'm trying to make a C array from a Numeric "c" (Character) typecode array >using the high level call: > >NA_InputArray(PyObject *numarray, NumarrayType t, int requires) > Unified handling of character arrays and numeric arrays doesn't exist yet in numarray. There is no C-API for the chararray module because we haven't needed one. But CharArrays are NDArrays and have attributes stored in PyArrayObjects just like numarrays. >with no success. > >As I have been able to access all the other types (i.e. >'1','b','s','i','l','f','d') successfully, perhaps character type is not >supported? > >In the NumarrayType enum, there is no tChar, but I've tried tUInt8 and tAny >as the value for NumarrayType parameter, but both choices issues the same >error: > >Traceback (most recent call last): > File "table-tree2.py", line 77, in ? > h5file.createArray('/columns', 'name', array(names), "Name column") > File "/home/falted/PyTables/pytables-0.3/tables/File.py", line 400, in >createArray > setattr(group, name, object) > File "/home/falted/PyTables/pytables-0.3/tables/Group.py", line 355, in >__setattr__ > value._f_putObjectInTree(name, self) > File "/home/falted/PyTables/pytables-0.3/tables/Leaf.py", line 71, in >_f_putObjectInTree > self.create() > File "/home/falted/PyTables/pytables-0.3/tables/Array.py", line 83, in >create > self.createArray(self.object, self.title) > File "/home/falted/PyTables/pytables-0.3/src/hdf5Extension.pyx", line 913, >in createArray > array = NA_InputArray(arr, numfmt2[arr.typecode()], C_ARRAY) >libnumarray.error: getShape: sequence object nested more than MAXDIM deep. > NA_InputArray was intended to accept non-numeric sequences. It could report this better... >although I was passing only a Numeric 'c' with a rather small shape (10,16). > >I just want to access the buffer data, and the shape of this object from C >(well, I'm actually using Pyrex, but I think this is not important). Is that >possible by only using numarray C calls? > Look at Lib/chararray.py and Src/_chararraymodule.c. If you can handle using a CharArray or RawCharArray, try: 1. call NA_updateDataPtr( array ) to refresh the data buffer pointer in the PyArrayObject. Even _chararraymodule.c doesn't do this right yet. 2. call NA_OFFSETDATA(array) to add the byteoffset to the pointer. 3. shape, strides, and itemsize should be directly accessible from the PyArrayObject. CharArray has some extra stripping and padding semantics; these are lazy and hence absent without extra care in C. RawCharArray has none. CharArrays are really arrays of fixed length strings of bytes. The string length is defined by the array itemsize. >Thanks, > > > Todd From falted at openlc.org Sat Jan 18 10:18:02 2003 From: falted at openlc.org (Francesc Alted) Date: Sat Jan 18 10:18:02 2003 Subject: [Numpy-discussion] Accessing a Numeric 'c' array from numarray] In-Reply-To: <3E2983C3.7000304@stsci.edu> References: <3E2983C3.7000304@stsci.edu> Message-ID: <200301181917.29533.falted@openlc.org> A Dissabte 18 Gener 2003 17:41, Todd Miller va escriure: > >I just want to access the buffer data, and the shape of this object from C > >(well, I'm actually using Pyrex, but I think this is not important). Is > > that possible by only using numarray C calls? > > Look at Lib/chararray.py and Src/_chararraymodule.c. > > If you can handle using a CharArray or RawCharArray, try: > > 1. call NA_updateDataPtr( array ) to refresh the data buffer pointer in > the PyArrayObject. Even _chararraymodule.c doesn't do this right yet. > > 2. call NA_OFFSETDATA(array) to add the byteoffset to the pointer. > > 3. shape, strides, and itemsize should be directly accessible from the > PyArrayObject. Ok. I'll try to do that. > > CharArray has some extra stripping and padding semantics; these are lazy > and hence absent without extra care in C. RawCharArray has none. > By the way, is it safe to assume that CharArray objects are contiguous? or RawCharArray?. The same question goes for RecArray objects. Or it is always convenient to check with iscontiguous() method if they are or not?. In case these objects can be non-contiguous, I guess there's still not a function like NA_InputArray that works with CharArray or RecArray objects in order to obtain well-behaved objects. Is that true? I think it would be possible to me to include support for numarray objects in next release of PyTables. Thanks!, -- Francesc Alted From jmiller at stsci.edu Sat Jan 18 11:57:03 2003 From: jmiller at stsci.edu (Todd Miller) Date: Sat Jan 18 11:57:03 2003 Subject: [Numpy-discussion] Accessing a Numeric 'c' array from numarray] References: <3E2983C3.7000304@stsci.edu> <200301181917.29533.falted@openlc.org> Message-ID: <3E29B52C.2030602@stsci.edu> Francesc Alted wrote: >A Dissabte 18 Gener 2003 17:41, Todd Miller va escriure: > > >>>I just want to access the buffer data, and the shape of this object from C >>>(well, I'm actually using Pyrex, but I think this is not important). Is >>>that possible by only using numarray C calls? >>> >>> >>Look at Lib/chararray.py and Src/_chararraymodule.c. >> >>If you can handle using a CharArray or RawCharArray, try: >> >>1. call NA_updateDataPtr( array ) to refresh the data buffer pointer in >>the PyArrayObject. Even _chararraymodule.c doesn't do this right yet. >> >>2. call NA_OFFSETDATA(array) to add the byteoffset to the pointer. >> >>3. shape, strides, and itemsize should be directly accessible from the >>PyArrayObject. >> >> > >Ok. I'll try to do that. > > > >>CharArray has some extra stripping and padding semantics; these are lazy >>and hence absent without extra care in C. RawCharArray has none. >> >> >> > >By the way, is it safe to assume that CharArray objects are contiguous? or >RawCharArray?. > Mostly no. Each fixed length element is stored as a contiguous sequence of bytes. Anything goes for the rest, so you need to look at the strides arrays and byteoffset. >The same question goes for RecArray objects. > No. It's possible to select every 10th record, for instance, in a slice. I believe the resulting decimated array would be a discontiguous view of the original. >Or it is always >convenient to check with iscontiguous() method if they are or not?. > I'm not even certain the method works correctly for chararray and recarray. I think the portion of chararray that has been written in C considers array strides. recarray is pure python. In both cases, I think I'd just forget about contiguity and use the strides arrays. > In case >these objects can be non-contiguous, I guess there's still not a function >like NA_InputArray that works with CharArray or RecArray objects in order to >obtain well-behaved objects. Is that true? > True. But neither recarray nor chararray really has behavedness problems like misalignment, byteswapping, or type conversion. I think contiguity is the only issue, and that is solved just by calling .copy(). You might argue that records contain byteswapped and misaligned fields. I don't have an immediate answer to that. My preference is to use strides and forget about contiguity, but you could also make contiguous copies simply. Noone I'm aware of has yet tried access to misbehaved records in C. > >I think it would be possible to me to include support for numarray objects >in next release of PyTables. > Great! >Thanks!, > > From verveer at embl.de Sun Jan 19 06:39:09 2003 From: verveer at embl.de (verveer at embl.de) Date: Sun Jan 19 06:39:09 2003 Subject: [Numpy-discussion] numarray bug? Message-ID: <1042987080.3e2ab8489e640@webmail.EMBL-Heidelberg.DE> Hi, The following gives an error: >>> print numarray.Int8 == numarray.Any Traceback (most recent call last): File "", line 1, in ? File "/usr/local/lib/python2.2/site-packages/numarray/numerictypes.py", line 102, in __cmp__ return genericTypeRank.index(self.name) - genericTypeRank.index(other.name) ValueError: list.index(x): x not in list A bug? Cheers, Peter -- Dr. Peter J. Verveer Cell Biology and Cell Biophysics Programme EMBL Meyerhofstrasse 1 D-69117 Heidelberg Germany Tel. : +49 6221 387245 Fax : +49 6221 387242 Email: verveer at embl-heidelberg.de From falted at openlc.org Mon Jan 20 04:17:03 2003 From: falted at openlc.org (Francesc Alted) Date: Mon Jan 20 04:17:03 2003 Subject: [Numpy-discussion] Accessing a Numeric 'c' array from numarray] In-Reply-To: <3E29B52C.2030602@stsci.edu> References: <3E2983C3.7000304@stsci.edu> <200301181917.29533.falted@openlc.org> <3E29B52C.2030602@stsci.edu> Message-ID: <200301201316.06127.falted@openlc.org> A Dissabte 18 Gener 2003 21:12, Todd Miller va escriure: > >By the way, is it safe to assume that CharArray objects are contiguous? or > >RawCharArray?. > > Mostly no. Each fixed length element is stored as a contiguous > sequence of bytes. Anything goes for the rest, so you need to look at > the strides arrays and byteoffset. > > >The same question goes for RecArray objects. > > No. It's possible to select every 10th record, for instance, in a > slice. I believe the resulting decimated array would be a discontiguous > view of the original. > > >Or it is always > >convenient to check with iscontiguous() method if they are or not?. > > I'm not even certain the method works correctly for chararray and > recarray. Well, during my tests with numarray 0.4, iscontiguous() seems to work well, both for chararrays and recarrays. > In both cases, I think I'd just forget about > contiguity and use the strides arrays. Yeah, but I still want to use iscontiguous() method just to speed-up a bit the code. > You might argue that records contain > byteswapped and misaligned fields. I don't have an immediate answer to > that. Exactly, I am pondering how to deal with HDF5 objects coming from machines with a different endianess (misalignment is not a problem in my case) than the local machine. But I think I can manage that by creating recarrays buffers with the byteorder parameter set appropriately during the HDF5 table reads. Then, all the data can be read correctly because numarray will byteswap the data whenever this recarray will be accessed. Moreover, if this object is to be used frequently, I can speed-up the access to this recarray by byteswapping the columns (as arrays) using their byteswap() method. In the future it would be nice to provide a generica byteswap method for recarrays. Thanks, -- Francesc Alted From falted at openlc.org Mon Jan 20 11:02:02 2003 From: falted at openlc.org (Francesc Alted) Date: Mon Jan 20 11:02:02 2003 Subject: [Numpy-discussion] recarray2 re-visited Message-ID: <200301202000.53584.falted@openlc.org> Hi, As I needed a byteswap() method for recarray, after a bit of hacking I've made one myself. This is based on my own version of recarray to take advantage of the _fields cache so as to both speed-up and simplify the new code. Basically, the new method takes a recarray, checking which columns are numarray arrays and invoking their byteswap() method if needed. Easy, but effective. Moreover, a _byteswap() and togglebyteorder() are provided to be compatible with existing methods in NumArray objects. As a plus, the recarray __str__ has been modified in order to allow a printing having in mind the byteorder of the recarray, and improving the speed of printing by a factor of 30, that can be handy in some situations. Do with it whatever you want, -- Francesc Alted -------------- next part -------------- A non-text attachment was scrubbed... Name: recarray2.py Type: text/x-python Size: 21435 bytes Desc: not available URL: -------------- next part -------------- recarray shape in test ==> (10000,) Assignment in recarray original ------------------------------- Assign time: 1.24 Rows/s: 8064 Assignment in recarray modified ------------------------------- Assign time: 0.16 Rows/s: 62499 Speed-up: 7.75 Selection in recarray original ------------------------------ This record pass the cut ==> 0.0 (row 0 ) This record pass the cut ==> 1.0 (row 1 ) This record pass the cut ==> 4.0 (row 2 ) Select time: 1.53 Rows/s: 6535 Selection in recarray modified ------------------------------ This record pass the cut ==> 0.0 (row 0 ) This record pass the cut ==> 1.0 (row 1 ) This record pass the cut ==> 4.0 (row 2 ) Select time: 0.15 Rows/s: 66666 Speed-up: 10.2 Printing in recarray original ------------------------------ Print time: 18.11 Rows/s: 552 Printing in recarray modified ------------------------------ Print time: 0.63 Rows/s: 15872 Speed-up: 28.746 -------------- next part -------------- A non-text attachment was scrubbed... Name: recarray2-test.py Type: text/x-python Size: 2946 bytes Desc: not available URL: From falted at openlc.org Tue Jan 21 08:01:13 2003 From: falted at openlc.org (Francesc Alted) Date: Tue Jan 21 08:01:13 2003 Subject: [Numpy-discussion] Conversion functions between Numeric and numarray arrays? Message-ID: <200301211744.55666.falted@openlc.org> Hi, Anybody is aware of any function (either in C or Python or a mixture of both) to easily convert Numerical Python arrays from/to numarray arrays? I mean, I would like to use such a funtion that, without having to copy element by element all the data, be able to copy the data buffer (or even use the same if possible at all) from one object to the other. Thanks, -- Francesc Alted From haase at msg.ucsf.edu Tue Jan 21 10:41:07 2003 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Tue Jan 21 10:41:07 2003 Subject: [Numpy-discussion] Conversion functions between Numeric and numarray arrays? References: <200301211744.55666.falted@openlc.org> Message-ID: <051501c2c17c$a83e8410$3b45da80@rodan> Hi, I think this is actually quite related to my post from Friday: [Numpy-discussion] make C array accessible to python without copy -> So, to reformulate: Who hold actually the array data in memory? Or: where gets the memory allocated and where/how many pointers to that exist? I understood the answer that Todd Miller gave, that there is such a thing as a "buffer object" that does all the work, so then: one would just have to take that and build a "new" numarray or Numeric structure around it (referring to the Subject of this email) or (in the case of my Friday-email) just have that "buffer object" point to a different memory space (that got already allocated by the C-program) . Agree ? (Did I get it right?) Sebastian Haase ----- Original Message ----- From falted at openlc.org Tue Jan 21 11:24:08 2003 From: falted at openlc.org (Francesc Alted) Date: Tue Jan 21 11:24:08 2003 Subject: [Numpy-discussion] Conversion functions between Numeric and numarray arrays? In-Reply-To: <3E2D74A2.40204@stsci.edu> References: <200301211744.55666.falted@openlc.org> <3E2D74A2.40204@stsci.edu> Message-ID: <200301212005.30328.falted@openlc.org> A Dimarts 21 Gener 2003 17:26, v?reu escriure: > Francesc Alted wrote: > >Anybody is aware of any function (either in C or Python or a mixture of > >both) to easily convert Numerical Python arrays from/to numarray arrays? > > I think you should look at numarray.fromlist() and NumArray.tolist(). I > think fromlist() will work on a nested sequence object, and hence a > Numeric array. Yeah, I knew that, but I was looking for something more optimal. > > >I mean, I would like to use such a funtion that, without having to copy > >element by element all the data, be able to copy the data buffer (or even > >use the same if possible at all) from one object to the other. > > I have not looked at this yet; it's a very good question. Note that > going from numarray to Numeric there are issues with making the buffer > well-behaved. I think this should be not too difficult to achieve and I'll try to explain why. When going from numarray to Numeric, numarray already have NA_InputArray C-API function that returns a well-behaved array. But strictly speaking, we don't even need a well-behaved array (this is a too restrictive condition) as both Numeric and numarray support discontiguous data. Even the byteorder should be not a problem, because, as Numeric itself has no such a property, we can create a Numeric array that is in native order as the result and byteswap the numarray object (if needed) before doing the conversion. So, non-alignment remains as the only issue that may cause a buffer copy during numarray ==> Numeric conversion. Is that correct?. If yes, it is possible to do a workaround about that, i.e. we can still get a Numeric from a numarray without copying the data in case of numarray misaligned objects?. Regarding to going in the other sense (ie. Numeric ==> numarray), as numarray supports discontiguity, misalignment and byteswapped data, this conversion should not imply a data buffer copy at all. Once we have a pointer to the data buffer, it is only a matter of wrapping a Numeric or numarray object around it getting this info from the original object, and returning the new object as a result. All in all, this conversion *seems* to be not a too difficult task. Making such a conversion functions (in C, but also having Python counterparts) available might represent to open the door to a co-existence of Numeric and numarray objects in the same program, and that would easy the numarray deployment in existing Numeric software. Comments? -- Francesc Alted From falted at openlc.org Tue Jan 21 11:24:11 2003 From: falted at openlc.org (Francesc Alted) Date: Tue Jan 21 11:24:11 2003 Subject: [Numpy-discussion] Conversion functions between Numeric and numarray arrays? In-Reply-To: <051501c2c17c$a83e8410$3b45da80@rodan> References: <200301211744.55666.falted@openlc.org> <051501c2c17c$a83e8410$3b45da80@rodan> Message-ID: <200301212020.57384.falted@openlc.org> A Dimarts 21 Gener 2003 19:41, Sebastian Haase va escriure: > Hi, > I think this is actually quite related to my post from Friday: > [Numpy-discussion] make C array accessible to python without copy > > -> So, to reformulate: Who hold actually the array data in memory? Or: > where gets the memory allocated and where/how many pointers to that exist? > I understood the answer that Todd Miller gave, that there is such a thing > as a "buffer object" that does all the work, so then: one would just have > to take that and build a "new" numarray or Numeric structure around it > (referring to the Subject of this email) or (in the case of my > Friday-email) just have that "buffer object" point to a different memory > space (that got already allocated by the C-program) . > > Agree ? (Did I get it right?) Well, so so. I think the buffer object is a property of numarray objects, not Numeric objects. So, in the numarray ==> Numeric conversion process you may need to access the internals of the buffer (for example by using the high level numarray C-API) and manage to obtain a data buffer (in the C sense, not an object) that can be used to build the Numeric object (with the help of the numarray object metadata). The opposite way needs something similar but with inverted roles. See my previous message for a more in-depth explanation. I think the conversion (without copying) is not a difficult process, but no so-easy like that. Well, I'm just a newcomer to numarray and my opinions about that may perfectly be completely wrong, of course. Take them with caution!. -- Francesc Alted From paul at pfdubois.com Tue Jan 21 12:06:34 2003 From: paul at pfdubois.com (paul at pfdubois.com) Date: Tue Jan 21 12:06:34 2003 Subject: [Numpy-discussion] RE: numarray/Numeric upkeep? Message-ID: <3E0D02A0000164FB@mta6.wss.scd.yahoo.com> Here are some of the factors leading to the slow rate of change of Numeric lately. a. I changed to a new project and have had a lot of startup learning to do. My new project uses Numeric but not in as central a way as my old one. b. I mistakenly thought numarray would be ready sooner so that I was trying to let it slide. c. I announced last year, in view of (a), that I was needing to be replaced as HeadNummie. It would be logical to turn this over to the Numarray people, but they aren't ready to do it until Numarray is ready, so nothing happened. d. Except for Travis, most of the other listed Numeric developers aren't in fact doing patches, releases, etc. e. Not all patches that are submitted are correct or desirable, historically. I'm not saying anything about any patches you may have submitted, just pointing out that applying them requires real work, not just mechanical patching. In fact the rate of error in patches is quite high and I've learned to be cautious. f. Some patches interfere with each other; for example, a patch for making 64 bit machines work right and a patch for some specific bug collided. I've started to work on the MA for Numarray but I'm not able to do much work on Numeric right now. This is a place where someone else has to help. >-- Original Message -- >To: dubois at users.sourceforge.net >Subject: numarray/Numeric upkeep? >From: Michael Stone >Cc: >Date: Tue, 21 Jan 2003 11:32:03 -0800 > > > >No one seems to be doing bugfixes for Numeric or numarray. >Nothing seems to have happened for several months. Lots of bugs have been >posted for Numeric, some easily fixable (I submitted one with a patch). > >Any idea if either project will become active again anytime soon? From perry at stsci.edu Tue Jan 21 12:28:13 2003 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 21 12:28:13 2003 Subject: [Numpy-discussion] RE: numarray/Numeric upkeep? In-Reply-To: <3E0D02A0000164FB@mta6.wss.scd.yahoo.com> Message-ID: Michael Stone wrote: > >No one seems to be doing bugfixes for Numeric or numarray. > >Nothing seems to have happened for several months. Lots of bugs > have been ... It certainly isn't true that nothing has happened for several months with numarray. On what do you base this belief? While not all bugs have been fixed, the oldest listed in the numarray bug tracker is from December. Is there a bug you feel needs urgent attention? Work is continuing and new releases will be coming out. As to Paul's comments regarding when numarray will be ready, my guess is when the following are complete: - Package reorganization (make numarray a package) - Optimization for small arrays (making numarray'speed with small arrays more comparable with Numeric; this is probably the single largest remaining item) - Porting some well known packages such as MA (which Paul is working on), scipy, pyopengl and such to work with numarray. Some of this has been started. There are other smaller things to do as well. But I'm hoping that we can be done with these in a few months. Perry From bazell at comcast.net Tue Jan 21 12:33:35 2003 From: bazell at comcast.net (Dave Bazell) Date: Tue Jan 21 12:33:35 2003 Subject: [Numpy-discussion] array operation Message-ID: <00bd01c2c18c$10ab5000$6401a8c0@DB> I am trying to see if I can use where() or choose() to do this. I can't really figure it out. I have a 2-d array data where each row is an observation and each column is an attribute of the observation: data = [[.3, .2, 2.3,...] <- observation 1 [.7, 1.2, .4...] <- observation 2 ...]] I have another 1-d array that contains a code for the class of object: class = [0,1,0,1,1,3,2,0,...] where class[i] = the class of the ith object in the data array. Thus, observation 1 above is class 0, observation 2 is class 1, and so on. I want to select all objects of a given class from data array. I can do this with a loop for i in range(ndat): if class == 0: do something .... Is there a way to use where() or choose() to do this? Would it be more efficient? Thanks, Dave From perry at stsci.edu Tue Jan 21 13:02:05 2003 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 21 13:02:05 2003 Subject: [Numpy-discussion] array operation In-Reply-To: <00bd01c2c18c$10ab5000$6401a8c0@DB> Message-ID: Dave Bazell writes: > I am trying to see if I can use where() or choose() to do this. I can't > really figure it out. > > I have a 2-d array data where each row is an observation and each > column is > an attribute of the observation: > > data = > [[.3, .2, 2.3,...] <- observation 1 > [.7, 1.2, .4...] <- observation 2 > ...]] > > I have another 1-d array that contains a code for the class of object: > > class = [0,1,0,1,1,3,2,0,...] Note that using class is illegal, it is a reserved keyword. > > where class[i] = the class of the ith object in the data array. Thus, > observation 1 above is class 0, observation 2 is class 1, and so on. > > I want to select all objects of a given class from data array. I can do > this with a loop > I assume you mean you want to select all the rows corresponding to all the observations where the code for the class corresponding to that observation equals some particular value. If so then for numarray this ought to work. index = nonzero(code==1) # want indices of all the obs where class code = 1 selected_obs = data[index] (or in one line if you wish: selected_obs = data[nonzero(code==1)] ) > for i in range(ndat): > if class == 0: > do something > .... > > Is there a way to use where() or choose() to do this? Would it be more > efficient? > Perry From Chris.Barker at noaa.gov Tue Jan 21 14:30:10 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Tue Jan 21 14:30:10 2003 Subject: [Numpy-discussion] array operation References: Message-ID: <3E2DC965.9328BCD6@noaa.gov> Perry Greenfield wrote: > If so then for numarray this ought to work. > > index = nonzero(code==1) # want indices of all the obs where class code = 1 > selected_obs = data[index] of for Numeric, use take(): selected_obs = take(data,nonzero(code == 1),1) (this will select columns coresponding to where the code == 1, which is how I read your question) By the way, choose() and where() do something similar, but give you an array back that is the saem size as the one you start with, with some (or all) of the elements replaced. take() gives you a smaller array that is a subset of the original one, which I think is what you want here. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From jmiller at stsci.edu Tue Jan 21 14:39:04 2003 From: jmiller at stsci.edu (Todd Miller) Date: Tue Jan 21 14:39:04 2003 Subject: [Numpy-discussion] Conversion functions between Numeric and numarray arrays? References: <200301211744.55666.falted@openlc.org> <3E2D74A2.40204@stsci.edu> <200301212005.30328.falted@openlc.org> Message-ID: <3E2DCBDA.1040604@stsci.edu> Francesc Alted wrote: >I think this should be not too difficult to achieve and I'll try to explain >why. > >When going from numarray to Numeric, numarray already have NA_InputArray >C-API function that returns a well-behaved array. But strictly speaking, we >don't even need a well-behaved array (this is a too restrictive condition) >as both Numeric and numarray support discontiguous data. Even the byteorder >should be not a problem, because, as Numeric itself has no such a property, >we can create a Numeric array that is in native order as the result and >byteswap the numarray object (if needed) before doing the conversion. > In-place byteswapping sounds like a bad idea to me. What if the array is based upon a readonly buffer? We've just started using these at STSCI because a readonly memory map imposes no load on the system swap file. With a read only mapping, the buffer itself has readonly pages; these cannot be swapped in-place. >So, non-alignment remains as the only issue that may cause a buffer copy >during numarray ==> Numeric conversion. Is that correct?. > I don't think so. >If yes, it is >possible to do a workaround about that, i.e. we can still get a Numeric from >a numarray without copying the data in case of numarray misaligned objects?. > > I don't see how. The primary source of misaligned arrays is numerical columns in recarrays. It seems to me that if the data is misaligned, you either have to copy it to someplace else which is aligned, or teach the function which is going to process it how to access it byte-wise. Only the former sounds feasible to me. >Regarding to going in the other sense (ie. Numeric ==> numarray), as >numarray supports discontiguity, misalignment and byteswapped data, this >conversion should not imply a data buffer copy at all. > > This sounds correct. >Once we have a pointer to the data buffer, it is only a matter of >wrapping a Numeric or numarray object around it getting this info from the >original object, and returning the new object as a result. > >All in all, this conversion *seems* to be not a too difficult task. > > It seems straightforward in principle, but the memory management issues seem a little tricky to me. It's easy to get buffers from numarrays, and create numarrays from buffers. I guess we need a module which does the same for Numeric. There are two easy ways to "get a buffer" from a Numeric array: 1. Wrap the Numeric data in a buffer object. 2. Add support for the buffer API to the Numeric object. Off hand, I'm not sure which is better, although (1) is less intrusive to Numeric and I suppose is the place to start. This should be easy. But, I'm not sure how to create a Numeric array from a buffer. It's easy to get the data pointer from a buffer, and to construct a Numeric array from a data pointer, but we also need a way to stash the pointer to the buffer object. I don't like the idea of modifying Numeric's PyArrayObject. >Making such a conversion functions (in C, but also having Python >counterparts) available might represent to open the door to a co-existence >of Numeric and numarray objects in the same program, and that would easy the >numarray deployment in existing Numeric software. > >Comments? > > All in all, I think this is a great idea which would really boost interoperability. I wish there was a simpler approach which required no modifications to Numeric. Todd From falted at openlc.org Wed Jan 22 01:53:01 2003 From: falted at openlc.org (Francesc Alted) Date: Wed Jan 22 01:53:01 2003 Subject: [Numpy-discussion] Incomplete support in certain Numeric emulation functions in numarray Message-ID: <200301221051.57337.falted@openlc.org> Hi, I have discovered that the Numeric emulation functions in numarray doesn't accept a character typecode as type parameter. This is not immediately apparent because type parameter is of type 'int', and passing it a 'char' maybe not a good practice. But the fact is that Numeric *do* accept the charcodes in the type parameter. For example, this is the normal way to call the PyArray_FromDims function: arr = PyArray_FromDims(self.rank, self.dimensions, tFloat64) but, in Numeric, this other manner also works: arr = PyArray_FromDims(self.rank, self.dimensions, 'd') Now, in numarray, if you pass a character to the type parameter, a "segmentation fault" is issued. Look at the end of Numeric-22.0/Src/arraytypes.c, to see how characters are handled as types in Numeric. I think something like this should be added to the deferred_libnumarray_init in numarray-0.4/Src/newarray.ch. Another thing. It seems to me that NA_New and NA_Empty functions are not well documented in the numarray documentation as they differ from the definitions in numarray-0.4/Src/newarray.ch. I hope that the latter will stay, because I prefer them a lot more than the documented ones :-) Bye, -- Francesc Alted From jmiller at stsci.edu Wed Jan 22 06:52:08 2003 From: jmiller at stsci.edu (Todd Miller) Date: Wed Jan 22 06:52:08 2003 Subject: [Numpy-discussion] Incomplete support in certain Numeric emulation functions in numarray References: <200301221051.57337.falted@openlc.org> Message-ID: <3E2EAFE9.4060900@stsci.edu> Francesc Alted wrote: >Hi, > >I have discovered that the Numeric emulation functions in numarray doesn't >accept a character typecode as type parameter. > Interesting. > >This is not immediately apparent because type parameter is of type 'int', >and passing it a 'char' maybe not a good practice. > I wrote the emulation functions using the manual and intuition rather than the existing code. There will be others like this. >But the fact is that >Numeric *do* accept the charcodes in the type parameter. > > > No argument here. numarray can "always" be more compatible than it is "now", for any value of always or now. I think the only real way to avoid that would be to build Numeric into numarray, which sounds dubious. :) >For example, this is the normal way to call the PyArray_FromDims function: > >arr = PyArray_FromDims(self.rank, self.dimensions, tFloat64) > >but, in Numeric, this other manner also works: > >arr = PyArray_FromDims(self.rank, self.dimensions, 'd') > > This was nicely illustrated. >Now, in numarray, if you pass a character to the type parameter, a >"segmentation fault" is issued. > > Decidedly not good. >Look at the end of Numeric-22.0/Src/arraytypes.c, to see how characters are >handled as types in Numeric. I think something like this should be added to >the deferred_libnumarray_init in numarray-0.4/Src/newarray.ch. > I did a simple implementation of PyArray_DescrFromType trying to add support for f2py. There are 2 real issues with it that I see: 1. It still doesn't handle character codes. I think it could handle both NumericTypes and character codes without conflict because of the way the ASCII character set is layed out. 2. I just added it so that it *could* be called since I think f2py needed it. I didn't call it anywhere from the other compatability functions. Care to do another patch? >Another thing. It seems to me that NA_New and NA_Empty functions are not >well documented in the numarray documentation as they differ from the >definitions in numarray-0.4/Src/newarray.ch. I hope that the latter will >stay, because I prefer them a lot more than the documented ones :-) > If you're working from CVS, the form they're in now was the result of someone's detailed comments. They're still not quite right, because the interface is written in terms of int arrays, which is not good for LP64 platforms where long is really what is needed to avoid creating 2G bottlenecks. The naming is also not consistent and I will want to make it so before release of numarray-0.5. >Bye, > > > Todd From falted at openlc.org Wed Jan 22 09:48:03 2003 From: falted at openlc.org (Francesc Alted) Date: Wed Jan 22 09:48:03 2003 Subject: [Numpy-discussion] Incomplete support in certain Numeric emulation functions in numarray In-Reply-To: <3E2EAFE9.4060900@stsci.edu> References: <200301221051.57337.falted@openlc.org> <3E2EAFE9.4060900@stsci.edu> Message-ID: <200301221846.13358.falted@openlc.org> A Dimecres 22 Gener 2003 15:51, Todd Miller va escriure: > > I did a simple implementation of PyArray_DescrFromType trying to add > support for f2py. > There are 2 real issues with it that I see: > > 1. It still doesn't handle character codes. I think it could handle > both NumericTypes and character codes without conflict because of the > way the ASCII character set is layed out. I think so > > 2. I just added it so that it *could* be called since I think f2py > needed it. I didn't call it anywhere from the other compatability > functions. > I tried to patch your PyArray_DescrFromType, but nothing has changed because, as you said, any compatabilty function call it. > Care to do another patch? Well, I've tried to patch the NA_NewAll funtion in newarray.c: typeObject = pNumType[type]; if (!typeObject) { /* Test if it is a Numeric charcode */ sprintf(strcharcode, "%c", type); charcode = PyString_FromString(strcharcode); typeobj = PyDict_GetItemString(pNumericTypesTDict, strcharcode); if (typeobj) { typeObject = typeobj; } else return (PyArrayObject *) PyErr_Format(_Error, "Type object lookup returned NULL for type %d", type); } instead of the original code: typeObject = pNumType[type]; if (!typeObject) return (PyArrayObject *) PyErr_Format(_Error, "Type object lookup returned NULL for type %d", type); with no luck as the segmentation fault continues to appear. Anyway, I've already patched my original code to use only integer codes, not character, so it would be a problem (at least for me). > They're still not quite right, because the interface is written in > terms of int arrays, which is not good for LP64 platforms where long is > really what is needed to avoid creating 2G bottlenecks. The naming is > also not consistent and I will want to make it so before release of > numarray-0.5. Ok, so perhaps it's better to use the PyArray_FromDims rather than NA_Empty (at least, until the C-API stabilizes). It's good to know that!. BTW, during the patching work of numarray sources I perceived some missing character code types in numerictypes.py. These are the correspondents to: UInt16, Int64 and UInt64. In recarray, they don't appear neither (except for Int64 which appears as 'N' in numfmt, but with no correspondant in revfmt), so one can't build-up recarrays with these types because you need a charcode for the "formats" string. Is this intentional? Do you plan to fill these gaps (it would be nice, specially for recarrays)? Thanks, -- Francesc Alted From haase at msg.ucsf.edu Thu Jan 23 14:06:04 2003 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Thu Jan 23 14:06:04 2003 Subject: [Numpy-discussion] Have a problem: what is attribute 'compress' References: <3E00FDB5.2090804@erols.com> <004b01c2a6fd$195c95f0$3b45da80@rodan> <3E01D07B.3070009@stsci.edu> Message-ID: <08ad01c2c32b$900238f0$3b45da80@rodan> Hi, I can print numarray of any int time just fine, but I still get the compress error message with Float (or complex) data: >>>c >>>array([[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], type=UInt16) >>>c.astype(na.Float) Traceback (most recent call last): File "", line 1, in ? File "C:\Python22\Lib\site-packages\numarray\numarray.py", line 581, in __repr__ MAX_LINE_WIDTH, PRECISION, SUPPRESS_SMALL, ', ', 1) File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 163, in array2string separator, array_output) File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 125, in _array2string format, item_length = _floatFormat(data, precision, suppress_small) File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 246, in _floatFormat non_zero = numarray.abs(numarray.compress(numarray.not_equal(data, 0), data)) AttributeError: 'module' object has no attribute 'compress' I get this on Windows (2000) and on Linux. Both numarray 0.4 Thanks, Sebastian ----- Original Message ----- From: "Todd Miller" To: "Sebastian Haase" Cc: Sent: Thursday, December 19, 2002 5:58 AM Subject: Re: [Numpy-discussion] Have a problem: what is attribute 'compress' > Sebastian Haase wrote: > > >Hi! > >Somehow I have a problem with numarray. Please take a look at this: > > > Hi Sebastian, > > I've don't recall seeing anything like this, nor can I reproduce it > now. If you've been following numarray for a while now, I can say > that it is important to remove the old version of numarray before > installing the new version. I recommend deleting your current > installation and reinstalling numarray. > > compress() is a ufunc, much like add() or put(). It is defined in > ndarray.py, right after the import of the modules ufunc and _ufunc. > _ufunc in particular is a problematic module, because it has followed > the atypical development path of moving from C-code to Python code. > Because of this, and the fact that a .so or .dll overrides a .py, > older installations interfere with newer ones. The atypical path was > required because the original _ufuncmodule.c was so large that it could > not be compiled on some systems; as a result, I split _ufuncmodule.c > into pieces by data type and now use _ufunc.py to glue the pieces together. > > Good luck! Please let me know if reinstalling doesn't clear up the > problem. > > Todd > > > > > > >>>>import numarray as na > >>>>na.array([0, 0]) > >>>> > >>>> > >array([0, 0]) > > > > > >>>>na.array([0.0, 0.0]) > >>>> > >>>> > >Traceback (most recent call last): > > File "", line 1, in ? > > File "C:\Python22\Lib\site-packages\numarray\numarray.py", line 581, in > >__repr__ > > MAX_LINE_WIDTH, PRECISION, SUPPRESS_SMALL, ', ', 1) > > File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 163, in > >array2string > > separator, array_output) > > File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 125, in > >_array2string > > format, item_length = _floatFormat(data, precision, suppress_small) > > File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 246, in > >_floatFormat > > non_zero = numarray.abs(numarray.compress(numarray.not_equal(data, 0), > >data)) > >AttributeError: 'module' object has no attribute 'compress' > > > >The same workes fine with Numeric. But I would prefer numarray because I'm > >writing C++-extensions and I need "unsigned shorts". > > > >What is this error about? > > > >Thanks, > >Sebastian > > > > > > > > > >------------------------------------------------------- > >This SF.NET email is sponsored by: Order your Holiday Geek Presents Now! > >Green Lasers, Hip Geek T-Shirts, Remote Control Tanks, Caffeinated Soap, > >MP3 Players, XBox Games, Flying Saucers, WebCams, Smart Putty. > >T H I N K G E E K . C O M http://www.thinkgeek.com/sf/ > >_______________________________________________ > >Numpy-discussion mailing list > >Numpy-discussion at lists.sourceforge.net > >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > > > > > From jmiller at stsci.edu Thu Jan 23 14:33:03 2003 From: jmiller at stsci.edu (Todd Miller) Date: Thu Jan 23 14:33:03 2003 Subject: [Numpy-discussion] Have a problem: what is attribute 'compress' References: <3E00FDB5.2090804@erols.com> <004b01c2a6fd$195c95f0$3b45da80@rodan> <3E01D07B.3070009@stsci.edu> <08ad01c2c32b$900238f0$3b45da80@rodan> Message-ID: <3E306D73.6050303@stsci.edu> Sebastian Haase wrote: >Hi, >I can print numarray of any int time just fine, but > OK. I am assuming you deleted all of your old numarray installations as I recommended and reinstalled numarray-0.4. What is your PYTHONPATH? >I still get the compress error message with Float (or complex) >data: > > >>>>c >>>>array([[0, 0, 0, ..., 0, 0, 0], >>>> >>>> > [0, 0, 0, ..., 0, 0, 0], > [0, 0, 0, ..., 0, 0, 0], > ..., > [0, 0, 0, ..., 0, 0, 0], > [0, 0, 0, ..., 0, 0, 0], > [0, 0, 0, ..., 0, 0, 0]], type=UInt16) > > >>>>c.astype(na.Float) >>>> >>>> >Traceback (most recent call last): > File "", line 1, in ? > File "C:\Python22\Lib\site-packages\numarray\numarray.py", line 581, in >__repr__ > MAX_LINE_WIDTH, PRECISION, SUPPRESS_SMALL, ', ', 1) > File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 163, in >array2string > separator, array_output) > File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 125, in >_array2string > format, item_length = _floatFormat(data, precision, suppress_small) > File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 246, in >_floatFormat > non_zero = numarray.abs(numarray.compress(numarray.not_equal(data, 0), >data)) >AttributeError: 'module' object has no attribute 'compress' > >I get this on Windows (2000) and on Linux. Both numarray 0.4 > > I'm not sure what's going on here, but I develop on both platforms, and Linux constantly. The self tests definitely pass in Linux. It must be some kind of environment issue or runtime issue. What happens when you type: >>> import numtestall >>> numtestall.test() ... what gets printed here? ... >Thanks, >Sebastian > > > >----- Original Message ----- >From: "Todd Miller" >To: "Sebastian Haase" >Cc: >Sent: Thursday, December 19, 2002 5:58 AM >Subject: Re: [Numpy-discussion] Have a problem: what is attribute 'compress' > > > > >>Sebastian Haase wrote: >> >> >> >>>Hi! >>>Somehow I have a problem with numarray. Please take a look at this: >>> >>> >>> >>Hi Sebastian, >> >>I've don't recall seeing anything like this, nor can I reproduce it >>now. If you've been following numarray for a while now, I can say >>that it is important to remove the old version of numarray before >>installing the new version. I recommend deleting your current >>installation and reinstalling numarray. >> >>compress() is a ufunc, much like add() or put(). It is defined in >>ndarray.py, right after the import of the modules ufunc and _ufunc. >>_ufunc in particular is a problematic module, because it has followed >>the atypical development path of moving from C-code to Python code. >> Because of this, and the fact that a .so or .dll overrides a .py, >> older installations interfere with newer ones. The atypical path was >>required because the original _ufuncmodule.c was so large that it could >>not be compiled on some systems; as a result, I split _ufuncmodule.c >>into pieces by data type and now use _ufunc.py to glue the pieces >> >> >together. > > >>Good luck! Please let me know if reinstalling doesn't clear up the >>problem. >> >>Todd >> >> >> >>> >>> >>>>>>import numarray as na >>>>>>na.array([0, 0]) >>>>>> >>>>>> >>>>>> >>>>>> >>>array([0, 0]) >>> >>> >>> >>> >>>>>>na.array([0.0, 0.0]) >>>>>> >>>>>> >>>>>> >>>>>> >>>Traceback (most recent call last): >>> File "", line 1, in ? >>> File "C:\Python22\Lib\site-packages\numarray\numarray.py", line 581, in >>>__repr__ >>> MAX_LINE_WIDTH, PRECISION, SUPPRESS_SMALL, ', ', 1) >>> File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 163, >>> >>> >in > > >>>array2string >>> separator, array_output) >>> File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 125, >>> >>> >in > > >>>_array2string >>> format, item_length = _floatFormat(data, precision, suppress_small) >>> File "C:\Python22\Lib\site-packages\numarray\arrayprint.py", line 246, >>> >>> >in > > >>>_floatFormat >>> non_zero = numarray.abs(numarray.compress(numarray.not_equal(data, >>> >>> >0), > > >>>data)) >>>AttributeError: 'module' object has no attribute 'compress' >>> >>>The same workes fine with Numeric. But I would prefer numarray because >>> >>> >I'm > > >>>writing C++-extensions and I need "unsigned shorts". >>> >>>What is this error about? >>> >>>Thanks, >>>Sebastian >>> >>> >>> >>> >>>------------------------------------------------------- >>>This SF.NET email is sponsored by: Order your Holiday Geek Presents Now! >>>Green Lasers, Hip Geek T-Shirts, Remote Control Tanks, Caffeinated Soap, >>>MP3 Players, XBox Games, Flying Saucers, WebCams, Smart Putty. >>>T H I N K G E E K . C O M http://www.thinkgeek.com/sf/ >>>_______________________________________________ >>>Numpy-discussion mailing list >>>Numpy-discussion at lists.sourceforge.net >>>https://lists.sourceforge.net/lists/listinfo/numpy-discussion >>> >>> >>> >>> >> >> >> >> > > > >------------------------------------------------------- >This SF.NET email is sponsored by: >SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See! >http://www.vasoftware.com >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From j_r_fonseca at yahoo.co.uk Thu Jan 23 16:10:02 2003 From: j_r_fonseca at yahoo.co.uk (=?iso-8859-15?Q?Jos=E9?= Fonseca) Date: Thu Jan 23 16:10:02 2003 Subject: [Numpy-discussion] Extensive use of methods instead of functions Message-ID: <20030124000759.GA6042@localhost.localdomain> With the ability of subclassing types in recent versions of the Python language, more people will be interested in subclassing Numeric arrays for specific purposes. Still the use of functions instead of methods takes away many of the advantages, the ability of being overloaded. Taking this statement as an example: Numeric.put(myarray, myindices, myvalues) In the current state of affairs, if we wanted to have to statment to work with asparse matrix class derived from a Numeric array, it would have to be something like: Sparse.put(myarray, myindices, myvalues) That is, it forces to the underlaying code to know whether is dealing with Numeric arrays, or some other equivalent class. But it would be much more useful to have simply: myarray.put(myindices, myvalues) which would work regardless of the actual type of myarray, provided it supplied the put() method. This would improve enormously code reusability and extensability. I know that there are certain implementations details that may difficult this (like many functions being implemented in pure Python), but any advances made in this since will be an improvement of the current situation. Also, I know that this example is a little unhappy because numarray will do these things with the __getitem__ and __setitem__ operators. But others could easily be shown. Regards, Jos? Fonseca __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com From falted at openlc.org Fri Jan 24 04:00:07 2003 From: falted at openlc.org (Francesc Alted) Date: Fri Jan 24 04:00:07 2003 Subject: [Numpy-discussion] typecodes in numarray Message-ID: <200301241259.30243.falted@openlc.org> Maybe I'm becoming a bit tedious with this, but if you look at: >>> import numerictypes >>> numerictypes.typecode {Complex64: 'D', Int32: 'l', UInt16: 's', Complex32: 'F', Float64: 'd', UInt8: 'b', Int16: 's', Float32: 'f', Int8: '1'} you can find some incongruencies that lead to weird things like: >>> array([1,2], Int16).typecode() 's' >>> array([1,2], UInt16).typecode() 's' # --> same as Int16! >>> array([1,2], Int64).typecode() Traceback (most recent call last): File "", line 1, in ? File "/usr/local/lib/python2.2/site-packages/numarray/numarray.py", line 730, in typecode return numerictypes.typecode[self._type] KeyError: numarray type: Int64 >>> array([1,2], UInt64).typecode() Traceback (most recent call last): File "", line 1, in ? File "/usr/local/lib/python2.2/site-packages/numarray/numarray.py", line 730, in typecode return numerictypes.typecode[self._type] KeyError: numarray type: UInt64 Also, 'l' is used here to map Int32, while in recarray is used to map Boolean. Moreover, Numeric 22.0 introduced the equivalent of UInt16 and UInt32 types as 'w' and 'u' respectively. But, again, 'u' is used in recarray as synonym of Uint8. I think it's important to agree with a definitive set of charcodes and use them uniformly throughout numarray. Suggestion: if recarray charcodes are not necessary to match the Numeric ones, I propose that using the Python convention maybe a good idea. Look at the table in: http://www.python.org/doc/current/lib/module-struct.html. -- Francesc Alted From perry at stsci.edu Fri Jan 24 06:38:17 2003 From: perry at stsci.edu (Perry Greenfield) Date: Fri Jan 24 06:38:17 2003 Subject: [Numpy-discussion] typecodes in numarray In-Reply-To: <200301241259.30243.falted@openlc.org> Message-ID: > -----Original Message----- > From: numpy-discussion-admin at lists.sourceforge.net > [mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of > Francesc Alted > Sent: Friday, January 24, 2003 7:00 AM > To: numpy-discussion at lists.sourceforge.net > Subject: [Numpy-discussion] typecodes in numarray > > > Maybe I'm becoming a bit tedious with this, but if you look at: > No, this sort of feedback is very valuable. We'll think about this a bit, but I'd agree that consistency with Numeric codes is important. Some of the history of the codes used by recarray arise from conventions used in other software not related to Python or Numeric. But if recarray is to be generic and used by others, we should hide, remove or layer such conventions in a subclass. Let us think about how we should do that. Thanks, Perry From perry at stsci.edu Fri Jan 24 09:04:02 2003 From: perry at stsci.edu (Perry Greenfield) Date: Fri Jan 24 09:04:02 2003 Subject: FW: [Numpy-discussion] typecodes in numarray Message-ID: Todd Miller had some further comments that I thought were worth posting as well (and I think he makes some very good points). ************************************************************************ My [i.e. Todd's] thoughts about it: >Maybe I'm becoming a bit tedious with this, but if you look at: > No. It shows you're thinking about it carefully. Having looked at all of the examples below, I have some comments: 1. The sparseness and obscurity of the typecode "wordspace" are both demonstrated here. There are so few letters to choose from, they're often already used in some other context. Even given the large number of unused letters, it's often difficult to choose good ones and to remember what has been chosen. I think this is one of the reasons Perry chose to replace typecodes with true type objects which have rich, regular, and predictable symbolic names. 2. Typecodes were added as a backwards compatability feature of numarray, and I think it's probable that numarray beat Numeric to supporting most of these types, because otherwise they'd have been copied directly and there would be no problem. I'm not really trying to play a blame-game here, but I am making an argument that perhaps numarray should only go so far in the support of what I regard as an obsolescent feature. If the Numeric developers choose to continue extending the use of typecodes in ways that are incompatible with numarray, one way of dealing with it is to "just say no". We are going beyond the scope of backwards compatability to on-going compatabilty. (Which we may still have to do but needs to be discussed and considered) 3. STSCI has layered other software on top of numarray and recarray which astronomers use to do work. It is the friction of that interface which makes correcting these consistency problems more difficult than might be immediately apparent. >I think it's important to agree with a definitive set of charcodes and use >them uniformly throughout numarray. > I wish this were possible, but I'm thinking we should try to find an alternative approach altogether, one which may be more verbose but implicitly free of conflict. A means for specifying a recarray format might be created from tuples, type objects, and integer repetition factors. The verbosity of this approach might be a litte tedious, but it would also be transparent, maintainable, and conflict free. I think we should add an "obsolescent feature" warning to numarray and recarray which flags any use of character typecodes when the appropriate command line switches are set. >Suggestion: if recarray charcodes are not necessary to match the Numeric >ones, I propose that using the Python convention maybe a good idea. >Look at the table in: >http://www.python.org/doc/current/lib/module-struct.html. > This sounds good to me, except that it will break an existing interface that I don't have control over. Therefore, I suggest we correct the problem by coming up with something better. From paul at pfdubois.com Fri Jan 24 09:43:07 2003 From: paul at pfdubois.com (Paul F Dubois) Date: Fri Jan 24 09:43:07 2003 Subject: [Numpy-discussion] typecodes in numarray In-Reply-To: Message-ID: <000501c2c3cf$e10716e0$6601a8c0@NICKLEBY> I don't understand this remark: but I am making an argument that perhaps > numarray should only go so far in the support of what I regard as an > obsolescent feature. If the Numeric developers choose to continue > extending the use of typecodes in ways that are incompatible with > numarray, one way of dealing with it is to "just say no". > We are going > beyond the scope of backwards compatability to on-going compatabilty. > (Which we may still have to do but needs to be discussed and > considered) > There is no "on-going" Numeric development. It stops the minute numarray is ready. Period. We developers all agreed on that. The whole reason for numarray is that Numeric was pronounced unmaintainable and unextendable by those who frequently had to work on it. To do anything else will fragment the entire numerical python community and software set. From falted at openlc.org Fri Jan 24 10:48:04 2003 From: falted at openlc.org (Francesc Alted) Date: Fri Jan 24 10:48:04 2003 Subject: FW: [Numpy-discussion] typecodes in numarray In-Reply-To: References: Message-ID: <200301241946.55398.falted@openlc.org> A Divendres 24 Gener 2003 18:02, Todd Miller va escriure: > > My [i.e. Todd's] thoughts about it: > > No. It shows you're thinking about it carefully. Having looked at all > of the examples below, I have some comments: I mostly agree with your comments, but let point out some thoughts > > 1. The sparseness and obscurity of the typecode "wordspace" are both > demonstrated here. There are so few letters to choose from, they're > often already used in some other context. Even given the large number > of unused letters, it's often difficult to choose good ones and to > remember what has been chosen. I think this is one of the reasons Perry > chose to replace typecodes with true type objects which have rich, > regular, and predictable symbolic names. I completely agree that type objects is a brilliant idea. > 3. STSCI has layered other software on top of numarray and recarray > which astronomers use to do work. It is the friction of that interface > which makes correcting these consistency problems more difficult than > might be immediately apparent. Yeah, I know... > > >I think it's important to agree with a definitive set of charcodes and use > >them uniformly throughout numarray. > > I wish this were possible, but I'm thinking we should try to find an > alternative approach altogether, one which may be more verbose but > implicitly free of conflict. > > A means for specifying a recarray format might be created from tuples, > type objects, and integer repetition factors. > > The verbosity of this approach might be a litte tedious, but it would > also be transparent, maintainable, and conflict free. I think this is a very good idea. In fact, while working in PyTables I was lately pondering what would be the best way to define record arrays, and I also think that a verbose approach should be the beast. After considering metaclasses, and tuples, I ended to a compromise solution between both which are dictionaries combined with some function or class to refine the definition. My current thinking is something like: recarrDescr = { "name" : defineType(CharType, 16, ""), # 16-character String "TDCcount" : defineType(UInt8, 1, 0), # unsigned byte "ADCcount" : defineType(Int16, 1, 0), # signed short integer "grid_i" : defineType(Int32, 1, 9), # integer "grid_j" : defineType(Int32, 1, 9), # integer "pressure" : defineType(Float32, 1, 1.), # float (single-precision) "temperature" : defineType(Float64, 32, arange(32)), # double[32] "idnumber" : defineType(Int64, 1, 0), # signed long long } where defineType is a class that accepts (type, shape, default) parameters. It can be extended safely in the future if more needs appear. Dictionary has the advantage over tuple in that you can map column name to their contents quite easily, and is more flexible than defining the fields with a metaclass descendent (see http://pytables.sourceforge.net/html-doc/usersguide-html3.html#subsection3.1.2) because dictionarys can be built-up in run-time (although that also migth metaclass descendents, but in a more misterious way that I think is not worth of). In addition, dictionary object is available in all python version whereas metaclasses only from 2.2 on. However, I regard metaclasses as the most elegant solution (but elegance is not always equivalent to convenience :(). Perhaps you may want to consider this for using in recarray definition. > > I think we should add an "obsolescent feature" warning to numarray and > recarray which flags any use of character typecodes when the appropriate > command line switches are set. Well, I don't fully agree with that. I do believe that classes typecodes to be a more meaningful way for describing types, but charcodes can be quite advantageous in certain situations, like in describing in compact way the contents of a record, or passing this info to C-routines to deal with the data. For example, consider the benefits of describing a recarray format as: "3s4i20d" instead of ((Int16, 3), (Int32, 4), (Float64, 20), ) the former being more handy in lots of situations. I certainly believe that a coexistence of both can be very beneficious, specially for 3rd party extension makers (like me :). > > >Suggestion: if recarray charcodes are not necessary to match the Numeric > >ones, I propose that using the Python convention maybe a good idea. > >Look at the table in: > >http://www.python.org/doc/current/lib/module-struct.html. > > This sounds good to me, except that it will break an existing interface > that I don't have control over. Therefore, I suggest we correct the > problem by coming up with something better. Well, if charcodes finally stay in, this have an additional advantage in that python crew has provided meaningful ways to express padding (character "x"), endianess ("=", "<", ">") and alignment ("@"). So having a compact expresion like "@3sx4i20d", apart from resembling chinese to occidentals, may give a lot of info in a handy way. -- Francesc Alted From jmiller at stsci.edu Fri Jan 24 11:20:05 2003 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 24 11:20:05 2003 Subject: [Fwd: Re: [Numpy-discussion] typecodes in numarray] Message-ID: <3E319543.8040101@stsci.edu> -------------- next part -------------- An embedded message was scrubbed... From: unknown sender Subject: no subject Date: no date Size: 38 URL: From jmiller at stsci.edu Fri Jan 24 14:01:31 2003 From: jmiller at stsci.edu (Todd Miller) Date: Fri, 24 Jan 2003 14:01:31 -0500 Subject: [Numpy-discussion] typecodes in numarray References: <000501c2c3cf$e10716e0$6601a8c0@NICKLEBY> Message-ID: <3E318D8B.1090403@stsci.edu> Paul F Dubois wrote: >I don't understand this remark: > >but I am making an argument that perhaps > > >>numarray should only go so far in the support of what I regard as an >>obsolescent feature. If the Numeric developers choose to continue >>extending the use of typecodes in ways that are incompatible with >>numarray, one way of dealing with it is to "just say no". >>We are going >>beyond the scope of backwards compatability to on-going compatabilty. >>(Which we may still have to do but needs to be discussed and >>considered) >> >> >> > >There is no "on-going" Numeric development. It stops the minute numarray is >ready. Period. We developers all agreed on that. The whole reason for >numarray is that Numeric was pronounced unmaintainable and unextendable by >those who frequently had to work on it. To do anything else will fragment >the entire numerical python community and software set. > > > > Thanks for clarifying Paul. My point didn't quite come out right. A better way to put it might have been: 1. Numarray and Numeric are subject to accidental divergence. As long as they both continue to change concurrently, they will probably differ even in interface. Because numarray isn't quite ready yet, they are both still changing. 2. Typecodes in particular are something numarray is superceding with something better. Because of this, providing on-going compatability with Numeric typecodes may not make sense. 3. Numeric compatability is not the only driver for the choice of recarray typecodes so I can't make arbitrary changes without affecting other software and people. 4. I think there's a clearer, numarray type object based approach to describing recarray formats which does not use typecodes at all. Thus, instead of attampting to weed through and unify layers of conflicting type codes, we might be able to end-run the whole problem with an alternative approach. Todd > >------------------------------------------------------- >This SF.NET email is sponsored by: >SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See! >http://www.vasoftware.com >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > --Boundary_(ID_V53Q9uhCvVN46XJvLKOLLw)-- From perry at stsci.edu Fri Jan 24 11:34:02 2003 From: perry at stsci.edu (Perry Greenfield) Date: Fri Jan 24 11:34:02 2003 Subject: [Numpy-discussion] typecodes in numarray In-Reply-To: <000501c2c3cf$e10716e0$6601a8c0@NICKLEBY> Message-ID: I think Todd was referring to the recent addition of unsigned types to Numeric, along with came new typecodes. These types were already in numarray at the time. Perry > -----Original Message----- > From: Paul F Dubois [mailto:paul at pfdubois.com] > Sent: Friday, January 24, 2003 12:42 PM > To: 'Perry Greenfield'; falted at openlc.org; > numpy-discussion at lists.sourceforge.net > Subject: RE: [Numpy-discussion] typecodes in numarray > > > I don't understand this remark: > > but I am making an argument that perhaps > > numarray should only go so far in the support of what I regard as an > > obsolescent feature. If the Numeric developers choose to continue > > extending the use of typecodes in ways that are incompatible with > > numarray, one way of dealing with it is to "just say no". > > We are going > > beyond the scope of backwards compatability to on-going compatabilty. > > (Which we may still have to do but needs to be discussed and > > considered) > > > > There is no "on-going" Numeric development. It stops the minute > numarray is > ready. Period. We developers all agreed on that. The whole reason for > numarray is that Numeric was pronounced unmaintainable and unextendable by > those who frequently had to work on it. To do anything else will fragment > the entire numerical python community and software set. > > > > > > From jmiller at stsci.edu Fri Jan 24 12:01:32 2003 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 24 12:01:32 2003 Subject: FW: [Numpy-discussion] typecodes in numarray References: <200301241946.55398.falted@openlc.org> Message-ID: <3E319ED4.5060709@stsci.edu> > > >>A means for specifying a recarray format might be created from tuples, >>type objects, and integer repetition factors. >> >>The verbosity of this approach might be a litte tedious, but it would >>also be transparent, maintainable, and conflict free. >> >> > >I think this is a very good idea. In fact, while working in PyTables I was >lately pondering what would be the best way to define record arrays, and I >also think that a verbose approach should be the beast. > >After considering metaclasses, and tuples, I ended to a compromise solution >between both which are dictionaries combined with some function or class to >refine the definition. > >My current thinking is something like: > >recarrDescr = { > "name" : defineType(CharType, 16, ""), # 16-character String > "TDCcount" : defineType(UInt8, 1, 0), # unsigned byte > "ADCcount" : defineType(Int16, 1, 0), # signed short integer > "grid_i" : defineType(Int32, 1, 9), # integer > "grid_j" : defineType(Int32, 1, 9), # integer > "pressure" : defineType(Float32, 1, 1.), # float (single-precision) > "temperature" : defineType(Float64, 32, arange(32)), # double[32] > "idnumber" : defineType(Int64, 1, 0), # signed long long > } > >where defineType is a class that accepts (type, shape, default) parameters. >It can be extended safely in the future if more needs appear. > You're way ahead of me here. The only thing I don't like about this is the additional relative complexity because of the addition of field names and default values. It would be nice to layer this more. >Perhaps you may want to consider this for using in recarray definition. > We'll definitely consider it as we hash this out. > > > >>I think we should add an "obsolescent feature" warning to numarray and >>recarray which flags any use of character typecodes when the appropriate >>command line switches are set. >> >> > >Well, I don't fully agree with that. I do believe that classes typecodes to >be a more meaningful way for describing types, but charcodes can be quite >advantageous in certain situations, like in describing in compact way the >contents of a record, or passing this info to C-routines to deal with the >data. > Yeah, I know. >For example, consider the benefits of describing a recarray format as: > >"3s4i20d" > I know. > >instead of > >((Int16, 3), > (Int32, 4), > (Float64, 20), > ) > This is pretty much exactly what I was thinking. It is straightforward to imagine and difficult to forget. > >the former being more handy in lots of situations. > > Would you please name some of these so we can explore handling them both ways? >I certainly believe that a coexistence of both can be very beneficious, specially for 3rd party extension makers (like me :). > If there's a reasonable way to avoid supporting both, we should. >>>Suggestion: if recarray charcodes are not necessary to match the Numeric >>>ones, I propose that using the Python convention maybe a good idea. >>>Look at the table in: >>>http://www.python.org/doc/current/lib/module-struct.html. >>> >>> >>This sounds good to me, except that it will break an existing interface >>that I don't have control over. Therefore, I suggest we correct the >>problem by coming up with something better. >> >> > >Well, if charcodes finally stay in, this have an additional advantage in >that python crew has provided meaningful ways to express padding (character >"x"), endianess ("=", "<", ">") and alignment ("@"). > We might also add these to the type-repetition tuple. Regards, Todd From hinsen at cnrs-orleans.fr Fri Jan 24 12:13:05 2003 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Jan 24 12:13:05 2003 Subject: [Numpy-discussion] Extensive use of methods instead of functions In-Reply-To: <20030124000759.GA6042@localhost.localdomain> References: <20030124000759.GA6042@localhost.localdomain> Message-ID: Jos? Fonseca writes: > With the ability of subclassing types in recent versions of the Python > language, more people will be interested in subclassing Numeric arrays > for specific purposes. Still the use of functions instead of methods > takes away many of the advantages, the ability of being overloaded. True. On the other hand, there is also an advantage: NumPy routines can be used on standard Python data types such as number and sequence types. In the ideal world (which might come one day), core NumPy functionality would be part of standard Python, and then all these operations would work on other built-in types as well. Until then, I am not sure that changing NumPy functions to methods is a good idea. I need to call them on scalar numbers much more often than I subclass arrays. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From paul at pfdubois.com Fri Jan 24 12:36:03 2003 From: paul at pfdubois.com (Paul F Dubois) Date: Fri Jan 24 12:36:03 2003 Subject: [Numpy-discussion] Extensive use of methods instead of functions In-Reply-To: Message-ID: <000401c2c3e8$0db770f0$6601a8c0@NICKLEBY> Every time the subject of subclassing a numeric array comes up, it as if nobody ever thought of it before. Been there, done that. It doesn't turn out to be all that useful. To see why, consider a + b where a and b are Foo instances, and Foo inherits from numarray. a. a + b will be a numarray, not a Foo instance, unless you write a new + operator. b. Attempting to have numarray itself apply a subclass constructor to the result runs into the problem that numarray does not have any idea what the constructor's signature is or what information is needed to fill out that constructor. c. Even if the subclass accepts numarray's constructor signature, it would rarely produced satisfactory results just "losing" the Foo'ness details of a and b. This same argument applies to every method that returns a Foo instance, and every ufunc. So you end up redoing everything anyway. In short, worrying about subclassing is way down the list of things we ought to consider. > -----Original Message----- > From: numpy-discussion-admin at lists.sourceforge.net > [mailto:numpy-discussion-admin at lists.sourceforge.net] On > Behalf Of Konrad Hinsen > Sent: Friday, January 24, 2003 12:07 PM > To: Jos? Fonseca > Cc: numpy-discussion at lists.sourceforge.net > Subject: Re: [Numpy-discussion] Extensive use of methods > instead of functions > > > Jos? Fonseca writes: > > > With the ability of subclassing types in recent versions of > the Python > > language, more people will be interested in subclassing > Numeric arrays > > for specific purposes. Still the use of functions instead > of methods > > takes away many of the advantages, the ability of being overloaded. > > True. On the other hand, there is also an advantage: NumPy > routines can be used on standard Python data types such as > number and sequence types. > > In the ideal world (which might come one day), core NumPy > functionality would be part of standard Python, and then all > these operations would work on other built-in types as well. > > Until then, I am not sure that changing NumPy functions to > methods is a good idea. I need to call them on scalar numbers > much more often than I subclass arrays. > > Konrad. > -- > -------------------------------------------------------------- > ----------------- > Konrad Hinsen | E-Mail: > hinsen at cnrs-orleans.fr > Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 > Rue Charles Sadron | Fax: +33-2.38.63.15.17 > 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ > France | Nederlands/Francais > -------------------------------------------------------------- > ----------------- > > > ------------------------------------------------------- > This SF.NET email is sponsored by: > SourceForge Enterprise Edition + IBM + LinuxWorld =omething 2 > See! http://www.vasoftware.com > _______________________________________________ > Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From perry at stsci.edu Fri Jan 24 13:11:05 2003 From: perry at stsci.edu (Perry Greenfield) Date: Fri Jan 24 13:11:05 2003 Subject: [Numpy-discussion] Extensive use of methods instead of functions In-Reply-To: <000401c2c3e8$0db770f0$6601a8c0@NICKLEBY> Message-ID: Paul Dubois writes: > > Every time the subject of subclassing a numeric array comes up, it as if > nobody ever thought of it before. Been there, done that. It > doesn't turn out > to be all that useful. To see why, consider a + b where a and b are Foo > instances, and Foo inherits from numarray. > > a. a + b will be a numarray, not a Foo instance, unless you write a new + > operator. > b. Attempting to have numarray itself apply a subclass constructor to the > result runs into the problem that numarray does not have any idea what the > constructor's signature is or what information is needed to fill out that > constructor. > c. Even if the subclass accepts numarray's constructor signature, it would > rarely produced satisfactory results just "losing" the Foo'ness > details of a > and b. > > This same argument applies to every method that returns a Foo > instance, and > every ufunc. So you end up redoing everything anyway. > > In short, worrying about subclassing is way down the list of > things we ought > to consider. > Paul illustrates some important points. While I'm not as down on the ability to subclass (more on that later), he is absolutely right that most think that subclassing is a breeze and don't realize that it is far from being so. The arguments for this would be helped immensely by a practical example of a desired subclass. This does far more to illustrate the issues than an abstract discussion. For most instances that I have considered or thought about it is unavoidable that one must override virtually all (if not all) the operators and functions. Nevertheless, subclassing can still save a great deal of work over implementing a completely new extension. But you'll have to deal with defining how all the operators and functions should behave. In our view, the most valuable subclassing in numarray comes from subclassing NDArray, which handles all the structural operations for arrays (recarray makes heavy use of this). But recarrays don't try to support numerical operations, and that makes it fairly easy. Subclassing numarrays is significantly more work for the reasons cited. Perry From jmiller at stsci.edu Fri Jan 24 13:56:01 2003 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 24 13:56:01 2003 Subject: FW: [Numpy-discussion] typecodes in numarray References: <200301241946.55398.falted@openlc.org> <3E319ED4.5060709@stsci.edu> Message-ID: <3E31B9DB.7080603@stsci.edu> > > >> My current thinking is something like: >> >> recarrDescr = { >> "name" : defineType(CharType, 16, ""), # 16-character String >> "TDCcount" : defineType(UInt8, 1, 0), # unsigned byte >> "ADCcount" : defineType(Int16, 1, 0), # signed short integer >> "grid_i" : defineType(Int32, 1, 9), # integer >> "grid_j" : defineType(Int32, 1, 9), # integer >> "pressure" : defineType(Float32, 1, 1.), # float >> (single-precision) >> "temperature" : defineType(Float64, 32, arange(32)), # double[32] >> "idnumber" : defineType(Int64, 1, 0), # signed long long } >> >> where defineType is a class that accepts (type, shape, default) >> parameters. >> It can be extended safely in the future if more needs appear. >> > You're way ahead of me here. The only thing I don't like about this > is the additional relative complexity because of the addition of field > names and default values. It would be nice to layer this more. One more thing I don't understand looking at this: a dictionary is unordered. Todd From j_r_fonseca at yahoo.co.uk Fri Jan 24 14:00:03 2003 From: j_r_fonseca at yahoo.co.uk (=?iso-8859-15?Q?Jos=E9?= Fonseca) Date: Fri Jan 24 14:00:03 2003 Subject: [Numpy-discussion] Extensive use of methods instead of functions In-Reply-To: References: <20030124000759.GA6042@localhost.localdomain> Message-ID: <20030124215828.GA32437@localhost.localdomain> On Fri, Jan 24, 2003 at 09:07:21PM +0100, Konrad Hinsen wrote: > Jos? Fonseca writes: > > > With the ability of subclassing types in recent versions of the Python > > language, more people will be interested in subclassing Numeric arrays > > for specific purposes. Still the use of functions instead of methods > > takes away many of the advantages, the ability of being overloaded. > > True. On the other hand, there is also an advantage: NumPy routines > can be used on standard Python data types such as number and sequence > types. > > In the ideal world (which might come one day), core NumPy > functionality would be part of standard Python, and then all these > operations would work on other built-in types as well. > > Until then, I am not sure that changing NumPy functions to methods > is a good idea. I need to call them on scalar numbers much more > often than I subclass arrays. You've got a good point there. I often want to use with other Numeric array-alike classes, but I've also used them with standard Python data types for convenience. Still, it's perfectly possible to both interfaces to co-exist. Of course that when one would use the .method version it can't expect to work with standard Python data types and has to make a choice, or to use asarray() or something equivalent before using it. Regards, Jos? Fonseca __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com From j_r_fonseca at yahoo.co.uk Fri Jan 24 15:21:02 2003 From: j_r_fonseca at yahoo.co.uk (=?iso-8859-15?Q?'Jos=E9?= Fonseca') Date: Fri Jan 24 15:21:02 2003 Subject: [Numpy-discussion] Extensive use of methods instead of functions In-Reply-To: <000401c2c3e8$0db770f0$6601a8c0@NICKLEBY> References: <000401c2c3e8$0db770f0$6601a8c0@NICKLEBY> Message-ID: <20030124231900.GB32437@localhost.localdomain> On Fri, Jan 24, 2003 at 12:34:54PM -0800, Paul F Dubois wrote: > > Every time the subject of subclassing a numeric array comes up, it as > if nobody ever thought of it before. Why do you treat me as if I was trying to sell the "Next Big Thing"!? First, I must tell you that the first time I came across the idea of subclassing Numeric arrays was while reading the "Subclassing" subsection, in the "Special Topics" section of the Numeric Python manual. Your name, Paul, appears as one of the authors. Second, subclassing Numeric arrays may be useful. Again, the distribution of Numeric Python even has one big example: making a linear algebra oriented version of Numeric python, where the operations would be the standard matrix and vector operations instead of the element-wise operations. > Been there, done that. It doesn't turn out to be all that useful. As seen by the examples above is obvious you did. Still, I don't see how can you possibly say it isn't useful... > To see why, consider a + b where a and b are Foo instances, and Foo > inherits from numarray. > > a. a + b will be a numarray, not a Foo instance, unless you write a > new + operator. b. Attempting to have numarray itself apply a > subclass constructor to the result runs into the problem that numarray > does not have any idea what the constructor's signature is or what > information is needed to fill out that constructor. c. Even if the > subclass accepts numarray's constructor signature, it would rarely > produced satisfactory results just "losing" the Foo'ness details of a > and b. > > This same argument applies to every method that returns a Foo > instance, and every ufunc. So you end up redoing everything anyway. [In general it may be usefully to subclass Numeric arrays if one just want to add/overload methods, but no new properties.] And third, if you read my thread you'd notice that the use of methods instead of functions has implications/benefits much beyond the subclassing issue. It's particularly important for Numeric-alike arrays. All objects in Python are virtual so you don't actually need to subclass to use different kind of objects in the same piece as code. While you're right in the sense that for many practical applications there is little use of subclassing - a sparse matrix class is one of them for instance -, you can't deny that is quite useful to have Numeric-alike arrays, in the same basis as is currently done with the file-alike objects in Python, i.e., they could be strings, web pages but as long as they define a set of methods, these. > In short, worrying about subclassing is way down the list of things we > ought to consider. If so, then why did your comment only focused on the subclassing issue? The subclassing was a mere introduction [perhaps unfortunate, I confess] to the method overloading issue. Now, if you could (re)read my first post and comment on my actual suggestion I would appreciate. Of course that I have no problems if the Numeric/numarray maintainers decide to turn it down. I'll most probably just use UserArray.py to create a "method-ized" version of Numeric, so that my algorithms can work with both Numeric array and sparse matrices. (I do have a real case need of for this.) BTW, there is an alternative to create full-methodized Numeric array: just add a attribute which points to the module which the class belongs, e.g., "myarray.module.take" would point to "Numeric.take" if it was a Numeric array, or "Sparse.take" if it was a sparse matrix. Regards, Jos? Fonseca __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com From bsder at allcaps.org Fri Jan 24 16:19:03 2003 From: bsder at allcaps.org (Andrew P. Lentvorski, Jr.) Date: Fri Jan 24 16:19:03 2003 Subject: [Numpy-discussion] Extensive use of methods instead of functions In-Reply-To: <20030124231900.GB32437@localhost.localdomain> Message-ID: On Fri, 24 Jan 2003, [iso-8859-15] 'Jos? Fonseca' wrote: > Of course that I have no problems if the Numeric/numarray maintainers > decide to turn it down. I'll most probably just use UserArray.py to create a > "method-ized" version of Numeric, so that my algorithms can work with > both Numeric array and sparse matrices. (I do have a real case need of > for this.) Sparse matricies are common enough that they really should be a base part of Numeric rather than requiring subclassing/extending/etc. I know that Travis O. was working on some sparse matrix stuff a while back so you might want to contact him to get the current status of that work. -a From falted at openlc.org Sat Jan 25 04:43:02 2003 From: falted at openlc.org (Francesc Alted) Date: Sat Jan 25 04:43:02 2003 Subject: FW: [Numpy-discussion] typecodes in numarray In-Reply-To: <3E319ED4.5060709@stsci.edu> References: <200301241946.55398.falted@openlc.org> <3E319ED4.5060709@stsci.edu> Message-ID: <200301251342.15164.falted@openlc.org> A Divendres 24 Gener 2003 21:15, Todd Miller va escriure: > > > >My current thinking is something like: > > > >recarrDescr = { > > "name" : defineType(CharType, 16, ""), # 16-character String > > "TDCcount" : defineType(UInt8, 1, 0), # unsigned byte > > "ADCcount" : defineType(Int16, 1, 0), # signed short integer > > "grid_i" : defineType(Int32, 1, 9), # integer > > "grid_j" : defineType(Int32, 1, 9), # integer > > "pressure" : defineType(Float32, 1, 1.), # float > > (single-precision) "temperature" : defineType(Float64, 32, arange(32)), > > # double[32] "idnumber" : defineType(Int64, 1, 0), # signed long > > long } > > > >where defineType is a class that accepts (type, shape, default) > > parameters. It can be extended safely in the future if more needs appear. > > You're way ahead of me here. The only thing I don't like about this is > the additional relative complexity because of the addition of field > names and default values. It would be nice to layer this more. > Well, I think a map between field names and values is valuable from the user's point of view. It may help him to label the different information on the recarray. Moreover, if __getattr__ and __setattr__ methods (or __getitem__ and __setitem__) would get implemented on recarray (as they are in my recarray2 version, for example), the field name can become a very convenient manner to access a specific field by name (this introduce the limitation that field name must be a valid python identifier, but I think this is not a big restriction). By looking at the description dictionary, the user can have a quick idea of what he can find in every field (with no need of counting, which can be a big advantage specially for long records). With regard to default values, you can make this parameter (even the shape) a keyword parameter in order to make it optional. In that way, the definition can be as simple as "defineType(CharType)" (or even just "Chartype", if you add a bit of code) or as complete as "defineType(Chartype, shape, default, whatever_you_want)". I think this is a quite flexible approach. >One more thing I don't understand looking at this: a dictionary is >unordered. Yeah, but this can be regarded as an advantage rather than a drawback in the sense that you can choose the order you (the developer) prefer. For example, I was using first a alphanumerical order to arrange the data fields, but now, I'm considering that a arrangement that optimizes the alignment of the fields could be far better. As for one, say that you have a (Int8, Int32, Float64) record; in principle it could be easy to create a routine that arranges this record in the form (Float64,Int32, Int8) that optimizes the different field access (it may be even possible to introduce automatic padding later on if recarrays would support them in the future). Maybe you are getting confused in thinking that recarrDescr will create the recarray. Not at all, this a *metadata* definition that can be passed to the actual recarray funtion for recarray creation. Its function would be similar to the formats parameter (with typical values like "3a,4i,3w") in recarray.array, but with more verbosity and all the reported advantages. > >instead of > > > >((Int16, 3), > > (Int32, 4), > > (Float64, 20), > > ) > > This is pretty much exactly what I was thinking. It is straightforward > to imagine and difficult to forget. > > >the former being more handy in lots of situations. > > Would you please name some of these so we can explore handling them both > ways? > Well, I'm afraid that the best advantage would be when dealing with recarrays in C extension modules. In this kind of situation it would be far better to deal with a "3a4i3w" array than a tuple of python objects. But maybe I'm wrong and the latter is not so-complicated to manage; however, I used to work a lot with records (even before meeting recarray) and I was quite comfortable with formats in string mode. Or perhaps it would be enough to provide a method for converting from the standard metadata layout (dictionary or tuple or whatever), to a string format. This should be not very difficult. > > > >Well, if charcodes finally stay in, this have an additional advantage in > >that python crew has provided meaningful ways to express padding > > (character "x"), endianess ("=", "<", ">") and alignment ("@"). > > We might also add these to the type-repetition tuple. It would be nice, of course. -- Francesc Alted From jmiller at stsci.edu Sat Jan 25 11:16:05 2003 From: jmiller at stsci.edu (Todd Miller) Date: Sat Jan 25 11:16:05 2003 Subject: FW: [Numpy-discussion] typecodes in numarray References: <200301241946.55398.falted@openlc.org> <3E319ED4.5060709@stsci.edu> <200301251342.15164.falted@openlc.org> Message-ID: <3E32E5E3.2020704@stsci.edu> Francesc Alted wrote: >A Divendres 24 Gener 2003 21:15, Todd Miller va escriure: > > >>>My current thinking is something like: >>> >>>recarrDescr = { >>> "name" : defineType(CharType, 16, ""), # 16-character String >>> "TDCcount" : defineType(UInt8, 1, 0), # unsigned byte >>> "ADCcount" : defineType(Int16, 1, 0), # signed short integer >>> "grid_i" : defineType(Int32, 1, 9), # integer >>> "grid_j" : defineType(Int32, 1, 9), # integer >>> "pressure" : defineType(Float32, 1, 1.), # float >>>(single-precision) "temperature" : defineType(Float64, 32, arange(32)), >>># double[32] "idnumber" : defineType(Int64, 1, 0), # signed long >>>long } >>> >>> Still think I'd prefer something seperable: recarrStruct = ( (CharType, 16), UInt8, Int16, Int32, Int32, Float32, (Float64, 32), Int64 ) recarrFields = ["name", "TDCcount", "ADCcount", "grid_i", "grid_j", "pressure", "temperature", "idnumber"] I guess it might not be quite as good for large structs. >>>where defineType is a class that accepts (type, shape, default) >>>parameters. It can be extended safely in the future if more needs appear. >>> >>> >>You're way ahead of me here. The only thing I don't like about this is >>the additional relative complexity because of the addition of field >>names and default values. It would be nice to layer this more. >> >> >> > >Well, I think a map between field names and values is valuable from the >user's point of view. It may help him to label the different information on >the recarray. Moreover, if __getattr__ and __setattr__ methods (or >__getitem__ and __setitem__) would get implemented on recarray (as they are >in my recarray2 version, for example), the field name can become a very >convenient manner to access a specific field by name (this introduce the >limitation that field name must be a valid python identifier, but I think >this is not a big restriction). By looking at the description dictionary, >the user can have a quick idea of what he can find in every field (with no >need of counting, which can be a big advantage specially for long records). > That's true and sounds nice. I'm just thinking records with named fields should be derived from records with positional fields. If the functionality is layered, you can use as much complexity as you need. It's a good sign that both you and I thought of an identical tuple format; it's the obvious minimal one. > >With regard to default values, you can make this parameter (even the shape) >a keyword parameter in order to make it optional. > OK. That's a good point. > > >>One more thing I don't understand looking at this: a dictionary is >>unordered. >> >> > >Yeah, but this can be regarded as an advantage rather than a drawback in the >sense that you can choose the order you (the developer) prefer. For example, >I was using first a alphanumerical order to arrange the data fields, but >now, I'm considering that a arrangement that optimizes the alignment of the >fields could be far better. As for one, say that you have a (Int8, Int32, >Float64) record; in principle it could be easy to create a routine that >arranges this record in the form (Float64,Int32, Int8) that optimizes the >different field access (it may be even possible to introduce automatic >padding later on if recarrays would support them in the future). > >Maybe you are getting confused > Yes and no. :) >in thinking that recarrDescr will create the >recarray. Not at all, this a *metadata* definition that can be passed to the >actual recarray funtion for recarray creation. > Just like the type repetition tuple except also including field names and default values. I don't think you lost me. For what we do, the exact physical layout of the "struct" is important, so order matters. I see order as part of the meta-data, but I don't usually deal with meta-entities so maybe I've got that part wrong. :) >Its function would be >similar to the formats parameter (with typical values like "3a,4i,3w") in >recarray.array, but with more verbosity and all the reported advantages. > > > >>>instead of >>> >>>((Int16, 3), >>>(Int32, 4), >>>(Float64, 20), >>>) >>> >>> >>This is pretty much exactly what I was thinking. It is straightforward >>to imagine and difficult to forget. >> >> >> >>>the former being more handy in lots of situations. >>> >>> >>Would you please name some of these so we can explore handling them both >>ways? >> >> >> > >Well, I'm afraid that the best advantage would be when dealing with >recarrays in C extension modules. In this kind of situation it would be far >better to deal with a "3a4i3w" array than a tuple of python objects. But >maybe I'm wrong and the latter is not so-complicated to manage; however, I >used to work a lot with records (even before meeting recarray) and I was >quite comfortable with formats in string mode. > I was thinking that if the above was an issue, we could write an API function(s) to "compile" the type-repetition tuple into arrays of ints which describe the type of each field and corresponding repetition factor. > >Or perhaps it would be enough to provide a method for converting from the >standard metadata layout (dictionary or tuple or whatever), to a string >format. This should be not very difficult. > > Almost exactly what I suggested above. See you Monday, Todd From baecker at physik.tu-dresden.de Sun Jan 26 02:41:02 2003 From: baecker at physik.tu-dresden.de (baecker at physik.tu-dresden.de) Date: Sun Jan 26 02:41:02 2003 Subject: [Numpy-discussion] complex diagonal matrix Message-ID: Hi, I just wondered if there is a "nicer" way of generating a complex diagonal matrix than a) v=arange(10,typecode=Complex) mat=diag(v) b) v=arange(10) mat=diag(v)+0j Namely, wouldn't something like v=arange(10) mat=diag(v,typecode=Complex) be nicer? BTW: I somehow found that in the (excellent) documentation of Numeric the definitions from Mlab.py are a bit hidden. In my case I know nothing about matlab and I somehow expected that this type of routines are to be found in the section (together with zeros,ones etc. etc....) Also diag is not listed in the index http://www.pfdubois.com/numpy/html2/numpy-22.html#A or ? Arnd From hinsen at cnrs-orleans.fr Sun Jan 26 03:11:02 2003 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Sun Jan 26 03:11:02 2003 Subject: [Numpy-discussion] complex diagonal matrix In-Reply-To: References: Message-ID: baecker at physik.tu-dresden.de writes: > I just wondered if there is a "nicer" way of generating > a complex diagonal matrix than > a) > v=arange(10,typecode=Complex) > mat=diag(v) > b) > v=arange(10) > mat=diag(v)+0j > > Namely, wouldn't something like > v=arange(10) > mat=diag(v,typecode=Complex) > be nicer? Why would that be nicer? Personally, I prefer to have explicit typecodes limited to a very small number of array generators, and have all other functions apply the standard type-preservation rules. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From list at jsaul.de Sun Jan 26 04:03:05 2003 From: list at jsaul.de (Joachim Saul) Date: Sun Jan 26 04:03:05 2003 Subject: [Numpy-discussion] complex diagonal matrix In-Reply-To: References: Message-ID: <20030126120117.GB869@jsaul.de> * baecker at physik.tu-dresden.de [26.01.2003 11:40]: > I just wondered if there is a "nicer" way of generating > a complex diagonal matrix than > a) > v=arange(10,typecode=Complex) > mat=diag(v) > b) > v=arange(10) > mat=diag(v)+0j > > Namely, wouldn't something like > v=arange(10) > mat=diag(v,typecode=Complex) > be nicer? No, because diag() is supposed to create a diagonal, but *not* to cast to another type. If you wanted to add that "functionality" to functions like diag(), you would also have to add it to functions like reshape() etc., i.e. practically everywhere. The way it is handled now is reasonably simple and flexible, and there is really no advantage of your suggestion compared to approach a). Cheers, Joachim From falted at openlc.org Mon Jan 27 04:02:02 2003 From: falted at openlc.org (Francesc Alted) Date: Mon Jan 27 04:02:02 2003 Subject: FW: [Numpy-discussion] typecodes in numarray In-Reply-To: <3E32E5E3.2020704@stsci.edu> References: <200301251342.15164.falted@openlc.org> <3E32E5E3.2020704@stsci.edu> Message-ID: <200301271301.01659.falted@openlc.org> A Dissabte 25 Gener 2003 20:30, Todd Miller va escriure: > > Still think I'd prefer something seperable: > > recarrStruct = ( (CharType, 16), > UInt8, > Int16, > Int32, > Int32, > Float32, > (Float64, 32), > Int64 ) > > recarrFields = ["name", > "TDCcount", > "ADCcount", > "grid_i", > "grid_j", > "pressure", > "temperature", > "idnumber"] > > I guess it might not be quite as good for large structs. Me too... > > It's a good sign that both you and I thought of an identical tuple > format; it's the obvious > minimal one. Yeah. We just differ in the way to arrange this metadata to be passed to the recarray constructor. But I think this is secondary compared to the flexibility that a verbose approach offers compared with the actual string format. In fact, more than one container might be supported to define the metadata; one can start with tuples as you suggest, but in the future other ways can be added (if considered convenient). For example, I think I'll stick with the dictionary option for PyTables, but also a class declaration for the metadata would be supported, like in : class Small(IsRecord): var1 = defineType(CharType, 2, "") var2 = defineType(Int32, 1) var3 = Float64 This would not be difficult to support because, by accessing to the Small().__dict__, you get also a dictionary. In addition, the latter will ensure (by construction) that you are not using a non-valid python identifier, which is mandatory in my current implementation. I find these containers (dictionaries and classes) both elegant and convenient. > > Just like the type repetition tuple except also including field names > and default values. I don't think you lost me. For what we do, the > exact physical layout of the "struct" is important, so order matters. I > see order as part of the > meta-data, but I don't usually deal with meta-entities so maybe I've > got that part wrong. :) > Well, if you need positional fields, you may add a (optional) parameter, called for example, "position" so that you can fix it. > > I was thinking that if the above was an issue, we could write an API > function(s) to "compile" the type-repetition tuple into arrays of ints > which describe the type of each field and corresponding repetition factor. Yeah, I agree that this would be the best solution. That way, the charcodes will be factored out from the code, and by just providing such and API (both in Python and C), would be enough to reconstruct them, if needed. That will allow a more consistent numarray internal code. > > See you Monday, Right, how did you know that? :) -- Francesc Alted From jmiller at stsci.edu Mon Jan 27 06:44:03 2003 From: jmiller at stsci.edu (Todd Miller) Date: Mon Jan 27 06:44:03 2003 Subject: FW: [Numpy-discussion] typecodes in numarray References: <200301251342.15164.falted@openlc.org> <3E32E5E3.2020704@stsci.edu> <200301271301.01659.falted@openlc.org> Message-ID: <3E354551.5090704@stsci.edu> Francesc Alted wrote: >Yeah. We just differ in the way to arrange this metadata to be passed to the >recarray constructor. But I think this is secondary compared to the >flexibility that a verbose approach offers compared with the actual string >format. > Yes. So one question is: if we were to add type-repetition tuples to recarray as an alternative to the current character code strings, would that be any form of improvement to recarray from your perspective? As I see it, recarray currently has a clean seperation between format and naming which permits the latter to be optional. Before changing that, I'd need a clear argument why. (I didn't design and generally don't even maintain recarray). >In fact, more than one container might be supported to define the >metadata; one can start with tuples as you suggest, but in the future other >ways can be added (if considered convenient). > > >For example, I think I'll stick with the dictionary option for PyTables, but >also a class declaration for the metadata would be supported, like in : > >class Small(IsRecord): > var1 = defineType(CharType, 2, "") > var2 = defineType(Int32, 1) > var3 = Float64 > >This would not be difficult to support because, by accessing to the >Small().__dict__, you get also a dictionary. In addition, the latter will >ensure (by construction) that you are not using a non-valid python >identifier, which is mandatory in my current implementation. I find these >containers (dictionaries and classes) both elegant and convenient. > > I'm not trying to be Mr. Negative here, but one thing to keep in mind is this: >>> class C: ... pass ... >>> c = C() >>> dir(c.__dict__) ['__class__', '__cmp__', '__contains__', '__delattr__', '__delitem__', '__doc__', '__eq__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__reduce__', '__repr__', '__setattr__', '__setitem__', '__str__', 'clear', 'copy', 'fromkeys', 'get', 'has_key', 'items', 'iteritems', 'iterkeys', 'itervalues', 'keys', 'pop', 'popitem', 'setdefault', 'update', 'values'] Which is to say, the instance dictionary is a little cluttered, and it might not be that easy to determine which objects in it are there to define the data format. >>Just like the type repetition tuple except also including field names >>and default values. I don't think you lost me. For what we do, the >>exact physical layout of the "struct" is important, so order matters. I >>see order as part of the >>meta-data, but I don't usually deal with meta-entities so maybe I've >>got that part wrong. :) >> > >Well, if you need positional fields, you may add a (optional) parameter, >called for example, "position" so that you can fix it. > > I'm sure that's not the easiest way to capture struct layout, but I take your point. Since position matters to me, I'd prefer that capturing them was implicit. Since it doesn't to you, it seems OK for it to be explicit. Either default mode can support the other, but capturing order with tuples is free, while capturing order with a __dict__ will take some kind of extra work. >>I was thinking that if the above was an issue, we could write an API >>function(s) to "compile" the type-repetition tuple into arrays of ints >>which describe the type of each field and corresponding repetition factor. >> >> > >Yeah, I agree that this would be the best solution. That way, the charcodes >will be factored out from the code, and by just providing such and API (both >in Python and C), would be enough to reconstruct them, if needed. That will >allow a more consistent numarray internal code. > > I'm thinking the general format for this may be converting N-tuples of types and ints into N arrays of types and ints. And vice versa. It's obvious how this works with numarray types. I think the chararray types need work and need to be mapped into the same integer enumeration as the numeric types in a non-overlapping way. >See you Monday, > > > >Right, how did you know that? :) > > Insightful on weekends anyway, Todd From jmiller at stsci.edu Mon Jan 27 08:30:02 2003 From: jmiller at stsci.edu (Todd Miller) Date: Mon Jan 27 08:30:02 2003 Subject: FW: [Numpy-discussion] typecodes in numarray References: <200301271301.01659.falted@openlc.org> <3E354551.5090704@stsci.edu> <200301271717.19055.falted@openlc.org> Message-ID: <3E355E35.9070805@stsci.edu> Francesc Alted wrote: >A Dilluns 27 Gener 2003 15:42, Todd Miller va escriure: > > >>Yes. So one question is: if we were to add type-repetition tuples to >>recarray as an alternative to the current character code strings, would >>that be any form of improvement to recarray from your perspective? >> >> > >Well, at least, charcodes can be avoided. I think it's a big win... or maybe >not as big? > > I think that avoiding the charcodes would be an improvement. Type-repetition tuples provide a clear well defined way to define data formats. It's not so clear that it eliminates the requirement for on-going Numeric compatability, but it might. > > >>As I see it, recarray currently has a clean seperation between format >>and naming which permits the latter to be optional. Before changing >>that, I'd need a clear argument why. (I didn't design and generally >>don't even maintain recarray). >> >> > >One argument is the fact that a map is very clear to the user, although that >such a map can be built *after* the names and format are passed to the >recarray constructor and be accessible as an atribute. However, the latter >solution is worse IMO, because the user has to supply two separate pieces of >information when, actually, these should be regarded as a unity. Anyway, >this maybe a subjective perception. > > Well, I think there's truth to the danger of seperating names from data declarations, but it is easy to map keys(), values() to the seperate pieces in a different layer if necessary. >This would not be difficult to support because, by accessing to the >Small().__dict__, you get also a dictionary. In addition, the latter will >ensure (by construction) that you are not using a non-valid python >identifier, which is mandatory in my current implementation. I find these >containers (dictionaries and classes) both elegant and convenient. > > >>I'm not trying to be Mr. Negative here, but one thing to keep in mind >> >> > >Oh dear, you are right!. > For a few seconds there, I thought I was on a roll! >In fact, I forgot that to make this to work, you >need to use the metaclasses introduced in Python 2.2 (see Alex Martelli's >post: http://mail.python.org/pipermail/python-list/2002-July/112007.html). >I was following this recipe, but I forgot that I was using Python 2.2. > >So, as numarray has to work with previous python versions, there is no point >to care about that. > > In truth, numarray-0.4 and up already require Python-2.2 and up. >I'm sure that's not the easiest way to capture struct layout, but I >take your point. Since position matters to me, I'd prefer that >capturing them was implicit. Since it doesn't to you, it seems OK for >it to be explicit. Either default mode can support the other, but >capturing order with tuples is free, while capturing order with a >__dict__ will take some kind of extra work. > > > >That's right. We have some different needs and priorities, and we should >take the approach better suited to each other. But exchanging points of view >is always a great thing. > > > >>I'm thinking the general format for this may be converting N-tuples of >>types and ints into N arrays of types and ints. And vice versa. >>It's obvious how this works with numarray types. I think the chararray >>types need work and need to be mapped into the same integer enumeration >>as the numeric types in a non-overlapping way. >> >> >> > >I can't catch your point here. Why there should be a problem with >chararrays?. > What I was trying to see is that chararray types are not as well designed as the numarray types, nor are they reflected in the C-API. > > From falted at openlc.org Mon Jan 27 08:39:05 2003 From: falted at openlc.org (Francesc Alted) Date: Mon Jan 27 08:39:05 2003 Subject: FW: [Numpy-discussion] typecodes in numarray In-Reply-To: <3E354551.5090704@stsci.edu> References: <200301271301.01659.falted@openlc.org> <3E354551.5090704@stsci.edu> Message-ID: <200301271717.19055.falted@openlc.org> A Dilluns 27 Gener 2003 15:42, Todd Miller va escriure: > Yes. So one question is: if we were to add type-repetition tuples to > recarray as an alternative to the current character code strings, would > that be any form of improvement to recarray from your perspective? Well, at least, charcodes can be avoided. I think it's a big win... or maybe not as big? > > As I see it, recarray currently has a clean seperation between format > and naming which permits the latter to be optional. Before changing > that, I'd need a clear argument why. (I didn't design and generally > don't even maintain recarray). One argument is the fact that a map is very clear to the user, although that such a map can be built *after* the names and format are passed to the recarray constructor and be accessible as an atribute. However, the latter solution is worse IMO, because the user has to supply two separate pieces of information when, actually, these should be regarded as a unity. Anyway, this maybe a subjective perception. > >This would not be difficult to support because, by accessing to the > >Small().__dict__, you get also a dictionary. In addition, the latter will > >ensure (by construction) that you are not using a non-valid python > >identifier, which is mandatory in my current implementation. I find these > >containers (dictionaries and classes) both elegant and convenient. > > I'm not trying to be Mr. Negative here, but one thing to keep in mind Oh dear, you are right!. In fact, I forgot that to make this to work, you need to use the metaclasses introduced in Python 2.2 (see Alex Martelli's post: http://mail.python.org/pipermail/python-list/2002-July/112007.html). I was following this recipe, but I forgot that I was using Python 2.2. So, as numarray has to work with previous python versions, there is no point to care about that. > > I'm sure that's not the easiest way to capture struct layout, but I > take your point. Since position matters to me, I'd prefer that > capturing them was implicit. Since it doesn't to you, it seems OK for > it to be explicit. Either default mode can support the other, but > capturing order with tuples is free, while capturing order with a > __dict__ will take some kind of extra work. That's right. We have some different needs and priorities, and we should take the approach better suited to each other. But exchanging points of view is always a great thing. > > I'm thinking the general format for this may be converting N-tuples of > types and ints into N arrays of types and ints. And vice versa. > It's obvious how this works with numarray types. I think the chararray > types need work and need to be mapped into the same integer enumeration > as the numeric types in a non-overlapping way. > I can't catch your point here. Why there should be a problem with chararrays?. -- Francesc Alted From Chris.Barker at noaa.gov Mon Jan 27 10:20:06 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Mon Jan 27 10:20:06 2003 Subject: FW: [Numpy-discussion] typecodes in numarray References: <200301271301.01659.falted@openlc.org> <3E354551.5090704@stsci.edu> <200301271717.19055.falted@openlc.org> Message-ID: <3E35768B.DD6454BE@noaa.gov> Francesc Alted wrote: > So, as numarray has to work with previous python versions, Why? Anyone using NumArray is either starting from scratch or porting from Numeric, so having to port to a newer version of Python is a very small deal. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From jmiller at stsci.edu Mon Jan 27 10:34:05 2003 From: jmiller at stsci.edu (Todd Miller) Date: Mon Jan 27 10:34:05 2003 Subject: FW: [Numpy-discussion] typecodes in numarray References: <200301271301.01659.falted@openlc.org> <3E354551.5090704@stsci.edu> <200301271717.19055.falted@openlc.org> <3E35768B.DD6454BE@noaa.gov> Message-ID: <3E357B5F.9030908@stsci.edu> Chris Barker wrote: >Francesc Alted wrote: > > > >>So, as numarray has to work with previous python versions, >> >> > >Why? Anyone using NumArray is either starting from scratch or porting >from Numeric, so having to port to a newer version of Python is a very >small deal. > > Just to make it very clear: numarray-0.4 and up require Python-2.2 or higher. Up until numarray-0.4 (released in November), that was not the case, and numarray ran (and was tested!) on Python-2.0 and higher. The desire to increase C-level Numeric compatability and to improve simple indexing speed led us to a C baseclass, which is only supported in Python-2.2 and up. Todd From falted at openlc.org Mon Jan 27 11:23:01 2003 From: falted at openlc.org (Francesc Alted) Date: Mon Jan 27 11:23:01 2003 Subject: FW: [Numpy-discussion] typecodes in numarray In-Reply-To: <3E355E35.9070805@stsci.edu> References: <200301271717.19055.falted@openlc.org> <3E355E35.9070805@stsci.edu> Message-ID: <200301272021.47587.falted@openlc.org> A Dilluns 27 Gener 2003 17:28, Todd Miller va escriure: > >So, as numarray has to work with previous python versions, there is no > > point to care about that. > > In truth, numarray-0.4 and up already require Python-2.2 and up. Oh!, I didn't know that. In such a case, I think it's worth to consider the possibility to define records as classes descendants from metaclasses. But, of course, you have the ultimate decision. > >>I'm thinking the general format for this may be converting N-tuples of > >>types and ints into N arrays of types and ints. And vice versa. > >>It's obvious how this works with numarray types. I think the chararray > >>types need work and need to be mapped into the same integer enumeration > >>as the numeric types in a non-overlapping way. > > > >I can't catch your point here. Why there should be a problem with > >chararrays?. > > What I was trying to see is that chararray types are not as well > designed as the numarray types, nor are they reflected in the C-API. I see. Well, is it really desirable such a unification? CharArray entities come from a module and NumArray from another one, and that should be ok. Why bother in creating a unified API or integer enumeration?. I think this should be not a big drawback for C-extension crafters (although, to say the truth, that would be very elegant if you manage to do that, but maybe it is not worth the effort, I don't know). -- Francesc Alted From jmiller at stsci.edu Mon Jan 27 11:39:01 2003 From: jmiller at stsci.edu (Todd Miller) Date: Mon Jan 27 11:39:01 2003 Subject: FW: [Numpy-discussion] typecodes in numarray References: <000001c2c635$624e9a40$6601a8c0@NICKLEBY> Message-ID: <3E358A72.6050400@stsci.edu> Paul F Dubois wrote: >IMHO you can assume any Python you want. Look to the long term here, not the >short. > You lost me. numarray-0.4 needs at least Python-2.2 or baseclasses don't exist. I had a slow Python equivalent for the baseclass as I refactored prior to numarray-0.4, but it's gone now. > >I'm a bit uncertain on MA as to whether my old design is right. Maybe I >should be inheriting from NDarray? So that MA is more of a sibling of >numarray rather than a wrapper of it? > > I asked Perry about this one. His points (salted a little by me) were: 1. If you inherit from NumArray, you also inherit from NDArray. If you only inherit from NDArray, all you get are the structural operations. 2. If you inherit from NumArray, you can use Liskov substitution to pass MA's directly into extensions expecting NumArrays. This substitution may or may not be good. Also, isinstance(anMA, numarray) will return True. 3. If you inherit from NumArray, you get numerical method definitions which may or may not be applicable to MA. With a little thrashing, we might also get MAs to work for ufuncs. In fact, ufuncs are the key to whether or not the NumArray numerical methods add any value. Todd > > From jmiller at stsci.edu Mon Jan 27 11:54:06 2003 From: jmiller at stsci.edu (Todd Miller) Date: Mon Jan 27 11:54:06 2003 Subject: FW: [Numpy-discussion] typecodes in numarray References: <200301271717.19055.falted@openlc.org> <3E355E35.9070805@stsci.edu> <200301272021.47587.falted@openlc.org> Message-ID: <3E358DE0.7040501@stsci.edu> Francesc Alted wrote: >A Dilluns 27 Gener 2003 17:28, Todd Miller va escriure: > > >>>So, as numarray has to work with previous python versions, there is no >>>point to care about that. >>> >>> >>In truth, numarray-0.4 and up already require Python-2.2 and up. >> >> > >Oh!, I didn't know that. In such a case, I think it's worth to consider the >possibility to define records as classes descendants from metaclasses. But, >of course, you have the ultimate decision. > > I don't know what you mean here. Please spell it out a little more. > > >>>>I'm thinking the general format for this may be converting N-tuples of >>>>types and ints into N arrays of types and ints. And vice versa. >>>>It's obvious how this works with numarray types. I think the chararray >>>>types need work and need to be mapped into the same integer enumeration >>>>as the numeric types in a non-overlapping way. >>>> >>>> >>>I can't catch your point here. Why there should be a problem with >>>chararrays?. >>> >>> >>What I was trying to see is that chararray types are not as well >>designed as the numarray types, nor are they reflected in the C-API. >> >> > >I see. Well, is it really desirable such a unification? CharArray entities >come from a module and NumArray from another one, and that should be ok. Why >bother in creating a unified API or integer enumeration?. > It may not be necessary. Int8 with repitition factors may work about the same. From falted at openlc.org Mon Jan 27 12:16:02 2003 From: falted at openlc.org (Francesc Alted) Date: Mon Jan 27 12:16:02 2003 Subject: FW: [Numpy-discussion] typecodes in numarray In-Reply-To: <3E358DE0.7040501@stsci.edu> References: <200301272021.47587.falted@openlc.org> <3E358DE0.7040501@stsci.edu> Message-ID: <200301272114.53545.falted@openlc.org> A Dilluns 27 Gener 2003 20:52, Todd Miller va escriure: > > > >Oh!, I didn't know that. In such a case, I think it's worth to consider > > the possibility to define records as classes descendants from > > metaclasses. But, of course, you have the ultimate decision. > > I don't know what you mean here. Please spell it out a little more. I was trying to mean that using something like : class Small(IsRecord): field1 = defineType(CharType, 2, default="", position=1) field2 = defineType(Int32, 1, position=2) field3 = Float64 as as container for recarray metadata is definitely possible instead of the tuple (formats="2aid",names=("field1","field2", "field3")), if using Python2.2. IsRecord is a metaclass (introduced in Python 2.2) that allows you to effectively separate the declared attributes from the implicit ones in normal classes. Of course, you can taylor IsRecord so as to fulfill your needs. I hope that I have expressed myself more clearly now, -- Francesc Alted From jmiller at stsci.edu Mon Jan 27 12:54:05 2003 From: jmiller at stsci.edu (Todd Miller) Date: Mon Jan 27 12:54:05 2003 Subject: FW: [Numpy-discussion] typecodes in numarray References: <200301272021.47587.falted@openlc.org> <3E358DE0.7040501@stsci.edu> <200301272114.53545.falted@openlc.org> Message-ID: <3E359C2B.4070509@stsci.edu> Francesc Alted wrote: >A Dilluns 27 Gener 2003 20:52, Todd Miller va escriure: > > >>>Oh!, I didn't know that. In such a case, I think it's worth to consider >>>the possibility to define records as classes descendants from >>>metaclasses. But, of course, you have the ultimate decision. >>> >>> >>I don't know what you mean here. Please spell it out a little more. >> >> > >I was trying to mean that using something like : > >class Small(IsRecord): > field1 = defineType(CharType, 2, default="", position=1) > field2 = defineType(Int32, 1, position=2) > field3 = Float64 > >as as container for recarray metadata is definitely possible instead of the >tuple (formats="2aid",names=("field1","field2", "field3")), if using >Python2.2. IsRecord is a metaclass (introduced in Python 2.2) that allows >you to effectively separate the declared attributes from the implicit ones >in normal classes. > >Of course, you can taylor IsRecord so as to fulfill your needs. > >I hope that I have expressed myself more clearly now, > > > I looked at your docs here: http://pytables.sourceforge.net/html-doc/usersguide-html4.html#section4.2 and what you said above clicked. Thanks. Todd From Chris.Barker at noaa.gov Tue Jan 28 11:02:04 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Tue Jan 28 11:02:04 2003 Subject: [Numpy-discussion] Extensive use of methods instead of functions References: <6AF88055-28D9-11D7-AE69-000A27B19B96@oratrix.com> <3E26EC9D.A0B7D173@noaa.gov> <3E288068.3070407@stsci.edu> Message-ID: <3E36D14D.C3238DFA@noaa.gov> Konrad Hinsen wrote: > > M = array(l) > > Mt = M.transpose() > > > > just isn't that much worse than: > > > > Mt = transpose(l) > > No, but the automatic conversion enables me to write functions that > accept any sequence type without even having to think about it. I've used that to, but I also frequently use something like this: def function(A): A = array(A) ... Which is pretty simple to. > Moreover, it is almost essential in many situations to accept scalars > in place of arrays, because scalars fulfill the role of rank-0 arrays. Yes, this is critical. Isn't there a plan to make the scalar -- rank-0 array dicotomy a little cleaner in NumArray ? > > I also agree that the point is not subclassing per se, it's > > polymorphism. It should be easy to write a class that acts like an array > > in all the ways that you need it to. > > True, and that is a weak point of NumPy. Is this getting any better with NumArray? -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From falted at openlc.org Tue Jan 28 11:42:07 2003 From: falted at openlc.org (Francesc Alted) Date: Tue Jan 28 11:42:07 2003 Subject: [Numpy-discussion] enum values visible in numeric types instances? Message-ID: <200301282041.21145.falted@openlc.org> Hi, A couple of points related with numarray type objects: 1.- When working with numeric types instances like UInt8 or Float64, is there a way to access to their enumeration NumarrayType C counterpart?. That can be handy when want to map from these objects and integers. For example, right now, I'm forced to use these mappings in Pyrex: # Conversion tables from/to classes to the numarray enum types toenum = {num.Int8:tInt8, num.UInt8:tUInt8, num.Int16:tInt16, num.UInt16:tUInt16, num.Int32:tInt32, num.UInt32:tUInt32, num.Float32:tFloat32, num.Float64:tFloat64, CharType:97 # ascii(97) --> 'a' # Special case (to be corrected) } toclass = {tInt8:num.Int8, tUInt8:num.UInt8, tInt16:num.Int16, tUInt16:num.UInt16, tInt32:num.Int32, tUInt32:num.UInt32, tFloat32:num.Float32, tFloat64:num.Float64, 97:CharType # ascii(97) --> 'a' # Special case (to be corrected) } (yes, Pyrex lets you do that kind of "miracles", like mappings between Python objects and C integers) but if I had this access directly from the object (for example Int8.enumType), my code (and C-extensions in general) could look simpler. 2.- I understand now why Todd was worried about CharArray objects to be assigned to an enumerated type. In fact, if you look at the above maps, I have to map myself this special object as the number 97 (which is the ascii value for character "a"). 97 is ok for now because it can't collide (at least for a while) with other enumeration types. My suggestion is that it would be a good thing to have a reserved enum type for CharArray. And I think that mapping CharArrays with Bool or Int8, would not be a good solution because chararray objects differ in some ways from them, that it would be a mess to distinguish both objects in C-code by just looking at its enumeration type. I don't know, but maybe recarrays also merit a place in enumeration (?). By the way, after the discussion with Todd I finally decided to remove all the Numeric charcodes (and related codes) from PyTables. However, I can still manage Numeric objects by converting them to numarray and accessing the class type with the .type() method. An you know that? the code looks much more logical and neat, and best of all, less error-prone (well, at least I hope so!). I definitely encourage you to do a similar transition in numarray (although I guess that would be more difficult because you still need to Numeric compatibility). Thanks, -- Francesc Alted From perry at stsci.edu Tue Jan 28 13:59:08 2003 From: perry at stsci.edu (Perry Greenfield) Date: Tue Jan 28 13:59:08 2003 Subject: [Numpy-discussion] Extensive use of methods instead of functions In-Reply-To: <3E36D14D.C3238DFA@noaa.gov> Message-ID: > Yes, this is critical. Isn't there a plan to make the scalar -- rank-0 > array dicotomy a little cleaner in NumArray ? > Hmmm, I'd like to say yes, but I'm not sure what exactly you are referring to. Please elaborate on how you think it should be changed. About the only thing that comes to mind is that repr() for rank-0 will be different for numarray than Numeric, and that it will never be the result of any reduction or similar selection. > > > I also agree that the point is not subclassing per se, it's > > > polymorphism. It should be easy to write a class that acts > like an array > > > in all the ways that you need it to. > > > > True, and that is a weak point of NumPy. > > Is this getting any better with NumArray? > Again, I hope so, but I find this too general to know if it satisfies anyone's specific goals. I'd like to see specific examples. I think it is often tricker than people initially think. Perry From jdhunter at ace.bsd.uchicago.edu Wed Jan 29 13:13:03 2003 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Wed Jan 29 13:13:03 2003 Subject: [Numpy-discussion] fastest way to make two vectors into an array Message-ID: I have two equal length 1D arrays of 256-4096 complex or floating point numbers which I need to put into a shape=(len(x),2) array. I need to do this a lot, so I would like to use the most efficient means. Currently I am doing: def somefunc(x,y): X = zeros( (len(x),2), typecode=x.typecode()) X[:,0] = x X[:,1] = y do_something_with(X) Is this the fastest way? Thanks, John Hunter From list at jsaul.de Thu Jan 30 01:20:04 2003 From: list at jsaul.de (Joachim Saul) Date: Thu Jan 30 01:20:04 2003 Subject: [Numpy-discussion] fastest way to make two vectors into an array In-Reply-To: References: Message-ID: <20030130091853.GA842@jsaul.de> * John Hunter [2003-01-29 22:13]: > def somefunc(x,y): > X = zeros( (len(x),2), typecode=x.typecode()) > X[:,0] = x > X[:,1] = y > do_something_with(X) > > Is this the fastest way? X = transpose(array([x]+[y])) It may not be the fastest possible way, but should be about a factor of two faster; better than nothing. Cheers, Joachim From karthik at james.hut.fi Thu Jan 30 01:47:03 2003 From: karthik at james.hut.fi (Karthikesh Raju) Date: Thu Jan 30 01:47:03 2003 Subject: [Numpy-discussion] Object too deep for desired array In-Reply-To: Message-ID: Hi, i was tring out something like this import Numeric import LinearAlgebra import cmath import RandomArray import copy def sMatrix(pd, code, window): if window == 0: nprime = 1 else: nprime = window K, C = Numeric.shape(code) K1, L = Numeric.shape(pd) # check if K == K1 and raise an exception here sCode = Numeric.zeros([nprime*C,K*L*(window+1)],'d') for k in range(K): for l in range(L): code1 = copy.deepcopy(Numeric.array(code[k,0:C-pd[k,l]])) code1.shape = (C-pd[k,l],1) sCode1= Numeric.concatenate((Numeric.zeros([pd[k,l],1]),Numeric.zeros([C*window,1]),code1)) sCode[:, (window+1)*l+window*L*k] = copy.deepcopy(sCode1) return sCode if __name__ == "__main__": pd = Numeric.array([[2]]) code = Numeric.array([[-1,1,-1,1,1]]) np = sMatrix(pd,code,0) print np print "--"*30 np = sMatrix(pd,code,1) print Numeric.shape(np) print np print "--"*30 np = sMatrix(pd,code,2) print Numeric.shape(np) print np print "--"*30 ------------------------------ And i get struck with the following error message:: Traceback (most recent call last): File "sMatrix.py", line 31, in ? np = sMatrix(pd,code,0) File "sMatrix.py", line 24, in sMatrix sCode[:, (window+1)*l+window*L*k] = copy.deepcopy(sCode1) ValueError: Object too deep for desired array ------------ i think it is due to the many deep copy operations taht i am performing. i want to be in a position where slices of matrices should not be references, but should be copies itself and i should be able to move these copies around. (May be it is inefficient, but that is what i did in Matlab and want some compatibility, till i learn more of python and till i migrate to python completely). Is there a way out? Why is this an problem? Am i missing something. Best regards, karthik ----------------------------------------------------------------------- Karthikesh Raju, email: karthik at james.hut.fi Researcher, http://www.cis.hut.fi/karthik Helsinki University of Technology, Tel: +358-9-451 5389 Laboratory of Comp. & Info. Sc., Fax: +358-9-451 3277 Department of Computer Sc., P.O Box 5400, FIN 02015 HUT, Espoo, FINLAND ----------------------------------------------------------------------- From pearu at cens.ioc.ee Thu Jan 30 01:51:09 2003 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Thu Jan 30 01:51:09 2003 Subject: [Numpy-discussion] fastest way to make two vectors into an array In-Reply-To: Message-ID: On Wed, 29 Jan 2003, John Hunter wrote: > > I have two equal length 1D arrays of 256-4096 complex or floating > point numbers which I need to put into a shape=(len(x),2) array. > > I need to do this a lot, so I would like to use the most efficient > means. Currently I am doing: > > def somefunc(x,y): > X = zeros( (len(x),2), typecode=x.typecode()) > X[:,0] = x > X[:,1] = y > do_something_with(X) > > Is this the fastest way? May be you could arange your algorithm so that you first create X and then reference its columns by x,y without copying: # Allocate memory X = zeros( (n,2), typecode=.. ) # Get references to columns x = X[:,0] y = X[:,1] while 1: do_something_inplace_with(x,y) do_something_with(X) Pearu From jdhunter at ace.bsd.uchicago.edu Thu Jan 30 11:26:05 2003 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Thu Jan 30 11:26:05 2003 Subject: [Numpy-discussion] fastest way to make two vectors into an array In-Reply-To: (John Hunter's message of "Wed, 29 Jan 2003 15:13:03 -0600") References: Message-ID: >>>>> "John" == John Hunter writes: John> I have two equal length 1D arrays of 256-4096 complex or John> floating point numbers which I need to put into a John> shape=(len(x),2) array. John> I need to do this a lot, so I would like to use the most John> efficient means. Currently I am doing: I tested all the suggested methods and the transpose with [x] and [y] was the clear winner, with an 8 fold speed up over my original code. The concatenate method was between 2-3 times faster. Thanks to all who responded, John Hunter cruncher2:~/python/test> python test.py test_naive test_naive 0.480427026749 cruncher2:~/python/test> python test.py test_concat test_concat 0.189149975777 cruncher2:~/python/test> python test.py test_transpose test_transpose 0.0698409080505 from Numeric import transpose, concatenate, reshape, array, zeros from RandomArray import normal import time, sys def test_naive(x,y): "Naive approach" X = zeros( (len(x),2), typecode=x.typecode()) X[:,0] = x X[:,1] = y def test_concat(x,y): "Thanks to Chris Barker and Bryan Cole" X = concatenate( ( reshape(x,(-1,1)), reshape(y,(-1,1)) ), 1) def test_transpose(x,y): "Thanks to Joachim Saul" X = transpose(array([x]+[y])) m = {'test_naive' : test_naive, 'test_concat' : test_concat, 'test_transpose' : test_transpose} nse1 = normal(0.0, 1.0, (4096,)) nse2 = normal(0.0, 1.0, nse1.shape) N = 1000 trials = range(N) func = m[sys.argv[1]] t1 = time.time() for i in trials: func(nse1,nse2) t2 = time.time() print sys.argv[1], t2-t1 From jdhunter at ace.bsd.uchicago.edu Thu Jan 30 14:18:04 2003 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Thu Jan 30 14:18:04 2003 Subject: [Numpy-discussion] mlab functions: psd, csd, cohere, corrcoef Message-ID: I needed some spectral analysis functions, and finding none available, wrote my own. I use matlab a lot, so I wrote them to be matlab compatible. If you all think these look OK, I'm happy to submit them for inclusion into MLab. ------------------------------------------------------------------- """ Spectral analysis functions for Numerical python written for compatability with matlab commands with the same names. psd - Power spectral density uing Welch's average periodogram csd - Cross spectral density uing Welch's average periodogram cohere - Coherence (normalized cross spectral density) corrcoef - The matrix of correlation coefficients The functions are designed to work for real and complex valued Numeric arrays. One of the major differences between this code and matlab's is that I use functions for 'detrend' and 'window', and matlab uses vectors. This can be easily changed, but I think the functional approach is a bit more elegant. Please send comments, questions and bugs to: Author: John D. Hunter """ from __future__ import division from MLab import mean, hanning, cov from Numeric import zeros, ones, diagonal, transpose, matrixmultiply, \ resize, sqrt, divide, array, Float, Complex, concatenate, \ convolve, dot, conjugate, absolute, arange, reshape from FFT import fft def norm(x): return sqrt(dot(x,x)) def window_hanning(x): return hanning(len(x))*x def window_none(x): return x def detrend_mean(x): return x - mean(x) def detrend_none(x): return x def detrend_linear(x): """Remove the best fit line from x""" # I'm going to regress x on xx=range(len(x)) and return # x - (b*xx+a) xx = arange(len(x), typecode=x.typecode()) X = transpose(array([xx]+[x])) C = cov(X) b = C[0,1]/C[0,0] a = mean(x) - b*mean(xx) return x-(b*xx+a) def psd(x, NFFT=256, Fs=2, detrend=detrend_none, window=window_hanning, noverlap=0): """ The power spectral density by Welches average periodogram method. The vector x is divided into NFFT length segments. Each segment is detrended by function detrend and windowed by function window. noperlap gives the length of the overlap between segments. The absolute(fft(segment))**2 of each segment are averaged to compute Pxx, with a scaling to correct for power loss due to windowing. Fs is the sampling frequency. -- NFFT must be a power of 2 -- detrend and window are functions, unlike in matlab where they are vectors. -- if length x < NFFT, it will be zero padded to NFFT Refs: Bendat & Piersol -- Random Data: Analysis and Measurement Procedures, John Wiley & Sons (1986) """ if NFFT % 2: raise ValueError, 'NFFT must be a power of 2' # zero pad x up to NFFT if it is shorter than NFFT if len(x)1: Pxx = mean(Pxx,1) Pxx = divide(Pxx, norm(windowVals)**2) freqs = Fs/NFFT*arange(0,numFreqs) return Pxx, freqs def csd(x, y, NFFT=256, Fs=2, detrend=detrend_none, window=window_hanning, noverlap=0): """ The cross spectral density Pxy by Welches average periodogram method. The vectors x and y are divided into NFFT length segments. Each segment is detrended by function detrend and windowed by function window. noverlap gives the length of the overlap between segments. The product of the direct FFTs of x and y are averaged over each segment to compute Pxy, with a scaling to correct for power loss due to windowing. Fs is the sampling frequency. NFFT must be a power of 2 Refs: Bendat & Piersol -- Random Data: Analysis and Measurement Procedures, John Wiley & Sons (1986) """ if NFFT % 2: raise ValueError, 'NFFT must be a power of 2' # zero pad x and y up to NFFT if they are shorter than NFFT if len(x)1: Pxy = mean(Pxy,1) Pxy = divide(Pxy, norm(windowVals)**2) freqs = Fs/NFFT*arange(0,numFreqs) return Pxy, freqs def cohere(x, y, NFFT=256, Fs=2, detrend=detrend_none, window=window_hanning, noverlap=0): """ cohere the coherence between x and y. Coherence is the normalized cross spectral density Cxy = |Pxy|^2/(Pxx*Pyy) The return value is (Cxy, f), where f are the frequencies of the coherence vector. See the docs for psd and csd for information about the function arguments NFFT, detrend, windowm noverlap, as well as the methods used to compute Pxy, Pxx and Pyy. """ Pxx,f = psd(x, NFFT=NFFT, Fs=Fs, detrend=detrend, window=window, noverlap=noverlap) Pyy,f = psd(y, NFFT=NFFT, Fs=Fs, detrend=detrend, window=window, noverlap=noverlap) Pxy,f = csd(x, y, NFFT=NFFT, Fs=Fs, detrend=detrend, window=window, noverlap=noverlap) Cxy = divide(absolute(Pxy)**2, Pxx*Pyy) return Cxy, f def corrcoef(*args): """ corrcoef(X) where X is a matrix returns a matrix of correlation coefficients for each row of X. corrcoef(x,y) where x and y are vectors returns the matrix or correlation coefficients for x and y. Numeric arrays can be real or complex The correlation matrix is defined from the covariance matrix C as r(i,j) = C[i,j] / (C[i,i]*C[j,j]) """ if len(args)==2: X = transpose(array([args[0]]+[args[1]])) elif len(args==1): X = args[0] else: raise RuntimeError, 'Only expecting 1 or 2 arguments' C = cov(X) d = resize(diagonal(C), (2,1)) r = divide(C,sqrt(matrixmultiply(d,transpose(d))))[0,1] try: return r.real except AttributeError: return r ------------------------------------------------------------------- I wrote a little test code comparing the output of matlab's equivalent functions. Basically, I compute the psd or cohere in matlab and python and do the rms difference on the resultant vectors RMS cohere python/matlab difference 0.000854587104587 RMS psd python/matlab difference 0.00210783306638 I am not sure where these differences are arising, but they are quite small. I'm going to keep trying to track them down. For corrcoef, the answers are the same past 8 significant digits. Hope this helps! John Hunter From haase at msg.ucsf.edu Fri Jan 31 05:12:05 2003 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Fri Jan 31 05:12:05 2003 Subject: [Numpy-discussion] numarray 0.4 on osX/darwin Message-ID: <020a01c2c897$65bf2dc0$3b45da80@rodan> Hi everybody, I tried a 'python2.2 setup.py install' of numarray on a Mac running os-X (10.1; I have also Fink installed) I starts crunching until: /usr/bin/ld: Undefined symbols: _fclearexcept _fetestexcept Anyone out there, who uses numarray on osX ? I'm thankful for any pointer... Sebastian Haase From jmiller at stsci.edu Fri Jan 31 07:31:01 2003 From: jmiller at stsci.edu (Todd Miller) Date: Fri Jan 31 07:31:01 2003 Subject: [Numpy-discussion] numarray 0.4 on osX/darwin References: <020a01c2c897$65bf2dc0$3b45da80@rodan> Message-ID: <3E3A9628.3030704@stsci.edu> Sebastian Haase wrote: >Hi everybody, >I tried a 'python2.2 setup.py install' >of numarray on a Mac running os-X (10.1; I have also Fink installed) >I starts crunching until: >/usr/bin/ld: Undefined symbols: >_fclearexcept >_fetestexcept > >Anyone out there, who uses numarray on osX ? > >I'm thankful for any pointer... > >Sebastian Haase > > Hi Sebastian, I am very much a Mac-Amateur, but I have run numarray under osX by first installing a local UNIX version of Python using the source tarball. The steps were roughly as follows: 1. Obtain and unpack the Python source tarball in you home directory. cd there. 2. Configure Python using: ./configure --prefix=$HOME 3. Edit the Makefile for the following: 61c61 > LDFLAGS= --- < LDFLAGS= -framework System -framework CoreServices -framework Foundation This was the only (reasonable) way I could figure out how to tunnel link time options down through the distutils in the proper command line order. I'm not really sure this is a minimal set of frameworks, but it did at least work. 4. Build and install python: make ; make install 5. Obtain and unpack the numarray source tarball. cd there. 6. Build and install numarray: python setupall.py install 7. Put $HOME/bin on your PATH and rehash. Todd > > > >------------------------------------------------------- >This SF.NET email is sponsored by: >SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See! >http://www.vasoftware.com >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > From Chris.Barker at noaa.gov Fri Jan 31 12:44:02 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Fri Jan 31 12:44:02 2003 Subject: [Numpy-discussion] fastest way to make two vectors into anarray References: Message-ID: <3E3ADC19.5566CB5A@noaa.gov> John Hunter wrote: > John> I have two equal length 1D arrays of 256-4096 complex or > John> floating point numbers which I need to put into a > John> shape=(len(x),2) array. > I tested all the suggested methods and the transpose with [x] and [y] > was the clear winner, with an 8 fold speed up over my original code. > The concatenate method was between 2-3 times faster. I was a little surprised by this, as I figured that the transpose method made an extra copy of the data (array() makes one copy, transpose() another. So I looked at the source for concatenate: def concatenate(a, axis=0): """concatenate(a, axis=0) joins the tuple of sequences in a into a single NumPy array. """ if axis == 0: return multiarray.concatenate(a) else: new_list = [] for m in a: new_list.append(swapaxes(m, axis, 0)) return swapaxes(multiarray.concatenate(new_list), axis, 0) So, if you are concantenating along anything other than the zero-th axis, you end up doing something similar to the transpose method. Seeign this, I trioed something else: def test_concat2(x,y): x.shape = (1,-1) y.shape = (1,-1) X = transpose( concatenate( (x, y) ) ) x.shape = (-1,) y.shape = (-1,) This then uses the native concatenate, but requires an extra copy in teh transpose. Here's a somewhat cleaner version, though you get more copies: def test_concat3(x,y): "Thanks to Chris Barker and Bryan Cole" X = transpose( concatenate( ( reshape(x,(1,-1)), reshape(y,(1,-1)) ) ) ) Here are the test results: testing on vectors of length: 4096 test_concat 0.286280035973 test_transpose 0.100033998489 test_naive 0.805399060249 test_concat3 0.109319090843 test_concat2 0.136469960213 All the transpose methods are essentially a tie. Would it be that hard for concatenate to do it's thing for any axis in C? It does seem like this is a fairly basic operation, and shouldn't require more than one copy. By the way, I realised that the transpose method had an extra call. transpose() can take an approprriate python sequence, so this works just fine: def test_transpose2(x,y): X = transpose([x]+[y]) However, it doesn't really save you the copy, as I'm retty sure transpose makes a copy internally anyway. Test results: testing on vectors of length: 4096 test_transpose 0.104995965958 test_transpose2 0.103582024574 I think the winner is: X = transpose([x]+[y]) well, I learned a little bit more about Numeric today. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From rob at hooft.net Fri Jan 31 13:36:03 2003 From: rob at hooft.net (Rob Hooft) Date: Fri Jan 31 13:36:03 2003 Subject: [Numpy-discussion] fastest way to make two vectors into anarray References: <3E3ADC19.5566CB5A@noaa.gov> Message-ID: <3E3AEC19.6020907@hooft.net> Chris Barker wrote: > > X = transpose([x]+[y]) > > > well, I learned a little bit more about Numeric today. > I've been skipping through a lot of messages today because I was getting behind on mailing list traffic, but I missed one thing in the discussion so far (sorry if it was marked already): transpose doesn't actually do any work. Actually, transpose only sets the "strides" counts differently, and this is blazingly fast. What is NOT fast is using the transposed array later! The problem is that many routines actually require a contiguous array, and will make a temporary local contiguous copy. This may happen multiple times if the lifetime of the transposed array is long. Even routines that do not require a contiguous array and can actually use the strides may run significantly slower because the CPU cache is trashed a lot by the high strides. Moral: you can't test this code by looping a 1000 times through it, you actually should take into account the time it takes to make a contiguous array immediately after the transpose call. Regards, Rob Hooft -- Rob W.W. Hooft || rob at hooft.net || http://www.hooft.net/people/rob/