From falted at openlc.org Tue Feb 4 01:15:02 2003 From: falted at openlc.org (Francesc Alted) Date: Tue Feb 4 01:15:02 2003 Subject: [Numpy-discussion] more than 2-D numarrays in recarray? Message-ID: <200302041014.08462.falted@openlc.org> Hi, It seems that recarray doesn't support more than 1-D numarray arrays as fields. Is that a fundamental limitation? If not, do you plan to support arbitrary dimensions in the future?. Thanks, -- Francesc Alted From jmiller at stsci.edu Tue Feb 4 04:05:04 2003 From: jmiller at stsci.edu (Todd Miller) Date: Tue Feb 4 04:05:04 2003 Subject: [Numpy-discussion] more than 2-D numarrays in recarray? References: <200302041014.08462.falted@openlc.org> Message-ID: <3E3FAFED.1050201@stsci.edu> Francesc Alted wrote: >Hi, > >It seems that recarray doesn't support more than 1-D numarray arrays as >fields. Is that a fundamental limitation? > I don't think it is fundamental, merely a question of what is needed and works easily. I see two problems with multi-d numarray fields, both solvable: 1. Multidimensional numarrays must be described in the recarray spec. 2. Either numarray or recarray must be able to handle a (slightly) more complicated case of recomputing array strides from shape and (bytestride,record-length). I didn't design or implement recarray so there may be other problems as well. >If not, do you plan to support >arbitrary dimensions in the future?. > I don't think it's a priority now. What do you need them for? > >Thanks, > > > Regards, Todd From falted at openlc.org Tue Feb 4 05:01:05 2003 From: falted at openlc.org (Francesc Alted) Date: Tue Feb 4 05:01:05 2003 Subject: [Numpy-discussion] more than 2-D numarrays in recarray? In-Reply-To: <3E3FAFED.1050201@stsci.edu> References: <200302041014.08462.falted@openlc.org> <3E3FAFED.1050201@stsci.edu> Message-ID: <200302041400.34961.falted@openlc.org> A Dimarts 04 Febrer 2003 13:19, Todd Miller va escriure: > I see two problems with multi-d numarray fields, both > solvable: > > 1. Multidimensional numarrays must be described in the recarray spec. > > 2. Either numarray or recarray must be able to handle a (slightly) more > complicated case of recomputing array strides from shape and > (bytestride,record-length). > > I didn't design or implement recarray so there may be other problems as > well. I had a look at the code and it seems like you are right. > I don't think it's a priority now. What do you need them for? Well, I've adopted the recarray object (actually a slightly modified version of it) to be a fundamental building block in next release of PyTables. If arbitrary dimensionality were implemented, the resulting tables would be more general. Moreover, I'm thinking about implementing unlimited (just one axis) array dimension support and having a degenerated recarray with just one column as a multimensional numarray object would easy quite a lot the implementation. Of course, I could implement my own recarray version with that support, but I just don't want to diverge so much from the reference implementation. -- Francesc Alted From perry at stsci.edu Tue Feb 4 07:41:02 2003 From: perry at stsci.edu (Perry Greenfield) Date: Tue Feb 4 07:41:02 2003 Subject: [Numpy-discussion] more than 2-D numarrays in recarray? In-Reply-To: <200302041400.34961.falted@openlc.org> Message-ID: > > A Dimarts 04 Febrer 2003 13:19, Todd Miller va escriure: > > I see two problems with multi-d numarray fields, both > > solvable: > > > > 1. Multidimensional numarrays must be described in the recarray spec. > > > > 2. Either numarray or recarray must be able to handle a (slightly) more > > complicated case of recomputing array strides from shape and > > (bytestride,record-length). > > > > I didn't design or implement recarray so there may be other problems as > > well. > > I had a look at the code and it seems like you are right. > > > I don't think it's a priority now. What do you need them for? > > Well, I've adopted the recarray object (actually a slightly > modified version > of it) to be a fundamental building block in next release of PyTables. If > arbitrary dimensionality were implemented, the resulting tables would be > more general. Moreover, I'm thinking about implementing unlimited > (just one > axis) array dimension support and having a degenerated recarray with just > one column as a multimensional numarray object would easy quite a lot the > implementation. > > Of course, I could implement my own recarray version with that > support, but > I just don't want to diverge so much from the reference implementation. > > -- > Francesc Alted > As Todd says, the initial implementation was to support only 1-d cases. There is no fundamental reason why it shouldn't support the general case. We'd like to work with you about how that should be best implemented. Basically the issue is how we save the shape information for that field. I don't think it would be hard to implement. Perry From tim.hochberg at ieee.org Tue Feb 4 08:52:05 2003 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Tue Feb 4 08:52:05 2003 Subject: [Numpy-discussion] Interesting Psyco/Numeric/Numarray comparison Message-ID: <3E3FEF8F.6000807@ieee.org> I was inspired by Armin's latest Psyco version to try and see how well one could do with NumPy/NumArray implemented in Psycotic Python. I wrote a bare bones, pure Python, Numeric array class loosely based on Jnumeric (which in turn was loosely based on Numeric). The buffer is just Python's array.array. At the moment, all that one can do to the arrays is add and index them and the code is still a bit of a mess. I plan to clean things up over the next week in my copius free time <0.999 wink> and at that point it should be easy add the remaining operations. I benchmarked this code, which I'm calling Psymeric for the moment, against NumPy and Numarray to see how it did. I used a variety of array sizes, but mostly relatively large arrays of shape (500,100) and of type Float64 and Int32 (mixed and with consistent types) as well as scalar values. Looking at the benchmark data one comes to three main conclusions: * For small arrays NumPy always wins. Both Numarray and Psymeric have much larger overhead. * For large, contiguouse arrays, Numarray is about twice as fast as either of the other two. * For large, noncontiguous arrays, Psymeric and NumPy are ~20% faster than Numarray The impressive thing is that Psymeric is generally slightly faster than NumPy when adding two arrays. It's slightly slower (~10%) when adding an array and a scalar although I suspect that could be fixed by some special casing a la Numarray. Adding two (500,100) arrays of type Float64 together results in the following timings: psymeric numpy numarray contiguous 0.0034 s 0.0038 s 0.0019 s stride-2 0.0020 s 0.0023 s 0.0033 s I'm not sure if this is important, but it is an impressive demonstration of Psyco! More later when I get the code a bit more cleaned up. -tim 0.002355 0.002355 From falted at openlc.org Tue Feb 4 10:06:03 2003 From: falted at openlc.org (Francesc Alted) Date: Tue Feb 4 10:06:03 2003 Subject: [Numpy-discussion] more than 2-D numarrays in recarray? In-Reply-To: References: Message-ID: <200302041904.29807.falted@openlc.org> A Dimarts 04 Febrer 2003 16:40, Perry Greenfield va escriure: > We'd like to work with you about how that should be best implemented. > Basically the issue is how we save the shape information for that field. > I don't think it would be hard to implement. Ok, great! Well, my proposals for extended recarray syntax are: 1.- Extend the actual formats to read something like: ['(1,)i1', '(3,4)i4', '(16,)a', '(2,3,4)i2'] Pro's: - It's the straightforward extension of the actual format - Should be easy to implement - Note that the charcodes has been substituted by a slightly more verbose version ('i2' instead of 's', for example) - Short and simple Con's: - It is still string-code based - Implicit field order 2.- Make use of the syntax I'm suggesting in past messages: class Particle(IsRecord): name = Col(CharType, (16,), dflt="", pos=3) # 16-character String ADCcount = Col(Int8, (1,), dflt=0, pos=1) # signed byte TDCcount = Col(Int32, (3,4), dflt=0, pos=2) # signed integer grid_i = Col(Int16, (2,3,4), dflt=0, pos=4) # signed short integer Pro's: - It gets rid of charcodes or string codes - The map between name and type is visually clear - Explicit field order - The columns can be defined as __slots__ in the class constructor making impossible to assign (through __setattr__ for example) values to non-existing columns. - Is elegant (IMO) Con's: - Requires more typing to define - Not as concise as 1) (but a short representation can be made inside IsRecord!) - Difficult to define dynamically 3.- Similar than 2), but with a dictionary like: Particle = { "name" : Col(CharType, (16,), dflt="", pos=3), # 16-character String "ADCcount" : Col(Int8, (1,), dflt=0, pos=1), # signed byte "TDCcount" : Col(Int32, (3,4), dflt=0, pos=2), # signed integer "grid_i" : Col(Int16, (2,3,4), dflt=0, pos=4), # signed short integer } Pro's: - It gets rid of charcodes or string codes - The map between name and type is visually clear - Explicit field order - Easy to build dynamically Con's - No possibility to define __slots__ - Not as elegant as 2), but it looks fine. 4.- List-based approach: Particle = [ Col(Int8, (1,), dflt=0), # signed byte Col(Int32, (3,4), dflt=0), # signed integer Col(CharType, (16,), dflt=""), # 16-character String Col(Int16, (2,3,4), dflt=0), # signed short integer ] Pro's: - Costs less to type (less verbose) - Easy to build dynamically Con's - Implicit field order - Map between field names and contents not visually clear Note: In the previous discussion explicit order has been considered better than implicit, following the Python mantra, and although some people may think that this don't apply well here, I do (but, again, this is purely subjective). Of course, a combination of 2 alternatives can be the best. My current experience tells me that a combination of 2 and 3 may be very good. In that way, a user can define their recarrays as classes, but if he needs to define them dynamically, the recarray constructor can accept also a dictionary like 3 (but, obviously, the same applies to case 4). In the end, the recarray instance should have a variable that points to this definition class, where metadata is keeped, but a shortcut in the form 1) can also be constructed for convenience. IMO integrating options 2 and 3 (even 4) are not difficult to implement and in fact, such a combination is already present in PyTables CVS version. I even might provide a recarray version with 2 & 3 integrated for developers evaluation. Comments?, -- Francesc Alted From perry at stsci.edu Wed Feb 5 07:06:08 2003 From: perry at stsci.edu (Perry Greenfield) Date: Wed Feb 5 07:06:08 2003 Subject: [Numpy-discussion] Interesting Psyco/Numeric/Numarray comparison In-Reply-To: <3E3FEF8F.6000807@ieee.org> Message-ID: Tim Hochberg writes: > I was inspired by Armin's latest Psyco version to try and see how well > one could do with NumPy/NumArray implemented in Psycotic Python. I wrote > a bare bones, pure Python, Numeric array class loosely based on Jnumeric > (which in turn was loosely based on Numeric). The buffer is just > Python's array.array. At the moment, all that one can do to the arrays > is add and index them and the code is still a bit of a mess. I plan to > clean things up over the next week in my copius free time <0.999 wink> > and at that point it should be easy add the remaining operations. > > I benchmarked this code, which I'm calling Psymeric for the moment, > against NumPy and Numarray to see how it did. I used a variety of array > sizes, but mostly relatively large arrays of shape (500,100) and of type > Float64 and Int32 (mixed and with consistent types) as well as scalar > values. Looking at the benchmark data one comes to three main conclusions: > * For small arrays NumPy always wins. Both Numarray and Psymeric have > much larger overhead. > * For large, contiguouse arrays, Numarray is about twice as fast as > either of the other two. > * For large, noncontiguous arrays, Psymeric and NumPy are ~20% faster > than Numarray > The impressive thing is that Psymeric is generally slightly faster than > NumPy when adding two arrays. It's slightly slower (~10%) when adding an > array and a scalar although I suspect that could be fixed by some > special casing a la Numarray. Adding two (500,100) arrays of type > Float64 together results in the following timings: > psymeric numpy numarray > contiguous 0.0034 s 0.0038 s 0.0019 s > stride-2 0.0020 s 0.0023 s 0.0033 s > > I'm not sure if this is important, but it is an impressive demonstration > of Psyco! More later when I get the code a bit more cleaned up. > > -tim > 0.002355 > > 0.002355 > The "psymeric" results are indeed interesting. However, I'd like to make some remarks about numarray benchmarks. At this stage, most of the focus has been on large, contiguous array performance (and as can be seen that is where numarray does best). There are a number of other improvements that can and will be made to numarray performance so some of the other benchmarks are bound to improve (how much is uncertain). For example, the current behavior with strided arrays results in looping over subblocks of the array, and that looping is done on relatively small blocks in Python. We haven't done any tuning yet to see what the optimum size of block should be (it may be machine dependent as well), and it is likely that the loop will eventually be moved into C. Small array performance should improve quite a bit, we are looking into how to do that now and should have a better idea soon of whether we can beat Numeric's performance or not. But "psymeric" approach raises an obvious question (implied I guess, but not explicitly stated). With Psyco, is there a need for Numeric or numarray at all? I haven't thought this through in great detail, but at least one issue seems tough to address in this approach, and that is handling numeric types not supported by Python (e.g., Int8, Int16 UInt16, Float32, etc.). Are you offering the possiblity of the "pysmeric" approach as being the right way to go, and if so, how would you handle this issue? On the other hand, there are lots of algorithms that cannot be handled well with array manipulations. It would seem that psyco would be a natural alternative in such cases (as long as one is content to use Float64 or Int32), but it isn't obivious that these require arrays as anything but data structures (e.g. places to obtain and store scalars). Perry Greenfield From tim.hochberg at ieee.org Wed Feb 5 08:54:05 2003 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Wed Feb 5 08:54:05 2003 Subject: [Psyco-devel] RE: [Numpy-discussion] Interesting Psyco/Numeric/Numarray comparison In-Reply-To: References: Message-ID: <3E414181.6020302@ieee.org> Perry Greenfield wrote: >The "psymeric" results are indeed interesting. However, I'd like to >make some remarks about numarray benchmarks. At this stage, most of >the focus has been on large, contiguous array performance (and as >can be seen that is where numarray does best). There are a number >of other improvements that can and will be made to numarray performance >so some of the other benchmarks are bound to improve (how much is >uncertain). For example, the current behavior with strided arrays >results in looping over subblocks of the array, and that looping is >done on relatively small blocks in Python. We haven't done any tuning >yet to see what the optimum size of block should be (it may be machine >dependent as well), and it is likely that the loop will eventually be >moved into C. Small array performance should improve quite a bit, we >are looking into how to do that now and should have a better idea >soon of whether we can beat Numeric's performance or not. > > I fully expect numarray to beat Numeric for large arrays eventually just based on the fact the psymeric tends to be slightly faster Numeric now for many cases. However, for small arrays it seems that you're likely to be fighting the function call overhead of Python unless you go completely, or nearly completely, to C. But that would be a shame as it would make modifying/extending numarray that much harder. >But "psymeric" approach raises an obvious question (implied I guess, but >not explicitly stated). With Psyco, is there a need for Numeric or >numarray at all? I haven't thought this through in great detail, but at >least one issue seems tough to address in this approach, and that is >handling numeric types not supported by Python (e.g., Int8, Int16 UInt16, >Float32, etc.). Are you offering the possiblity of the "pysmeric" >approach as being the right way to go, > > I think there are too many open questions at this point to be a serious contender. It's interesting enough and the payoff would be big enough that I think it's worth throwing out some of the questions and see if anything interesting pops out. > and if so, how would you handle >this issue? > > The types issue may not be a problem. Python's array.array supports a full set of types (http://www.python.org/doc/current/lib/module-array.html). However, psyco does not currently support fast operations on types 'f', 'I' and 'L'. I don't know if this is a technical problem, or something that's likely to be resolved in time. The 'f' (Float32) case is critical, the others less so. Armin, if you're reading this perhaps you'd like to comment? >On the other hand, there are lots of algorithms that cannot be handled >well with array manipulations. > This is where the Psyco approach would shine. One occasionally runs into cases where some part of the computation just cannot be done naturaly with array operations. A common case is the equivalent of this bit of C code: "A[i] = (C[i] It would seem that psyco would be a natural >alternative in such cases (as long as one is content to use Float64 or >Int32), but it isn't obivious that these require arrays as anything but >data structures (e.g. places to obtain and store scalars). > > That's not been my experience. When I've run into awkward cases like this it's been in situations where nearly all of my computations could be vectorized. Anyway, here are what I see as the issues with this type of approach: * Types: I believe that this should not be a problem * Interfacing with C/Fortran: This seems necessary for any Numeric wannabe. It seems that it must be possible, but it may require a bit of C-code, so it may not be possible to get completely away from C. * Speed: It's not clear to me at this point whether psymeric would get any faster than it currently is. It's pretty fast now, but the factor of two difference between it and numarray for contiguous arrays (a common case) is nothing to sneeze at. Cross-platform: This is the reall killer. Psyco only runs on x86 machines. I don't know if or when that's likely to change. Not being cross platform seems nix this from being a serious contender as a Numeric replacement for the time being. -tim From falted at openlc.org Fri Feb 7 09:46:04 2003 From: falted at openlc.org (Francesc Alted) Date: Fri Feb 7 09:46:04 2003 Subject: [Psyco-devel] RE: [Numpy-discussion] Interesting Psyco/Numeric/Numarray comparison In-Reply-To: <3E414181.6020302@ieee.org> References: <3E414181.6020302@ieee.org> Message-ID: <200302071843.09511.falted@openlc.org> A Dimecres 05 Febrer 2003 17:53, Tim Hochberg va escriure: > However, for small arrays it seems that you're likely to > be fighting the function call overhead of Python unless you go > completely, or nearly completely, to C. But that would be a shame as it > would make modifying/extending numarray that much harder. For this task may be is worth to consider using Pyrex (http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/) for that. From the website: """Pyrex lets you write code that mixes Python and C data types any way you want, and compiles it into a C extension for Python.""" i.e. if you have code in Python and want to accelerate it, it's quite more easy moving it to Pyrex rather than C, as Pyrex has Python syntax. In addition, it lets you call C routines very easily (just declaring them) and provide transparent access to variables, functions and objects in Python namespace. Apart from the standard Python loop statement, Pyrex introduces a new kind of for-loop (in the form "for i from 0 <= i < n:") for iterating over ranges of integers at C speed, that can, for sure, be very handy when optimizing many numarray loops. Another advantage is that Pyrex compiles their own code to C and you can distribute this C code in your package, without necessity to force the final Pyrex extension user to install Pyrex (because it's just a kind of compiler). I've been using it for more than six months and it's pretty stable and works very well (at least for UNIX machines; I don't have experience on Windows or OSX platforms). Just my two cents, -- Francesc Alted From paul at pfdubois.com Fri Feb 7 09:59:08 2003 From: paul at pfdubois.com (Paul Dubois) Date: Fri Feb 7 09:59:08 2003 Subject: [Numpy-discussion] Some bugs in Numeric fixed today in CVS Message-ID: <000301c2ced2$70627570$6601a8c0@NICKLEBY> [ 614808 ] Inconsistent use of tabs and spaces Fixed as suggested by Jimmy Retzlaff LinearAlgebra.py Matrix.py RNG/__init__.py RNG/Statistics.py [ 621032 ] needless work in multiarraymodule.c Fixes suggested by Greg Smith applied. Also recoded OBJECT_DotProduct to eliminate a warning error. [ 630584 ] generalized_inverse of complex array Fix suggested by Greg Smith applied. [ 652061 ] PyArray_As2D doesn't check pointer. Fix suggested by Andrea Riciputi applied. [ 655512 ] inverse_real_fft incorrect many sizes Fix given by mbriest applied. From Chris.Barker at noaa.gov Fri Feb 7 11:06:05 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Fri Feb 7 11:06:05 2003 Subject: [Psyco-devel] RE: [Numpy-discussion] Interesting Psyco/Numeric/Numarray comparison References: <3E414181.6020302@ieee.org> <200302071843.09511.falted@openlc.org> Message-ID: <3E43FBFA.4B0C0FA1@noaa.gov> Francesc Alted wrote: > For this task may be is worth to consider using Pyrex > (http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/) for that. From the > website: > > """Pyrex lets you write code that mixes Python and C data types any way you > want, and compiles it into a C extension for Python.""" I've been keeping my eye on Pyrex for a while now, but have not yet had enough of a use for it to justify tryin git out. I do ahve a question that I ahve not foudn the answer to on the web, which could make a big difference to how useful it is to me: Is Pyrex aware of Numeric Arrays? I imagine it could use them just fine, using the generic Python sequence get item stuff, but that would be a whole lot lower performance than if it understood the Numeric API and could access the data array directly. Also, how does it deal with multiple dimension indexing ( array[3,6,2] ) which the standard python sequence types do not support? As I think about this, I think your suggestion is fabulous. Pyrex (or a Pyrex-like) language would be a fabulous way to write code for NumArray, if it really made use of the NumArray API. Thanks for your input, -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From paul at pfdubois.com Fri Feb 7 13:48:04 2003 From: paul at pfdubois.com (Paul F Dubois) Date: Fri Feb 7 13:48:04 2003 Subject: [Numpy-discussion] Psyco MA? In-Reply-To: <3E43FBFA.4B0C0FA1@noaa.gov> Message-ID: <000101c2cef2$7ad25f50$6601a8c0@NICKLEBY> { CC to GvR just to show why I'm +1 on the if-PEP. I liked this in another language I used to use; Perl ? } Perhaps knowlegeable persons could comment on the feasibility of coding MA (masked arrays) in straight Python and then using Psyco on it? The existing implementation is in pure python and uses Numeric to represent the two arrays it holds (the data and sometimes a mask) in each object. A great deal of wasted motion is devoted to preparing Numeric arrays so as to avoid operations on masked elements. It could have been written a lot simpler if performance didn't dictate trying to leverage off Numeric. In straight Python one can imagine an add, for example, that was roughly: for k in 0<= k < len(a.data): result.mask[k] = a.mask[k] or b.mask[k] result.data[k] = a.data[k] if result.mask[k] else a.data[k] + b.data[k] (using the new if expression PEP just to confuse the populace) It seems to me that this might be competitive given the numbers someone posted before. Alas, I can't remember who was the original poster, but I'd guess they might have a good guess. > -----Original Message----- > From: numpy-discussion-admin at lists.sourceforge.net > [mailto:numpy-discussion-admin at lists.sourceforge.net] On > Behalf Of Chris Barker > Sent: Friday, February 07, 2003 10:34 AM > To: falted at openlc.org; numpy-discussion at lists.sourceforge.net > Subject: Re: [Psyco-devel] RE: [Numpy-discussion] Interesting > Psyco/Numeric/Numarray comparison > > > Francesc Alted wrote: > > > For this task may be is worth to consider using Pyrex > > (http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/) for > that. From > > the > > website: > > > > """Pyrex lets you write code that mixes Python and C data types any > > way you want, and compiles it into a C extension for Python.""" > > I've been keeping my eye on Pyrex for a while now, but have > not yet had enough of a use for it to justify tryin git out. > I do ahve a question that I ahve not foudn the answer to on > the web, which could make a big difference to how useful it is to me: > > Is Pyrex aware of Numeric Arrays? > > I imagine it could use them just fine, using the generic > Python sequence get item stuff, but that would be a whole lot > lower performance than if it understood the Numeric API and > could access the data array directly. Also, how does it deal > with multiple dimension indexing ( array[3,6,2] ) which the > standard python sequence types do not support? > > As I think about this, I think your suggestion is fabulous. > Pyrex (or a > Pyrex-like) language would be a fabulous way to write code > for NumArray, if it really made use of the NumArray API. > > Thanks for your input, > > -Chris > > -- > Christopher Barker, Ph.D. > Oceanographer > > NOAA/OR&R/HAZMAT (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > > ------------------------------------------------------- > This SF.NET email is sponsored by: > SourceForge Enterprise Edition + IBM + LinuxWorld = Something > 2 See! http://www.vasoftware.com > _______________________________________________ > Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From Chris.Barker at noaa.gov Fri Feb 7 14:25:03 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Fri Feb 7 14:25:03 2003 Subject: [Numpy-discussion] Psyco MA? References: <000101c2cef2$7ad25f50$6601a8c0@NICKLEBY> Message-ID: <3E442A77.413648CC@noaa.gov> -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Fri Feb 7 14:41:03 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Fri Feb 7 14:41:03 2003 Subject: [Numpy-discussion] Psyco MA? References: <000101c2cef2$7ad25f50$6601a8c0@NICKLEBY> Message-ID: <3E442E5A.89334754@noaa.gov> oops, sorry about the blank message. Paul F Dubois wrote: > { CC to GvR just to show why I'm +1 on the if-PEP. I liked this in another What the heck is the if-PEP ? > Perhaps knowlegeable persons could comment on the feasibility of coding MA > (masked arrays) in straight Python and then using Psyco on it? Is there confusion between Psyco and Pyrex? Psyco runs regular old Python bytecode, and individually compiles little pieces of it as needed into machine code. AS I understand it, this should make loops where the inner part is a pretty simple operation very fast. However, Psyco is pretty new, and I have no idea how robust and stable, but certainly not cross platform. As it generates machine code, it needs to be carefully ported to each hardware platform, and it currently only works on x86. Pyrex, on the other hand, is a "Python-like" language that is tranlated into C, and then the C is compiled. It generates pretty darn platform independent, so it should be able to be used on all platforms. In regard to your question about MA (and any ther similar project): I think Psyco has the potential to be the next-generation Python VM, which will have much higher performance, and therefore greatly reduce the need to write extensions for the sake of performance. I supsect that it could do its best with large, multi-dimensional arrays of numbers if there is a Python native object of such a type. Psycho, however is not ready for general use on all platforms, so in the forseeable future, there is a need for other ways to get decent performance. My suggestion follows: > It could have been written a lot simpler if performance didn't dictate > trying to leverage off Numeric. In straight Python one can imagine an add, > for example, that was roughly: > for k in 0<= k < len(a.data): > result.mask[k] = a.mask[k] or b.mask[k] > result.data[k] = a.data[k] if result.mask[k] else a.data[k] + > b.data[k] This looks like it could be written in Pyrex. If Pyrex were suitably NumArray aware, then it could work great. What this boils down to, in both the Pyrex and Psyco options, is that having a multi-dimensional homogenous numeric data type that is "Native" Python is a great idea! With Pyrex and/or Psyco, Numeric3 (NumArray2 ?) could be implimented by having only the samallest core in C, and then rest in Python (or Pyrex) While the Psyco option is the rosy future of Python, Pyrex is here now, and maybe adopting it to handle NumArrays well would be easier than re-writing a bunch of NumArray in C. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From tchur at optushome.com.au Fri Feb 7 15:08:01 2003 From: tchur at optushome.com.au (Tim Churches) Date: Fri Feb 7 15:08:01 2003 Subject: [Numpy-discussion] Psyco MA? In-Reply-To: <3E442E5A.89334754@noaa.gov> References: <000101c2cef2$7ad25f50$6601a8c0@NICKLEBY> <3E442E5A.89334754@noaa.gov> Message-ID: <1044659254.1290.128.camel@emilio> On Sat, 2003-02-08 at 09:08, Chris Barker wrote: > While the Psyco option is the rosy future of Python, Pyrex is here now, > and maybe adopting it to handle NumArrays well would be easier than > re-writing a bunch of NumArray in C. Well, Psyco is already immediately useful for many problems on Intel platforms, but I take your point that its real future is as the next-generation VM for Python. However, I agree 100% about the potential for leveraging Pyrex in Numarray. Not just in Numarray, but around it, too. The Numarray team should open serious talks with Greg Ewing about Numarray-enabling Pyrex. And New Zealand is a very nice place to visit (seriously, not joking, even though I am an Australian [reference to trans-Tasman Sea rivalry between Asutralia and New Zealand there]). Tim C From tim.hochberg at ieee.org Fri Feb 7 15:09:04 2003 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Fri Feb 7 15:09:04 2003 Subject: [Numpy-discussion] Psyco MA? In-Reply-To: <3E442E5A.89334754@noaa.gov> References: <000101c2cef2$7ad25f50$6601a8c0@NICKLEBY> <3E442E5A.89334754@noaa.gov> Message-ID: <3E443C83.7000209@ieee.org> Chris Barker wrote: >oops, sorry about the blank message. > >Paul F Dubois wrote: > > >>{ CC to GvR just to show why I'm +1 on the if-PEP. I liked this in another >> >> > >What the heck is the if-PEP ? > > Pep 308. It's stirring up a bit of a ruckos on CLP as we speak. >>Perhaps knowlegeable persons could comment on the feasibility of coding MA >>(masked arrays) in straight Python and then using Psyco on it? >> >> > >Is there confusion between Psyco and Pyrex? Psyco runs regular old >Python bytecode, and individually compiles little pieces of it as needed >into machine code. AS I understand it, this should make loops where the >inner part is a pretty simple operation very fast. > >However, Psyco is pretty new, and I have no idea how robust and stable, >but certainly not cross platform. As it generates machine code, it needs >to be carefully ported to each hardware platform, and it currently only >works on x86. > > Psyco seems fairly stable these days. However it's one of those things that probably needs to get a larger cabal of users to shake the bugs out of it. I still only use it to play around with because all things that I need speed from I end up doing in Numeric anyway. >Pyrex, on the other hand, is a "Python-like" language that is tranlated >into C, and then the C is compiled. It generates pretty darn platform >independent, so it should be able to be used on all platforms. > > >In regard to your question about MA (and any ther similar project): I >think Psyco has the potential to be the next-generation Python VM, which >will have much higher performance, and therefore greatly reduce the need >to write extensions for the sake of performance. I supsect that it could >do its best with large, multi-dimensional arrays of numbers if there is >a Python native object of such a type. Psycho, however is not ready for >general use on all platforms, so in the forseeable future, there is a >need for other ways to get decent performance. My suggestion follows: > > > >>It could have been written a lot simpler if performance didn't dictate >>trying to leverage off Numeric. In straight Python one can imagine an add, >>for example, that was roughly: >> for k in 0<= k < len(a.data): >> result.mask[k] = a.mask[k] or b.mask[k] >> result.data[k] = a.data[k] if result.mask[k] else a.data[k] + >>b.data[k] >> >> > >This looks like it could be written in Pyrex. If Pyrex were suitably >NumArray aware, then it could work great. > >What this boils down to, in both the Pyrex and Psyco options, is that >having a multi-dimensional homogenous numeric data type that is "Native" >Python is a great idea! With Pyrex and/or Psyco, Numeric3 (NumArray2 ?) >could be implimented by having only the samallest core in C, and then >rest in Python (or Pyrex) > > For Psyco at least you don't need a multidimensional type. You can get good results with flat array, in particular array.array. The number I posted earlier showed comparable performance for Numeric and a multidimensional array type written all in python and psycoized. And since I suspect that I'm the mysterious person who's name Paul couldn't remember, let me say I suspect the MA would be faster in psycoized python than what your doing now as long as a.data was an instance of array.array. However, there are at least three problems. Psyco doesn't fully support the floating point type('f') right now (although it does support most of the various integral types in addition to 'd'). I assume that these masked arrays are multidimensional, so someone would have to build the basic multidimensional machinery around array.array to make them work. I have a good start on this, but I'm not sure when I'm going to have time to work on this more. The biggy though is that psyco only works on x86 machines. What we really need to do is to clone Armin. >While the Psyco option is the rosy future of Python, Pyrex is here now, >and maybe adopting it to handle NumArrays well would be easier than >re-writing a bunch of NumArray in C. > > This sounds like you're conflating two different issues. The first issue is that Numarray is relatively slow for small arrays. Pyrex may indeed be an easier way to attack this although I wouldn't know, I've only looked at it not tried to use it. However, I think that this is something that can and should wait. Once use cases of numarray being _too_ slow for small arrays start piling up, then it will be time to attack the overhead. Premature optimization is the root of all evil and all that. The second issue is how to deal with code that does not vectorize well. Here Pyrex again might help if it were made Numarray aware. However, isn't this what scipy.weave already does? Again, I haven't used weave, but as I understand it, it's another python-c bridge, but one that's more geared toward numerics stuff. -tim From list at jsaul.de Fri Feb 7 15:58:09 2003 From: list at jsaul.de (Joachim Saul) Date: Fri Feb 7 15:58:09 2003 Subject: [Numpy-discussion] Psyco MA? In-Reply-To: <1044659254.1290.128.camel@emilio> References: <000101c2cef2$7ad25f50$6601a8c0@NICKLEBY> <3E442E5A.89334754@noaa.gov> <1044659254.1290.128.camel@emilio> Message-ID: <20030207235736.GG842@jsaul.de> * Tim Churches [2003-02-08 00:07]: > However, I agree 100% about the potential for leveraging Pyrex in > Numarray. Not just in Numarray, but around it, too. The Numarray team > should open serious talks with Greg Ewing about Numarray-enabling Pyrex. What is it that needs to be "enabled"? Pyrex handles Numeric (see Pyrex FAQ), why should it not handle Numarray? AFAIK Pyrex contains no code to specifically support Numeric, and it should therefore be straightforward to use it with Numarray as well. Only drawback is currently lack of support for e.g. slicing operations in Pyrex. Cheers, Joachim From tchur at optushome.com.au Fri Feb 7 16:25:04 2003 From: tchur at optushome.com.au (Tim Churches) Date: Fri Feb 7 16:25:04 2003 Subject: [Numpy-discussion] Psyco MA? In-Reply-To: <20030207235736.GG842@jsaul.de> References: <000101c2cef2$7ad25f50$6601a8c0@NICKLEBY> <3E442E5A.89334754@noaa.gov> <1044659254.1290.128.camel@emilio> <20030207235736.GG842@jsaul.de> Message-ID: <1044663911.1266.180.camel@emilio> On Sat, 2003-02-08 at 10:57, Joachim Saul wrote: > * Tim Churches [2003-02-08 00:07]: > > However, I agree 100% about the potential for leveraging Pyrex in > > Numarray. Not just in Numarray, but around it, too. The Numarray team > > should open serious talks with Greg Ewing about Numarray-enabling Pyrex. > > What is it that needs to be "enabled"? Pyrex handles Numeric (see > Pyrex FAQ), why should it not handle Numarray? AFAIK Pyrex > contains no code to specifically support Numeric, and it should > therefore be straightforward to use it with Numarray as well. Hmmm, maybe re-implementing MA in Pyrex is possible right now. Double hmmm.... Tim C From paul at pfdubois.com Fri Feb 7 16:29:02 2003 From: paul at pfdubois.com (Paul F Dubois) Date: Fri Feb 7 16:29:02 2003 Subject: [Numpy-discussion] Psyco MA? In-Reply-To: <3E443C83.7000209@ieee.org> Message-ID: <000801c2cf09$0cfabf10$6601a8c0@NICKLEBY> Just to confirm the obvious, I don't know the difference between Psyco and Pyrex and if I ever did it is Friday night and I've lost it. Any two words that share two letters look the same to me right now. > -----Original Message----- > From: numpy-discussion-admin at lists.sourceforge.net > [mailto:numpy-discussion-admin at lists.sourceforge.net] On > Behalf Of Tim Hochberg > Sent: Friday, February 07, 2003 3:09 PM > To: Chris Barker > Cc: Paul F Dubois; numpy-discussion at lists.sourceforge.net > Subject: Re: [Numpy-discussion] Psyco MA? > > > Chris Barker wrote: > > >oops, sorry about the blank message. > > > >Paul F Dubois wrote: > > > > > >>{ CC to GvR just to show why I'm +1 on the if-PEP. I liked this in > >>another > >> > >> > > > >What the heck is the if-PEP ? > > > > > > Pep 308. It's stirring up a bit of a ruckos on CLP as we speak. > > >>Perhaps knowlegeable persons could comment on the feasibility of > >>coding MA (masked arrays) in straight Python and then using > Psyco on > >>it? > >> > >> > > > >Is there confusion between Psyco and Pyrex? Psyco runs regular old > >Python bytecode, and individually compiles little pieces of it as > >needed into machine code. AS I understand it, this should make loops > >where the inner part is a pretty simple operation very fast. > > > >However, Psyco is pretty new, and I have no idea how robust > and stable, > >but certainly not cross platform. As it generates machine code, it > >needs to be carefully ported to each hardware platform, and it > >currently only works on x86. > > > > > Psyco seems fairly stable these days. However it's one of > those things > that probably needs to get a larger cabal of users to shake > the bugs out > of it. I still only use it to play around with because all > things that I > need speed from I end up doing in Numeric anyway. > > >Pyrex, on the other hand, is a "Python-like" language that > is tranlated > >into C, and then the C is compiled. It generates pretty darn > platform > >independent, so it should be able to be used on all platforms. > > > > > >In regard to your question about MA (and any ther similar > project): I > >think Psyco has the potential to be the next-generation Python VM, > >which will have much higher performance, and therefore > greatly reduce > >the need to write extensions for the sake of performance. I supsect > >that it could do its best with large, multi-dimensional arrays of > >numbers if there is a Python native object of such a type. Psycho, > >however is not ready for general use on all platforms, so in the > >forseeable future, there is a need for other ways to get decent > >performance. My suggestion follows: > > > > > > > >>It could have been written a lot simpler if performance > didn't dictate > >>trying to leverage off Numeric. In straight Python one can > imagine an > >>add, for example, that was roughly: > >> for k in 0<= k < len(a.data): > >> result.mask[k] = a.mask[k] or b.mask[k] > >> result.data[k] = a.data[k] if result.mask[k] else > a.data[k] + > >>b.data[k] > >> > >> > > > >This looks like it could be written in Pyrex. If Pyrex were suitably > >NumArray aware, then it could work great. > > > >What this boils down to, in both the Pyrex and Psyco > options, is that > >having a multi-dimensional homogenous numeric data type that is > >"Native" Python is a great idea! With Pyrex and/or Psyco, Numeric3 > >(NumArray2 ?) could be implimented by having only the > samallest core in > >C, and then rest in Python (or Pyrex) > > > > > For Psyco at least you don't need a multidimensional type. > You can get > good results with flat array, in particular array.array. The number I > posted earlier showed comparable performance for Numeric and a > multidimensional array type written all in python and psycoized. > > And since I suspect that I'm the mysterious person who's name Paul > couldn't remember, let me say I suspect the MA would be faster in > psycoized python than what your doing now as long as a.data was an > instance of array.array. However, there are at least three problems. > Psyco doesn't fully support the floating point type('f') right now > (although it does support most of the various integral types in > addition to 'd'). I assume that these masked arrays are > multidimensional, so someone would have to build the basic > multidimensional machinery around array.array to make them > work. I have > a good start on this, but I'm not sure when I'm going to have time to > work on this more. The biggy though is that psyco only works on x86 > machines. What we really need to do is to clone Armin. > > >While the Psyco option is the rosy future of Python, Pyrex > is here now, > >and maybe adopting it to handle NumArrays well would be easier than > >re-writing a bunch of NumArray in C. > > > > > This sounds like you're conflating two different issues. The > first issue > is that Numarray is relatively slow for small arrays. Pyrex > may indeed > be an easier way to attack this although I wouldn't know, I've only > looked at it not tried to use it. However, I think that this is > something that can and should wait. Once use cases of numarray being > _too_ slow for small arrays start piling up, then it will be time to > attack the overhead. Premature optimization is the root of > all evil and > all that. > > The second issue is how to deal with code that does not > vectorize well. > Here Pyrex again might help if it were made Numarray aware. However, > isn't this what scipy.weave already does? Again, I haven't > used weave, > but as I understand it, it's another python-c bridge, but one that's > more geared toward numerics stuff. > > > -tim > > > > > > > > > > ------------------------------------------------------- > This SF.NET email is sponsored by: > SourceForge Enterprise Edition + IBM + LinuxWorld = Something > 2 See! http://www.vasoftware.com > _______________________________________________ > Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From Chris.Barker at noaa.gov Fri Feb 7 17:10:15 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Fri Feb 7 17:10:15 2003 Subject: [Numpy-discussion] Psyco MA? References: <000101c2cef2$7ad25f50$6601a8c0@NICKLEBY> <3E442E5A.89334754@noaa.gov> <3E443C83.7000209@ieee.org> Message-ID: <3E445136.4791C978@noaa.gov> Tim Hochberg wrote: > Psyco seems fairly stable these days. However it's one of those things > that probably needs to get a larger cabal of users to shake the bugs out > of it. I still only use it to play around with because all things that I > need speed from I end up doing in Numeric anyway. Hmmm. It always just seemed too bleeding edge for me to want to drop it in inplace of my current Python, but maybe I should try... > For Psyco at least you don't need a multidimensional type. You can get > good results with flat array, in particular array.array. The number I > posted earlier showed comparable performance for Numeric and a > multidimensional array type written all in python and psycoized. What about non-contiguous arrays? Also, you pointed out yourself that you are still looking at a factor of two slowdown, it would be nice to get rid of that. > >While the Psyco option is the rosy future of Python, Pyrex is here now, > >and maybe adopting it to handle NumArrays well would be easier than > >re-writing a bunch of NumArray in C. > > > This sounds like you're conflating two different issues. The first issue > is that Numarray is relatively slow for small arrays. Pyrex may indeed > be an easier way to attack this although I wouldn't know, I've only > looked at it not tried to use it. However, I think that this is > something that can and should wait. Once use cases of numarray being > _too_ slow for small arrays start piling up, then it will be time to > attack the overhead. Premature optimization is the root of all evil and > all that. Quite true. I know I have a lot of use cases where I use a LOT of small arrays. That doesn't mean that performace is a huge problem, we'll see. I'm talking about other things as well, however. There are a lot of functions in the current Numeric that are written in a combination of Python and C. Mostly they are written using the lower level Numeric functions. This includes concatenate, chop, etc. etc. While speeding up any individual one of those won't make much difference, speeding them all up might. If it were much easier to get C-speed functions like this, we'd have a higher performance package all around. I've personally re-written byteswap() and chop(). In this case, not to get them faster, but to get them to use less memory. It would be great if we could do them all. > The second issue is how to deal with code that does not vectorize well. > Here Pyrex again might help if it were made Numarray aware. However, > isn't this what scipy.weave already does? Again, I haven't used weave, > but as I understand it, it's another python-c bridge, but one that's > more geared toward numerics stuff. Weave is another project that's on my list to check out, so I don't know why one would choose one over the other. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From list at jsaul.de Sat Feb 8 02:56:01 2003 From: list at jsaul.de (Joachim Saul) Date: Sat Feb 8 02:56:01 2003 Subject: [Numpy-discussion] Psyco MA? In-Reply-To: <1044663911.1266.180.camel@emilio> References: <000101c2cef2$7ad25f50$6601a8c0@NICKLEBY> <3E442E5A.89334754@noaa.gov> <1044659254.1290.128.camel@emilio> <20030207235736.GG842@jsaul.de> <1044663911.1266.180.camel@emilio> Message-ID: <20030208105418.GA842@jsaul.de> * Tim Churches [2003-02-08 01:25]: > On Sat, 2003-02-08 at 10:57, Joachim Saul wrote: > > * Tim Churches [2003-02-08 00:07]: > > > However, I agree 100% about the potential for leveraging Pyrex in > > > Numarray. Not just in Numarray, but around it, too. The Numarray team > > > should open serious talks with Greg Ewing about Numarray-enabling Pyrex. > > > > What is it that needs to be "enabled"? Pyrex handles Numeric (see > > Pyrex FAQ), why should it not handle Numarray? AFAIK Pyrex > > contains no code to specifically support Numeric, and it should > > therefore be straightforward to use it with Numarray as well. > > Hmmm, maybe re-implementing MA in Pyrex is possible right now. Double > hmmm.... Please check out the Pyrex doc. It's actually very easy right now, *if* you can live without "sequence operators" such as slicing, list comprehensions... but this is going to be supported, again according to the doc. Here is a exerpt from an extension module that I have built using Pyrex and Numeric, following the instructions in the Pyrex-FAQ: cdef extern int decomp(int, double*, double*, double, double, double) cdef extern from "Numeric/arrayobject.h": struct PyArray_Descr: int type_num, elsize char type ctypedef class PyArrayObject [type PyArray_Type]: cdef char *data cdef int nd cdef int *dimensions,*strides cdef PyArray_Descr *descr object PyArray_FromDims(int, int*, int) void import_array() def _decomp(PyArrayObject z_arr, PyArrayObject r_arr, double p, double vs, double sigma): cdef double *z, *r cdef int n n = z_arr.dimensions[0] z, r = z_arr.data, r_arr.data decomp(n, z, r, p, vs, sigma) This is rather crude code that doesn't check for the type of the arrays nor their dimension, but it does what I want right now and if I find the time I'll certainly make it more general. Those checks are actually performed in yet another Python layer. As you can see, the above looks like "strongly typed" Python. From a C-programmers perspective, if find this is extremely cool. If one leaves the type out, then the argument can be any Python object. What I like about Pyrex is that you can mix Python and C calls at your convenience. For example, I may call (C-like) arr = PyArray_FromDims(1, &n, PyArray_DOUBLE) but could have also used a corresponding Python construct like from Numeric import zeros arr = zeros(n, 'd') I expect the latter to be slower (not tested), but one can take Python code "as is" and "compile" it using Pyrex. This already increases performance and one can then conveniently replace as much Python code as needed with the corresponding C functions, which (presumably) will again speed up the code significantly. The bottle necks are finally moved to external C files and treated like a C library. Cheers, Joachim From perry at stsci.edu Sat Feb 8 13:52:02 2003 From: perry at stsci.edu (Perry Greenfield) Date: Sat Feb 8 13:52:02 2003 Subject: [Numpy-discussion] Some observations or questions about psyco & pyrex In-Reply-To: <20030208105418.GA842@jsaul.de> Message-ID: Both psyco and pyrex have some great aspects. But I think it is worth a little reflection on what can and can't be expected of them. I'm basically ignorant of both; I know a little about them, but haven't used them. if anything I say is wrong, please correct me. I'm going to make some comments based on inferred characteristics of them that could well be wrong. Psyco is very cool and seems the answer to many dreams. But consider the cost. From what I can infer, it obtains its performance enhancements at least in part by constructing machine code on the fly from the Python code. In other words it is performing aspects of running on particular processors that is usually relegated to C compilers by Python. I'd guess that the price is the far greater difficulty of maintaining such capability across many processor types. It also likely increases the complexity of the implementation of Python, perhaps making it much harder to change and enhance. Even without it handling things that are needed for array processing, how likely is it that it will be accepted as the standard implementation for Python for these reasons alone. I also am inclined to believe that adding the support for array processing to a psyco implementation is a significant undertaking. There are at least two issues that would have to be addressed: handling all the numeric types and exception handling behavior. Then there are aspects important to us that include handling byteswapped or non-aligned data. Having the Python VM handle the efficiency aspects of arrays simplifies aspects of their implementation as compared to the current implementations of Numeric and numarray it doesn't eliminate the need to replicate much of it. Having to deal with the implemenation for several different processors is likely to outweigh any savings in the implementation. But maybe I misjudge. Pyrex's goals are more realistic I believe. But unless I'm mistaken, pyrex cannot be a solution to the problems that Numeric and numarray solve. Writing a something for pyrex means committing to certain types. It's great for writing something that you would have written as a C extension, but it doesn't really solve the problem of implementing Ufuncs that can handle many different types of arrays, and particularly combinations of types. But perhaps I misunderstand here as well. It certainly would be nice if it could handle some of the aspects of the Numeric/numarray API automatically. So I doubt that either really is a solution for masked arrays in general. Perry From jae at zhar.net Sat Feb 8 15:11:01 2003 From: jae at zhar.net (John Eikenberry) Date: Sat Feb 8 15:11:01 2003 Subject: [Numpy-discussion] Some observations or questions about psyco & pyrex In-Reply-To: References: <20030208105418.GA842@jsaul.de> Message-ID: <20030208230912.GA3764@kosh.zhar.net> Perry Greenfield wrote: > Both psyco and pyrex have some great aspects. But I think > it is worth a little reflection on what can and can't be > expected of them. I'm basically ignorant of both; I know > a little about them, but haven't used them. if anything I > say is wrong, please correct me. I'm going to make some > comments based on inferred characteristics of them that > could well be wrong. I'd like to suggest to anyone interested in these ideas that they take a look a the pypython/minimal-python mailing list: http://codespeak.net/mailman/listinfo/pypy-dev > Psyco is very cool and seems the answer to many dreams. > But consider the cost. From what I can infer, it obtains > its performance enhancements at least in part by constructing > machine code on the fly from the Python code. In other > words it is performing aspects of running on particular > processors that is usually relegated to C compilers by > Python. > > I'd guess that the price is the far greater difficulty of > maintaining such capability across many processor types. > It also likely increases the complexity of the implementation > of Python, perhaps making it much harder to change and > enhance. Even without it handling things that are needed > for array processing, how likely is it that it will be > accepted as the standard implementation for Python for > these reasons alone. The hope is that quite the opposite of just about every one of these points will be true. That once Python is reimplemented in Python, with psycho as a backend jit-like compiler, it will decrease the complexity of the implementation. Making it much easier to change and enhance. I tend to be quite optimistic about the potential for pypython and psycho. I think the added work of the platform dependent psycho modules will be offset by the rest of the system being written in Python. -- John Eikenberry [jae at zhar.net - http://zhar.net] ______________________________________________________________ "Perfection (in design) is achieved not when there is nothing more to add, but rather when there is nothing more to take away." -- Antoine de Saint-Exupery From tim.hochberg at ieee.org Mon Feb 10 08:53:04 2003 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Mon Feb 10 08:53:04 2003 Subject: [Numpy-discussion] Psyco MA? In-Reply-To: <3E445136.4791C978@noaa.gov> References: <000101c2cef2$7ad25f50$6601a8c0@NICKLEBY> <3E442E5A.89334754@noaa.gov> <3E443C83.7000209@ieee.org> <3E445136.4791C978@noaa.gov> Message-ID: <3E47D8AC.4070108@ieee.org> Chris Barker wrote: >Tim Hochberg wrote: > > >>Psyco seems fairly stable these days. However it's one of those things >>that probably needs to get a larger cabal of users to shake the bugs out >>of it. I still only use it to play around with because all things that I >>need speed from I end up doing in Numeric anyway. >> >> > >Hmmm. It always just seemed too bleeding edge for me to want to drop it >in inplace of my current Python, but maybe I should try... > > I think Psyco was a reworked interpreter at some point, but it isn't any longer. Now it's just an extension module. You typically use it like this: def some_function_that_needs_to_be_fast(...): .... psyco.bind(some_function_that_needs_to_be_fast) Of course, it's still possible to bomb the interpreter with Psyco and it's a huge memory hog if you bind a lot of functions. On the other hand in the course of playing with psymeric I found one way to crash the interpreter with Psyco, one way with Numeric, and one way to cause Numarray to fail, although this did not crash the interpreter. So if I was keeping a tally of evil bugs, they'd all be tied right now.... >>For Psyco at least you don't need a multidimensional type. You can get >>good results with flat array, in particular array.array. The number I >>posted earlier showed comparable performance for Numeric and a >>multidimensional array type written all in python and psycoized. >> >> > >What about non-contiguous arrays? Also, you pointed out yourself that >you are still looking at a factor of two slowdown, it would be nice to >get rid of that. > > Non contiguous arrays are easy to build on top of contiguous arrays, psymeric works with noncontiguous arrays now. If you'd like, I can send you some code. The factor of two slowdown is an issue. A bigger issue is that only x86 platforms are supported. Also there is not support for things like byteswapped and nonalligned arrays. There also might be problems getting the exception handling right. If this approach were to be done "right" for heavy duty number cruncher types, it would require a more capable, c-based, core buffer object, with most other things written in python and psycoized. This begins to sounds a lot like what you would get if you put a lot of psyco.bind calls into the python parts of Numarray now. On the other hand, it's possible some interesting stuff will come out of the PyPy project that will make this thing possible in pure Python. I'm watching that project wit interest. I did some more tuning of the Psymeric code to reduce overhead and this is what the speed situation is now. This is complicated to compare, since the relative speeds depend on both the array type and shaps but one can get a general feel for things by looking at two things: the overhead, that is the time it takes to operate on very small arrays, and the asymptotic time/element for large arrays. These numbers differ substantially for contiguous and noncontiguous arrays but there relative values are fairly constant across types. That gives four numbers: Overhead (c) Overhead (nc) TimePerElement (c) TimePerElement (nc) NumPy 10 us 10 us 85 ps 95 ps NumArray 200 us 530 us 45 ps 135 ps Psymeric 50 us 65 us 80 ps 80 ps The times shown above are for Float64s and are pretty approximate, and they happen to be a particularly favorable array shape for Psymeric. I have seen pymeric as much as 50% slower than NumPy for large arrays of certain shapes. The overhead for NumArray is suprisingly large. After doing this experiment I'm certainly more sympathetic to Konrad wanting less overhead for NumArray before he adopts it. -tim From magnus at hetland.org Mon Feb 10 12:38:04 2003 From: magnus at hetland.org (Magnus Lie Hetland) Date: Mon Feb 10 12:38:04 2003 Subject: [Numpy-discussion] Plain array performance Message-ID: <20030210203736.GB13673@idi.ntnu.no> Just curious: What is the main strength of the array module in the standard library? Is it footprint/memory usage? Is it speed? If so, at what sizes? I ran some simple benchmarks (creating a list/array, iterating over them to sum up their elements, and extracting the slice foo[::2]) and got the following rations (array_time/list_time) for various sizes: Size 100: Creation: 1.13482142857 Sum: 1.54649265905 Slicing: 1.53736654804 Size 1000: Creation: 1.62444133147 Sum: 1.18439932835 Slicing: 1.56350184957 Size 10000: Creation: 1.61642712328 Sum: 1.47768567821 Slicing: 1.45889354599 Size 100000: Creation: 1.72711084285 Sum: 0.952593142445 Slicing: 1.05782341361 Size 1000000: Creation: 1.56617139425 Sum: 0.735687066032 Slicing: 0.773219364465 Size 10000000: Creation: 1.57903195174 Sum: 0.727253180418 Slicing: 0.726005428022 These benchmarks are pretty na?ve, but it seems to me that unless you're working with quite large arrays, there is no great advantage to using arrays rather than lists... (I'm not including numarray or Numeric in the equation here -- I just raise the issue because of the use of arrays in Psymeric...) Just curious... -- Magnus Lie Hetland "Nothing shocks me. I'm a scientist." http://hetland.org -- Indiana Jones From magnus at hetland.org Mon Feb 10 13:03:05 2003 From: magnus at hetland.org (Magnus Lie Hetland) Date: Mon Feb 10 13:03:05 2003 Subject: [Numpy-discussion] Plain array performance In-Reply-To: <3E481165.1080608@ieee.org> References: <20030210203736.GB13673@idi.ntnu.no> <3E481165.1080608@ieee.org> Message-ID: <20030210210156.GA16423@idi.ntnu.no> Tim Hochberg : > [snip] In my continued quest, I found this: http://www.penguin.it/pipermail/python/2002-October/001917.html It sums up (in Italian, though) the great memory advantage of arrays. (Might be a good idea to be explicit about this in the docs, perhaps... Hm.) > The reason I'm using arrays in psymeric are twofold. One is memory > usage. Right. > The other reason is that Psyco likes arrays > (http://arigo.tunes.org/psyco-preview/psycoguide/node26.html). I sort of thought that might be a reason... :) > In fact it was this note " The speed of a complex algorithm using an > array as buffer (like manipulating an image pixel-by-pixel) should > be very high; closer to C than plain Python." that led me to start > playing around with psymeric. I see. > Just for grins I disabled psyco and reran some tests on psymeric. > Instead of comporable speed to NumPy, the speed drops to about 25x > slower. Yikes! > I actually would have expected it to be worse, but the drop off is > still pretty steep. Indeed... Hm... If only we could have Psyco for non-x86 platforms... Oh, well. I guess we will, some day. :) > -tim -- Magnus Lie Hetland "Nothing shocks me. I'm a scientist." http://hetland.org -- Indiana Jones From paul at pfdubois.com Mon Feb 10 13:40:09 2003 From: paul at pfdubois.com (Paul F Dubois) Date: Mon Feb 10 13:40:09 2003 Subject: [Numpy-discussion] Plain array performance In-Reply-To: <20030210203736.GB13673@idi.ntnu.no> Message-ID: <000a01c2d14c$e15ea0b0$6601a8c0@NICKLEBY> The problem with na?ve benchmarks is that they *are* na?ve. In real applications you have a lot of arrays running around, and so a full cache shows up with smaller array sizes. Because of this, measuring performance is a really difficult matter. From magnus at hetland.org Mon Feb 10 13:43:07 2003 From: magnus at hetland.org (Magnus Lie Hetland) Date: Mon Feb 10 13:43:07 2003 Subject: [Numpy-discussion] Plain array performance In-Reply-To: <000a01c2d14c$e15ea0b0$6601a8c0@NICKLEBY> References: <20030210203736.GB13673@idi.ntnu.no> <000a01c2d14c$e15ea0b0$6601a8c0@NICKLEBY> Message-ID: <20030210214214.GA20750@idi.ntnu.no> Paul F Dubois : > > The problem with na?ve benchmarks is that they *are* na?ve. Indeed. My request was for a more dependable analysis. > In real applications you have a lot of arrays running around, and so > a full cache shows up with smaller array sizes. Because of this, > measuring performance is a really difficult matter. Indeed. I guess what I'm curious about is the motivation behind the array module... It seems to be mainly conserving memory -- or? -- Magnus Lie Hetland "Nothing shocks me. I'm a scientist." http://hetland.org -- Indiana Jones From magnus at hetland.org Mon Feb 10 13:46:03 2003 From: magnus at hetland.org (Magnus Lie Hetland) Date: Mon Feb 10 13:46:03 2003 Subject: [Numpy-discussion] array()? (Bug?) Message-ID: <20030210214544.GA21118@idi.ntnu.no> Is this a bug, or is there a motivation behind it? >>> from numarray import array >>> array() >>> IOW: Why is array callable without any arguments when it doesn't return anything? E.g. if I call array(**kwds) with some dictionary, I'd expect an exception (since a default array isn't really possible) if kwds were empty... Or? (I'm using 0.4 -- for some reason I can't get the cvs version to compile on Solaris.) -- Magnus Lie Hetland "Nothing shocks me. I'm a scientist." http://hetland.org -- Indiana Jones From jmiller at stsci.edu Mon Feb 10 14:30:09 2003 From: jmiller at stsci.edu (Todd Miller) Date: Mon Feb 10 14:30:09 2003 Subject: [Numpy-discussion] array()? (Bug?) References: <20030210214544.GA21118@idi.ntnu.no> Message-ID: <3E48277E.3030401@stsci.edu> Magnus Lie Hetland wrote: >Is this a bug, or is there a motivation behind it? > > > >>>>from numarray import array >>>>array() >>>> >>>> >>>> > >IOW: Why is array callable without any arguments when it doesn't >return anything? E.g. if I call array(**kwds) with some dictionary, >I'd expect an exception (since a default array isn't really possible) >if kwds were empty... Or? > >(I'm using 0.4 -- for some reason I can't get the cvs version to >compile on Solaris.) > > > It looks like a bug which resulted from Numeric compatability additions. For backwards compatability with Numeric, I added the "sequence" keyword as a synonym for the numarray "buffer" keyword. We're in the process of getting rid of (deprecating) "buffer". When it's gone (a couple releases), we can remove the default parameter to sequence and the bug. Todd From magnus at hetland.org Mon Feb 10 15:35:03 2003 From: magnus at hetland.org (Magnus Lie Hetland) Date: Mon Feb 10 15:35:03 2003 Subject: [Numpy-discussion] array()? (Bug?) In-Reply-To: <3E48277E.3030401@stsci.edu> References: <20030210214544.GA21118@idi.ntnu.no> <3E48277E.3030401@stsci.edu> Message-ID: <20030210233422.GA321@idi.ntnu.no> Todd Miller : > [snip] > It looks like a bug which resulted from Numeric compatability additions. > For backwards compatability with Numeric, I added the "sequence" > keyword as a synonym for the numarray "buffer" keyword. We're in the > process of getting rid of (deprecating) "buffer". When it's gone (a > couple releases), we can remove the default parameter to sequence and > the bug. OK -- but even until then, wouldn't it be possible to add a simple check for whether any arguments have been supplied? (Not a big priority, I guess :) > Todd -- Magnus Lie Hetland "Nothing shocks me. I'm a scientist." http://hetland.org -- Indiana Jones From magnus at hetland.org Mon Feb 10 19:38:05 2003 From: magnus at hetland.org (Magnus Lie Hetland) Date: Mon Feb 10 19:38:05 2003 Subject: [Numpy-discussion] average(), again(?) Message-ID: <20030211033702.GA17429@idi.ntnu.no> I think perhaps I've asked this before -- but is there any reason why the average() function from MA can't be copied (without the mask stuff) to numarray? Maybe it's too trivial (unlike in the masked case)...? It just seems like a generally useful function to have... -- Magnus Lie Hetland "Nothing shocks me. I'm a scientist." http://hetland.org -- Indiana Jones From falted at openlc.org Mon Feb 10 23:24:01 2003 From: falted at openlc.org (Francesc Alted) Date: Mon Feb 10 23:24:01 2003 Subject: [Numpy-discussion] Psyco MA? Message-ID: <200302102054.33064.falted@openlc.org> A Dissabte 08 Febrer 2003 11:54, Joachim Saul va escriure: > Please check out the Pyrex doc. It's actually very easy right now, > *if* you can live without "sequence operators" such as slicing, > list comprehensions... but this is going to be supported, again > according to the doc. Why you are saying that slicing is not supported?. I've checked them (as python expressions, of course) and work well. May be you are referring to cdef'd c-typed arrays in Pyrex?. I think this should be a dangerous thing that can make the pointer arithmetic to slow down because of the additional required checks in the slice range. > For example, I may call (C-like) > > arr = PyArray_FromDims(1, &n, PyArray_DOUBLE) > > but could have also used a corresponding Python construct like > > from Numeric import zeros > arr = zeros(n, 'd') > > I expect the latter to be slower (not tested), but one can take > Python code "as is" and "compile" it using Pyrex. I was curious about that and tested it in my Pentium 4 @ 2 GHz laptop and for small n (just to look for overhead). The C-like call takes 26 us and the Python-like takes 52 us. Generally speaking, you can expect an overhead of 20 us (a bit more as you pass more parameters) calling Python functions (or Python-like functions inside Pyrex) from Pyrex, compared to when you use a C-API to call the corresponding C function. In fact, calling a C-function (or a cdef Pyrex function) from Pyrex takes no more time than calling from C to C: on my laptop both scores at 0.5 us. The fact that calling C functions from Pyrex has not a significant overhead (compared with calls from C to C) plus the fact that Pyrex offers a C integer loop makes Pyrex so appealing for linear algebra optimizations, not only as a "glue" language. Another advantage is that with Pyrex you can define classes with a mix of C-type and Python-type attributes. This can be very handy to obtain a compact representation of objects (whenever you do not need to access the C-typed ones from Python, but anyway, you can always use accessors if needed). Cheers, -- Francesc Alted From falted at openlc.org Mon Feb 10 23:24:02 2003 From: falted at openlc.org (Francesc Alted) Date: Mon Feb 10 23:24:02 2003 Subject: [Psyco-devel] RE: [Numpy-discussion] Interesting Psyco/Numeric/Numarray comparison In-Reply-To: <3E43FBFA.4B0C0FA1@noaa.gov> References: <200302071843.09511.falted@openlc.org> <3E43FBFA.4B0C0FA1@noaa.gov> Message-ID: <200302102040.30486.falted@openlc.org> A Divendres 07 Febrer 2003 19:33, Chris Barker va escriure: > > Is Pyrex aware of Numeric Arrays? Joachim Saul already answered that, it is. More exactly, Pyrex is not aware of any special object outside the Python standard types, but with a bit of cleverness and patience, you can map any object you want to Pyrex. The Numeric array object map just happens to be documented in the FAQ, but I managed to access numarray objects as well. Here is the recipe: First, define some enum types and headers: # Structs and functions from numarray cdef extern from "numarray/numarray.h": ctypedef enum NumRequirements: NUM_CONTIGUOUS NUM_NOTSWAPPED NUM_ALIGNED NUM_WRITABLE NUM_C_ARRAY NUM_UNCONVERTED ctypedef enum NumarrayByteOrder: NUM_LITTLE_ENDIAN NUM_BIG_ENDIAN cdef enum: UNCONVERTED C_ARRAY ctypedef enum NumarrayType: tAny tBool tInt8 tUInt8 tInt16 tUInt16 tInt32 tUInt32 tInt64 tUInt64 tFloat32 tFloat64 tComplex32 tComplex64 tObject tDefault tLong # Declaration for the PyArrayObject struct PyArray_Descr: int type_num, elsize char type ctypedef class PyArrayObject [type PyArray_Type]: # Compatibility with Numeric cdef char *data cdef int nd cdef int *dimensions, *strides cdef object base cdef PyArray_Descr *descr cdef int flags # New attributes for numarray objects cdef object _data # object must meet buffer API */ cdef object _shadows # ill-behaved original array. */ cdef int nstrides # elements in strides array */ cdef long byteoffset # offset into buffer where array data begins */ cdef long bytestride # basic seperation of elements in bytes */ cdef long itemsize # length of 1 element in bytes */ cdef char byteorder # NUM_BIG_ENDIAN, NUM_LITTLE_ENDIAN */ cdef char _aligned # test override flag */ cdef char _contiguous # test override flag */ void import_array() # The Numeric API requires this function to be called before # using any Numeric facilities in an extension module. import_array() Then, declare the API routines you want to use: cdef extern from "numarray/libnumarray.h": PyArrayObject NA_InputArray (object, NumarrayType, int) PyArrayObject NA_OutputArray (object, NumarrayType, int) PyArrayObject NA_IoArray (object, NumarrayType, int) PyArrayObject NA_Empty(int nd, int *d, NumarrayType type) object PyArray_FromDims(int nd, int *d, NumarrayType type) define now a couple of maps between C enum types and Python numarrar type classes: # Conversion tables from/to classes to the numarray enum types toenum = {numarray.Int8:tInt8, numarray.UInt8:tUInt8, numarray.Int16:tInt16, numarray.UInt16:tUInt16, numarray.Int32:tInt32, numarray.UInt32:tUInt32, numarray.Float32:tFloat32, numarray.Float64:tFloat64, } toclass = {} for (key, value) in toenum.items(): toclass[value] = key ok. you are on the way. We can finally define our user funtion; for example, I will show here a function to multiply a matrix by a vector (C double precision): def multMatVec(object a, object b, object c): cdef PyArrayObject carra, carrb, carrc cdef double *da, *db, *dc cdef int i, j carra = NA_InputArray(a, toenum[a._type], C_ARRAY) carrb = NA_InputArray(b, toenum[b._type], C_ARRAY) carrc = NA_InputArray(c, toenum[c._type], C_ARRAY) da = carra.data db = carrb.data dc = carrc.data dim1 = carra.dimensions[0] dim2 = carra.dimensions[1] for i from 0<= i < dim1: dc[i] = 0. for j from 0<= j < dim2: dc[i] = dc[i] + da[i*dim2+j] * db[j] return carrc where NA_InputArray is a high-level numarray API that ensures that the object retrieved is a well-behaved array, and not mis-aligned, discontiguous or whatever. Maybe at first glance such a procedure would seem obscure, but it is not. I find it to be quite elegant. Look at the "for i from 0<= i < dim1:" construction. We could have used the more pythonic form: "for i in range(dim1):", but by using the former, the Pyrex compiler is able to produce a loop in plain C, so achieving C-speed on this piece of code. Of course, you must be aware to not introduce Python objects inside the loop, or all the potential speed-up improvement will vanish. But, with a bit of practice, this is easy to avoid. For me Pyrex is like having Python but with the speed of C. This is why I'm so enthusiastic with it. > > I imagine it could use them just fine, using the generic Python sequence > get item stuff, but that would be a whole lot lower performance than if > it understood the Numeric API and could access the data array directly. > Also, how does it deal with multiple dimension indexing ( array[3,6,2] ) > which the standard python sequence types do not support? In general, you can access sequence objects like in Python (and I've just checked that extended slicing *is* supported, I don't know why Joachim was saying that not; perhaps he was meaning Pyrex C-arrays?), but at Python speed. So, if you need speed, always use pointers to your data and use a bit of pointer arithmetic to access the element you want (look at the example). Of course, you can also define C arrays if you know the boundaries in compilation time and let the compiler do the computations to access your desired element, but you will need first to copy the data from your buffers to the C-array, and perhaps this is a bit inconvenient in some situations. > As I think about this, I think your suggestion is fabulous. Pyrex (or a > Pyrex-like) language would be a fabulous way to write code for NumArray, > if it really made use of the NumArray API. There can be drawbacks, like the one stated by Perry related with how to construct general Ufuncs that can handle many different combinations of arrays and types, although I don't understand that very well because Numeric and numarray crews already achieved to do that in C, so why it cannot be possible with Pyrex?. Mmm, perhaps there is some pre-processor involved?. Cheers, -- Francesc Alted From Chris.Barker at noaa.gov Tue Feb 11 10:36:01 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Tue Feb 11 10:36:01 2003 Subject: [Psyco-devel] RE: [Numpy-discussion] Interesting Psyco/Numeric/Numarray comparison References: <200302071843.09511.falted@openlc.org> <3E43FBFA.4B0C0FA1@noaa.gov> <200302102040.30486.falted@openlc.org> Message-ID: <3E4938E1.73B7831E@noaa.gov> Francesc Alted wrote: > First, define some enum types and headers: Could all this be put into Pyrex? (when NumArray becomes more stable anyway) It's well beyond me to understand it. > I will show here a function to multiply a matrix by a vector (C double > precision): > > def multMatVec(object a, object b, object c): > cdef PyArrayObject carra, carrb, carrc > cdef double *da, *db, *dc > cdef int i, j > > carra = NA_InputArray(a, toenum[a._type], C_ARRAY) > carrb = NA_InputArray(b, toenum[b._type], C_ARRAY) > carrc = NA_InputArray(c, toenum[c._type], C_ARRAY) > da = carra.data > db = carrb.data > dc = carrc.data > dim1 = carra.dimensions[0] > dim2 = carra.dimensions[1] > for i from 0<= i < dim1: > dc[i] = 0. > for j from 0<= j < dim2: > dc[i] = dc[i] + da[i*dim2+j] * db[j] > > return carrc > For me Pyrex is like having Python but with the speed of C. This is why I'm > so enthusiastic with it. That actually looks more like C than Python to me. As soon as I am doing pointer arithmetic, I don't feel like I'm writng Python. Would it be all that much more code in C? > speed. So, if you need speed, always use pointers to your data and use a bit > of pointer arithmetic to access the element you want (look at the example). Is there really no way to get this to work? > Of course, you can also define C arrays if you know the boundaries in > compilation time and let the compiler do the computations to access your > desired element, but you will need first to copy the data from your buffers > to the C-array, and perhaps this is a bit inconvenient in some situations. Couldn't you access the data array of the NumArray directly? I do this all the time with Numeric. > Why you are saying that slicing is not supported?. I've checked them (as > python expressions, of course) and work well. May be you are referring to > cdef'd c-typed arrays in Pyrex?. I think this should be a dangerous thing > that can make the pointer arithmetic to slow down because of the additional > required checks in the slice range. Well, there would need to be two value checks per slice. That would be significant for small slices, but not for large ones, I'd love to have it. It just doesn't feel like Python without slicing, and it doesn't feel like NumPy without multi-dimensional slicing. > There can be drawbacks, like the one stated by Perry related with how to > construct general Ufuncs that can handle many different combinations of > arrays and types, although I don't understand that very well because Numeric > and numarray crews already achieved to do that in C, so why it cannot be > possible with Pyrex?. Mmm, perhaps there is some pre-processor involved?. I was curious about this comment as well. I have only had success with writing any of my Numeric based extensions for pre-determined types. If I had to support additional types (and/or discontiguous and/or rank-N arrays), I ended up with a whole pile of case and/or if statements. Also kind of slow and inefficient code. It seems the only way to do this right is with C++ and templates (eg. Blitz++), but there are good reasons not to go that route. Would it really be any harder to use Pyrex than C for this kind of thing? Also, would it be possible to take a Pyrex type approach and have it do someting template-like: you wright the generic code in Pyrex, it generates all the type-specific C code for you. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From falted at openlc.org Tue Feb 11 11:34:04 2003 From: falted at openlc.org (Francesc Alted) Date: Tue Feb 11 11:34:04 2003 Subject: [Psyco-devel] RE: [Numpy-discussion] Interesting Psyco/Numeric/Numarray comparison In-Reply-To: <3E4938E1.73B7831E@noaa.gov> References: <200302102040.30486.falted@openlc.org> <3E4938E1.73B7831E@noaa.gov> Message-ID: <200302112033.30231.falted@openlc.org> A Dimarts 11 Febrer 2003 18:54, Chris Barker va escriure: > > > > def multMatVec(object a, object b, object c): > > cdef PyArrayObject carra, carrb, carrc > > cdef double *da, *db, *dc > > cdef int i, j > > > > carra = NA_InputArray(a, toenum[a._type], C_ARRAY) > > carrb = NA_InputArray(b, toenum[b._type], C_ARRAY) > > carrc = NA_InputArray(c, toenum[c._type], C_ARRAY) > > da = carra.data > > db = carrb.data > > dc = carrc.data > > dim1 = carra.dimensions[0] > > dim2 = carra.dimensions[1] > > for i from 0<= i < dim1: > > dc[i] = 0. > > for j from 0<= j < dim2: > > dc[i] = dc[i] + da[i*dim2+j] * db[j] > > > > return carrc > > > > > > For me Pyrex is like having Python but with the speed of C. This is why > > I'm so enthusiastic with it. > > That actually looks more like C than Python to me. As soon as I am doing > pointer arithmetic, I don't feel like I'm writng Python. Would it be all > that much more code in C? Doing that in C implies writing the "glue" code. In the past example, multMatVec is a function *directly* accessible in Python, without any additional declaration. Moreover, you can do in Pyrex the same things you do in python, so you could have written the last piece of code as: def multMatVec(object a, object b, object c): for i in range(a.shape[0]): c[i] = 0. for j in range(a.shape[1]): dc[i] = dc[i] + da[i][j] * db[j] return c but, of course, you get only Python speed. So, the moral is that C speed is only accessible in Pyrex if you use C like types and constructions, it just don't come for free. I just find this way to code to be more elegant than using Swig, or other approaches. But I'm most probably biased because Pyrex is my first (and unique) serious tool for doing Python extensions. > > > speed. So, if you need speed, always use pointers to your data and use a > > bit of pointer arithmetic to access the element you want (look at the > > example). > > Is there really no way to get this to work? > > > Of course, you can also define C arrays if you know the boundaries in > > compilation time and let the compiler do the computations to access your > > desired element, but you will need first to copy the data from your > > buffers to the C-array, and perhaps this is a bit inconvenient in some > > situations. > > Couldn't you access the data array of the NumArray directly? I do this > all the time with Numeric. Yeah, you can, and both examples shown here (in Numeric and numarray), you are accessing directly the array data buffer, with no copies (whenever your original array is well-.behaved, of course). > > > Why you are saying that slicing is not supported?. I've checked them (as > > python expressions, of course) and work well. May be you are referring to > > cdef'd c-typed arrays in Pyrex?. I think this should be a dangerous thing > > that can make the pointer arithmetic to slow down because of the > > additional required checks in the slice range. > > Well, there would need to be two value checks per slice. That would be > significant for small slices, but not for large ones, I'd love to have > it. It just doesn't feel like Python without slicing, and it doesn't > feel like NumPy without multi-dimensional slicing. > Again, right now, you can use slicing in Pyrex if you are dealing with Python objects, but from the moment you access to the lower-level Numeric/numarray buffer and assign to a Pyrex C-pointer, you can't do that anymore. That's the price to pay for speed. About implementing slicing in Pyrex C-pointer arithmetics, well, it can be worth to ask Greg Ewing, the Pyrex author. I'll send him this particular question and forward his answer (if any) to the list. > > There can be drawbacks, like the one stated by Perry related with how to > > construct general Ufuncs that can handle many different combinations of > > arrays and types, although I don't understand that very well because > > Numeric and numarray crews already achieved to do that in C, so why it > > cannot be possible with Pyrex?. Mmm, perhaps there is some pre-processor > > involved?. > > I was curious about this comment as well. I have only had success with > writing any of my Numeric based extensions for pre-determined types. If > I had to support additional types (and/or discontiguous and/or rank-N > arrays), I ended up with a whole pile of case and/or if statements. Also > kind of slow and inefficient code. > > It seems the only way to do this right is with C++ and templates (eg. > Blitz++), but there are good reasons not to go that route. > > Would it really be any harder to use Pyrex than C for this kind of > thing? Also, would it be possible to take a Pyrex type approach and have > it do someting template-like: you wright the generic code in Pyrex, it > generates all the type-specific C code for you. Well, this is another good question for Greg. I'll try to ask him, although as I don't have experience on that kind of issues, chances are that my question might result a complete nonsense :). Cheers, -- Francesc Alted From perry at stsci.edu Tue Feb 11 12:16:13 2003 From: perry at stsci.edu (Perry Greenfield) Date: Tue Feb 11 12:16:13 2003 Subject: [Numpy-discussion] Psyco MA? In-Reply-To: <3E47D8AC.4070108@ieee.org> Message-ID: Tim Hochberg writes: > Overhead (c) Overhead (nc) > TimePerElement (c) TimePerElement (nc) > NumPy 10 us 10 > us 85 ps 95 ps > NumArray 200 us 530 us > 45 ps 135 ps > Psymeric 50 us 65 > us 80 ps 80 ps > > > The times shown above are for Float64s and are pretty approximate, and > they happen to be a particularly favorable array shape for Psymeric. I > have seen pymeric as much as 50% slower than NumPy for large arrays of > certain shapes. > > The overhead for NumArray is surprisingly large. After doing this > experiment I'm certainly more sympathetic to Konrad wanting less > overhead for NumArray before he adopts it. > Wow! Do you really mean picoseconds? I never suspected that either Numeric or numarray were that fast. ;-) Anyway, this issue is timely [Err...]. As it turns out we started looking at ways of improving small array performance a couple weeks ago and are coming closer to trying out an approach that should reduce the overhead significantly. But I have some questions about your benchmarks. Could you show me the code that is used to generate the above timings? In particular I'm interested in the kinds of arrays that are being operated on. It turns out that that the numarray overhead depends on more than just contiguity and it isn't obvious to me which case you are testing. For example, Todd's benchmarks indicate that numarray's overhead is about a factor of 5 larger than numpy when the input arrays are contiguous and of the same type. On the other hand, if the array is not contiguous or requires a type conversion, the overhead is much larger. (Also, these cases require blocking loops over large arrays; we have done nothing yet to optimize the block size or the speed of that loop.) If you are doing the benchmark on contiguous, same type arrays, I'd like to get a copy of the benchmark program to try to see where the disagreement arises. The very preliminary indications are that we should be able to make numarray overheads approximately 3 times higher for all ufunc cases. That's still slower, but not by a factor of 20 as shown above. How much work it would take to reduce it further is unclear (the main bottleneck at that point appears to be how long it takes to create new output arrays) We are still mainly in the analysis and design phase of how to improve performance for small arrays and block looping. We believe that this first step will not require moving very much of the existing Python code into C (but some will be). Hopefully we will have some working code in a couple weeks. Thanks, Perry From tim.hochberg at ieee.org Tue Feb 11 13:05:05 2003 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Tue Feb 11 13:05:05 2003 Subject: [Numpy-discussion] Psyco MA? In-Reply-To: References: Message-ID: <3E49653D.4050604@ieee.org> Perry Greenfield wrote: >Tim Hochberg writes: > > >> Overhead (c) Overhead (nc) >>TimePerElement (c) TimePerElement (nc) >>NumPy 10 us 10 >>us 85 ps 95 ps >>NumArray 200 us 530 us >>45 ps 135 ps >>Psymeric 50 us 65 >>us 80 ps 80 ps >> >> >>The times shown above are for Float64s and are pretty approximate, and >>they happen to be a particularly favorable array shape for Psymeric. I >>have seen pymeric as much as 50% slower than NumPy for large arrays of >>certain shapes. >> >>The overhead for NumArray is surprisingly large. After doing this >>experiment I'm certainly more sympathetic to Konrad wanting less >>overhead for NumArray before he adopts it. >> >> >> >Wow! Do you really mean picoseconds? I never suspected that >either Numeric or numarray were that fast. ;-) > > My bad, I meant ns. What's a little factor of 10^3 among friends. >Anyway, this issue is timely [Err...]. As it turns out we started > > >looking at ways of improving small array performance a couple weeks >ago and are coming closer to trying out an approach that should >reduce the overhead significantly. > >But I have some questions about your benchmarks. Could you show me >the code that is used to generate the above timings? In particular >I'm interested in the kinds of arrays that are being operated on. >It turns out that that the numarray overhead depends on more than >just contiguity and it isn't obvious to me which case you are testing. > > I'll send you psymeric, including all the tests by private email to avoid cluttering up the list. (Don't worry, it's not huge -- only 750 lines of Python at this point). You can let me know if you find any horrible issues with it. >For example, Todd's benchmarks indicate that numarray's overhead is >about a factor of 5 larger than numpy when the input arrays are >contiguous and of the same type. On the other hand, if the array >is not contiguous or requires a type conversion, the overhead is >much larger. (Also, these cases require blocking loops over large >arrays; we have done nothing yet to optimize the block size or >the speed of that loop.) If you are doing the benchmark on >contiguous, same type arrays, I'd like to get a copy of the benchmark >program to try to see where the disagreement arises. > > Basically, I'm operating on two, random contiguous, 3x3, Float64 arrays.In the noncontiguous case the arrays are indexed using [::2,::2] and [1::2,::2] so these arrays are 2x2 and 1x2. Hmmm, that wasn't intentional, I'm measuring axis stretching as well. However using [::2.::2] for both axes doesn't change things a whole lot. The core timing part looks like this: t0 = clock() if op == '+': c = a + b elif op == '-': c = a - b elif op == '*': c = a * b elif op == '/': c = a / b elif op == '==': c = a==b else: raise ValueError("unknown op %s" % op) t1 = clock() This is done N times, the first M values are thrown away and the remaining values are averaged. Currently N is 3 and M is 1, so not a lot averaging is taking place. >The very preliminary indications are that we should be able to make >numarray overheads approximately 3 times higher for all ufunc cases. >That's still slower, but not by a factor of 20 as shown above. How >much work it would take to reduce it further is unclear (the main >bottleneck at that point appears to be how long it takes to create >new output arrays) > > That's good. I think it's important to get people like Konrad on board and that will require dropping the overhead. >We are still mainly in the analysis and design phase of how to >improve performance for small arrays and block looping. We believe >that this first step will not require moving very much of the >existing Python code into C (but some will be). Hopefully we >will have some working code in a couple weeks. > I hope it goes well. -tim From falted at openlc.org Wed Feb 12 01:25:06 2003 From: falted at openlc.org (Francesc Alted) Date: Wed Feb 12 01:25:06 2003 Subject: [Numpy-discussion] Fwd: Re: A couple of questions on Pyrex Message-ID: <200302121024.24406.falted@openlc.org> Hi, Here is the Greg's reply to my questions. It seems like Pyrex is not going to change in these two issues. Well, at least he considered the first to be an "interesting" idea. Cheers, ---------- Missatge transm?s ---------- From greg at cosc.canterbury.ac.nz Wed Feb 12 01:23:01 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 12 Feb 2003 19:23:01 +1300 (NZDT) Subject: A couple of questions on Pyrex Message-ID: > numbuf =3D data[2:30:4][1] > > in order to get a copy (in a new memory location) of the memory buffer = in > the selected slice to work with it. Would that be interesting to > implement?. It's an interesting idea, but I think it's getting somewhat beyond the scope of Pyrex. I don't think I'll be trying to implement anything like that in the foreseeable future. The Pyrex compiler is complicated enough already, and I don't want to add anything more that isn't really necessary. > Is (or will be) there any way in Pyrex to automagically create diferent > flavors of this function to deal with different datatypes? Same here, and even more so -- I'm *definitely* not going to re-implement C++ templates! :-) Greg Ewing, Computer Science Dept, +-------------------------------------= -+ University of Canterbury,=09 | A citizen of NewZealandCorp, a=09 | Christchurch, New Zealand=09 | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz=09 +--------------------------------------+ ------------------------------------------------------- --=20 Francesc Alted From karthik at james.hut.fi Wed Feb 12 01:32:02 2003 From: karthik at james.hut.fi (Karthikesh Raju) Date: Wed Feb 12 01:32:02 2003 Subject: [Numpy-discussion] Problems with view.py Message-ID: Hi All, i was using view.py, that comes with NumPy. Somehow view.py slows down the whole ipython shell. Most of the times the program crashes dumping core. The version of my Numeric is 21.3. Is there some new version of view.py? Is there something needed? Or rather is there some better viewer for viewing images in python ( specifically for viewing image processing images). Best regards karthik ----------------------------------------------------------------------- Karthikesh Raju, email: karthik at james.hut.fi Researcher, http://www.cis.hut.fi/karthik Helsinki University of Technology, Tel: +358-9-451 5389 Laboratory of Comp. & Info. Sc., Fax: +358-9-451 3277 Department of Computer Sc., P.O Box 5400, FIN 02015 HUT, Espoo, FINLAND ----------------------------------------------------------------------- From falted at openlc.org Wed Feb 12 05:04:03 2003 From: falted at openlc.org (Francesc Alted) Date: Wed Feb 12 05:04:03 2003 Subject: [Numpy-discussion] Psyco MA? In-Reply-To: <3E49653D.4050604@ieee.org> References: <3E49653D.4050604@ieee.org> Message-ID: <200302121403.09480.falted@openlc.org> Hi, Some days ago, I've also done some benchmarks on this issue, and I think that could be good to share my results. I'm basically reproducing the figures of Tim, although that with a difference still bigger in favour of Numeric for small matrix (2x2). The benchmarks are made with a combination of Python and Pyrex (in order to test also some functions in Numeric and numarray C API). The figures I'm getting are, roughly: Matrix multiplication: In Python: matrixmultiply (double(2,2) x double(2,)) in Numeric: 70 us matrixmultiply (double(2,2) x double(2,)) in numarray: 4800 us In Pyrex: numarray multiply in Pyrex, using NA_InputArray: 620 us numarray multiply in Pyrex, using PyObject_AsWriteBuffer: 146 us zeros: In Python: double(2,) in Numeric: 58 us double(2,) in numarray: 3100 us In Pyrex (using PyArray_FromDims): double(2,) with Numeric: 26 us double(2,) with numarray: 730 us As, you can see, in pure Python, numarray has a factor of 50 (for zeros) and up to 70 (for matrix multiply) times more overhead than Numeric. Increasing the matrix to a 200x20 the overhead difference falls down to factor of 16 (for matrix multiply) and 50 (for zeros) always in favor of Numeric. With Pyrex (i.e. making the C calls), the differences are not so big, but there is still a difference. In particular, when assuming a contiguous matrix and calling PyObject_AsWriteBuffer directly upon the object._data memory buffer, the factor falls down to 2. Increasing the matrix to 200x20, the overhead for zeros (using PyArray_FromDims) is the same for numarray than Numeric (around 700 us), while multiply in Pyrex can't beat the matrixmultiply in Numeric (Numeric is still 2 times faster). Hope that helps. I also can send you my testbeds if you are interested in. -- Francesc Alted From guido at python.org Thu Feb 13 13:46:11 2003 From: guido at python.org (Guido van Rossum) Date: Thu Feb 13 13:46:11 2003 Subject: [Numpy-discussion] OSCON / Python 11 proposals deadline is February 15th! Message-ID: <200302132114.h1DLE4x16909@odiug.zope.com> The Python 11 Conference is being held July 7-11 in Portland, Oregon as part of OSCON 2003. http://conferences.oreillynet.com/os2003/ The deadline for proposals is February 15th! You only need to have your proposal in this week, you don't need to worry about trying to put together the complete presentation or tutorial materials at this time. Proposal submissions page: http://conferences.oreillynet.com/cs/os2003/create/e_sess Few proposals have been submitted so far, we need many more to have a successful Python 11 conference. If you have submitted a proposal for one of the other Python conferences this year such as PyCon, I encourage you to go ahead and submit the proposal to Python 11 as well. If you are presenting at the Python UK Conference or EuroPython, but are unable to attend Python 11, you should consider having another team member do the presentation. The theme of OSCON 2003 is "Embracing and Extending Proprietary Software". Papers and presentations on how to successfully transition away from proprietary software would also be good, but it is not necessary for your proposal to cover the theme, proposals just need to be related to Python. COMPENSATION: Free registration for speakers (except lightning talks). Tutorial speakers also get: $500 honorarium; $50 per diem on day of tutorial; 1 night hotel; airfare. O'REILLY ANNOUNCEMENT: 2003 O'Reilly Open Source Convention Call For Participation Embracing and Extending Proprietary Software http://conferences.oreilly.com/oscon/ O'Reilly & Associates invites programmers, developers, strategists, and technical staff to submit proposals to lead tutorial and conference sessions at the 2003 Open Source Software Convention, slated for July 7-11 in Portland, OR. Proposals are due February 15, 2003. For more information please visit our OSCON website http://conferences.oreilly.com/oscon/ The theme this year is "Embracing and Extending Proprietary Software." Few companies use only one vendor's software on desktops, back office, and servers. Variety in operating systems and applications is becoming the norm, for sound financial and technical reasons. With variety comes the need for open unencumbered standards for data exchange and service interoperability. You can address the theme from any angle you like--for example, you might talk about migrating away from commercial software such as Microsoft Windows, or instead place your emphasis on coexistence. Convention Conferences Perl Conference 7 The Python 11 Conference PHP Conference 3 Convention Tracks Apache XML Applications MySQL and PostgreSQL Ruby --Guido van Rossum (home page: http://www.python.org/~guido/) From paul at pfdubois.com Thu Feb 13 19:37:06 2003 From: paul at pfdubois.com (Paul Dubois) Date: Thu Feb 13 19:37:06 2003 Subject: [Numpy-discussion] PEP-242 Numeric kinds -- disposition Message-ID: <000501c2d3da$461e0920$6601a8c0@NICKLEBY> PEP-242 should be closed. The kinds module will not be added to the standard library. There was no opposition to the proposal but only mild interest in using it, not enough to justify adding the module to the standard library. Instead, it will be made available as a separate distribution item at the Numerical Python site. At the next release of Numerical Python, it will no longer be a part of the Numeric distribution. From msekko1 at rediffmail.com Fri Feb 14 03:02:02 2003 From: msekko1 at rediffmail.com (MRS L ESTRADA) Date: Fri Feb 14 03:02:02 2003 Subject: [Numpy-discussion] ESTRADA [HELP] Message-ID: Dear sir My name is LOUISA C.ESTRADA,The wife of Mr. JOSEPH ESTRADA, the former President of Philippines located in the South East Asia. My husband was recently impeached from office by a backed uprising of mass demonstrators and the Senate. My husband is presently in jail and facing trial on charges of corruption, embezzlement, and the mysterious charge of plunder which might lead to death sentence. The present government is forcing my husband out of manila to avoid demonstration by his supporters. During my husband's regime as president of Philippine, I realized some reasonable amount of money from various deals that I successfully executed. I have plans to invest this money for my children's future on real estate and industrial production. My husband is not aware of this because I wish to do it secretly for now. Before my husband was impeached, I secretly siphoned the sum of $30,000,000 million USD (Thirty million United states dollars) out of Philippines and deposited the money with a security firm that transports valuable goods and consignments through diplomatic means. I also declared that the consignment was solid gold and my foreign business partner owned it. I am contacting you because I want you to go to the security company and claim the money on my behalf since I have declared that the consignment belong to my foreign business partner. You shall also be required to assist me in investment in your country. I hope to trust you as a God fearing person who will not sit on this money when you claim it, rather assist me properly, I expect you to declare what percentage of the total money you will take for your assistance. When I receive your positive response I will let you know where the security company is and the payment pin code to claim the money which is very important. For now, let all our communication be by e-mail because my line are right now connected to the Philippines Telecommunication Network services. Please also send me your telephone and fax number. I will ask my son to contact you to give you more details on after i have received a responce from you. Thank you and God bless you and your family. MRS LOUISA C. ESTRADA From falted at openlc.org Mon Feb 17 11:08:01 2003 From: falted at openlc.org (Francesc Alted) Date: Mon Feb 17 11:08:01 2003 Subject: [Numpy-discussion] rank-0 chararrays? Message-ID: <200302171940.36278.falted@openlc.org> Hi, I'm trying to map Numeric character typecode ('c') to chararrays, but I have a problem to distinguish between In [109]: chararray.array("qqqq") Out[109]: CharArray(['qqqq']) and In [110]: chararray.array(["qqqq"]) # Note the extra "[" "]" Out[110]: CharArray(['qqqq']) # The same result as 109 while in Numeric we have: In [113]: Numeric.array("qqqq") Out[113]: array([q, q, q, q],'c') In [114]: Numeric.array(["qqqq"]) Out[114]: array([ [q, q, q, q]],'c') # Differs from 113 even in numarray objects, rank-0 seems to work well: In [107]: numarray.array(1) Out[107]: array(1) In [108]: numarray.array([1]) Out[108]: array([1]) # Objects differ So, it seems like if chararray does not support well rank-0 objects. Is this the expected behavior?. If yes, we have no possibility to distinguish between object 109 and 110, and I'd like to distinguish between this two. What can be done to achieve this? Thanks, -- Francesc Alted From tim.hochberg at ieee.org Mon Feb 17 13:06:03 2003 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Mon Feb 17 13:06:03 2003 Subject: [Numpy-discussion] Psymeric-update Message-ID: <3E514E78.10905@ieee.org> The good news is Psymeric now supports complex numbers and inplace addition and complex numbers (Complex32 and Complex64). Also by doing some tuning, I got the overhead of Psymeric down to less than three times that of Numeric (versus 20 times in the version of Numarray that I have). Even without Psyco, the code only has overhead of five and half times that of Numeric, so it seems that the Numarray folks should at least be able to get down to that level without throwing everything into C. I have not been able to increase the asymptoptic speed and I think I'm probably stuck on that front for the time being. For the most part Psymeric is close to Numeric for large arrays, which makes it about 50% faster than numarray for noncontiguous and half as fast for contiguous arrays. These timings are for Float64: for Int8, Psymeric is ~3x slower than Numeric and for Int16 it's 50% slower, for Int32 2x slower. Psymeric is very slow for Float32 and Complex32 (~10x slower than Numeric) beacause of some Psyco issues with array.arrays and floats which I expect will be fixed at some point. And finally, for Complex64, psymeric is comparable to Numeric for addition and subtraction, but almost half as fast for multiplication and almost a third as fast for division. Barring some further improvements in Psyco or some new insights on my part, this is probably as far as I'll go with this. At this point, it would probably not be hard to make this into a work alike for Numeric or Numarray (excluding the various extension modules: FFT and the like). The one relatively hard part still outstanding it ufunc.accumulate/reduce. However, the performance while very impressive for an essentially pure python solution is not good enough to motivate me to use this in preference to Numeric. If anyone is interested in looking at the code, I'd be happy to send it to them. Regards, -tim From jmiller at stsci.edu Tue Feb 18 04:28:03 2003 From: jmiller at stsci.edu (Todd Miller) Date: Tue Feb 18 04:28:03 2003 Subject: [Numpy-discussion] rank-0 chararrays? References: <200302171940.36278.falted@openlc.org> Message-ID: <3E522A95.1000601@stsci.edu> Francesc Alted wrote: >Hi, > >I'm trying to map Numeric character typecode ('c') to chararrays, but I have >a problem to distinguish between > >In [109]: chararray.array("qqqq") >Out[109]: CharArray(['qqqq']) > >and > >In [110]: chararray.array(["qqqq"]) # Note the extra "[" "]" >Out[110]: CharArray(['qqqq']) # The same result as 109 > > The chararray API pre-dates our awareness, ultimate implemenation, and final rejection of rank-0 arrays. In retrospect, your usage above makes sense. Whether we change things now or not is another matter. You are giving me interface angst... :) You can create rank-0 arrays by specifying shape=() and itemsize=len(buffer). However, these do not repr correctly (unless you update from CVS). >while in Numeric we have: > >In [113]: Numeric.array("qqqq") >Out[113]: array([q, q, q, q],'c') > >In [114]: Numeric.array(["qqqq"]) >Out[114]: array([ [q, q, q, q]],'c') # Differs from 113 > >even in numarray objects, rank-0 seems to work well: > >In [107]: numarray.array(1) >Out[107]: array(1) > >In [108]: numarray.array([1]) >Out[108]: array([1]) # Objects differ > This was not always so, be we made it work when we thought rank-0 had something to offer. After some discussion on numpy-discussion-list, rank-0 went out of vogue. > > >So, it seems like if chararray does not support well rank-0 objects. > That is true. CharArray never caught up because rank-0 became vestigal even for NumArray. >Is this >the expected behavior?. > Yes. But, rank-0 support for chararray is not far off, with the possible exception of breaking the public interface. >If yes, we have no possibility to distinguish >between object 109 and 110, and I'd like to distinguish between this two. > Why exactly do you need rank-0? >What can be done to achieve this? > 1. Add a little special casing to chararray._charArrayToStringList() to handle rank-0. I did this already in CVS. 2. Debate whether or not to change chararray.array() to work as you've shown above. Proceed from there. > >Thanks, > > > From falted at openlc.org Tue Feb 18 09:54:08 2003 From: falted at openlc.org (Francesc Alted) Date: Tue Feb 18 09:54:08 2003 Subject: [Numpy-discussion] rank-0 chararrays? In-Reply-To: <3E522A95.1000601@stsci.edu> References: <200302171940.36278.falted@openlc.org> <3E522A95.1000601@stsci.edu> Message-ID: <200302181853.22563.falted@openlc.org> A Dimarts 18 Febrer 2003 13:44, Todd Miller va escriure: > You are giving me interface angst... :) Well, I don't know exactly what do you mean with that, but I hope it would be something not too bad ;) > > This was not always so, be we made it work when we thought rank-0 had > something to offer. After some discussion on numpy-discussion-list, > rank-0 went out of vogue. Mmmm, do you mean that rank-0 is being deprecated in numarray? > Why exactly do you need rank-0? Appart from supporting chararrays in PyTables, I'm using them as a buffer to save homogeneous character standard lists and tuples, because it is very easy to obtain a contiguous C buffer from it. However, if I have no possibility to distinguish between "qqq" and ["qqq"] objects directly from chararray instances obtained from them, I can't materialize them properly when reading the objects from the persitent storage. Perhaps using more metadata could solve the situation (for example, saving the original shape of the object), but I wouldn't like to clutter unnecessarily PyTables metadata space. > > >What can be done to achieve this? > > 1. Add a little special casing to chararray._charArrayToStringList() to > handle rank-0. I did this already in CVS. Ok. For the moment I'll be using numarray CVS, although I don't know if next version of numarray will be out before next PyTables release (planned in a couple of weeks). > 2. Debate whether or not to change chararray.array() to work as you've > shown above. Proceed from there. Well, the fact is that I needed rank-0 only for the reason stated before. But I'm not sure if this is reason enough to open such a debate. Thanks!, -- Francesc Alted From falted at openlc.org Wed Feb 19 04:06:02 2003 From: falted at openlc.org (Francesc Alted) Date: Wed Feb 19 04:06:02 2003 Subject: [Numpy-discussion] range check: feature request for numarray Message-ID: <200302191305.00194.falted@openlc.org> Hi, I think it would be useful to provide some range checking in numarray. For example, right now, you can do: In [24]: a=numarray.array([1,2],numarray.Int8) In [25]: a[1] = 256 In [26]: a Out[26]: array([1, 0], type=Int8) and nothing happens. But I'm proposing to raise an OverflowWarning so that people can be aware of such range overflows. Maybe it is desirable that the default would be to not issue the warning, except when the user wanted to know about that. So, my proposal is that the actual behaviour should be mantained, but when you want to be aware of all the warnings something like this could happen: In [28]: warnings.resetwarnings() In [29]: a=numarray.array([1,2],numarray.Int8) In [30]: a[1] = 256 OverflowWarning: value assignment not in the type range In [31]: a Out[31]: array([1, 0], type=Int8) But perhaps this feature might slow a bit the performance of assignments. Regards, -- Francesc Alted `` We are shaped by our thoughts, we become what we think. When the mind is pure, joy follows like a shadow that never leaves. '' -- Buddha, The Dhammapada From haase at msg.ucsf.edu Wed Feb 19 12:21:38 2003 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Wed Feb 19 12:21:38 2003 Subject: [Numpy-discussion] byteswap() not doing anything !? Message-ID: <200302191220.26456.haase@msg.ucsf.edu> >>> xx = na.array((1,2,3)) >>> xx array([1, 2, 3]) >>> xx.byteswap() >>> xx array([1, 2, 3]) >>> xx.type() Int32 Hi all, I was reading the documentation for array 0.4 but I get the about results. How do I get the bytes swaped like it says in the manual: numbyteswap() The byteswap method performs a byte swapping operation on all the elements in the array, working inplace (i.e. it returns None). >>> print a [1 2 3] >>> a.byteswap() >>> print a [16777216 33554432 50331648] Thanks, Sebastian From jmiller at stsci.edu Wed Feb 19 12:45:02 2003 From: jmiller at stsci.edu (Todd Miller) Date: Wed Feb 19 12:45:02 2003 Subject: [Numpy-discussion] byteswap() not doing anything !? References: <200302191220.26456.haase@msg.ucsf.edu> Message-ID: <3E53EC22.50109@stsci.edu> Sebastian Haase wrote: >>>>xx = na.array((1,2,3)) >>>>xx >>>> >>>> >array([1, 2, 3]) > > >>>>xx.byteswap() >>>>xx >>>> >>>> >array([1, 2, 3]) > > >>>>xx.type() >>>> >>>> >Int32 > >Hi all, >I was reading the documentation for array 0.4 >but I get the about results. >How do I get the bytes swaped like it says in the manual: > >numbyteswap() > The byteswap method performs a byte swapping > operation on all the elements in the array, > working inplace (i.e. it returns None). > >>> print a > [1 2 3] > >>> a.byteswap() > >>> print a > [16777216 33554432 50331648] > This is a known bug/incompatability. The behavior will be changed for the next release of numarray. Right now, _byteswap() does what you want. > > >Thanks, >Sebastian > > > Todd From jmiller at stsci.edu Thu Feb 20 00:08:13 2003 From: jmiller at stsci.edu (Todd Miller) Date: Thu Feb 20 00:08:13 2003 Subject: [Numpy-discussion] range check: feature request for numarray References: <200302191305.00194.falted@openlc.org> Message-ID: <3E5490C1.2010707@stsci.edu> Francesc Alted wrote: >Hi, > > > Hi Francesc, I'm sorry about the slow response on this. I looked into what it would take to do this, and while I agree with you in principle, right now my hands are full trying to beat down numarray overhead. >I think it would be useful to provide some range checking in numarray. For >example, right now, you can do: > >In [24]: a=numarray.array([1,2],numarray.Int8) > >In [25]: a[1] = 256 > >In [26]: a >Out[26]: array([1, 0], type=Int8) > >and nothing happens. But I'm proposing to raise an OverflowWarning so that >people can be aware of such range overflows. > That sounds reasonable. If you'd care to do a patch, I think we would want it. If you don't have time, it may be a little while before we do. >Maybe it is desirable that the default would be to not issue the warning, >except when the user wanted to know about that. > > I think I'd rather see the warning on by default, even though it might "break" some existing code. >So, my proposal is that the actual behaviour should be mantained, but when >you want to be aware of all the warnings something like this could happen: > >In [28]: warnings.resetwarnings() > >In [29]: a=numarray.array([1,2],numarray.Int8) > >In [30]: a[1] = 256 >OverflowWarning: value assignment not in the type range > >In [31]: a >Out[31]: array([1, 0], type=Int8) > >But perhaps this feature might slow a bit the performance of assignments. > Yes, but probably not too much. >Regards, > > > Todd From jensj at fysik.dtu.dk Thu Feb 20 04:00:04 2003 From: jensj at fysik.dtu.dk (Jens Jorgen Mortensen) Date: Thu Feb 20 04:00:04 2003 Subject: [Numpy-discussion] BLAS Message-ID: <200302201258.37704.jensj@bose.fysik.dtu.dk> Hi, When doing matrix-matrix multiplications with large matrices, using the BLAS library (Basic Linear Algebra Subprograms) can speed up things a lot. I don't think Numeric takes advantage of this (is this correct?). Will numarray be able to do that? Jens From falted at openlc.org Thu Feb 20 05:41:12 2003 From: falted at openlc.org (Francesc Alted) Date: Thu Feb 20 05:41:12 2003 Subject: [Numpy-discussion] byteswap() not doing anything !? In-Reply-To: <3E53EC22.50109@stsci.edu> References: <200302191220.26456.haase@msg.ucsf.edu> <3E53EC22.50109@stsci.edu> Message-ID: <200302201440.32504.falted@openlc.org> A Dimecres 19 Febrer 2003 21:42, Todd Miller va escriure: > > >>> print a > > > > [1 2 3] > > > > >>> a.byteswap() > > >>> print a > > > > [16777216 33554432 50331648] > > This is a known bug/incompatability. The behavior will be changed for > the next release of numarray. Right now, _byteswap() does what you want. This is already decided?. Because I like the present behaviour. At first, I've found this behaviour a bit strange, but after get used to it, I admit that it is elegant because you can always see a sane representation of the data in array independently of which architecture you have written the array. If you byteswap() an array, the _byteorder property is also changed, so you can check if your array is bytswapped or not just by writing: if a._byteorder <> sys.byteorder: print "a is byteswapped" else: print "a is not byteswapped" And, as you said before, you can always call _byteswap() if you *really* want to *only* byteswap the array. PyTables already makes use of byteswap() as it is now, and that's nice because an array can be restored from disk safely by just looking at the byte order on disk and then setting properly the ._byteorder attribute. That's all! This allows also to work seamlessly with objects coming from a mixture of big-endian and low-endian machines. But, anyway, if you plan to do the change, may you please tell us what would be the expected behaviour of the future .byteswap(), ._byteswap() and ._byteorder? Thanks, -- Francesc Alted From jmiller at stsci.edu Thu Feb 20 06:20:13 2003 From: jmiller at stsci.edu (Todd Miller) Date: Thu Feb 20 06:20:13 2003 Subject: [Numpy-discussion] byteswap() not doing anything !? References: <200302191220.26456.haase@msg.ucsf.edu> <3E53EC22.50109@stsci.edu> <200302201440.32504.falted@openlc.org> Message-ID: <3E54E7E8.3030705@stsci.edu> Francesc Alted wrote: >A Dimecres 19 Febrer 2003 21:42, Todd Miller va escriure: > > >>> >>> print a >>> >>> [1 2 3] >>> >>> >>> a.byteswap() >>> >>> print a >>> >>> [16777216 33554432 50331648] >>> >>> >>This is a known bug/incompatability. The behavior will be changed for >>the next release of numarray. Right now, _byteswap() does what you want. >> >> > >This is already decided?. Because I like the present behaviour. > It's already in CVS. Let me know what you think about the stuff below. > >At first, I've found this behaviour a bit strange, but after get used to it, >I admit that it is elegant because you can always see a sane representation >of the data in array independently of which architecture you have written >the array. > > I think byteswap() came to be the way it is in numarray-0.4 as a result of my experiences with cross-platform pickling. It made sense to me at the time. However, it is definitely a new point of confusion, and not backwards compatible with Numeric, so I think the numarray-0.4 byteswap() behavior was a mistake. >If you byteswap() an array, the _byteorder property is also changed, so you >can check if your array is bytswapped or not just by writing: > >if a._byteorder <> sys.byteorder: > print "a is byteswapped" >else: > print "a is not byteswapped" > >And, as you said before, you can always call _byteswap() if you *really* >want to *only* byteswap the array. > >PyTables already makes use of byteswap() as it is now, and that's nice >because an array can be restored from disk safely by just looking at the >byte order on disk and then setting properly the ._byteorder attribute. >That's all! This allows also to work seamlessly with objects coming from a >mixture of big-endian and low-endian machines. > >But, anyway, if you plan to do the change, may you please tell us what would >be the expected behaviour of the future .byteswap(), ._byteswap() and >._byteorder? > > The current "plan" is that byteswap() and _byteswap() will both behave as _byteswap() does now; i.e., they will be Numeric compatible synonyms. An explict (extra) call to togglebyteorder() will then produce the current behavior. The meaning of _byteorder will be unchanged. Please let me know if you see any snags in the plan. >Thanks, > > > From paul at pfdubois.com Thu Feb 20 08:51:02 2003 From: paul at pfdubois.com (Paul F Dubois) Date: Thu Feb 20 08:51:02 2003 Subject: [Numpy-discussion] BLAS In-Reply-To: <200302201258.37704.jensj@bose.fysik.dtu.dk> Message-ID: <000201c2d900$12107110$6601a8c0@NICKLEBY> > -----Original Message----- > From: numpy-discussion-admin at lists.sourceforge.net > [mailto:numpy-discussion-admin at lists.sourceforge.net] On > Behalf Of Jens Jorgen Mortensen > Sent: Thursday, February 20, 2003 3:59 AM > To: numpy-discussion at lists.sourceforge.net > Subject: [Numpy-discussion] BLAS > > > Hi, > > When doing matrix-matrix multiplications with large matrices, > using the BLAS > library (Basic Linear Algebra Subprograms) can speed up > things a lot. I don't > think Numeric takes advantage of this (is this correct?). No. You can configure it at installation to use the BLAS of choice. > Will numarray be > able to do that? > > Jens > > > ------------------------------------------------------- > This SF.net email is sponsored by: SlickEdit Inc. Develop an > edge. The most comprehensive and flexible code editor you can > use. Code faster. C/C++, C#, Java, HTML, XML, many more. FREE > 30-Day Trial. www.slickedit.com/sourceforge > _______________________________________________ > Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From falted at openlc.org Thu Feb 20 10:54:07 2003 From: falted at openlc.org (Francesc Alted) Date: Thu Feb 20 10:54:07 2003 Subject: [Numpy-discussion] byteswap() not doing anything !? In-Reply-To: <3E54E7E8.3030705@stsci.edu> References: <200302191220.26456.haase@msg.ucsf.edu> <200302201440.32504.falted@openlc.org> <3E54E7E8.3030705@stsci.edu> Message-ID: <200302201953.05991.falted@openlc.org> A Dijous 20 Febrer 2003 15:36, Todd Miller va escriure: > > The current "plan" is that byteswap() and _byteswap() will both behave > as _byteswap() does now; i.e., they will be Numeric compatible synonyms. > > An explict (extra) call to togglebyteorder() will then produce the > current behavior. The meaning of _byteorder will be unchanged. > > Please let me know if you see any snags in the plan. Well, I've been doing some tests, and I think I'll be able to produce a version of my code that will be compatible with numarray 0.4 and future versions (I'm just no using byteswap() at all). However, I've detected a side effect on this change: copy() method is broken now in CVS: In [131]: a=numarray.array([1,2]) In [132]: a.togglebyteorder() In [133]: b=a.copy() In [134]: a Out[134]: array([16777216, 33554432]) In [135]: b Out[135]: array([1, 2]) In [136]: a._byteorder Out[136]: 'big' In [137]: b._byteorder Out[137]: 'big' so, you don't get a well-behaved copy of original array a in b I think the next patch should cure it: --- numarray.py Tue Feb 18 16:35:16 2003 +++ /usr/local/lib/python2.2/site-packages/numarray/numarray.py Thu Feb 20 19:36:07 2003 @@ -609,6 +609,7 @@ c._type = self._type if self.isbyteswapped(): c.byteswap() + c.togglebyteorder() return c -- Francesc Alted From jmiller at stsci.edu Thu Feb 20 11:23:14 2003 From: jmiller at stsci.edu (Todd Miller) Date: Thu Feb 20 11:23:14 2003 Subject: [Numpy-discussion] byteswap() not doing anything !? References: <200302191220.26456.haase@msg.ucsf.edu> <200302201440.32504.falted@openlc.org> <3E54E7E8.3030705@stsci.edu> <200302201953.05991.falted@openlc.org> Message-ID: <3E552EF9.9060507@stsci.edu> Francesc Alted wrote: >A Dijous 20 Febrer 2003 15:36, Todd Miller va escriure: > > >>The current "plan" is that byteswap() and _byteswap() will both behave >>as _byteswap() does now; i.e., they will be Numeric compatible synonyms. >> >>An explict (extra) call to togglebyteorder() will then produce the >>current behavior. The meaning of _byteorder will be unchanged. >> >>Please let me know if you see any snags in the plan. >> >> > >Well, I've been doing some tests, and I think I'll be able to produce a >version of my code that will be compatible with numarray 0.4 and future >versions (I'm just no using byteswap() at all). > >However, I've detected a side effect on this change: copy() method is >broken now in CVS: > >In [131]: a=numarray.array([1,2]) > >In [132]: a.togglebyteorder() > >In [133]: b=a.copy() > >In [134]: a >Out[134]: array([16777216, 33554432]) > >In [135]: b >Out[135]: array([1, 2]) > >In [136]: a._byteorder >Out[136]: 'big' > >In [137]: b._byteorder >Out[137]: 'big' > >so, you don't get a well-behaved copy of original array a in b > > Doh! >I think the next patch should cure it: > >--- numarray.py Tue Feb 18 16:35:16 2003 >+++ /usr/local/lib/python2.2/site-packages/numarray/numarray.py Thu Feb 20 >19:36:07 2003 >@@ -609,6 +609,7 @@ > c._type = self._type > if self.isbyteswapped(): > c.byteswap() >+ c.togglebyteorder() > return c > > Thanks! Todd From R.M.Everson at exeter.ac.uk Thu Feb 20 15:43:14 2003 From: R.M.Everson at exeter.ac.uk (R.M.Everson) Date: Thu Feb 20 15:43:14 2003 Subject: [Numpy-discussion] BLAS In-Reply-To: <000201c2d900$12107110$6601a8c0@NICKLEBY> References: <000201c2d900$12107110$6601a8c0@NICKLEBY> Message-ID: Hi, As Paul Dubois says, some Numeric functions can be configured to use the BLAS library. However, the BLAS is not used for, perhaps the most common and important operation: matrix/vector multiplication. We have written a small patch to interface to replace the matrixproduct/dot/innerproduct functions in multiarraymodule.c with the appropriate BLAS calls. The patch (against Numeric 21.1b) can be found at http://www.dcs.ex.ac.uk/~aschmolc/Numeric and can give a speed up of a factor of 40 on 1000 by 1000 matrices using the Atlas BLAS. More details of the (naive!) timings can be found there too. We had planned on making a general announcement of this patch (updated to suit Numeric 22) in a week or so. However, we have just noticed that Numeric.dot (=Numeric.innerproduct = Numeric.matrixmultiply) does not take the complex conjugate of its first argument. Taking the complex conjugate seems to me to be the right thing for a routine named dot or innerproduct. Indeed, until we were bitten by it not taking the conjugate, I thought it did. Can someone here explain the rational behind having dot, innerproduct and matrixmultiply all do the same thing and none of them taking the conjugate? (Matlab dot() takes the conjugate, although Matlab mtimes() (called for A*B) does not). I would propose that innerproduct and dot be changed to take the conjugate and a new function that doesn't (say, mtimes) be introduced. I suspect, however, that this would break too much existing code. It would be nice to get it right in Numarray. Alternatively, can someone suggest how both functions can be conveniently and non-confusingly exposed? Richard. Paul F Dubois writes: >> -----Original Message----- From: >> numpy-discussion-admin at lists.sourceforge.net >> [mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of >> Jens Jorgen Mortensen Sent: Thursday, February 20, 2003 3:59 AM To: >> numpy-discussion at lists.sourceforge.net Subject: [Numpy-discussion] >> BLAS >> >> >> Hi, >> >> When doing matrix-matrix multiplications with large matrices, using >> the BLAS library (Basic Linear Algebra Subprograms) can speed up >> things a lot. I don't think Numeric takes advantage of this (is this >> correct?). > No. You can configure it at installation to use the BLAS of choice. >> Will numarray be able to do that? >> >> Jens >> From a.schmolck at gmx.net Thu Feb 20 17:22:18 2003 From: a.schmolck at gmx.net (Alexander Schmolck) Date: Thu Feb 20 17:22:18 2003 Subject: [Numpy-discussion] BLAS In-Reply-To: References: <000201c2d900$12107110$6601a8c0@NICKLEBY> Message-ID: R.M.Everson at exeter.ac.uk (R.M.Everson) writes: > Hi, > > As Paul Dubois says, some Numeric functions can be configured to use the > BLAS library. However, the BLAS is not used for, perhaps the most common > and important operation: matrix/vector multiplication. > > We have written a small patch to interface to replace the > matrixproduct/dot/innerproduct functions in multiarraymodule.c with the > appropriate BLAS calls. > > The patch (against Numeric 21.1b) can be found at > http://www.dcs.ex.ac.uk/~aschmolc/Numeric and can give a speed up of a > factor of 40 on 1000 by 1000 matrices using the Atlas BLAS. More details > of the (naive!) timings can be found there too. > An addendum: the new version is no longer a patch against Numeric, but a separate module, currently called 'dotblas', which is a cleaner approach as it doesn't require using a modified version of Numeric. To use this fast dot instaed of Numeric's dot, you can e.g do: import Numeric # no errors if dotblas isn't installed try: import dotblas Numeric.dot = dotblas.dot except ImportError: pass I just put a prerelease (which still handles complex arrays DIFFERENTLY from Numeric!!!) online at: http://www.dcs.ex.ac.uk/~aschmolc/Numeric/dotblas.html enjoy, alex From falted at openlc.org Fri Feb 21 03:18:03 2003 From: falted at openlc.org (Francesc Alted) Date: Fri Feb 21 03:18:03 2003 Subject: [Numpy-discussion] Non-regular lists in numarray Message-ID: <200302211216.56247.falted@openlc.org> Hi, I've found that numarray.array doesn't check enough the input for non-regular objects. For example: In [95]: numarray.array([3., [4, 5.2]]) Out[95]: array([ 3. , 5.7096262]) but, In [96]: Numeric.array([3., [4, 5.2]]) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) ? TypeError: bad argument type for built-in operation I find Numeric behaviour more appropriate. Regards, -- Francesc Alted From jmiller at stsci.edu Fri Feb 21 08:09:18 2003 From: jmiller at stsci.edu (Todd Miller) Date: Fri Feb 21 08:09:18 2003 Subject: [Numpy-discussion] Non-regular lists in numarray References: <200302211216.56247.falted@openlc.org> Message-ID: <3E564F0E.90400@stsci.edu> I logged this as a bug and I'll get to it as soon as I'm out of "numarray overhead reduction mode." Thanks! Todd Francesc Alted wrote: >Hi, > >I've found that numarray.array doesn't check enough the input for >non-regular objects. For example: > >In [95]: numarray.array([3., [4, 5.2]]) >Out[95]: array([ 3. , 5.7096262]) > >but, > >In [96]: Numeric.array([3., [4, 5.2]]) >--------------------------------------------------------------------------- >TypeError Traceback (most recent call last) > >? > >TypeError: bad argument type for built-in operation > > >I find Numeric behaviour more appropriate. > >Regards, > > > From paul at pfdubois.com Fri Feb 21 08:36:02 2003 From: paul at pfdubois.com (Paul F Dubois) Date: Fri Feb 21 08:36:02 2003 Subject: [Numpy-discussion] BLAS In-Reply-To: <200302201759.44307.jensj@bose.fysik.dtu.dk> Message-ID: <001d01c2d9c7$249a81a0$6601a8c0@NICKLEBY> I had forgotten about this case. I think when these were done it was thought that it would be better if the Numeric core did not require use of LAPACK/BLAS. We were thinking back then of a core with other packages, and the blas we use by default is probably the same speed so it didn't seem important. I would have no problem with a patch to change this. > -----Original Message----- > From: Jens Jorgen Mortensen [mailto:jensj at fysik.dtu.dk] > Sent: Thursday, February 20, 2003 9:00 AM > To: Paul F Dubois > Subject: Re: [Numpy-discussion] BLAS > > > On Torsdag den 20. februar 2003 17:49, Paul F Dubois wrote: > > > > When doing matrix-matrix multiplications with large > matrices, using > > > the BLAS library (Basic Linear Algebra Subprograms) can speed up > > > things a lot. I don't > > > think Numeric takes advantage of this (is this correct?). > > > > No. You can configure it at installation to use the BLAS of choice. > > > > I know that the stuff in LinearAlgebra can be configured to > use a BLAS of > choice, but what about the Numeric.dot function? > > Can I configure Numeric so that this: > > >>> a = Numeric.dot(b, c) > > will use BLAS? > > Jens > > From haase at msg.ucsf.edu Fri Feb 21 23:10:05 2003 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Fri Feb 21 23:10:05 2003 Subject: [Numpy-discussion] make C array accessible to python without copy Message-ID: Short follow up: 1) Is it planned to support this more directly? 2) How much does it cost to create a buffer object if it uses my already allocated memory ? 3) Can I change the pointer so that it points to a different memory space WITHOUT having to recreate any python objects? Or would that "confuse" the buffer or numarray? (We are hoping to aquire 30 images per second - the images should get written into a circular buffer so that the data can be written to disk in larger chunks - but the python array should always refer to the current image ) Thanks for all the nice toys (tools) ;-) Sebastian Haase On Fri, 17 Jan 2003 18:16:01 -0500 Todd Miller wrote: >Sebastian Haase wrote: > >>Hi, >>What is the C API to make an array that got allocated, >>let's say, by a = new short[512*512], >>accessible to python as numarray. >> >What you want to do is not currently supported well in C. > The way to do what you want is: > >1. Create a buffer object from your C++ array. The >buffer object can be built such that it refers to the >original copy of the data. > >2. Call back into Python (numarray.NumArray) with your >buffer object as the buffer parameter. > >You can scavenge the code in NA_newAll (Src/newarray.ch) >for most of the callback. > >>I tried NA_New - but that seems to make a copy. >>I would need it to use the original memory space >>so that I can "observe" the array from Python WHILE >>the underlying C array changes (it's actually a camera >>image) >> >That sounds cool! > >> >>Thanks, >>Sebastian Haase From g_will at cyberus.ca Thu Feb 27 11:05:10 2003 From: g_will at cyberus.ca (Gordon Williams) Date: Thu Feb 27 11:05:10 2003 Subject: [Numpy-discussion] filtering numeric arrays Message-ID: <003501c2de93$3c6b50e0$c456e640@wnt20337> Hi All, I have an 2D numeric array of x,y points eg [(1,3),(2,4),(5,6)] and I would like to remove all the points in the array that don't meet the min/max point criteria. I will have several thousand points. With lists I can do it like [(x,y) for x,y in seq if xMin < x References: <003501c2de93$3c6b50e0$c456e640@wnt20337> Message-ID: <3E5E65A6.8030409@ieee.org> Gordon Williams wrote: >Hi All, > >I have an 2D numeric array of x,y points eg [(1,3),(2,4),(5,6)] and I would >like to remove all the points in the array that don't meet the min/max point >criteria. I will have several thousand points. With lists I can do it like > >[(x,y) for x,y in seq if xMin < x >How do I get the same functionality and better speed using numeric. I have >tried a bunch of things using compress and take but I am running up against >a brick wall. > > I think you want something like this: >>> cond = (xMin < a[:,0]) & (a[:,0] < xMax) & (yMin < a[:,1]) & (a[:,1] < yMax) >>> np.compress(cond, a, 0) Where 'a' is your original Nx2 array. Unfortunately the obvious notation and prettier notation using (xMin < a[:,0] < xMax) fails because python treats that as "(xMin < a[:,0]) and (a[:,0] < xMax)" and "and" is not what you need here, '&' is. -tim > >Any ideas? > >Thanks > >Gordon Williams > > > > > > >------------------------------------------------------- >This sf.net email is sponsored by:ThinkGeek >Welcome to geek heaven. >http://thinkgeek.com/sf >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > From Chris.Barker at noaa.gov Thu Feb 27 11:41:12 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Thu Feb 27 11:41:12 2003 Subject: [Numpy-discussion] filtering numeric arrays In-Reply-To: <003501c2de93$3c6b50e0$c456e640@wnt20337> Message-ID: <5E98E540-4A8B-11D7-8A87-000393A96660@noaa.gov> On Thursday, February 27, 2003, at 11:05 AM, Gordon Williams wrote: > I have an 2D numeric array of x,y points eg [(1,3),(2,4),(5,6)] and I > would > like to remove all the points in the array that don't meet the min/max > point > criteria. I will have several thousand points. With lists I can do > it like This should do it: >>> a array([[1, 3], [2, 4], [5, 6]]) >>> valid = (a[:,0] > minX) & (a[:,0] < maxX) & (a[:,1] > minY) & (a[:,1] < maxY) >>> take(a,nonzero(valid)) array([ [2, 4]]) Note that & is a bitwise-and, not a logical and, but in this case, the result is the same. Unfortunately, the way Python works, overloading "and" is difficult. -Chris Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From g_will at cyberus.ca Thu Feb 27 12:45:31 2003 From: g_will at cyberus.ca (Gordon Williams) Date: Thu Feb 27 12:45:31 2003 Subject: [Numpy-discussion] Re: filtering numeric arrays Message-ID: <001601c2dea1$84d94e50$c456e640@wnt20337> Thanks to Tim and Chris. Just what I was looking for! I tested both along with some other tries that I had made. For 10000 points - G:\GPS\Python\GUI_Test\Filter>speed.py time for is 0.022809 time for is 0.060303 time for is 0.055692 time for is 0.003652 time for is 0.003561 For 100 points - G:\GPS\Python\GUI_Test\Filter>speed.py time for is 0.000238 time for is 0.000784 time for is 0.000678 time for is 0.000376 time for is 0.000153 They scale slightly differently between Tim's and Chris' methods. Thanks again, Gordon Williams Here is the code (since someone will ask): '''Test the speed of different methods of getting points out of a list''' import time import Numeric as n size= 100 maxNum=size/10. #data a= n.array(n.arange(0,maxNum,.05)) a.shape= (size,2) #list l= a.tolist() (xMin,yMin)= (3,2) (xMax,yMax)= (4,6) def listComp(seq): '''using list comprehension''' return [(x,y) for x,y in seq if xMin < x xMin) & (seq[:,0] < xMax) & (seq[:,1] > yMin) & (seq[:,1] < yMax) return n.take(a,n.nonzero(valid)) #Tests tests= [(listComp,l), (arraykludge,a),(arrayComp,a), (arrayTimH,a), (arrayChrisB,a)] for fun,seq in tests: t=time.clock() apply(fun,(seq,)) dt= time.clock()-t print "time for %s is %f" %(str(fun),dt) ----- Original Message ----- From: "Gordon Williams" To: Sent: Thursday, February 27, 2003 2:05 PM Subject: filtering numeric arrays > Hi All, > > I have an 2D numeric array of x,y points eg [(1,3),(2,4),(5,6)] and I would > like to remove all the points in the array that don't meet the min/max point > criteria. I will have several thousand points. With lists I can do it like > > [(x,y) for x,y in seq if xMin < x > How do I get the same functionality and better speed using numeric. I have > tried a bunch of things using compress and take but I am running up against > a brick wall. > > > Any ideas? > > Thanks > > Gordon Williams > > > > From dubois1 at llnl.gov Thu Feb 27 16:44:05 2003 From: dubois1 at llnl.gov (Paul F. Dubois) Date: Thu Feb 27 16:44:05 2003 Subject: [Numpy-discussion] Last call for v. 23 Message-ID: I am going to make a release of Numeric, 23.0. Fellow developers who are inspired to fix a bug are urged to do so immediately. This will be a bug fix release. From jdhunter at ace.bsd.uchicago.edu Thu Feb 27 20:27:15 2003 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Thu Feb 27 20:27:15 2003 Subject: [Numpy-discussion] speedy remove mean of rows Message-ID: I have a large (40,000 x 128) Numeric array, X, with typecode Float. In some cases the number of rows may be approx 10x greater. I want to create an array Y with the same dimensions as X, where each element of Y is the corresponding element of X with the mean of the row on which it occurs subtracted away. Ie, Y = X - transpose(resize(mean(X,1), (X.shape[1],X.shape[0]))) I am wondering if this is the most efficient way (speed and memory). Thanks for any suggestions, John Hunter From eric at enthought.com Thu Feb 27 21:40:17 2003 From: eric at enthought.com (Eric Jones) Date: Thu Feb 27 21:40:17 2003 Subject: [Numpy-discussion] speedy remove mean of rows Message-ID: <20030228054158.0D0111050@www.enthought.com> Hey John, I think broadcasting is your best bet. Here is a snippet using scipy (Numeric will be pretty much the same). >>> from scipy import * >>> a = stats.random((4,3)) a array([[ 0.94058263, 0.24342623, 0.74673623], [ 0.53151542, 0.07523929, 0.49730805], [ 0.5161854 , 0.51049614, 0.70360875], [ 0.09470515, 0.60604334, 0.64941102]]) >>> stats.mean(a) # axis=-1 by default in scipy array([ 0.6435817 , 0.36802092, 0.57676343, 0.45005317]) >>> a-stats.mean(a)[:,NewAxis] array([[ 0.29700093, -0.40015546, 0.10315453], [ 0.1634945 , -0.29278163, 0.12928713], [-0.06057803, -0.06626729, 0.12684532], [-0.35534802, 0.15599017, 0.19935785]]) eric John Hunter wrote .. > > I have a large (40,000 x 128) Numeric array, X, with typecode Float. > In some cases the number of rows may be approx 10x greater. > > I want to create an array Y with the same dimensions as X, where each > element of Y is the corresponding element of X with the mean of the > row on which it occurs subtracted away. Ie, > > Y = X - transpose(resize(mean(X,1), (X.shape[1],X.shape[0]))) > > I am wondering if this is the most efficient way (speed and memory). > > Thanks for any suggestions, > John Hunter > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From a.schmolck at gmx.net Fri Feb 28 09:54:04 2003 From: a.schmolck at gmx.net (Alexander Schmolck) Date: Fri Feb 28 09:54:04 2003 Subject: [Numpy-discussion] BLAS In-Reply-To: <001d01c2d9c7$249a81a0$6601a8c0@NICKLEBY> References: <001d01c2d9c7$249a81a0$6601a8c0@NICKLEBY> Message-ID: "Paul F Dubois" writes: > I had forgotten about this case. I think when these were done it was thought > that it would be better if the Numeric core did not require use of > LAPACK/BLAS. We were thinking back then of a core with other packages, and > the blas we use by default is probably the same speed so it didn't seem > important. I would have no problem with a patch to change this. Great. I submitted a patch just now. alex From fperez at colorado.edu Fri Feb 28 14:38:02 2003 From: fperez at colorado.edu (Fernando Perez) Date: Fri Feb 28 14:38:02 2003 Subject: [Numpy-discussion] Tentative fix for Numtut's view.py In-Reply-To: References: Message-ID: <3E5FE482.4080204@colorado.edu> Hi all, > Subject: [Numpy-discussion] Last call for v. 23 > > I am going to make a release of Numeric, 23.0. Fellow developers who are inspired > to fix a bug are urged to do so immediately. > > This will be a bug fix release. in the scipy mailing list there were some discussions about view.py as included in NumTut. It seems that many folks (including myself) have had problems with it, and they seem to be threading-related. The symptom is that once view is imported, the interactive interpreter essentially locks up, and typing becomes nearly impossible. I know next to nothing about threading, but in an attempt to fix the problem I stripped view.py bare of everything I didn't understand, until it worked :) Basically I removed all PIL and threading-related code, and left only the bare Tk code in place. Naive as this approach was, it seems to have worked. Some folks reported success, and David Ascher (the original author of view.py) suggested I submit this to the Numpy team as an update to the tutorial. There's a good chance the current view is just broken and nobody has bothered to use it in a long time. I'm attaching the new view here as a file, but if there is a different protocol I should follow, please let me know (patch, etc). As I said, this was the most simple-minded thing I could do to make it work. So if you are interested in accepting this, it might be wise to have a look at it first. On the upside, pretty much all I did was to _remove_ code, not to add anything. So the analysis should be easy (the new code is far simpler and shorter than the original). I've tested it personally under python 2.2.1 (the stock Redhat 8.0 install). Best, Fernando Perez. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: view.py URL: From falted at openlc.org Tue Feb 4 01:15:02 2003 From: falted at openlc.org (Francesc Alted) Date: Tue Feb 4 01:15:02 2003 Subject: [Numpy-discussion] more than 2-D numarrays in recarray? Message-ID: <200302041014.08462.falted@openlc.org> Hi, It seems that recarray doesn't support more than 1-D numarray arrays as fields. Is that a fundamental limitation? If not, do you plan to support arbitrary dimensions in the future?. Thanks, -- Francesc Alted From jmiller at stsci.edu Tue Feb 4 04:05:04 2003 From: jmiller at stsci.edu (Todd Miller) Date: Tue Feb 4 04:05:04 2003 Subject: [Numpy-discussion] more than 2-D numarrays in recarray? References: <200302041014.08462.falted@openlc.org> Message-ID: <3E3FAFED.1050201@stsci.edu> Francesc Alted wrote: >Hi, > >It seems that recarray doesn't support more than 1-D numarray arrays as >fields. Is that a fundamental limitation? > I don't think it is fundamental, merely a question of what is needed and works easily. I see two problems with multi-d numarray fields, both solvable: 1. Multidimensional numarrays must be described in the recarray spec. 2. Either numarray or recarray must be able to handle a (slightly) more complicated case of recomputing array strides from shape and (bytestride,record-length). I didn't design or implement recarray so there may be other problems as well. >If not, do you plan to support >arbitrary dimensions in the future?. > I don't think it's a priority now. What do you need them for? > >Thanks, > > > Regards, Todd From falted at openlc.org Tue Feb 4 05:01:05 2003 From: falted at openlc.org (Francesc Alted) Date: Tue Feb 4 05:01:05 2003 Subject: [Numpy-discussion] more than 2-D numarrays in recarray? In-Reply-To: <3E3FAFED.1050201@stsci.edu> References: <200302041014.08462.falted@openlc.org> <3E3FAFED.1050201@stsci.edu> Message-ID: <200302041400.34961.falted@openlc.org> A Dimarts 04 Febrer 2003 13:19, Todd Miller va escriure: > I see two problems with multi-d numarray fields, both > solvable: > > 1. Multidimensional numarrays must be described in the recarray spec. > > 2. Either numarray or recarray must be able to handle a (slightly) more > complicated case of recomputing array strides from shape and > (bytestride,record-length). > > I didn't design or implement recarray so there may be other problems as > well. I had a look at the code and it seems like you are right. > I don't think it's a priority now. What do you need them for? Well, I've adopted the recarray object (actually a slightly modified version of it) to be a fundamental building block in next release of PyTables. If arbitrary dimensionality were implemented, the resulting tables would be more general. Moreover, I'm thinking about implementing unlimited (just one axis) array dimension support and having a degenerated recarray with just one column as a multimensional numarray object would easy quite a lot the implementation. Of course, I could implement my own recarray version with that support, but I just don't want to diverge so much from the reference implementation. -- Francesc Alted From perry at stsci.edu Tue Feb 4 07:41:02 2003 From: perry at stsci.edu (Perry Greenfield) Date: Tue Feb 4 07:41:02 2003 Subject: [Numpy-discussion] more than 2-D numarrays in recarray? In-Reply-To: <200302041400.34961.falted@openlc.org> Message-ID: > > A Dimarts 04 Febrer 2003 13:19, Todd Miller va escriure: > > I see two problems with multi-d numarray fields, both > > solvable: > > > > 1. Multidimensional numarrays must be described in the recarray spec. > > > > 2. Either numarray or recarray must be able to handle a (slightly) more > > complicated case of recomputing array strides from shape and > > (bytestride,record-length). > > > > I didn't design or implement recarray so there may be other problems as > > well. > > I had a look at the code and it seems like you are right. > > > I don't think it's a priority now. What do you need them for? > > Well, I've adopted the recarray object (actually a slightly > modified version > of it) to be a fundamental building block in next release of PyTables. If > arbitrary dimensionality were implemented, the resulting tables would be > more general. Moreover, I'm thinking about implementing unlimited > (just one > axis) array dimension support and having a degenerated recarray with just > one column as a multimensional numarray object would easy quite a lot the > implementation. > > Of course, I could implement my own recarray version with that > support, but > I just don't want to diverge so much from the reference implementation. > > -- > Francesc Alted > As Todd says, the initial implementation was to support only 1-d cases. There is no fundamental reason why it shouldn't support the general case. We'd like to work with you about how that should be best implemented. Basically the issue is how we save the shape information for that field. I don't think it would be hard to implement. Perry From tim.hochberg at ieee.org Tue Feb 4 08:52:05 2003 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Tue Feb 4 08:52:05 2003 Subject: [Numpy-discussion] Interesting Psyco/Numeric/Numarray comparison Message-ID: <3E3FEF8F.6000807@ieee.org> I was inspired by Armin's latest Psyco version to try and see how well one could do with NumPy/NumArray implemented in Psycotic Python. I wrote a bare bones, pure Python, Numeric array class loosely based on Jnumeric (which in turn was loosely based on Numeric). The buffer is just Python's array.array. At the moment, all that one can do to the arrays is add and index them and the code is still a bit of a mess. I plan to clean things up over the next week in my copius free time <0.999 wink> and at that point it should be easy add the remaining operations. I benchmarked this code, which I'm calling Psymeric for the moment, against NumPy and Numarray to see how it did. I used a variety of array sizes, but mostly relatively large arrays of shape (500,100) and of type Float64 and Int32 (mixed and with consistent types) as well as scalar values. Looking at the benchmark data one comes to three main conclusions: * For small arrays NumPy always wins. Both Numarray and Psymeric have much larger overhead. * For large, contiguouse arrays, Numarray is about twice as fast as either of the other two. * For large, noncontiguous arrays, Psymeric and NumPy are ~20% faster than Numarray The impressive thing is that Psymeric is generally slightly faster than NumPy when adding two arrays. It's slightly slower (~10%) when adding an array and a scalar although I suspect that could be fixed by some special casing a la Numarray. Adding two (500,100) arrays of type Float64 together results in the following timings: psymeric numpy numarray contiguous 0.0034 s 0.0038 s 0.0019 s stride-2 0.0020 s 0.0023 s 0.0033 s I'm not sure if this is important, but it is an impressive demonstration of Psyco! More later when I get the code a bit more cleaned up. -tim 0.002355 0.002355 From falted at openlc.org Tue Feb 4 10:06:03 2003 From: falted at openlc.org (Francesc Alted) Date: Tue Feb 4 10:06:03 2003 Subject: [Numpy-discussion] more than 2-D numarrays in recarray? In-Reply-To: References: Message-ID: <200302041904.29807.falted@openlc.org> A Dimarts 04 Febrer 2003 16:40, Perry Greenfield va escriure: > We'd like to work with you about how that should be best implemented. > Basically the issue is how we save the shape information for that field. > I don't think it would be hard to implement. Ok, great! Well, my proposals for extended recarray syntax are: 1.- Extend the actual formats to read something like: ['(1,)i1', '(3,4)i4', '(16,)a', '(2,3,4)i2'] Pro's: - It's the straightforward extension of the actual format - Should be easy to implement - Note that the charcodes has been substituted by a slightly more verbose version ('i2' instead of 's', for example) - Short and simple Con's: - It is still string-code based - Implicit field order 2.- Make use of the syntax I'm suggesting in past messages: class Particle(IsRecord): name = Col(CharType, (16,), dflt="", pos=3) # 16-character String ADCcount = Col(Int8, (1,), dflt=0, pos=1) # signed byte TDCcount = Col(Int32, (3,4), dflt=0, pos=2) # signed integer grid_i = Col(Int16, (2,3,4), dflt=0, pos=4) # signed short integer Pro's: - It gets rid of charcodes or string codes - The map between name and type is visually clear - Explicit field order - The columns can be defined as __slots__ in the class constructor making impossible to assign (through __setattr__ for example) values to non-existing columns. - Is elegant (IMO) Con's: - Requires more typing to define - Not as concise as 1) (but a short representation can be made inside IsRecord!) - Difficult to define dynamically 3.- Similar than 2), but with a dictionary like: Particle = { "name" : Col(CharType, (16,), dflt="", pos=3), # 16-character String "ADCcount" : Col(Int8, (1,), dflt=0, pos=1), # signed byte "TDCcount" : Col(Int32, (3,4), dflt=0, pos=2), # signed integer "grid_i" : Col(Int16, (2,3,4), dflt=0, pos=4), # signed short integer } Pro's: - It gets rid of charcodes or string codes - The map between name and type is visually clear - Explicit field order - Easy to build dynamically Con's - No possibility to define __slots__ - Not as elegant as 2), but it looks fine. 4.- List-based approach: Particle = [ Col(Int8, (1,), dflt=0), # signed byte Col(Int32, (3,4), dflt=0), # signed integer Col(CharType, (16,), dflt=""), # 16-character String Col(Int16, (2,3,4), dflt=0), # signed short integer ] Pro's: - Costs less to type (less verbose) - Easy to build dynamically Con's - Implicit field order - Map between field names and contents not visually clear Note: In the previous discussion explicit order has been considered better than implicit, following the Python mantra, and although some people may think that this don't apply well here, I do (but, again, this is purely subjective). Of course, a combination of 2 alternatives can be the best. My current experience tells me that a combination of 2 and 3 may be very good. In that way, a user can define their recarrays as classes, but if he needs to define them dynamically, the recarray constructor can accept also a dictionary like 3 (but, obviously, the same applies to case 4). In the end, the recarray instance should have a variable that points to this definition class, where metadata is keeped, but a shortcut in the form 1) can also be constructed for convenience. IMO integrating options 2 and 3 (even 4) are not difficult to implement and in fact, such a combination is already present in PyTables CVS version. I even might provide a recarray version with 2 & 3 integrated for developers evaluation. Comments?, -- Francesc Alted From perry at stsci.edu Wed Feb 5 07:06:08 2003 From: perry at stsci.edu (Perry Greenfield) Date: Wed Feb 5 07:06:08 2003 Subject: [Numpy-discussion] Interesting Psyco/Numeric/Numarray comparison In-Reply-To: <3E3FEF8F.6000807@ieee.org> Message-ID: Tim Hochberg writes: > I was inspired by Armin's latest Psyco version to try and see how well > one could do with NumPy/NumArray implemented in Psycotic Python. I wrote > a bare bones, pure Python, Numeric array class loosely based on Jnumeric > (which in turn was loosely based on Numeric). The buffer is just > Python's array.array. At the moment, all that one can do to the arrays > is add and index them and the code is still a bit of a mess. I plan to > clean things up over the next week in my copius free time <0.999 wink> > and at that point it should be easy add the remaining operations. > > I benchmarked this code, which I'm calling Psymeric for the moment, > against NumPy and Numarray to see how it did. I used a variety of array > sizes, but mostly relatively large arrays of shape (500,100) and of type > Float64 and Int32 (mixed and with consistent types) as well as scalar > values. Looking at the benchmark data one comes to three main conclusions: > * For small arrays NumPy always wins. Both Numarray and Psymeric have > much larger overhead. > * For large, contiguouse arrays, Numarray is about twice as fast as > either of the other two. > * For large, noncontiguous arrays, Psymeric and NumPy are ~20% faster > than Numarray > The impressive thing is that Psymeric is generally slightly faster than > NumPy when adding two arrays. It's slightly slower (~10%) when adding an > array and a scalar although I suspect that could be fixed by some > special casing a la Numarray. Adding two (500,100) arrays of type > Float64 together results in the following timings: > psymeric numpy numarray > contiguous 0.0034 s 0.0038 s 0.0019 s > stride-2 0.0020 s 0.0023 s 0.0033 s > > I'm not sure if this is important, but it is an impressive demonstration > of Psyco! More later when I get the code a bit more cleaned up. > > -tim > 0.002355 > > 0.002355 > The "psymeric" results are indeed interesting. However, I'd like to make some remarks about numarray benchmarks. At this stage, most of the focus has been on large, contiguous array performance (and as can be seen that is where numarray does best). There are a number of other improvements that can and will be made to numarray performance so some of the other benchmarks are bound to improve (how much is uncertain). For example, the current behavior with strided arrays results in looping over subblocks of the array, and that looping is done on relatively small blocks in Python. We haven't done any tuning yet to see what the optimum size of block should be (it may be machine dependent as well), and it is likely that the loop will eventually be moved into C. Small array performance should improve quite a bit, we are looking into how to do that now and should have a better idea soon of whether we can beat Numeric's performance or not. But "psymeric" approach raises an obvious question (implied I guess, but not explicitly stated). With Psyco, is there a need for Numeric or numarray at all? I haven't thought this through in great detail, but at least one issue seems tough to address in this approach, and that is handling numeric types not supported by Python (e.g., Int8, Int16 UInt16, Float32, etc.). Are you offering the possiblity of the "pysmeric" approach as being the right way to go, and if so, how would you handle this issue? On the other hand, there are lots of algorithms that cannot be handled well with array manipulations. It would seem that psyco would be a natural alternative in such cases (as long as one is content to use Float64 or Int32), but it isn't obivious that these require arrays as anything but data structures (e.g. places to obtain and store scalars). Perry Greenfield From tim.hochberg at ieee.org Wed Feb 5 08:54:05 2003 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Wed Feb 5 08:54:05 2003 Subject: [Psyco-devel] RE: [Numpy-discussion] Interesting Psyco/Numeric/Numarray comparison In-Reply-To: References: Message-ID: <3E414181.6020302@ieee.org> Perry Greenfield wrote: >The "psymeric" results are indeed interesting. However, I'd like to >make some remarks about numarray benchmarks. At this stage, most of >the focus has been on large, contiguous array performance (and as >can be seen that is where numarray does best). There are a number >of other improvements that can and will be made to numarray performance >so some of the other benchmarks are bound to improve (how much is >uncertain). For example, the current behavior with strided arrays >results in looping over subblocks of the array, and that looping is >done on relatively small blocks in Python. We haven't done any tuning >yet to see what the optimum size of block should be (it may be machine >dependent as well), and it is likely that the loop will eventually be >moved into C. Small array performance should improve quite a bit, we >are looking into how to do that now and should have a better idea >soon of whether we can beat Numeric's performance or not. > > I fully expect numarray to beat Numeric for large arrays eventually just based on the fact the psymeric tends to be slightly faster Numeric now for many cases. However, for small arrays it seems that you're likely to be fighting the function call overhead of Python unless you go completely, or nearly completely, to C. But that would be a shame as it would make modifying/extending numarray that much harder. >But "psymeric" approach raises an obvious question (implied I guess, but >not explicitly stated). With Psyco, is there a need for Numeric or >numarray at all? I haven't thought this through in great detail, but at >least one issue seems tough to address in this approach, and that is >handling numeric types not supported by Python (e.g., Int8, Int16 UInt16, >Float32, etc.). Are you offering the possiblity of the "pysmeric" >approach as being the right way to go, > > I think there are too many open questions at this point to be a serious contender. It's interesting enough and the payoff would be big enough that I think it's worth throwing out some of the questions and see if anything interesting pops out. > and if so, how would you handle >this issue? > > The types issue may not be a problem. Python's array.array supports a full set of types (http://www.python.org/doc/current/lib/module-array.html). However, psyco does not currently support fast operations on types 'f', 'I' and 'L'. I don't know if this is a technical problem, or something that's likely to be resolved in time. The 'f' (Float32) case is critical, the others less so. Armin, if you're reading this perhaps you'd like to comment? >On the other hand, there are lots of algorithms that cannot be handled >well with array manipulations. > This is where the Psyco approach would shine. One occasionally runs into cases where some part of the computation just cannot be done naturaly with array operations. A common case is the equivalent of this bit of C code: "A[i] = (C[i] It would seem that psyco would be a natural >alternative in such cases (as long as one is content to use Float64 or >Int32), but it isn't obivious that these require arrays as anything but >data structures (e.g. places to obtain and store scalars). > > That's not been my experience. When I've run into awkward cases like this it's been in situations where nearly all of my computations could be vectorized. Anyway, here are what I see as the issues with this type of approach: * Types: I believe that this should not be a problem * Interfacing with C/Fortran: This seems necessary for any Numeric wannabe. It seems that it must be possible, but it may require a bit of C-code, so it may not be possible to get completely away from C. * Speed: It's not clear to me at this point whether psymeric would get any faster than it currently is. It's pretty fast now, but the factor of two difference between it and numarray for contiguous arrays (a common case) is nothing to sneeze at. Cross-platform: This is the reall killer. Psyco only runs on x86 machines. I don't know if or when that's likely to change. Not being cross platform seems nix this from being a serious contender as a Numeric replacement for the time being. -tim From falted at openlc.org Fri Feb 7 09:46:04 2003 From: falted at openlc.org (Francesc Alted) Date: Fri Feb 7 09:46:04 2003 Subject: [Psyco-devel] RE: [Numpy-discussion] Interesting Psyco/Numeric/Numarray comparison In-Reply-To: <3E414181.6020302@ieee.org> References: <3E414181.6020302@ieee.org> Message-ID: <200302071843.09511.falted@openlc.org> A Dimecres 05 Febrer 2003 17:53, Tim Hochberg va escriure: > However, for small arrays it seems that you're likely to > be fighting the function call overhead of Python unless you go > completely, or nearly completely, to C. But that would be a shame as it > would make modifying/extending numarray that much harder. For this task may be is worth to consider using Pyrex (http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/) for that. From the website: """Pyrex lets you write code that mixes Python and C data types any way you want, and compiles it into a C extension for Python.""" i.e. if you have code in Python and want to accelerate it, it's quite more easy moving it to Pyrex rather than C, as Pyrex has Python syntax. In addition, it lets you call C routines very easily (just declaring them) and provide transparent access to variables, functions and objects in Python namespace. Apart from the standard Python loop statement, Pyrex introduces a new kind of for-loop (in the form "for i from 0 <= i < n:") for iterating over ranges of integers at C speed, that can, for sure, be very handy when optimizing many numarray loops. Another advantage is that Pyrex compiles their own code to C and you can distribute this C code in your package, without necessity to force the final Pyrex extension user to install Pyrex (because it's just a kind of compiler). I've been using it for more than six months and it's pretty stable and works very well (at least for UNIX machines; I don't have experience on Windows or OSX platforms). Just my two cents, -- Francesc Alted From paul at pfdubois.com Fri Feb 7 09:59:08 2003 From: paul at pfdubois.com (Paul Dubois) Date: Fri Feb 7 09:59:08 2003 Subject: [Numpy-discussion] Some bugs in Numeric fixed today in CVS Message-ID: <000301c2ced2$70627570$6601a8c0@NICKLEBY> [ 614808 ] Inconsistent use of tabs and spaces Fixed as suggested by Jimmy Retzlaff LinearAlgebra.py Matrix.py RNG/__init__.py RNG/Statistics.py [ 621032 ] needless work in multiarraymodule.c Fixes suggested by Greg Smith applied. Also recoded OBJECT_DotProduct to eliminate a warning error. [ 630584 ] generalized_inverse of complex array Fix suggested by Greg Smith applied. [ 652061 ] PyArray_As2D doesn't check pointer. Fix suggested by Andrea Riciputi applied. [ 655512 ] inverse_real_fft incorrect many sizes Fix given by mbriest applied. From Chris.Barker at noaa.gov Fri Feb 7 11:06:05 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Fri Feb 7 11:06:05 2003 Subject: [Psyco-devel] RE: [Numpy-discussion] Interesting Psyco/Numeric/Numarray comparison References: <3E414181.6020302@ieee.org> <200302071843.09511.falted@openlc.org> Message-ID: <3E43FBFA.4B0C0FA1@noaa.gov> Francesc Alted wrote: > For this task may be is worth to consider using Pyrex > (http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/) for that. From the > website: > > """Pyrex lets you write code that mixes Python and C data types any way you > want, and compiles it into a C extension for Python.""" I've been keeping my eye on Pyrex for a while now, but have not yet had enough of a use for it to justify tryin git out. I do ahve a question that I ahve not foudn the answer to on the web, which could make a big difference to how useful it is to me: Is Pyrex aware of Numeric Arrays? I imagine it could use them just fine, using the generic Python sequence get item stuff, but that would be a whole lot lower performance than if it understood the Numeric API and could access the data array directly. Also, how does it deal with multiple dimension indexing ( array[3,6,2] ) which the standard python sequence types do not support? As I think about this, I think your suggestion is fabulous. Pyrex (or a Pyrex-like) language would be a fabulous way to write code for NumArray, if it really made use of the NumArray API. Thanks for your input, -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From paul at pfdubois.com Fri Feb 7 13:48:04 2003 From: paul at pfdubois.com (Paul F Dubois) Date: Fri Feb 7 13:48:04 2003 Subject: [Numpy-discussion] Psyco MA? In-Reply-To: <3E43FBFA.4B0C0FA1@noaa.gov> Message-ID: <000101c2cef2$7ad25f50$6601a8c0@NICKLEBY> { CC to GvR just to show why I'm +1 on the if-PEP. I liked this in another language I used to use; Perl ? } Perhaps knowlegeable persons could comment on the feasibility of coding MA (masked arrays) in straight Python and then using Psyco on it? The existing implementation is in pure python and uses Numeric to represent the two arrays it holds (the data and sometimes a mask) in each object. A great deal of wasted motion is devoted to preparing Numeric arrays so as to avoid operations on masked elements. It could have been written a lot simpler if performance didn't dictate trying to leverage off Numeric. In straight Python one can imagine an add, for example, that was roughly: for k in 0<= k < len(a.data): result.mask[k] = a.mask[k] or b.mask[k] result.data[k] = a.data[k] if result.mask[k] else a.data[k] + b.data[k] (using the new if expression PEP just to confuse the populace) It seems to me that this might be competitive given the numbers someone posted before. Alas, I can't remember who was the original poster, but I'd guess they might have a good guess. > -----Original Message----- > From: numpy-discussion-admin at lists.sourceforge.net > [mailto:numpy-discussion-admin at lists.sourceforge.net] On > Behalf Of Chris Barker > Sent: Friday, February 07, 2003 10:34 AM > To: falted at openlc.org; numpy-discussion at lists.sourceforge.net > Subject: Re: [Psyco-devel] RE: [Numpy-discussion] Interesting > Psyco/Numeric/Numarray comparison > > > Francesc Alted wrote: > > > For this task may be is worth to consider using Pyrex > > (http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/) for > that. From > > the > > website: > > > > """Pyrex lets you write code that mixes Python and C data types any > > way you want, and compiles it into a C extension for Python.""" > > I've been keeping my eye on Pyrex for a while now, but have > not yet had enough of a use for it to justify tryin git out. > I do ahve a question that I ahve not foudn the answer to on > the web, which could make a big difference to how useful it is to me: > > Is Pyrex aware of Numeric Arrays? > > I imagine it could use them just fine, using the generic > Python sequence get item stuff, but that would be a whole lot > lower performance than if it understood the Numeric API and > could access the data array directly. Also, how does it deal > with multiple dimension indexing ( array[3,6,2] ) which the > standard python sequence types do not support? > > As I think about this, I think your suggestion is fabulous. > Pyrex (or a > Pyrex-like) language would be a fabulous way to write code > for NumArray, if it really made use of the NumArray API. > > Thanks for your input, > > -Chris > > -- > Christopher Barker, Ph.D. > Oceanographer > > NOAA/OR&R/HAZMAT (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > > ------------------------------------------------------- > This SF.NET email is sponsored by: > SourceForge Enterprise Edition + IBM + LinuxWorld = Something > 2 See! http://www.vasoftware.com > _______________________________________________ > Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From Chris.Barker at noaa.gov Fri Feb 7 14:25:03 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Fri Feb 7 14:25:03 2003 Subject: [Numpy-discussion] Psyco MA? References: <000101c2cef2$7ad25f50$6601a8c0@NICKLEBY> Message-ID: <3E442A77.413648CC@noaa.gov> -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Fri Feb 7 14:41:03 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Fri Feb 7 14:41:03 2003 Subject: [Numpy-discussion] Psyco MA? References: <000101c2cef2$7ad25f50$6601a8c0@NICKLEBY> Message-ID: <3E442E5A.89334754@noaa.gov> oops, sorry about the blank message. Paul F Dubois wrote: > { CC to GvR just to show why I'm +1 on the if-PEP. I liked this in another What the heck is the if-PEP ? > Perhaps knowlegeable persons could comment on the feasibility of coding MA > (masked arrays) in straight Python and then using Psyco on it? Is there confusion between Psyco and Pyrex? Psyco runs regular old Python bytecode, and individually compiles little pieces of it as needed into machine code. AS I understand it, this should make loops where the inner part is a pretty simple operation very fast. However, Psyco is pretty new, and I have no idea how robust and stable, but certainly not cross platform. As it generates machine code, it needs to be carefully ported to each hardware platform, and it currently only works on x86. Pyrex, on the other hand, is a "Python-like" language that is tranlated into C, and then the C is compiled. It generates pretty darn platform independent, so it should be able to be used on all platforms. In regard to your question about MA (and any ther similar project): I think Psyco has the potential to be the next-generation Python VM, which will have much higher performance, and therefore greatly reduce the need to write extensions for the sake of performance. I supsect that it could do its best with large, multi-dimensional arrays of numbers if there is a Python native object of such a type. Psycho, however is not ready for general use on all platforms, so in the forseeable future, there is a need for other ways to get decent performance. My suggestion follows: > It could have been written a lot simpler if performance didn't dictate > trying to leverage off Numeric. In straight Python one can imagine an add, > for example, that was roughly: > for k in 0<= k < len(a.data): > result.mask[k] = a.mask[k] or b.mask[k] > result.data[k] = a.data[k] if result.mask[k] else a.data[k] + > b.data[k] This looks like it could be written in Pyrex. If Pyrex were suitably NumArray aware, then it could work great. What this boils down to, in both the Pyrex and Psyco options, is that having a multi-dimensional homogenous numeric data type that is "Native" Python is a great idea! With Pyrex and/or Psyco, Numeric3 (NumArray2 ?) could be implimented by having only the samallest core in C, and then rest in Python (or Pyrex) While the Psyco option is the rosy future of Python, Pyrex is here now, and maybe adopting it to handle NumArrays well would be easier than re-writing a bunch of NumArray in C. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From tchur at optushome.com.au Fri Feb 7 15:08:01 2003 From: tchur at optushome.com.au (Tim Churches) Date: Fri Feb 7 15:08:01 2003 Subject: [Numpy-discussion] Psyco MA? In-Reply-To: <3E442E5A.89334754@noaa.gov> References: <000101c2cef2$7ad25f50$6601a8c0@NICKLEBY> <3E442E5A.89334754@noaa.gov> Message-ID: <1044659254.1290.128.camel@emilio> On Sat, 2003-02-08 at 09:08, Chris Barker wrote: > While the Psyco option is the rosy future of Python, Pyrex is here now, > and maybe adopting it to handle NumArrays well would be easier than > re-writing a bunch of NumArray in C. Well, Psyco is already immediately useful for many problems on Intel platforms, but I take your point that its real future is as the next-generation VM for Python. However, I agree 100% about the potential for leveraging Pyrex in Numarray. Not just in Numarray, but around it, too. The Numarray team should open serious talks with Greg Ewing about Numarray-enabling Pyrex. And New Zealand is a very nice place to visit (seriously, not joking, even though I am an Australian [reference to trans-Tasman Sea rivalry between Asutralia and New Zealand there]). Tim C From tim.hochberg at ieee.org Fri Feb 7 15:09:04 2003 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Fri Feb 7 15:09:04 2003 Subject: [Numpy-discussion] Psyco MA? In-Reply-To: <3E442E5A.89334754@noaa.gov> References: <000101c2cef2$7ad25f50$6601a8c0@NICKLEBY> <3E442E5A.89334754@noaa.gov> Message-ID: <3E443C83.7000209@ieee.org> Chris Barker wrote: >oops, sorry about the blank message. > >Paul F Dubois wrote: > > >>{ CC to GvR just to show why I'm +1 on the if-PEP. I liked this in another >> >> > >What the heck is the if-PEP ? > > Pep 308. It's stirring up a bit of a ruckos on CLP as we speak. >>Perhaps knowlegeable persons could comment on the feasibility of coding MA >>(masked arrays) in straight Python and then using Psyco on it? >> >> > >Is there confusion between Psyco and Pyrex? Psyco runs regular old >Python bytecode, and individually compiles little pieces of it as needed >into machine code. AS I understand it, this should make loops where the >inner part is a pretty simple operation very fast. > >However, Psyco is pretty new, and I have no idea how robust and stable, >but certainly not cross platform. As it generates machine code, it needs >to be carefully ported to each hardware platform, and it currently only >works on x86. > > Psyco seems fairly stable these days. However it's one of those things that probably needs to get a larger cabal of users to shake the bugs out of it. I still only use it to play around with because all things that I need speed from I end up doing in Numeric anyway. >Pyrex, on the other hand, is a "Python-like" language that is tranlated >into C, and then the C is compiled. It generates pretty darn platform >independent, so it should be able to be used on all platforms. > > >In regard to your question about MA (and any ther similar project): I >think Psyco has the potential to be the next-generation Python VM, which >will have much higher performance, and therefore greatly reduce the need >to write extensions for the sake of performance. I supsect that it could >do its best with large, multi-dimensional arrays of numbers if there is >a Python native object of such a type. Psycho, however is not ready for >general use on all platforms, so in the forseeable future, there is a >need for other ways to get decent performance. My suggestion follows: > > > >>It could have been written a lot simpler if performance didn't dictate >>trying to leverage off Numeric. In straight Python one can imagine an add, >>for example, that was roughly: >> for k in 0<= k < len(a.data): >> result.mask[k] = a.mask[k] or b.mask[k] >> result.data[k] = a.data[k] if result.mask[k] else a.data[k] + >>b.data[k] >> >> > >This looks like it could be written in Pyrex. If Pyrex were suitably >NumArray aware, then it could work great. > >What this boils down to, in both the Pyrex and Psyco options, is that >having a multi-dimensional homogenous numeric data type that is "Native" >Python is a great idea! With Pyrex and/or Psyco, Numeric3 (NumArray2 ?) >could be implimented by having only the samallest core in C, and then >rest in Python (or Pyrex) > > For Psyco at least you don't need a multidimensional type. You can get good results with flat array, in particular array.array. The number I posted earlier showed comparable performance for Numeric and a multidimensional array type written all in python and psycoized. And since I suspect that I'm the mysterious person who's name Paul couldn't remember, let me say I suspect the MA would be faster in psycoized python than what your doing now as long as a.data was an instance of array.array. However, there are at least three problems. Psyco doesn't fully support the floating point type('f') right now (although it does support most of the various integral types in addition to 'd'). I assume that these masked arrays are multidimensional, so someone would have to build the basic multidimensional machinery around array.array to make them work. I have a good start on this, but I'm not sure when I'm going to have time to work on this more. The biggy though is that psyco only works on x86 machines. What we really need to do is to clone Armin. >While the Psyco option is the rosy future of Python, Pyrex is here now, >and maybe adopting it to handle NumArrays well would be easier than >re-writing a bunch of NumArray in C. > > This sounds like you're conflating two different issues. The first issue is that Numarray is relatively slow for small arrays. Pyrex may indeed be an easier way to attack this although I wouldn't know, I've only looked at it not tried to use it. However, I think that this is something that can and should wait. Once use cases of numarray being _too_ slow for small arrays start piling up, then it will be time to attack the overhead. Premature optimization is the root of all evil and all that. The second issue is how to deal with code that does not vectorize well. Here Pyrex again might help if it were made Numarray aware. However, isn't this what scipy.weave already does? Again, I haven't used weave, but as I understand it, it's another python-c bridge, but one that's more geared toward numerics stuff. -tim From list at jsaul.de Fri Feb 7 15:58:09 2003 From: list at jsaul.de (Joachim Saul) Date: Fri Feb 7 15:58:09 2003 Subject: [Numpy-discussion] Psyco MA? In-Reply-To: <1044659254.1290.128.camel@emilio> References: <000101c2cef2$7ad25f50$6601a8c0@NICKLEBY> <3E442E5A.89334754@noaa.gov> <1044659254.1290.128.camel@emilio> Message-ID: <20030207235736.GG842@jsaul.de> * Tim Churches [2003-02-08 00:07]: > However, I agree 100% about the potential for leveraging Pyrex in > Numarray. Not just in Numarray, but around it, too. The Numarray team > should open serious talks with Greg Ewing about Numarray-enabling Pyrex. What is it that needs to be "enabled"? Pyrex handles Numeric (see Pyrex FAQ), why should it not handle Numarray? AFAIK Pyrex contains no code to specifically support Numeric, and it should therefore be straightforward to use it with Numarray as well. Only drawback is currently lack of support for e.g. slicing operations in Pyrex. Cheers, Joachim From tchur at optushome.com.au Fri Feb 7 16:25:04 2003 From: tchur at optushome.com.au (Tim Churches) Date: Fri Feb 7 16:25:04 2003 Subject: [Numpy-discussion] Psyco MA? In-Reply-To: <20030207235736.GG842@jsaul.de> References: <000101c2cef2$7ad25f50$6601a8c0@NICKLEBY> <3E442E5A.89334754@noaa.gov> <1044659254.1290.128.camel@emilio> <20030207235736.GG842@jsaul.de> Message-ID: <1044663911.1266.180.camel@emilio> On Sat, 2003-02-08 at 10:57, Joachim Saul wrote: > * Tim Churches [2003-02-08 00:07]: > > However, I agree 100% about the potential for leveraging Pyrex in > > Numarray. Not just in Numarray, but around it, too. The Numarray team > > should open serious talks with Greg Ewing about Numarray-enabling Pyrex. > > What is it that needs to be "enabled"? Pyrex handles Numeric (see > Pyrex FAQ), why should it not handle Numarray? AFAIK Pyrex > contains no code to specifically support Numeric, and it should > therefore be straightforward to use it with Numarray as well. Hmmm, maybe re-implementing MA in Pyrex is possible right now. Double hmmm.... Tim C From paul at pfdubois.com Fri Feb 7 16:29:02 2003 From: paul at pfdubois.com (Paul F Dubois) Date: Fri Feb 7 16:29:02 2003 Subject: [Numpy-discussion] Psyco MA? In-Reply-To: <3E443C83.7000209@ieee.org> Message-ID: <000801c2cf09$0cfabf10$6601a8c0@NICKLEBY> Just to confirm the obvious, I don't know the difference between Psyco and Pyrex and if I ever did it is Friday night and I've lost it. Any two words that share two letters look the same to me right now. > -----Original Message----- > From: numpy-discussion-admin at lists.sourceforge.net > [mailto:numpy-discussion-admin at lists.sourceforge.net] On > Behalf Of Tim Hochberg > Sent: Friday, February 07, 2003 3:09 PM > To: Chris Barker > Cc: Paul F Dubois; numpy-discussion at lists.sourceforge.net > Subject: Re: [Numpy-discussion] Psyco MA? > > > Chris Barker wrote: > > >oops, sorry about the blank message. > > > >Paul F Dubois wrote: > > > > > >>{ CC to GvR just to show why I'm +1 on the if-PEP. I liked this in > >>another > >> > >> > > > >What the heck is the if-PEP ? > > > > > > Pep 308. It's stirring up a bit of a ruckos on CLP as we speak. > > >>Perhaps knowlegeable persons could comment on the feasibility of > >>coding MA (masked arrays) in straight Python and then using > Psyco on > >>it? > >> > >> > > > >Is there confusion between Psyco and Pyrex? Psyco runs regular old > >Python bytecode, and individually compiles little pieces of it as > >needed into machine code. AS I understand it, this should make loops > >where the inner part is a pretty simple operation very fast. > > > >However, Psyco is pretty new, and I have no idea how robust > and stable, > >but certainly not cross platform. As it generates machine code, it > >needs to be carefully ported to each hardware platform, and it > >currently only works on x86. > > > > > Psyco seems fairly stable these days. However it's one of > those things > that probably needs to get a larger cabal of users to shake > the bugs out > of it. I still only use it to play around with because all > things that I > need speed from I end up doing in Numeric anyway. > > >Pyrex, on the other hand, is a "Python-like" language that > is tranlated > >into C, and then the C is compiled. It generates pretty darn > platform > >independent, so it should be able to be used on all platforms. > > > > > >In regard to your question about MA (and any ther similar > project): I > >think Psyco has the potential to be the next-generation Python VM, > >which will have much higher performance, and therefore > greatly reduce > >the need to write extensions for the sake of performance. I supsect > >that it could do its best with large, multi-dimensional arrays of > >numbers if there is a Python native object of such a type. Psycho, > >however is not ready for general use on all platforms, so in the > >forseeable future, there is a need for other ways to get decent > >performance. My suggestion follows: > > > > > > > >>It could have been written a lot simpler if performance > didn't dictate > >>trying to leverage off Numeric. In straight Python one can > imagine an > >>add, for example, that was roughly: > >> for k in 0<= k < len(a.data): > >> result.mask[k] = a.mask[k] or b.mask[k] > >> result.data[k] = a.data[k] if result.mask[k] else > a.data[k] + > >>b.data[k] > >> > >> > > > >This looks like it could be written in Pyrex. If Pyrex were suitably > >NumArray aware, then it could work great. > > > >What this boils down to, in both the Pyrex and Psyco > options, is that > >having a multi-dimensional homogenous numeric data type that is > >"Native" Python is a great idea! With Pyrex and/or Psyco, Numeric3 > >(NumArray2 ?) could be implimented by having only the > samallest core in > >C, and then rest in Python (or Pyrex) > > > > > For Psyco at least you don't need a multidimensional type. > You can get > good results with flat array, in particular array.array. The number I > posted earlier showed comparable performance for Numeric and a > multidimensional array type written all in python and psycoized. > > And since I suspect that I'm the mysterious person who's name Paul > couldn't remember, let me say I suspect the MA would be faster in > psycoized python than what your doing now as long as a.data was an > instance of array.array. However, there are at least three problems. > Psyco doesn't fully support the floating point type('f') right now > (although it does support most of the various integral types in > addition to 'd'). I assume that these masked arrays are > multidimensional, so someone would have to build the basic > multidimensional machinery around array.array to make them > work. I have > a good start on this, but I'm not sure when I'm going to have time to > work on this more. The biggy though is that psyco only works on x86 > machines. What we really need to do is to clone Armin. > > >While the Psyco option is the rosy future of Python, Pyrex > is here now, > >and maybe adopting it to handle NumArrays well would be easier than > >re-writing a bunch of NumArray in C. > > > > > This sounds like you're conflating two different issues. The > first issue > is that Numarray is relatively slow for small arrays. Pyrex > may indeed > be an easier way to attack this although I wouldn't know, I've only > looked at it not tried to use it. However, I think that this is > something that can and should wait. Once use cases of numarray being > _too_ slow for small arrays start piling up, then it will be time to > attack the overhead. Premature optimization is the root of > all evil and > all that. > > The second issue is how to deal with code that does not > vectorize well. > Here Pyrex again might help if it were made Numarray aware. However, > isn't this what scipy.weave already does? Again, I haven't > used weave, > but as I understand it, it's another python-c bridge, but one that's > more geared toward numerics stuff. > > > -tim > > > > > > > > > > ------------------------------------------------------- > This SF.NET email is sponsored by: > SourceForge Enterprise Edition + IBM + LinuxWorld = Something > 2 See! http://www.vasoftware.com > _______________________________________________ > Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From Chris.Barker at noaa.gov Fri Feb 7 17:10:15 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Fri Feb 7 17:10:15 2003 Subject: [Numpy-discussion] Psyco MA? References: <000101c2cef2$7ad25f50$6601a8c0@NICKLEBY> <3E442E5A.89334754@noaa.gov> <3E443C83.7000209@ieee.org> Message-ID: <3E445136.4791C978@noaa.gov> Tim Hochberg wrote: > Psyco seems fairly stable these days. However it's one of those things > that probably needs to get a larger cabal of users to shake the bugs out > of it. I still only use it to play around with because all things that I > need speed from I end up doing in Numeric anyway. Hmmm. It always just seemed too bleeding edge for me to want to drop it in inplace of my current Python, but maybe I should try... > For Psyco at least you don't need a multidimensional type. You can get > good results with flat array, in particular array.array. The number I > posted earlier showed comparable performance for Numeric and a > multidimensional array type written all in python and psycoized. What about non-contiguous arrays? Also, you pointed out yourself that you are still looking at a factor of two slowdown, it would be nice to get rid of that. > >While the Psyco option is the rosy future of Python, Pyrex is here now, > >and maybe adopting it to handle NumArrays well would be easier than > >re-writing a bunch of NumArray in C. > > > This sounds like you're conflating two different issues. The first issue > is that Numarray is relatively slow for small arrays. Pyrex may indeed > be an easier way to attack this although I wouldn't know, I've only > looked at it not tried to use it. However, I think that this is > something that can and should wait. Once use cases of numarray being > _too_ slow for small arrays start piling up, then it will be time to > attack the overhead. Premature optimization is the root of all evil and > all that. Quite true. I know I have a lot of use cases where I use a LOT of small arrays. That doesn't mean that performace is a huge problem, we'll see. I'm talking about other things as well, however. There are a lot of functions in the current Numeric that are written in a combination of Python and C. Mostly they are written using the lower level Numeric functions. This includes concatenate, chop, etc. etc. While speeding up any individual one of those won't make much difference, speeding them all up might. If it were much easier to get C-speed functions like this, we'd have a higher performance package all around. I've personally re-written byteswap() and chop(). In this case, not to get them faster, but to get them to use less memory. It would be great if we could do them all. > The second issue is how to deal with code that does not vectorize well. > Here Pyrex again might help if it were made Numarray aware. However, > isn't this what scipy.weave already does? Again, I haven't used weave, > but as I understand it, it's another python-c bridge, but one that's > more geared toward numerics stuff. Weave is another project that's on my list to check out, so I don't know why one would choose one over the other. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From list at jsaul.de Sat Feb 8 02:56:01 2003 From: list at jsaul.de (Joachim Saul) Date: Sat Feb 8 02:56:01 2003 Subject: [Numpy-discussion] Psyco MA? In-Reply-To: <1044663911.1266.180.camel@emilio> References: <000101c2cef2$7ad25f50$6601a8c0@NICKLEBY> <3E442E5A.89334754@noaa.gov> <1044659254.1290.128.camel@emilio> <20030207235736.GG842@jsaul.de> <1044663911.1266.180.camel@emilio> Message-ID: <20030208105418.GA842@jsaul.de> * Tim Churches [2003-02-08 01:25]: > On Sat, 2003-02-08 at 10:57, Joachim Saul wrote: > > * Tim Churches [2003-02-08 00:07]: > > > However, I agree 100% about the potential for leveraging Pyrex in > > > Numarray. Not just in Numarray, but around it, too. The Numarray team > > > should open serious talks with Greg Ewing about Numarray-enabling Pyrex. > > > > What is it that needs to be "enabled"? Pyrex handles Numeric (see > > Pyrex FAQ), why should it not handle Numarray? AFAIK Pyrex > > contains no code to specifically support Numeric, and it should > > therefore be straightforward to use it with Numarray as well. > > Hmmm, maybe re-implementing MA in Pyrex is possible right now. Double > hmmm.... Please check out the Pyrex doc. It's actually very easy right now, *if* you can live without "sequence operators" such as slicing, list comprehensions... but this is going to be supported, again according to the doc. Here is a exerpt from an extension module that I have built using Pyrex and Numeric, following the instructions in the Pyrex-FAQ: cdef extern int decomp(int, double*, double*, double, double, double) cdef extern from "Numeric/arrayobject.h": struct PyArray_Descr: int type_num, elsize char type ctypedef class PyArrayObject [type PyArray_Type]: cdef char *data cdef int nd cdef int *dimensions,*strides cdef PyArray_Descr *descr object PyArray_FromDims(int, int*, int) void import_array() def _decomp(PyArrayObject z_arr, PyArrayObject r_arr, double p, double vs, double sigma): cdef double *z, *r cdef int n n = z_arr.dimensions[0] z, r = z_arr.data, r_arr.data decomp(n, z, r, p, vs, sigma) This is rather crude code that doesn't check for the type of the arrays nor their dimension, but it does what I want right now and if I find the time I'll certainly make it more general. Those checks are actually performed in yet another Python layer. As you can see, the above looks like "strongly typed" Python. From a C-programmers perspective, if find this is extremely cool. If one leaves the type out, then the argument can be any Python object. What I like about Pyrex is that you can mix Python and C calls at your convenience. For example, I may call (C-like) arr = PyArray_FromDims(1, &n, PyArray_DOUBLE) but could have also used a corresponding Python construct like from Numeric import zeros arr = zeros(n, 'd') I expect the latter to be slower (not tested), but one can take Python code "as is" and "compile" it using Pyrex. This already increases performance and one can then conveniently replace as much Python code as needed with the corresponding C functions, which (presumably) will again speed up the code significantly. The bottle necks are finally moved to external C files and treated like a C library. Cheers, Joachim From perry at stsci.edu Sat Feb 8 13:52:02 2003 From: perry at stsci.edu (Perry Greenfield) Date: Sat Feb 8 13:52:02 2003 Subject: [Numpy-discussion] Some observations or questions about psyco & pyrex In-Reply-To: <20030208105418.GA842@jsaul.de> Message-ID: Both psyco and pyrex have some great aspects. But I think it is worth a little reflection on what can and can't be expected of them. I'm basically ignorant of both; I know a little about them, but haven't used them. if anything I say is wrong, please correct me. I'm going to make some comments based on inferred characteristics of them that could well be wrong. Psyco is very cool and seems the answer to many dreams. But consider the cost. From what I can infer, it obtains its performance enhancements at least in part by constructing machine code on the fly from the Python code. In other words it is performing aspects of running on particular processors that is usually relegated to C compilers by Python. I'd guess that the price is the far greater difficulty of maintaining such capability across many processor types. It also likely increases the complexity of the implementation of Python, perhaps making it much harder to change and enhance. Even without it handling things that are needed for array processing, how likely is it that it will be accepted as the standard implementation for Python for these reasons alone. I also am inclined to believe that adding the support for array processing to a psyco implementation is a significant undertaking. There are at least two issues that would have to be addressed: handling all the numeric types and exception handling behavior. Then there are aspects important to us that include handling byteswapped or non-aligned data. Having the Python VM handle the efficiency aspects of arrays simplifies aspects of their implementation as compared to the current implementations of Numeric and numarray it doesn't eliminate the need to replicate much of it. Having to deal with the implemenation for several different processors is likely to outweigh any savings in the implementation. But maybe I misjudge. Pyrex's goals are more realistic I believe. But unless I'm mistaken, pyrex cannot be a solution to the problems that Numeric and numarray solve. Writing a something for pyrex means committing to certain types. It's great for writing something that you would have written as a C extension, but it doesn't really solve the problem of implementing Ufuncs that can handle many different types of arrays, and particularly combinations of types. But perhaps I misunderstand here as well. It certainly would be nice if it could handle some of the aspects of the Numeric/numarray API automatically. So I doubt that either really is a solution for masked arrays in general. Perry From jae at zhar.net Sat Feb 8 15:11:01 2003 From: jae at zhar.net (John Eikenberry) Date: Sat Feb 8 15:11:01 2003 Subject: [Numpy-discussion] Some observations or questions about psyco & pyrex In-Reply-To: References: <20030208105418.GA842@jsaul.de> Message-ID: <20030208230912.GA3764@kosh.zhar.net> Perry Greenfield wrote: > Both psyco and pyrex have some great aspects. But I think > it is worth a little reflection on what can and can't be > expected of them. I'm basically ignorant of both; I know > a little about them, but haven't used them. if anything I > say is wrong, please correct me. I'm going to make some > comments based on inferred characteristics of them that > could well be wrong. I'd like to suggest to anyone interested in these ideas that they take a look a the pypython/minimal-python mailing list: http://codespeak.net/mailman/listinfo/pypy-dev > Psyco is very cool and seems the answer to many dreams. > But consider the cost. From what I can infer, it obtains > its performance enhancements at least in part by constructing > machine code on the fly from the Python code. In other > words it is performing aspects of running on particular > processors that is usually relegated to C compilers by > Python. > > I'd guess that the price is the far greater difficulty of > maintaining such capability across many processor types. > It also likely increases the complexity of the implementation > of Python, perhaps making it much harder to change and > enhance. Even without it handling things that are needed > for array processing, how likely is it that it will be > accepted as the standard implementation for Python for > these reasons alone. The hope is that quite the opposite of just about every one of these points will be true. That once Python is reimplemented in Python, with psycho as a backend jit-like compiler, it will decrease the complexity of the implementation. Making it much easier to change and enhance. I tend to be quite optimistic about the potential for pypython and psycho. I think the added work of the platform dependent psycho modules will be offset by the rest of the system being written in Python. -- John Eikenberry [jae at zhar.net - http://zhar.net] ______________________________________________________________ "Perfection (in design) is achieved not when there is nothing more to add, but rather when there is nothing more to take away." -- Antoine de Saint-Exupery From tim.hochberg at ieee.org Mon Feb 10 08:53:04 2003 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Mon Feb 10 08:53:04 2003 Subject: [Numpy-discussion] Psyco MA? In-Reply-To: <3E445136.4791C978@noaa.gov> References: <000101c2cef2$7ad25f50$6601a8c0@NICKLEBY> <3E442E5A.89334754@noaa.gov> <3E443C83.7000209@ieee.org> <3E445136.4791C978@noaa.gov> Message-ID: <3E47D8AC.4070108@ieee.org> Chris Barker wrote: >Tim Hochberg wrote: > > >>Psyco seems fairly stable these days. However it's one of those things >>that probably needs to get a larger cabal of users to shake the bugs out >>of it. I still only use it to play around with because all things that I >>need speed from I end up doing in Numeric anyway. >> >> > >Hmmm. It always just seemed too bleeding edge for me to want to drop it >in inplace of my current Python, but maybe I should try... > > I think Psyco was a reworked interpreter at some point, but it isn't any longer. Now it's just an extension module. You typically use it like this: def some_function_that_needs_to_be_fast(...): .... psyco.bind(some_function_that_needs_to_be_fast) Of course, it's still possible to bomb the interpreter with Psyco and it's a huge memory hog if you bind a lot of functions. On the other hand in the course of playing with psymeric I found one way to crash the interpreter with Psyco, one way with Numeric, and one way to cause Numarray to fail, although this did not crash the interpreter. So if I was keeping a tally of evil bugs, they'd all be tied right now.... >>For Psyco at least you don't need a multidimensional type. You can get >>good results with flat array, in particular array.array. The number I >>posted earlier showed comparable performance for Numeric and a >>multidimensional array type written all in python and psycoized. >> >> > >What about non-contiguous arrays? Also, you pointed out yourself that >you are still looking at a factor of two slowdown, it would be nice to >get rid of that. > > Non contiguous arrays are easy to build on top of contiguous arrays, psymeric works with noncontiguous arrays now. If you'd like, I can send you some code. The factor of two slowdown is an issue. A bigger issue is that only x86 platforms are supported. Also there is not support for things like byteswapped and nonalligned arrays. There also might be problems getting the exception handling right. If this approach were to be done "right" for heavy duty number cruncher types, it would require a more capable, c-based, core buffer object, with most other things written in python and psycoized. This begins to sounds a lot like what you would get if you put a lot of psyco.bind calls into the python parts of Numarray now. On the other hand, it's possible some interesting stuff will come out of the PyPy project that will make this thing possible in pure Python. I'm watching that project wit interest. I did some more tuning of the Psymeric code to reduce overhead and this is what the speed situation is now. This is complicated to compare, since the relative speeds depend on both the array type and shaps but one can get a general feel for things by looking at two things: the overhead, that is the time it takes to operate on very small arrays, and the asymptotic time/element for large arrays. These numbers differ substantially for contiguous and noncontiguous arrays but there relative values are fairly constant across types. That gives four numbers: Overhead (c) Overhead (nc) TimePerElement (c) TimePerElement (nc) NumPy 10 us 10 us 85 ps 95 ps NumArray 200 us 530 us 45 ps 135 ps Psymeric 50 us 65 us 80 ps 80 ps The times shown above are for Float64s and are pretty approximate, and they happen to be a particularly favorable array shape for Psymeric. I have seen pymeric as much as 50% slower than NumPy for large arrays of certain shapes. The overhead for NumArray is suprisingly large. After doing this experiment I'm certainly more sympathetic to Konrad wanting less overhead for NumArray before he adopts it. -tim From magnus at hetland.org Mon Feb 10 12:38:04 2003 From: magnus at hetland.org (Magnus Lie Hetland) Date: Mon Feb 10 12:38:04 2003 Subject: [Numpy-discussion] Plain array performance Message-ID: <20030210203736.GB13673@idi.ntnu.no> Just curious: What is the main strength of the array module in the standard library? Is it footprint/memory usage? Is it speed? If so, at what sizes? I ran some simple benchmarks (creating a list/array, iterating over them to sum up their elements, and extracting the slice foo[::2]) and got the following rations (array_time/list_time) for various sizes: Size 100: Creation: 1.13482142857 Sum: 1.54649265905 Slicing: 1.53736654804 Size 1000: Creation: 1.62444133147 Sum: 1.18439932835 Slicing: 1.56350184957 Size 10000: Creation: 1.61642712328 Sum: 1.47768567821 Slicing: 1.45889354599 Size 100000: Creation: 1.72711084285 Sum: 0.952593142445 Slicing: 1.05782341361 Size 1000000: Creation: 1.56617139425 Sum: 0.735687066032 Slicing: 0.773219364465 Size 10000000: Creation: 1.57903195174 Sum: 0.727253180418 Slicing: 0.726005428022 These benchmarks are pretty na?ve, but it seems to me that unless you're working with quite large arrays, there is no great advantage to using arrays rather than lists... (I'm not including numarray or Numeric in the equation here -- I just raise the issue because of the use of arrays in Psymeric...) Just curious... -- Magnus Lie Hetland "Nothing shocks me. I'm a scientist." http://hetland.org -- Indiana Jones From magnus at hetland.org Mon Feb 10 13:03:05 2003 From: magnus at hetland.org (Magnus Lie Hetland) Date: Mon Feb 10 13:03:05 2003 Subject: [Numpy-discussion] Plain array performance In-Reply-To: <3E481165.1080608@ieee.org> References: <20030210203736.GB13673@idi.ntnu.no> <3E481165.1080608@ieee.org> Message-ID: <20030210210156.GA16423@idi.ntnu.no> Tim Hochberg : > [snip] In my continued quest, I found this: http://www.penguin.it/pipermail/python/2002-October/001917.html It sums up (in Italian, though) the great memory advantage of arrays. (Might be a good idea to be explicit about this in the docs, perhaps... Hm.) > The reason I'm using arrays in psymeric are twofold. One is memory > usage. Right. > The other reason is that Psyco likes arrays > (http://arigo.tunes.org/psyco-preview/psycoguide/node26.html). I sort of thought that might be a reason... :) > In fact it was this note " The speed of a complex algorithm using an > array as buffer (like manipulating an image pixel-by-pixel) should > be very high; closer to C than plain Python." that led me to start > playing around with psymeric. I see. > Just for grins I disabled psyco and reran some tests on psymeric. > Instead of comporable speed to NumPy, the speed drops to about 25x > slower. Yikes! > I actually would have expected it to be worse, but the drop off is > still pretty steep. Indeed... Hm... If only we could have Psyco for non-x86 platforms... Oh, well. I guess we will, some day. :) > -tim -- Magnus Lie Hetland "Nothing shocks me. I'm a scientist." http://hetland.org -- Indiana Jones From paul at pfdubois.com Mon Feb 10 13:40:09 2003 From: paul at pfdubois.com (Paul F Dubois) Date: Mon Feb 10 13:40:09 2003 Subject: [Numpy-discussion] Plain array performance In-Reply-To: <20030210203736.GB13673@idi.ntnu.no> Message-ID: <000a01c2d14c$e15ea0b0$6601a8c0@NICKLEBY> The problem with na?ve benchmarks is that they *are* na?ve. In real applications you have a lot of arrays running around, and so a full cache shows up with smaller array sizes. Because of this, measuring performance is a really difficult matter. From magnus at hetland.org Mon Feb 10 13:43:07 2003 From: magnus at hetland.org (Magnus Lie Hetland) Date: Mon Feb 10 13:43:07 2003 Subject: [Numpy-discussion] Plain array performance In-Reply-To: <000a01c2d14c$e15ea0b0$6601a8c0@NICKLEBY> References: <20030210203736.GB13673@idi.ntnu.no> <000a01c2d14c$e15ea0b0$6601a8c0@NICKLEBY> Message-ID: <20030210214214.GA20750@idi.ntnu.no> Paul F Dubois : > > The problem with na?ve benchmarks is that they *are* na?ve. Indeed. My request was for a more dependable analysis. > In real applications you have a lot of arrays running around, and so > a full cache shows up with smaller array sizes. Because of this, > measuring performance is a really difficult matter. Indeed. I guess what I'm curious about is the motivation behind the array module... It seems to be mainly conserving memory -- or? -- Magnus Lie Hetland "Nothing shocks me. I'm a scientist." http://hetland.org -- Indiana Jones From magnus at hetland.org Mon Feb 10 13:46:03 2003 From: magnus at hetland.org (Magnus Lie Hetland) Date: Mon Feb 10 13:46:03 2003 Subject: [Numpy-discussion] array()? (Bug?) Message-ID: <20030210214544.GA21118@idi.ntnu.no> Is this a bug, or is there a motivation behind it? >>> from numarray import array >>> array() >>> IOW: Why is array callable without any arguments when it doesn't return anything? E.g. if I call array(**kwds) with some dictionary, I'd expect an exception (since a default array isn't really possible) if kwds were empty... Or? (I'm using 0.4 -- for some reason I can't get the cvs version to compile on Solaris.) -- Magnus Lie Hetland "Nothing shocks me. I'm a scientist." http://hetland.org -- Indiana Jones From jmiller at stsci.edu Mon Feb 10 14:30:09 2003 From: jmiller at stsci.edu (Todd Miller) Date: Mon Feb 10 14:30:09 2003 Subject: [Numpy-discussion] array()? (Bug?) References: <20030210214544.GA21118@idi.ntnu.no> Message-ID: <3E48277E.3030401@stsci.edu> Magnus Lie Hetland wrote: >Is this a bug, or is there a motivation behind it? > > > >>>>from numarray import array >>>>array() >>>> >>>> >>>> > >IOW: Why is array callable without any arguments when it doesn't >return anything? E.g. if I call array(**kwds) with some dictionary, >I'd expect an exception (since a default array isn't really possible) >if kwds were empty... Or? > >(I'm using 0.4 -- for some reason I can't get the cvs version to >compile on Solaris.) > > > It looks like a bug which resulted from Numeric compatability additions. For backwards compatability with Numeric, I added the "sequence" keyword as a synonym for the numarray "buffer" keyword. We're in the process of getting rid of (deprecating) "buffer". When it's gone (a couple releases), we can remove the default parameter to sequence and the bug. Todd From magnus at hetland.org Mon Feb 10 15:35:03 2003 From: magnus at hetland.org (Magnus Lie Hetland) Date: Mon Feb 10 15:35:03 2003 Subject: [Numpy-discussion] array()? (Bug?) In-Reply-To: <3E48277E.3030401@stsci.edu> References: <20030210214544.GA21118@idi.ntnu.no> <3E48277E.3030401@stsci.edu> Message-ID: <20030210233422.GA321@idi.ntnu.no> Todd Miller : > [snip] > It looks like a bug which resulted from Numeric compatability additions. > For backwards compatability with Numeric, I added the "sequence" > keyword as a synonym for the numarray "buffer" keyword. We're in the > process of getting rid of (deprecating) "buffer". When it's gone (a > couple releases), we can remove the default parameter to sequence and > the bug. OK -- but even until then, wouldn't it be possible to add a simple check for whether any arguments have been supplied? (Not a big priority, I guess :) > Todd -- Magnus Lie Hetland "Nothing shocks me. I'm a scientist." http://hetland.org -- Indiana Jones From magnus at hetland.org Mon Feb 10 19:38:05 2003 From: magnus at hetland.org (Magnus Lie Hetland) Date: Mon Feb 10 19:38:05 2003 Subject: [Numpy-discussion] average(), again(?) Message-ID: <20030211033702.GA17429@idi.ntnu.no> I think perhaps I've asked this before -- but is there any reason why the average() function from MA can't be copied (without the mask stuff) to numarray? Maybe it's too trivial (unlike in the masked case)...? It just seems like a generally useful function to have... -- Magnus Lie Hetland "Nothing shocks me. I'm a scientist." http://hetland.org -- Indiana Jones From falted at openlc.org Mon Feb 10 23:24:01 2003 From: falted at openlc.org (Francesc Alted) Date: Mon Feb 10 23:24:01 2003 Subject: [Numpy-discussion] Psyco MA? Message-ID: <200302102054.33064.falted@openlc.org> A Dissabte 08 Febrer 2003 11:54, Joachim Saul va escriure: > Please check out the Pyrex doc. It's actually very easy right now, > *if* you can live without "sequence operators" such as slicing, > list comprehensions... but this is going to be supported, again > according to the doc. Why you are saying that slicing is not supported?. I've checked them (as python expressions, of course) and work well. May be you are referring to cdef'd c-typed arrays in Pyrex?. I think this should be a dangerous thing that can make the pointer arithmetic to slow down because of the additional required checks in the slice range. > For example, I may call (C-like) > > arr = PyArray_FromDims(1, &n, PyArray_DOUBLE) > > but could have also used a corresponding Python construct like > > from Numeric import zeros > arr = zeros(n, 'd') > > I expect the latter to be slower (not tested), but one can take > Python code "as is" and "compile" it using Pyrex. I was curious about that and tested it in my Pentium 4 @ 2 GHz laptop and for small n (just to look for overhead). The C-like call takes 26 us and the Python-like takes 52 us. Generally speaking, you can expect an overhead of 20 us (a bit more as you pass more parameters) calling Python functions (or Python-like functions inside Pyrex) from Pyrex, compared to when you use a C-API to call the corresponding C function. In fact, calling a C-function (or a cdef Pyrex function) from Pyrex takes no more time than calling from C to C: on my laptop both scores at 0.5 us. The fact that calling C functions from Pyrex has not a significant overhead (compared with calls from C to C) plus the fact that Pyrex offers a C integer loop makes Pyrex so appealing for linear algebra optimizations, not only as a "glue" language. Another advantage is that with Pyrex you can define classes with a mix of C-type and Python-type attributes. This can be very handy to obtain a compact representation of objects (whenever you do not need to access the C-typed ones from Python, but anyway, you can always use accessors if needed). Cheers, -- Francesc Alted From falted at openlc.org Mon Feb 10 23:24:02 2003 From: falted at openlc.org (Francesc Alted) Date: Mon Feb 10 23:24:02 2003 Subject: [Psyco-devel] RE: [Numpy-discussion] Interesting Psyco/Numeric/Numarray comparison In-Reply-To: <3E43FBFA.4B0C0FA1@noaa.gov> References: <200302071843.09511.falted@openlc.org> <3E43FBFA.4B0C0FA1@noaa.gov> Message-ID: <200302102040.30486.falted@openlc.org> A Divendres 07 Febrer 2003 19:33, Chris Barker va escriure: > > Is Pyrex aware of Numeric Arrays? Joachim Saul already answered that, it is. More exactly, Pyrex is not aware of any special object outside the Python standard types, but with a bit of cleverness and patience, you can map any object you want to Pyrex. The Numeric array object map just happens to be documented in the FAQ, but I managed to access numarray objects as well. Here is the recipe: First, define some enum types and headers: # Structs and functions from numarray cdef extern from "numarray/numarray.h": ctypedef enum NumRequirements: NUM_CONTIGUOUS NUM_NOTSWAPPED NUM_ALIGNED NUM_WRITABLE NUM_C_ARRAY NUM_UNCONVERTED ctypedef enum NumarrayByteOrder: NUM_LITTLE_ENDIAN NUM_BIG_ENDIAN cdef enum: UNCONVERTED C_ARRAY ctypedef enum NumarrayType: tAny tBool tInt8 tUInt8 tInt16 tUInt16 tInt32 tUInt32 tInt64 tUInt64 tFloat32 tFloat64 tComplex32 tComplex64 tObject tDefault tLong # Declaration for the PyArrayObject struct PyArray_Descr: int type_num, elsize char type ctypedef class PyArrayObject [type PyArray_Type]: # Compatibility with Numeric cdef char *data cdef int nd cdef int *dimensions, *strides cdef object base cdef PyArray_Descr *descr cdef int flags # New attributes for numarray objects cdef object _data # object must meet buffer API */ cdef object _shadows # ill-behaved original array. */ cdef int nstrides # elements in strides array */ cdef long byteoffset # offset into buffer where array data begins */ cdef long bytestride # basic seperation of elements in bytes */ cdef long itemsize # length of 1 element in bytes */ cdef char byteorder # NUM_BIG_ENDIAN, NUM_LITTLE_ENDIAN */ cdef char _aligned # test override flag */ cdef char _contiguous # test override flag */ void import_array() # The Numeric API requires this function to be called before # using any Numeric facilities in an extension module. import_array() Then, declare the API routines you want to use: cdef extern from "numarray/libnumarray.h": PyArrayObject NA_InputArray (object, NumarrayType, int) PyArrayObject NA_OutputArray (object, NumarrayType, int) PyArrayObject NA_IoArray (object, NumarrayType, int) PyArrayObject NA_Empty(int nd, int *d, NumarrayType type) object PyArray_FromDims(int nd, int *d, NumarrayType type) define now a couple of maps between C enum types and Python numarrar type classes: # Conversion tables from/to classes to the numarray enum types toenum = {numarray.Int8:tInt8, numarray.UInt8:tUInt8, numarray.Int16:tInt16, numarray.UInt16:tUInt16, numarray.Int32:tInt32, numarray.UInt32:tUInt32, numarray.Float32:tFloat32, numarray.Float64:tFloat64, } toclass = {} for (key, value) in toenum.items(): toclass[value] = key ok. you are on the way. We can finally define our user funtion; for example, I will show here a function to multiply a matrix by a vector (C double precision): def multMatVec(object a, object b, object c): cdef PyArrayObject carra, carrb, carrc cdef double *da, *db, *dc cdef int i, j carra = NA_InputArray(a, toenum[a._type], C_ARRAY) carrb = NA_InputArray(b, toenum[b._type], C_ARRAY) carrc = NA_InputArray(c, toenum[c._type], C_ARRAY) da = carra.data db = carrb.data dc = carrc.data dim1 = carra.dimensions[0] dim2 = carra.dimensions[1] for i from 0<= i < dim1: dc[i] = 0. for j from 0<= j < dim2: dc[i] = dc[i] + da[i*dim2+j] * db[j] return carrc where NA_InputArray is a high-level numarray API that ensures that the object retrieved is a well-behaved array, and not mis-aligned, discontiguous or whatever. Maybe at first glance such a procedure would seem obscure, but it is not. I find it to be quite elegant. Look at the "for i from 0<= i < dim1:" construction. We could have used the more pythonic form: "for i in range(dim1):", but by using the former, the Pyrex compiler is able to produce a loop in plain C, so achieving C-speed on this piece of code. Of course, you must be aware to not introduce Python objects inside the loop, or all the potential speed-up improvement will vanish. But, with a bit of practice, this is easy to avoid. For me Pyrex is like having Python but with the speed of C. This is why I'm so enthusiastic with it. > > I imagine it could use them just fine, using the generic Python sequence > get item stuff, but that would be a whole lot lower performance than if > it understood the Numeric API and could access the data array directly. > Also, how does it deal with multiple dimension indexing ( array[3,6,2] ) > which the standard python sequence types do not support? In general, you can access sequence objects like in Python (and I've just checked that extended slicing *is* supported, I don't know why Joachim was saying that not; perhaps he was meaning Pyrex C-arrays?), but at Python speed. So, if you need speed, always use pointers to your data and use a bit of pointer arithmetic to access the element you want (look at the example). Of course, you can also define C arrays if you know the boundaries in compilation time and let the compiler do the computations to access your desired element, but you will need first to copy the data from your buffers to the C-array, and perhaps this is a bit inconvenient in some situations. > As I think about this, I think your suggestion is fabulous. Pyrex (or a > Pyrex-like) language would be a fabulous way to write code for NumArray, > if it really made use of the NumArray API. There can be drawbacks, like the one stated by Perry related with how to construct general Ufuncs that can handle many different combinations of arrays and types, although I don't understand that very well because Numeric and numarray crews already achieved to do that in C, so why it cannot be possible with Pyrex?. Mmm, perhaps there is some pre-processor involved?. Cheers, -- Francesc Alted From Chris.Barker at noaa.gov Tue Feb 11 10:36:01 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Tue Feb 11 10:36:01 2003 Subject: [Psyco-devel] RE: [Numpy-discussion] Interesting Psyco/Numeric/Numarray comparison References: <200302071843.09511.falted@openlc.org> <3E43FBFA.4B0C0FA1@noaa.gov> <200302102040.30486.falted@openlc.org> Message-ID: <3E4938E1.73B7831E@noaa.gov> Francesc Alted wrote: > First, define some enum types and headers: Could all this be put into Pyrex? (when NumArray becomes more stable anyway) It's well beyond me to understand it. > I will show here a function to multiply a matrix by a vector (C double > precision): > > def multMatVec(object a, object b, object c): > cdef PyArrayObject carra, carrb, carrc > cdef double *da, *db, *dc > cdef int i, j > > carra = NA_InputArray(a, toenum[a._type], C_ARRAY) > carrb = NA_InputArray(b, toenum[b._type], C_ARRAY) > carrc = NA_InputArray(c, toenum[c._type], C_ARRAY) > da = carra.data > db = carrb.data > dc = carrc.data > dim1 = carra.dimensions[0] > dim2 = carra.dimensions[1] > for i from 0<= i < dim1: > dc[i] = 0. > for j from 0<= j < dim2: > dc[i] = dc[i] + da[i*dim2+j] * db[j] > > return carrc > For me Pyrex is like having Python but with the speed of C. This is why I'm > so enthusiastic with it. That actually looks more like C than Python to me. As soon as I am doing pointer arithmetic, I don't feel like I'm writng Python. Would it be all that much more code in C? > speed. So, if you need speed, always use pointers to your data and use a bit > of pointer arithmetic to access the element you want (look at the example). Is there really no way to get this to work? > Of course, you can also define C arrays if you know the boundaries in > compilation time and let the compiler do the computations to access your > desired element, but you will need first to copy the data from your buffers > to the C-array, and perhaps this is a bit inconvenient in some situations. Couldn't you access the data array of the NumArray directly? I do this all the time with Numeric. > Why you are saying that slicing is not supported?. I've checked them (as > python expressions, of course) and work well. May be you are referring to > cdef'd c-typed arrays in Pyrex?. I think this should be a dangerous thing > that can make the pointer arithmetic to slow down because of the additional > required checks in the slice range. Well, there would need to be two value checks per slice. That would be significant for small slices, but not for large ones, I'd love to have it. It just doesn't feel like Python without slicing, and it doesn't feel like NumPy without multi-dimensional slicing. > There can be drawbacks, like the one stated by Perry related with how to > construct general Ufuncs that can handle many different combinations of > arrays and types, although I don't understand that very well because Numeric > and numarray crews already achieved to do that in C, so why it cannot be > possible with Pyrex?. Mmm, perhaps there is some pre-processor involved?. I was curious about this comment as well. I have only had success with writing any of my Numeric based extensions for pre-determined types. If I had to support additional types (and/or discontiguous and/or rank-N arrays), I ended up with a whole pile of case and/or if statements. Also kind of slow and inefficient code. It seems the only way to do this right is with C++ and templates (eg. Blitz++), but there are good reasons not to go that route. Would it really be any harder to use Pyrex than C for this kind of thing? Also, would it be possible to take a Pyrex type approach and have it do someting template-like: you wright the generic code in Pyrex, it generates all the type-specific C code for you. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From falted at openlc.org Tue Feb 11 11:34:04 2003 From: falted at openlc.org (Francesc Alted) Date: Tue Feb 11 11:34:04 2003 Subject: [Psyco-devel] RE: [Numpy-discussion] Interesting Psyco/Numeric/Numarray comparison In-Reply-To: <3E4938E1.73B7831E@noaa.gov> References: <200302102040.30486.falted@openlc.org> <3E4938E1.73B7831E@noaa.gov> Message-ID: <200302112033.30231.falted@openlc.org> A Dimarts 11 Febrer 2003 18:54, Chris Barker va escriure: > > > > def multMatVec(object a, object b, object c): > > cdef PyArrayObject carra, carrb, carrc > > cdef double *da, *db, *dc > > cdef int i, j > > > > carra = NA_InputArray(a, toenum[a._type], C_ARRAY) > > carrb = NA_InputArray(b, toenum[b._type], C_ARRAY) > > carrc = NA_InputArray(c, toenum[c._type], C_ARRAY) > > da = carra.data > > db = carrb.data > > dc = carrc.data > > dim1 = carra.dimensions[0] > > dim2 = carra.dimensions[1] > > for i from 0<= i < dim1: > > dc[i] = 0. > > for j from 0<= j < dim2: > > dc[i] = dc[i] + da[i*dim2+j] * db[j] > > > > return carrc > > > > > > For me Pyrex is like having Python but with the speed of C. This is why > > I'm so enthusiastic with it. > > That actually looks more like C than Python to me. As soon as I am doing > pointer arithmetic, I don't feel like I'm writng Python. Would it be all > that much more code in C? Doing that in C implies writing the "glue" code. In the past example, multMatVec is a function *directly* accessible in Python, without any additional declaration. Moreover, you can do in Pyrex the same things you do in python, so you could have written the last piece of code as: def multMatVec(object a, object b, object c): for i in range(a.shape[0]): c[i] = 0. for j in range(a.shape[1]): dc[i] = dc[i] + da[i][j] * db[j] return c but, of course, you get only Python speed. So, the moral is that C speed is only accessible in Pyrex if you use C like types and constructions, it just don't come for free. I just find this way to code to be more elegant than using Swig, or other approaches. But I'm most probably biased because Pyrex is my first (and unique) serious tool for doing Python extensions. > > > speed. So, if you need speed, always use pointers to your data and use a > > bit of pointer arithmetic to access the element you want (look at the > > example). > > Is there really no way to get this to work? > > > Of course, you can also define C arrays if you know the boundaries in > > compilation time and let the compiler do the computations to access your > > desired element, but you will need first to copy the data from your > > buffers to the C-array, and perhaps this is a bit inconvenient in some > > situations. > > Couldn't you access the data array of the NumArray directly? I do this > all the time with Numeric. Yeah, you can, and both examples shown here (in Numeric and numarray), you are accessing directly the array data buffer, with no copies (whenever your original array is well-.behaved, of course). > > > Why you are saying that slicing is not supported?. I've checked them (as > > python expressions, of course) and work well. May be you are referring to > > cdef'd c-typed arrays in Pyrex?. I think this should be a dangerous thing > > that can make the pointer arithmetic to slow down because of the > > additional required checks in the slice range. > > Well, there would need to be two value checks per slice. That would be > significant for small slices, but not for large ones, I'd love to have > it. It just doesn't feel like Python without slicing, and it doesn't > feel like NumPy without multi-dimensional slicing. > Again, right now, you can use slicing in Pyrex if you are dealing with Python objects, but from the moment you access to the lower-level Numeric/numarray buffer and assign to a Pyrex C-pointer, you can't do that anymore. That's the price to pay for speed. About implementing slicing in Pyrex C-pointer arithmetics, well, it can be worth to ask Greg Ewing, the Pyrex author. I'll send him this particular question and forward his answer (if any) to the list. > > There can be drawbacks, like the one stated by Perry related with how to > > construct general Ufuncs that can handle many different combinations of > > arrays and types, although I don't understand that very well because > > Numeric and numarray crews already achieved to do that in C, so why it > > cannot be possible with Pyrex?. Mmm, perhaps there is some pre-processor > > involved?. > > I was curious about this comment as well. I have only had success with > writing any of my Numeric based extensions for pre-determined types. If > I had to support additional types (and/or discontiguous and/or rank-N > arrays), I ended up with a whole pile of case and/or if statements. Also > kind of slow and inefficient code. > > It seems the only way to do this right is with C++ and templates (eg. > Blitz++), but there are good reasons not to go that route. > > Would it really be any harder to use Pyrex than C for this kind of > thing? Also, would it be possible to take a Pyrex type approach and have > it do someting template-like: you wright the generic code in Pyrex, it > generates all the type-specific C code for you. Well, this is another good question for Greg. I'll try to ask him, although as I don't have experience on that kind of issues, chances are that my question might result a complete nonsense :). Cheers, -- Francesc Alted From perry at stsci.edu Tue Feb 11 12:16:13 2003 From: perry at stsci.edu (Perry Greenfield) Date: Tue Feb 11 12:16:13 2003 Subject: [Numpy-discussion] Psyco MA? In-Reply-To: <3E47D8AC.4070108@ieee.org> Message-ID: Tim Hochberg writes: > Overhead (c) Overhead (nc) > TimePerElement (c) TimePerElement (nc) > NumPy 10 us 10 > us 85 ps 95 ps > NumArray 200 us 530 us > 45 ps 135 ps > Psymeric 50 us 65 > us 80 ps 80 ps > > > The times shown above are for Float64s and are pretty approximate, and > they happen to be a particularly favorable array shape for Psymeric. I > have seen pymeric as much as 50% slower than NumPy for large arrays of > certain shapes. > > The overhead for NumArray is surprisingly large. After doing this > experiment I'm certainly more sympathetic to Konrad wanting less > overhead for NumArray before he adopts it. > Wow! Do you really mean picoseconds? I never suspected that either Numeric or numarray were that fast. ;-) Anyway, this issue is timely [Err...]. As it turns out we started looking at ways of improving small array performance a couple weeks ago and are coming closer to trying out an approach that should reduce the overhead significantly. But I have some questions about your benchmarks. Could you show me the code that is used to generate the above timings? In particular I'm interested in the kinds of arrays that are being operated on. It turns out that that the numarray overhead depends on more than just contiguity and it isn't obvious to me which case you are testing. For example, Todd's benchmarks indicate that numarray's overhead is about a factor of 5 larger than numpy when the input arrays are contiguous and of the same type. On the other hand, if the array is not contiguous or requires a type conversion, the overhead is much larger. (Also, these cases require blocking loops over large arrays; we have done nothing yet to optimize the block size or the speed of that loop.) If you are doing the benchmark on contiguous, same type arrays, I'd like to get a copy of the benchmark program to try to see where the disagreement arises. The very preliminary indications are that we should be able to make numarray overheads approximately 3 times higher for all ufunc cases. That's still slower, but not by a factor of 20 as shown above. How much work it would take to reduce it further is unclear (the main bottleneck at that point appears to be how long it takes to create new output arrays) We are still mainly in the analysis and design phase of how to improve performance for small arrays and block looping. We believe that this first step will not require moving very much of the existing Python code into C (but some will be). Hopefully we will have some working code in a couple weeks. Thanks, Perry From tim.hochberg at ieee.org Tue Feb 11 13:05:05 2003 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Tue Feb 11 13:05:05 2003 Subject: [Numpy-discussion] Psyco MA? In-Reply-To: References: Message-ID: <3E49653D.4050604@ieee.org> Perry Greenfield wrote: >Tim Hochberg writes: > > >> Overhead (c) Overhead (nc) >>TimePerElement (c) TimePerElement (nc) >>NumPy 10 us 10 >>us 85 ps 95 ps >>NumArray 200 us 530 us >>45 ps 135 ps >>Psymeric 50 us 65 >>us 80 ps 80 ps >> >> >>The times shown above are for Float64s and are pretty approximate, and >>they happen to be a particularly favorable array shape for Psymeric. I >>have seen pymeric as much as 50% slower than NumPy for large arrays of >>certain shapes. >> >>The overhead for NumArray is surprisingly large. After doing this >>experiment I'm certainly more sympathetic to Konrad wanting less >>overhead for NumArray before he adopts it. >> >> >> >Wow! Do you really mean picoseconds? I never suspected that >either Numeric or numarray were that fast. ;-) > > My bad, I meant ns. What's a little factor of 10^3 among friends. >Anyway, this issue is timely [Err...]. As it turns out we started > > >looking at ways of improving small array performance a couple weeks >ago and are coming closer to trying out an approach that should >reduce the overhead significantly. > >But I have some questions about your benchmarks. Could you show me >the code that is used to generate the above timings? In particular >I'm interested in the kinds of arrays that are being operated on. >It turns out that that the numarray overhead depends on more than >just contiguity and it isn't obvious to me which case you are testing. > > I'll send you psymeric, including all the tests by private email to avoid cluttering up the list. (Don't worry, it's not huge -- only 750 lines of Python at this point). You can let me know if you find any horrible issues with it. >For example, Todd's benchmarks indicate that numarray's overhead is >about a factor of 5 larger than numpy when the input arrays are >contiguous and of the same type. On the other hand, if the array >is not contiguous or requires a type conversion, the overhead is >much larger. (Also, these cases require blocking loops over large >arrays; we have done nothing yet to optimize the block size or >the speed of that loop.) If you are doing the benchmark on >contiguous, same type arrays, I'd like to get a copy of the benchmark >program to try to see where the disagreement arises. > > Basically, I'm operating on two, random contiguous, 3x3, Float64 arrays.In the noncontiguous case the arrays are indexed using [::2,::2] and [1::2,::2] so these arrays are 2x2 and 1x2. Hmmm, that wasn't intentional, I'm measuring axis stretching as well. However using [::2.::2] for both axes doesn't change things a whole lot. The core timing part looks like this: t0 = clock() if op == '+': c = a + b elif op == '-': c = a - b elif op == '*': c = a * b elif op == '/': c = a / b elif op == '==': c = a==b else: raise ValueError("unknown op %s" % op) t1 = clock() This is done N times, the first M values are thrown away and the remaining values are averaged. Currently N is 3 and M is 1, so not a lot averaging is taking place. >The very preliminary indications are that we should be able to make >numarray overheads approximately 3 times higher for all ufunc cases. >That's still slower, but not by a factor of 20 as shown above. How >much work it would take to reduce it further is unclear (the main >bottleneck at that point appears to be how long it takes to create >new output arrays) > > That's good. I think it's important to get people like Konrad on board and that will require dropping the overhead. >We are still mainly in the analysis and design phase of how to >improve performance for small arrays and block looping. We believe >that this first step will not require moving very much of the >existing Python code into C (but some will be). Hopefully we >will have some working code in a couple weeks. > I hope it goes well. -tim From falted at openlc.org Wed Feb 12 01:25:06 2003 From: falted at openlc.org (Francesc Alted) Date: Wed Feb 12 01:25:06 2003 Subject: [Numpy-discussion] Fwd: Re: A couple of questions on Pyrex Message-ID: <200302121024.24406.falted@openlc.org> Hi, Here is the Greg's reply to my questions. It seems like Pyrex is not going to change in these two issues. Well, at least he considered the first to be an "interesting" idea. Cheers, ---------- Missatge transm?s ---------- From greg at cosc.canterbury.ac.nz Wed Feb 12 01:23:01 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 12 Feb 2003 19:23:01 +1300 (NZDT) Subject: A couple of questions on Pyrex Message-ID: > numbuf =3D data[2:30:4][1] > > in order to get a copy (in a new memory location) of the memory buffer = in > the selected slice to work with it. Would that be interesting to > implement?. It's an interesting idea, but I think it's getting somewhat beyond the scope of Pyrex. I don't think I'll be trying to implement anything like that in the foreseeable future. The Pyrex compiler is complicated enough already, and I don't want to add anything more that isn't really necessary. > Is (or will be) there any way in Pyrex to automagically create diferent > flavors of this function to deal with different datatypes? Same here, and even more so -- I'm *definitely* not going to re-implement C++ templates! :-) Greg Ewing, Computer Science Dept, +-------------------------------------= -+ University of Canterbury,=09 | A citizen of NewZealandCorp, a=09 | Christchurch, New Zealand=09 | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz=09 +--------------------------------------+ ------------------------------------------------------- --=20 Francesc Alted From karthik at james.hut.fi Wed Feb 12 01:32:02 2003 From: karthik at james.hut.fi (Karthikesh Raju) Date: Wed Feb 12 01:32:02 2003 Subject: [Numpy-discussion] Problems with view.py Message-ID: Hi All, i was using view.py, that comes with NumPy. Somehow view.py slows down the whole ipython shell. Most of the times the program crashes dumping core. The version of my Numeric is 21.3. Is there some new version of view.py? Is there something needed? Or rather is there some better viewer for viewing images in python ( specifically for viewing image processing images). Best regards karthik ----------------------------------------------------------------------- Karthikesh Raju, email: karthik at james.hut.fi Researcher, http://www.cis.hut.fi/karthik Helsinki University of Technology, Tel: +358-9-451 5389 Laboratory of Comp. & Info. Sc., Fax: +358-9-451 3277 Department of Computer Sc., P.O Box 5400, FIN 02015 HUT, Espoo, FINLAND ----------------------------------------------------------------------- From falted at openlc.org Wed Feb 12 05:04:03 2003 From: falted at openlc.org (Francesc Alted) Date: Wed Feb 12 05:04:03 2003 Subject: [Numpy-discussion] Psyco MA? In-Reply-To: <3E49653D.4050604@ieee.org> References: <3E49653D.4050604@ieee.org> Message-ID: <200302121403.09480.falted@openlc.org> Hi, Some days ago, I've also done some benchmarks on this issue, and I think that could be good to share my results. I'm basically reproducing the figures of Tim, although that with a difference still bigger in favour of Numeric for small matrix (2x2). The benchmarks are made with a combination of Python and Pyrex (in order to test also some functions in Numeric and numarray C API). The figures I'm getting are, roughly: Matrix multiplication: In Python: matrixmultiply (double(2,2) x double(2,)) in Numeric: 70 us matrixmultiply (double(2,2) x double(2,)) in numarray: 4800 us In Pyrex: numarray multiply in Pyrex, using NA_InputArray: 620 us numarray multiply in Pyrex, using PyObject_AsWriteBuffer: 146 us zeros: In Python: double(2,) in Numeric: 58 us double(2,) in numarray: 3100 us In Pyrex (using PyArray_FromDims): double(2,) with Numeric: 26 us double(2,) with numarray: 730 us As, you can see, in pure Python, numarray has a factor of 50 (for zeros) and up to 70 (for matrix multiply) times more overhead than Numeric. Increasing the matrix to a 200x20 the overhead difference falls down to factor of 16 (for matrix multiply) and 50 (for zeros) always in favor of Numeric. With Pyrex (i.e. making the C calls), the differences are not so big, but there is still a difference. In particular, when assuming a contiguous matrix and calling PyObject_AsWriteBuffer directly upon the object._data memory buffer, the factor falls down to 2. Increasing the matrix to 200x20, the overhead for zeros (using PyArray_FromDims) is the same for numarray than Numeric (around 700 us), while multiply in Pyrex can't beat the matrixmultiply in Numeric (Numeric is still 2 times faster). Hope that helps. I also can send you my testbeds if you are interested in. -- Francesc Alted From guido at python.org Thu Feb 13 13:46:11 2003 From: guido at python.org (Guido van Rossum) Date: Thu Feb 13 13:46:11 2003 Subject: [Numpy-discussion] OSCON / Python 11 proposals deadline is February 15th! Message-ID: <200302132114.h1DLE4x16909@odiug.zope.com> The Python 11 Conference is being held July 7-11 in Portland, Oregon as part of OSCON 2003. http://conferences.oreillynet.com/os2003/ The deadline for proposals is February 15th! You only need to have your proposal in this week, you don't need to worry about trying to put together the complete presentation or tutorial materials at this time. Proposal submissions page: http://conferences.oreillynet.com/cs/os2003/create/e_sess Few proposals have been submitted so far, we need many more to have a successful Python 11 conference. If you have submitted a proposal for one of the other Python conferences this year such as PyCon, I encourage you to go ahead and submit the proposal to Python 11 as well. If you are presenting at the Python UK Conference or EuroPython, but are unable to attend Python 11, you should consider having another team member do the presentation. The theme of OSCON 2003 is "Embracing and Extending Proprietary Software". Papers and presentations on how to successfully transition away from proprietary software would also be good, but it is not necessary for your proposal to cover the theme, proposals just need to be related to Python. COMPENSATION: Free registration for speakers (except lightning talks). Tutorial speakers also get: $500 honorarium; $50 per diem on day of tutorial; 1 night hotel; airfare. O'REILLY ANNOUNCEMENT: 2003 O'Reilly Open Source Convention Call For Participation Embracing and Extending Proprietary Software http://conferences.oreilly.com/oscon/ O'Reilly & Associates invites programmers, developers, strategists, and technical staff to submit proposals to lead tutorial and conference sessions at the 2003 Open Source Software Convention, slated for July 7-11 in Portland, OR. Proposals are due February 15, 2003. For more information please visit our OSCON website http://conferences.oreilly.com/oscon/ The theme this year is "Embracing and Extending Proprietary Software." Few companies use only one vendor's software on desktops, back office, and servers. Variety in operating systems and applications is becoming the norm, for sound financial and technical reasons. With variety comes the need for open unencumbered standards for data exchange and service interoperability. You can address the theme from any angle you like--for example, you might talk about migrating away from commercial software such as Microsoft Windows, or instead place your emphasis on coexistence. Convention Conferences Perl Conference 7 The Python 11 Conference PHP Conference 3 Convention Tracks Apache XML Applications MySQL and PostgreSQL Ruby --Guido van Rossum (home page: http://www.python.org/~guido/) From paul at pfdubois.com Thu Feb 13 19:37:06 2003 From: paul at pfdubois.com (Paul Dubois) Date: Thu Feb 13 19:37:06 2003 Subject: [Numpy-discussion] PEP-242 Numeric kinds -- disposition Message-ID: <000501c2d3da$461e0920$6601a8c0@NICKLEBY> PEP-242 should be closed. The kinds module will not be added to the standard library. There was no opposition to the proposal but only mild interest in using it, not enough to justify adding the module to the standard library. Instead, it will be made available as a separate distribution item at the Numerical Python site. At the next release of Numerical Python, it will no longer be a part of the Numeric distribution. From msekko1 at rediffmail.com Fri Feb 14 03:02:02 2003 From: msekko1 at rediffmail.com (MRS L ESTRADA) Date: Fri Feb 14 03:02:02 2003 Subject: [Numpy-discussion] ESTRADA [HELP] Message-ID: Dear sir My name is LOUISA C.ESTRADA,The wife of Mr. JOSEPH ESTRADA, the former President of Philippines located in the South East Asia. My husband was recently impeached from office by a backed uprising of mass demonstrators and the Senate. My husband is presently in jail and facing trial on charges of corruption, embezzlement, and the mysterious charge of plunder which might lead to death sentence. The present government is forcing my husband out of manila to avoid demonstration by his supporters. During my husband's regime as president of Philippine, I realized some reasonable amount of money from various deals that I successfully executed. I have plans to invest this money for my children's future on real estate and industrial production. My husband is not aware of this because I wish to do it secretly for now. Before my husband was impeached, I secretly siphoned the sum of $30,000,000 million USD (Thirty million United states dollars) out of Philippines and deposited the money with a security firm that transports valuable goods and consignments through diplomatic means. I also declared that the consignment was solid gold and my foreign business partner owned it. I am contacting you because I want you to go to the security company and claim the money on my behalf since I have declared that the consignment belong to my foreign business partner. You shall also be required to assist me in investment in your country. I hope to trust you as a God fearing person who will not sit on this money when you claim it, rather assist me properly, I expect you to declare what percentage of the total money you will take for your assistance. When I receive your positive response I will let you know where the security company is and the payment pin code to claim the money which is very important. For now, let all our communication be by e-mail because my line are right now connected to the Philippines Telecommunication Network services. Please also send me your telephone and fax number. I will ask my son to contact you to give you more details on after i have received a responce from you. Thank you and God bless you and your family. MRS LOUISA C. ESTRADA From falted at openlc.org Mon Feb 17 11:08:01 2003 From: falted at openlc.org (Francesc Alted) Date: Mon Feb 17 11:08:01 2003 Subject: [Numpy-discussion] rank-0 chararrays? Message-ID: <200302171940.36278.falted@openlc.org> Hi, I'm trying to map Numeric character typecode ('c') to chararrays, but I have a problem to distinguish between In [109]: chararray.array("qqqq") Out[109]: CharArray(['qqqq']) and In [110]: chararray.array(["qqqq"]) # Note the extra "[" "]" Out[110]: CharArray(['qqqq']) # The same result as 109 while in Numeric we have: In [113]: Numeric.array("qqqq") Out[113]: array([q, q, q, q],'c') In [114]: Numeric.array(["qqqq"]) Out[114]: array([ [q, q, q, q]],'c') # Differs from 113 even in numarray objects, rank-0 seems to work well: In [107]: numarray.array(1) Out[107]: array(1) In [108]: numarray.array([1]) Out[108]: array([1]) # Objects differ So, it seems like if chararray does not support well rank-0 objects. Is this the expected behavior?. If yes, we have no possibility to distinguish between object 109 and 110, and I'd like to distinguish between this two. What can be done to achieve this? Thanks, -- Francesc Alted From tim.hochberg at ieee.org Mon Feb 17 13:06:03 2003 From: tim.hochberg at ieee.org (Tim Hochberg) Date: Mon Feb 17 13:06:03 2003 Subject: [Numpy-discussion] Psymeric-update Message-ID: <3E514E78.10905@ieee.org> The good news is Psymeric now supports complex numbers and inplace addition and complex numbers (Complex32 and Complex64). Also by doing some tuning, I got the overhead of Psymeric down to less than three times that of Numeric (versus 20 times in the version of Numarray that I have). Even without Psyco, the code only has overhead of five and half times that of Numeric, so it seems that the Numarray folks should at least be able to get down to that level without throwing everything into C. I have not been able to increase the asymptoptic speed and I think I'm probably stuck on that front for the time being. For the most part Psymeric is close to Numeric for large arrays, which makes it about 50% faster than numarray for noncontiguous and half as fast for contiguous arrays. These timings are for Float64: for Int8, Psymeric is ~3x slower than Numeric and for Int16 it's 50% slower, for Int32 2x slower. Psymeric is very slow for Float32 and Complex32 (~10x slower than Numeric) beacause of some Psyco issues with array.arrays and floats which I expect will be fixed at some point. And finally, for Complex64, psymeric is comparable to Numeric for addition and subtraction, but almost half as fast for multiplication and almost a third as fast for division. Barring some further improvements in Psyco or some new insights on my part, this is probably as far as I'll go with this. At this point, it would probably not be hard to make this into a work alike for Numeric or Numarray (excluding the various extension modules: FFT and the like). The one relatively hard part still outstanding it ufunc.accumulate/reduce. However, the performance while very impressive for an essentially pure python solution is not good enough to motivate me to use this in preference to Numeric. If anyone is interested in looking at the code, I'd be happy to send it to them. Regards, -tim From jmiller at stsci.edu Tue Feb 18 04:28:03 2003 From: jmiller at stsci.edu (Todd Miller) Date: Tue Feb 18 04:28:03 2003 Subject: [Numpy-discussion] rank-0 chararrays? References: <200302171940.36278.falted@openlc.org> Message-ID: <3E522A95.1000601@stsci.edu> Francesc Alted wrote: >Hi, > >I'm trying to map Numeric character typecode ('c') to chararrays, but I have >a problem to distinguish between > >In [109]: chararray.array("qqqq") >Out[109]: CharArray(['qqqq']) > >and > >In [110]: chararray.array(["qqqq"]) # Note the extra "[" "]" >Out[110]: CharArray(['qqqq']) # The same result as 109 > > The chararray API pre-dates our awareness, ultimate implemenation, and final rejection of rank-0 arrays. In retrospect, your usage above makes sense. Whether we change things now or not is another matter. You are giving me interface angst... :) You can create rank-0 arrays by specifying shape=() and itemsize=len(buffer). However, these do not repr correctly (unless you update from CVS). >while in Numeric we have: > >In [113]: Numeric.array("qqqq") >Out[113]: array([q, q, q, q],'c') > >In [114]: Numeric.array(["qqqq"]) >Out[114]: array([ [q, q, q, q]],'c') # Differs from 113 > >even in numarray objects, rank-0 seems to work well: > >In [107]: numarray.array(1) >Out[107]: array(1) > >In [108]: numarray.array([1]) >Out[108]: array([1]) # Objects differ > This was not always so, be we made it work when we thought rank-0 had something to offer. After some discussion on numpy-discussion-list, rank-0 went out of vogue. > > >So, it seems like if chararray does not support well rank-0 objects. > That is true. CharArray never caught up because rank-0 became vestigal even for NumArray. >Is this >the expected behavior?. > Yes. But, rank-0 support for chararray is not far off, with the possible exception of breaking the public interface. >If yes, we have no possibility to distinguish >between object 109 and 110, and I'd like to distinguish between this two. > Why exactly do you need rank-0? >What can be done to achieve this? > 1. Add a little special casing to chararray._charArrayToStringList() to handle rank-0. I did this already in CVS. 2. Debate whether or not to change chararray.array() to work as you've shown above. Proceed from there. > >Thanks, > > > From falted at openlc.org Tue Feb 18 09:54:08 2003 From: falted at openlc.org (Francesc Alted) Date: Tue Feb 18 09:54:08 2003 Subject: [Numpy-discussion] rank-0 chararrays? In-Reply-To: <3E522A95.1000601@stsci.edu> References: <200302171940.36278.falted@openlc.org> <3E522A95.1000601@stsci.edu> Message-ID: <200302181853.22563.falted@openlc.org> A Dimarts 18 Febrer 2003 13:44, Todd Miller va escriure: > You are giving me interface angst... :) Well, I don't know exactly what do you mean with that, but I hope it would be something not too bad ;) > > This was not always so, be we made it work when we thought rank-0 had > something to offer. After some discussion on numpy-discussion-list, > rank-0 went out of vogue. Mmmm, do you mean that rank-0 is being deprecated in numarray? > Why exactly do you need rank-0? Appart from supporting chararrays in PyTables, I'm using them as a buffer to save homogeneous character standard lists and tuples, because it is very easy to obtain a contiguous C buffer from it. However, if I have no possibility to distinguish between "qqq" and ["qqq"] objects directly from chararray instances obtained from them, I can't materialize them properly when reading the objects from the persitent storage. Perhaps using more metadata could solve the situation (for example, saving the original shape of the object), but I wouldn't like to clutter unnecessarily PyTables metadata space. > > >What can be done to achieve this? > > 1. Add a little special casing to chararray._charArrayToStringList() to > handle rank-0. I did this already in CVS. Ok. For the moment I'll be using numarray CVS, although I don't know if next version of numarray will be out before next PyTables release (planned in a couple of weeks). > 2. Debate whether or not to change chararray.array() to work as you've > shown above. Proceed from there. Well, the fact is that I needed rank-0 only for the reason stated before. But I'm not sure if this is reason enough to open such a debate. Thanks!, -- Francesc Alted From falted at openlc.org Wed Feb 19 04:06:02 2003 From: falted at openlc.org (Francesc Alted) Date: Wed Feb 19 04:06:02 2003 Subject: [Numpy-discussion] range check: feature request for numarray Message-ID: <200302191305.00194.falted@openlc.org> Hi, I think it would be useful to provide some range checking in numarray. For example, right now, you can do: In [24]: a=numarray.array([1,2],numarray.Int8) In [25]: a[1] = 256 In [26]: a Out[26]: array([1, 0], type=Int8) and nothing happens. But I'm proposing to raise an OverflowWarning so that people can be aware of such range overflows. Maybe it is desirable that the default would be to not issue the warning, except when the user wanted to know about that. So, my proposal is that the actual behaviour should be mantained, but when you want to be aware of all the warnings something like this could happen: In [28]: warnings.resetwarnings() In [29]: a=numarray.array([1,2],numarray.Int8) In [30]: a[1] = 256 OverflowWarning: value assignment not in the type range In [31]: a Out[31]: array([1, 0], type=Int8) But perhaps this feature might slow a bit the performance of assignments. Regards, -- Francesc Alted `` We are shaped by our thoughts, we become what we think. When the mind is pure, joy follows like a shadow that never leaves. '' -- Buddha, The Dhammapada From haase at msg.ucsf.edu Wed Feb 19 12:21:38 2003 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Wed Feb 19 12:21:38 2003 Subject: [Numpy-discussion] byteswap() not doing anything !? Message-ID: <200302191220.26456.haase@msg.ucsf.edu> >>> xx = na.array((1,2,3)) >>> xx array([1, 2, 3]) >>> xx.byteswap() >>> xx array([1, 2, 3]) >>> xx.type() Int32 Hi all, I was reading the documentation for array 0.4 but I get the about results. How do I get the bytes swaped like it says in the manual: numbyteswap() The byteswap method performs a byte swapping operation on all the elements in the array, working inplace (i.e. it returns None). >>> print a [1 2 3] >>> a.byteswap() >>> print a [16777216 33554432 50331648] Thanks, Sebastian From jmiller at stsci.edu Wed Feb 19 12:45:02 2003 From: jmiller at stsci.edu (Todd Miller) Date: Wed Feb 19 12:45:02 2003 Subject: [Numpy-discussion] byteswap() not doing anything !? References: <200302191220.26456.haase@msg.ucsf.edu> Message-ID: <3E53EC22.50109@stsci.edu> Sebastian Haase wrote: >>>>xx = na.array((1,2,3)) >>>>xx >>>> >>>> >array([1, 2, 3]) > > >>>>xx.byteswap() >>>>xx >>>> >>>> >array([1, 2, 3]) > > >>>>xx.type() >>>> >>>> >Int32 > >Hi all, >I was reading the documentation for array 0.4 >but I get the about results. >How do I get the bytes swaped like it says in the manual: > >numbyteswap() > The byteswap method performs a byte swapping > operation on all the elements in the array, > working inplace (i.e. it returns None). > >>> print a > [1 2 3] > >>> a.byteswap() > >>> print a > [16777216 33554432 50331648] > This is a known bug/incompatability. The behavior will be changed for the next release of numarray. Right now, _byteswap() does what you want. > > >Thanks, >Sebastian > > > Todd From jmiller at stsci.edu Thu Feb 20 00:08:13 2003 From: jmiller at stsci.edu (Todd Miller) Date: Thu Feb 20 00:08:13 2003 Subject: [Numpy-discussion] range check: feature request for numarray References: <200302191305.00194.falted@openlc.org> Message-ID: <3E5490C1.2010707@stsci.edu> Francesc Alted wrote: >Hi, > > > Hi Francesc, I'm sorry about the slow response on this. I looked into what it would take to do this, and while I agree with you in principle, right now my hands are full trying to beat down numarray overhead. >I think it would be useful to provide some range checking in numarray. For >example, right now, you can do: > >In [24]: a=numarray.array([1,2],numarray.Int8) > >In [25]: a[1] = 256 > >In [26]: a >Out[26]: array([1, 0], type=Int8) > >and nothing happens. But I'm proposing to raise an OverflowWarning so that >people can be aware of such range overflows. > That sounds reasonable. If you'd care to do a patch, I think we would want it. If you don't have time, it may be a little while before we do. >Maybe it is desirable that the default would be to not issue the warning, >except when the user wanted to know about that. > > I think I'd rather see the warning on by default, even though it might "break" some existing code. >So, my proposal is that the actual behaviour should be mantained, but when >you want to be aware of all the warnings something like this could happen: > >In [28]: warnings.resetwarnings() > >In [29]: a=numarray.array([1,2],numarray.Int8) > >In [30]: a[1] = 256 >OverflowWarning: value assignment not in the type range > >In [31]: a >Out[31]: array([1, 0], type=Int8) > >But perhaps this feature might slow a bit the performance of assignments. > Yes, but probably not too much. >Regards, > > > Todd From jensj at fysik.dtu.dk Thu Feb 20 04:00:04 2003 From: jensj at fysik.dtu.dk (Jens Jorgen Mortensen) Date: Thu Feb 20 04:00:04 2003 Subject: [Numpy-discussion] BLAS Message-ID: <200302201258.37704.jensj@bose.fysik.dtu.dk> Hi, When doing matrix-matrix multiplications with large matrices, using the BLAS library (Basic Linear Algebra Subprograms) can speed up things a lot. I don't think Numeric takes advantage of this (is this correct?). Will numarray be able to do that? Jens From falted at openlc.org Thu Feb 20 05:41:12 2003 From: falted at openlc.org (Francesc Alted) Date: Thu Feb 20 05:41:12 2003 Subject: [Numpy-discussion] byteswap() not doing anything !? In-Reply-To: <3E53EC22.50109@stsci.edu> References: <200302191220.26456.haase@msg.ucsf.edu> <3E53EC22.50109@stsci.edu> Message-ID: <200302201440.32504.falted@openlc.org> A Dimecres 19 Febrer 2003 21:42, Todd Miller va escriure: > > >>> print a > > > > [1 2 3] > > > > >>> a.byteswap() > > >>> print a > > > > [16777216 33554432 50331648] > > This is a known bug/incompatability. The behavior will be changed for > the next release of numarray. Right now, _byteswap() does what you want. This is already decided?. Because I like the present behaviour. At first, I've found this behaviour a bit strange, but after get used to it, I admit that it is elegant because you can always see a sane representation of the data in array independently of which architecture you have written the array. If you byteswap() an array, the _byteorder property is also changed, so you can check if your array is bytswapped or not just by writing: if a._byteorder <> sys.byteorder: print "a is byteswapped" else: print "a is not byteswapped" And, as you said before, you can always call _byteswap() if you *really* want to *only* byteswap the array. PyTables already makes use of byteswap() as it is now, and that's nice because an array can be restored from disk safely by just looking at the byte order on disk and then setting properly the ._byteorder attribute. That's all! This allows also to work seamlessly with objects coming from a mixture of big-endian and low-endian machines. But, anyway, if you plan to do the change, may you please tell us what would be the expected behaviour of the future .byteswap(), ._byteswap() and ._byteorder? Thanks, -- Francesc Alted From jmiller at stsci.edu Thu Feb 20 06:20:13 2003 From: jmiller at stsci.edu (Todd Miller) Date: Thu Feb 20 06:20:13 2003 Subject: [Numpy-discussion] byteswap() not doing anything !? References: <200302191220.26456.haase@msg.ucsf.edu> <3E53EC22.50109@stsci.edu> <200302201440.32504.falted@openlc.org> Message-ID: <3E54E7E8.3030705@stsci.edu> Francesc Alted wrote: >A Dimecres 19 Febrer 2003 21:42, Todd Miller va escriure: > > >>> >>> print a >>> >>> [1 2 3] >>> >>> >>> a.byteswap() >>> >>> print a >>> >>> [16777216 33554432 50331648] >>> >>> >>This is a known bug/incompatability. The behavior will be changed for >>the next release of numarray. Right now, _byteswap() does what you want. >> >> > >This is already decided?. Because I like the present behaviour. > It's already in CVS. Let me know what you think about the stuff below. > >At first, I've found this behaviour a bit strange, but after get used to it, >I admit that it is elegant because you can always see a sane representation >of the data in array independently of which architecture you have written >the array. > > I think byteswap() came to be the way it is in numarray-0.4 as a result of my experiences with cross-platform pickling. It made sense to me at the time. However, it is definitely a new point of confusion, and not backwards compatible with Numeric, so I think the numarray-0.4 byteswap() behavior was a mistake. >If you byteswap() an array, the _byteorder property is also changed, so you >can check if your array is bytswapped or not just by writing: > >if a._byteorder <> sys.byteorder: > print "a is byteswapped" >else: > print "a is not byteswapped" > >And, as you said before, you can always call _byteswap() if you *really* >want to *only* byteswap the array. > >PyTables already makes use of byteswap() as it is now, and that's nice >because an array can be restored from disk safely by just looking at the >byte order on disk and then setting properly the ._byteorder attribute. >That's all! This allows also to work seamlessly with objects coming from a >mixture of big-endian and low-endian machines. > >But, anyway, if you plan to do the change, may you please tell us what would >be the expected behaviour of the future .byteswap(), ._byteswap() and >._byteorder? > > The current "plan" is that byteswap() and _byteswap() will both behave as _byteswap() does now; i.e., they will be Numeric compatible synonyms. An explict (extra) call to togglebyteorder() will then produce the current behavior. The meaning of _byteorder will be unchanged. Please let me know if you see any snags in the plan. >Thanks, > > > From paul at pfdubois.com Thu Feb 20 08:51:02 2003 From: paul at pfdubois.com (Paul F Dubois) Date: Thu Feb 20 08:51:02 2003 Subject: [Numpy-discussion] BLAS In-Reply-To: <200302201258.37704.jensj@bose.fysik.dtu.dk> Message-ID: <000201c2d900$12107110$6601a8c0@NICKLEBY> > -----Original Message----- > From: numpy-discussion-admin at lists.sourceforge.net > [mailto:numpy-discussion-admin at lists.sourceforge.net] On > Behalf Of Jens Jorgen Mortensen > Sent: Thursday, February 20, 2003 3:59 AM > To: numpy-discussion at lists.sourceforge.net > Subject: [Numpy-discussion] BLAS > > > Hi, > > When doing matrix-matrix multiplications with large matrices, > using the BLAS > library (Basic Linear Algebra Subprograms) can speed up > things a lot. I don't > think Numeric takes advantage of this (is this correct?). No. You can configure it at installation to use the BLAS of choice. > Will numarray be > able to do that? > > Jens > > > ------------------------------------------------------- > This SF.net email is sponsored by: SlickEdit Inc. Develop an > edge. The most comprehensive and flexible code editor you can > use. Code faster. C/C++, C#, Java, HTML, XML, many more. FREE > 30-Day Trial. www.slickedit.com/sourceforge > _______________________________________________ > Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > From falted at openlc.org Thu Feb 20 10:54:07 2003 From: falted at openlc.org (Francesc Alted) Date: Thu Feb 20 10:54:07 2003 Subject: [Numpy-discussion] byteswap() not doing anything !? In-Reply-To: <3E54E7E8.3030705@stsci.edu> References: <200302191220.26456.haase@msg.ucsf.edu> <200302201440.32504.falted@openlc.org> <3E54E7E8.3030705@stsci.edu> Message-ID: <200302201953.05991.falted@openlc.org> A Dijous 20 Febrer 2003 15:36, Todd Miller va escriure: > > The current "plan" is that byteswap() and _byteswap() will both behave > as _byteswap() does now; i.e., they will be Numeric compatible synonyms. > > An explict (extra) call to togglebyteorder() will then produce the > current behavior. The meaning of _byteorder will be unchanged. > > Please let me know if you see any snags in the plan. Well, I've been doing some tests, and I think I'll be able to produce a version of my code that will be compatible with numarray 0.4 and future versions (I'm just no using byteswap() at all). However, I've detected a side effect on this change: copy() method is broken now in CVS: In [131]: a=numarray.array([1,2]) In [132]: a.togglebyteorder() In [133]: b=a.copy() In [134]: a Out[134]: array([16777216, 33554432]) In [135]: b Out[135]: array([1, 2]) In [136]: a._byteorder Out[136]: 'big' In [137]: b._byteorder Out[137]: 'big' so, you don't get a well-behaved copy of original array a in b I think the next patch should cure it: --- numarray.py Tue Feb 18 16:35:16 2003 +++ /usr/local/lib/python2.2/site-packages/numarray/numarray.py Thu Feb 20 19:36:07 2003 @@ -609,6 +609,7 @@ c._type = self._type if self.isbyteswapped(): c.byteswap() + c.togglebyteorder() return c -- Francesc Alted From jmiller at stsci.edu Thu Feb 20 11:23:14 2003 From: jmiller at stsci.edu (Todd Miller) Date: Thu Feb 20 11:23:14 2003 Subject: [Numpy-discussion] byteswap() not doing anything !? References: <200302191220.26456.haase@msg.ucsf.edu> <200302201440.32504.falted@openlc.org> <3E54E7E8.3030705@stsci.edu> <200302201953.05991.falted@openlc.org> Message-ID: <3E552EF9.9060507@stsci.edu> Francesc Alted wrote: >A Dijous 20 Febrer 2003 15:36, Todd Miller va escriure: > > >>The current "plan" is that byteswap() and _byteswap() will both behave >>as _byteswap() does now; i.e., they will be Numeric compatible synonyms. >> >>An explict (extra) call to togglebyteorder() will then produce the >>current behavior. The meaning of _byteorder will be unchanged. >> >>Please let me know if you see any snags in the plan. >> >> > >Well, I've been doing some tests, and I think I'll be able to produce a >version of my code that will be compatible with numarray 0.4 and future >versions (I'm just no using byteswap() at all). > >However, I've detected a side effect on this change: copy() method is >broken now in CVS: > >In [131]: a=numarray.array([1,2]) > >In [132]: a.togglebyteorder() > >In [133]: b=a.copy() > >In [134]: a >Out[134]: array([16777216, 33554432]) > >In [135]: b >Out[135]: array([1, 2]) > >In [136]: a._byteorder >Out[136]: 'big' > >In [137]: b._byteorder >Out[137]: 'big' > >so, you don't get a well-behaved copy of original array a in b > > Doh! >I think the next patch should cure it: > >--- numarray.py Tue Feb 18 16:35:16 2003 >+++ /usr/local/lib/python2.2/site-packages/numarray/numarray.py Thu Feb 20 >19:36:07 2003 >@@ -609,6 +609,7 @@ > c._type = self._type > if self.isbyteswapped(): > c.byteswap() >+ c.togglebyteorder() > return c > > Thanks! Todd From R.M.Everson at exeter.ac.uk Thu Feb 20 15:43:14 2003 From: R.M.Everson at exeter.ac.uk (R.M.Everson) Date: Thu Feb 20 15:43:14 2003 Subject: [Numpy-discussion] BLAS In-Reply-To: <000201c2d900$12107110$6601a8c0@NICKLEBY> References: <000201c2d900$12107110$6601a8c0@NICKLEBY> Message-ID: Hi, As Paul Dubois says, some Numeric functions can be configured to use the BLAS library. However, the BLAS is not used for, perhaps the most common and important operation: matrix/vector multiplication. We have written a small patch to interface to replace the matrixproduct/dot/innerproduct functions in multiarraymodule.c with the appropriate BLAS calls. The patch (against Numeric 21.1b) can be found at http://www.dcs.ex.ac.uk/~aschmolc/Numeric and can give a speed up of a factor of 40 on 1000 by 1000 matrices using the Atlas BLAS. More details of the (naive!) timings can be found there too. We had planned on making a general announcement of this patch (updated to suit Numeric 22) in a week or so. However, we have just noticed that Numeric.dot (=Numeric.innerproduct = Numeric.matrixmultiply) does not take the complex conjugate of its first argument. Taking the complex conjugate seems to me to be the right thing for a routine named dot or innerproduct. Indeed, until we were bitten by it not taking the conjugate, I thought it did. Can someone here explain the rational behind having dot, innerproduct and matrixmultiply all do the same thing and none of them taking the conjugate? (Matlab dot() takes the conjugate, although Matlab mtimes() (called for A*B) does not). I would propose that innerproduct and dot be changed to take the conjugate and a new function that doesn't (say, mtimes) be introduced. I suspect, however, that this would break too much existing code. It would be nice to get it right in Numarray. Alternatively, can someone suggest how both functions can be conveniently and non-confusingly exposed? Richard. Paul F Dubois writes: >> -----Original Message----- From: >> numpy-discussion-admin at lists.sourceforge.net >> [mailto:numpy-discussion-admin at lists.sourceforge.net] On Behalf Of >> Jens Jorgen Mortensen Sent: Thursday, February 20, 2003 3:59 AM To: >> numpy-discussion at lists.sourceforge.net Subject: [Numpy-discussion] >> BLAS >> >> >> Hi, >> >> When doing matrix-matrix multiplications with large matrices, using >> the BLAS library (Basic Linear Algebra Subprograms) can speed up >> things a lot. I don't think Numeric takes advantage of this (is this >> correct?). > No. You can configure it at installation to use the BLAS of choice. >> Will numarray be able to do that? >> >> Jens >> From a.schmolck at gmx.net Thu Feb 20 17:22:18 2003 From: a.schmolck at gmx.net (Alexander Schmolck) Date: Thu Feb 20 17:22:18 2003 Subject: [Numpy-discussion] BLAS In-Reply-To: References: <000201c2d900$12107110$6601a8c0@NICKLEBY> Message-ID: R.M.Everson at exeter.ac.uk (R.M.Everson) writes: > Hi, > > As Paul Dubois says, some Numeric functions can be configured to use the > BLAS library. However, the BLAS is not used for, perhaps the most common > and important operation: matrix/vector multiplication. > > We have written a small patch to interface to replace the > matrixproduct/dot/innerproduct functions in multiarraymodule.c with the > appropriate BLAS calls. > > The patch (against Numeric 21.1b) can be found at > http://www.dcs.ex.ac.uk/~aschmolc/Numeric and can give a speed up of a > factor of 40 on 1000 by 1000 matrices using the Atlas BLAS. More details > of the (naive!) timings can be found there too. > An addendum: the new version is no longer a patch against Numeric, but a separate module, currently called 'dotblas', which is a cleaner approach as it doesn't require using a modified version of Numeric. To use this fast dot instaed of Numeric's dot, you can e.g do: import Numeric # no errors if dotblas isn't installed try: import dotblas Numeric.dot = dotblas.dot except ImportError: pass I just put a prerelease (which still handles complex arrays DIFFERENTLY from Numeric!!!) online at: http://www.dcs.ex.ac.uk/~aschmolc/Numeric/dotblas.html enjoy, alex From falted at openlc.org Fri Feb 21 03:18:03 2003 From: falted at openlc.org (Francesc Alted) Date: Fri Feb 21 03:18:03 2003 Subject: [Numpy-discussion] Non-regular lists in numarray Message-ID: <200302211216.56247.falted@openlc.org> Hi, I've found that numarray.array doesn't check enough the input for non-regular objects. For example: In [95]: numarray.array([3., [4, 5.2]]) Out[95]: array([ 3. , 5.7096262]) but, In [96]: Numeric.array([3., [4, 5.2]]) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) ? TypeError: bad argument type for built-in operation I find Numeric behaviour more appropriate. Regards, -- Francesc Alted From jmiller at stsci.edu Fri Feb 21 08:09:18 2003 From: jmiller at stsci.edu (Todd Miller) Date: Fri Feb 21 08:09:18 2003 Subject: [Numpy-discussion] Non-regular lists in numarray References: <200302211216.56247.falted@openlc.org> Message-ID: <3E564F0E.90400@stsci.edu> I logged this as a bug and I'll get to it as soon as I'm out of "numarray overhead reduction mode." Thanks! Todd Francesc Alted wrote: >Hi, > >I've found that numarray.array doesn't check enough the input for >non-regular objects. For example: > >In [95]: numarray.array([3., [4, 5.2]]) >Out[95]: array([ 3. , 5.7096262]) > >but, > >In [96]: Numeric.array([3., [4, 5.2]]) >--------------------------------------------------------------------------- >TypeError Traceback (most recent call last) > >? > >TypeError: bad argument type for built-in operation > > >I find Numeric behaviour more appropriate. > >Regards, > > > From paul at pfdubois.com Fri Feb 21 08:36:02 2003 From: paul at pfdubois.com (Paul F Dubois) Date: Fri Feb 21 08:36:02 2003 Subject: [Numpy-discussion] BLAS In-Reply-To: <200302201759.44307.jensj@bose.fysik.dtu.dk> Message-ID: <001d01c2d9c7$249a81a0$6601a8c0@NICKLEBY> I had forgotten about this case. I think when these were done it was thought that it would be better if the Numeric core did not require use of LAPACK/BLAS. We were thinking back then of a core with other packages, and the blas we use by default is probably the same speed so it didn't seem important. I would have no problem with a patch to change this. > -----Original Message----- > From: Jens Jorgen Mortensen [mailto:jensj at fysik.dtu.dk] > Sent: Thursday, February 20, 2003 9:00 AM > To: Paul F Dubois > Subject: Re: [Numpy-discussion] BLAS > > > On Torsdag den 20. februar 2003 17:49, Paul F Dubois wrote: > > > > When doing matrix-matrix multiplications with large > matrices, using > > > the BLAS library (Basic Linear Algebra Subprograms) can speed up > > > things a lot. I don't > > > think Numeric takes advantage of this (is this correct?). > > > > No. You can configure it at installation to use the BLAS of choice. > > > > I know that the stuff in LinearAlgebra can be configured to > use a BLAS of > choice, but what about the Numeric.dot function? > > Can I configure Numeric so that this: > > >>> a = Numeric.dot(b, c) > > will use BLAS? > > Jens > > From haase at msg.ucsf.edu Fri Feb 21 23:10:05 2003 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Fri Feb 21 23:10:05 2003 Subject: [Numpy-discussion] make C array accessible to python without copy Message-ID: Short follow up: 1) Is it planned to support this more directly? 2) How much does it cost to create a buffer object if it uses my already allocated memory ? 3) Can I change the pointer so that it points to a different memory space WITHOUT having to recreate any python objects? Or would that "confuse" the buffer or numarray? (We are hoping to aquire 30 images per second - the images should get written into a circular buffer so that the data can be written to disk in larger chunks - but the python array should always refer to the current image ) Thanks for all the nice toys (tools) ;-) Sebastian Haase On Fri, 17 Jan 2003 18:16:01 -0500 Todd Miller wrote: >Sebastian Haase wrote: > >>Hi, >>What is the C API to make an array that got allocated, >>let's say, by a = new short[512*512], >>accessible to python as numarray. >> >What you want to do is not currently supported well in C. > The way to do what you want is: > >1. Create a buffer object from your C++ array. The >buffer object can be built such that it refers to the >original copy of the data. > >2. Call back into Python (numarray.NumArray) with your >buffer object as the buffer parameter. > >You can scavenge the code in NA_newAll (Src/newarray.ch) >for most of the callback. > >>I tried NA_New - but that seems to make a copy. >>I would need it to use the original memory space >>so that I can "observe" the array from Python WHILE >>the underlying C array changes (it's actually a camera >>image) >> >That sounds cool! > >> >>Thanks, >>Sebastian Haase From g_will at cyberus.ca Thu Feb 27 11:05:10 2003 From: g_will at cyberus.ca (Gordon Williams) Date: Thu Feb 27 11:05:10 2003 Subject: [Numpy-discussion] filtering numeric arrays Message-ID: <003501c2de93$3c6b50e0$c456e640@wnt20337> Hi All, I have an 2D numeric array of x,y points eg [(1,3),(2,4),(5,6)] and I would like to remove all the points in the array that don't meet the min/max point criteria. I will have several thousand points. With lists I can do it like [(x,y) for x,y in seq if xMin < x References: <003501c2de93$3c6b50e0$c456e640@wnt20337> Message-ID: <3E5E65A6.8030409@ieee.org> Gordon Williams wrote: >Hi All, > >I have an 2D numeric array of x,y points eg [(1,3),(2,4),(5,6)] and I would >like to remove all the points in the array that don't meet the min/max point >criteria. I will have several thousand points. With lists I can do it like > >[(x,y) for x,y in seq if xMin < x >How do I get the same functionality and better speed using numeric. I have >tried a bunch of things using compress and take but I am running up against >a brick wall. > > I think you want something like this: >>> cond = (xMin < a[:,0]) & (a[:,0] < xMax) & (yMin < a[:,1]) & (a[:,1] < yMax) >>> np.compress(cond, a, 0) Where 'a' is your original Nx2 array. Unfortunately the obvious notation and prettier notation using (xMin < a[:,0] < xMax) fails because python treats that as "(xMin < a[:,0]) and (a[:,0] < xMax)" and "and" is not what you need here, '&' is. -tim > >Any ideas? > >Thanks > >Gordon Williams > > > > > > >------------------------------------------------------- >This sf.net email is sponsored by:ThinkGeek >Welcome to geek heaven. >http://thinkgeek.com/sf >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > From Chris.Barker at noaa.gov Thu Feb 27 11:41:12 2003 From: Chris.Barker at noaa.gov (Chris Barker) Date: Thu Feb 27 11:41:12 2003 Subject: [Numpy-discussion] filtering numeric arrays In-Reply-To: <003501c2de93$3c6b50e0$c456e640@wnt20337> Message-ID: <5E98E540-4A8B-11D7-8A87-000393A96660@noaa.gov> On Thursday, February 27, 2003, at 11:05 AM, Gordon Williams wrote: > I have an 2D numeric array of x,y points eg [(1,3),(2,4),(5,6)] and I > would > like to remove all the points in the array that don't meet the min/max > point > criteria. I will have several thousand points. With lists I can do > it like This should do it: >>> a array([[1, 3], [2, 4], [5, 6]]) >>> valid = (a[:,0] > minX) & (a[:,0] < maxX) & (a[:,1] > minY) & (a[:,1] < maxY) >>> take(a,nonzero(valid)) array([ [2, 4]]) Note that & is a bitwise-and, not a logical and, but in this case, the result is the same. Unfortunately, the way Python works, overloading "and" is difficult. -Chris Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From g_will at cyberus.ca Thu Feb 27 12:45:31 2003 From: g_will at cyberus.ca (Gordon Williams) Date: Thu Feb 27 12:45:31 2003 Subject: [Numpy-discussion] Re: filtering numeric arrays Message-ID: <001601c2dea1$84d94e50$c456e640@wnt20337> Thanks to Tim and Chris. Just what I was looking for! I tested both along with some other tries that I had made. For 10000 points - G:\GPS\Python\GUI_Test\Filter>speed.py time for is 0.022809 time for is 0.060303 time for is 0.055692 time for is 0.003652 time for is 0.003561 For 100 points - G:\GPS\Python\GUI_Test\Filter>speed.py time for is 0.000238 time for is 0.000784 time for is 0.000678 time for is 0.000376 time for is 0.000153 They scale slightly differently between Tim's and Chris' methods. Thanks again, Gordon Williams Here is the code (since someone will ask): '''Test the speed of different methods of getting points out of a list''' import time import Numeric as n size= 100 maxNum=size/10. #data a= n.array(n.arange(0,maxNum,.05)) a.shape= (size,2) #list l= a.tolist() (xMin,yMin)= (3,2) (xMax,yMax)= (4,6) def listComp(seq): '''using list comprehension''' return [(x,y) for x,y in seq if xMin < x xMin) & (seq[:,0] < xMax) & (seq[:,1] > yMin) & (seq[:,1] < yMax) return n.take(a,n.nonzero(valid)) #Tests tests= [(listComp,l), (arraykludge,a),(arrayComp,a), (arrayTimH,a), (arrayChrisB,a)] for fun,seq in tests: t=time.clock() apply(fun,(seq,)) dt= time.clock()-t print "time for %s is %f" %(str(fun),dt) ----- Original Message ----- From: "Gordon Williams" To: Sent: Thursday, February 27, 2003 2:05 PM Subject: filtering numeric arrays > Hi All, > > I have an 2D numeric array of x,y points eg [(1,3),(2,4),(5,6)] and I would > like to remove all the points in the array that don't meet the min/max point > criteria. I will have several thousand points. With lists I can do it like > > [(x,y) for x,y in seq if xMin < x > How do I get the same functionality and better speed using numeric. I have > tried a bunch of things using compress and take but I am running up against > a brick wall. > > > Any ideas? > > Thanks > > Gordon Williams > > > > From dubois1 at llnl.gov Thu Feb 27 16:44:05 2003 From: dubois1 at llnl.gov (Paul F. Dubois) Date: Thu Feb 27 16:44:05 2003 Subject: [Numpy-discussion] Last call for v. 23 Message-ID: I am going to make a release of Numeric, 23.0. Fellow developers who are inspired to fix a bug are urged to do so immediately. This will be a bug fix release. From jdhunter at ace.bsd.uchicago.edu Thu Feb 27 20:27:15 2003 From: jdhunter at ace.bsd.uchicago.edu (John Hunter) Date: Thu Feb 27 20:27:15 2003 Subject: [Numpy-discussion] speedy remove mean of rows Message-ID: I have a large (40,000 x 128) Numeric array, X, with typecode Float. In some cases the number of rows may be approx 10x greater. I want to create an array Y with the same dimensions as X, where each element of Y is the corresponding element of X with the mean of the row on which it occurs subtracted away. Ie, Y = X - transpose(resize(mean(X,1), (X.shape[1],X.shape[0]))) I am wondering if this is the most efficient way (speed and memory). Thanks for any suggestions, John Hunter From eric at enthought.com Thu Feb 27 21:40:17 2003 From: eric at enthought.com (Eric Jones) Date: Thu Feb 27 21:40:17 2003 Subject: [Numpy-discussion] speedy remove mean of rows Message-ID: <20030228054158.0D0111050@www.enthought.com> Hey John, I think broadcasting is your best bet. Here is a snippet using scipy (Numeric will be pretty much the same). >>> from scipy import * >>> a = stats.random((4,3)) a array([[ 0.94058263, 0.24342623, 0.74673623], [ 0.53151542, 0.07523929, 0.49730805], [ 0.5161854 , 0.51049614, 0.70360875], [ 0.09470515, 0.60604334, 0.64941102]]) >>> stats.mean(a) # axis=-1 by default in scipy array([ 0.6435817 , 0.36802092, 0.57676343, 0.45005317]) >>> a-stats.mean(a)[:,NewAxis] array([[ 0.29700093, -0.40015546, 0.10315453], [ 0.1634945 , -0.29278163, 0.12928713], [-0.06057803, -0.06626729, 0.12684532], [-0.35534802, 0.15599017, 0.19935785]]) eric John Hunter wrote .. > > I have a large (40,000 x 128) Numeric array, X, with typecode Float. > In some cases the number of rows may be approx 10x greater. > > I want to create an array Y with the same dimensions as X, where each > element of Y is the corresponding element of X with the mean of the > row on which it occurs subtracted away. Ie, > > Y = X - transpose(resize(mean(X,1), (X.shape[1],X.shape[0]))) > > I am wondering if this is the most efficient way (speed and memory). > > Thanks for any suggestions, > John Hunter > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion From a.schmolck at gmx.net Fri Feb 28 09:54:04 2003 From: a.schmolck at gmx.net (Alexander Schmolck) Date: Fri Feb 28 09:54:04 2003 Subject: [Numpy-discussion] BLAS In-Reply-To: <001d01c2d9c7$249a81a0$6601a8c0@NICKLEBY> References: <001d01c2d9c7$249a81a0$6601a8c0@NICKLEBY> Message-ID: "Paul F Dubois" writes: > I had forgotten about this case. I think when these were done it was thought > that it would be better if the Numeric core did not require use of > LAPACK/BLAS. We were thinking back then of a core with other packages, and > the blas we use by default is probably the same speed so it didn't seem > important. I would have no problem with a patch to change this. Great. I submitted a patch just now. alex From fperez at colorado.edu Fri Feb 28 14:38:02 2003 From: fperez at colorado.edu (Fernando Perez) Date: Fri Feb 28 14:38:02 2003 Subject: [Numpy-discussion] Tentative fix for Numtut's view.py In-Reply-To: References: Message-ID: <3E5FE482.4080204@colorado.edu> Hi all, > Subject: [Numpy-discussion] Last call for v. 23 > > I am going to make a release of Numeric, 23.0. Fellow developers who are inspired > to fix a bug are urged to do so immediately. > > This will be a bug fix release. in the scipy mailing list there were some discussions about view.py as included in NumTut. It seems that many folks (including myself) have had problems with it, and they seem to be threading-related. The symptom is that once view is imported, the interactive interpreter essentially locks up, and typing becomes nearly impossible. I know next to nothing about threading, but in an attempt to fix the problem I stripped view.py bare of everything I didn't understand, until it worked :) Basically I removed all PIL and threading-related code, and left only the bare Tk code in place. Naive as this approach was, it seems to have worked. Some folks reported success, and David Ascher (the original author of view.py) suggested I submit this to the Numpy team as an update to the tutorial. There's a good chance the current view is just broken and nobody has bothered to use it in a long time. I'm attaching the new view here as a file, but if there is a different protocol I should follow, please let me know (patch, etc). As I said, this was the most simple-minded thing I could do to make it work. So if you are interested in accepting this, it might be wise to have a look at it first. On the upside, pretty much all I did was to _remove_ code, not to add anything. So the analysis should be easy (the new code is far simpler and shorter than the original). I've tested it personally under python 2.2.1 (the stock Redhat 8.0 install). Best, Fernando Perez. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: view.py URL: