From chan_dhf at yahoo.de  Wed Aug  1 02:55:10 2007
From: chan_dhf at yahoo.de (Danny Chan)
Date: Wed, 1 Aug 2007 08:55:10 +0200 (CEST)
Subject: [Numpy-discussion] reading 10 bit raw data into an array
In-Reply-To: <46AE17C2.80301@ieee.org>
Message-ID: <51633.82348.qm@web26214.mail.ukl.yahoo.com>

Hi Travis!
I guess I will still have to pad my data to full bytes before reading it, correct?

Travis Oliphant <oliphant.travis at ieee.org> schrieb: Danny Chan wrote:
> Hi all!
> I'm trying to read a data file that contains a raw image file. Every 
> pixel is assigned a value from 0 to 1023, and all pixels are stored from 
> top left to bottom right pixel in binary format in this file. I know the 
> width and the height of the image, so all that would be required is to 
> read 10 bits at a time and store it these as an integer. I played around 
> with the fromstring and fromfile function, and I read the documentation 
> for dtype objects, but I'm still confused. It seems simple enough to 
> read data in a format with a standard bitwidth, but how can I read data 
> in a non-standard format. Can anyone help?
> 

This kind of bit-manipulation must be done using bit operations on 
standard size data types even in C.  The file reading and writing 
libraries use bytes as their common denominator.

I would read in the entire image into a numpy array of unsigned bytes 
and then use slicing, masking, and bit-shifting to take 5 bytes at a 
time and convert them to 4 values of a 16-bit unsigned image.

Basically, you would do something like

# read in entire image into 1-d unsigned byte array
# create 16-bit array of the correct 2-D size
# use flat indexing to store into the new array
#   new.flat[::4] = old[::5] + bitwise_or(old[1::5], MASK1b) << SHIFT1b
#   new.flat[1::4] = bitwise_or(old[1::5], MASK2a) << SHIFT2a
                     + bitwise_or(old[2::5], MASK2b) << SHIFT2b

#   etc.


The exact MASKS and shifts to use is left as an exercise for the reader :-)


-Travis
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion at scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


---------------------------------
Jetzt Mails schnell in einem Vorschaufenster ?berfliegen. Dies und viel mehr bietet das  neue Yahoo! Mail. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070801/58d054c2/attachment.html>

From haase at msg.ucsf.edu  Wed Aug  1 05:03:27 2007
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Wed, 1 Aug 2007 11:03:27 +0200
Subject: [Numpy-discussion] reading 10 bit raw data into an array
In-Reply-To: <51633.82348.qm@web26214.mail.ukl.yahoo.com>
References: <46AE17C2.80301@ieee.org>
	<51633.82348.qm@web26214.mail.ukl.yahoo.com>
Message-ID: <bc657ead0708010203n64ac453end6ebc5db2a38b16b@mail.gmail.com>

On 8/1/07, Danny Chan <chan_dhf at yahoo.de> wrote:
> Hi Travis!
> I guess I will still have to pad my data to full bytes before reading it,
> correct?
>
> Travis Oliphant <oliphant.travis at ieee.org> schrieb:
> Danny Chan wrote:
> > Hi all!
> > I'm trying to read a data file that contains a raw image file. Every
> > pixel is assigned a value from 0 to 1023, and all pixels are stored from
> > top left to bottom right pixel in binary format in this file. I know the
> > width and the height of the image, so all that would be required is to
> > read 10 bits at a time and store it these as an integer. I played around
> > with the fromstring and fromfile function, and I read the documentation
> > for dtype objects, but I'm still confused. It seems simple enough to
> > read data in a format with a standard bitwidth, but how can I read data
> > in a non-standard format. Can anyone help?
> >
>
> This kind of bit-manipulation must be done using bit operations on
> standard size data types even in C. The file reading and writing
> libraries use bytes as their common denominator.
>
> I would read in the entire image into a numpy array of unsigned bytes
> and then use slicing, masking, and bit-shifting to take 5 bytes at a
> time and convert them to 4 values of a 16-bit unsigned image.
>
> Basically, you would do something like
>
> # read in entire image into 1-d unsigned byte array
> # create 16-bit array of the correct 2-D size
> # use flat indexing to store into the new array
> # new.flat[::4] = old[::5] + bitwise_or(old[1::5], MASK1b) << SHIFT1b
> # new.flat[1::4] = bitwise_or(old[1::5], MASK2a) << SHIFT2a
> + bitwise_or(old[2::5], MASK2b) << SHIFT2b
>
> # etc.
>
>
> The exact MASKS and shifts to use is left as an exercise for the reader :-)


Quick comment :  are you really sure your  camera produces the 12 bit
data in a "12 bit stream" --- all I have ever seen is that cameras
would just use 16 bit for each pixel.  (All you had to know if it uses
the left or the right part of those.  In other words, you might have
to divide (or use  bit shifting) the data by 16.)
Wasteful yes, but much simpler to handel !?


-Sebastian Haase


From lfriedri at imtek.de  Wed Aug  1 10:09:37 2007
From: lfriedri at imtek.de (Lars Friedrich)
Date: Wed, 01 Aug 2007 16:09:37 +0200
Subject: [Numpy-discussion] fourier with single precision
Message-ID: <46B09421.3020807@imtek.de>

Hello,

is there a way to tell numpy.fft.fft2 to use complex64 instead of 
complex128 as output dtype to speed the up transformation?

Thanks
Lars


From vincent.nijs at gmail.com  Wed Aug  1 11:30:21 2007
From: vincent.nijs at gmail.com (Vincent)
Date: Wed, 01 Aug 2007 15:30:21 -0000
Subject: [Numpy-discussion] How to implement a 'pivot table?'
In-Reply-To: <fcedc2970707310800j78bc563apaba4a228ef558bd@mail.gmail.com>
References: <fcedc2970707300812p58c343fbq852a8de09dbd4420@mail.gmail.com>
	<e4412d6b0707301012x20c28096nb1afaa5fae2028b1@mail.gmail.com>
	<fcedc2970707301232tf7908d8v749b45040fc6473f@mail.gmail.com>
	<e4412d6b0707310740j66e9b347sfe24275b03321e82@mail.gmail.com>
	<fcedc2970707310800j78bc563apaba4a228ef558bd@mail.gmail.com>
Message-ID: <1185982221.939389.53900@g12g2000prg.googlegroups.com>

I do a lot of this kind of things in SAS. In don't like SAS that much
so it would be great to have functionality like this for numpy
recarray's.

To transplant the approach that SAS takes to a numpy setting you'd
have something like the following 4 steps:
1. Sort the data by date and region
2. Determine the indices for the blocks (e.g., East, 1/1)
3. calculate the summary stats per block

SAS is very efficient at these types of operations i believe. Since it
assumes that the data is sorted, and throws and error if the data is
not sorted appropriately, i assume the indexing can be more efficient.
However, given the earlier comments i am wonder if this approach would
enhance performance.

I would be very interested to see what you come up with so please post
some of the code and/or timing tests to the list if possible.

Best,

Vincent


From chan_dhf at yahoo.de  Wed Aug  1 13:38:44 2007
From: chan_dhf at yahoo.de (Danny Chan)
Date: Wed, 1 Aug 2007 19:38:44 +0200
Subject: [Numpy-discussion] reading 10 bit raw data into an array
In-Reply-To: <bc657ead0708010203n64ac453end6ebc5db2a38b16b@mail.gmail.com>
References: <46AE17C2.80301@ieee.org>
	<51633.82348.qm@web26214.mail.ukl.yahoo.com>
	<bc657ead0708010203n64ac453end6ebc5db2a38b16b@mail.gmail.com>
Message-ID: <200708011938.45196.chan_dhf@yahoo.de>

>
> Quick comment :  are you really sure your  camera produces the 12 bit
> data in a "12 bit stream" --- all I have ever seen is that cameras
> would just use 16 bit for each pixel.  (All you had to know if it uses
> the left or the right part of those.  In other words, you might have
> to divide (or use  bit shifting) the data by 16.)
> Wasteful yes, but much simpler to handel !?
>

10 bit, but yes, I am sure. It is an embedded camera system, in fact, I get 
the data stream even before it is handled to any ISP for further processing 
of the picture. In the end, the ISP will convert the data stream to another 
format, but I have to simulate some of the algorithms that will be 
implemented in hardware.


From bsouthey at gmail.com  Wed Aug  1 15:02:24 2007
From: bsouthey at gmail.com (Bruce Southey)
Date: Wed, 1 Aug 2007 14:02:24 -0500
Subject: [Numpy-discussion] How to implement a 'pivot table?'
In-Reply-To: <fcedc2970707300812p58c343fbq852a8de09dbd4420@mail.gmail.com>
References: <fcedc2970707300812p58c343fbq852a8de09dbd4420@mail.gmail.com>
Message-ID: <bbcd77d00708011202g600835d3t2c8c2813ff04c1e2@mail.gmail.com>

Hi,
The hard part is knowing what aggregate function that you want. So a
hard way, even after cheating, to take the data provided is given
below. (The Numpy Example List was very useful especially on the where
function)!

I tried to be a little generic so you can replace the sum by any
suitable function and probably the array type as well. Of course it is
not complete because you still need to know the levels of the 'rows'
and 'columns' and also is not efficient as it has loops.

Bruce

from numpy import *
A=array([[1,1,10],
         [1,1,20],
         [1,2,30],
         [2,1,40],
         [2,2,50],
         [2,2,60] ])
C = zeros((2,2))

for i in range(2):
      crit1 = (A[:,0]==1+i)
      subA=A[crit1,1:]
      for j in range(2):
            crit2 = (subA[:,0]==1+j)
            subB=subA[crit2,1:]
            C[i,j]=subB.sum()


print C

On 7/30/07, Geoffrey Zhu <zyzhu2000 at gmail.com> wrote:
> Hi Everyone,
>
> I am wondering what is the best (and fast) way to build a pivot table
> aside from the 'brute force way?'
>
> I want to transform an numpy array into a pivot table. For example, if
> I have a numpy array like below:
>
> Region     Date          # of Units
> ----------    ----------        --------------
> East        1/1             10
> East        1/1             20
> East        1/2             30
> West       1/1             40
> West       1/2             50
> West       1/2             60
>
> I want  to transform this into the following table, where f() is a
> given aggregate function:
>
>            Date
> Region           1/1          1/2
> ----------
> East         f(10,20)         f(30)
> West        f(40)             f(50,60)
>
>
> I can regroup them into 'sets' and do it the brute force way, but that
> is kind of slow to execute. Does anyone know a better way?
>
>
> Thanks,
> Geoffrey
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>


From rhdireen at gmail.com  Wed Aug  1 19:24:46 2007
From: rhdireen at gmail.com (Randy Direen)
Date: Wed, 1 Aug 2007 17:24:46 -0600
Subject: [Numpy-discussion] f2py self documenting not working
Message-ID: <b0f937080708011624t22629681w5b8b22312e840c3@mail.gmail.com>

Im using f2py under numpy.  I've written several simple examples and f2py
has not generated any documentations for the routines I have made.  Any help
would be great, I am very new to f2py and I would like to use the tool to
wrap a rather large program written in Fortran90.   Thanks!

Randy Direen
NIST
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070801/7ed9860f/attachment.html>

From travis at enthought.com  Thu Aug  2 00:22:31 2007
From: travis at enthought.com (Travis Vaught)
Date: Wed, 1 Aug 2007 23:22:31 -0500
Subject: [Numpy-discussion] How to implement a 'pivot table?'
In-Reply-To: <bbcd77d00708011202g600835d3t2c8c2813ff04c1e2@mail.gmail.com>
References: <fcedc2970707300812p58c343fbq852a8de09dbd4420@mail.gmail.com>
	<bbcd77d00708011202g600835d3t2c8c2813ff04c1e2@mail.gmail.com>
Message-ID: <57273045-FDD5-4454-A1DE-9CB12387DA88@enthought.com>

Greetings,

Speaking of brute force... I've attached a rather ugly module that  
let's you do things with a pretty simple interface (session shown  
below).  I haven't fully tested the performance, but a million  
records with 5 fields takes about 11 seconds on my Mac to do a  
'mean'.  I'm not sure what your performance considerations are, but  
this may be useful.  Record arrays are really nice if they make sense  
for your data.

Travis


(from an ipython command prompt)

In [1]: import testpivot as p

In [2]: a = p.sample_data()

In [3]: a
Out[3]:
recarray([('ACorp', 'Region 1', 'Q1', 20000.0),
        ('ACorp', 'Region 1', 'Q2', 22000.0),
        ('ACorp', 'Region 1', 'Q3', 21000.0),
        ('ACorp', 'Region 1', 'Q4', 26000.0),
        ('ACorp', 'Region 2', 'Q1', 23000.0),
        ('ACorp', 'Region 2', 'Q2', 20000.0),
        ('ACorp', 'Region 2', 'Q3', 22000.0),
        ('ACorp', 'Region 2', 'Q4', 21000.0),
        ('ACorp', 'Region 3', 'Q1', 26000.0),
        ('ACorp', 'Region 3', 'Q2', 23000.0),
        ('ACorp', 'Region 3', 'Q3', 29000.0),
        ('ACorp', 'Region 3', 'Q4', 27000.0),
        ('BCorp', 'Region 1', 'Q1', 20000.0),
        ('BCorp', 'Region 1', 'Q2', 20000.0),
        ('BCorp', 'Region 1', 'Q3', 24000.0),
        ('BCorp', 'Region 1', 'Q4', 24000.0),
        ('BCorp', 'Region 2', 'Q1', 21000.0),
        ('BCorp', 'Region 2', 'Q2', 21000.0),
        ('BCorp', 'Region 2', 'Q3', 22000.0),
        ('BCorp', 'Region 2', 'Q4', 29000.0),
        ('BCorp', 'Region 3', 'Q1', 28000.0),
        ('BCorp', 'Region 3', 'Q2', 25000.0),
        ('BCorp', 'Region 3', 'Q3', 22000.0),
        ('BCorp', 'Region 3', 'Q4', 21000.0)],
       dtype=[('company', '|S5'), ('region', '|S8'), ('quarter', '| 
S2'), ('income', '<f8')])

In [4]: p.pivot(a, 'company', 'region', 'income', p.psum)
######## Summary by company and region ##########
cols:['ACorp' 'BCorp']
rows:['Region 1' 'Region 2' 'Region 3']
[[  89000.   88000.]
[  86000.   93000.]
[ 105000.   96000.]]

In [5]: p.pivot(a, 'company', 'quarter', 'income', p.psum)
######## Summary by company and quarter ##########
cols:['ACorp' 'BCorp']
rows:['Q1' 'Q2' 'Q3' 'Q4']
[[ 69000.  69000.]
[ 65000.  66000.]
[ 72000.  68000.]
[ 74000.  74000.]]

In [6]: p.pivot(a, 'company', 'quarter', 'income', p.pmean)
######## Summary by company and quarter ##########
cols:['ACorp' 'BCorp']
rows:['Q1' 'Q2' 'Q3' 'Q4']
[[ 23000.          23000.        ]
[ 21666.66666667  22000.        ]
[ 24000.          22666.66666667]
[ 24666.66666667  24666.66666667]]

-------------- next part --------------
A non-text attachment was scrubbed...
Name: testpivot.py
Type: text/x-python-script
Size: 3833 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070801/82cd7c40/attachment.bin>
-------------- next part --------------


On Aug 1, 2007, at 2:02 PM, Bruce Southey wrote:

> Hi,
> The hard part is knowing what aggregate function that you want. So a
> hard way, even after cheating, to take the data provided is given
> below. (The Numpy Example List was very useful especially on the where
> function)!
>
> I tried to be a little generic so you can replace the sum by any
> suitable function and probably the array type as well. Of course it is
> not complete because you still need to know the levels of the 'rows'
> and 'columns' and also is not efficient as it has loops.
>
> Bruce
>
> from numpy import *
> A=array([[1,1,10],
>          [1,1,20],
>          [1,2,30],
>          [2,1,40],
>          [2,2,50],
>          [2,2,60] ])
> C = zeros((2,2))
>
> for i in range(2):
>       crit1 = (A[:,0]==1+i)
>       subA=A[crit1,1:]
>       for j in range(2):
>             crit2 = (subA[:,0]==1+j)
>             subB=subA[crit2,1:]
>             C[i,j]=subB.sum()
>
>
> print C
>
> On 7/30/07, Geoffrey Zhu <zyzhu2000 at gmail.com> wrote:
>> Hi Everyone,
>>
>> I am wondering what is the best (and fast) way to build a pivot table
>> aside from the 'brute force way?'
>>
>> I want to transform an numpy array into a pivot table. For  
>> example, if
>> I have a numpy array like below:
>>
>> Region     Date          # of Units
>> ----------    ----------        --------------
>> East        1/1             10
>> East        1/1             20
>> East        1/2             30
>> West       1/1             40
>> West       1/2             50
>> West       1/2             60
>>
>> I want  to transform this into the following table, where f() is a
>> given aggregate function:
>>
>>            Date
>> Region           1/1          1/2
>> ----------
>> East         f(10,20)         f(30)
>> West        f(40)             f(50,60)
>>
>>
>> I can regroup them into 'sets' and do it the brute force way, but  
>> that
>> is kind of slow to execute. Does anyone know a better way?
>>
>>
>> Thanks,
>> Geoffrey
>> _______________________________________________
>> Numpy-discussion mailing list
>> Numpy-discussion at scipy.org
>> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>


From david at ar.media.kyoto-u.ac.jp  Thu Aug  2 00:59:47 2007
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Thu, 02 Aug 2007 13:59:47 +0900
Subject: [Numpy-discussion] fourier with single precision
In-Reply-To: <46B09421.3020807@imtek.de>
References: <46B09421.3020807@imtek.de>
Message-ID: <46B164C3.6040504@ar.media.kyoto-u.ac.jp>

Lars Friedrich wrote:
> Hello,
>
> is there a way to tell numpy.fft.fft2 to use complex64 instead of 
> complex128 as output dtype to speed the up transformation?
>   
As far as I can read from the fft code in numpy, only double is 
supported at the moment, unfortunately. Note that you can get some speed 
by using scipy.fftpack methods instead, if scipy is an option for you.

David


From goddard at cgl.ucsf.edu  Thu Aug  2 01:43:01 2007
From: goddard at cgl.ucsf.edu (Tom Goddard)
Date: Wed, 01 Aug 2007 22:43:01 -0700
Subject: [Numpy-discussion] Memory efficient equality test for arrays
Message-ID: <46B16EE5.5010900@cgl.ucsf.edu>

Is there a numpy call to test if two large arrays (say 1 Gbyte each) are 
equal (same shape and elements) without creating another large array of 
booleans as happens with "a == b", numpy.equal(a,b), or 
numpy.array_equal(a,b)?

I want a memory efficient and fast comparison.

    Tom


From haase at msg.ucsf.edu  Thu Aug  2 06:20:32 2007
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Thu, 2 Aug 2007 12:20:32 +0200
Subject: [Numpy-discussion] rant against from numpy import * / from
	pylab import *
In-Reply-To: <45FA4377.6010201@hawaii.edu>
References: <bc657ead0703152157y59cd3bf9uff9d4b298697b970@mail.gmail.com>
	<45FA4377.6010201@hawaii.edu>
Message-ID: <bc657ead0708020320v73f87db8u79211f5056cef001@mail.gmail.com>

Hi all,
Here a quick update:
I'm trying to have a concise / sparse module with containing only
pylab-specific names and not all names I already have in numpy.
To easy typing I want to call numpy "N" and my pylab "P".

I'm now using this code:
<code snipplet for importing matplotlib>
    import matplotlib, new
    matplotlib.use('WXAgg')
    from  matplotlib import pylab
    P = new.module("pylab_sparse","""pylab module minus stuff alreay
in numpy""")
    for k,v in pylab.__dict__.iteritems():
        try:
           if v is N.__dict__[k]:
               continue
        except KeyError:
           pass
        P.__dict__[k] = v

    P.ion()
    del matplotlib, new, pylab
</code sniplet for importing matplotlib>

The result is "some" reduction in the number of non-pylab-specific
names in my "P"-module. However there seem to be still many extra
names left, like e.g.:
alltrue, amax, array, ...
look at this:
    # 20070802
    # >>> len(dir(pylab))
    # 441
    # >>> len(dir(P))
    # 346
    # >>> P.nx.numpy.__version__
    # '1.0.1'
    # >>> N.__version__
    # '1.0.1'
    # >>> N.alltrue
    # <function alltrue at 0x01471B70>
    # >>> P.alltrue
    # <function alltrue at 0x019142F0>
    # >>> N.alltrue.__doc__
    # 'Perform a logical_and over the given axis.'
    # >>> P.alltrue.__doc__
    # >>> #N.alltrue(x, axis=None, out=None)
    # >>> #P.alltrue(x, axis=0)

I'm using matplotlib with
__version__  = '0.90.0'
__revision__ = '$Revision: 3003 $'
__date__     = '$Date: 2007-02-06 22:24:06 -0500 (Tue, 06 Feb 2007) $'


Any hint how to further reduce the number of names in "P" ?
My ideal would be that the "P" module (short for pylab) would only
contain the stuff described in the __doc__ strings of `pylab.py` and
`__init__.py`(in matplotlib)  (+ plus some extra, undocumented, yet
pylab specific things)

Thanks
-Sebastian


On 3/16/07, Eric Firing <efiring at hawaii.edu> wrote:
> Sebastian Haase wrote:
> > Hi!
> > I use the wxPython PyShell.
> > I like especially the feature that when typing a module and then the
> > dot "." I get a popup list of all available functions (names) inside
> > that module.
> >
> > Secondly,  I think it really makes code clearer when one can see where
> > a function comes from.
> >
> > I have a default
> > import numpy as N
> > executed before my shell even starts.
> > In fact I have a bunch of my "standard" modules imported as <some
> > single capital letter>.
> >
> > This - I think - is a good compromise to the commonly used "extra
> > typing" and "unreadable"  argument.
> >
> > a = sin(b) * arange(10,50, .1) * cos(d)
> > vs.
> > a = N.sin(b) * N.arange(10,50, .1) * N.cos(d)
>
> I generally do the latter, but really, all those "N." bits are still
> visual noise when it comes to reading the code--that is, seeing the
> algorithm rather than where the functions come from.  I don't think
> there is anything wrong with explicitly importing commonly-used names,
> especially things like sin and cos.
>
> >
> > I would like to hear some comments by others.
> >
> >
> > On a different note: I just started using pylab, so I did added an
> > automatic  "from matplotlib import pylab as P" -- but now P contains
> > everything that I already have in N.  It makes it really hard to
> > *find* (as in *see* n the popup-list) the pylab-only functions. --
> > what can I do about this ?
>
> A quick and dirty solution would be to comment out most of the imports
> in pylab.py; they are not needed for the pylab functions and are there
> only to give people lots of functionality in a single namespace.
>
> I am cross-posting this to matplotlib-users because it involves mpl, and
> an alternative solution would be for us to add an rcParam entry to allow
> one to turn off all of the namespace consolidation.  A danger is that if
> someone is using "from pylab import *" in a script, then whether it
> would run would depend on the matplotlibrc file.  To get around that,
> another possibility would be to break pylab.py into two parts, with
> pylab.py continuing to do the namespace consolidation and importing the
> second part, which would contain the actual pylab functions.  Then if
> you don't want the namespace consolidation, you could simply import the
> second part instead of pylab.  There may be devils in the details, but
> it seems to me that this last alternative--splitting pylab.py--might
> make a number of people happier while having no adverse effects on
> everyone else.
>
> Eric
> >
> >
> > Thanks,
> > Sebastian


From lfriedri at imtek.de  Thu Aug  2 12:40:13 2007
From: lfriedri at imtek.de (Lars Friedrich)
Date: Thu, 02 Aug 2007 18:40:13 +0200
Subject: [Numpy-discussion] fourier with single precision
Message-ID: <46B208ED.1010403@imtek.de>

Hello,

David Cournapeau wrote:
> As far as I can read from the fft code in numpy, only double is 
> supported at the moment, unfortunately. Note that you can get some speed 
> by using scipy.fftpack methods instead, if scipy is an option for you.

What I understood is that numpy uses FFTPACK's algorithms. From 
www.netlib.org/fftpack (is this the right address?) I took that there is 
a single-precision and double-precision-version of the algorithms. How 
hard would it be (for example for me...) to add the single-precision 
versions to numpy? I am not a decent C-hacker, but if someone tells me, 
that this task is not *too* hard, I would start looking more closely at 
the code...

Would it make sense, that if one passes an array of dtype = 
numpy.float32 to the fft function, a complex64 is returned, and if one 
passes an array of dtype = numpy.float64, a complex128 is returned?

Lars


From rmay at ou.edu  Thu Aug  2 15:18:49 2007
From: rmay at ou.edu (Ryan May)
Date: Thu, 02 Aug 2007 14:18:49 -0500
Subject: [Numpy-discussion] 16bit Integer Array/Scalar Inconsistency
Message-ID: <46B22E19.2090707@ou.edu>

Hi,

I ran into this while debugging a script today:

In [1]: import numpy as N

In [2]: N.__version__
Out[2]: '1.0.3'

In [3]: d = N.array([32767], dtype=N.int16)

In [4]: d + 32767
Out[4]: array([-2], dtype=int16)

In [5]: d[0] + 32767
Out[5]: 65534

In [6]: type(d[0] + 32767)
Out[6]: <type 'numpy.int64'>

In [7]: type(d[0])
Out[7]: <type 'numpy.int16'>

It seems that numpy will automatically promote the scalar to avoid
overflow, but not in the array case.  Is this inconsistency a bug, just
a (known) gotcha?

I myself don't have any problems with the array not being promoted
automatically, but the inconsistency with scalar operation made
debugging my problem more difficult.

Ryan

-- 
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma


From dalcinl at gmail.com  Thu Aug  2 15:22:07 2007
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Thu, 2 Aug 2007 16:22:07 -0300
Subject: [Numpy-discussion] reference leacks in numpy.asarray
Message-ID: <e7ba66e40708021222m380062bs7b66673ef9fa2b69@mail.gmail.com>

using numpy-1.0.3, I believe there are a reference leak somewhere.
Using a debug build of Python 2.5.1 (--with-pydebug), I get the
following

import sys, gc
import numpy

def testleaks(func, args=(), kargs={}, repeats=5):
    for i in xrange(repeats):
        r1 = sys.gettotalrefcount()
        func(*args,**kargs)
        r2 = sys.gettotalrefcount()
        rd = r2-r1
        print 'before: %d, after: %d, diff: [%d]' % (r1, r2, rd)

def npy_asarray_1():
    a = numpy.zeros(5, dtype=int)
    b = numpy.asarray(a, dtype=float)
    del a, b

def npy_asarray_2():
    a = numpy.zeros(5, dtype=float)
    b = numpy.asarray(a, dtype=float)
    del a, b

if __name__ == '__main__':
    testleaks(npy_asarray_1)
    testleaks(npy_asarray_2)


$ python npyleaktest.py
before: 84531, after: 84532, diff: [1]
before: 84534, after: 84534, diff: [0]
before: 84534, after: 84534, diff: [0]
before: 84534, after: 84534, diff: [0]
before: 84534, after: 84534, diff: [0]
before: 84531, after: 84533, diff: [2]
before: 84535, after: 84536, diff: [1]
before: 84536, after: 84537, diff: [1]
before: 84537, after: 84538, diff: [1]
before: 84538, after: 84539, diff: [1]

It seems npy_asarray_2() is leaking a reference. I am  missing
something here?. The same problem is found in C, using
PyArray_FROM_OTF (no time to go inside to see what's going on, sorry)

If this is know and solved in SVN, please forget me.

Regards,


-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594


From robert.kern at gmail.com  Thu Aug  2 15:31:07 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 02 Aug 2007 14:31:07 -0500
Subject: [Numpy-discussion] 16bit Integer Array/Scalar Inconsistency
In-Reply-To: <46B22E19.2090707@ou.edu>
References: <46B22E19.2090707@ou.edu>
Message-ID: <46B230FB.5090201@gmail.com>

Ryan May wrote:
> Hi,
> 
> I ran into this while debugging a script today:
> 
> In [1]: import numpy as N
> 
> In [2]: N.__version__
> Out[2]: '1.0.3'
> 
> In [3]: d = N.array([32767], dtype=N.int16)
> 
> In [4]: d + 32767
> Out[4]: array([-2], dtype=int16)
> 
> In [5]: d[0] + 32767
> Out[5]: 65534
> 
> In [6]: type(d[0] + 32767)
> Out[6]: <type 'numpy.int64'>
> 
> In [7]: type(d[0])
> Out[7]: <type 'numpy.int16'>
> 
> It seems that numpy will automatically promote the scalar to avoid
> overflow, but not in the array case.  Is this inconsistency a bug, just
> a (known) gotcha?

Known feature. When arrays and scalars are mixed and the types are within the
same kind (e.g. both are integer types just at different precisions), the type
of the scalar is ignored. This solves one of the usability issues with trying to
use lower precisions; you still want to be able to divide by 2.0, for example,
without automatically up-casting your very large float32 array.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From focke at slac.stanford.edu  Thu Aug  2 15:51:57 2007
From: focke at slac.stanford.edu (Warren Focke)
Date: Thu, 2 Aug 2007 12:51:57 -0700 (PDT)
Subject: [Numpy-discussion] fourier with single precision
In-Reply-To: <46B208ED.1010403@imtek.de>
References: <46B208ED.1010403@imtek.de>
Message-ID: <Pine.LNX.4.64.0708021245190.31476@circinus.slac.stanford.edu>


On Thu, 2 Aug 2007, Lars Friedrich wrote:

> What I understood is that numpy uses FFTPACK's algorithms.

Sort of.  It appears to be a hand translation from F77 to C.

> From www.netlib.org/fftpack (is this the right address?) I took that 
> there is a single-precision and double-precision-version of the 
> algorithms. How hard would it be (for example for me...) to add the 
> single-precision versions to numpy? I am not a decent C-hacker, but if 
> someone tells me, that this task is not *too* hard, I would start 
> looking more closely at the code...

It shouldn't be hard.  fftpack.c will make a single-precision version if 
DOUBLE is not defined at compile time.

> Would it make sense, that if one passes an array of dtype =
> numpy.float32 to the fft function, a complex64 is returned, and if one
> passes an array of dtype = numpy.float64, a complex128 is returned?

Sounds like reasonable default behavior.  Might be useful if the caller 
could overrride it.


w


From tim.hochberg at ieee.org  Thu Aug  2 17:15:03 2007
From: tim.hochberg at ieee.org (Timothy Hochberg)
Date: Thu, 2 Aug 2007 14:15:03 -0700
Subject: [Numpy-discussion] reference leacks in numpy.asarray
In-Reply-To: <e7ba66e40708021222m380062bs7b66673ef9fa2b69@mail.gmail.com>
References: <e7ba66e40708021222m380062bs7b66673ef9fa2b69@mail.gmail.com>
Message-ID: <e4412d6b0708021415x3a7399c4w27ff07a3781c1623@mail.gmail.com>

On 8/2/07, Lisandro Dalcin <dalcinl at gmail.com> wrote:
>
> using numpy-1.0.3, I believe there are a reference leak somewhere.
> Using a debug build of Python 2.5.1 (--with-pydebug), I get the
> following
>
> import sys, gc
> import numpy
>
> def testleaks(func, args=(), kargs={}, repeats=5):
>     for i in xrange(repeats):
>         r1 = sys.gettotalrefcount()
>         func(*args,**kargs)
>         r2 = sys.gettotalrefcount()
>         rd = r2-r1
>         print 'before: %d, after: %d, diff: [%d]' % (r1, r2, rd)
>
> def npy_asarray_1():
>     a = numpy.zeros(5, dtype=int)
>     b = numpy.asarray(a, dtype=float)
>     del a, b
>
> def npy_asarray_2():
>     a = numpy.zeros(5, dtype=float)
>     b = numpy.asarray(a, dtype=float)
>     del a, b
>
> if __name__ == '__main__':
>     testleaks(npy_asarray_1)
>     testleaks(npy_asarray_2)
>
>
> $ python npyleaktest.py
> before: 84531, after: 84532, diff: [1]
> before: 84534, after: 84534, diff: [0]
> before: 84534, after: 84534, diff: [0]
> before: 84534, after: 84534, diff: [0]
> before: 84534, after: 84534, diff: [0]
> before: 84531, after: 84533, diff: [2]
> before: 84535, after: 84536, diff: [1]
> before: 84536, after: 84537, diff: [1]
> before: 84537, after: 84538, diff: [1]
> before: 84538, after: 84539, diff: [1]
>
> It seems npy_asarray_2() is leaking a reference. I am  missing
> something here?. The same problem is found in C, using
> PyArray_FROM_OTF (no time to go inside to see what's going on, sorry)
>
> If this is know and solved in SVN, please forget me.


I don't have a debug build handy to test this on, but this might not be a
reference leak. Since you are checking the count before and after each
cycle, it could be that there are cycles being created that are subsequently
cleaned up by the garbage collector.

Can you try instead to look at the difference between the reference count at
the end of each cycle with the reference count before the first cycle? If
that goes up indefinitely, then it's probably a leak. If it bounces around
or levels off, then probably not. You'd probably want to run a bunch of
repeats just to be sure.
regards,
-tim


Regards,
>
>
> --
> Lisandro Dalc?n
> ---------------
> Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
> Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
> Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
> PTLC - G?emes 3450, (3000) Santa Fe, Argentina
> Tel/Fax: +54-(0)342-451.1594
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
.  __
.   |-\
.
.  tim.hochberg at ieee.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070802/07d92347/attachment.html>

From dalcinl at gmail.com  Thu Aug  2 18:03:19 2007
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Thu, 2 Aug 2007 19:03:19 -0300
Subject: [Numpy-discussion] reference leacks in numpy.asarray
In-Reply-To: <e4412d6b0708021415x3a7399c4w27ff07a3781c1623@mail.gmail.com>
References: <e7ba66e40708021222m380062bs7b66673ef9fa2b69@mail.gmail.com>
	<e4412d6b0708021415x3a7399c4w27ff07a3781c1623@mail.gmail.com>
Message-ID: <e7ba66e40708021503v1fbf4b6bs8f75675cd1c47182@mail.gmail.com>

Ups, I forgot to mention I was using gc.collect(), I accidentally
cleaned it my mail

Anyway, the following

import sys, gc
import numpy

def test():
    a = numpy.zeros(5, dtype=float)
    while 1:
        gc.collect()
        b = numpy.asarray(a, dtype=float); del b
        gc.collect()
        print sys.gettotalrefcount()

test()

shows in mi box alway 1 more totalrefcount in each pass, so always
increasing. IMHO, I still think there is a leak somewere.

And now, I am not sure if PyArray_FromAny is the source of the problem.


On 8/2/07, Timothy Hochberg <tim.hochberg at ieee.org> wrote:
>
>
>
> On 8/2/07, Lisandro Dalcin <dalcinl at gmail.com> wrote:
> > using numpy-1.0.3, I believe there are a reference leak somewhere.
> > Using a debug build of Python 2.5.1 (--with-pydebug), I get the
> > following
> >
> > import sys, gc
> > import numpy
> >
> > def testleaks(func, args=(), kargs={}, repeats=5):
> >     for i in xrange(repeats):
> >         r1 = sys.gettotalrefcount()
> >         func(*args,**kargs)
> >         r2 = sys.gettotalrefcount()
> >         rd = r2-r1
> >         print 'before: %d, after: %d, diff: [%d]' % (r1, r2, rd)
> >
> > def npy_asarray_1():
> >     a = numpy.zeros(5, dtype=int)
> >     b = numpy.asarray(a, dtype=float)
> >     del a, b
> >
> > def npy_asarray_2():
> >     a = numpy.zeros(5, dtype=float)
> >     b = numpy.asarray(a, dtype=float)
> >     del a, b
> >
> > if __name__ == '__main__':
> >     testleaks(npy_asarray_1)
> >     testleaks(npy_asarray_2)
> >
> >
> > $ python npyleaktest.py
> > before: 84531, after: 84532, diff: [1]
> > before: 84534, after: 84534, diff: [0]
> > before: 84534, after: 84534, diff: [0]
> > before: 84534, after: 84534, diff: [0]
> > before: 84534, after: 84534, diff: [0]
> > before: 84531, after: 84533, diff: [2]
> > before: 84535, after: 84536, diff: [1]
> > before: 84536, after: 84537, diff: [1]
> > before: 84537, after: 84538, diff: [1]
> > before: 84538, after: 84539, diff: [1]
> >
> > It seems npy_asarray_2() is leaking a reference. I am  missing
> > something here?. The same problem is found in C, using
> > PyArray_FROM_OTF (no time to go inside to see what's going on, sorry)
> >
> > If this is know and solved in SVN, please forget me.
>
> I don't have a debug build handy to test this on, but this might not be a
> reference leak. Since you are checking the count before and after each
> cycle, it could be that there are cycles being created that are subsequently
> cleaned up by the garbage collector.
>
> Can you try instead to look at the difference between the reference count at
> the end of each cycle with the reference count before the first cycle? If
> that goes up indefinitely, then it's probably a leak. If it bounces around
> or levels off, then probably not. You'd probably want to run a bunch of
> repeats just to be sure.
> regards,
> -tim
>
> > Regards,
> >
> >
> > --
> > Lisandro Dalc?n
> > ---------------
> > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
> > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
> > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
> > PTLC - G?emes 3450, (3000) Santa Fe, Argentina
> > Tel/Fax: +54-(0)342-451.1594
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at scipy.org
> >
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
> >
>
>
>
> --
> .  __
> .   |-\
> .
> .   tim.hochberg at ieee.org
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594


From dalcinl at gmail.com  Thu Aug  2 18:20:22 2007
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Thu, 2 Aug 2007 19:20:22 -0300
Subject: [Numpy-discussion] reference leacks in numpy.asarray
In-Reply-To: <e4412d6b0708021415x3a7399c4w27ff07a3781c1623@mail.gmail.com>
References: <e7ba66e40708021222m380062bs7b66673ef9fa2b69@mail.gmail.com>
	<e4412d6b0708021415x3a7399c4w27ff07a3781c1623@mail.gmail.com>
Message-ID: <e7ba66e40708021520ifda1ddapbf7ea33175d65df8@mail.gmail.com>

I think the problem is in  _array_fromobject (seen as numpy.array in Python)

This function parses its arguments by using the convertor
PyArray_DescrConverter2. which RETURNS A NEW REFERENCE!!! This
reference is never DECREF'ed.

BTW, A lesson I've learned of the pattern

if (!PyArg_ParseXXX(....)) return NULL

is that convertor functions should NEVER return new references to
PyObject*'s,  because if the conversion fails (because of latter wrong
argument), you leak a reference to the 'converted' object.

If this pattern is used everywhere in numpy, well, there are big
chances of leaking references in the case of bad args to C functions.

Regards,


On 8/2/07, Timothy Hochberg <tim.hochberg at ieee.org> wrote:
>
>
>
> On 8/2/07, Lisandro Dalcin <dalcinl at gmail.com> wrote:
> > using numpy-1.0.3, I believe there are a reference leak somewhere.
> > Using a debug build of Python 2.5.1 (--with-pydebug), I get the
> > following
> >
> > import sys, gc
> > import numpy
> >
> > def testleaks(func, args=(), kargs={}, repeats=5):
> >     for i in xrange(repeats):
> >         r1 = sys.gettotalrefcount()
> >         func(*args,**kargs)
> >         r2 = sys.gettotalrefcount()
> >         rd = r2-r1
> >         print 'before: %d, after: %d, diff: [%d]' % (r1, r2, rd)
> >
> > def npy_asarray_1():
> >     a = numpy.zeros(5, dtype=int)
> >     b = numpy.asarray(a, dtype=float)
> >     del a, b
> >
> > def npy_asarray_2():
> >     a = numpy.zeros(5, dtype=float)
> >     b = numpy.asarray(a, dtype=float)
> >     del a, b
> >
> > if __name__ == '__main__':
> >     testleaks(npy_asarray_1)
> >     testleaks(npy_asarray_2)
> >
> >
> > $ python npyleaktest.py
> > before: 84531, after: 84532, diff: [1]
> > before: 84534, after: 84534, diff: [0]
> > before: 84534, after: 84534, diff: [0]
> > before: 84534, after: 84534, diff: [0]
> > before: 84534, after: 84534, diff: [0]
> > before: 84531, after: 84533, diff: [2]
> > before: 84535, after: 84536, diff: [1]
> > before: 84536, after: 84537, diff: [1]
> > before: 84537, after: 84538, diff: [1]
> > before: 84538, after: 84539, diff: [1]
> >
> > It seems npy_asarray_2() is leaking a reference. I am  missing
> > something here?. The same problem is found in C, using
> > PyArray_FROM_OTF (no time to go inside to see what's going on, sorry)
> >
> > If this is know and solved in SVN, please forget me.
>
> I don't have a debug build handy to test this on, but this might not be a
> reference leak. Since you are checking the count before and after each
> cycle, it could be that there are cycles being created that are subsequently
> cleaned up by the garbage collector.
>
> Can you try instead to look at the difference between the reference count at
> the end of each cycle with the reference count before the first cycle? If
> that goes up indefinitely, then it's probably a leak. If it bounces around
> or levels off, then probably not. You'd probably want to run a bunch of
> repeats just to be sure.
> regards,
> -tim
>
> > Regards,
> >
> >
> > --
> > Lisandro Dalc?n
> > ---------------
> > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
> > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
> > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
> > PTLC - G?emes 3450, (3000) Santa Fe, Argentina
> > Tel/Fax: +54-(0)342-451.1594
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at scipy.org
> >
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
> >
>
>
>
> --
> .  __
> .   |-\
> .
> .   tim.hochberg at ieee.org
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594


From charlesr.harris at gmail.com  Thu Aug  2 19:01:55 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 2 Aug 2007 17:01:55 -0600
Subject: [Numpy-discussion] fourier with single precision
In-Reply-To: <Pine.LNX.4.64.0708021245190.31476@circinus.slac.stanford.edu>
References: <46B208ED.1010403@imtek.de>
	<Pine.LNX.4.64.0708021245190.31476@circinus.slac.stanford.edu>
Message-ID: <e06186140708021601n5d11077fta0c8ae2bf3f6bfbe@mail.gmail.com>

On 8/2/07, Warren Focke <focke at slac.stanford.edu> wrote:
>
>
>
> On Thu, 2 Aug 2007, Lars Friedrich wrote:
>
> > What I understood is that numpy uses FFTPACK's algorithms.
>
> Sort of.  It appears to be a hand translation from F77 to C.
>
> > From www.netlib.org/fftpack (is this the right address?) I took that
> > there is a single-precision and double-precision-version of the
> > algorithms. How hard would it be (for example for me...) to add the
> > single-precision versions to numpy? I am not a decent C-hacker, but if
> > someone tells me, that this task is not *too* hard, I would start
> > looking more closely at the code...
>
> It shouldn't be hard.  fftpack.c will make a single-precision version if
> DOUBLE is not defined at compile time.
>
> > Would it make sense, that if one passes an array of dtype =
> > numpy.float32 to the fft function, a complex64 is returned, and if one
> > passes an array of dtype = numpy.float64, a complex128 is returned?
>
> Sounds like reasonable default behavior.  Might be useful if the caller
> could overrride it.


On X86 machines the main virtue would be smaller and more cache friendly
arrays because double precision arithmetic is about the same speed as single
precision, sometimes even a bit faster. The PPC architecture does have
faster single than double precision, so there it could make a difference.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070802/70a4dd17/attachment.html>

From dalcinl at gmail.com  Thu Aug  2 19:42:07 2007
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Thu, 2 Aug 2007 20:42:07 -0300
Subject: [Numpy-discussion] reference leacks in numpy.asarray
In-Reply-To: <e7ba66e40708021520ifda1ddapbf7ea33175d65df8@mail.gmail.com>
References: <e7ba66e40708021222m380062bs7b66673ef9fa2b69@mail.gmail.com>
	<e4412d6b0708021415x3a7399c4w27ff07a3781c1623@mail.gmail.com>
	<e7ba66e40708021520ifda1ddapbf7ea33175d65df8@mail.gmail.com>
Message-ID: <e7ba66e40708021642r343b52fg7ebcc3af5ab5d92f@mail.gmail.com>

This patch corrected the problem for me, numpy test pass...


On 8/2/07, Lisandro Dalcin <dalcinl at gmail.com> wrote:
> I think the problem is in  _array_fromobject (seen as numpy.array in Python)


-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594
-------------- next part --------------
A non-text attachment was scrubbed...
Name: array.patch
Type: application/octet-stream
Size: 700 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070802/f5a29aae/attachment.obj>

From dalcinl at gmail.com  Thu Aug  2 20:01:43 2007
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Thu, 2 Aug 2007 21:01:43 -0300
Subject: [Numpy-discussion] reference leaks in array() and arange()
Message-ID: <e7ba66e40708021701u314a7fedve8a4ce1e58c6be06@mail.gmail.com>

As PyArray_DescrConverter return new references, I think there could
be many places were PyArray_Descr* objects get its reference count
incremented.

Here, I send a patch correcting this for array() and arange(), but not
sure if this is the more general solution.

BTW, please see my previous comments in previous mail on using
convertor functions (returning new refs) and the (very common) idiom
"if(!PyArg_PaseXXX(,,,) return NULL", as this seems to be used in
almost all places in numpy C sources and is a potential source of ref
leaks.

Regards,

-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594
-------------- next part --------------
A non-text attachment was scrubbed...
Name: array-arange.patch
Type: application/octet-stream
Size: 1229 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070802/5a2ffeae/attachment.obj>

From focke at slac.stanford.edu  Thu Aug  2 22:11:36 2007
From: focke at slac.stanford.edu (Warren Focke)
Date: Thu, 2 Aug 2007 19:11:36 -0700 (PDT)
Subject: [Numpy-discussion] fourier with single precision
In-Reply-To: <e06186140708021601n5d11077fta0c8ae2bf3f6bfbe@mail.gmail.com>
References: <46B208ED.1010403@imtek.de>
	<Pine.LNX.4.64.0708021245190.31476@circinus.slac.stanford.edu>
	<e06186140708021601n5d11077fta0c8ae2bf3f6bfbe@mail.gmail.com>
Message-ID: <Pine.LNX.4.64.0708021906560.838@circinus.slac.stanford.edu>


On Thu, 2 Aug 2007, Charles R Harris wrote:

> On X86 machines the main virtue would be smaller and more cache friendly
> arrays because double precision arithmetic is about the same speed as single
> precision, sometimes even a bit faster. The PPC architecture does have
> faster single than double precision, so there it could make a difference.

Yeah, I was wondering if I should mention that.  I think SSE has real 
single precision, if you can convince the compiler to do it that way. 
Even better if it could be vectorized with SSE.

w


From vincent.nijs at gmail.com  Fri Aug  3 01:57:06 2007
From: vincent.nijs at gmail.com (Vincent)
Date: Fri, 03 Aug 2007 05:57:06 -0000
Subject: [Numpy-discussion] How to implement a 'pivot table?'
In-Reply-To: <57273045-FDD5-4454-A1DE-9CB12387DA88@enthought.com>
References: <fcedc2970707300812p58c343fbq852a8de09dbd4420@mail.gmail.com>
	<bbcd77d00708011202g600835d3t2c8c2813ff04c1e2@mail.gmail.com>
	<57273045-FDD5-4454-A1DE-9CB12387DA88@enthought.com>
Message-ID: <1186120626.433696.242920@i38g2000prf.googlegroups.com>

What is ugly about the module? I like it!

What do you mean about recarray's? Do you think they are they not
appropriate for this type of thing?

When i get some time i'll run some tests versus SAS for the same
operations and do a speed comparison.

Question: Would there be an easy way to merge the summary stats back
into the recarray?

Best,

Vincent

On Aug 1, 11:22 pm, Travis Vaught <tra... at enthought.com> wrote:
> Greetings,
>
> Speaking of brute force... I've attached a rather ugly module that
> let's you do things with a pretty simple interface (session shown
> below).  I haven't fully tested the performance, but a million
> records with 5 fields takes about 11 seconds on my Mac to do a
> 'mean'.  I'm not sure what your performance considerations are, but
> this may be useful.  Record arrays are really nice if they make sense
> for your data.
>
> Travis
>
> (from an ipython command prompt)
>
> In [1]: import testpivot as p
>
> In [2]: a = p.sample_data()
>
> In [3]: a
> Out[3]:
> recarray([('ACorp', 'Region 1', 'Q1', 20000.0),
>         ('ACorp', 'Region 1', 'Q2', 22000.0),
>         ('ACorp', 'Region 1', 'Q3', 21000.0),
>         ('ACorp', 'Region 1', 'Q4', 26000.0),
>         ('ACorp', 'Region 2', 'Q1', 23000.0),
>         ('ACorp', 'Region 2', 'Q2', 20000.0),
>         ('ACorp', 'Region 2', 'Q3', 22000.0),
>         ('ACorp', 'Region 2', 'Q4', 21000.0),
>         ('ACorp', 'Region 3', 'Q1', 26000.0),
>         ('ACorp', 'Region 3', 'Q2', 23000.0),
>         ('ACorp', 'Region 3', 'Q3', 29000.0),
>         ('ACorp', 'Region 3', 'Q4', 27000.0),
>         ('BCorp', 'Region 1', 'Q1', 20000.0),
>         ('BCorp', 'Region 1', 'Q2', 20000.0),
>         ('BCorp', 'Region 1', 'Q3', 24000.0),
>         ('BCorp', 'Region 1', 'Q4', 24000.0),
>         ('BCorp', 'Region 2', 'Q1', 21000.0),
>         ('BCorp', 'Region 2', 'Q2', 21000.0),
>         ('BCorp', 'Region 2', 'Q3', 22000.0),
>         ('BCorp', 'Region 2', 'Q4', 29000.0),
>         ('BCorp', 'Region 3', 'Q1', 28000.0),
>         ('BCorp', 'Region 3', 'Q2', 25000.0),
>         ('BCorp', 'Region 3', 'Q3', 22000.0),
>         ('BCorp', 'Region 3', 'Q4', 21000.0)],
>        dtype=[('company', '|S5'), ('region', '|S8'), ('quarter', '|
> S2'), ('income', '<f8')])
>
> In [4]: p.pivot(a, 'company', 'region', 'income', p.psum)
> ######## Summary by company and region ##########
> cols:['ACorp' 'BCorp']
> rows:['Region 1' 'Region 2' 'Region 3']
> [[  89000.   88000.]
> [  86000.   93000.]
> [ 105000.   96000.]]
>
> In [5]: p.pivot(a, 'company', 'quarter', 'income', p.psum)
> ######## Summary by company and quarter ##########
> cols:['ACorp' 'BCorp']
> rows:['Q1' 'Q2' 'Q3' 'Q4']
> [[ 69000.  69000.]
> [ 65000.  66000.]
> [ 72000.  68000.]
> [ 74000.  74000.]]
>
> In [6]: p.pivot(a, 'company', 'quarter', 'income', p.pmean)
> ######## Summary by company and quarter ##########
> cols:['ACorp' 'BCorp']
> rows:['Q1' 'Q2' 'Q3' 'Q4']
> [[ 23000.          23000.        ]
> [ 21666.66666667  22000.        ]
> [ 24000.          22666.66666667]
> [ 24666.66666667  24666.66666667]]
>
>  testpivot.py
> 3KDownload
>
>
>
> On Aug 1, 2007, at 2:02 PM, Bruce Southey wrote:
>
> > Hi,
> > The hard part is knowing what aggregate function that you want. So a
> > hard way, even after cheating, to take the data provided is given
> > below. (The Numpy Example List was very useful especially on the where
> > function)!
>
> > I tried to be a little generic so you can replace the sum by any
> > suitable function and probably the array type as well. Of course it is
> > not complete because you still need to know the levels of the 'rows'
> > and 'columns' and also is not efficient as it has loops.
>
> > Bruce
>
> > from numpy import *
> > A=array([[1,1,10],
> >          [1,1,20],
> >          [1,2,30],
> >          [2,1,40],
> >          [2,2,50],
> >          [2,2,60] ])
> > C = zeros((2,2))
>
> > for i in range(2):
> >       crit1 = (A[:,0]==1+i)
> >       subA=A[crit1,1:]
> >       for j in range(2):
> >             crit2 = (subA[:,0]==1+j)
> >             subB=subA[crit2,1:]
> >             C[i,j]=subB.sum()
>
> > print C
>
> > On 7/30/07, Geoffrey Zhu <zyzhu2... at gmail.com> wrote:
> >> Hi Everyone,
>
> >> I am wondering what is the best (and fast) way to build a pivot table
> >> aside from the 'brute force way?'
>
> >> I want to transform an numpy array into a pivot table. For
> >> example, if
> >> I have a numpy array like below:
>
> >> Region     Date          # of Units
> >> ----------    ----------        --------------
> >> East        1/1             10
> >> East        1/1             20
> >> East        1/2             30
> >> West       1/1             40
> >> West       1/2             50
> >> West       1/2             60
>
> >> I want  to transform this into the following table, where f() is a
> >> given aggregate function:
>
> >>            Date
> >> Region           1/1          1/2
> >> ----------
> >> East         f(10,20)         f(30)
> >> West        f(40)             f(50,60)
>
> >> I can regroup them into 'sets' and do it the brute force way, but
> >> that
> >> is kind of slow to execute. Does anyone know a better way?
>
> >> Thanks,
> >> Geoffrey
> >> _______________________________________________
> >> Numpy-discussion mailing list
> >> Numpy-discuss... at scipy.org
> >>http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discuss... at scipy.org
> >http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion


From david at ar.media.kyoto-u.ac.jp  Fri Aug  3 02:06:58 2007
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Fri, 03 Aug 2007 15:06:58 +0900
Subject: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement
Message-ID: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>

Hi,

   Following an ongoing discussion with S. Johnson, one of the developer 
of fftw3, I would be interested in what people think about adding 
infrastructure in numpy related to SIMD alignement (that is 16 bytes 
alignement for SSE/ALTIVEC, I don't know anything about other archs). 
The problem is that right now, it is difficult to get information for 
alignement in numpy (by alignement here, I mean something different than 
what is normally meant in numpy context; whether, in my understanding, 
NPY_ALIGNED refers to a pointer which is aligned wrt his type, here, I 
am talking about arbitrary alignement).
  For example, for fftw3, we need to know whether a given data buffer is 
16 bytes aligned to get optimal performances; generally, SSE needs 16 
byte alignement for optimal performances, as well as altivec. I think it 
would be nice to get some infrastructure to help developers to get those 
kind of information, and maybe to be able to request 16 aligned buffers.
   Here is what I can think of:
      - adding an API to know whether a given PyArrayObject has its data 
buffer 16 bytes aligned, and requesting a 16 bytes aligned 
PyArrayObject. Something like NPY_ALIGNED, basically.
      - forcing data allocation to be 16 bytes aligned in numpy (eg 
define PyDataMem_Mem to a 16 bytes aligned allocator instead of malloc). 
This would mean that many arrays would be "naturally" 16 bytes aligned 
without effort.

Point 2 is really easy to implement I think: actually, on some platforms 
(Mac OS X and FreeBSD), malloc returning 16 bytes aligned buffers 
anyway, so I don't think the wasted space is a real problem. Linux with 
glibc is 8 bytes aligned, I don't know about windows. Implementing our 
own 16 bytes aligned memory allocator for cross platform compatibility 
should be relatively easy. I don't see any drawback, but I guess other 
people will.

Point 1 is more tricky, as this requires much more changes in the code.

Do main developers of numpy have an opinion on this ?

   cheers,

   David


From matthew.brett at gmail.com  Fri Aug  3 09:38:36 2007
From: matthew.brett at gmail.com (Matthew Brett)
Date: Fri, 3 Aug 2007 14:38:36 +0100
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
Message-ID: <1e2af89e0708030638l1c91885bt7769c1c45bdf341d@mail.gmail.com>

Hi,

>    Following an ongoing discussion with S. Johnson, one of the developer
> of fftw3, I would be interested in what people think about adding
> infrastructure in numpy related to SIMD alignement (that is 16 bytes
> alignement for SSE/ALTIVEC, I don't know anything about other archs).
> The problem is that right now, it is difficult to get information for
> alignement in numpy (by alignement here, I mean something different than
> what is normally meant in numpy context; whether, in my understanding,
> NPY_ALIGNED refers to a pointer which is aligned wrt his type, here, I
> am talking about arbitrary alignement).

Excellent idea if practical...

Matthew


From strawman at astraw.com  Fri Aug  3 11:12:44 2007
From: strawman at astraw.com (Andrew Straw)
Date: Fri, 03 Aug 2007 08:12:44 -0700
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
Message-ID: <46B345EC.6090503@astraw.com>

Dear David,

Both ideas, particularly the 2nd, would be excellent additions to numpy. 
I often use the Intel IPP (Integrated Performance Primitives) Library 
together with numpy, but I have to do all my memory allocation with the 
IPP to ensure fastest operation. I then create numpy views of the data. 
All this works brilliantly, but it would be really nice if I could 
allocate the memory directly in numpy.

IPP allocates, and says it wants, 32 byte aligned memory (see, e.g. 
http://www.intel.com/support/performancetools/sb/CS-021418.htm ). Given 
that fftw3 apparently wants 16 byte aligned memory, my feeling is that, 
  if the effort is made, the alignment width should be specified at 
run-time, rather than hard-coded.

In terms of implementation of your 1st point, I'm not aware of how much 
effort your idea would take (and it does sound nice), but some benefit 
would be had just from a simple function numpy.is_mem_aligned( ndarray, 
width=16 ) which returns a bool.

Cheers!
Andrew

David Cournapeau wrote:
> Hi,
> 
>    Following an ongoing discussion with S. Johnson, one of the developer 
> of fftw3, I would be interested in what people think about adding 
> infrastructure in numpy related to SIMD alignement (that is 16 bytes 
> alignement for SSE/ALTIVEC, I don't know anything about other archs). 
> The problem is that right now, it is difficult to get information for 
> alignement in numpy (by alignement here, I mean something different than 
> what is normally meant in numpy context; whether, in my understanding, 
> NPY_ALIGNED refers to a pointer which is aligned wrt his type, here, I 
> am talking about arbitrary alignement).
>   For example, for fftw3, we need to know whether a given data buffer is 
> 16 bytes aligned to get optimal performances; generally, SSE needs 16 
> byte alignement for optimal performances, as well as altivec. I think it 
> would be nice to get some infrastructure to help developers to get those 
> kind of information, and maybe to be able to request 16 aligned buffers.
>    Here is what I can think of:
>       - adding an API to know whether a given PyArrayObject has its data 
> buffer 16 bytes aligned, and requesting a 16 bytes aligned 
> PyArrayObject. Something like NPY_ALIGNED, basically.
>       - forcing data allocation to be 16 bytes aligned in numpy (eg 
> define PyDataMem_Mem to a 16 bytes aligned allocator instead of malloc). 
> This would mean that many arrays would be "naturally" 16 bytes aligned 
> without effort.
> 
> Point 2 is really easy to implement I think: actually, on some platforms 
> (Mac OS X and FreeBSD), malloc returning 16 bytes aligned buffers 
> anyway, so I don't think the wasted space is a real problem. Linux with 
> glibc is 8 bytes aligned, I don't know about windows. Implementing our 
> own 16 bytes aligned memory allocator for cross platform compatibility 
> should be relatively easy. I don't see any drawback, but I guess other 
> people will.
> 
> Point 1 is more tricky, as this requires much more changes in the code.
> 
> Do main developers of numpy have an opinion on this ?
> 
>    cheers,
> 
>    David
> 
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion


From focke at slac.stanford.edu  Fri Aug  3 11:17:30 2007
From: focke at slac.stanford.edu (Warren Focke)
Date: Fri, 3 Aug 2007 08:17:30 -0700 (PDT)
Subject: [Numpy-discussion] fourier with single precision
In-Reply-To: <Pine.LNX.4.64.0708021245190.31476@circinus.slac.stanford.edu>
References: <46B208ED.1010403@imtek.de>
	<Pine.LNX.4.64.0708021245190.31476@circinus.slac.stanford.edu>
Message-ID: <Pine.LNX.4.64.0708030813100.5237@circinus.slac.stanford.edu>


On Thu, 2 Aug 2007, Warren Focke wrote:

>
>
> On Thu, 2 Aug 2007, Lars Friedrich wrote:
>
>> versions to numpy? I am not a decent C-hacker, but if someone tells me, 
>> that this task is not *too* hard, I would start looking more closely at the 
>> code...
>
> It shouldn't be hard.  fftpack.c will make a single-precision version if 
> DOUBLE is not defined at compile time.

Of course, it's even less hard to use FFTW or MKL.

w


From david at ar.media.kyoto-u.ac.jp  Fri Aug  3 23:28:34 2007
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Sat, 04 Aug 2007 12:28:34 +0900
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <46B345EC.6090503@astraw.com>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
	<46B345EC.6090503@astraw.com>
Message-ID: <46B3F262.80000@ar.media.kyoto-u.ac.jp>

Andrew Straw wrote:
> Dear David,
>
> Both ideas, particularly the 2nd, would be excellent additions to numpy. 
> I often use the Intel IPP (Integrated Performance Primitives) Library 
> together with numpy, but I have to do all my memory allocation with the 
> IPP to ensure fastest operation. I then create numpy views of the data. 
> All this works brilliantly, but it would be really nice if I could 
> allocate the memory directly in numpy.
>
> IPP allocates, and says it wants, 32 byte aligned memory (see, e.g. 
> http://www.intel.com/support/performancetools/sb/CS-021418.htm ). Given 
> that fftw3 apparently wants 16 byte aligned memory, my feeling is that, 
>   if the effort is made, the alignment width should be specified at 
> run-time, rather than hard-coded.
I think that doing it at runtime would be overkill, no ? I was thinking 
about making it a compile option. Generally, at the ASM level, you need 
16 bytes alignment (for instructions like movaps, which takes 16 bytes 
in memory and put it in the SSE registers), this is not just fftw. Maybe 
the 32 bytes alignment is useful for cache reasons, I don't know.

I don't think it would be difficult to implement and validate; what I 
don't know at all is the implication of this at the binary level, if any.

cheers,

David


From charlesr.harris at gmail.com  Sat Aug  4 01:30:46 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 3 Aug 2007 23:30:46 -0600
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <46B3F262.80000@ar.media.kyoto-u.ac.jp>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
	<46B345EC.6090503@astraw.com> <46B3F262.80000@ar.media.kyoto-u.ac.jp>
Message-ID: <e06186140708032230i65b77d63g4b001b6ac60e66d@mail.gmail.com>

On 8/3/07, David Cournapeau <david at ar.media.kyoto-u.ac.jp> wrote:
>
> Andrew Straw wrote:
> > Dear David,
> >
> > Both ideas, particularly the 2nd, would be excellent additions to numpy.
> > I often use the Intel IPP (Integrated Performance Primitives) Library
> > together with numpy, but I have to do all my memory allocation with the
> > IPP to ensure fastest operation. I then create numpy views of the data.
> > All this works brilliantly, but it would be really nice if I could
> > allocate the memory directly in numpy.
> >
> > IPP allocates, and says it wants, 32 byte aligned memory (see, e.g.
> > http://www.intel.com/support/performancetools/sb/CS-021418.htm ). Given
> > that fftw3 apparently wants 16 byte aligned memory, my feeling is that,
> >   if the effort is made, the alignment width should be specified at
> > run-time, rather than hard-coded.
> I think that doing it at runtime would be overkill, no ? I was thinking
> about making it a compile option. Generally, at the ASM level, you need
> 16 bytes alignment (for instructions like movaps, which takes 16 bytes
> in memory and put it in the SSE registers), this is not just fftw. Maybe
> the 32 bytes alignment is useful for cache reasons, I don't know.
>
> I don't think it would be difficult to implement and validate; what I
> don't know at all is the implication of this at the binary level, if any.


Here's a hack that google turned up:

(1) Use static variables instead of dynamic (stack) variables
(2) Use in-line assembly code that explicitly aligns data
(3) In C code, use "*malloc*" to explicitly allocate variables

Here is Intel's example of (2):

; procedure prologue
push ebp
mov esp, ebp
and ebp, -8
sub esp, 12

; procedure epilogue
add esp, 12
pop ebp
ret

Intel's example of (3), slightly modified:

double *p, *newp;
p = (double*)*malloc* ((sizeof(double)*NPTS)+4);
newp = (p+4) & (~7);

This assures that newp is 8-*byte* aligned even if p is not. However,
*malloc*() may already follow Intel's recommendation that a *32*-*byte* or
greater data structures be aligned on a *32* *byte* boundary. In that case,
increasing the requested memory by 4 bytes and computing newp are
superfluous.


Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070803/6eb3641a/attachment.html>

From charlesr.harris at gmail.com  Sat Aug  4 02:06:15 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 4 Aug 2007 00:06:15 -0600
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <e06186140708032230i65b77d63g4b001b6ac60e66d@mail.gmail.com>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
	<46B345EC.6090503@astraw.com> <46B3F262.80000@ar.media.kyoto-u.ac.jp>
	<e06186140708032230i65b77d63g4b001b6ac60e66d@mail.gmail.com>
Message-ID: <e06186140708032306q7f43db5dv7418650f4a79160a@mail.gmail.com>

On 8/3/07, Charles R Harris <charlesr.harris at gmail.com> wrote:
>
>
>
> On 8/3/07, David Cournapeau <david at ar.media.kyoto-u.ac.jp> wrote:
> >
> > Andrew Straw wrote:
> > > Dear David,
> > >
> > > Both ideas, particularly the 2nd, would be excellent additions to
> > numpy.
> > > I often use the Intel IPP (Integrated Performance Primitives) Library
> > > together with numpy, but I have to do all my memory allocation with
> > the
> > > IPP to ensure fastest operation. I then create numpy views of the
> > data.
> > > All this works brilliantly, but it would be really nice if I could
> > > allocate the memory directly in numpy.
> > >
> > > IPP allocates, and says it wants, 32 byte aligned memory (see, e.g.
> > > http://www.intel.com/support/performancetools/sb/CS-021418.htm ).
> > Given
> > > that fftw3 apparently wants 16 byte aligned memory, my feeling is
> > that,
> > >   if the effort is made, the alignment width should be specified at
> > > run-time, rather than hard-coded.
> > I think that doing it at runtime would be overkill, no ? I was thinking
> > about making it a compile option. Generally, at the ASM level, you need
> > 16 bytes alignment (for instructions like movaps, which takes 16 bytes
> > in memory and put it in the SSE registers), this is not just fftw. Maybe
> > the 32 bytes alignment is useful for cache reasons, I don't know.
> >
> > I don't think it would be difficult to implement and validate; what I
> > don't know at all is the implication of this at the binary level, if
> > any.
>
>
>
> Here's a hack that google turned up:
>
> (1) Use static variables instead of dynamic (stack) variables
> (2) Use in-line assembly code that explicitly aligns data
> (3) In C code, use "*malloc*" to explicitly allocate variables
>
> Here is Intel's example of (2):
>
> ; procedure prologue
> push ebp
> mov esp, ebp
> and ebp, -8
> sub esp, 12
>
> ; procedure epilogue
> add esp, 12
> pop ebp
> ret
>
> Intel's example of (3), slightly modified:
>
> double *p, *newp;
> p = (double*)*malloc* ((sizeof(double)*NPTS)+4);
> newp = (p+4) & (~7);
>
> This assures that newp is 8-*byte* aligned even if p is not. However,
> *malloc*() may already follow Intel's recommendation that a *32*-* byte*or
> greater data structures be aligned on a * 32* *byte* boundary. In that
> case,
> increasing the requested memory by 4 bytes and computing newp are
> superfluous.
>

I think that for numpy arrays it should be possible to define the offset so
that the result is 32 byte aligned. However, this might break some peoples'
code if they haven't payed attention to the offset. Another possibility is
to allocate an oversized array, check the pointer, and take a range out of
it. For instance:

In [32]: a = zeros(10)

In [33]: a.ctypes.data % 32
Out[33]: 16

The array alignment is 16 bytes, consequently

In [34]: a[2:].ctypes.data % 32
Out[34]: 0

Voila, 32 byte alignment. I think a short python routine could do this,
which ought to serve well for 1D fft's. Multidimensional arrays will be
trickier if you want the rows to be aligned. Aligning the columns just isn't
going to work.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070804/fab625e1/attachment.html>

From david at ar.media.kyoto-u.ac.jp  Sat Aug  4 02:25:38 2007
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Sat, 04 Aug 2007 15:25:38 +0900
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <e06186140708032306q7f43db5dv7418650f4a79160a@mail.gmail.com>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>	<46B345EC.6090503@astraw.com>
	<46B3F262.80000@ar.media.kyoto-u.ac.jp>	<e06186140708032230i65b77d63g4b001b6ac60e66d@mail.gmail.com>
	<e06186140708032306q7f43db5dv7418650f4a79160a@mail.gmail.com>
Message-ID: <46B41BE2.7020003@ar.media.kyoto-u.ac.jp>


>
>
>     Here's a hack that google turned up:
>
>     (1) Use static variables instead of dynamic (stack) variables
>     (2) Use in-line assembly code that explicitly aligns data
>     (3) In C code, use "*malloc*" to explicitly allocate variables
>
>     Here is Intel's example of (2):
>
>     ; procedure prologue
>     push ebp
>     mov esp, ebp
>     and ebp, -8
>     sub esp, 12
>
>     ; procedure epilogue
>     add esp, 12
>     pop ebp
>     ret
>
>     Intel's example of (3), slightly modified:
>
>     double *p, *newp;
>     p = (double*)*malloc* ((sizeof(double)*NPTS)+4);
>     newp = (p+4) & (~7);
>
>     This assures that newp is 8-*byte* aligned even if p is not. However,
>     *malloc*() may already follow Intel's recommendation that a *32*-*
>     byte* or
>     greater data structures be aligned on a *32* *byte* boundary. In
>     that case,
>     increasing the requested memory by 4 bytes and computing newp are
>     superfluous.
>
>
> I think that for numpy arrays it should be possible to define the 
> offset so that the result is 32 byte aligned. However, this might 
> break some peoples' code if they haven't payed attention to the offset.
Why ? I really don't see how it can break anything at the source code 
level. You don't have to care about things you didn't care before: the 
best proof of that if that numpy runs on different platforms where the 
malloc has different alignment guarantees (mac OS X already aligned to 
16 bytes, for the very reason of making optimizing with SIMD easier, 
whereas glibc malloc only aligns to 8 bytes, at least on Linux).
> Another possibility is to allocate an oversized array, check the 
> pointer, and take a range out of it. For instance:
>
> In [32]: a = zeros(10)
>
> In [33]: a.ctypes.data % 32
> Out[33]: 16
>
> The array alignment is 16 bytes, consequently
>
> In [34]: a[2:].ctypes.data % 32
> Out[34]: 0
>
> Voila, 32 byte alignment. I think a short python routine could do 
> this, which ought to serve well for 1D fft's. Multidimensional arrays 
> will be trickier if you want the rows to be aligned. Aligning the 
> columns just isn't going to work.
I am not suggesting realigning existing arrays. What I would like numpy 
to support are the following cases:

 - Check whether a given a numpy array is simd aligned:

/* Simple case: if aligned, use optimized func, use non optimized 
otherwise */
int simd_func(double* in, size_t n);
int nosimd_func(double* in, size_t n);

if (PyArray_ISALIGNED_SIMD(a)) {
    simd_func((double *)a->data, a->size);
} else {
    nosimd_func((double *)a->data, a->size);
}
 - Request explicitely an aligned arrays from any PyArray_* functions 
which create a ndarray, eg: ar = PyArray_FROM_OF(a, NPY_SIMD_ALIGNED);

Allocating a buffer aligned to a given alignment is not the problem: 
there is a posix functions to do it, and we can implement easily a 
function for the OS who do not support it. This would be done in C, not 
in python.

cheers,

David


From peridot.faceted at gmail.com  Sat Aug  4 03:24:55 2007
From: peridot.faceted at gmail.com (Anne Archibald)
Date: Sat, 4 Aug 2007 03:24:55 -0400
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <46B41BE2.7020003@ar.media.kyoto-u.ac.jp>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
	<46B345EC.6090503@astraw.com> <46B3F262.80000@ar.media.kyoto-u.ac.jp>
	<e06186140708032230i65b77d63g4b001b6ac60e66d@mail.gmail.com>
	<e06186140708032306q7f43db5dv7418650f4a79160a@mail.gmail.com>
	<46B41BE2.7020003@ar.media.kyoto-u.ac.jp>
Message-ID: <ce557a360708040024h3719982ao2d269e119aca2eb6@mail.gmail.com>

On 04/08/07, David Cournapeau <david at ar.media.kyoto-u.ac.jp> wrote:

> >     Here's a hack that google turned up:

I'd avoid hacks in favour of posix_memalign (which allows arbitrary
degrees of alignment. For one thing, freeing becomes a headache (you
can't free a pointer you've jiggered!).

>  - Check whether a given a numpy array is simd aligned:
>
> /* Simple case: if aligned, use optimized func, use non optimized
> otherwise */
> int simd_func(double* in, size_t n);
> int nosimd_func(double* in, size_t n);
>
> if (PyArray_ISALIGNED_SIMD(a)) {
>     simd_func((double *)a->data, a->size);
> } else {
>     nosimd_func((double *)a->data, a->size);
> }
>  - Request explicitely an aligned arrays from any PyArray_* functions
> which create a ndarray, eg: ar = PyArray_FROM_OF(a, NPY_SIMD_ALIGNED);
>
> Allocating a buffer aligned to a given alignment is not the problem:
> there is a posix functions to do it, and we can implement easily a
> function for the OS who do not support it. This would be done in C, not
> in python.

I'd just like to point out that PyArray_ISALIGNED_SIMD(a) can be a
macro which aligns to something like "!((a->datapointer)&0xf)"; this
avoids any change to the array objects and allows checking for
arbitrary degrees of alignment - somebody mentioned the Intel
Performance Primitives need 32-byte aligned data? One might also want
page-aligned data or data aligned in some way with cache lines.

It seems to me two things are needed:

* A mechanism for requesting numpy arrays with buffers aligned to an
arbitrary power-of-two size (basically just using posix_memalign or
some horrible hack on platforms that don't have it).

* A macro (in C, and some way to get the same information from python,
perhaps just "a.ctypes.data % 16") to test for common alignment cases;
SIMD alignment and arbitrary power-of-two alignment are probably
sufficient.

Does this fail to cover any important cases?

Anne


From adam.powell at ucl.ac.uk  Fri Aug  3 07:50:04 2007
From: adam.powell at ucl.ac.uk (adam.powell at ucl.ac.uk)
Date: Fri, 03 Aug 2007 12:50:04 +0100
Subject: [Numpy-discussion] multinomial error?
Message-ID: <20070803125004.3sauiyr34g444048@www.webmail.ucl.ac.uk>

Hi,

I appear to be having a problem with the random.multinomial function. For some
reason if i attempt to loop over a large number of single-trial multinomial
picks then the function begins to ignore some non-zero entries in my 1-D array
of multinomial probabilities... Is seems that there is no upper limit on the
size of the probability array for a one off multinomial pick, but if looping
over the process multiple times the function can't handle the whole array and
seems to truncate it arbitrarily before performing the trial with only the
remaining probabilities.

There is a reason why i need to loop over a large number of single-trial events,
rather than just replacing the loop with a large number of trials in one single
multinomial pick (annoying, as that's so much quicker!).

Thanks for any help,

Adam


From stevenj at alum.mit.edu  Sat Aug  4 23:20:31 2007
From: stevenj at alum.mit.edu (Steven G. Johnson)
Date: Sat, 04 Aug 2007 20:20:31 -0700
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <ce557a360708040024h3719982ao2d269e119aca2eb6@mail.gmail.com>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
	<46B345EC.6090503@astraw.com>
	<46B3F262.80000@ar.media.kyoto-u.ac.jp>
	<e06186140708032230i65b77d63g4b001b6ac60e66d@mail.gmail.com>
	<e06186140708032306q7f43db5dv7418650f4a79160a@mail.gmail.com>
	<46B41BE2.7020003@ar.media.kyoto-u.ac.jp>
	<ce557a360708040024h3719982ao2d269e119aca2eb6@mail.gmail.com>
Message-ID: <1186284031.873276.35610@19g2000hsx.googlegroups.com>

On Aug 4, 3:24 am, "Anne Archibald" <peridot.face... at gmail.com> wrote:

> It seems to me two things are needed:
>
> * A mechanism for requesting numpy arrays with buffers aligned to an
> arbitrary power-of-two size (basically just using posix_memalign or
> some horrible hack on platforms that don't have it).

Right, you might as well allow the alignment (to a power-of-two size)
to be specified at runtime, as there is really no cost to implementing
an arbitrary alignment once you have any alignment.

Although you should definitely use posix_memalign (or the old
memalign) where it is available, unfortunately it's not implemented on
all systems.  e.g. MacOS X and FreeBSD don't have it, last I checked
(although in both cases their malloc is 16-byte aligned).  Microsoft VC
++ has a function called _aligned_malloc which is equivalent.

However, since MinGW (www.mingw.org) didn't have an _aligned_malloc
function, I wrote one for them a few years ago and put it in the
public domain (I use MinGW to cross-compile to Windows from Linux and
need the alignment).  You are free to use it as a fallback on systems
that don't have a memalign function if you want.  It should work on
any system where sizeof(void*) is a power of two (i.e. every extant
architecture, that I know of).  You can download it and its test
program from:
           ab-initio.mit.edu/~stevenj/align.c
           ab-initio.mit.edu/~stevenj/tstalign.c
It just uses malloc with a little extra padding as needed to align the
data, plus a copy of the original pointer so that you can still free
and realloc (using _aligned_free and _aligned_realloc).  It could be
made a bit more efficient, but it probably doesn't matter.

> * A macro (in C, and some way to get the same information from python,
> perhaps just "a.ctypes.data % 16") to test for common alignment cases;
> SIMD alignment and arbitrary power-of-two alignment are probably
> sufficient.

In C this is easy, just ((uintptr_t) pointer) % 16 == 0.

You might also consider a way to set the default alignment of numpy
arrays at runtime, rather than requesting aligned arrays
individually.  e.g. so that someone could come along at a later date
to a large program and just add one function call to make all the
arrays 16-byte aligned to improve performance using SIMD libraries.

Regards,
Steven G. Johnson


From aisaac at american.edu  Sun Aug  5 10:42:17 2007
From: aisaac at american.edu (Alan G Isaac)
Date: Sun, 5 Aug 2007 10:42:17 -0400
Subject: [Numpy-discussion] multinomial error?
In-Reply-To: <20070803125004.3sauiyr34g444048@www.webmail.ucl.ac.uk>
References: <20070803125004.3sauiyr34g444048@www.webmail.ucl.ac.uk>
Message-ID: <Mahogany-0.67.0-432-20070805-104217.00@american.edu>

On Fri, 03 Aug 2007, adam.powell at ucl.ac.uk apparently wrote:
> I appear to be having a problem with the random.multinomial function. For some 
> reason if i attempt to loop over a large number of single-trial multinomial 
> picks then the function begins to ignore some non-zero entries in my 1-D array 
> of multinomial probabilities... Is seems that there is no upper limit on the 
> size of the probability array for a one off multinomial pick, but if looping 
> over the process multiple times the function can't handle the whole array and 
> seems to truncate it arbitrarily before performing the trial with only the 
> remaining probabilities. 

Minimal example?

Cheers,
Alan Isaac


From lfriedri at imtek.de  Mon Aug  6 02:53:55 2007
From: lfriedri at imtek.de (Lars Friedrich)
Date: Mon, 06 Aug 2007 08:53:55 +0200
Subject: [Numpy-discussion] fourier with single precision
Message-ID: <46B6C583.7080701@imtek.de>

Hello,

thanks for your comments. If I got you right, I should look for a 
FFT-code that uses SSE (what does this actually stand for?), which means 
that it vectorizes 32bit-single-operations into larger chunks that make 
efficient use of recent CPUs.

You mentioned FFTW and MKL. Is this www.fftw.org and the 'intel math 
kernel library'? If I would like to use one of them, is numpy the right 
place to put it in?

Does anyone know, if it is possible to switch on SSE support (at compile 
time) in the fftpack.c that numpy uses?

Thanks

Lars


From nwagner at iam.uni-stuttgart.de  Mon Aug  6 03:09:54 2007
From: nwagner at iam.uni-stuttgart.de (Nils Wagner)
Date: Mon, 06 Aug 2007 09:09:54 +0200
Subject: [Numpy-discussion] fourier with single precision
In-Reply-To: <46B6C583.7080701@imtek.de>
References: <46B6C583.7080701@imtek.de>
Message-ID: <46B6C942.50106@iam.uni-stuttgart.de>

Lars Friedrich wrote:
> Hello,
>
> thanks for your comments. If I got you right, I should look for a 
> FFT-code that uses SSE (what does this actually stand for?), which means 
> that it vectorizes 32bit-single-operations into larger chunks that make 
> efficient use of recent CPUs.
>
>   
http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions
 
Nils


From david at ar.media.kyoto-u.ac.jp  Mon Aug  6 03:43:13 2007
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Mon, 06 Aug 2007 16:43:13 +0900
Subject: [Numpy-discussion] fourier with single precision
In-Reply-To: <46B6C583.7080701@imtek.de>
References: <46B6C583.7080701@imtek.de>
Message-ID: <46B6D111.7090106@ar.media.kyoto-u.ac.jp>

Lars Friedrich wrote:
> Hello,
>
> thanks for your comments. If I got you right, I should look for a 
> FFT-code that uses SSE (what does this actually stand for?), which means 
> that it vectorizes 32bit-single-operations into larger chunks that make 
> efficient use of recent CPUs.
>
> You mentioned FFTW and MKL. Is this www.fftw.org and the 'intel math 
> kernel library'? If I would like to use one of them, is numpy the right 
> place to put it in?
>
> Does anyone know, if it is possible to switch on SSE support (at compile 
> time) in the fftpack.c that numpy uses?
>
MKL is from Intel (free as in beer on Linux and for academic purpose I 
think, but of course, you should check whether this applies to you). 
FFTW is GPL, and AFAIK is considered to be the fastest general purpose 
open source FFT.

Here are your options as far as I understand:

    - if you care about speed (that is, faster than numpy), then use 
scipy.fftpack with fftw3: there are wrappers in scipy for it. There is 
no float support (yet), but it is planned. Even with double, it will be 
faster (how much is really platform dependent). There is also MKL 
support, which may be faster (never used it).
    - if you care also about memory, then maybe you will have no choice 
but using your own routines for float support. FFTW support both single 
and double precision, but only double is available in scipy.

    cheers,

    David
> Thanks
>
> Lars
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>


From david at ar.media.kyoto-u.ac.jp  Mon Aug  6 03:51:29 2007
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Mon, 06 Aug 2007 16:51:29 +0900
Subject: [Numpy-discussion] fourier with single precision
In-Reply-To: <46B6C583.7080701@imtek.de>
References: <46B6C583.7080701@imtek.de>
Message-ID: <46B6D301.3080208@ar.media.kyoto-u.ac.jp>

Lars Friedrich wrote:
> Hello,
>
> thanks for your comments. If I got you right, I should look for a 
> FFT-code that uses SSE (what does this actually stand for?), which means 
> that it vectorizes 32bit-single-operations into larger chunks that make 
> efficient use of recent CPUs.
>
> You mentioned FFTW and MKL. Is this www.fftw.org and the 'intel math 
> kernel library'? If I would like to use one of them, is numpy the right 
> place to put it in?
>
> Does anyone know, if it is possible to switch on SSE support (at compile 
> time) in the fftpack.c that numpy uses?
>   
I don't think it will have much impact, because to use SSE efficiently, 
you need some constraints wrt memory allocation which cannot be met 
easily now in numpy arrays, AND good compiler support (intel compiler, 
basically) for automatic vectorization. Even then, FFT may have specific 
patterns which mean that only hand tuned routines can get most of the 
CPU horsepower: both mkl and fftw use SIMD instructions to get their 
maximum efficiency.

David


From matthieu.brucher at gmail.com  Mon Aug  6 04:04:18 2007
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Mon, 6 Aug 2007 10:04:18 +0200
Subject: [Numpy-discussion] fourier with single precision
In-Reply-To: <46B6D111.7090106@ar.media.kyoto-u.ac.jp>
References: <46B6C583.7080701@imtek.de>
	<46B6D111.7090106@ar.media.kyoto-u.ac.jp>
Message-ID: <e76aa17f0708060104m54326cbfy4bd3fa95f006baef@mail.gmail.com>

>
> MKL is from Intel (free as in beer on Linux and for academic purpose I
> think, but of course, you should check whether this applies to you).


AFAIK, the MKL is free for non-commercial purposes under Linux only, and
there is a special license for academics.

Matthieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070806/e401cc35/attachment.html>

From tim.hochberg at ieee.org  Mon Aug  6 09:54:12 2007
From: tim.hochberg at ieee.org (Timothy Hochberg)
Date: Mon, 6 Aug 2007 06:54:12 -0700
Subject: [Numpy-discussion] How to implement a 'pivot table?'
In-Reply-To: <57273045-FDD5-4454-A1DE-9CB12387DA88@enthought.com>
References: <fcedc2970707300812p58c343fbq852a8de09dbd4420@mail.gmail.com>
	<bbcd77d00708011202g600835d3t2c8c2813ff04c1e2@mail.gmail.com>
	<57273045-FDD5-4454-A1DE-9CB12387DA88@enthought.com>
Message-ID: <e4412d6b0708060654j372e9898m6b406c543941d63b@mail.gmail.com>

Nicely done Travis. Working code is always better than theory. I copied your
interface and used the brute-force, non-numpy approach to construct the
pivot table. On the one hand, it doesn't preserve the order that the entires
are discovered in as the original does. On the other hand, it's about 40%
faster for large files on my machine (see pivot2). Probably because you
don't have to loop through the data so many times. You can get further
improvements if you know the operation in advance as shown in pivotsum,
although this won't work on median ASAIK.

regards,

-tim


On 8/1/07, Travis Vaught <travis at enthought.com> wrote:
>
> Greetings,
>
> Speaking of brute force... I've attached a rather ugly module that
> let's you do things with a pretty simple interface (session shown
> below).  I haven't fully tested the performance, but a million
> records with 5 fields takes about 11 seconds on my Mac to do a
> 'mean'.  I'm not sure what your performance considerations are, but
> this may be useful.  Record arrays are really nice if they make sense
> for your data.
>
> Travis
>
>
> (from an ipython command prompt)
>
> In [1]: import testpivot as p
>
> In [2]: a = p.sample_data()
>
> In [3]: a
> Out[3]:
> recarray([('ACorp', 'Region 1', 'Q1', 20000.0),
>         ('ACorp', 'Region 1', 'Q2', 22000.0),
>         ('ACorp', 'Region 1', 'Q3', 21000.0),
>         ('ACorp', 'Region 1', 'Q4', 26000.0 ),
>         ('ACorp', 'Region 2', 'Q1', 23000.0),
>         ('ACorp', 'Region 2', 'Q2', 20000.0),
>         ('ACorp', 'Region 2', 'Q3', 22000.0),
>         ('ACorp', 'Region 2', 'Q4', 21000.0),
>         ('ACorp', 'Region 3', 'Q1', 26000.0),
>         ('ACorp', 'Region 3', 'Q2', 23000.0),
>         ('ACorp', 'Region 3', 'Q3', 29000.0),
>         ('ACorp', 'Region 3', 'Q4', 27000.0),
>         ('BCorp', 'Region 1', 'Q1', 20000.0),
>         ('BCorp', 'Region 1', 'Q2', 20000.0),
>         ('BCorp', 'Region 1', 'Q3', 24000.0),
>         ('BCorp', 'Region 1', 'Q4', 24000.0),
>         ('BCorp', 'Region 2', 'Q1', 21000.0 ),
>         ('BCorp', 'Region 2', 'Q2', 21000.0),
>         ('BCorp', 'Region 2', 'Q3', 22000.0),
>         ('BCorp', 'Region 2', 'Q4', 29000.0),
>         ('BCorp', 'Region 3', 'Q1', 28000.0),
>         ('BCorp', 'Region 3', 'Q2', 25000.0),
>         ('BCorp', 'Region 3', 'Q3', 22000.0),
>         ('BCorp', 'Region 3', 'Q4', 21000.0)],
>        dtype=[('company', '|S5'), ('region', '|S8'), ('quarter', '|
> S2'), ('income', '<f8')])
>
> In [4]: p.pivot(a, 'company', 'region', 'income', p.psum)
> ######## Summary by company and region ##########
> cols:['ACorp' 'BCorp']
> rows:['Region 1' 'Region 2' 'Region 3']
> [[  89000.   88000.]
> [  86000.   93000.]
> [ 105000.   96000.]]
>
> In [5]: p.pivot(a, 'company', 'quarter', 'income', p.psum)
> ######## Summary by company and quarter ##########
> cols:['ACorp' 'BCorp']
> rows:['Q1' 'Q2' 'Q3' 'Q4']
> [[ 69000.  69000.]
> [ 65000.  66000.]
> [ 72000.  68000.]
> [ 74000.  74000.]]
>
> In [6]: p.pivot(a, 'company', 'quarter', 'income', p.pmean)
> ######## Summary by company and quarter ##########
> cols:['ACorp' 'BCorp']
> rows:['Q1' 'Q2' 'Q3' 'Q4']
> [[ 23000.          23000.        ]
> [ 21666.66666667   22000.        ]
> [ 24000.          22666.66666667]
> [ 24666.66666667  24666.66666667]]
>
>
>
>
> On Aug 1, 2007, at 2:02 PM, Bruce Southey wrote:
>
> > Hi,
> > The hard part is knowing what aggregate function that you want. So a
> > hard way, even after cheating, to take the data provided is given
> > below. (The Numpy Example List was very useful especially on the where
> > function)!
> >
> > I tried to be a little generic so you can replace the sum by any
> > suitable function and probably the array type as well. Of course it is
> > not complete because you still need to know the levels of the 'rows'
> > and 'columns' and also is not efficient as it has loops.
> >
> > Bruce
> >
> > from numpy import *
> > A=array([[1,1,10],
> >          [1,1,20],
> >          [1,2,30],
> >          [2,1,40],
> >          [2,2,50],
> >          [2,2,60] ])
> > C = zeros((2,2))
> >
> > for i in range(2):
> >       crit1 = (A[:,0]==1+i)
> >       subA=A[crit1,1:]
> >       for j in range(2):
> >             crit2 = (subA[:,0]==1+j)
> >             subB=subA[crit2,1:]
> >             C[i,j]=subB.sum()
> >
> >
> > print C
> >
> > On 7/30/07, Geoffrey Zhu <zyzhu2000 at gmail.com> wrote:
> >> Hi Everyone,
> >>
> >> I am wondering what is the best (and fast) way to build a pivot table
> >> aside from the 'brute force way?'
> >>
> >> I want to transform an numpy array into a pivot table. For
> >> example, if
> >> I have a numpy array like below:
> >>
> >> Region     Date          # of Units
> >> ----------    ----------        --------------
> >> East        1/1             10
> >> East        1/1             20
> >> East        1/2             30
> >> West       1/1             40
> >> West       1/2             50
> >> West       1/2             60
> >>
> >> I want  to transform this into the following table, where f() is a
> >> given aggregate function:
> >>
> >>            Date
> >> Region           1/1          1/2
> >> ----------
> >> East         f(10,20)         f(30)
> >> West        f(40)             f(50,60)
> >>
> >>
> >> I can regroup them into 'sets' and do it the brute force way, but
> >> that
> >> is kind of slow to execute. Does anyone know a better way?
> >>
> >>
> >> Thanks,
> >> Geoffrey
> >> _______________________________________________
> >> Numpy-discussion mailing list
> >> Numpy-discussion at scipy.org
> >> http://projects.scipy.org/mailman/listinfo/numpy-discussion
> >>
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at scipy.org
> > http://projects.scipy.org/mailman/listinfo/numpy-discussion
> >
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>
>


-- 
.  __
.   |-\
.
.  tim.hochberg at ieee.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070806/079cdf0a/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: testpivot.py
Type: text/x-python
Size: 6604 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070806/079cdf0a/attachment.py>

From dalcinl at gmail.com  Mon Aug  6 16:08:50 2007
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Mon, 6 Aug 2007 17:08:50 -0300
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
Message-ID: <e7ba66e40708061308q31b3f2a0ga04306f3e95f03d5@mail.gmail.com>

On 8/3/07, David Cournapeau <david at ar.media.kyoto-u.ac.jp> wrote:
>  Here is what I can think of:
>       - adding an API to know whether a given PyArrayObject has its data
> buffer 16 bytes aligned, and requesting a 16 bytes aligned
> PyArrayObject. Something like NPY_ALIGNED, basically.
>       - forcing data allocation to be 16 bytes aligned in numpy (eg
> define PyDataMem_Mem to a 16 bytes aligned allocator instead of malloc).

All this sounds pretty similar to sdt::allocator we can found in C++
STL (http://www.sgi.com/tech/stl/Allocators.html). Perhaps a NumPy
array could be associated with an instance of an 'allocator' object
(surely written in C, perhaps subclassable in Python) providing
appropriate methos for
alloc/dealloc(/realloc?/initialize(memset)?/copy(memcpy)?) memory.

This would be really nice, as it is extensible (you could even write a
custom allocator, perhaps making use of a preallocated,static pool;
use of C++ new/delete; use of any C++ std::allocator, shared memory,
etc. etc.). I think this is the direction to go but no idea how much
difficult it could be to implement.


-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594


From david at ar.media.kyoto-u.ac.jp  Mon Aug  6 23:41:03 2007
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Tue, 07 Aug 2007 12:41:03 +0900
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <e7ba66e40708061308q31b3f2a0ga04306f3e95f03d5@mail.gmail.com>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
	<e7ba66e40708061308q31b3f2a0ga04306f3e95f03d5@mail.gmail.com>
Message-ID: <46B7E9CF.1080005@ar.media.kyoto-u.ac.jp>

Lisandro Dalcin wrote:
> On 8/3/07, David Cournapeau <david at ar.media.kyoto-u.ac.jp> wrote:
>>  Here is what I can think of:
>>       - adding an API to know whether a given PyArrayObject has its data
>> buffer 16 bytes aligned, and requesting a 16 bytes aligned
>> PyArrayObject. Something like NPY_ALIGNED, basically.
>>       - forcing data allocation to be 16 bytes aligned in numpy (eg
>> define PyDataMem_Mem to a 16 bytes aligned allocator instead of malloc).
>
> All this sounds pretty similar to sdt::allocator we can found in C++
> STL (http://www.sgi.com/tech/stl/Allocators.html). Perhaps a NumPy
> array could be associated with an instance of an 'allocator' object
> (surely written in C, perhaps subclassable in Python) providing
> appropriate methos for
> alloc/dealloc(/realloc?/initialize(memset)?/copy(memcpy)?) memory.
>
> This would be really nice, as it is extensible (you could even write a
> custom allocator, perhaps making use of a preallocated,static pool;
> use of C++ new/delete; use of any C++ std::allocator, shared memory,
> etc. etc.). I think this is the direction to go but no idea how much
> difficult it could be to implement.
>
Well, when I proposed the SIMD extension, I was willing to implement the 
proposal, and this was for a simple goal: enabling better integration 
with many numeric libraries which need SIMD alignment.

As nice as a custom allocator might be, I will certainly not implement 
it myself. For SIMD, I think the weight adding complexity / benefit 
worth it (since there is not much change to the API and implementation), 
and I know more or less how to do it; for custom allocator, that's an 
entirely different story. That's really more complex; static pools may 
be useful in some cases (but that's not obvious, since only the data are 
allocated with this buffer, everything else being allocated through the 
python memory allocator, and numpy arrays have pretty simple memory 
allocation patterns).

David


From peridot.faceted at gmail.com  Tue Aug  7 00:49:09 2007
From: peridot.faceted at gmail.com (Anne Archibald)
Date: Tue, 7 Aug 2007 00:49:09 -0400
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <46B7E9CF.1080005@ar.media.kyoto-u.ac.jp>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
	<e7ba66e40708061308q31b3f2a0ga04306f3e95f03d5@mail.gmail.com>
	<46B7E9CF.1080005@ar.media.kyoto-u.ac.jp>
Message-ID: <ce557a360708062149i4618c39cv2339c5c4b9bcb2c@mail.gmail.com>

On 06/08/07, David Cournapeau <david at ar.media.kyoto-u.ac.jp> wrote:

> Well, when I proposed the SIMD extension, I was willing to implement the
> proposal, and this was for a simple goal: enabling better integration
> with many numeric libraries which need SIMD alignment.
>
> As nice as a custom allocator might be, I will certainly not implement
> it myself. For SIMD, I think the weight adding complexity / benefit
> worth it (since there is not much change to the API and implementation),
> and I know more or less how to do it; for custom allocator, that's an
> entirely different story. That's really more complex; static pools may
> be useful in some cases (but that's not obvious, since only the data are
> allocated with this buffer, everything else being allocated through the
> python memory allocator, and numpy arrays have pretty simple memory
> allocation patterns).

I have to agree. I can hardly volunteer David for anything, and I
don't have time to implement this myself, but I think a custom
allocator is a rather special-purpose tool; if one were to implement
one, I think the way to go would be to implement a subclass of ndarray
(or just a constructor) that allocated the memory. This could be done
from python, since you can make an ndarray from scratch using a given
memory array. Of course, making temporaries be allocated with the
correct allocator will be very complicated, since it's unclear which
allocator should be used.

Adding SIMD alignment should be a very small modification; it can be
done as simply as using ctypes to wrap posix_memalign (or a portable
version, possibly written in python) and writing a simple python
function that checks the beginning data address. There's really no
need to make it complicated.

Anne


From david at ar.media.kyoto-u.ac.jp  Tue Aug  7 01:00:20 2007
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Tue, 07 Aug 2007 14:00:20 +0900
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <ce557a360708062149i4618c39cv2339c5c4b9bcb2c@mail.gmail.com>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>	<e7ba66e40708061308q31b3f2a0ga04306f3e95f03d5@mail.gmail.com>	<46B7E9CF.1080005@ar.media.kyoto-u.ac.jp>
	<ce557a360708062149i4618c39cv2339c5c4b9bcb2c@mail.gmail.com>
Message-ID: <46B7FC64.7000109@ar.media.kyoto-u.ac.jp>

Anne Archibald wrote:
>
> I have to agree. I can hardly volunteer David for anything, and I
> don't have time to implement this myself, but I think a custom
> allocator is a rather special-purpose tool; if one were to implement
> one, I think the way to go would be to implement a subclass of ndarray
> (or just a constructor) that allocated the memory. This could be done
> from python, since you can make an ndarray from scratch using a given
> memory array. Of course, making temporaries be allocated with the
> correct allocator will be very complicated, since it's unclear which
> allocator should be used.
>
> Adding SIMD alignment should be a very small modification; it can be
> done as simply as using ctypes to wrap posix_memalign (or a portable
> version, possibly written in python) and writing a simple python
> function that checks the beginning data address. There's really no
> need to make it complicated.
>
Anne, you said previously that it was easy to allocate buffers for a 
given alignment at runtime. Could you point me to a document which 
explains how ? For platforms without posix_memalign, I don't see how to 
implement a memory allocator with an arbitrary alignment (more 
precisely, I don't see how to free it if I cannot assume a fixed 
alignement: how do I know where the "real" pointer is ?).

David


From peridot.faceted at gmail.com  Tue Aug  7 01:33:24 2007
From: peridot.faceted at gmail.com (Anne Archibald)
Date: Tue, 7 Aug 2007 01:33:24 -0400
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <46B7FC64.7000109@ar.media.kyoto-u.ac.jp>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
	<e7ba66e40708061308q31b3f2a0ga04306f3e95f03d5@mail.gmail.com>
	<46B7E9CF.1080005@ar.media.kyoto-u.ac.jp>
	<ce557a360708062149i4618c39cv2339c5c4b9bcb2c@mail.gmail.com>
	<46B7FC64.7000109@ar.media.kyoto-u.ac.jp>
Message-ID: <ce557a360708062233p5c4cdf4fw5bcf18ed4a199533@mail.gmail.com>

On 07/08/07, David Cournapeau <david at ar.media.kyoto-u.ac.jp> wrote:

> Anne, you said previously that it was easy to allocate buffers for a
> given alignment at runtime. Could you point me to a document which
> explains how ? For platforms without posix_memalign, I don't see how to
> implement a memory allocator with an arbitrary alignment (more
> precisely, I don't see how to free it if I cannot assume a fixed
> alignement: how do I know where the "real" pointer is ?).

Well, it can be done in Python: just allocate a too-big ndarray and
take a slice that's the right shape and has the right alignment. But
this sucks. Stephen G. Johnson posted code earlier in this thread that
provides a portable aligned-memory allocator - it handles the freeing
by (always) storing enough information to recover the original pointer
in the padding space. (This means you always need to pad, which is a
pain, but there's not much you can do about that.) His implementation
stores the original pointer just before the beginning of the aligned
data, so _aligned_free is free(((void**)ptr)[-1]). If you were worried
about space (rather than time) you could store a single byte just
before the pointer whose value indicated how much padding was done, or
whatever.

These schemes all waste space, but unless malloc's internal structures
are the size of the alignment block, it's almost unavoidable to waste
some space; the only way around it I can see is if the program also
allocates lots of small, odd-shaped, unaligned blocks of memory that
can be used to fill the gaps (and even then I doubt any sensible
malloc implementation fills in little gaps like this, since it seems
likely to lead to memory fragmentation). A posix_memalign that is
built into malloc can do better than any implementation that isn't,
though, with the possible exception of a specialized pool allocator
built with aligned allocation in mind.

Anne


From matthieu.brucher at gmail.com  Tue Aug  7 02:23:15 2007
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Tue, 7 Aug 2007 08:23:15 +0200
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <46B7FC64.7000109@ar.media.kyoto-u.ac.jp>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
	<e7ba66e40708061308q31b3f2a0ga04306f3e95f03d5@mail.gmail.com>
	<46B7E9CF.1080005@ar.media.kyoto-u.ac.jp>
	<ce557a360708062149i4618c39cv2339c5c4b9bcb2c@mail.gmail.com>
	<46B7FC64.7000109@ar.media.kyoto-u.ac.jp>
Message-ID: <e76aa17f0708062323g6d82686em473c91c95729093f@mail.gmail.com>

>
> For platforms without posix_memalign, I don't see how to
> implement a memory allocator with an arbitrary alignment (more
> precisely, I don't see how to free it if I cannot assume a fixed
> alignement: how do I know where the "real" pointer is ?).


Visual Studio seems to offer a counter part (also note that malloc is
supposed to return a pointer on a 16bits boundary) which is called
_aligned_malloc (
http://msdn2.microsoft.com/en-us/library/8z34s9c6(VS.80).aspx). It should be
what you need, at least for Windows/MSVC.

Matthieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070807/bfb43dd5/attachment.html>

From david at ar.media.kyoto-u.ac.jp  Tue Aug  7 02:11:52 2007
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Tue, 07 Aug 2007 15:11:52 +0900
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <ce557a360708062233p5c4cdf4fw5bcf18ed4a199533@mail.gmail.com>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>	<e7ba66e40708061308q31b3f2a0ga04306f3e95f03d5@mail.gmail.com>	<46B7E9CF.1080005@ar.media.kyoto-u.ac.jp>	<ce557a360708062149i4618c39cv2339c5c4b9bcb2c@mail.gmail.com>	<46B7FC64.7000109@ar.media.kyoto-u.ac.jp>
	<ce557a360708062233p5c4cdf4fw5bcf18ed4a199533@mail.gmail.com>
Message-ID: <46B80D28.8060005@ar.media.kyoto-u.ac.jp>

Anne Archibald wrote:
> Well, it can be done in Python: just allocate a too-big ndarray and
> take a slice that's the right shape and has the right alignment. But
> this sucks. Stephen G. Johnson posted code earlier in this thread that
> provides a portable aligned-memory allocator - it handles the freeing
> by (always) storing enough information to recover the original pointer
> in the padding space. (This means you always need to pad, which is a
> pain, but there's not much you can do about that.) 
This is indeed no rocket science, I feel a bit ashamed :) I don't see 
the problem with padding (except wasted time) ?
> His implementation
> stores the original pointer just before the beginning of the aligned
> data, so _aligned_free is free(((void**)ptr)[-1]). If you were worried
> about space (rather than time) you could store a single byte just
> before the pointer whose value indicated how much padding was done, or
> whatever.
I really don't see how space would be a problem in our situation: it is 
not like we will pad more than a few bytes; in the case it is, I don't 
see how python would be the right choice anymore anyway. I will try to 
prepare a patch the next few days, then.

cheers,

David


From gerard.vermeulen at grenoble.cnrs.fr  Tue Aug  7 04:54:28 2007
From: gerard.vermeulen at grenoble.cnrs.fr (Gerard Vermeulen)
Date: Tue, 7 Aug 2007 10:54:28 +0200
Subject: [Numpy-discussion] ANN: PyQwt3D-0.1.5 released
Message-ID: <20070807105428.3aeb7f19@zombie.grenoble.cnrs.fr>

What is PyQwt3D ( http://pyqwt3d.sourceforge.net) ?

- it is a set of Python bindings for the QwtPlot3D C++ class library
  which extends the Qt framework with widgets for 3D data visualization.
  PyQwt3D inherits the snappy feel from QwtPlot3D.
  The examples at http://pyqwt.sourceforge.net/pyqwt3d-examples.html
  show how easy it is to make a 3D plot and how to save a 3D plot to
  an image or an (E)PS/PDF/PGF/SVG file.

- it requires and extends PyQt, a set of Python bindings for Qt.

- it supports the use of PyQt, Qt, QwtPlot3D, and NumPy or SciPy in a
  GUI Python application or in an interactive Python session.

- it runs on POSIX, Mac OS X and Windows platforms (practically any
  platform supported by Qt and Python).

The home page of PyQwt3D is http://pyqwt.sourceforge.net.

New features and bugfixes in PyQwt3D-0.1.5:
- Added support for QwtPlot3D-0.2.7
- Added support for SIP-4.7, PyQt-4.3 and PyQt-3.17.3.
- Added support for SVG and PGF vector output.
- Added Qwt3D.save() to facilitate saving plots to a file.
- Added Qwt3D.plot() to facilitate function plotting with nicely scaled axes.
- Fixed the type of the result of IO.outputHandler(format).
- Fixed saving to pixmap formats in qt4examples/Grab.py.

PyQwt3D-0.1.5 supports:
1. Python-2.5, or -2.4.
2. PyQt-4.3, -4.2, -4.1, or -3.17.
3. SIP-4.7, -4.6, or -4.5.
4. Qt-4.3, -4.2, Qt-3.3, or -3.2.
5. QwtPlot3D-0.2.7.


Enjoy -- Gerard Vermeulen


From lfriedri at imtek.de  Tue Aug  7 05:22:24 2007
From: lfriedri at imtek.de (Lars Friedrich)
Date: Tue, 07 Aug 2007 11:22:24 +0200
Subject: [Numpy-discussion] fourier with single precision
Message-ID: <46B839D0.9020905@imtek.de>

Thank you for your comments!

I will try this fftw3-scipy approach and see how much faster I can get. 
Maybe this is enough for me...?

Lars


From nwagner at iam.uni-stuttgart.de  Tue Aug  7 08:02:16 2007
From: nwagner at iam.uni-stuttgart.de (Nils Wagner)
Date: Tue, 07 Aug 2007 14:02:16 +0200
Subject: [Numpy-discussion] Count the occurrence of a certain integer in a
	list of integers
Message-ID: <46B85F48.3000700@iam.uni-stuttgart.de>

Hi all,

I have a list of integer numbers. The entries can vary between 0 and 19.
How can I count the occurrence of any number. Consider

 >>> data
[9, 6, 9, 6, 7, 9, 9, 10, 7, 9, 9, 6, 7, 9, 8, 8, 11, 9, 6, 7, 10, 9, 7, 9, 7, 8, 9, 8, 7, 9]


Is there a better way than using, e.g.

>>> shape(where(array(data)==10))[1]
2
 

to compute the occurrence of 10 in the list which is 2 in this case ?

Nils


From matthieu.brucher at gmail.com  Tue Aug  7 08:13:49 2007
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Tue, 7 Aug 2007 14:13:49 +0200
Subject: [Numpy-discussion] Count the occurrence of a certain integer in
	a list of integers
In-Reply-To: <46B85F48.3000700@iam.uni-stuttgart.de>
References: <46B85F48.3000700@iam.uni-stuttgart.de>
Message-ID: <e76aa17f0708070513i17c2d758m72882312b9838ce7@mail.gmail.com>

You can try using hist() with the correct range and number of bins.

Matthieu

2007/8/7, Nils Wagner <nwagner at iam.uni-stuttgart.de>:
>
> Hi all,
>
> I have a list of integer numbers. The entries can vary between 0 and 19.
> How can I count the occurrence of any number. Consider
>
> >>> data
> [9, 6, 9, 6, 7, 9, 9, 10, 7, 9, 9, 6, 7, 9, 8, 8, 11, 9, 6, 7, 10, 9, 7,
> 9, 7, 8, 9, 8, 7, 9]
>
>
> Is there a better way than using, e.g.
>
> >>> shape(where(array(data)==10))[1]
> 2
>
>
> to compute the occurrence of 10 in the list which is 2 in this case ?
>
> Nils
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070807/a6dd9f44/attachment.html>

From kwgoodman at gmail.com  Tue Aug  7 08:19:47 2007
From: kwgoodman at gmail.com (Keith Goodman)
Date: Tue, 7 Aug 2007 14:19:47 +0200
Subject: [Numpy-discussion] Count the occurrence of a certain integer in
	a list of integers
In-Reply-To: <46B85F48.3000700@iam.uni-stuttgart.de>
References: <46B85F48.3000700@iam.uni-stuttgart.de>
Message-ID: <f4f93d420708070519rc1b1476ndde06fb55b2f1609@mail.gmail.com>

On 8/7/07, Nils Wagner <nwagner at iam.uni-stuttgart.de> wrote:
> I have a list of integer numbers. The entries can vary between 0 and 19.
> How can I count the occurrence of any number. Consider
>
>  >>> data
> [9, 6, 9, 6, 7, 9, 9, 10, 7, 9, 9, 6, 7, 9, 8, 8, 11, 9, 6, 7, 10, 9, 7, 9, 7, 8, 9, 8, 7, 9]
>
>
> Is there a better way than using, e.g.
>
> >>> shape(where(array(data)==10))[1]
> 2
>
>
> to compute the occurrence of 10 in the list which is 2 in this case ?

Would list comprehension work?

len([z for z in data if z == 10])


From kwgoodman at gmail.com  Tue Aug  7 08:24:13 2007
From: kwgoodman at gmail.com (Keith Goodman)
Date: Tue, 7 Aug 2007 14:24:13 +0200
Subject: [Numpy-discussion] Count the occurrence of a certain integer in
	a list of integers
In-Reply-To: <f4f93d420708070519rc1b1476ndde06fb55b2f1609@mail.gmail.com>
References: <46B85F48.3000700@iam.uni-stuttgart.de>
	<f4f93d420708070519rc1b1476ndde06fb55b2f1609@mail.gmail.com>
Message-ID: <f4f93d420708070524t5d2fafe5hb0747009e6b34d85@mail.gmail.com>

On 8/7/07, Keith Goodman <kwgoodman at gmail.com> wrote:
> On 8/7/07, Nils Wagner <nwagner at iam.uni-stuttgart.de> wrote:
> > I have a list of integer numbers. The entries can vary between 0 and 19.
> > How can I count the occurrence of any number. Consider
> >
> >  >>> data
> > [9, 6, 9, 6, 7, 9, 9, 10, 7, 9, 9, 6, 7, 9, 8, 8, 11, 9, 6, 7, 10, 9, 7, 9, 7, 8, 9, 8, 7, 9]
> >
> >
> > Is there a better way than using, e.g.
> >
> > >>> shape(where(array(data)==10))[1]
> > 2
> >
> >
> > to compute the occurrence of 10 in the list which is 2 in this case ?
>
> Would list comprehension work?
>
> len([z for z in data if z == 10])

Or is this faster?

(array(x)==10).sum()


From cimrman3 at ntc.zcu.cz  Tue Aug  7 08:24:22 2007
From: cimrman3 at ntc.zcu.cz (Robert Cimrman)
Date: Tue, 07 Aug 2007 14:24:22 +0200
Subject: [Numpy-discussion] Count the occurrence of a certain integer in
 a	list of integers
In-Reply-To: <46B85F48.3000700@iam.uni-stuttgart.de>
References: <46B85F48.3000700@iam.uni-stuttgart.de>
Message-ID: <46B86476.3080009@ntc.zcu.cz>

Nils Wagner wrote:
> Hi all,
> 
> I have a list of integer numbers. The entries can vary between 0 and 19.
> How can I count the occurrence of any number. Consider
> 
>  >>> data
> [9, 6, 9, 6, 7, 9, 9, 10, 7, 9, 9, 6, 7, 9, 8, 8, 11, 9, 6, 7, 10, 9, 7, 9, 7, 8, 9, 8, 7, 9]
> 
> 
> Is there a better way than using, e.g.
> 
>>>> shape(where(array(data)==10))[1]
> 2
>  
> 
> to compute the occurrence of 10 in the list which is 2 in this case ?

Your way is ok if you want to count just a few numbers. If you want all,
you may sort the array and use searchorted:

b = sort( a )
count = searchsorted( b, 7, side = 'right' ) - searchsorted( b, 7, side
= 'left' )


r.


From lorrmann at physik.uni-wuerzburg.de  Tue Aug  7 08:51:53 2007
From: lorrmann at physik.uni-wuerzburg.de (volker)
Date: Tue, 7 Aug 2007 12:51:53 +0000 (UTC)
Subject: [Numpy-discussion]
	=?utf-8?q?Count_the_occurrence_of_a_certain_in?=
	=?utf-8?q?teger_in=09a_list_of_integers?=
References: <46B85F48.3000700@iam.uni-stuttgart.de>
	<f4f93d420708070519rc1b1476ndde06fb55b2f1609@mail.gmail.com>
	<f4f93d420708070524t5d2fafe5hb0747009e6b34d85@mail.gmail.com>
Message-ID: <loom.20070807T144536-458@post.gmane.org>

Keith Goodman <kwgoodman <at> gmail.com> writes:

> 
> On 8/7/07, Keith Goodman <kwgoodman <at> gmail.com> wrote:
> > On 8/7/07, Nils Wagner <nwagner <at> iam.uni-stuttgart.de> wrote:
> > > I have a list of integer numbers. The entries can vary between 0 and 19.
> > > How can I count the occurrence of any number. Consider
> > >
> > >  >>> data
> > > [9, 6, 9, 6, 7, 9, 9, 10, 7, 9, 9, 6, 7, 9, 8, 8, 11, 9, 6, 7, 10, 9, 7,
9, 7, 8, 9, 8, 7, 9]
> > >
> > >
> > > Is there a better way than using, e.g.
> > >
> > > >>> shape(where(array(data)==10))[1]
> > > 2
> > >
> > >
> > > to compute the occurrence of 10 in the list which is 2 in this case ?
> >
> > Would list comprehension work?
> >
> > len([z for z in data if z == 10])
> 
> Or is this faster?
> 
> (array(x)==10).sum()
> 

Lets test ;)

In [34]: data = array(data).repeat(1e6)


In [35]: %time shape(where(array(data)==10))[1]
CPU times: user 1.27 s, sys: 0.16 s, total: 1.44 s
Wall time: 1.65


In [36]: %time ([z for z in data if z == 10])
CPU times: user 18.06 s, sys: 0.52 s, total: 18.58 s
Wall time: 18.59


In [37]: %time (array(data)==10).sum()
CPU times: user 0.68 s, sys: 0.20 s, total: 0.88 s
Wall time: 1.36


From aisaac at american.edu  Tue Aug  7 09:11:34 2007
From: aisaac at american.edu (Alan G Isaac)
Date: Tue, 7 Aug 2007 09:11:34 -0400
Subject: [Numpy-discussion] Count the occurrence of a certain integer in
	a list of integers
In-Reply-To: <46B85F48.3000700@iam.uni-stuttgart.de>
References: <46B85F48.3000700@iam.uni-stuttgart.de>
Message-ID: <Mahogany-0.67.0-1092-20070807-091134.00@american.edu>

On Tue, 07 Aug 2007, Nils Wagner apparently wrote:
> I have a list of integer numbers. The entries can vary between 0 and 19. 
> How can I count the occurrence of any number. Consider 
>  >>> data
> [9, 6, 9, 6, 7, 9, 9, 10, 7, 9, 9, 6, 7, 9, 8, 8, 11, 9, 6, 7, 10, 9, 7, 9, 7, 8, 9, 8, 7, 9] 
> Is there a better way than using, e.g. 
>>>> shape(where(array(data)==10))[1] 
> 2


You did not say why data.count(10) is unsatisfactory ...

Cheers,
Alan Isaac


From aisaac at american.edu  Tue Aug  7 09:19:13 2007
From: aisaac at american.edu (Alan G Isaac)
Date: Tue, 7 Aug 2007 09:19:13 -0400
Subject: [Numpy-discussion] Count the occurrence of a certain integer in
	a list of integers
In-Reply-To: <Mahogany-0.67.0-1092-20070807-091134.00@american.edu>
References: <46B85F48.3000700@iam.uni-stuttgart.de><Mahogany-0.67.0-1092-20070807-091134.00@american.edu>
Message-ID: <Mahogany-0.67.0-1092-20070807-091913.00@american.edu>

By the way, you can get all the frequencies pretty fast 
using a defaultdict:
http://docs.python.org/lib/defaultdict-examples.html

Cheers,
Alan Isaac


From nwagner at iam.uni-stuttgart.de  Tue Aug  7 09:22:37 2007
From: nwagner at iam.uni-stuttgart.de (Nils Wagner)
Date: Tue, 07 Aug 2007 15:22:37 +0200
Subject: [Numpy-discussion] Count the occurrence of a certain integer
 in	a list of integers
In-Reply-To: <Mahogany-0.67.0-1092-20070807-091134.00@american.edu>
References: <46B85F48.3000700@iam.uni-stuttgart.de>
	<Mahogany-0.67.0-1092-20070807-091134.00@american.edu>
Message-ID: <46B8721D.8000709@iam.uni-stuttgart.de>

Alan G Isaac wrote:
> On Tue, 07 Aug 2007, Nils Wagner apparently wrote:
>   
>> I have a list of integer numbers. The entries can vary between 0 and 19. 
>> How can I count the occurrence of any number. Consider 
>>  >>> data
>> [9, 6, 9, 6, 7, 9, 9, 10, 7, 9, 9, 6, 7, 9, 8, 8, 11, 9, 6, 7, 10, 9, 7, 9, 7, 8, 9, 8, 7, 9] 
>> Is there a better way than using, e.g. 
>>     
>>>>> shape(where(array(data)==10))[1] 
>>>>>           
>> 2
>>     
>
>
> You did not say why data.count(10) is unsatisfactory ...
>
> Cheers,
> Alan Isaac
>
>
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>   
 
Thank you for all your input. To be honest I was not aware of all these
possibilities to solve my problem.
If you distribute a task among different people you will obtain
different methods of resolution.

Nils


From charlesr.harris at gmail.com  Tue Aug  7 16:26:27 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 7 Aug 2007 14:26:27 -0600
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <ce557a360708062149i4618c39cv2339c5c4b9bcb2c@mail.gmail.com>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
	<e7ba66e40708061308q31b3f2a0ga04306f3e95f03d5@mail.gmail.com>
	<46B7E9CF.1080005@ar.media.kyoto-u.ac.jp>
	<ce557a360708062149i4618c39cv2339c5c4b9bcb2c@mail.gmail.com>
Message-ID: <e06186140708071326y52608713u1bedc98b154b4cc8@mail.gmail.com>

On 8/6/07, Anne Archibald <peridot.faceted at gmail.com> wrote:
>
> On 06/08/07, David Cournapeau <david at ar.media.kyoto-u.ac.jp> wrote:
>
> > Well, when I proposed the SIMD extension, I was willing to implement the
> > proposal, and this was for a simple goal: enabling better integration
> > with many numeric libraries which need SIMD alignment.
> >
> > As nice as a custom allocator might be, I will certainly not implement
> > it myself. For SIMD, I think the weight adding complexity / benefit
> > worth it (since there is not much change to the API and implementation),
> > and I know more or less how to do it; for custom allocator, that's an
> > entirely different story. That's really more complex; static pools may
> > be useful in some cases (but that's not obvious, since only the data are
> > allocated with this buffer, everything else being allocated through the
> > python memory allocator, and numpy arrays have pretty simple memory
> > allocation patterns).
>
> I have to agree. I can hardly volunteer David for anything, and I
> don't have time to implement this myself, but I think a custom
> allocator is a rather special-purpose tool; if one were to implement
> one, I think the way to go would be to implement a subclass of ndarray
> (or just a constructor) that allocated the memory. This could be done
> from python, since you can make an ndarray from scratch using a given
> memory array. Of course, making temporaries be allocated with the
> correct allocator will be very complicated, since it's unclear which
> allocator should be used.


Maybe I'm missing something, but handling the temporaries is automatic. Just
return the appropriate slice from an array created in a subroutine. The
original array gets its reference count decremented when the routine exits
but the slice will still hold one. When the slice is deleted all the
allocated memory will get garbage collected.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070807/0b60ad69/attachment.html>

From john at saponara.net  Tue Aug  7 22:50:05 2007
From: john at saponara.net (john saponara)
Date: Tue, 07 Aug 2007 22:50:05 -0400
Subject: [Numpy-discussion] spurious IndexError?
Message-ID: <46B92F5D.1010600@saponara.net>

Using numpy-1.0.2/python-2.5/winxp pro sp2:  in the following, the only 
array is 'a', and I'm not using it as an index, so why do I get the 
IndexError below?

--- start python session ---
 >>> a=array([[1,3],[2,4]])
 >>> a
array([[1, 3],
        [2, 4]])
 >>> f=lambda i,j: a[i,j]
 >>> f(1,1)
4
 >>> fromfunction(f,(2,2))
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "C:\Python25\Lib\site-packages\numpy\core\numeric.py", line 514, 
in fromfunction
	return function(*args,**kwargs)
   File "<stdin>", line 1, in <lambda>
IndexError: arrays used as indices must be of integer (or boolean) type
--- end python session ---

The upstream maple is written in 'fromfunction' style, and I have no 
control over that but want to port it to python in the most natural way 
possible.

The session suggests that lambda has no trouble with an array, so the 
problem seems to be related to the way 'fromfunction' works.  What am I 
missing?

Thanks!


From robert.kern at gmail.com  Tue Aug  7 23:04:12 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 07 Aug 2007 22:04:12 -0500
Subject: [Numpy-discussion] spurious IndexError?
In-Reply-To: <46B92F5D.1010600@saponara.net>
References: <46B92F5D.1010600@saponara.net>
Message-ID: <46B932AC.80206@gmail.com>

john saponara wrote:
> Using numpy-1.0.2/python-2.5/winxp pro sp2:  in the following, the only 
> array is 'a', and I'm not using it as an index, so why do I get the 
> IndexError below?
> 
> --- start python session ---
>  >>> a=array([[1,3],[2,4]])
>  >>> a
> array([[1, 3],
>         [2, 4]])
>  >>> f=lambda i,j: a[i,j]
>  >>> f(1,1)
> 4
>  >>> fromfunction(f,(2,2))
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
>    File "C:\Python25\Lib\site-packages\numpy\core\numeric.py", line 514, 
> in fromfunction
> 	return function(*args,**kwargs)
>    File "<stdin>", line 1, in <lambda>
> IndexError: arrays used as indices must be of integer (or boolean) type
> --- end python session ---
> 
> The upstream maple is written in 'fromfunction' style, and I have no 
> control over that but want to port it to python in the most natural way 
> possible.
> 
> The session suggests that lambda has no trouble with an array, so the 
> problem seems to be related to the way 'fromfunction' works.  What am I 
> missing?

fromfunction() takes the (2, 2) and forms arrays of indices. It then calls your
function with those arrays as arguments. It does not loop. The default dtype of
these arrays is float, not int. You must use "dtype=int" in your call to
fromfunction().

def fromfunction(function, shape, **kwargs):
    """Returns an array constructed by calling a function on a tuple of number
    grids.

    The function should accept as many arguments as the length of shape and
    work on array inputs.  The shape argument is a sequence of numbers
    indicating the length of the desired output for each axis.

    The function can also accept keyword arguments (except dtype), which will
    be passed through fromfunction to the function itself.  The dtype argument
    (default float) determines the data-type of the index grid passed to the
    function.
    """

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From gerard.vermeulen at grenoble.cnrs.fr  Wed Aug  8 01:36:44 2007
From: gerard.vermeulen at grenoble.cnrs.fr (Gerard Vermeulen)
Date: Wed, 8 Aug 2007 07:36:44 +0200
Subject: [Numpy-discussion] ANN: PyQwt3D-0.1.6 released
Message-ID: <20070808073644.265d3e2d@zombie.grenoble.cnrs.fr>

What is PyQwt3D ( http://pyqwt3d.sourceforge.net) ?

- it is a set of Python bindings for the QwtPlot3D C++ class library
  which extends the Qt framework with widgets for 3D data visualization.
  PyQwt3D inherits the snappy feel from QwtPlot3D.
  The examples at http://pyqwt.sourceforge.net/pyqwt3d-examples.html
  show how easy it is to make a 3D plot and how to save a 3D plot to
  an image or an (E)PS/PDF/PGF/SVG file.

- it requires and extends PyQt, a set of Python bindings for Qt.

- it supports the use of PyQt, Qt, QwtPlot3D, and NumPy or SciPy in a
  GUI Python application or in an interactive Python session.

- it runs on POSIX, Mac OS X and Windows platforms (practically any
  platform supported by Qt and Python).

- it is licensed under the GPL with an exception to allow dynamic linking
  with non-free releases of Qt and PyQt. 

The home page of PyQwt3D is http://pyqwt.sourceforge.net.

PyQwt3D-0.1.6 is a bug fix release:
- Improved text display on screen and in pixmaps with Qt-4 and X (requires
  the use of the patched QwtPlot3D-0.2.7 library included in PyQwt3D).


PyQwt3D-0.1.6 supports:
1. Python-2.5, or -2.4.
2. PyQt-4.3, -4.2, -4.1, or -3.17.
3. SIP-4.7, -4.6, or -4.5.
4. Qt-4.3, -4.2, Qt-3.3, or -3.2.
5. QwtPlot3D-0.2.7.


Enjoy -- Gerard Vermeulen


From lbolla at gmail.com  Wed Aug  8 03:35:32 2007
From: lbolla at gmail.com (lorenzo bolla)
Date: Wed, 8 Aug 2007 09:35:32 +0200
Subject: [Numpy-discussion] numpy installation problem
In-Reply-To: <7fd38bfa0707301818kd280ce2vdb1e1cb0b0a23111@mail.gmail.com>
References: <7fd38bfa0707301818kd280ce2vdb1e1cb0b0a23111@mail.gmail.com>
Message-ID: <80c99e790708080035m57c2e186s760d07d8adf24c45@mail.gmail.com>

sorry for the silly question: have you done
"python setup.py install"
from the numpy src directory, after untarring?
then cd out from the src directory and try to import numpy from python.
L.


On 7/31/07, kingshuk ghosh <sfkings at gmail.com> wrote:
>
> Hi,
> I downloaded numpy1.0.3-2.tar and unzipped and untared.
> However somehow new numpy does not work. It invokes
> the old numpy 0.9.6 when i import numpy from python
> and type in numpy.version.version .
> I tried to change path and once I do that and when I do
> import numpy it says
> "running from source directory" and then if I try
> numpy.version.version it gives some error.
>
> Is there something obvious I am missing after unzipping
> and untaring the numpy source file ? For example do I need
> to do something to install the new numpy1.0.3 ?
>
> Or do I also need to download full python package ?
> I am trying to run this on Red Hat Linux 3.2.2-5 which
> has a gcc 3.2.2 and the version of python is 2.4 .
>
> Any help will be greatly appreciated.
>
> Cheers
> Kings
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070808/eb0306a6/attachment.html>

From stefan at sun.ac.za  Wed Aug  8 05:53:30 2007
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Wed, 8 Aug 2007 11:53:30 +0200
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <ce557a360708062233p5c4cdf4fw5bcf18ed4a199533@mail.gmail.com>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
	<e7ba66e40708061308q31b3f2a0ga04306f3e95f03d5@mail.gmail.com>
	<46B7E9CF.1080005@ar.media.kyoto-u.ac.jp>
	<ce557a360708062149i4618c39cv2339c5c4b9bcb2c@mail.gmail.com>
	<46B7FC64.7000109@ar.media.kyoto-u.ac.jp>
	<ce557a360708062233p5c4cdf4fw5bcf18ed4a199533@mail.gmail.com>
Message-ID: <20070808095330.GO30988@mentat.za.net>

On Tue, Aug 07, 2007 at 01:33:24AM -0400, Anne Archibald wrote:
> Well, it can be done in Python: just allocate a too-big ndarray and
> take a slice that's the right shape and has the right alignment. But
> this sucks.

Could you explain to me why is this such a bad idea?

St?fan


From markbak at gmail.com  Wed Aug  8 06:44:56 2007
From: markbak at gmail.com (mark)
Date: Wed, 08 Aug 2007 10:44:56 -0000
Subject: [Numpy-discussion] simple slicing question
Message-ID: <1186569896.722070.91690@g4g2000hsf.googlegroups.com>

Consider the array d:

d = linspace( -10, 10, 10 )

If I want to multiply every value above -5 by 100 I can do

d[ d>-5 ] *= 100

But what if I want to multiply every value between -5 and +5 by 100.
This does NOT work:

d[ d>-5 and d<5 ] *= 100

Any ideas?

Thanks, Mark


From kwgoodman at gmail.com  Wed Aug  8 06:53:07 2007
From: kwgoodman at gmail.com (Keith Goodman)
Date: Wed, 8 Aug 2007 12:53:07 +0200
Subject: [Numpy-discussion] simple slicing question
In-Reply-To: <1186569896.722070.91690@g4g2000hsf.googlegroups.com>
References: <1186569896.722070.91690@g4g2000hsf.googlegroups.com>
Message-ID: <f4f93d420708080353p6ed7c886sfff49c20479da889@mail.gmail.com>

On 8/8/07, mark <markbak at gmail.com> wrote:
> But what if I want to multiply every value between -5 and +5 by 100.
> This does NOT work:
>
> d[ d>-5 and d<5 ] *= 100

 d[(d>-5) & (d<5)] *= 100


From markbak at gmail.com  Wed Aug  8 07:07:15 2007
From: markbak at gmail.com (mark)
Date: Wed, 08 Aug 2007 11:07:15 -0000
Subject: [Numpy-discussion] simple slicing question
In-Reply-To: <f4f93d420708080353p6ed7c886sfff49c20479da889@mail.gmail.com>
References: <1186569896.722070.91690@g4g2000hsf.googlegroups.com>
	<f4f93d420708080353p6ed7c886sfff49c20479da889@mail.gmail.com>
Message-ID: <1186571235.266377.136330@22g2000hsm.googlegroups.com>

Life is so simple.
Thanks Keith,
Mark

On Aug 8, 12:53 pm, "Keith Goodman" <kwgood... at gmail.com> wrote:
> On 8/8/07, mark <mark... at gmail.com> wrote:
>
> > But what if I want to multiply every value between -5 and +5 by 100.
> > This does NOT work:
>
> > d[ d>-5 and d<5 ] *= 100
>
>  d[(d>-5) & (d<5)] *= 100
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion


From peridot.faceted at gmail.com  Wed Aug  8 11:29:55 2007
From: peridot.faceted at gmail.com (Anne Archibald)
Date: Wed, 8 Aug 2007 11:29:55 -0400
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <20070808095330.GO30988@mentat.za.net>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
	<e7ba66e40708061308q31b3f2a0ga04306f3e95f03d5@mail.gmail.com>
	<46B7E9CF.1080005@ar.media.kyoto-u.ac.jp>
	<ce557a360708062149i4618c39cv2339c5c4b9bcb2c@mail.gmail.com>
	<46B7FC64.7000109@ar.media.kyoto-u.ac.jp>
	<ce557a360708062233p5c4cdf4fw5bcf18ed4a199533@mail.gmail.com>
	<20070808095330.GO30988@mentat.za.net>
Message-ID: <ce557a360708080829pb6d348am93d41e826882b1b5@mail.gmail.com>

On 08/08/2007, Stefan van der Walt <stefan at sun.ac.za> wrote:
> On Tue, Aug 07, 2007 at 01:33:24AM -0400, Anne Archibald wrote:
> > Well, it can be done in Python: just allocate a too-big ndarray and
> > take a slice that's the right shape and has the right alignment. But
> > this sucks.
>
> Could you explain to me why is this such a bad idea?

Oh. Well, it's not *terrible*; it gets you an aligned array. But you
have to allocate the original array as a 1D byte array (to allow for
arbitrary realignments) and then align it, reshape it, and reinterpret
it as a new type. Plus you're allocating an extra ndarray structure,
which will live as long as the new array does; this not only wastes
even more memory than the portable alignment solutions, it clogs up
python's garbage collector.

It's not outrageous, if you need aligned arrays *now*, on a released
version of numpy, but numpy itself should do better.

Anne


From markbak at gmail.com  Wed Aug  8 11:37:09 2007
From: markbak at gmail.com (mark)
Date: Wed, 08 Aug 2007 15:37:09 -0000
Subject: [Numpy-discussion] vectorized function inside a class
Message-ID: <1186587429.134603.120980@q75g2000hsh.googlegroups.com>

I am trying to figure out a way to define a vectorized function inside
a class.
This is what I tried:

class test:
	def __init__(self):
		self.x = 3.0
	def func(self,y):
		rv = self.x
		if y > self.x: rv = y
		return rv
	f = vectorize(func)


>>> m = test()
>>> m.f( m, [-20,4,6] )
array([ 3.,  4.,  6.])

But as you can see, I can only call the m.f function when I also pass
it the instance m again.
I really want to call it as
m.f( [-20,4,6] )
But then I get an error
ValueError: mismatch between python function inputs and received
arguments

Any ideas how to do this better?

Mark


From stefan at sun.ac.za  Wed Aug  8 11:50:00 2007
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Wed, 8 Aug 2007 17:50:00 +0200
Subject: [Numpy-discussion] vectorized function inside a class
In-Reply-To: <1186587429.134603.120980@q75g2000hsh.googlegroups.com>
References: <1186587429.134603.120980@q75g2000hsh.googlegroups.com>
Message-ID: <20070808155000.GC29100@mentat.za.net>

Hi Mark

On Wed, Aug 08, 2007 at 03:37:09PM -0000, mark wrote:
> I am trying to figure out a way to define a vectorized function inside
> a class.
> This is what I tried:
> 
> class test:
> 	def __init__(self):
> 		self.x = 3.0
> 	def func(self,y):
> 		rv = self.x
> 		if y > self.x: rv = y
> 		return rv
> 	f = vectorize(func)
> 
> 
> >>> m = test()
> >>> m.f( m, [-20,4,6] )
> array([ 3.,  4.,  6.])

Maybe you don't need to use vectorize.  How about

def func(self,y):
    y = y.copy()
    y[y <= self.x] = self.x
    return y

Cheers
St?fan


From tim.hochberg at ieee.org  Wed Aug  8 11:54:18 2007
From: tim.hochberg at ieee.org (Timothy Hochberg)
Date: Wed, 8 Aug 2007 08:54:18 -0700
Subject: [Numpy-discussion] vectorized function inside a class
In-Reply-To: <1186587429.134603.120980@q75g2000hsh.googlegroups.com>
References: <1186587429.134603.120980@q75g2000hsh.googlegroups.com>
Message-ID: <e4412d6b0708080854o707c77e2uf862d40554e952d1@mail.gmail.com>

On 8/8/07, mark <markbak at gmail.com> wrote:
>
> I am trying to figure out a way to define a vectorized function inside
> a class.
> This is what I tried:
>
> class test:
>         def __init__(self):
>                 self.x = 3.0
>         def func(self,y):
>                 rv = self.x
>                 if y > self.x: rv = y
>                 return rv
>         f = vectorize(func)
>
>
> >>> m = test()
> >>> m.f( m, [-20,4,6] )
> array([ 3.,  4.,  6.])
>
> But as you can see, I can only call the m.f function when I also pass
> it the instance m again.
> I really want to call it as
> m.f( [-20,4,6] )
> But then I get an error
> ValueError: mismatch between python function inputs and received
> arguments
>
> Any ideas how to do this better?


Don't use vectorize? Something like:

def f(self,y):
    return np.where(y > self.x, y, self.x)

You could also use vectorize by wrapping the result in a real method like
this:

            _f = vectorize(func)
            def f(self, y):
               return self._f(self, y)

That seems kind of silly in this instance though.

-tim


-- 
.  __
.   |-\
.
.  tim.hochberg at ieee.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070808/7a277262/attachment.html>

From charlesr.harris at gmail.com  Wed Aug  8 12:04:27 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 8 Aug 2007 10:04:27 -0600
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <ce557a360708080829pb6d348am93d41e826882b1b5@mail.gmail.com>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
	<e7ba66e40708061308q31b3f2a0ga04306f3e95f03d5@mail.gmail.com>
	<46B7E9CF.1080005@ar.media.kyoto-u.ac.jp>
	<ce557a360708062149i4618c39cv2339c5c4b9bcb2c@mail.gmail.com>
	<46B7FC64.7000109@ar.media.kyoto-u.ac.jp>
	<ce557a360708062233p5c4cdf4fw5bcf18ed4a199533@mail.gmail.com>
	<20070808095330.GO30988@mentat.za.net>
	<ce557a360708080829pb6d348am93d41e826882b1b5@mail.gmail.com>
Message-ID: <e06186140708080904g659dfc00h805fd5d9d33929c4@mail.gmail.com>

On 8/8/07, Anne Archibald <peridot.faceted at gmail.com> wrote:
>
> On 08/08/2007, Stefan van der Walt <stefan at sun.ac.za> wrote:
> > On Tue, Aug 07, 2007 at 01:33:24AM -0400, Anne Archibald wrote:
> > > Well, it can be done in Python: just allocate a too-big ndarray and
> > > take a slice that's the right shape and has the right alignment. But
> > > this sucks.
> >
> > Could you explain to me why is this such a bad idea?
>
> Oh. Well, it's not *terrible*; it gets you an aligned array. But you
> have to allocate the original array as a 1D byte array (to allow for
> arbitrary realignments) and then align it, reshape it, and reinterpret
> it as a new type. Plus you're allocating an extra ndarray structure,
> which will live as long as the new array does; this not only wastes
> even more memory than the portable alignment solutions, it clogs up
> python's garbage collector.


The ndarray structure doesn't take up much memory, it is the data that is
large and the data is shared between the original array and the slice. Nor
does the data type of the slice need changing, one simply uses the desired
type to begin with, or at least a type of the right size so that a view will
do the job without copies. Nor do I see how the garbage collector will get
clogged up, slices are a common feature of using numpy. The slice method
also has the advantage of being compiler and operating system independent,
there is a reason Intel used that approach.

Aligning multidimensional arrays might indeed be complicated, but I suspect
those complications will be easier to handle in Python than in C.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070808/29a9b388/attachment.html>

From stefan at sun.ac.za  Wed Aug  8 12:08:19 2007
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Wed, 8 Aug 2007 18:08:19 +0200
Subject: [Numpy-discussion] vectorized function inside a class
In-Reply-To: <e4412d6b0708080854o707c77e2uf862d40554e952d1@mail.gmail.com>
References: <1186587429.134603.120980@q75g2000hsh.googlegroups.com>
	<e4412d6b0708080854o707c77e2uf862d40554e952d1@mail.gmail.com>
Message-ID: <20070808160819.GD29100@mentat.za.net>

On Wed, Aug 08, 2007 at 08:54:18AM -0700, Timothy Hochberg wrote:
> Don't use vectorize? Something like:
> 
> def f(self,y):
>     return np.where(y > self.x, y, self.x)

A one-liner, cool.  Benchmarks on some other methods:

Method 1: N.where

100 loops, best of 3: 9.32 ms per loop

Method 2: N.clip

10000000 loops, best of 3: 112 ns per loop

100 loops, best of 3: 3.33 ms per loop

Method 3: N.putmask

100 loops, best of 3: 5.95 ms per loop

Method 4: fancy indexing

100 loops, best of 3: 5.09 ms per loop

Cheers
St?fan


From mpmusu at cc.usu.edu  Wed Aug  8 12:26:24 2007
From: mpmusu at cc.usu.edu (Mark.Miller)
Date: Wed, 08 Aug 2007 10:26:24 -0600
Subject: [Numpy-discussion] Count the occurrence of a certain integer in a
 list of integers
Message-ID: <46B9EEB0.40105@cc.usu.edu>

A late entry, but here's something that gets you an array of counts for 
each unique integer:

 >>> data = numpy.array([9, 6, 9, 6, 7, 9, 9, 10, 7, 9, 9, 6, 7, 9, 8, 
8, 11, 9, 6, 7, 10, 9, 7, 9, 7, 8, 9, 8, 7, 9])
 >>> unique=numpy.unique(data)
 >>> unique
array([ 6,  7,  8,  9, 10, 11])
 >>> histo=numpy.histogram(data,unique)
 >>> histo
(array([ 4,  7,  4, 12,  2,  1]), array([ 6,  7,  8,  9, 10, 11]))
 >>>

So histo[0] includes the counts of each integer in data.

-Mark


2007/8/7, Nils Wagner <nwagner at iam.uni-stuttgart.de>:
 >
 > Hi all,
 >
 > I have a list of integer numbers. The entries can vary between 0 and 19.
 > How can I count the occurrence of any number. Consider
 >
 > >>> data
 > [9, 6, 9, 6, 7, 9, 9, 10, 7, 9, 9, 6, 7, 9, 8, 8, 11, 9, 6, 7, 10, 9, 7,
 > 9, 7, 8, 9, 8, 7, 9]
 >
 >
 > Is there a better way than using, e.g.
 >
 > >>> shape(where(array(data)==10))[1]
 > 2
 >
 >
 > to compute the occurrence of 10 in the list which is 2 in this case ?
 >
 > Nils


From peridot.faceted at gmail.com  Wed Aug  8 14:23:44 2007
From: peridot.faceted at gmail.com (Anne Archibald)
Date: Wed, 8 Aug 2007 14:23:44 -0400
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <e06186140708080904g659dfc00h805fd5d9d33929c4@mail.gmail.com>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
	<e7ba66e40708061308q31b3f2a0ga04306f3e95f03d5@mail.gmail.com>
	<46B7E9CF.1080005@ar.media.kyoto-u.ac.jp>
	<ce557a360708062149i4618c39cv2339c5c4b9bcb2c@mail.gmail.com>
	<46B7FC64.7000109@ar.media.kyoto-u.ac.jp>
	<ce557a360708062233p5c4cdf4fw5bcf18ed4a199533@mail.gmail.com>
	<20070808095330.GO30988@mentat.za.net>
	<ce557a360708080829pb6d348am93d41e826882b1b5@mail.gmail.com>
	<e06186140708080904g659dfc00h805fd5d9d33929c4@mail.gmail.com>
Message-ID: <ce557a360708081123o73881b27h6a769b31cf480b21@mail.gmail.com>

On 08/08/2007, Charles R Harris <charlesr.harris at gmail.com> wrote:
>
>
> On 8/8/07, Anne Archibald <peridot.faceted at gmail.com> wrote:
> > Oh. Well, it's not *terrible*; it gets you an aligned array. But you
> > have to allocate the original array as a 1D byte array (to allow for
> > arbitrary realignments) and then align it, reshape it, and reinterpret
> > it as a new type. Plus you're allocating an extra ndarray structure,
> > which will live as long as the new array does; this not only wastes
> > even more memory than the portable alignment solutions, it clogs up
> > python's garbage collector.
>
> The ndarray structure doesn't take up much memory, it is the data that is
> large and the data is shared between the original array and the slice. Nor
> does the data type of the slice need changing, one simply uses the desired
> type to begin with, or at least a type of the right size so that a view will
> do the job without copies. Nor do I see how the garbage collector will get
> clogged up, slices are a common feature of using numpy. The slice method
> also has the advantage of being compiler and operating system independent,
> there is a reason Intel used that approach.
>
> Aligning multidimensional arrays might indeed be complicated, but I suspect
> those complications will be easier to handle in Python than in C.

Can we assume that numpy arrays allocated to contain (say) complex64s
are aligned to a 16-byte boundary? I don't think they will
necessarily, so the shift we need may not be an integer number of
complex64s. float96s pose even more problems. So to ensure alignment,
we do need to do type conversion; if we're doing it anyway, byte
arrays require the least trust in malloc().

The ndarray object isn't too big, probably some twenty or thirty
bytes, so I'm not talking about a huge waste. But it is a python
object, and the garbage collector needs to walk the whole tree of
accessible python objects every time it runs, so this is one more
object on the list.

As an aside: numpy's handling of ndarray objects is actually not
ideal; if you want to exhaust memory on your system, do:

a = arange(5)
while True:
    a = a[::-1]

Each ndarray object keeps alive the ndarray object it is a slice of,
so this operation creates an ever-growing linked list of ndarray
objects. Seems to me it would be better to keep a pointer only to the
original object that holds the address of the buffer (so it can be
freed).

Aligning multidimensional arrays is an interesting question. To first
order, aligning the first element should be enough. If the dimensions
of the array are not divisible by the alignment, though, this means
that lower-dimensional complete slices may not be aligned:

A = aligned_empty((7,5),dtype=float,alignment=16)

Then A is aligned, as is A[0,:], but A[1,:] is not.

So in this case we might want to actually allocate an 8-by-5 array and
take a slice. This does mean it won't be contiguous in memory, so that
flattening it requires a copy (which may not wind up aligned). This is
something we might want to do - that is, make available as an option -
in python.

Anne


From charlesr.harris at gmail.com  Wed Aug  8 14:58:13 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 8 Aug 2007 12:58:13 -0600
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <ce557a360708081123o73881b27h6a769b31cf480b21@mail.gmail.com>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
	<e7ba66e40708061308q31b3f2a0ga04306f3e95f03d5@mail.gmail.com>
	<46B7E9CF.1080005@ar.media.kyoto-u.ac.jp>
	<ce557a360708062149i4618c39cv2339c5c4b9bcb2c@mail.gmail.com>
	<46B7FC64.7000109@ar.media.kyoto-u.ac.jp>
	<ce557a360708062233p5c4cdf4fw5bcf18ed4a199533@mail.gmail.com>
	<20070808095330.GO30988@mentat.za.net>
	<ce557a360708080829pb6d348am93d41e826882b1b5@mail.gmail.com>
	<e06186140708080904g659dfc00h805fd5d9d33929c4@mail.gmail.com>
	<ce557a360708081123o73881b27h6a769b31cf480b21@mail.gmail.com>
Message-ID: <e06186140708081158m32f4f380t6a7b77fa7db880d9@mail.gmail.com>

Anne,

On 8/8/07, Anne Archibald <peridot.faceted at gmail.com> wrote:
>
> On 08/08/2007, Charles R Harris <charlesr.harris at gmail.com> wrote:
> >
> >
> > On 8/8/07, Anne Archibald <peridot.faceted at gmail.com> wrote:
> > > Oh. Well, it's not *terrible*; it gets you an aligned array. But you
> > > have to allocate the original array as a 1D byte array (to allow for
> > > arbitrary realignments) and then align it, reshape it, and reinterpret
> > > it as a new type. Plus you're allocating an extra ndarray structure,
> > > which will live as long as the new array does; this not only wastes
> > > even more memory than the portable alignment solutions, it clogs up
> > > python's garbage collector.
> >
> > The ndarray structure doesn't take up much memory, it is the data that
> is
> > large and the data is shared between the original array and the slice.
> Nor
> > does the data type of the slice need changing, one simply uses the
> desired
> > type to begin with, or at least a type of the right size so that a view
> will
> > do the job without copies. Nor do I see how the garbage collector will
> get
> > clogged up, slices are a common feature of using numpy. The slice method
> > also has the advantage of being compiler and operating system
> independent,
> > there is a reason Intel used that approach.
> >
> > Aligning multidimensional arrays might indeed be complicated, but I
> suspect
> > those complications will be easier to handle in Python than in C.
>
> Can we assume that numpy arrays allocated to contain (say) complex64s
> are aligned to a 16-byte boundary? I don't think they will
> necessarily, so the shift we need may not be an integer number of
> complex64s. float96s pose even more problems. So to ensure alignment,
> we do need to do type conversion; if we're doing it anyway, byte
> arrays require the least trust in malloc().


I think that is a safe assumption, it is probably almost as safe as assuming
binary and two's complement, likely more safe than assuming ieee 784.  I
expect almost all 32 bit OS's to align on 4 byte boundaries at worst, 64 bit
machines to align on 8 byte boundaries. Even C structures are typically
filled out with blanks to preserve some sort of alignment. That is because
of addressing efficiency, or even the impossibility of odd addressing --
depends on the architecture. Sometimes even byte addressing is easier to get
by putting a larger integer on the bus and extracting the relevant part. In
addition, I expect the heap implementation to make some alignment decisions
for efficiency.

My 64 bit linux on Intel aligns arrays, whatever the data type, on 16 byte
boundaries. It might be interesting to see what happens with the Intel and
MSVC comipilers, but I expect similar results. PPC's, Sun and SGI need to be
checked, but I don't expect problems. I think that will cover almost all
architectures numpy is likely to run on.


> The ndarray object isn't too big, probably some twenty or thirty
> bytes, so I'm not talking about a huge waste. But it is a python
> object, and the garbage collector needs to walk the whole tree of
> accessible python objects every time it runs, so this is one more
> object on the list.
>
> As an aside: numpy's handling of ndarray objects is actually not
> ideal; if you want to exhaust memory on your system, do:
>
> a = arange(5)
> while True:
>     a = a[::-1]


Well, that's a pathological case present in numpy. Fixing it doesn't seem to
be a high priority although there is a ticket somewhere.

Each ndarray object keeps alive the ndarray object it is a slice of,
> so this operation creates an ever-growing linked list of ndarray
> objects. Seems to me it would be better to keep a pointer only to the
> original object that holds the address of the buffer (so it can be
> freed).
>
> Aligning multidimensional arrays is an interesting question. To first
> order, aligning the first element should be enough. If the dimensions
> of the array are not divisible by the alignment, though, this means
> that lower-dimensional complete slices may not be aligned:
>
> A = aligned_empty((7,5),dtype=float,alignment=16)
>
> Then A is aligned, as is A[0,:], but A[1,:] is not.
>
> So in this case we might want to actually allocate an 8-by-5 array and
> take a slice. This does mean it won't be contiguous in memory, so that
> flattening it requires a copy (which may not wind up aligned). This is
> something we might want to do - that is, make available as an option -
> in python.


I think that is better viewed as need based. I suspect that if you really
need such alignment it is better to start with array dimensions that will
naturally align the rows. It will be impossible to naturally align all the
columnes unless the data type is the correct size.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070808/af32b181/attachment.html>

From matthieu.brucher at gmail.com  Wed Aug  8 15:01:59 2007
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Wed, 8 Aug 2007 21:01:59 +0200
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <e06186140708081158m32f4f380t6a7b77fa7db880d9@mail.gmail.com>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
	<46B7E9CF.1080005@ar.media.kyoto-u.ac.jp>
	<ce557a360708062149i4618c39cv2339c5c4b9bcb2c@mail.gmail.com>
	<46B7FC64.7000109@ar.media.kyoto-u.ac.jp>
	<ce557a360708062233p5c4cdf4fw5bcf18ed4a199533@mail.gmail.com>
	<20070808095330.GO30988@mentat.za.net>
	<ce557a360708080829pb6d348am93d41e826882b1b5@mail.gmail.com>
	<e06186140708080904g659dfc00h805fd5d9d33929c4@mail.gmail.com>
	<ce557a360708081123o73881b27h6a769b31cf480b21@mail.gmail.com>
	<e06186140708081158m32f4f380t6a7b77fa7db880d9@mail.gmail.com>
Message-ID: <e76aa17f0708081201n16dadb35o6bbd0c62e7845000@mail.gmail.com>

>
> My 64 bit linux on Intel aligns arrays, whatever the data type, on 16 byte
> boundaries. It might be interesting to see what happens with the Intel and
> MSVC comipilers, but I expect similar results.
>

According to the doc on the msdn, the data should be 16-bits aligned.

Matthieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070808/211d7243/attachment.html>

From charlesr.harris at gmail.com  Wed Aug  8 15:16:41 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 8 Aug 2007 13:16:41 -0600
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <e76aa17f0708081201n16dadb35o6bbd0c62e7845000@mail.gmail.com>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
	<ce557a360708062149i4618c39cv2339c5c4b9bcb2c@mail.gmail.com>
	<46B7FC64.7000109@ar.media.kyoto-u.ac.jp>
	<ce557a360708062233p5c4cdf4fw5bcf18ed4a199533@mail.gmail.com>
	<20070808095330.GO30988@mentat.za.net>
	<ce557a360708080829pb6d348am93d41e826882b1b5@mail.gmail.com>
	<e06186140708080904g659dfc00h805fd5d9d33929c4@mail.gmail.com>
	<ce557a360708081123o73881b27h6a769b31cf480b21@mail.gmail.com>
	<e06186140708081158m32f4f380t6a7b77fa7db880d9@mail.gmail.com>
	<e76aa17f0708081201n16dadb35o6bbd0c62e7845000@mail.gmail.com>
Message-ID: <e06186140708081216t22135e87l62c5a96d897dfc3f@mail.gmail.com>

On 8/8/07, Matthieu Brucher <matthieu.brucher at gmail.com> wrote:
>
> My 64 bit linux on Intel aligns arrays, whatever the data type, on 16 byte
> > boundaries. It might be interesting to see what happens with the Intel and
> > MSVC comipilers, but I expect similar results.
> >
>
> According to the doc on the msdn, the data should be 16-bits aligned.


Shades of DOS and 16 bit machines. Have you checked what actually happens on
modern hardware?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070808/d81ef962/attachment.html>

From markbak at gmail.com  Wed Aug  8 16:38:32 2007
From: markbak at gmail.com (mark)
Date: Wed, 08 Aug 2007 20:38:32 -0000
Subject: [Numpy-discussion] vectorized function inside a class
In-Reply-To: <e4412d6b0708080854o707c77e2uf862d40554e952d1@mail.gmail.com>
References: <1186587429.134603.120980@q75g2000hsh.googlegroups.com>
	<e4412d6b0708080854o707c77e2uf862d40554e952d1@mail.gmail.com>
Message-ID: <1186605512.397478.24150@22g2000hsm.googlegroups.com>

Thanks for the ideas to circumvent vectorization.
But the real function I need to vectorize is quite a bit more
complicated.
So I would really like to use vectorize.
Are there any reasons against vectorization? Is it slow?
The way Tim suggests I expect to be slow as there are two functions
calls.
Thanks,
Mark

On Aug 8, 5:54 pm, "Timothy Hochberg" <tim.hochb... at ieee.org> wrote:
> On 8/8/07, mark <mark... at gmail.com> wrote:
>
>
>
>
>
> > I am trying to figure out a way to define a vectorized function inside
> > a class.
> > This is what I tried:
>
> > class test:
> >         def __init__(self):
> >                 self.x = 3.0
> >         def func(self,y):
> >                 rv = self.x
> >                 if y > self.x: rv = y
> >                 return rv
> >         f = vectorize(func)
>
> > >>> m = test()
> > >>> m.f( m, [-20,4,6] )
> > array([ 3.,  4.,  6.])
>
> > But as you can see, I can only call the m.f function when I also pass
> > it the instance m again.
> > I really want to call it as
> > m.f( [-20,4,6] )
> > But then I get an error
> > ValueError: mismatch between python function inputs and received
> > arguments
>
> > Any ideas how to do this better?
>
> Don't use vectorize? Something like:
>
> def f(self,y):
>     return np.where(y > self.x, y, self.x)
>
> You could also use vectorize by wrapping the result in a real method like
> this:
>
>             _f = vectorize(func)
>             def f(self, y):
>                return self._f(self, y)
>
> That seems kind of silly in this instance though.
>
> -tim
>
> --
> .  __
> .   |-\
> .
> .  tim.hochb... at ieee.org
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion


From peridot.faceted at gmail.com  Wed Aug  8 16:53:28 2007
From: peridot.faceted at gmail.com (Anne Archibald)
Date: Wed, 8 Aug 2007 16:53:28 -0400
Subject: [Numpy-discussion] vectorized function inside a class
In-Reply-To: <1186605512.397478.24150@22g2000hsm.googlegroups.com>
References: <1186587429.134603.120980@q75g2000hsh.googlegroups.com>
	<e4412d6b0708080854o707c77e2uf862d40554e952d1@mail.gmail.com>
	<1186605512.397478.24150@22g2000hsm.googlegroups.com>
Message-ID: <ce557a360708081353q1249c83aifc175473f621449b@mail.gmail.com>

On 08/08/2007, mark <markbak at gmail.com> wrote:
> Thanks for the ideas to circumvent vectorization.
> But the real function I need to vectorize is quite a bit more
> complicated.
> So I would really like to use vectorize.
> Are there any reasons against vectorization? Is it slow?
> The way Tim suggests I expect to be slow as there are two functions
> calls.

vectorize() is just shorthand for a for loop, basically; it won't win
you anything on speed over looping yourself. It does win on
convenience, but if you can write your function to act on arrays it
will run much faster.

Anne


From john at saponara.net  Wed Aug  8 18:13:35 2007
From: john at saponara.net (john saponara)
Date: Wed, 08 Aug 2007 18:13:35 -0400
Subject: [Numpy-discussion] fromfunction question
Message-ID: <46BA400F.3070505@saponara.net>

Thinking I could use fromfunction to generate the x,y,z coordinates of a 
3D surface, I instead got separate arrays of x, y, and z coordinates (as 
I should have expected) and needed to use a nested listcomp to produce 
the unified array of 3D points:

x,y,z=fromfunction( lambda i,j: (xfun(i,j),yfun(i,j),zfun(i,j)), (1,3), 
dtype=int )
result=[[(a,b,c) for  a,b,c in zip(p,q,r)] for p,q,r in zip(x,y,z)]

Is it possible to compute the unified array of 3-tuples in a single 
step?  Below is working code.

Thanks.


--- start python session ---
r=array([[0,1],[1,0],[2,1]])
c=array([[0,1],[1,0],[2,1]])
p=array([[-1,0]])
rLen=len(r)
cLen=len(c)

# functions to compute x,y,z coordinates of 3d points (the exact 
expressions are not important)
def xfun(i,j):
     return r[j,0]

def yfun(i,j):
     return r[j,1]*p[i,0]+c[i+1,0]

def zfun(i,j):
     return r[j,1]*p[i,1]+c[i+1,1]

# the fromfunction and an extra step to arrange coordinates into 3-tuples
x,y,z=fromfunction( lambda i,j: (xfun(i,j),yfun(i,j),zfun(i,j)), (1,3), 
dtype=int )
result=[[(a,b,c) for  a,b,c in zip(p,q,r)] for p,q,r in zip(x,y,z)]

print result   # prints [[(0, 0, 0), (1, 1, 0), (2, 0, 0)]]
--- end python session ---


From david at ar.media.kyoto-u.ac.jp  Wed Aug  8 23:08:45 2007
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Thu, 09 Aug 2007 12:08:45 +0900
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <e06186140708081158m32f4f380t6a7b77fa7db880d9@mail.gmail.com>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>	<e7ba66e40708061308q31b3f2a0ga04306f3e95f03d5@mail.gmail.com>	<46B7E9CF.1080005@ar.media.kyoto-u.ac.jp>	<ce557a360708062149i4618c39cv2339c5c4b9bcb2c@mail.gmail.com>	<46B7FC64.7000109@ar.media.kyoto-u.ac.jp>	<ce557a360708062233p5c4cdf4fw5bcf18ed4a199533@mail.gmail.com>	<20070808095330.GO30988@mentat.za.net>	<ce557a360708080829pb6d348am93d41e826882b1b5@mail.gmail.com>	<e06186140708080904g659dfc00h805fd5d9d33929c4@mail.gmail.com>	<ce557a360708081123o73881b27h6a769b31cf480b21@mail.gmail.com>
	<e06186140708081158m32f4f380t6a7b77fa7db880d9@mail.gmail.com>
Message-ID: <46BA853D.90901@ar.media.kyoto-u.ac.jp>

Charles R Harris wrote:
> Anne,
>
> On 8/8/07, *Anne Archibald* <peridot.faceted at gmail.com 
> <mailto:peridot.faceted at gmail.com>> wrote:
>
>     On 08/08/2007, Charles R Harris <charlesr.harris at gmail.com
>     <mailto:charlesr.harris at gmail.com>> wrote:
>     >
>     >
>     > On 8/8/07, Anne Archibald <peridot.faceted at gmail.com
>     <mailto:peridot.faceted at gmail.com>> wrote:
>     > > Oh. Well, it's not *terrible*; it gets you an aligned array.
>     But you
>     > > have to allocate the original array as a 1D byte array (to
>     allow for
>     > > arbitrary realignments) and then align it, reshape it, and
>     reinterpret
>     > > it as a new type. Plus you're allocating an extra ndarray
>     structure,
>     > > which will live as long as the new array does; this not only
>     wastes
>     > > even more memory than the portable alignment solutions, it
>     clogs up
>     > > python's garbage collector.
>     >
>     > The ndarray structure doesn't take up much memory, it is the
>     data that is
>     > large and the data is shared between the original array and the
>     slice. Nor
>     > does the data type of the slice need changing, one simply uses
>     the desired
>     > type to begin with, or at least a type of the right size so that
>     a view will
>     > do the job without copies. Nor do I see how the garbage
>     collector will get
>     > clogged up, slices are a common feature of using numpy. The
>     slice method
>     > also has the advantage of being compiler and operating system
>     independent,
>     > there is a reason Intel used that approach.
>
I am not sure to understand which approach to which problem you are 
talking about here ?

IMHO, the discussion is becoming a bit carried away. What I was 
suggesting is
    - being able to check whether a given data buffer is aligned to a 
given alignment (easy)
    - being able to request an aligned data buffer: requires aligned 
memory allocators, and some additions to the API for creating arrays.

This all boils down to the following case: I have a C function which 
requires N bytes aligned data, I want the numpy API to provide this 
capability. I don't understand the discussion on doing it in python: 
first, this means you cannot request a data buffer at the C level, and I 
don't understand the whole discussion on slice, multi dimension and so 
on either: at the C level, different libraries may need different arrays 
formats, and in the case of fftw, all it cares about is the alignment of 
the data pointer. For contiguous, C order arrays, as long as the data 
pointer is aligned, I don't think we need more; are some people familiar 
with the MKL, who could tell whether we need more ?

cheers,

David


From charlesr.harris at gmail.com  Thu Aug  9 03:17:28 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 9 Aug 2007 01:17:28 -0600
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <46BA853D.90901@ar.media.kyoto-u.ac.jp>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
	<ce557a360708062149i4618c39cv2339c5c4b9bcb2c@mail.gmail.com>
	<46B7FC64.7000109@ar.media.kyoto-u.ac.jp>
	<ce557a360708062233p5c4cdf4fw5bcf18ed4a199533@mail.gmail.com>
	<20070808095330.GO30988@mentat.za.net>
	<ce557a360708080829pb6d348am93d41e826882b1b5@mail.gmail.com>
	<e06186140708080904g659dfc00h805fd5d9d33929c4@mail.gmail.com>
	<ce557a360708081123o73881b27h6a769b31cf480b21@mail.gmail.com>
	<e06186140708081158m32f4f380t6a7b77fa7db880d9@mail.gmail.com>
	<46BA853D.90901@ar.media.kyoto-u.ac.jp>
Message-ID: <e06186140708090017o360d4936x5f31044ee6137a4e@mail.gmail.com>

On 8/8/07, David Cournapeau <david at ar.media.kyoto-u.ac.jp> wrote:
>
> Charles R Harris wrote:
> > Anne,
> >
> > On 8/8/07, *Anne Archibald* <peridot.faceted at gmail.com
> > <mailto:peridot.faceted at gmail.com>> wrote:
> >
> >     On 08/08/2007, Charles R Harris <charlesr.harris at gmail.com
> >     <mailto:charlesr.harris at gmail.com>> wrote:
> >     >
> >     >
> >     > On 8/8/07, Anne Archibald <peridot.faceted at gmail.com
> >     <mailto:peridot.faceted at gmail.com>> wrote:
> >     > > Oh. Well, it's not *terrible*; it gets you an aligned array.
> >     But you
> >     > > have to allocate the original array as a 1D byte array (to
> >     allow for
> >     > > arbitrary realignments) and then align it, reshape it, and
> >     reinterpret
> >     > > it as a new type. Plus you're allocating an extra ndarray
> >     structure,
> >     > > which will live as long as the new array does; this not only
> >     wastes
> >     > > even more memory than the portable alignment solutions, it
> >     clogs up
> >     > > python's garbage collector.
> >     >
> >     > The ndarray structure doesn't take up much memory, it is the
> >     data that is
> >     > large and the data is shared between the original array and the
> >     slice. Nor
> >     > does the data type of the slice need changing, one simply uses
> >     the desired
> >     > type to begin with, or at least a type of the right size so that
> >     a view will
> >     > do the job without copies. Nor do I see how the garbage
> >     collector will get
> >     > clogged up, slices are a common feature of using numpy. The
> >     slice method
> >     > also has the advantage of being compiler and operating system
> >     independent,
> >     > there is a reason Intel used that approach.
> >
> I am not sure to understand which approach to which problem you are
> talking about here ?
>
> IMHO, the discussion is becoming a bit carried away. What I was
> suggesting is
>     - being able to check whether a given data buffer is aligned to a
> given alignment (easy)
>     - being able to request an aligned data buffer: requires aligned
> memory allocators, and some additions to the API for creating arrays.
>
> This all boils down to the following case: I have a C function which
> requires N bytes aligned data, I want the numpy API to provide this
> capability. I don't understand the discussion on doing it in python:


Well, what you want might be very easy to do in python, we just need to
check the default alignments for doubles and floats for some of the other
compilers, architectures, and OS's out there. On the other hand, you might
not be able to request a c malloc that is aligned in a portable way without
resorting to the same tricks as you do in python. So why not use python and
get the reference counting and garbage collection along with it? What we
want are doubles 8 byte aligned and floats 4 byte aligned. That seems to be
the case with gcc, linux, and the Intel architecture. The idea is to create
a slightly oversize array, then use a slice of the proper size that is 16
byte aligned.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070809/5752cba8/attachment.html>

From david at ar.media.kyoto-u.ac.jp  Thu Aug  9 03:52:38 2007
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Thu, 09 Aug 2007 16:52:38 +0900
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <e06186140708090017o360d4936x5f31044ee6137a4e@mail.gmail.com>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>	<ce557a360708062149i4618c39cv2339c5c4b9bcb2c@mail.gmail.com>	<46B7FC64.7000109@ar.media.kyoto-u.ac.jp>	<ce557a360708062233p5c4cdf4fw5bcf18ed4a199533@mail.gmail.com>	<20070808095330.GO30988@mentat.za.net>	<ce557a360708080829pb6d348am93d41e826882b1b5@mail.gmail.com>	<e06186140708080904g659dfc00h805fd5d9d33929c4@mail.gmail.com>	<ce557a360708081123o73881b27h6a769b31cf480b21@mail.gmail.com>	<e06186140708081158m32f4f380t6a7b77fa7db880d9@mail.gmail.com>	<46BA853D.90901@ar.media.kyoto-u.ac.jp>
	<e06186140708090017o360d4936x5f31044ee6137a4e@mail.gmail.com>
Message-ID: <46BAC7C6.8030004@ar.media.kyoto-u.ac.jp>

Charles R Harris wrote:
>
> Well, what you want might be very easy to do in python, we just need 
> to check the default alignments for doubles and floats for some of the 
> other compilers, architectures, and OS's out there. On the other hand, 
> you might not be able to request a c malloc that is aligned in a 
> portable way without resorting to the same tricks as you do in python. 
> So why not use python and get the reference counting and garbage 
> collection along with it?
First, doing it in python means that I cannot use the facility from C 
easily. But this is exactly where I need it, and where I would guess 
most people need it. People want to interface numpy with the mkl ? They 
will do it in C, right ? And maybe I am just too dumb to see the 
problem, but I don't see the need for garbage collection and so on :) 
Again, what is needed is:
    - aligned allocator -> we can use the one from Steven Johnson, used 
in fftw, which support more or less the same archs than numpy
    - Refactor the array creation functions in C such as the 
implementation takes one additional alignement argument, and the 
original functions are kept identical to before
    - Add a few utilities function to check whether it is SSE aligned, 
arbitrary aligned, etc...

The only non trivial point is 2 . Actually, when I first thought about 
it, I thought about fixing alignement at compile time, which would have 
made it totally avoidable: it would have been a simple change of the 
definition of PyDataMem_New to an aligned malloc with a constant. I have 
already the code for this, and besides aligned malloc code, it is like a 
5 lines change of numpy code, nothing terrible, really.

David


From charlesr.harris at gmail.com  Thu Aug  9 03:55:50 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 9 Aug 2007 01:55:50 -0600
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <e06186140708090017o360d4936x5f31044ee6137a4e@mail.gmail.com>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
	<46B7FC64.7000109@ar.media.kyoto-u.ac.jp>
	<ce557a360708062233p5c4cdf4fw5bcf18ed4a199533@mail.gmail.com>
	<20070808095330.GO30988@mentat.za.net>
	<ce557a360708080829pb6d348am93d41e826882b1b5@mail.gmail.com>
	<e06186140708080904g659dfc00h805fd5d9d33929c4@mail.gmail.com>
	<ce557a360708081123o73881b27h6a769b31cf480b21@mail.gmail.com>
	<e06186140708081158m32f4f380t6a7b77fa7db880d9@mail.gmail.com>
	<46BA853D.90901@ar.media.kyoto-u.ac.jp>
	<e06186140708090017o360d4936x5f31044ee6137a4e@mail.gmail.com>
Message-ID: <e06186140708090055ieed6342h5bb330241e1d9588@mail.gmail.com>

On 8/9/07, Charles R Harris <charlesr.harris at gmail.com> wrote:
>
>
>
> On 8/8/07, David Cournapeau <david at ar.media.kyoto-u.ac.jp> wrote:
> >
> > Charles R Harris wrote:
> > > Anne,
> > >
> > > On 8/8/07, *Anne Archibald* <peridot.faceted at gmail.com
> > > <mailto: peridot.faceted at gmail.com>> wrote:
> > >
> > >     On 08/08/2007, Charles R Harris <charlesr.harris at gmail.com
> > >     <mailto: charlesr.harris at gmail.com>> wrote:
> > >     >
> > >     >
> > >     > On 8/8/07, Anne Archibald <peridot.faceted at gmail.com
> > >     <mailto: peridot.faceted at gmail.com>> wrote:
> > >     > > Oh. Well, it's not *terrible*; it gets you an aligned array.
> > >     But you
> > >     > > have to allocate the original array as a 1D byte array (to
> > >     allow for
> > >     > > arbitrary realignments) and then align it, reshape it, and
> > >     reinterpret
> > >     > > it as a new type. Plus you're allocating an extra ndarray
> > >     structure,
> > >     > > which will live as long as the new array does; this not only
> > >     wastes
> > >     > > even more memory than the portable alignment solutions, it
> > >     clogs up
> > >     > > python's garbage collector.
> > >     >
> > >     > The ndarray structure doesn't take up much memory, it is the
> > >     data that is
> > >     > large and the data is shared between the original array and the
> > >     slice. Nor
> > >     > does the data type of the slice need changing, one simply uses
> > >     the desired
> > >     > type to begin with, or at least a type of the right size so that
> > >     a view will
> > >     > do the job without copies. Nor do I see how the garbage
> > >     collector will get
> > >     > clogged up, slices are a common feature of using numpy. The
> > >     slice method
> > >     > also has the advantage of being compiler and operating system
> > >     independent,
> > >     > there is a reason Intel used that approach.
> > >
> > I am not sure to understand which approach to which problem you are
> > talking about here ?
> >
> > IMHO, the discussion is becoming a bit carried away. What I was
> > suggesting is
> >     - being able to check whether a given data buffer is aligned to a
> > given alignment (easy)
> >     - being able to request an aligned data buffer: requires aligned
> > memory allocators, and some additions to the API for creating arrays.
> >
> > This all boils down to the following case: I have a C function which
> > requires N bytes aligned data, I want the numpy API to provide this
> > capability. I don't understand the discussion on doing it in python:
>
>
> Well, what you want might be very easy to do in python, we just need to
> check the default alignments for doubles and floats for some of the other
> compilers, architectures, and OS's out there. On the other hand, you might
> not be able to request a c malloc that is aligned in a portable way without
> resorting to the same tricks as you do in python. So why not use python and
> get the reference counting and garbage collection along with it? What we
> want are doubles 8 byte aligned and floats 4 byte aligned. That seems to be
> the case with gcc, linux, and the Intel architecture. The idea is to create
> a slightly oversize array, then use a slice of the proper size that is 16
> byte aligned.
>
> Chuck
>

For instance, in the case of  linux-x86 and linux-x86_64, the following
should work:

In [68]: def align16(n,dtype=float64) :
   ....:     size = dtype().dtype.itemsize
   ....:     over = 16/size
   ....:     data = empty(n + over, dtype=dtype)
   ....:     skip = (- data.ctypes.data % 16)/size
   ....:     return data[skip:skip + n]

Of course, now you need to fill in the data.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070809/c2ff6b5c/attachment.html>

From charlesr.harris at gmail.com  Thu Aug  9 04:05:03 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 9 Aug 2007 02:05:03 -0600
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <46BAC7C6.8030004@ar.media.kyoto-u.ac.jp>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
	<ce557a360708062233p5c4cdf4fw5bcf18ed4a199533@mail.gmail.com>
	<20070808095330.GO30988@mentat.za.net>
	<ce557a360708080829pb6d348am93d41e826882b1b5@mail.gmail.com>
	<e06186140708080904g659dfc00h805fd5d9d33929c4@mail.gmail.com>
	<ce557a360708081123o73881b27h6a769b31cf480b21@mail.gmail.com>
	<e06186140708081158m32f4f380t6a7b77fa7db880d9@mail.gmail.com>
	<46BA853D.90901@ar.media.kyoto-u.ac.jp>
	<e06186140708090017o360d4936x5f31044ee6137a4e@mail.gmail.com>
	<46BAC7C6.8030004@ar.media.kyoto-u.ac.jp>
Message-ID: <e06186140708090105y7f25e412o24ac07df493b5b59@mail.gmail.com>

On 8/9/07, David Cournapeau <david at ar.media.kyoto-u.ac.jp> wrote:
>
> Charles R Harris wrote:
> >
> > Well, what you want might be very easy to do in python, we just need
> > to check the default alignments for doubles and floats for some of the
> > other compilers, architectures, and OS's out there. On the other hand,
> > you might not be able to request a c malloc that is aligned in a
> > portable way without resorting to the same tricks as you do in python.
> > So why not use python and get the reference counting and garbage
> > collection along with it?
> First, doing it in python means that I cannot use the facility from C
> easily. But this is exactly where I need it, and where I would guess
> most people need it. People want to interface numpy with the mkl ? They
> will do it in C, right ? And maybe I am just too dumb to see the
> problem, but I don't see the need for garbage collection and so on :)
> Again, what is needed is:
>     - aligned allocator -> we can use the one from Steven Johnson, used
> in fftw, which support more or less the same archs than numpy
>     - Refactor the array creation functions in C such as the
> implementation takes one additional alignement argument, and the
> original functions are kept identical to before
>     - Add a few utilities function to check whether it is SSE aligned,
> arbitrary aligned, etc...
>
> The only non trivial point is 2 . Actually, when I first thought about
> it, I thought about fixing alignement at compile time, which would have
> made it totally avoidable: it would have been a simple change of the
> definition of PyDataMem_New to an aligned malloc with a constant. I have
> already the code for this, and besides aligned malloc code, it is like a
> 5 lines change of numpy code, nothing terrible, really.


Ah, you want it in C. Well, I think it would not be too difficult to change
PyDataMem_New, however, the function signature would change and all the code
that used it would break. That is pretty drastic. Better to define
PyDataMem_New_Aligned, then redefine PyDataMem_New to use the new function.
That way nothing breaks and you get the function you need. I don't think
Travis would get upset if you added such a function and documented it.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070809/24e4e522/attachment.html>

From david at ar.media.kyoto-u.ac.jp  Thu Aug  9 04:26:12 2007
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Thu, 09 Aug 2007 17:26:12 +0900
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <e06186140708090105y7f25e412o24ac07df493b5b59@mail.gmail.com>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>	<ce557a360708062233p5c4cdf4fw5bcf18ed4a199533@mail.gmail.com>	<20070808095330.GO30988@mentat.za.net>	<ce557a360708080829pb6d348am93d41e826882b1b5@mail.gmail.com>	<e06186140708080904g659dfc00h805fd5d9d33929c4@mail.gmail.com>	<ce557a360708081123o73881b27h6a769b31cf480b21@mail.gmail.com>	<e06186140708081158m32f4f380t6a7b77fa7db880d9@mail.gmail.com>	<46BA853D.90901@ar.media.kyoto-u.ac.jp>	<e06186140708090017o360d4936x5f31044ee6137a4e@mail.gmail.com>	<46BAC7C6.8030004@ar.media.kyoto-u.ac.jp>
	<e06186140708090105y7f25e412o24ac07df493b5b59@mail.gmail.com>
Message-ID: <46BACFA4.5010707@ar.media.kyoto-u.ac.jp>

Charles R Harris wrote:
>
> Ah, you want it in C.
What would be the use to get SIMD aligned arrays in python ?

David


From charlesr.harris at gmail.com  Thu Aug  9 04:58:24 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 9 Aug 2007 02:58:24 -0600
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <46BACFA4.5010707@ar.media.kyoto-u.ac.jp>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
	<ce557a360708080829pb6d348am93d41e826882b1b5@mail.gmail.com>
	<e06186140708080904g659dfc00h805fd5d9d33929c4@mail.gmail.com>
	<ce557a360708081123o73881b27h6a769b31cf480b21@mail.gmail.com>
	<e06186140708081158m32f4f380t6a7b77fa7db880d9@mail.gmail.com>
	<46BA853D.90901@ar.media.kyoto-u.ac.jp>
	<e06186140708090017o360d4936x5f31044ee6137a4e@mail.gmail.com>
	<46BAC7C6.8030004@ar.media.kyoto-u.ac.jp>
	<e06186140708090105y7f25e412o24ac07df493b5b59@mail.gmail.com>
	<46BACFA4.5010707@ar.media.kyoto-u.ac.jp>
Message-ID: <e06186140708090158h62267548uf27c4572dd1a494@mail.gmail.com>

On 8/9/07, David Cournapeau <david at ar.media.kyoto-u.ac.jp> wrote:
>
> Charles R Harris wrote:
> >
> > Ah, you want it in C.
> What would be the use to get SIMD aligned arrays in python ?


If I wanted a fairly specialized routine and didn't want to touch the guts
of numpy, I would pass the aligned array to a C function and use the data
pointer. The python code would be just a high level wrapper. You might even
be able to use ctypes to pass the pointer into a library function. It's not
necessary to code everything in C using the python C API.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070809/958430d7/attachment.html>

From stefan at sun.ac.za  Thu Aug  9 05:40:23 2007
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Thu, 9 Aug 2007 11:40:23 +0200
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <46BAC7C6.8030004@ar.media.kyoto-u.ac.jp>
References: <46B7FC64.7000109@ar.media.kyoto-u.ac.jp>
	<ce557a360708062233p5c4cdf4fw5bcf18ed4a199533@mail.gmail.com>
	<20070808095330.GO30988@mentat.za.net>
	<ce557a360708080829pb6d348am93d41e826882b1b5@mail.gmail.com>
	<e06186140708080904g659dfc00h805fd5d9d33929c4@mail.gmail.com>
	<ce557a360708081123o73881b27h6a769b31cf480b21@mail.gmail.com>
	<e06186140708081158m32f4f380t6a7b77fa7db880d9@mail.gmail.com>
	<46BA853D.90901@ar.media.kyoto-u.ac.jp>
	<e06186140708090017o360d4936x5f31044ee6137a4e@mail.gmail.com>
	<46BAC7C6.8030004@ar.media.kyoto-u.ac.jp>
Message-ID: <20070809094023.GI9452@mentat.za.net>

On Thu, Aug 09, 2007 at 04:52:38PM +0900, David Cournapeau wrote:
> Charles R Harris wrote:
> >
> > Well, what you want might be very easy to do in python, we just need 
> > to check the default alignments for doubles and floats for some of the 
> > other compilers, architectures, and OS's out there. On the other hand, 
> > you might not be able to request a c malloc that is aligned in a 
> > portable way without resorting to the same tricks as you do in python. 
> > So why not use python and get the reference counting and garbage 
> > collection along with it?
> First, doing it in python means that I cannot use the facility from C 
> easily. But this is exactly where I need it, and where I would guess 
> most people need it. People want to interface numpy with the mkl ? They 
> will do it in C, right ?

It doesn't really matter where the memory allocation occurs, does it?
As far as I understand, the underlying fftw function has some flag to
indicate when the data is aligned.  If so, we could expose that flag
in Python, and do something like

x = align16(data)
_fft(x, is_aligned=True)

I am not intimately familiar with the fft wrappers, so maybe I'm
missing something more fundamental.

Cheers
St?fan


From millman at berkeley.edu  Thu Aug  9 07:29:29 2007
From: millman at berkeley.edu (Jarrod Millman)
Date: Thu, 9 Aug 2007 04:29:29 -0700
Subject: [Numpy-discussion] I am volunteering to be the release manager for
	NumPy 1.0.3.1 and SciPy 0.5.2
Message-ID: <c7009a550708090429s7ca6f6d0y13ba52cf28c4e03f@mail.gmail.com>

I volunteer to be the release manager for NumPy 1.0.3.1 and SciPy
0.5.3.  In order to actually get them both released I will obviously
need some help.  But given the amount of work required and the number
of people who have offered to help, I believe this will be doable.

Given the extensive discussion about what is needed for these
releases, I am fairly confident that I know what needs to be done.  I
will try to be very specific about what I will do and what I will need
help with.  Basically, I am just rewriting the plan described by
Robert Kern last month.  Please let me know if you have any
suggestions/comments/problems with this plan and please let me know if
you can commit to helping in any way.

[[NOTE:  I just (on Monday) hired 2 full-time programmers to work on
the neuroimaging in python (NIPY) project, so they will be able to
help out with bug fixing as well as testing the pre-releases on
different platforms.]]

Releasing NumPy 1.0.3.1
===================
On July 24th, Robert suggested making a numpy 1.0.3.1 point release.
He was concerned that there were some changes in numpy.distutils that
needed to cook a little longer.  So I am offering to make a 1.0.3.1
release.  If Travis or one of the other core NumPy developers want to
make a 1.0.4 release in the next week or so, then there won't be a
need for a 1.0.3.1 release.

First, I will branch from the 1.0.3 tag:
svn cp http://svn.scipy.org/svn/numpy/tags/1.0.3
http://svn.scipy.org/svn/numpy/branches/1.0.3

Second, I will apply all the patches necessary to build scipy from
svn, but nothing else.  Then I will just follow the NumPy release
instructions:  http://projects.scipy.org/scipy/numpy/wiki/MakingReleases
I will make the tarball and source rpm; but will need help with
everything else.  Things will go faster if someone else can build the
Windows binaries.  If not, my new programmers and I will make the
binaries.  Finally, one of the sourceforge admins will need upload
those files once we are done.  (I am happy to be made an admin and
upload the files myself, if it would be more convenient.)

Releasing SciPy 0.5.3
=================
I will make a 0.5.3 scipy branch:
svn cp http://svn.scipy.org/svn/scipy/trunk
http://svn.scipy.org/svn/scipy/branches/0.5.3

>From then on normal development will continue on the trunk, but only
bug fixes will be allowed on the branch.  I will ask everyone to test
the branch for at least 1 week depending on whether we get any bug
reports.  Once we are able to get the most serious bugs fixed, I will
start working with everyone to build as many binaries as possible. I
will rely on David Cournapeau and Andrew Straw to provide RPMs and
DEBs.  Again, things will go faster if someone else can build the
Windows binaries.  But if not, my new programmers and I will figure
out how to make the binaries for Windows.  We can also make the OS X
binaries especially if Robert Kern is stilling willing to help.

I will also draft a release announcement and give everyone time to
comment on it.  I will either need to get access to the sourceforge
site and the PyPi records or someone will have to update them for me.

Timeline
=======
If this is agreeable to everyone, I will make the NumPy branch on
Friday and apply the relevant patches.  Then if I can get someone else
to make the Windows executables and upload the files, we should be
able to have a new NumPy release before the beginning of the SciPy
conference.

As for the 0.5.3 SciPy branch, we can discuss this in some detail if
everyone is OK with the basic plan.

In general, I hope that I will be able to have a 1.0.3.1 NumPy release
before August 20th.  Perhaps we could even make the 0.5.3 branch by
the 20th.  Fortunately, as David said earlier the main issue is
getting a new release of NumPy out.

Resources
========
As I mentioned I just hired 2 full-time programmers to work on NIPY
who will be able to help me get the binaries built and tested for the
different platforms.  All 3 of us will be at the SciPy conference next
week.  So we will hopefully be able to solve whatever problems we run
into very quickly given that it will be so easy to get help.

Additionally, David Cournapeau has said that he is willing to help get
a new release of SciPy out.  He has already been busy at work
squashing bugs.

Sincerely,

-- 
Jarrod Millman
Computational Infrastructure for Research Labs
10 Giannini Hall, UC Berkeley
phone: 510.643.4014
http://cirl.berkeley.edu/


From steve at shrogers.com  Thu Aug  9 08:34:47 2007
From: steve at shrogers.com (Steven H. Rogers)
Date: Thu, 09 Aug 2007 06:34:47 -0600
Subject: [Numpy-discussion] APL2007 Update
Message-ID: <46BB09E7.7070208@shrogers.com>

Attached is an updated announcement for APL2007: Arrays and Objects.

21-23 October 2007
Montreal, Canada

APL = Array Programming Languages


-------------- next part --------------
A non-text attachment was scrubbed...
Name: APL2007Ann-2-1.pdf
Type: application/pdf
Size: 18084 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070809/bca1d5ca/attachment.pdf>

From nwagner at iam.uni-stuttgart.de  Thu Aug  9 09:47:04 2007
From: nwagner at iam.uni-stuttgart.de (Nils Wagner)
Date: Thu, 09 Aug 2007 15:47:04 +0200
Subject: [Numpy-discussion] Working with lists
Message-ID: <46BB1AD8.2090009@iam.uni-stuttgart.de>

Hi all,

I have a list e.g.
>>> bounds
[(1950.0, 2100.0), (1800.0, 1850.0), (1600.0, 1630.0), (1400.0, 1420.0),
(1200.0, 1210.0), (990, 1018.0), (10, 12), (12.0, 14.0), (14.0, 16.0),
(16.0, 18.0), (18.0, 20)]

How can I extract the first value of each pair given in parenthesis i.e.
1950,1800,1600,1400,... ?

Nils
 

From kwgoodman at gmail.com  Thu Aug  9 09:53:00 2007
From: kwgoodman at gmail.com (Keith Goodman)
Date: Thu, 9 Aug 2007 15:53:00 +0200
Subject: [Numpy-discussion] Working with lists
In-Reply-To: <46BB1AD8.2090009@iam.uni-stuttgart.de>
References: <46BB1AD8.2090009@iam.uni-stuttgart.de>
Message-ID: <f4f93d420708090653r6e5f7723t2f0f991c30dd49f9@mail.gmail.com>

On 8/9/07, Nils Wagner <nwagner at iam.uni-stuttgart.de> wrote:
> [(1950.0, 2100.0), (1800.0, 1850.0), (1600.0, 1630.0), (1400.0, 1420.0),
> (1200.0, 1210.0), (990, 1018.0), (10, 12), (12.0, 14.0), (14.0, 16.0),
> (16.0, 18.0), (18.0, 20)]
>
> How can I extract the first value of each pair given in parenthesis i.e.
> 1950,1800,1600,1400,... ?

Here's one way:

[z[0] for z in bounds]


From lorrmann at physik.uni-wuerzburg.de  Thu Aug  9 09:54:21 2007
From: lorrmann at physik.uni-wuerzburg.de (volker)
Date: Thu, 9 Aug 2007 13:54:21 +0000 (UTC)
Subject: [Numpy-discussion] Working with lists
References: <46BB1AD8.2090009@iam.uni-stuttgart.de>
Message-ID: <loom.20070809T155209-775@post.gmane.org>

Nils Wagner <nwagner <at> iam.uni-stuttgart.de> writes:

> 
> Hi all,
> 
> I have a list e.g.
> >>> bounds
> [(1950.0, 2100.0), (1800.0, 1850.0), (1600.0, 1630.0), (1400.0, 1420.0),
> (1200.0, 1210.0), (990, 1018.0), (10, 12), (12.0, 14.0), (14.0, 16.0),
> (16.0, 18.0), (18.0, 20)]
> 
> How can I extract the first value of each pair given in parenthesis i.e.
> 1950,1800,1600,1400,... ?
> 
> Nils
> 

Its easy i think:

bounds_0 = (array(bounds)[:,0]).tolist()


volker


From gruben at bigpond.net.au  Thu Aug  9 10:06:49 2007
From: gruben at bigpond.net.au (Gary Ruben)
Date: Fri, 10 Aug 2007 00:06:49 +1000
Subject: [Numpy-discussion] Working with lists
In-Reply-To: <f4f93d420708090653r6e5f7723t2f0f991c30dd49f9@mail.gmail.com>
References: <46BB1AD8.2090009@iam.uni-stuttgart.de>
	<f4f93d420708090653r6e5f7723t2f0f991c30dd49f9@mail.gmail.com>
Message-ID: <46BB1F79.2040204@bigpond.net.au>

FWIW,
The list comprehension is faster than using map()

In [7]: %timeit map(lambda x:x[0],bounds)
10000 loops, best of 3: 49.6 -?s per loop

In [8]: %timeit [x[0] for x in bounds]
10000 loops, best of 3: 20.8 -?s per loop

Gary R.

Keith Goodman wrote:
> On 8/9/07, Nils Wagner <nwagner at iam.uni-stuttgart.de> wrote:
>> [(1950.0, 2100.0), (1800.0, 1850.0), (1600.0, 1630.0), (1400.0, 1420.0),
>> (1200.0, 1210.0), (990, 1018.0), (10, 12), (12.0, 14.0), (14.0, 16.0),
>> (16.0, 18.0), (18.0, 20)]
>>
>> How can I extract the first value of each pair given in parenthesis i.e.
>> 1950,1800,1600,1400,... ?
> 
> Here's one way:
> 
> [z[0] for z in bounds]


From cournape at gmail.com  Thu Aug  9 11:00:21 2007
From: cournape at gmail.com (David Cournapeau)
Date: Fri, 10 Aug 2007 00:00:21 +0900
Subject: [Numpy-discussion] Where to put misc C function in numpy ?
Message-ID: <5b8d13220708090800y1993abd4j74ddc64618b17873@mail.gmail.com>

Hi,

Following the thread on facilities for SIMD friendly allocations, I
have a basic private branch ready for review, but I have one problem:
where to put the allocation functions ? The problem is the following:
data buffers are allocated/deallocated with functions defined in
ndarrayobject,h

PyMemData_NEW(ptr) malloc(ptr)
...

Which I would replace with

PyMemData_NEW(ptr) npy_aligned_alloc(ptr, DEF_ALIGNMENT)

Where to define npy_aligned_alloc ? As PyMemData_NEW is used outside
numpy.core, the function needs to be available somewhat "publically",
but as far as I understand numpy code structure, there is no such
facility available (eg a pure C library, totally unaware of python,
which would contain some useful tools for numpy), right ?

David


From cournape at gmail.com  Thu Aug  9 11:03:31 2007
From: cournape at gmail.com (David Cournapeau)
Date: Fri, 10 Aug 2007 00:03:31 +0900
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <e06186140708090158h62267548uf27c4572dd1a494@mail.gmail.com>
References: <46B2C602.7010205@ar.media.kyoto-u.ac.jp>
	<e06186140708080904g659dfc00h805fd5d9d33929c4@mail.gmail.com>
	<ce557a360708081123o73881b27h6a769b31cf480b21@mail.gmail.com>
	<e06186140708081158m32f4f380t6a7b77fa7db880d9@mail.gmail.com>
	<46BA853D.90901@ar.media.kyoto-u.ac.jp>
	<e06186140708090017o360d4936x5f31044ee6137a4e@mail.gmail.com>
	<46BAC7C6.8030004@ar.media.kyoto-u.ac.jp>
	<e06186140708090105y7f25e412o24ac07df493b5b59@mail.gmail.com>
	<46BACFA4.5010707@ar.media.kyoto-u.ac.jp>
	<e06186140708090158h62267548uf27c4572dd1a494@mail.gmail.com>
Message-ID: <5b8d13220708090803g7c062b67he6735aa9f2d43d89@mail.gmail.com>

On 8/9/07, Charles R Harris <charlesr.harris at gmail.com> wrote:
>
>
> On 8/9/07, David Cournapeau <david at ar.media.kyoto-u.ac.jp> wrote:
> > Charles R Harris wrote:
> > >
> > > Ah, you want it in C.
> > What would be the use to get SIMD aligned arrays in python ?
>
> If I wanted a fairly specialized routine and didn't want to touch the guts
> of numpy, I would pass the aligned array to a C function and use the data
> pointer. The python code would be just a high level wrapper. You might even
> be able to use ctypes to pass the pointer into a library function. It's not
> necessary to code everything in C using the python C API.
I certainly do not argue on this point. But if it was specialized,
there would be no point putting in in numpy in the first place. What I
hope is that at some point, the aligned allocators can be used inside
core numpy to optimize things internally (ufunc, etc...). Those
facilities would be really useful for many optimized libraries, which
are all C: as such, doing it in C makes sense, no ?

David


From cournape at gmail.com  Thu Aug  9 11:14:05 2007
From: cournape at gmail.com (David Cournapeau)
Date: Fri, 10 Aug 2007 00:14:05 +0900
Subject: [Numpy-discussion] numpy arrays,
	data allocation and SIMD alignement
In-Reply-To: <20070809094023.GI9452@mentat.za.net>
References: <46B7FC64.7000109@ar.media.kyoto-u.ac.jp>
	<20070808095330.GO30988@mentat.za.net>
	<ce557a360708080829pb6d348am93d41e826882b1b5@mail.gmail.com>
	<e06186140708080904g659dfc00h805fd5d9d33929c4@mail.gmail.com>
	<ce557a360708081123o73881b27h6a769b31cf480b21@mail.gmail.com>
	<e06186140708081158m32f4f380t6a7b77fa7db880d9@mail.gmail.com>
	<46BA853D.90901@ar.media.kyoto-u.ac.jp>
	<e06186140708090017o360d4936x5f31044ee6137a4e@mail.gmail.com>
	<46BAC7C6.8030004@ar.media.kyoto-u.ac.jp>
	<20070809094023.GI9452@mentat.za.net>
Message-ID: <5b8d13220708090814k74a47c09h867b68d9b4fab19f@mail.gmail.com>

On 8/9/07, Stefan van der Walt <stefan at sun.ac.za> wrote:

>
> It doesn't really matter where the memory allocation occurs, does it?
> As far as I understand, the underlying fftw function has some flag to
> indicate when the data is aligned.  If so, we could expose that flag
> in Python, and do something like
>
> x = align16(data)
> _fft(x, is_aligned=True)
>
> I am not intimately familiar with the fft wrappers, so maybe I'm
> missing something more fundamental.

You can do that, but this is only a special case of what I have in
mind. For example, what if you want to call functions which are
relatively cheap, but called many times, and want an aligned array ?
Going back and forth would be a huge waste. Also, having aligned
buffers internally (in C_, even for non array data, can be useful (eg
filters, and maybe even core numpy functionalities like ufunc,
etc...). Another point I forgot to mention before is that we can
define a default alignment which would already be SIMD friendly (as
done on Mac OS X or FreeBSD by default malloc) for *all* numpy arrays
at 0 cost: for fft, this means that most arrays would already by as
wanted, meaning a huge boost of performances for free.

Basically, the functionalities would be more usable in C, without too
much constraint, because frankly, the implementation is not difficult:
I have something almost ready, and the patch is 7kb, including code to
detect platform dependent aligned allocator. The C code can be tested
really easily (since it is independent of python).

David


From kwgoodman at gmail.com  Thu Aug  9 12:28:42 2007
From: kwgoodman at gmail.com (Keith Goodman)
Date: Thu, 9 Aug 2007 18:28:42 +0200
Subject: [Numpy-discussion] Working with lists
In-Reply-To: <46BB1F79.2040204@bigpond.net.au>
References: <46BB1AD8.2090009@iam.uni-stuttgart.de>
	<f4f93d420708090653r6e5f7723t2f0f991c30dd49f9@mail.gmail.com>
	<46BB1F79.2040204@bigpond.net.au>
Message-ID: <f4f93d420708090928x6586bbd7n832c906aef8f267@mail.gmail.com>

On 8/9/07, Gary Ruben <gruben at bigpond.net.au> wrote:
> FWIW,
> The list comprehension is faster than using map()
>
> In [7]: %timeit map(lambda x:x[0],bounds)
> 10000 loops, best of 3: 49.6 -?s per loop
>
> In [8]: %timeit [x[0] for x in bounds]
> 10000 loops, best of 3: 20.8 -?s per loop

zip is even faster on my computer:

>> timeit map(lambda x:x[0], bounds)
100000 loops, best of 3: 5.48 ?s per loop
>> timeit [x[0] for x in bounds]
100000 loops, best of 3: 2.69 ?s per loop
>> timeit a, b = zip(*bounds)
100000 loops, best of 3: 2.57 ?s per loop


From Chris.Barker at noaa.gov  Thu Aug  9 12:58:26 2007
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Thu, 09 Aug 2007 09:58:26 -0700
Subject: [Numpy-discussion] I am volunteering to be the release manager
	for	NumPy 1.0.3.1 and SciPy 0.5.2
In-Reply-To: <c7009a550708090429s7ca6f6d0y13ba52cf28c4e03f@mail.gmail.com>
References: <c7009a550708090429s7ca6f6d0y13ba52cf28c4e03f@mail.gmail.com>
Message-ID: <46BB47B2.20601@noaa.gov>


Jarrod Millman wrote:
> I volunteer to be the release manager for NumPy 1.0.3.1 and SciPy
> 0.5.3. 

Wonderful! Thanks.

> Releasing SciPy 0.5.3
> We can also make the OS X
> binaries especially if Robert Kern is stilling willing to help.

What form will these take? It would be great if we could have Universal 
binaries, with no dependencies (other than Python and numpy, of course) 
-- I think that is now possible, Robert would certainly know.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From paul at rudin.co.uk  Thu Aug  9 13:14:41 2007
From: paul at rudin.co.uk (Paul Rudin)
Date: Thu, 09 Aug 2007 18:14:41 +0100
Subject: [Numpy-discussion] Working with lists
References: <46BB1AD8.2090009@iam.uni-stuttgart.de>
	<f4f93d420708090653r6e5f7723t2f0f991c30dd49f9@mail.gmail.com>
	<46BB1F79.2040204@bigpond.net.au>
	<f4f93d420708090928x6586bbd7n832c906aef8f267@mail.gmail.com>
Message-ID: <87sl6s8wam.fsf@rudin.co.uk>

"Keith Goodman" <kwgoodman at gmail.com> writes:

> On 8/9/07, Gary Ruben <gruben at bigpond.net.au> wrote:
>> FWIW,
>> The list comprehension is faster than using map()
>>
>> In [7]: %timeit map(lambda x:x[0],bounds)
>> 10000 loops, best of 3: 49.6 -?s per loop
>>
>> In [8]: %timeit [x[0] for x in bounds]
>> 10000 loops, best of 3: 20.8 -?s per loop
>
> zip is even faster on my computer:
>
>>> timeit map(lambda x:x[0], bounds)
> 100000 loops, best of 3: 5.48 ?s per loop
>>> timeit [x[0] for x in bounds]
> 100000 loops, best of 3: 2.69 ?s per loop
>>> timeit a, b = zip(*bounds)
> 100000 loops, best of 3: 2.57 ?s per loop

itertools.izip is faster yet on mine.


From robert.kern at gmail.com  Thu Aug  9 14:07:36 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 09 Aug 2007 13:07:36 -0500
Subject: [Numpy-discussion] I am volunteering to be the release manager
 for	NumPy 1.0.3.1 and SciPy 0.5.2
In-Reply-To: <46BB47B2.20601@noaa.gov>
References: <c7009a550708090429s7ca6f6d0y13ba52cf28c4e03f@mail.gmail.com>
	<46BB47B2.20601@noaa.gov>
Message-ID: <46BB57E8.7060100@gmail.com>

Christopher Barker wrote:
> 
> Jarrod Millman wrote:
>> I volunteer to be the release manager for NumPy 1.0.3.1 and SciPy
>> 0.5.3. 
> 
> Wonderful! Thanks.
> 
>> Releasing SciPy 0.5.3
>> We can also make the OS X
>> binaries especially if Robert Kern is stilling willing to help.
> 
> What form will these take? It would be great if we could have Universal 
> binaries, with no dependencies (other than Python and numpy, of course) 
> -- I think that is now possible, Robert would certainly know.

Yes.

http://mail.python.org/pipermail/pythonmac-sig/2007-June/018986.html

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From efiring at hawaii.edu  Thu Aug  9 14:12:00 2007
From: efiring at hawaii.edu (Eric Firing)
Date: Thu, 09 Aug 2007 08:12:00 -1000
Subject: [Numpy-discussion] rant against from numpy import * /
 from	pylab import *
In-Reply-To: <bc657ead0708020320v73f87db8u79211f5056cef001@mail.gmail.com>
References: <bc657ead0703152157y59cd3bf9uff9d4b298697b970@mail.gmail.com>
	<45FA4377.6010201@hawaii.edu>
	<bc657ead0708020320v73f87db8u79211f5056cef001@mail.gmail.com>
Message-ID: <46BB58F0.60507@hawaii.edu>

Sebastian,

I am trying to move things in the direction of simpler and cleaner 
namespaces, but I think that to do it well requires a systematic 
approach to the continuing numpification of mpl, so I have been working 
on mlab.py before tackling pylab.  I hope everything can be done via 
reorganization, without requiring any import tricks, but that remains to 
be seen. I'm sorry this is taking a long time, but it is in the works.

Eric

Sebastian Haase wrote:
> Hi all,
> Here a quick update:
> I'm trying to have a concise / sparse module with containing only
> pylab-specific names and not all names I already have in numpy.
> To easy typing I want to call numpy "N" and my pylab "P".
> 
> I'm now using this code:
> <code snipplet for importing matplotlib>
>     import matplotlib, new
>     matplotlib.use('WXAgg')
>     from  matplotlib import pylab
>     P = new.module("pylab_sparse","""pylab module minus stuff alreay
> in numpy""")
>     for k,v in pylab.__dict__.iteritems():
>         try:
>            if v is N.__dict__[k]:
>                continue
>         except KeyError:
>            pass
>         P.__dict__[k] = v
> 
>     P.ion()
>     del matplotlib, new, pylab
> </code sniplet for importing matplotlib>
> 
> The result is "some" reduction in the number of non-pylab-specific
> names in my "P"-module. However there seem to be still many extra
> names left, like e.g.:
> alltrue, amax, array, ...
> look at this:
>     # 20070802
>     # >>> len(dir(pylab))
>     # 441
>     # >>> len(dir(P))
>     # 346
>     # >>> P.nx.numpy.__version__
>     # '1.0.1'
>     # >>> N.__version__
>     # '1.0.1'
>     # >>> N.alltrue
>     # <function alltrue at 0x01471B70>
>     # >>> P.alltrue
>     # <function alltrue at 0x019142F0>
>     # >>> N.alltrue.__doc__
>     # 'Perform a logical_and over the given axis.'
>     # >>> P.alltrue.__doc__
>     # >>> #N.alltrue(x, axis=None, out=None)
>     # >>> #P.alltrue(x, axis=0)
> 
> I'm using matplotlib with
> __version__  = '0.90.0'
> __revision__ = '$Revision: 3003 $'
> __date__     = '$Date: 2007-02-06 22:24:06 -0500 (Tue, 06 Feb 2007) $'
> 
> 
> Any hint how to further reduce the number of names in "P" ?
> My ideal would be that the "P" module (short for pylab) would only
> contain the stuff described in the __doc__ strings of `pylab.py` and
> `__init__.py`(in matplotlib)  (+ plus some extra, undocumented, yet
> pylab specific things)
> 
> Thanks
> -Sebastian
> 
> 
> On 3/16/07, Eric Firing <efiring at hawaii.edu> wrote:
>> Sebastian Haase wrote:
>>> Hi!
>>> I use the wxPython PyShell.
>>> I like especially the feature that when typing a module and then the
>>> dot "." I get a popup list of all available functions (names) inside
>>> that module.
>>>
>>> Secondly,  I think it really makes code clearer when one can see where
>>> a function comes from.
>>>
>>> I have a default
>>> import numpy as N
>>> executed before my shell even starts.
>>> In fact I have a bunch of my "standard" modules imported as <some
>>> single capital letter>.
>>>
>>> This - I think - is a good compromise to the commonly used "extra
>>> typing" and "unreadable"  argument.
>>>
>>> a = sin(b) * arange(10,50, .1) * cos(d)
>>> vs.
>>> a = N.sin(b) * N.arange(10,50, .1) * N.cos(d)
>> I generally do the latter, but really, all those "N." bits are still
>> visual noise when it comes to reading the code--that is, seeing the
>> algorithm rather than where the functions come from.  I don't think
>> there is anything wrong with explicitly importing commonly-used names,
>> especially things like sin and cos.
>>
>>> I would like to hear some comments by others.
>>>
>>>
>>> On a different note: I just started using pylab, so I did added an
>>> automatic  "from matplotlib import pylab as P" -- but now P contains
>>> everything that I already have in N.  It makes it really hard to
>>> *find* (as in *see* n the popup-list) the pylab-only functions. --
>>> what can I do about this ?
>> A quick and dirty solution would be to comment out most of the imports
>> in pylab.py; they are not needed for the pylab functions and are there
>> only to give people lots of functionality in a single namespace.
>>
>> I am cross-posting this to matplotlib-users because it involves mpl, and
>> an alternative solution would be for us to add an rcParam entry to allow
>> one to turn off all of the namespace consolidation.  A danger is that if
>> someone is using "from pylab import *" in a script, then whether it
>> would run would depend on the matplotlibrc file.  To get around that,
>> another possibility would be to break pylab.py into two parts, with
>> pylab.py continuing to do the namespace consolidation and importing the
>> second part, which would contain the actual pylab functions.  Then if
>> you don't want the namespace consolidation, you could simply import the
>> second part instead of pylab.  There may be devils in the details, but
>> it seems to me that this last alternative--splitting pylab.py--might
>> make a number of people happier while having no adverse effects on
>> everyone else.
>>
>> Eric
>>>
>>> Thanks,
>>> Sebastian
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion


From robert.kern at gmail.com  Thu Aug  9 17:07:02 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 09 Aug 2007 16:07:02 -0500
Subject: [Numpy-discussion] I am volunteering to be the release manager
 for	NumPy 1.0.3.1 and SciPy 0.5.2
In-Reply-To: <46BB47B2.20601@noaa.gov>
References: <c7009a550708090429s7ca6f6d0y13ba52cf28c4e03f@mail.gmail.com>
	<46BB47B2.20601@noaa.gov>
Message-ID: <46BB81F6.2060100@gmail.com>

Christopher Barker wrote:
> 
> Jarrod Millman wrote:
>> I volunteer to be the release manager for NumPy 1.0.3.1 and SciPy
>> 0.5.3. 
> 
> Wonderful! Thanks.
> 
>> Releasing SciPy 0.5.3
>> We can also make the OS X
>> binaries especially if Robert Kern is stilling willing to help.
> 
> What form will these take? It would be great if we could have Universal 
> binaries, with no dependencies (other than Python and numpy, of course) 
> -- I think that is now possible, Robert would certainly know.

Whoops, the email I referenced is missing an important bit: use this gfortran
binary instead of the one from hpc.sf.net.

  http://r.research.att.com/tools/

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From david at ar.media.kyoto-u.ac.jp  Thu Aug  9 22:42:05 2007
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Fri, 10 Aug 2007 11:42:05 +0900
Subject: [Numpy-discussion] I am volunteering to be the release manager
 for	NumPy 1.0.3.1 and SciPy 0.5.2
In-Reply-To: <c7009a550708090429s7ca6f6d0y13ba52cf28c4e03f@mail.gmail.com>
References: <c7009a550708090429s7ca6f6d0y13ba52cf28c4e03f@mail.gmail.com>
Message-ID: <46BBD07D.4000502@ar.media.kyoto-u.ac.jp>

Jarrod Millman wrote:
> I volunteer to be the release manager for NumPy 1.0.3.1 and SciPy
> 0.5.3.  In order to actually get them both released I will obviously
> need some help.  But given the amount of work required and the number
> of people who have offered to help, I believe this will be doable.
>
> Given the extensive discussion about what is needed for these
> releases, I am fairly confident that I know what needs to be done.  I
> will try to be very specific about what I will do and what I will need
> help with.  Basically, I am just rewriting the plan described by
> Robert Kern last month.  Please let me know if you have any
> suggestions/comments/problems with this plan and please let me know if
> you can commit to helping in any way.
>
> [[NOTE:  I just (on Monday) hired 2 full-time programmers to work on
> the neuroimaging in python (NIPY) project, so they will be able to
> help out with bug fixing as well as testing the pre-releases on
> different platforms.]]
>
> Releasing NumPy 1.0.3.1
> ===================
> On July 24th, Robert suggested making a numpy 1.0.3.1 point release.
> He was concerned that there were some changes in numpy.distutils that
> needed to cook a little longer.  So I am offering to make a 1.0.3.1
> release.  If Travis or one of the other core NumPy developers want to
> make a 1.0.4 release in the next week or so, then there won't be a
> need for a 1.0.3.1 release.
>
> First, I will branch from the 1.0.3 tag:
> svn cp http://svn.scipy.org/svn/numpy/tags/1.0.3
> http://svn.scipy.org/svn/numpy/branches/1.0.3
>
> Second, I will apply all the patches necessary to build scipy from
> svn, but nothing else.  Then I will just follow the NumPy release
> instructions:  http://projects.scipy.org/scipy/numpy/wiki/MakingReleases
> I will make the tarball and source rpm; but will need help with
> everything else.  Things will go faster if someone else can build the
> Windows binaries. 
For windows, I understand the main problem is ATLAS, right ? I have 
discussed a bit the issue with Clint Whaley (the main developer of 
ATLAS), and I think I got a way to build ATLAS without using SSE (which 
caused trouble for some "old" ATHLON last time, AFAIK). I can provide 
the informations to you; I would just need someone to test the binaries 
on a non SSE machine, since I don't have any myself.

cheers,

David


From javier.maria.torres at ericsson.com  Fri Aug 10 04:01:27 2007
From: javier.maria.torres at ericsson.com (Javier Maria Torres (MI/EEM))
Date: Fri, 10 Aug 2007 10:01:27 +0200
Subject: [Numpy-discussion] Fail to compile Numpy on Cygwin
Message-ID: <262621D9AF021741B219F859256CDB7D012D19A2@eesmdmw020.eemea.ericsson.se>

Hi,

I get the following output when trying to compile the latest Numpy SVN
snapshot on Cygwin (gcc 3.4.4) and Python (Cygwin-installed, 2.5.1; I
also have the Windows version installed, this might cause problems?). I
also include the (meager) site.cfg used. I would appreciate any comment.

Thanks a lot, and greetings,

Javier Torres


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: output.txt
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070810/f814b6e0/attachment.txt>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: site.cfg
Type: application/octet-stream
Size: 84 bytes
Desc: site.cfg
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070810/f814b6e0/attachment.obj>

From pearu at cens.ioc.ee  Fri Aug 10 04:23:11 2007
From: pearu at cens.ioc.ee (Pearu Peterson)
Date: Fri, 10 Aug 2007 11:23:11 +0300 (EEST)
Subject: [Numpy-discussion] Fail to compile Numpy on Cygwin
In-Reply-To: <262621D9AF021741B219F859256CDB7D012D19A2@eesmdmw020.eemea.ericsson.se
	>
References: <262621D9AF021741B219F859256CDB7D012D19A2@eesmdmw020.eemea.ericsson.se>
Message-ID: <57874.129.240.228.53.1186734191.squirrel@cens.ioc.ee>

On Fri, August 10, 2007 11:01 am, Javier Maria Torres (MI/EEM) wrote:
> Hi,
>
> I get the following output when trying to compile the latest Numpy SVN
> snapshot on Cygwin (gcc 3.4.4) and Python (Cygwin-installed, 2.5.1; I
> also have the Windows version installed, this might cause problems?). I
> also include the (meager) site.cfg used. I would appreciate any comment.

In your build command

  python setup.py config --compiler=mingw32 build --compiler=mingw32 install

you are not using cygwin gcc compiler but mingw32. I think you cannot
do this - don't ask why, some time ago I failed to determine the cause.

Anyway, under cygwin just try

  python setup.py build

then it should just pick up cygwin compiler.

Or, execute

  python setup.py config --compiler=mingw32 build --compiler=mingw32 install

from Windows cmd line.

HTH,
Pearu


From javier.maria.torres at ericsson.com  Fri Aug 10 04:27:59 2007
From: javier.maria.torres at ericsson.com (Javier Maria Torres (MI/EEM))
Date: Fri, 10 Aug 2007 10:27:59 +0200
Subject: [Numpy-discussion] Fail to compile Numpy on Cygwin
In-Reply-To: <57874.129.240.228.53.1186734191.squirrel@cens.ioc.ee>
References: <262621D9AF021741B219F859256CDB7D012D19A2@eesmdmw020.eemea.ericsson.se>
	<57874.129.240.228.53.1186734191.squirrel@cens.ioc.ee>
Message-ID: <262621D9AF021741B219F859256CDB7D012D19A3@eesmdmw020.eemea.ericsson.se>

Hi Pearu,

Using just "python setup.py build" I get the following change in the
same error:

...
compile options: '-I/usr/local/include/python2.5 -Inumpy/core/src
-Inumpy/core/include -I/usr/local/include/python2.5 -c'
gcc: _configtest.c
gcc _configtest.o -L/usr/local/lib -L/usr/lib -o _configtest.exe
/usr/lib/gcc/i686-pc-cygwin/3.4.4/../../../../i686-pc-cygwin/bin/ld:
crt0.o: No such file: No such file or directory
collect2: ld returned 1 exit status
/usr/lib/gcc/i686-pc-cygwin/3.4.4/../../../../i686-pc-cygwin/bin/ld:
crt0.o: No such file: No such file or directory
collect2: ld returned 1 exit status
failure.
removing: _configtest.c _configtest.o
Traceback (most recent call last):
... 

I completely remove the build directory between builds, just in case
this helps.

Thanks a lot, and greetings,

Javier

-----Original Message-----
From: numpy-discussion-bounces at scipy.org
[mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Pearu Peterson
Sent: viernes, 10 de agosto de 2007 10:23
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Fail to compile Numpy on Cygwin

On Fri, August 10, 2007 11:01 am, Javier Maria Torres (MI/EEM) wrote:
> Hi,
>
> I get the following output when trying to compile the latest Numpy SVN

> snapshot on Cygwin (gcc 3.4.4) and Python (Cygwin-installed, 2.5.1; I 
> also have the Windows version installed, this might cause problems?). 
> I also include the (meager) site.cfg used. I would appreciate any
comment.

In your build command

  python setup.py config --compiler=mingw32 build --compiler=mingw32
install

you are not using cygwin gcc compiler but mingw32. I think you cannot do
this - don't ask why, some time ago I failed to determine the cause.

Anyway, under cygwin just try

  python setup.py build

then it should just pick up cygwin compiler.

Or, execute

  python setup.py config --compiler=mingw32 build --compiler=mingw32
install

from Windows cmd line.

HTH,
Pearu

_______________________________________________
Numpy-discussion mailing list
Numpy-discussion at scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


From pearu at cens.ioc.ee  Fri Aug 10 04:40:24 2007
From: pearu at cens.ioc.ee (Pearu Peterson)
Date: Fri, 10 Aug 2007 11:40:24 +0300 (EEST)
Subject: [Numpy-discussion] Fail to compile Numpy on Cygwin
In-Reply-To: <262621D9AF021741B219F859256CDB7D012D19A3@eesmdmw020.eemea.ericsson.se
	>
References: <262621D9AF021741B219F859256CDB7D012D19A2@eesmdmw020.eemea.ericsson.se>
	<57874.129.240.228.53.1186734191.squirrel@cens.ioc.ee>
	<262621D9AF021741B219F859256CDB7D012D19A3@eesmdmw020.eemea.ericsson.se>
Message-ID: <51065.129.240.228.53.1186735224.squirrel@cens.ioc.ee>

On Fri, August 10, 2007 11:27 am, Javier Maria Torres (MI/EEM) wrote:
> Hi Pearu,
>
> Using just "python setup.py build" I get the following change in the
> same error:
>
> ...
> compile options: '-I/usr/local/include/python2.5 -Inumpy/core/src
> -Inumpy/core/include -I/usr/local/include/python2.5 -c'
> gcc: _configtest.c
> gcc _configtest.o -L/usr/local/lib -L/usr/lib -o _configtest.exe
> /usr/lib/gcc/i686-pc-cygwin/3.4.4/../../../../i686-pc-cygwin/bin/ld:
> crt0.o: No such file: No such file or directory
> collect2: ld returned 1 exit status
> /usr/lib/gcc/i686-pc-cygwin/3.4.4/../../../../i686-pc-cygwin/bin/ld:
> crt0.o: No such file: No such file or directory
> collect2: ld returned 1 exit status
> failure.

Check were is the crt0.o file in your system, if it exists then
your cygwin environment is not set up properly, I guess.
If not, you might need to install compiler development libraries to cygwin
system.

Pearu


From millman at berkeley.edu  Fri Aug 10 04:45:07 2007
From: millman at berkeley.edu (Jarrod Millman)
Date: Fri, 10 Aug 2007 01:45:07 -0700
Subject: [Numpy-discussion] NumPy-1.0.3.x
Message-ID: <c7009a550708100145s4f9841eanebc3842026c5e405@mail.gmail.com>

Hello everyone,

I made a mumpy-1.0.3.x branch from the 1.0.3 tag and tried to get
everything working (see changesets 3957-3961).  I added back get_path
to numpy/distutils/misc_util.py, which is used by Lib/odr/setup.py in
scipy 0.5.2.  I also tried to clean up a few issues by doing the same
thing that was done to the trunk in:
http://projects.scipy.org/scipy/numpy/changeset/3845
http://projects.scipy.org/scipy/numpy/changeset/3848

I am still seeing 2 problems:
1) http://projects.scipy.org/scipy/numpy/ticket/535
2) when I run scipy.test(1,10), I get:
check_cosine_weighted_infinite
(scipy.integrate.tests.test_quadpack.test_quad)Illegal instruction

If anyone has any ideas as to what is wrong, please let me know.

Thanks,

-- 
Jarrod Millman
Computational Infrastructure for Research Labs
10 Giannini Hall, UC Berkeley
phone: 510.643.4014
http://cirl.berkeley.edu/


From ryanlists at gmail.com  Fri Aug 10 09:47:56 2007
From: ryanlists at gmail.com (Ryan Krauss)
Date: Fri, 10 Aug 2007 08:47:56 -0500
Subject: [Numpy-discussion] I am volunteering to be the release manager
	for NumPy 1.0.3.1 and SciPy 0.5.2
In-Reply-To: <46BBD07D.4000502@ar.media.kyoto-u.ac.jp>
References: <c7009a550708090429s7ca6f6d0y13ba52cf28c4e03f@mail.gmail.com>
	<46BBD07D.4000502@ar.media.kyoto-u.ac.jp>
Message-ID: <c5b438120708100647l4ac594a1u4aa31996e9464baa@mail.gmail.com>

I have access to one non-SSE (or at least non-SSE2) machine that I can test on.

I sort of championed this cause the last time this came up out of fear
that my students would have these problems.  No one did.  So, I don't
know how many non-SSE machines are really out there.  This may not be
a big problem.

If we can support non-SSE machines without too much trouble or create
one windows binary that works for "everyone" without performance loss,
great.

I am still willing to test.

Ryan

On 8/9/07, David Cournapeau <david at ar.media.kyoto-u.ac.jp> wrote:
> Jarrod Millman wrote:
> > I volunteer to be the release manager for NumPy 1.0.3.1 and SciPy
> > 0.5.3.  In order to actually get them both released I will obviously
> > need some help.  But given the amount of work required and the number
> > of people who have offered to help, I believe this will be doable.
> >
> > Given the extensive discussion about what is needed for these
> > releases, I am fairly confident that I know what needs to be done.  I
> > will try to be very specific about what I will do and what I will need
> > help with.  Basically, I am just rewriting the plan described by
> > Robert Kern last month.  Please let me know if you have any
> > suggestions/comments/problems with this plan and please let me know if
> > you can commit to helping in any way.
> >
> > [[NOTE:  I just (on Monday) hired 2 full-time programmers to work on
> > the neuroimaging in python (NIPY) project, so they will be able to
> > help out with bug fixing as well as testing the pre-releases on
> > different platforms.]]
> >
> > Releasing NumPy 1.0.3.1
> > ===================
> > On July 24th, Robert suggested making a numpy 1.0.3.1 point release.
> > He was concerned that there were some changes in numpy.distutils that
> > needed to cook a little longer.  So I am offering to make a 1.0.3.1
> > release.  If Travis or one of the other core NumPy developers want to
> > make a 1.0.4 release in the next week or so, then there won't be a
> > need for a 1.0.3.1 release.
> >
> > First, I will branch from the 1.0.3 tag:
> > svn cp http://svn.scipy.org/svn/numpy/tags/1.0.3
> > http://svn.scipy.org/svn/numpy/branches/1.0.3
> >
> > Second, I will apply all the patches necessary to build scipy from
> > svn, but nothing else.  Then I will just follow the NumPy release
> > instructions:  http://projects.scipy.org/scipy/numpy/wiki/MakingReleases
> > I will make the tarball and source rpm; but will need help with
> > everything else.  Things will go faster if someone else can build the
> > Windows binaries.
> For windows, I understand the main problem is ATLAS, right ? I have
> discussed a bit the issue with Clint Whaley (the main developer of
> ATLAS), and I think I got a way to build ATLAS without using SSE (which
> caused trouble for some "old" ATHLON last time, AFAIK). I can provide
> the informations to you; I would just need someone to test the binaries
> on a non SSE machine, since I don't have any myself.
>
> cheers,
>
> David
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>


From Glen.Mabey at swri.org  Fri Aug 10 12:20:16 2007
From: Glen.Mabey at swri.org (Glen W. Mabey)
Date: Fri, 10 Aug 2007 11:20:16 -0500
Subject: [Numpy-discussion] .transpose() of memmap array fails to	close()
In-Reply-To: <20070607214620.GM6116@bams.ccf.swri.edu>
References: <20070607214620.GM6116@bams.ccf.swri.edu>
Message-ID: <20070810162016.GA12992@bams.ccf.swri.edu>

Hello,

I posted this a while back and didn't get any replies.  I'm running in
to this issue again from a different aspect, and today I've been trying
to figure out which method of ndarray needs to be overloaded for memmap
so that the the ._mmap attribute gets handled appropriately.

But, I have not been able to figure out what methods of ndarray are
getting used in code such as this:

>>> import numpy
>>> amemmap = numpy.memmap( '/tmp/afile', dtype=numpy.float32,
>>> shape=(4,5), mode='w+' )
>>> b = amemmap[2:3]
>>> b
>>> Exception exceptions.AttributeError: "'memmap' object has no attribute '_mmap'" in <bound method memmap.__del__ of memmap([ 0.,  0.,  0.,  0.,  0.], dtype=float32)> ignored memmap([[ 0.,  0.,  0.,  0.,  0.]], dtype=float32)


Furthermore, can anyone enlighten me as to why an AttributeError
exception would be ignored?

Am I using numpy.memmap instances appropriately?

Thank you,
Glen Mabey


On Thu, Jun 07, 2007 at 04:46:20PM -0500, Glen W. Mabey wrote:
> Hello,
> 
> When assigning a variable that is the transpose() of a memmap array, the
> ._mmap member doesn't get copied, I guess:
> 
> In [1]:import numpy
> 
> In [2]:amemmap = numpy.memmap( '/tmp/afile', dtype=numpy.float32, shape=(4,5), mode='w+' )
> 
> In [3]:bmemmap = amemmap.transpose()
> 
> In [4]:bmemmap.close()
> ---------------------------------------------------------------------------
> <type 'exceptions.AttributeError'>        Traceback (most recent call last)
> 
> /home/gmabey/src/R9619_dev_acqlibweb/Projects/R9619_NChannelDetection/NED/<ipython console> in <module>()
> 
> /usr/local/stow/numpy-20070605_svn-py2.5/lib/python2.5/site-packages/numpy/core/memmap.py
> in close(self)
>      86 
>      87     def close(self):
> ---> 88         self._mmap.close()
>      89 
>      90     def __del__(self):
> 
> <type 'exceptions.AttributeError'>: 'NoneType' object has no attribute 'close'
> > /usr/local/stow/numpy-20070605_svn-py2.5/lib/python2.5/site-packages/numpy/core/memmap.py(88)close()
>      87     def close(self):
> ---> 88         self._mmap.close()
>      89 
> 
> 
> 
> 
> This is an issue when the data is accessed in an order that is different
> from how it is stored on disk, as:
> 
> bmemmap = numpy.memmap( '/tmp/afile', dtype=numpy.float32, shape=(4,5), mode='w+' ).transpose()
> 
> So the object that was originally produced not accessible.  I imagine
> there is some better way to indicate order of dimensions, but
> regardless, doing
> 
> In [4]:bmemmap._mmap = amemmap._mmap
> 
> is a hack workaround.
> 
> Best regards,
> Glen Mabey
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion


From Glen.Mabey at swri.org  Fri Aug 10 12:30:05 2007
From: Glen.Mabey at swri.org (Glen W. Mabey)
Date: Fri, 10 Aug 2007 11:30:05 -0500
Subject: [Numpy-discussion] .transpose() of memmap array fails to close()
In-Reply-To: <20070810162016.GA12992@bams.ccf.swri.edu>
References: <20070607214620.GM6116@bams.ccf.swri.edu>
	<20070810162016.GA12992@bams.ccf.swri.edu>
Message-ID: <20070810163005.GB13557@bams.ccf.swri.edu>

On Fri, Aug 10, 2007 at 11:20:16AM -0500, Glen W. Mabey wrote:
> I posted this a while back and didn't get any replies.  I'm running in
> to this issue again from a different aspect, and today I've been trying
> to figure out which method of ndarray needs to be overloaded for memmap
> so that the the ._mmap attribute gets handled appropriately.


Oh, and 

Python 2.5.1
numpy svn as of yesterday  ...
AMD opteron, Linux/Debian

Glen


From robert.kern at gmail.com  Fri Aug 10 14:26:10 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Fri, 10 Aug 2007 13:26:10 -0500
Subject: [Numpy-discussion] NumPy-1.0.3.x
In-Reply-To: <c7009a550708100145s4f9841eanebc3842026c5e405@mail.gmail.com>
References: <c7009a550708100145s4f9841eanebc3842026c5e405@mail.gmail.com>
Message-ID: <46BCADC2.30104@gmail.com>

Jarrod Millman wrote:

> 2) when I run scipy.test(1,10), I get:
> check_cosine_weighted_infinite
> (scipy.integrate.tests.test_quadpack.test_quad)Illegal instruction
> 
> If anyone has any ideas as to what is wrong, please let me know.

What platform are you on and what underlying libraries (ATLAS, etc.) did you
compile with? "Illegal instruction" usually comes from using an ATLAS library
that was compiled with a higher level of SSE than your CPU supports.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From Glen.Mabey at swri.org  Fri Aug 10 17:14:38 2007
From: Glen.Mabey at swri.org (Glen W. Mabey)
Date: Fri, 10 Aug 2007 16:14:38 -0500
Subject: [Numpy-discussion] .transpose() of memmap array fails to close()
In-Reply-To: <20070810162016.GA12992@bams.ccf.swri.edu>
References: <20070607214620.GM6116@bams.ccf.swri.edu>
	<20070810162016.GA12992@bams.ccf.swri.edu>
Message-ID: <20070810211438.GF13557@bams.ccf.swri.edu>


[I keep posting hoping that someone knowledgeable in these things will
take notice ...]

Just a couple of more notes regarding this numpy.memmap issue.

It seems that any slice of a numpy.memmap that is greater than 1-d has 
a similar problem.


In [1]:import numpy

In [2]:amemmap = numpy.memmap( '/tmp/afile', dtype=numpy.float32,
shape=(4,5), mode='w+' )

In [3]:amemmap[1,3:4]
Out[3]:memmap([ 0.], dtype=float32)

In [4]:amemmap[0:1,3:4]
Exception exceptions.AttributeError: "'memmap' object has no attribute '_mmap'" in <bound method memmap.__del__ of memmap([ 0.], dtype=float32)> ignored
Out[4]:memmap([[ 0.]], dtype=float32)


A very naive hack-fix of overloading the __getitem__ method of the
numpy.memmap class such that the result of ndarray.__getitem__ gets the
._mmap attribute added didn't work ...

I tried to follow the program flow into the bowels of
multiarraymodule.c, but that was beyond me ...

This problem started showing up when I changed to python 2.5 and
persists in 2.5.1.  I've considered switching back to 2.4 but I really
need 64-bit array indexing ...

Best Regards,
Glen Mabey


On Fri, Aug 10, 2007 at 11:20:16AM -0500, Glen W. Mabey wrote:
> Hello,
> 
> I posted this a while back and didn't get any replies.  I'm running in
> to this issue again from a different aspect, and today I've been trying
> to figure out which method of ndarray needs to be overloaded for memmap
> so that the the ._mmap attribute gets handled appropriately.
> 
> But, I have not been able to figure out what methods of ndarray are
> getting used in code such as this:
> 
> >>> import numpy
> >>> amemmap = numpy.memmap( '/tmp/afile', dtype=numpy.float32,
> >>> shape=(4,5), mode='w+' )
> >>> b = amemmap[2:3]
> >>> b
> >>> Exception exceptions.AttributeError: "'memmap' object has no attribute '_mmap'" in <bound method memmap.__del__ of memmap([ 0.,  0.,  0.,  0.,  0.], dtype=float32)> ignored memmap([[ 0.,  0.,  0.,  0.,  0.]], dtype=float32)
> 
> 
> Furthermore, can anyone enlighten me as to why an AttributeError
> exception would be ignored?
> 
> Am I using numpy.memmap instances appropriately?
> 
> Thank you,
> Glen Mabey
> 
> 
> 
> 
> On Thu, Jun 07, 2007 at 04:46:20PM -0500, Glen W. Mabey wrote:
> > Hello,
> > 
> > When assigning a variable that is the transpose() of a memmap array, the
> > ._mmap member doesn't get copied, I guess:
> > 
> > In [1]:import numpy
> > 
> > In [2]:amemmap = numpy.memmap( '/tmp/afile', dtype=numpy.float32, shape=(4,5), mode='w+' )
> > 
> > In [3]:bmemmap = amemmap.transpose()
> > 
> > In [4]:bmemmap.close()
> > ---------------------------------------------------------------------------
> > <type 'exceptions.AttributeError'>        Traceback (most recent call last)
> > 
> > /home/gmabey/src/R9619_dev_acqlibweb/Projects/R9619_NChannelDetection/NED/<ipython console> in <module>()
> > 
> > /usr/local/stow/numpy-20070605_svn-py2.5/lib/python2.5/site-packages/numpy/core/memmap.py
> > in close(self)
> >      86 
> >      87     def close(self):
> > ---> 88         self._mmap.close()
> >      89 
> >      90     def __del__(self):
> > 
> > <type 'exceptions.AttributeError'>: 'NoneType' object has no attribute 'close'
> > > /usr/local/stow/numpy-20070605_svn-py2.5/lib/python2.5/site-packages/numpy/core/memmap.py(88)close()
> >      87     def close(self):
> > ---> 88         self._mmap.close()
> >      89 
> > 
> > 
> > 
> > 
> > This is an issue when the data is accessed in an order that is different
> > from how it is stored on disk, as:
> > 
> > bmemmap = numpy.memmap( '/tmp/afile', dtype=numpy.float32, shape=(4,5), mode='w+' ).transpose()
> > 
> > So the object that was originally produced not accessible.  I imagine
> > there is some better way to indicate order of dimensions, but
> > regardless, doing
> > 
> > In [4]:bmemmap._mmap = amemmap._mmap
> > 
> > is a hack workaround.
> > 
> > Best regards,
> > Glen Mabey
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at scipy.org
> > http://projects.scipy.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion


From david at ar.media.kyoto-u.ac.jp  Sat Aug 11 03:06:23 2007
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Sat, 11 Aug 2007 16:06:23 +0900
Subject: [Numpy-discussion] SIMD friendly allocators: first patch
Message-ID: <46BD5FEF.7060602@ar.media.kyoto-u.ac.jp>

Hi,

    I put a first version of aligned allocators for numpy here: 
http://projects.scipy.org/scipy/numpy/ticket/568 (sorry for the 
duplicate, but I had problems connecting with the trac server). I have 
tested it on linux only for now, but if problem arise, it should not be 
too difficult to solve (will test it soon on windows and mac os X).
    It does not provide yet high level interface (eg requestion python 
arrays with given alignment), but if people agree with the current 
design, those should not be too difficult to implement.

    cheers,

    David


From david at ar.media.kyoto-u.ac.jp  Mon Aug 13 01:43:52 2007
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Mon, 13 Aug 2007 14:43:52 +0900
Subject: [Numpy-discussion] [f2py] Adding custom code in module
	initialization code
Message-ID: <46BFEF98.2020201@ar.media.kyoto-u.ac.jp>

Hi,

    I would like to know if it is possible to tell f2py to call some 
functions inside the initialization function of a module ? I found a 
mention to add some function to the module function list, but nothing 
about the initialization function.

    cheers,

    David


From pearu at cens.ioc.ee  Mon Aug 13 01:57:59 2007
From: pearu at cens.ioc.ee (Pearu Peterson)
Date: Mon, 13 Aug 2007 08:57:59 +0300 (EEST)
Subject: [Numpy-discussion] [f2py] Adding custom code in module
 initialization code
In-Reply-To: <46BFEF98.2020201@ar.media.kyoto-u.ac.jp>
References: <46BFEF98.2020201@ar.media.kyoto-u.ac.jp>
Message-ID: <60543.85.166.31.64.1186984679.squirrel@cens.ioc.ee>

On Mon, August 13, 2007 8:43 am, David Cournapeau wrote:
> Hi,
>
>     I would like to know if it is possible to tell f2py to call some
> functions inside the initialization function of a module ? I found a
> mention to add some function to the module function list, but nothing
> about the initialization function.

Yes, it is possible. Look for `usercode` statement in

  http://cens.ioc.ee/projects/f2py2e/usersguide/index.html

In particular, see the `Extended F2PY usage` section for an example.

HTH,
Pearu


From david at ar.media.kyoto-u.ac.jp  Mon Aug 13 05:24:28 2007
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Mon, 13 Aug 2007 18:24:28 +0900
Subject: [Numpy-discussion] [f2py] Adding custom code in module
 initialization code
In-Reply-To: <60543.85.166.31.64.1186984679.squirrel@cens.ioc.ee>
References: <46BFEF98.2020201@ar.media.kyoto-u.ac.jp>
	<60543.85.166.31.64.1186984679.squirrel@cens.ioc.ee>
Message-ID: <46C0234C.4090802@ar.media.kyoto-u.ac.jp>

Pearu Peterson wrote:
> On Mon, August 13, 2007 8:43 am, David Cournapeau wrote:
>   
>> Hi,
>>
>>     I would like to know if it is possible to tell f2py to call some
>> functions inside the initialization function of a module ? I found a
>> mention to add some function to the module function list, but nothing
>> about the initialization function.
>>     
>
> Yes, it is possible. Look for `usercode` statement in
>
>   http://cens.ioc.ee/projects/f2py2e/usersguide/index.html
>
> In particular, see the `Extended F2PY usage` section for an example.
>   
I see how to use usercode to add C function to the module, but how to 
tell f2py to call a given C function in the init* function ? For 
example, let's say I have a module foo, and I want the function init_foo 
to call the function bar() ?

David


From david at ar.media.kyoto-u.ac.jp  Mon Aug 13 05:47:47 2007
From: david at ar.media.kyoto-u.ac.jp (David Cournapeau)
Date: Mon, 13 Aug 2007 18:47:47 +0900
Subject: [Numpy-discussion] [f2py] Adding custom code in module
 initialization code
In-Reply-To: <46C0234C.4090802@ar.media.kyoto-u.ac.jp>
References: <46BFEF98.2020201@ar.media.kyoto-u.ac.jp>	<60543.85.166.31.64.1186984679.squirrel@cens.ioc.ee>
	<46C0234C.4090802@ar.media.kyoto-u.ac.jp>
Message-ID: <46C028C3.3040502@ar.media.kyoto-u.ac.jp>

David Cournapeau wrote:
> Pearu Peterson wrote:
>   
>> On Mon, August 13, 2007 8:43 am, David Cournapeau wrote:
>>   
>>     
>>> Hi,
>>>
>>>     I would like to know if it is possible to tell f2py to call some
>>> functions inside the initialization function of a module ? I found a
>>> mention to add some function to the module function list, but nothing
>>> about the initialization function.
>>>     
>>>       
>> Yes, it is possible. Look for `usercode` statement in
>>
>>   http://cens.ioc.ee/projects/f2py2e/usersguide/index.html
>>
>> In particular, see the `Extended F2PY usage` section for an example.
>>   
>>     
> I see how to use usercode to add C function to the module, but how to 
> tell f2py to call a given C function in the init* function ? For 
> example, let's say I have a module foo, and I want the function init_foo 
> to call the function bar() ?
>   
Sorry, it was in front of my eyes for sometimes, and I didn't see it.

David


From listservs at mac.com  Mon Aug 13 09:15:29 2007
From: listservs at mac.com (Chris Fonnesbeck)
Date: Mon, 13 Aug 2007 13:15:29 +0000 (UTC)
Subject: [Numpy-discussion] Vectorize leaks
Message-ID: <loom.20070813T151031-703@post.gmane.org>

I have narrowed a memory leak in PyMC down to the vectorize() function 
in numpy. I have a simple inverse logit transformation function:

invlogit = lambda x: 1.0 / (1.0 + exp(-1.0 * x))

which runs without leaking when used iteratively during simulations. 
However, when I try to vectorize it, the process' rsize grows each 
iteration of the simulation.

Using a recent (<2 days old) svn build of numpy on OS X 10.4.

C.


From mpmusu at cc.usu.edu  Mon Aug 13 11:04:32 2007
From: mpmusu at cc.usu.edu (Mark.Miller)
Date: Mon, 13 Aug 2007 09:04:32 -0600
Subject: [Numpy-discussion] f2py and string arrays
Message-ID: <46C07300.3060902@cc.usu.edu>

Quick question...

I have a Fortran function for f2py declared as follows (just an example).

module test1
contains
     subroutine manip(length, array, a, b)
          integer   :: length, a,b
          character(length) :: array(0:a-1,0:b-1)
!f2py intent(in) length,a,b
!f2py intent(inout) array
          array(0,0) = "1111"
     end subroutine manip
end module test1

It compiles using f2py without issue.  The f2py-generated docstring 
correctly lists the required arguments:

manip - Function signature:
   manip(length,array,[a,b])
Required arguments:
   length : input int
   array : in/output rank-2 array('S') with bounds (a,b)


However, I'm getting some type conversion errors when using the function 
in python:

 >>> from mymodule import test1
 >>> import numpy
 >>> a=numpy.empty((3,3,'S4',order='F')
 >>> a[:,:]='2222'
 >>> test1.manip(4,a,3,3)
Traceback (most recent call last):
   File "(stdin)", line 1, in (module)
ValueError: failed to initialize intent(inout) array --expected elsize = 
1 but got 4 -- input 'S' not compatible to 'c'

Does f2py not work with numpy string arrays?  I have some excellent 
alternate implementations for this type of thing, but would prefer an 
approach similar to that outlined above.

Thanks,

-Mark


From aisaac at american.edu  Mon Aug 13 11:51:49 2007
From: aisaac at american.edu (Alan G Isaac)
Date: Mon, 13 Aug 2007 11:51:49 -0400
Subject: [Numpy-discussion] .transpose() of memmap array fails to close()
In-Reply-To: <20070810211438.GF13557@bams.ccf.swri.edu>
References: <20070607214620.GM6116@bams.ccf.swri.edu><20070810162016.GA12992@bams.ccf.swri.edu><20070810211438.GF13557@bams.ccf.swri.edu>
Message-ID: <Mahogany-0.67.0-992-20070813-115149.00@american.edu>

On Fri, 10 Aug 2007, "Glen W. Mabey" apparently wrote:
> It seems that any slice of a numpy.memmap that is greater than 1-d has 
> a similar problem.  
> In [1]:import numpy 
> In [2]:amemmap = numpy.memmap( '/tmp/afile', dtype=numpy.float32,  shape=(4,5), mode='w+' )
> In [3]:amemmap[1,3:4] 
> Out[3]:memmap([ 0.], dtype=float32) 
> In [4]:amemmap[0:1,3:4] 
> Exception exceptions.AttributeError: "'memmap' object has no attribute '_mmap'" in <bound method memmap.__del__ of memmap([ 0.], dtype=float32)> ignored 
> Out[4]:memmap([[ 0.]], dtype=float32)


You have not heard from anyone on this yet, right?
Please continue to post your findings.

Cheers,
Alan Isaac


From Glen.Mabey at swri.org  Mon Aug 13 12:19:45 2007
From: Glen.Mabey at swri.org (Glen W. Mabey)
Date: Mon, 13 Aug 2007 11:19:45 -0500
Subject: [Numpy-discussion] .transpose() of memmap array fails to	close()
In-Reply-To: <Mahogany-0.67.0-992-20070813-115149.00@american.edu>
References: <20070607214620.GM6116@bams.ccf.swri.edu>
	<20070810162016.GA12992@bams.ccf.swri.edu>
	<20070810211438.GF13557@bams.ccf.swri.edu>
	<Mahogany-0.67.0-992-20070813-115149.00@american.edu>
Message-ID: <20070813161945.GA12360@bams.ccf.swri.edu>

On Mon, Aug 13, 2007 at 11:51:49AM -0400, Alan G Isaac wrote:
> You have not heard from anyone on this yet, right?

Nope, but I'm glad to hear even this response.

> Please continue to post your findings.

At this point, I'm guessing that the __getitem__() method of ndarray
returns a numpy.memmap instance instead of a ndarray instance, but that
numpy.memmap.__new__() is not getting executed, resulting in ._mmap not
getting initialized, so that when numpy.memmap.__del__() gets called, it
chokes because ._mmap doesn't exist.

For my purposes, I am mostly opening these files read-only, so I don't
need to have flush() called.  For the returned valued of __getitem__, it
is not appropriate to have ._mmap.close() called (the other operation in
numpy.memmap.__del__().  So, I just commented out the __del__()
overloaded function.

When I do open memmap'ed files read-write, I can manually perform a
flush() operation before I'm done, and things seem to work out okay even
though .close() isn't called.

As I have tried to think through what should be the appropriate
behavior for the returned value of __getitem__, I have not been able to
see an appropriate solution (let alone know how to implement it) to this
issue.

Thank you,
Glen Mabey


From david.huard at gmail.com  Mon Aug 13 20:14:12 2007
From: david.huard at gmail.com (David Huard)
Date: Mon, 13 Aug 2007 20:14:12 -0400
Subject: [Numpy-discussion] Vectorize leaks
In-Reply-To: <loom.20070813T151031-703@post.gmane.org>
References: <loom.20070813T151031-703@post.gmane.org>
Message-ID: <91cf711d0708131714w7b143bc6r12435687d4b7e711@mail.gmail.com>

Hi Chris,

Same problem for ubuntu linux.

Darn, I spent an hour tracking this bug and now I see you found it before...

2007/8/13, Chris Fonnesbeck <listservs at mac.com>:
>
> I have narrowed a memory leak in PyMC down to the vectorize() function
> in numpy. I have a simple inverse logit transformation function:
>
> invlogit = lambda x: 1.0 / (1.0 + exp(-1.0 * x))
>
> which runs without leaking when used iteratively during simulations.
> However, when I try to vectorize it, the process' rsize grows each
> iteration of the simulation.
>
> Using a recent (<2 days old) svn build of numpy on OS X 10.4.
>
> C.
>
>
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070813/f20ec381/attachment.html>

From peridot.faceted at gmail.com  Tue Aug 14 00:23:26 2007
From: peridot.faceted at gmail.com (Anne Archibald)
Date: Tue, 14 Aug 2007 00:23:26 -0400
Subject: [Numpy-discussion] .transpose() of memmap array fails to close()
In-Reply-To: <20070813161945.GA12360@bams.ccf.swri.edu>
References: <20070607214620.GM6116@bams.ccf.swri.edu>
	<20070810162016.GA12992@bams.ccf.swri.edu>
	<20070810211438.GF13557@bams.ccf.swri.edu>
	<Mahogany-0.67.0-992-20070813-115149.00@american.edu>
	<20070813161945.GA12360@bams.ccf.swri.edu>
Message-ID: <ce557a360708132123k5b373b6l2eda66913d6e0048@mail.gmail.com>

On 13/08/07, Glen W. Mabey <Glen.Mabey at swri.org> wrote:

> As I have tried to think through what should be the appropriate
> behavior for the returned value of __getitem__, I have not been able to
> see an appropriate solution (let alone know how to implement it) to this
> issue.

Is the problem one of finalization? That is, making sure the memory
map gets (flushed and) closed exactly once? In this case the
numpythonic solution is to have only the original mmap object do any
finalization; any slices contain a reference to it anyway, so they
cannot be kept after it is collected. If the problem is that you want
to do an explicit close/flush on a slice object, you could just always
apply the close/flush to the base object of the slice if it has one or
the slice itself if it doesn't.

I'm afraid I don't really understand the problem but it seems like
nobody who just knows the answer is about to speak up...

Anne


From josh.p.marshall at gmail.com  Tue Aug 14 08:01:34 2007
From: josh.p.marshall at gmail.com (Josh Marshall)
Date: Tue, 14 Aug 2007 22:01:34 +1000
Subject: [Numpy-discussion] OS X universal SciPy: success!
In-Reply-To: <mailman.13.1186851614.3859.scipy-dev@scipy.org>
References: <mailman.13.1186851614.3859.scipy-dev@scipy.org>
Message-ID: <5D925F6F-6090-417D-8845-60459601777B@gmail.com>

I've been trying to easily build an OS X universal SciPy for some  
time now. [1] My prior efforts consisted of lipo-ing together the PPC  
and x86 'Superpacks' put together by Chris Fonnesbeck [2], which  
worked for distributing my image processing app locally. However,  
this isn't really a good way to do it, specially not for putting up  
for general use.

I came across a successful Universal build of gfortran [4,5], and  
then shortly found this message [5] claiming it could possibly be  
used to build a universal SciPy.

So, I gave it a shot and it works! (with some tricks...)

The build has both ppc and i386 architectures in every .so in the  
scipy install. The tests run fine on my G4, but I haven't yet had the  
chance to try it on an Intel Mac. If anyone is keen to do so, please  
let me know.

There will need to be some modifications to numpy distutils, since it  
presumes that there isn't a universal Fortran. The patch below is  
just a hack to get it working, and will break any non-universal  
gfortran. Essentially, all that needs to happen is to have '-arch ppc  
-arch i386' added to any call to gfortran (both compile and link) ,  
and the '-march' flags removed.

What's the best way to add this functionality to numpy distutils? I  
couldn't think of any way to test for a universal compiler, other  
than trying to compile a test file and seeing if it dies with the  
multiple arch flags.

Regards,
Josh Marshall


[1] http://mail.python.org/pipermail/pythonmac-sig/2006-December/ 
018556.html
[2] http://trichech.us/?page_id=5
[3] http://r.research.att.com/tools/
[4] http://r.research.att.com/gfortran-4.2.1.dmg
[5] http://mail.python.org/pipermail/pythonmac-sig/2007-May/018975.html

isengard:~/Development/Python/numpy-svn/numpy/distutils/fcompiler Josh 
$ svn diff gnu.py
Index: gnu.py
===================================================================
--- gnu.py      (revision 3964)
+++ gnu.py      (working copy)
@@ -102,6 +102,7 @@
                  minor)

              opt.extend(['-undefined', 'dynamic_lookup', '-bundle'])
+            opt.extend(['-arch ppc -arch i386'])
          else:
              opt.append("-shared")
          if sys.platform.startswith('sunos'):
@@ -183,12 +184,13 @@
              # Since Apple doesn't distribute a GNU Fortran  
compiler, we
              # can't add -arch ppc or -arch i386, as only their version
              # of the GNU compilers accepts those.
-            for a in '601 602 603 603e 604 604e 620 630 740 7400  
7450 750'\
-                    '403 505 801 821 823 860'.split():
-                if getattr(cpu,'is_ppc%s'%a)():
-                    opt.append('-mcpu='+a)
-                    opt.append('-mtune='+a)
-                    break
+            #for a in '601 602 603 603e 604 604e 620 630 740 7400  
7450 750'\
+            #        '403 505 801 821 823 860'.split():
+            #    if getattr(cpu,'is_ppc%s'%a)():
+            #        opt.append('-mcpu='+a)
+            #        opt.append('-mtune='+a)
+            #        break
+            opt.append('-arch ppc -arch i386')
              return opt


From markbak at gmail.com  Wed Aug 15 05:07:26 2007
From: markbak at gmail.com (mark)
Date: Wed, 15 Aug 2007 02:07:26 -0700
Subject: [Numpy-discussion] deleting value from array
Message-ID: <1187168846.711948.138530@w3g2000hsg.googlegroups.com>

I am trying to delete a value from an array
This seems to work as follows

>>> a = array([1,2,3,4])
>>> a = delete( a, 1 )
>>> a
array([1, 3, 4])

But wouldn't it make more sense to have a function like

a.delete(1) ?

I now get the feeling the delete command needs to copy the entire
array with exception of the deleted item. I guess this is a hard thing
to do efficiently?

Thanks,

Mark


From matthieu.brucher at gmail.com  Wed Aug 15 05:19:35 2007
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Wed, 15 Aug 2007 11:19:35 +0200
Subject: [Numpy-discussion] deleting value from array
In-Reply-To: <1187168846.711948.138530@w3g2000hsg.googlegroups.com>
References: <1187168846.711948.138530@w3g2000hsg.googlegroups.com>
Message-ID: <e76aa17f0708150219lfa2ae9bx30c4eaa024307591@mail.gmail.com>

>
> I now get the feeling the delete command needs to copy the entire
> array with exception of the deleted item. I guess this is a hard thing
> to do efficiently?
>

Well, if you don't copy the array, the value will always remain present.

Matthieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070815/f3c58786/attachment.html>

From Andy.cheesman at bristol.ac.uk  Wed Aug 15 05:53:05 2007
From: Andy.cheesman at bristol.ac.uk (Andy Cheesman)
Date: Wed, 15 Aug 2007 10:53:05 +0100
Subject: [Numpy-discussion] Finding a row match within a numpy array
Message-ID: <46C2CD01.5030307@bristol.ac.uk>

Dear nice people

I'm trying to match a row (b) within a large numpy array (a). My most
successful attempt is below

hit = equal(b, a)
total_hits = add.reduce(hit, 1)
max_hit = argmax(total_hits, 0)
answer = a[max_hit]

where ...
a = array([[ 0,  1,  2,  3],
    	   [ 4,  5,  6,  7],
	   [ 8,  9, 10, 11],
	   [12, 13, 14, 15]])

b = array([8,  9, 10, 11])


I was wondering if people could suggest a possible more efficient route
as there seems to be numerous steps.

Thanks
Andy


From markbak at gmail.com  Wed Aug 15 06:09:16 2007
From: markbak at gmail.com (mark)
Date: Wed, 15 Aug 2007 03:09:16 -0700
Subject: [Numpy-discussion] Finding a row match within a numpy array
In-Reply-To: <46C2CD01.5030307@bristol.ac.uk>
References: <46C2CD01.5030307@bristol.ac.uk>
Message-ID: <1187172556.613122.207400@r29g2000hsg.googlegroups.com>

I think you can create an array with a true value in the right spot as
folows:

row = all( equal(a,b), 1 )

Then you can either find the row (but you already knew that one, as it
is b)

a[row]

or the row index

find(row==True)

Mark

On Aug 15, 11:53 am, Andy Cheesman <Andy.chees... at bristol.ac.uk>
wrote:
> Dear nice people
>
> I'm trying to match a row (b) within a large numpy array (a). My most
> successful attempt is below
>
> hit = equal(b, a)
> total_hits = add.reduce(hit, 1)
> max_hit = argmax(total_hits, 0)
> answer = a[max_hit]
>
> where ...
> a = array([[ 0,  1,  2,  3],
>            [ 4,  5,  6,  7],
>            [ 8,  9, 10, 11],
>            [12, 13, 14, 15]])
>
> b = array([8,  9, 10, 11])
>
> I was wondering if people could suggest a possible more efficient route
> as there seems to be numerous steps.
>
> Thanks
> Andy
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion


From markbak at gmail.com  Wed Aug 15 06:11:59 2007
From: markbak at gmail.com (mark)
Date: Wed, 15 Aug 2007 03:11:59 -0700
Subject: [Numpy-discussion] deleting value from array
In-Reply-To: <e76aa17f0708150219lfa2ae9bx30c4eaa024307591@mail.gmail.com>
References: <1187168846.711948.138530@w3g2000hsg.googlegroups.com>
	<e76aa17f0708150219lfa2ae9bx30c4eaa024307591@mail.gmail.com>
Message-ID: <1187172719.626721.255050@r34g2000hsd.googlegroups.com>

Yeah, I can see the copying is essential.
I just think the syntax
a = delete(a,1)
confusing, as I would expect the deleted value back, rather than the
updated array.
As in the 'pop' function for lists.
No 'pop' in numpy? (I presume this may have been debated extensively
in the past).
I find the syntax
a.delete(1) more logical.
Mark

On Aug 15, 11:19 am, "Matthieu Brucher" <matthieu.bruc... at gmail.com>
wrote:
> > I now get the feeling the delete command needs to copy the entire
> > array with exception of the deleted item. I guess this is a hard thing
> > to do efficiently?
>
> Well, if you don't copy the array, the value will always remain present.
>
> Matthieu
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion


From Andy.cheesman at bristol.ac.uk  Wed Aug 15 06:26:15 2007
From: Andy.cheesman at bristol.ac.uk (Andy Cheesman)
Date: Wed, 15 Aug 2007 11:26:15 +0100
Subject: [Numpy-discussion] Finding a row match within a numpy array
In-Reply-To: <1187172556.613122.207400@r29g2000hsg.googlegroups.com>
References: <46C2CD01.5030307@bristol.ac.uk>
	<1187172556.613122.207400@r29g2000hsg.googlegroups.com>
Message-ID: <46C2D4C7.4010305@bristol.ac.uk>

Thanks for the speedy response but where can I locate the find function
as it isn't in numpy.

Andy

mark wrote:
> I think you can create an array with a true value in the right spot as
> folows:
> 
> row = all( equal(a,b), 1 )
> 
> Then you can either find the row (but you already knew that one, as it
> is b)
> 
> a[row]
> 
> or the row index
> 
> find(row==True)
> 
> Mark
> 
> On Aug 15, 11:53 am, Andy Cheesman <Andy.chees... at bristol.ac.uk>
> wrote:
>> Dear nice people
>>
>> I'm trying to match a row (b) within a large numpy array (a). My most
>> successful attempt is below
>>
>> hit = equal(b, a)
>> total_hits = add.reduce(hit, 1)
>> max_hit = argmax(total_hits, 0)
>> answer = a[max_hit]
>>
>> where ...
>> a = array([[ 0,  1,  2,  3],
>>            [ 4,  5,  6,  7],
>>            [ 8,  9, 10, 11],
>>            [12, 13, 14, 15]])
>>
>> b = array([8,  9, 10, 11])
>>
>> I was wondering if people could suggest a possible more efficient route
>> as there seems to be numerous steps.
>>
>> Thanks
>> Andy
>> _______________________________________________
>> Numpy-discussion mailing list
>> Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion
> 
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
> 


From markbak at gmail.com  Wed Aug 15 06:59:48 2007
From: markbak at gmail.com (mark)
Date: Wed, 15 Aug 2007 03:59:48 -0700
Subject: [Numpy-discussion] Finding a row match within a numpy array
In-Reply-To: <46C2D4C7.4010305@bristol.ac.uk>
References: <46C2CD01.5030307@bristol.ac.uk>
	<1187172556.613122.207400@r29g2000hsg.googlegroups.com>
	<46C2D4C7.4010305@bristol.ac.uk>
Message-ID: <1187175588.007436.125720@w3g2000hsg.googlegroups.com>

Oops, 'find' is in pylab (matplotlib).
I guess in numpy you have to use 'where', which does almost the same,
but it returns a Tuple.
Is there a function that is more like the find in matplotlib?
Mark


On Aug 15, 12:26 pm, Andy Cheesman <Andy.chees... at bristol.ac.uk>
wrote:
> Thanks for the speedy response but where can I locate the find function
> as it isn't in numpy.
>
> Andy
>
>
>
> mark wrote:
> > I think you can create an array with a true value in the right spot as
> > folows:
>
> > row = all( equal(a,b), 1 )
>
> > Then you can either find the row (but you already knew that one, as it
> > is b)
>
> > a[row]
>
> > or the row index
>
> > find(row==True)
>
> > Mark
>
> > On Aug 15, 11:53 am, Andy Cheesman <Andy.chees... at bristol.ac.uk>
> > wrote:
> >> Dear nice people
>
> >> I'm trying to match a row (b) within a large numpy array (a). My most
> >> successful attempt is below
>
> >> hit = equal(b, a)
> >> total_hits = add.reduce(hit, 1)
> >> max_hit = argmax(total_hits, 0)
> >> answer = a[max_hit]
>
> >> where ...
> >> a = array([[ 0,  1,  2,  3],
> >>            [ 4,  5,  6,  7],
> >>            [ 8,  9, 10, 11],
> >>            [12, 13, 14, 15]])
>
> >> b = array([8,  9, 10, 11])
>
> >> I was wondering if people could suggest a possible more efficient route
> >> as there seems to be numerous steps.
>
> >> Thanks
> >> Andy
> >> _______________________________________________
> >> Numpy-discussion mailing list
> >> Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion
>
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discuss... at scipy.org
> >http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion


From matthieu.brucher at gmail.com  Wed Aug 15 07:38:14 2007
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Wed, 15 Aug 2007 13:38:14 +0200
Subject: [Numpy-discussion] Finding a row match within a numpy array
In-Reply-To: <1187175588.007436.125720@w3g2000hsg.googlegroups.com>
References: <46C2CD01.5030307@bristol.ac.uk>
	<1187172556.613122.207400@r29g2000hsg.googlegroups.com>
	<46C2D4C7.4010305@bristol.ac.uk>
	<1187175588.007436.125720@w3g2000hsg.googlegroups.com>
Message-ID: <e76aa17f0708150438n60234b8cg5665a33ba04f904@mail.gmail.com>

The where function ?

Matthieu

2007/8/15, mark <markbak at gmail.com>:
>
> Oops, 'find' is in pylab (matplotlib).
> I guess in numpy you have to use 'where', which does almost the same,
> but it returns a Tuple.
> Is there a function that is more like the find in matplotlib?
> Mark
>
>
> On Aug 15, 12:26 pm, Andy Cheesman <Andy.chees... at bristol.ac.uk>
> wrote:
> > Thanks for the speedy response but where can I locate the find function
> > as it isn't in numpy.
> >
> > Andy
> >
> >
> >
> > mark wrote:
> > > I think you can create an array with a true value in the right spot as
> > > folows:
> >
> > > row = all( equal(a,b), 1 )
> >
> > > Then you can either find the row (but you already knew that one, as it
> > > is b)
> >
> > > a[row]
> >
> > > or the row index
> >
> > > find(row==True)
> >
> > > Mark
> >
> > > On Aug 15, 11:53 am, Andy Cheesman <Andy.chees... at bristol.ac.uk>
> > > wrote:
> > >> Dear nice people
> >
> > >> I'm trying to match a row (b) within a large numpy array (a). My most
> > >> successful attempt is below
> >
> > >> hit = equal(b, a)
> > >> total_hits = add.reduce(hit, 1)
> > >> max_hit = argmax(total_hits, 0)
> > >> answer = a[max_hit]
> >
> > >> where ...
> > >> a = array([[ 0,  1,  2,  3],
> > >>            [ 4,  5,  6,  7],
> > >>            [ 8,  9, 10, 11],
> > >>            [12, 13, 14, 15]])
> >
> > >> b = array([8,  9, 10, 11])
> >
> > >> I was wondering if people could suggest a possible more efficient
> route
> > >> as there seems to be numerous steps.
> >
> > >> Thanks
> > >> Andy
> > >> _______________________________________________
> > >> Numpy-discussion mailing list
> > >> Numpy-discuss... at scipy
> .orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion
> >
> > > _______________________________________________
> > > Numpy-discussion mailing list
> > > Numpy-discuss... at scipy.org
> > >http://projects.scipy.org/mailman/listinfo/numpy-discussion
> >
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discuss... at scipy
> .orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070815/12eb315c/attachment.html>

From gael.varoquaux at normalesup.org  Wed Aug 15 09:30:53 2007
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Wed, 15 Aug 2007 15:30:53 +0200
Subject: [Numpy-discussion] deleting value from array
In-Reply-To: <1187172719.626721.255050@r34g2000hsd.googlegroups.com>
References: <1187168846.711948.138530@w3g2000hsg.googlegroups.com>
	<e76aa17f0708150219lfa2ae9bx30c4eaa024307591@mail.gmail.com>
	<1187172719.626721.255050@r34g2000hsd.googlegroups.com>
Message-ID: <20070815133053.GA15510@clipper.ens.fr>

On Wed, Aug 15, 2007 at 03:11:59AM -0700, mark wrote:
> Yeah, I can see the copying is essential.
> I just think the syntax
> a = delete(a,1)
> confusing, as I would expect the deleted value back, rather than the
> updated array.
> As in the 'pop' function for lists.
> No 'pop' in numpy? (I presume this may have been debated extensively
> in the past).
> I find the syntax
> a.delete(1) more logical.

It is often considered in OO language that foo.method() modifies the foo
object, while function(foo) returns a new object, not modifying foo. This
is not always true in Python. Sometimes (eg strings) this is not true
because the object is immutable, sometimes there isn't this good reason.
I would be happy if we sticked to this convention. I find it makes the
language easier to guess.

Ga?l


From Shawn.Gong at drdc-rddc.gc.ca  Wed Aug 15 10:47:08 2007
From: Shawn.Gong at drdc-rddc.gc.ca (Gong, Shawn (Contractor))
Date: Wed, 15 Aug 2007 10:47:08 -0400
Subject: [Numpy-discussion] memory error caused by astype()
Message-ID: <2E58C246F17003499C141D334794D049027683D8@ottawaex02.Ottawa.drdc-rddc.gc.ca>

Hi list,

When I do large array manipulations, I get out-of-memory errors.
If the array size is 5000 by 6000, the following codes use nearly 1G.
Then my PC displays a Python error box.  The try/except won't catch it
if the memory error happens in "astype" instead of "array1* array2"

    try:
        if ( array1.typecode() in cplx_types ):
            array1 = abs(array1.astype(Numeric.Complex32))
        else:
            array1 = array1.astype(Numeric.Float32)

        if ( array2.typecode() in cplx_types ):
            array2 = abs(array2.astype(Numeric.Complex32))
        else:
            array2 = array2.astype(Numeric.Float32)

        array1 = Numeric.sqrt(array1) * Numeric.sqrt(array2)
        return array1

    except:
        gvutils.error("Memory error occurred\nPlease select a smaller
array")
        return None

My questions are:
1)	Is there a more memory efficient way of doing this?
2)	How do I deal with exception if astype is the only way to go
3)	Is there a way in Python that detects the available RAM and
limits the array size before he/she can go ahead with the array
multiplications?  
			i.e. detects the available RAM, say 800K 
			Assume worst case - Complex32
			Figure out how many temp_arrays used by numpy
			Calculate array size limit = ??
4)	If there is no 3) Is there something in Python that monitors
memory and warns the user. I have these "astype" at a number functions.
Do I have to put try/except at each location?

Thanks,
Shaw Gong


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070815/e876c448/attachment.html>

From markbak at gmail.com  Wed Aug 15 11:01:11 2007
From: markbak at gmail.com (mark)
Date: Wed, 15 Aug 2007 15:01:11 -0000
Subject: [Numpy-discussion] Finding a row match within a numpy array
In-Reply-To: <e76aa17f0708150438n60234b8cg5665a33ba04f904@mail.gmail.com>
References: <46C2CD01.5030307@bristol.ac.uk>
	<1187172556.613122.207400@r29g2000hsg.googlegroups.com>
	<46C2D4C7.4010305@bristol.ac.uk>
	<1187175588.007436.125720@w3g2000hsg.googlegroups.com>
	<e76aa17f0708150438n60234b8cg5665a33ba04f904@mail.gmail.com>
Message-ID: <1187190071.384881.240470@w3g2000hsg.googlegroups.com>

Maybe this is not the intended use of where, but it seems to work:
>>> from numpy import * # No complaining now
>>> a = arange(12)
>>> a.shape = (4,3)
>>> a
array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])
>>> b = array([6,7,8])
>>> row = all( equal(a,b), 1 )
>>> where(row==True)
(array([2]),)

On Aug 15, 1:38 pm, "Matthieu Brucher" <matthieu.bruc... at gmail.com>
wrote:
> The where function ?
>
> Matthieu
>
> 2007/8/15, mark <mark... at gmail.com>:
>
>
>
> > Oops, 'find' is in pylab (matplotlib).
> > I guess in numpy you have to use 'where', which does almost the same,
> > but it returns a Tuple.
> > Is there a function that is more like the find in matplotlib?
> > Mark
>
> > On Aug 15, 12:26 pm, Andy Cheesman <Andy.chees... at bristol.ac.uk>
> > wrote:
> > > Thanks for the speedy response but where can I locate the find function
> > > as it isn't in numpy.
>
> > > Andy
>
> > > mark wrote:
> > > > I think you can create an array with a true value in the right spot as
> > > > folows:
>
> > > > row = all( equal(a,b), 1 )
>
> > > > Then you can either find the row (but you already knew that one, as it
> > > > is b)
>
> > > > a[row]
>
> > > > or the row index
>
> > > > find(row==True)
>
> > > > Mark
>
> > > > On Aug 15, 11:53 am, Andy Cheesman <Andy.chees... at bristol.ac.uk>
> > > > wrote:
> > > >> Dear nice people
>
> > > >> I'm trying to match a row (b) within a large numpy array (a). My most
> > > >> successful attempt is below
>
> > > >> hit = equal(b, a)
> > > >> total_hits = add.reduce(hit, 1)
> > > >> max_hit = argmax(total_hits, 0)
> > > >> answer = a[max_hit]
>
> > > >> where ...
> > > >> a = array([[ 0,  1,  2,  3],
> > > >>            [ 4,  5,  6,  7],
> > > >>            [ 8,  9, 10, 11],
> > > >>            [12, 13, 14, 15]])
>
> > > >> b = array([8,  9, 10, 11])
>
> > > >> I was wondering if people could suggest a possible more efficient
> > route
> > > >> as there seems to be numerous steps.
>
> > > >> Thanks
> > > >> Andy
> > > >> _______________________________________________
> > > >> Numpy-discussion mailing list
> > > >> Numpy-discuss... at scipy
> > .orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion
>
> > > > _______________________________________________
> > > > Numpy-discussion mailing list
> > > > Numpy-discuss... at scipy.org
> > > >http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
> > > _______________________________________________
> > > Numpy-discussion mailing list
> > > Numpy-discuss... at scipy
> > .orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion
>
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discuss... at scipy.org
> >http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion


From millman at berkeley.edu  Wed Aug 15 12:22:38 2007
From: millman at berkeley.edu (Jarrod Millman)
Date: Wed, 15 Aug 2007 09:22:38 -0700
Subject: [Numpy-discussion] NumPy 1.0.3.x and SciPy 0.5.2.x
Message-ID: <c7009a550708150922t5d855bbbx6e4fe6944f4618cd@mail.gmail.com>

Hello,

I am hoping to release NumPy 1.0.3.1 and SciPy 0.5.2.1 this weekend.
These releases will work with each other and get rid of the annoying
deprecation warning about SciPyTest.

They are both basically ready to release.  If you have some time,
please build and install the stable branches and let me know if you
have any errors.

You can check out the code here:
svn co http://svn.scipy.org/svn/numpy/branches/1.0.3.x
svn co http://svn.scipy.org/svn/scipy/branches/0.5.2.x

Below is a list of the changes I have made.

NumPy 1.0.3.1
============
* Adds back get_path to numpy.distutils.misc_utils

SciPy 0.5.2.1
==========
* Replaces ScipyTest with NumpyTest
* Fixes mio5.py as per revision 2893
* Adds missing test definition in scipy.cluster as per revision 2941
* Synchs odr module with trunk since odr is broken in 0.5.2
* Updates for SWIG > 1.3.29 and fixes memory leak of type 'void *'


Thanks,

-- 
Jarrod Millman
Computational Infrastructure for Research Labs
10 Giannini Hall, UC Berkeley
phone: 510.643.4014
http://cirl.berkeley.edu/


From Glen.Mabey at swri.org  Wed Aug 15 12:36:12 2007
From: Glen.Mabey at swri.org (Glen W. Mabey)
Date: Wed, 15 Aug 2007 11:36:12 -0500
Subject: [Numpy-discussion] .transpose() of memmap array fails to	close()
In-Reply-To: <ce557a360708132123k5b373b6l2eda66913d6e0048@mail.gmail.com>
References: <20070607214620.GM6116@bams.ccf.swri.edu>
	<20070810162016.GA12992@bams.ccf.swri.edu>
	<20070810211438.GF13557@bams.ccf.swri.edu>
	<Mahogany-0.67.0-992-20070813-115149.00@american.edu>
	<20070813161945.GA12360@bams.ccf.swri.edu>
	<ce557a360708132123k5b373b6l2eda66913d6e0048@mail.gmail.com>
Message-ID: <20070815163612.GB23855@bams.ccf.swri.edu>

On Tue, Aug 14, 2007 at 12:23:26AM -0400, Anne Archibald wrote:
> On 13/08/07, Glen W. Mabey <Glen.Mabey at swri.org> wrote:
> 
> > As I have tried to think through what should be the appropriate
> > behavior for the returned value of __getitem__, I have not been able to
> > see an appropriate solution (let alone know how to implement it) to this
> > issue.
> 
> Is the problem one of finalization? That is, making sure the memory
> map gets (flushed and) closed exactly once? In this case the
> numpythonic solution is to have only the original mmap object do any
> finalization; any slices contain a reference to it anyway, so they
> cannot be kept after it is collected. If the problem is that you want
> to do an explicit close/flush on a slice object, you could just always
> apply the close/flush to the base object of the slice if it has one or
> the slice itself if it doesn't.

The immediate problem is that when a numpy.memmap instance is created as
another view of the original array, then __del__ on that new view fails.

flush()ing and closing aren't an issue for me, but they can't be
performed at all on derived views right now.  It seems to me that any
derived view ought to be able to flush(), and ideally in my mind,
close() would be called [automatically] only just before the reference 
count gets decremented to zero.  

That doesn't seem to match the numpythonic philosophy you described,
Anne, but seems logical to me, while still allowing for both manual
flush() and close() operations.

Thanks for your response.

Glen


From matthew.brett at gmail.com  Wed Aug 15 15:02:31 2007
From: matthew.brett at gmail.com (Matthew Brett)
Date: Wed, 15 Aug 2007 20:02:31 +0100
Subject: [Numpy-discussion] .transpose() of memmap array fails to close()
In-Reply-To: <20070815163612.GB23855@bams.ccf.swri.edu>
References: <20070607214620.GM6116@bams.ccf.swri.edu>
	<20070810162016.GA12992@bams.ccf.swri.edu>
	<20070810211438.GF13557@bams.ccf.swri.edu>
	<Mahogany-0.67.0-992-20070813-115149.00@american.edu>
	<20070813161945.GA12360@bams.ccf.swri.edu>
	<ce557a360708132123k5b373b6l2eda66913d6e0048@mail.gmail.com>
	<20070815163612.GB23855@bams.ccf.swri.edu>
Message-ID: <1e2af89e0708151202q68e2973co1fe407df07af2df2@mail.gmail.com>

Hi,

Thanks for looking into this because we (neuroimaging.scipy.org) use
mmaps a lot. I am very away from my desk at the moment but please do
keep us all informed, and we'll try and pitch in if we can...

Matthew


On 8/15/07, Glen W. Mabey <Glen.Mabey at swri.org> wrote:
> On Tue, Aug 14, 2007 at 12:23:26AM -0400, Anne Archibald wrote:
> > On 13/08/07, Glen W. Mabey <Glen.Mabey at swri.org> wrote:
> >
> > > As I have tried to think through what should be the appropriate
> > > behavior for the returned value of __getitem__, I have not been able to
> > > see an appropriate solution (let alone know how to implement it) to this
> > > issue.
> >
> > Is the problem one of finalization? That is, making sure the memory
> > map gets (flushed and) closed exactly once? In this case the
> > numpythonic solution is to have only the original mmap object do any
> > finalization; any slices contain a reference to it anyway, so they
> > cannot be kept after it is collected. If the problem is that you want
> > to do an explicit close/flush on a slice object, you could just always
> > apply the close/flush to the base object of the slice if it has one or
> > the slice itself if it doesn't.
>
> The immediate problem is that when a numpy.memmap instance is created as
> another view of the original array, then __del__ on that new view fails.
>
> flush()ing and closing aren't an issue for me, but they can't be
> performed at all on derived views right now.  It seems to me that any
> derived view ought to be able to flush(), and ideally in my mind,
> close() would be called [automatically] only just before the reference
> count gets decremented to zero.
>
> That doesn't seem to match the numpythonic philosophy you described,
> Anne, but seems logical to me, while still allowing for both manual
> flush() and close() operations.
>
> Thanks for your response.
>
> Glen
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>


From peridot.faceted at gmail.com  Wed Aug 15 20:50:28 2007
From: peridot.faceted at gmail.com (Anne Archibald)
Date: Wed, 15 Aug 2007 20:50:28 -0400
Subject: [Numpy-discussion] .transpose() of memmap array fails to close()
In-Reply-To: <20070815163612.GB23855@bams.ccf.swri.edu>
References: <20070607214620.GM6116@bams.ccf.swri.edu>
	<20070810162016.GA12992@bams.ccf.swri.edu>
	<20070810211438.GF13557@bams.ccf.swri.edu>
	<Mahogany-0.67.0-992-20070813-115149.00@american.edu>
	<20070813161945.GA12360@bams.ccf.swri.edu>
	<ce557a360708132123k5b373b6l2eda66913d6e0048@mail.gmail.com>
	<20070815163612.GB23855@bams.ccf.swri.edu>
Message-ID: <ce557a360708151750o46b43eb5ke948b9ed0f5eff8a@mail.gmail.com>

On 15/08/07, Glen W. Mabey <Glen.Mabey at swri.org> wrote:
> On Tue, Aug 14, 2007 at 12:23:26AM -0400, Anne Archibald wrote:
> > On 13/08/07, Glen W. Mabey <Glen.Mabey at swri.org> wrote:
> >
> > > As I have tried to think through what should be the appropriate
> > > behavior for the returned value of __getitem__, I have not been able to
> > > see an appropriate solution (let alone know how to implement it) to this
> > > issue.
> >
> > Is the problem one of finalization? That is, making sure the memory
> > map gets (flushed and) closed exactly once? In this case the
> > numpythonic solution is to have only the original mmap object do any
> > finalization; any slices contain a reference to it anyway, so they
> > cannot be kept after it is collected. If the problem is that you want
> > to do an explicit close/flush on a slice object, you could just always
> > apply the close/flush to the base object of the slice if it has one or
> > the slice itself if it doesn't.
>
> The immediate problem is that when a numpy.memmap instance is created as
> another view of the original array, then __del__ on that new view fails.

Yes, this is definitely broken.

> flush()ing and closing aren't an issue for me, but they can't be
> performed at all on derived views right now.  It seems to me that any
> derived view ought to be able to flush(), and ideally in my mind,
> close() would be called [automatically] only just before the reference
> count gets decremented to zero.
>
> That doesn't seem to match the numpythonic philosophy you described,
> Anne, but seems logical to me, while still allowing for both manual
> flush() and close() operations.

You have to be a bit careful, because a view really is just a view
into the array - the original is still around. So you can't really
delete the array contents when the view is deleted. Really, if you do:
B = A[::2]
del B
nothing at all should happen to A.

But to be pythonic, or numpythonic, when the original A is
garbage-collected, the garbage collection should certainly close the
mmap.

Being able to apply flush() or whatever to slices is not necessarily
unpythonic, but it's probably a lot simpler to reliably implement
slices of mmap()s as simple slices of ordinary arrays. It means you
need to keep the original mmap object around (or traverse up the tree
of bases:
T = A
while T.base is not None: T = T.base
T.flush()
)

(Note that this would be simpler if when you did
A = arange(100)
B = A[::2]
C = B[::2]
you found that C.base were A rather than B.)

Anne


From nadavh at visionsense.com  Thu Aug 16 06:04:35 2007
From: nadavh at visionsense.com (Nadav Horesh)
Date: Thu, 16 Aug 2007 13:04:35 +0300
Subject: [Numpy-discussion] deleting value from array
In-Reply-To: <1187168846.711948.138530@w3g2000hsg.googlegroups.com>
References: <1187168846.711948.138530@w3g2000hsg.googlegroups.com>
Message-ID: <1187258675.27158.1.camel@nadav.envision.co.il>

The closest I can think of is:

a =  a[range(len(a)) != 1]

   Nadav.

On Wed, 2007-08-15 at 02:07 -0700, mark wrote:

> I am trying to delete a value from an array
> This seems to work as follows
> 
> >>> a = array([1,2,3,4])
> >>> a = delete( a, 1 )
> >>> a
> array([1, 3, 4])
> 
> But wouldn't it make more sense to have a function like
> 
> a.delete(1) ?
> 
> I now get the feeling the delete command needs to copy the entire
> array with exception of the deleted item. I guess this is a hard thing
> to do efficiently?
> 
> Thanks,
> 
> Mark
> 
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070816/288580e7/attachment.html>

From Glen.Mabey at swri.org  Thu Aug 16 10:17:10 2007
From: Glen.Mabey at swri.org (Glen W. Mabey)
Date: Thu, 16 Aug 2007 09:17:10 -0500
Subject: [Numpy-discussion] .transpose() of memmap array fails to	close()
In-Reply-To: <ce557a360708151750o46b43eb5ke948b9ed0f5eff8a@mail.gmail.com>
References: <20070607214620.GM6116@bams.ccf.swri.edu>
	<20070810162016.GA12992@bams.ccf.swri.edu>
	<20070810211438.GF13557@bams.ccf.swri.edu>
	<Mahogany-0.67.0-992-20070813-115149.00@american.edu>
	<20070813161945.GA12360@bams.ccf.swri.edu>
	<ce557a360708132123k5b373b6l2eda66913d6e0048@mail.gmail.com>
	<20070815163612.GB23855@bams.ccf.swri.edu>
	<ce557a360708151750o46b43eb5ke948b9ed0f5eff8a@mail.gmail.com>
Message-ID: <20070816141710.GA3154@bams.ccf.swri.edu>

On Wed, Aug 15, 2007 at 08:50:28PM -0400, Anne Archibald wrote:
> You have to be a bit careful, because a view really is just a view
> into the array - the original is still around. So you can't really
> delete the array contents when the view is deleted. Really, if you do:
> B = A[::2]
> del B
> nothing at all should happen to A.

Okay, right.  I was muddling those two concepts.

> But to be pythonic, or numpythonic, when the original A is
> garbage-collected, the garbage collection should certainly close the
> mmap.

Humm, this would be less than ideal for my use case, when the data on
disk is organized in a different dimensional order than I want to refer
to it in my code.  For example:

p_data = numpy.memmap( datafilename, shape=( 10, 1024, 20 ), dtype=numpy.float32, mode='r')
u_data = p_data.transpose( [ 2, 0, 1 ] )

and I don't want to have to keep track of p_data because its only u_data
that I care about and want to use.  And I promise, this is not a
contrived example.  I have data that I really do want to be ordered in a
certain way on disk, for I/O efficiency reasons, yet when I logically
index into it in my code, I want the dimensions to be in a different
order.

> Being able to apply flush() or whatever to slices is not necessarily
> unpythonic, but it's probably a lot simpler to reliably implement
> slices of mmap()s as simple slices of ordinary arrays. 

I considered this approach, but what happens if you want to instantiate
a slice that is very large, e.g., larger than the size of your physical
RAM?  In that case, you can't afford to make simple slices be ordinary
arrays, besides the case where you want to change values.  Making them
functionally memmap-arrays, but without .sync() and .close() doesn't
seem right either.  

> It means you
> need to keep the original mmap object around (or traverse up the tree
> of bases:
> T = A
> while T.base is not None: T = T.base
> T.flush()
> )
> 
> (Note that this would be simpler if when you did
> A = arange(100)
> B = A[::2]
> C = B[::2]
> you found that C.base were A rather than B.)

Okay, this would make it so that I didn't have to explicitly keep track
of p_data, in my example.  Not bad, although I'd never noticed a .base
member before ...

Thank you,
Glen Mabey


From Andy.cheesman at bristol.ac.uk  Tue Aug 14 06:53:03 2007
From: Andy.cheesman at bristol.ac.uk (Andy Cheesman)
Date: Tue, 14 Aug 2007 11:53:03 +0100
Subject: [Numpy-discussion] Finding a row match within a numpy array
Message-ID: <46C1898F.6020107@bristol.ac.uk>

Dear nice people

I'm trying to match a row (b) within a large numpy array (a). My most
successful attempt is below

hit = equal(b, a)
total_hits = add.reduce(hit, 1)
max_hit = argmax(total_hits, 0)
answer = a[max_hit]

where ...
a = array([[ 0,  1,  2,  3],
    	   [ 4,  5,  6,  7],
	   [ 8,  9, 10, 11],
	   [12, 13, 14, 15]])

b = array([8,  9, 10, 11])


I was wondering if people could suggest a possible more efficient route
as there seems to be numerous steps.

Thanks
Andy


From efiring at hawaii.edu  Thu Aug 16 15:20:41 2007
From: efiring at hawaii.edu (Eric Firing)
Date: Thu, 16 Aug 2007 09:20:41 -1000
Subject: [Numpy-discussion] fast putmask implementation
Message-ID: <46C4A389.5020108@hawaii.edu>

In looking at maskedarray performance, I found that the filled() 
function or method is a bottleneck.  I think it can be sped up by using 
putmask instead of indexed assignment, but I found that putmask itself 
is slower than it needs to be.  So I followed David Cournapeau's example 
of fastclip and made a similar fastputmask.  The diff relative to 
current svn (3967) is attached.

The faster version makes a factor-of-ten or larger improvement in 
putmask speed.  numpy.test() still passes.

With 10000-element integer arrays the new version reduces the times from 
136 to 15 usec for 1000 masked elements, and 445 to 18 usec for 5000 
masked elements, with a scalar value argument.  It is only slightly 
slower with an array value argument.  (Times are for Intel Core2, 2 GH, 
linux.)

I hope someone will take a look and either tell me what I need to fix or 
commit it as-is.

Thanks.

Eric
-------------- next part --------------
A non-text attachment was scrubbed...
Name: numpy_putmask.diff
Type: text/x-patch
Size: 32047 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070816/30ff93db/attachment.bin>

From zyzhu2000 at gmail.com  Thu Aug 16 21:26:34 2007
From: zyzhu2000 at gmail.com (Geoffrey Zhu)
Date: Thu, 16 Aug 2007 20:26:34 -0500
Subject: [Numpy-discussion] numpy.array does not take generators
Message-ID: <fcedc2970708161826u12c8aa78mecdc3f55b8504338@mail.gmail.com>

Hi All,

I want to construct a numpy array based on Python objects. In the
below code, opts is a list of tuples.

For example,

opts=[ ('C', 100, 3, 'A'), ('K', 200, 5.4, 'B')]

If I use a generator like the following:

K=numpy.array(o[2]/1000.0 for o in opts)

It does not work.

I have to use:

numpy.array([o[2]/1000.0 for o in opts])

Is this behavior intended?

By the way, it is quite inefficient to create numpy array this way,
because I have to create a regular python first, and then construct a
numpy array. But I do not want to store everything in vector form
initially, as it is more natural to store them in objects, and easier
to use when organizing the data. Does anyone know any better way?

Thanks,
Geoffrey


From aisaac at american.edu  Thu Aug 16 22:04:11 2007
From: aisaac at american.edu (Alan G Isaac)
Date: Thu, 16 Aug 2007 22:04:11 -0400
Subject: [Numpy-discussion] numpy.array does not take generators
In-Reply-To: <fcedc2970708161826u12c8aa78mecdc3f55b8504338@mail.gmail.com>
References: <fcedc2970708161826u12c8aa78mecdc3f55b8504338@mail.gmail.com>
Message-ID: <Mahogany-0.67.0-1540-20070816-220411.00@american.edu>

On Thu, 16 Aug 2007, Geoffrey Zhu apparently wrote:
> K=numpy.array(o[2]/1000.0 for o in opts)
> It does not work. 

K=numpy.fromiter((o[2]/1000.0 for o in opts),'float')

hth,
Alan Isaac


From cournape at gmail.com  Thu Aug 16 22:05:15 2007
From: cournape at gmail.com (David Cournapeau)
Date: Fri, 17 Aug 2007 11:05:15 +0900
Subject: [Numpy-discussion] fast putmask implementation
In-Reply-To: <46C4A389.5020108@hawaii.edu>
References: <46C4A389.5020108@hawaii.edu>
Message-ID: <5b8d13220708161905l76f2ca74p5eedb8ca5ded1d18@mail.gmail.com>

On 8/17/07, Eric Firing <efiring at hawaii.edu> wrote:
> In looking at maskedarray performance, I found that the filled()
> function or method is a bottleneck.  I think it can be sped up by using
> putmask instead of indexed assignment, but I found that putmask itself
> is slower than it needs to be.  So I followed David Cournapeau's example
> of fastclip and made a similar fastputmask.  The diff relative to
> current svn (3967) is attached.

Great ! putmask was actually the function I wanted to improve after
clip, because it is the second bottleneck for matplotlib imagesc :) I
would not be suprised if now imagesc has descent speed compared to
matlab.

>
> I hope someone will take a look and either tell me what I need to fix or
> commit it as-is.

It looks like there are a lot of spurious diff in you patch (space vs
tab, or endline problems ?). Could you regenerate a patch without
them, since half of the patch is "garbage" ? It would be much easier
to see the changes you actually made.

cheers,

David


From efiring at hawaii.edu  Thu Aug 16 22:39:02 2007
From: efiring at hawaii.edu (Eric Firing)
Date: Thu, 16 Aug 2007 16:39:02 -1000
Subject: [Numpy-discussion] fast putmask implementation
In-Reply-To: <5b8d13220708161905l76f2ca74p5eedb8ca5ded1d18@mail.gmail.com>
References: <46C4A389.5020108@hawaii.edu>
	<5b8d13220708161905l76f2ca74p5eedb8ca5ded1d18@mail.gmail.com>
Message-ID: <46C50A46.1070208@hawaii.edu>

David Cournapeau wrote:
> On 8/17/07, Eric Firing <efiring at hawaii.edu> wrote:
>> In looking at maskedarray performance, I found that the filled()
>> function or method is a bottleneck.  I think it can be sped up by using
>> putmask instead of indexed assignment, but I found that putmask itself
>> is slower than it needs to be.  So I followed David Cournapeau's example
>> of fastclip and made a similar fastputmask.  The diff relative to
>> current svn (3967) is attached.
> 
> Great ! putmask was actually the function I wanted to improve after
> clip, because it is the second bottleneck for matplotlib imagesc :) I
> would not be suprised if now imagesc has descent speed compared to
> matlab.
> 
>> I hope someone will take a look and either tell me what I need to fix or
>> commit it as-is.
> 
> It looks like there are a lot of spurious diff in you patch (space vs
> tab, or endline problems ?). Could you regenerate a patch without
> them, since half of the patch is "garbage" ? It would be much easier
> to see the changes you actually made.

Agreed.  This is because my editor deletes spurious whitespace that was 
already in the file.  If I ruled the world, the spurious whitespace and 
hard tabs would never be there in the first place. (If I were younger I 
might use smileys in places like this, but they just don't come 
naturally to me.) As far as I can see there is no way of using svn diff 
to deal with this automatically, so in the attached revision I have 
manually removed chunks resulting solely from whitespace.  Some of the 
remaining chunks unfortunately have a mixture of whitespace and 
substantive differences.  And manually removing chunks is risky.

Is there a better way to handle this problem?  A better way to make 
diffs?  Or any possibility of routinely cleaning the junk out of the svn 
source files?  (Yes, I know--what is junk to me probably results from 
what others consider good behavior of the editor.)

Eric

> 
> cheers,
> 
> David
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
A non-text attachment was scrubbed...
Name: numpy_putmask.diff
Type: text/x-patch
Size: 8716 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070816/712f5b9f/attachment.bin>

From cookedm at physics.mcmaster.ca  Thu Aug 16 23:07:19 2007
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Thu, 16 Aug 2007 23:07:19 -0400
Subject: [Numpy-discussion] fast putmask implementation
In-Reply-To: <46C50A46.1070208@hawaii.edu>
References: <46C4A389.5020108@hawaii.edu>
	<5b8d13220708161905l76f2ca74p5eedb8ca5ded1d18@mail.gmail.com>
	<46C50A46.1070208@hawaii.edu>
Message-ID: <20070817030719.GA5542@arbutus.physics.mcmaster.ca>

On Thu, Aug 16, 2007 at 04:39:02PM -1000, Eric Firing wrote:
> As far as I can see there is no way of using svn diff to deal with 
> this automatically, so in the attached revision I have manually removed 
> chunks resulting solely from whitespace.
>
> Is there a better way to handle this problem?  A better way to make diffs?  
> Or any possibility of routinely cleaning the junk out of the svn source 
> files?  (Yes, I know--what is junk to me probably results from what others 
> consider good behavior of the editor.)

'svn diff -x -b' might work better (-b gets passed to diff, which makes
it ignore space changes). Or svn diff -x -w to ignore all whitespace.

Me, I hate trailing ws too (I've got Emacs set up so that gets
highlighted as red, which makes me angry :). The hard tabs in C code is
keeping with the style used in the C Python sources (Emacs even has a
'python' C style -- do "C-c . python").

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From efiring at hawaii.edu  Fri Aug 17 01:32:51 2007
From: efiring at hawaii.edu (Eric Firing)
Date: Thu, 16 Aug 2007 19:32:51 -1000
Subject: [Numpy-discussion] fast putmask implementation
In-Reply-To: <20070817030719.GA5542@arbutus.physics.mcmaster.ca>
References: <46C4A389.5020108@hawaii.edu>
	<5b8d13220708161905l76f2ca74p5eedb8ca5ded1d18@mail.gmail.com>
	<46C50A46.1070208@hawaii.edu>
	<20070817030719.GA5542@arbutus.physics.mcmaster.ca>
Message-ID: <46C53303.4000806@hawaii.edu>

David M. Cooke wrote:
> On Thu, Aug 16, 2007 at 04:39:02PM -1000, Eric Firing wrote:
>> As far as I can see there is no way of using svn diff to deal with 
>> this automatically, so in the attached revision I have manually removed 
>> chunks resulting solely from whitespace.
>>
>> Is there a better way to handle this problem?  A better way to make diffs?  
>> Or any possibility of routinely cleaning the junk out of the svn source 
>> files?  (Yes, I know--what is junk to me probably results from what others 
>> consider good behavior of the editor.)
> 
> 'svn diff -x -b' might work better (-b gets passed to diff, which makes
> it ignore space changes). Or svn diff -x -w to ignore all whitespace.
> 
> Me, I hate trailing ws too (I've got Emacs set up so that gets
> highlighted as red, which makes me angry :). The hard tabs in C code is
> keeping with the style used in the C Python sources (Emacs even has a
> 'python' C style -- do "C-c . python").
> 

David,

Thank you.  I had tried something like that a while ago without success, 
and now I know why: the '-w' has to be quoted to keep it out of the 
clutches of the shell, so it is "svn diff -x '-w'".  The result is 
attached.  Much better.

As for hard tabs in C Python sources--it is still a bad idea even if the 
BDFL himself does it--very bad for Python, not quite as bad for C, but 
still bad.  Too fragile, too dependent on editor configuration, and in 
numpy, not done consistently--it's a complete mishmash of tabs and 
spaces.  OK, enough of that.

Eric
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: putmask.diff_w
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070816/cd1d3648/attachment.ksh>

From robert.kern at gmail.com  Fri Aug 17 02:19:53 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 16 Aug 2007 23:19:53 -0700
Subject: [Numpy-discussion] numpy.array does not take generators
In-Reply-To: <fcedc2970708161826u12c8aa78mecdc3f55b8504338@mail.gmail.com>
References: <fcedc2970708161826u12c8aa78mecdc3f55b8504338@mail.gmail.com>
Message-ID: <46C53E09.9020306@gmail.com>

Geoffrey Zhu wrote:
> Hi All,
> 
> I want to construct a numpy array based on Python objects. In the
> below code, opts is a list of tuples.
> 
> For example,
> 
> opts=[ ('C', 100, 3, 'A'), ('K', 200, 5.4, 'B')]
> 
> If I use a generator like the following:
> 
> K=numpy.array(o[2]/1000.0 for o in opts)
> 
> It does not work.
> 
> I have to use:
> 
> numpy.array([o[2]/1000.0 for o in opts])
> 
> Is this behavior intended?

Yes. With arbitrary generators, there is no good way to do the kind of
mind-reading that numpy.array() usually does with sequences. It would have to
unroll the whole generator anyways. fromiter() works for this, but you are
restricted to 1-D arrays which is a lot easier to implement the mind-reading for.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From efiring at hawaii.edu  Fri Aug 17 02:40:01 2007
From: efiring at hawaii.edu (Eric Firing)
Date: Thu, 16 Aug 2007 20:40:01 -1000
Subject: [Numpy-discussion] fast putmask implementation
In-Reply-To: <20070817030719.GA5542@arbutus.physics.mcmaster.ca>
References: <46C4A389.5020108@hawaii.edu>
	<5b8d13220708161905l76f2ca74p5eedb8ca5ded1d18@mail.gmail.com>
	<46C50A46.1070208@hawaii.edu>
	<20070817030719.GA5542@arbutus.physics.mcmaster.ca>
Message-ID: <46C542C1.8060907@hawaii.edu>

David M. Cooke wrote:
> On Thu, Aug 16, 2007 at 04:39:02PM -1000, Eric Firing wrote:
>> As far as I can see there is no way of using svn diff to deal with 
>> this automatically, so in the attached revision I have manually removed 
>> chunks resulting solely from whitespace.
>>
>> Is there a better way to handle this problem?  A better way to make diffs?  
>> Or any possibility of routinely cleaning the junk out of the svn source 
>> files?  (Yes, I know--what is junk to me probably results from what others 
>> consider good behavior of the editor.)
> 
> 'svn diff -x -b' might work better (-b gets passed to diff, which makes
> it ignore space changes). Or svn diff -x -w to ignore all whitespace.
> 
> Me, I hate trailing ws too (I've got Emacs set up so that gets
> highlighted as red, which makes me angry :). The hard tabs in C code is
> keeping with the style used in the C Python sources (Emacs even has a
> 'python' C style -- do "C-c . python").
> 

Not any more! See the revised PEP 007, 
http://www.python.org/dev/peps/pep-0007/

     In Python 3000 (and in the 2.x series, in new source files),
     we'll switch to a different indentation style: 4 spaces per indent,
     all spaces (no tabs in any file).  The rest will remain the same.

I would love to see this as the standard in numpy as well.  Then files 
obey WYSIWYG regardless of editor.  (Except for unicode woes, but that 
is another topic.)

Eric


From peridot.faceted at gmail.com  Fri Aug 17 04:11:12 2007
From: peridot.faceted at gmail.com (Anne Archibald)
Date: Fri, 17 Aug 2007 04:11:12 -0400
Subject: [Numpy-discussion] .transpose() of memmap array fails to close()
In-Reply-To: <20070816141710.GA3154@bams.ccf.swri.edu>
References: <20070607214620.GM6116@bams.ccf.swri.edu>
	<20070810162016.GA12992@bams.ccf.swri.edu>
	<20070810211438.GF13557@bams.ccf.swri.edu>
	<Mahogany-0.67.0-992-20070813-115149.00@american.edu>
	<20070813161945.GA12360@bams.ccf.swri.edu>
	<ce557a360708132123k5b373b6l2eda66913d6e0048@mail.gmail.com>
	<20070815163612.GB23855@bams.ccf.swri.edu>
	<ce557a360708151750o46b43eb5ke948b9ed0f5eff8a@mail.gmail.com>
	<20070816141710.GA3154@bams.ccf.swri.edu>
Message-ID: <ce557a360708170111k35096c9fkcae465f83299635c@mail.gmail.com>

On 16/08/07, Glen W. Mabey <Glen.Mabey at swri.org> wrote:
> On Wed, Aug 15, 2007 at 08:50:28PM -0400, Anne Archibald wrote:
> > But to be pythonic, or numpythonic, when the original A is
> > garbage-collected, the garbage collection should certainly close the
> > mmap.
>
> Humm, this would be less than ideal for my use case, when the data on
> disk is organized in a different dimensional order than I want to refer
> to it in my code.  For example:
>
> p_data = numpy.memmap( datafilename, shape=( 10, 1024, 20 ), dtype=numpy.float32, mode='r')
> u_data = p_data.transpose( [ 2, 0, 1 ] )
>
> and I don't want to have to keep track of p_data because its only u_data
> that I care about and want to use.  And I promise, this is not a
> contrived example.  I have data that I really do want to be ordered in a
> certain way on disk, for I/O efficiency reasons, yet when I logically
> index into it in my code, I want the dimensions to be in a different
> order.

Perfectly reasonable. Note that p_data cannot be collected until
u_data goes away too, so the mmap is safe. And transpose()ing doesn't
copy any data, so even if you get an ndarray, you haven't lost the
ability to modify things on disk.

> > Being able to apply flush() or whatever to slices is not necessarily
> > unpythonic, but it's probably a lot simpler to reliably implement
> > slices of mmap()s as simple slices of ordinary arrays.
>
> I considered this approach, but what happens if you want to instantiate
> a slice that is very large, e.g., larger than the size of your physical
> RAM?  In that case, you can't afford to make simple slices be ordinary
> arrays, besides the case where you want to change values.  Making them
> functionally memmap-arrays, but without .sync() and .close() doesn't
> seem right either.

I was a bit ambiguous. An ordinary numpy array is an ndarray object,
which contains some housekeeping data (dimension, shape, stride
lengths, some flags, what have you) and a pointer to a hunk of memory.
That hunk of memory can be pretty much any directly-addressable
memory, for example a contiguous block of malloc()ed RAM, the
beginning of a (possibly strided) subblock of an existing piece of
malloc()ed RAM, a pointer to an array statically allocated in some C
or Fortran library... or a piece of memory in an mmap()ed region.
Numpy doesn't care at all about the difference. In fact this is the
beauty of numpy: because all it cares about is where the elements
start, what they look like, how many there are, and how far apart they
are, it can cheaply create subarrays without copying any data.

So naively, one might implement mmap()ed arrays with a factory
function that called mmap(), got back a pointer to the place in
virtual memory where the file's contents appear to live, and whipped
up a perfectly ordinary ndarray to point to the contents. It would
work, thanks to the magic of the OS's mmap() call. The only problem is
you would have to figure out when it was safe to close the mmap()
(invalidating the array's memory!) and you would have no convenient
way to flush() the mmap() out to disk.

So the mmap() objects exist. All they are is ndarrays that keep track
of how the mmap() was done and provide flush() and close() methods;
they also (I hope!) make sure close() gets called when they get
garbage-collected. Note that the safety-scissors way to do this would
be to *not* provide a close() method, since a close() leaves the
object's data unusable, just waiting for an unwise attempt to index
into the object. It's probably better not to ever close() an mmap()
object.

What should happen when you take a slice of an mmap() object? (this
includes transposes and other non-copying ways to get at its
contents). You get a fresh new ndarray object that does all the numpy
magic. But should it also do the mmap() magic? It doesn't need the
mmap() creation magic, since the mmap() already exists. flush() would
be sort of nice, since that's meaningful (though it might take a long
time, if it flushes the whole mmap). close() is just asking to shoot
yourself in the foot, since it not only invalidates the slice you took
but the whole mmap()!

It seems to me - remember, I don't use mmap or develop numpy, so give
this opinion the corresponding weight - that the Right Answer for
mmap() is to provide flush(), but not to provide close() except on
finalization (you can ensure finalization happens by deleting all
references to the array). Finally, if you take a slice of an mmap(), I
think you should get a simple ndarray. This ensures you don't have to
thread type-duplication code into everywhere that might make a slice.
But if you do make slices themselves mmap()s, providing flush() to
slices too, great. Just don't provide close(), and particularly
*don't* invoke it on finalization of slices, or things will die
horribly.


Anne


From matthew.brett at gmail.com  Fri Aug 17 08:11:41 2007
From: matthew.brett at gmail.com (Matthew Brett)
Date: Fri, 17 Aug 2007 13:11:41 +0100
Subject: [Numpy-discussion] Error in allclose with inf values
Message-ID: <1e2af89e0708170511v26495ecbge022b8efd51a9d9c@mail.gmail.com>

Hi,

I noticed that allclose does not always behave correctly for arrays with infs.

I've attached a test script for allclose, and here's an alternative
implementation that I believe behaves correctly.

Obviously the test script could be a test case in core/tests/test_numeric.py

I wonder if we should allow nans in the test arrays - possibly with an
optional keyword arg like allownan.  After all inf-inf is nan - but we
allow that in the test.

Best,

Matthew
-------------- next part --------------
A non-text attachment was scrubbed...
Name: my_allclose.py
Type: text/x-python-script
Size: 758 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070817/55c80aa6/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test_allclose.py
Type: text/x-python-script
Size: 1463 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070817/55c80aa6/attachment-0001.bin>

From zyzhu2000 at gmail.com  Fri Aug 17 18:26:55 2007
From: zyzhu2000 at gmail.com (Geoffrey Zhu)
Date: Fri, 17 Aug 2007 17:26:55 -0500
Subject: [Numpy-discussion] numpy.array does not take generators
In-Reply-To: <46C53E09.9020306@gmail.com>
References: <fcedc2970708161826u12c8aa78mecdc3f55b8504338@mail.gmail.com>
	<46C53E09.9020306@gmail.com>
Message-ID: <fcedc2970708171526x5c286179u5475a7f2405e0f96@mail.gmail.com>

On 8/17/07, Robert Kern <robert.kern at gmail.com> wrote:
> Geoffrey Zhu wrote:
> > Hi All,
> >
> > I want to construct a numpy array based on Python objects. In the
> > below code, opts is a list of tuples.
> >
> > For example,
> >
> > opts=[ ('C', 100, 3, 'A'), ('K', 200, 5.4, 'B')]
> >
> > If I use a generator like the following:
> >
> > K=numpy.array(o[2]/1000.0 for o in opts)
> >
> > It does not work.
> >
> > I have to use:
> >
> > numpy.array([o[2]/1000.0 for o in opts])
> >
> > Is this behavior intended?
>
> Yes. With arbitrary generators, there is no good way to do the kind of
> mind-reading that numpy.array() usually does with sequences. It would have to
> unroll the whole generator anyways. fromiter() works for this, but you are
> restricted to 1-D arrays which is a lot easier to implement the mind-reading for.
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless enigma
>  that is made terrible by our own mad attempt to interpret it as though it had
>  an underlying truth."
>  -- Umberto Eco
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>

I see. Thanks for explaining.


From barrywark at gmail.com  Fri Aug 17 19:00:58 2007
From: barrywark at gmail.com (Barry Wark)
Date: Fri, 17 Aug 2007 16:00:58 -0700
Subject: [Numpy-discussion] numpy.array does not take generators
In-Reply-To: <46C53E09.9020306@gmail.com>
References: <fcedc2970708161826u12c8aa78mecdc3f55b8504338@mail.gmail.com>
	<46C53E09.9020306@gmail.com>
Message-ID: <cd7634ce0708171600s2a08c3a0q413fc756dc6dd00c@mail.gmail.com>

Is there a reason not to add an argument to fromiter that specifies
the final size of the n-d array? Reading this discussion, I realized
that there are several places in my code where I create 2-D arrays
like this:

arr = N.array([d.data() for d in list_of_data_containers]),

where d.data() returns a buffer object.

I would guess that this paradigm causes lots of memory copying. The
more efficient solution, I think, would be to preallocate the array
and then assign each row in a loop. It's so much clearer this way,
however, that I've kept it as is in the code.

So, what if I could do something like

arr = N.fromiter(d.data() for d in list_of_data_containers, shape=(x,y)),

with the contract that fromiter will throw an exception if any of the
d.data() are not of size y or if there are more than x elements in
list_of_data_containers?

Just a thought for discussion.

barry

On 8/16/07, Robert Kern <robert.kern at gmail.com> wrote:
> Geoffrey Zhu wrote:
> > Hi All,
> >
> > I want to construct a numpy array based on Python objects. In the
> > below code, opts is a list of tuples.
> >
> > For example,
> >
> > opts=[ ('C', 100, 3, 'A'), ('K', 200, 5.4, 'B')]
> >
> > If I use a generator like the following:
> >
> > K=numpy.array(o[2]/1000.0 for o in opts)
> >
> > It does not work.
> >
> > I have to use:
> >
> > numpy.array([o[2]/1000.0 for o in opts])
> >
> > Is this behavior intended?
>
> Yes. With arbitrary generators, there is no good way to do the kind of
> mind-reading that numpy.array() usually does with sequences. It would have to
> unroll the whole generator anyways. fromiter() works for this, but you are
> restricted to 1-D arrays which is a lot easier to implement the mind-reading for.
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless enigma
>  that is made terrible by our own mad attempt to interpret it as though it had
>  an underlying truth."
>   -- Umberto Eco
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>


From tim.hochberg at ieee.org  Fri Aug 17 20:00:24 2007
From: tim.hochberg at ieee.org (Timothy Hochberg)
Date: Fri, 17 Aug 2007 17:00:24 -0700
Subject: [Numpy-discussion] numpy.array does not take generators
In-Reply-To: <cd7634ce0708171600s2a08c3a0q413fc756dc6dd00c@mail.gmail.com>
References: <fcedc2970708161826u12c8aa78mecdc3f55b8504338@mail.gmail.com>
	<46C53E09.9020306@gmail.com>
	<cd7634ce0708171600s2a08c3a0q413fc756dc6dd00c@mail.gmail.com>
Message-ID: <e4412d6b0708171700h1b771d2fw379f82a953e5f95a@mail.gmail.com>

On 8/17/07, Barry Wark <barrywark at gmail.com> wrote:
>
> Is there a reason not to add an argument to fromiter that specifies
> the final size of the n-d array? Reading this discussion, I realized
> that there are several places in my code where I create 2-D arrays
> like this:
>
> arr = N.array([d.data() for d in list_of_data_containers]),
>
> where d.data() returns a buffer object.
>
> I would guess that this paradigm causes lots of memory copying. The
> more efficient solution, I think, would be to preallocate the array
> and then assign each row in a loop. It's so much clearer this way,
> however, that I've kept it as is in the code.
>
> So, what if I could do something like
>
> arr = N.fromiter(d.data() for d in list_of_data_containers, shape=(x,y)),


I don't know that there's any theoretical problem in terms of doing
something like this. There are a couple of practical issues though. One is
that it would significantly increase the implementation complexity of
fromiter, which right now is about as simple as it can reasonably be.
Someone would need to step forward and write and test the code. The second
issue is with the interface. The interface that you propose isn't really
right. The current interface is:

   fromiter(iterable, dtype, count=-1)

where count indicates how many items to extract from the iterable (-1
iterates until it is empty). 'shape' as you propose would couple to this in
an unnatural way. Adding another keyword argument that indicates just the
shape of the elements would make more sense, but it starts to seem a bit
clunky.

  fromiter(iterable, dtype, count-1, itemshape=())

For this particular application, there doesn't seem to be any problem simply
defining yourself a little utility function to do this for you.

def from_shaped_iter(iterable, dtype, shape):
    a = numpy.empty(shape, dtype)
    for i, x in enumerate(iterable):
        a[i] = x
    return a

I expect this would have decent performance if y dimension is reasonably
large.

regards,


-tim

with the contract that fromiter will throw an exception if any of the
> d.data() are not of size y or if there are more than x elements in
> list_of_data_containers?
>
> Just a thought for discussion.
>
> barry
>
> On 8/16/07, Robert Kern <robert.kern at gmail.com> wrote:
> > Geoffrey Zhu wrote:
> > > Hi All,
> > >
> > > I want to construct a numpy array based on Python objects. In the
> > > below code, opts is a list of tuples.
> > >
> > > For example,
> > >
> > > opts=[ ('C', 100, 3, 'A'), ('K', 200, 5.4, 'B')]
> > >
> > > If I use a generator like the following:
> > >
> > > K=numpy.array(o[2]/1000.0 for o in opts)
> > >
> > > It does not work.
> > >
> > > I have to use:
> > >
> > > numpy.array([o[2]/1000.0 for o in opts])
> > >
> > > Is this behavior intended?
> >
> > Yes. With arbitrary generators, there is no good way to do the kind of
> > mind-reading that numpy.array() usually does with sequences. It would
> have to
> > unroll the whole generator anyways. fromiter() works for this, but you
> are
> > restricted to 1-D arrays which is a lot easier to implement the
> mind-reading for.
> >
> > --
> > Robert Kern
> >
> > "I have come to believe that the whole world is an enigma, a harmless
> enigma
> >  that is made terrible by our own mad attempt to interpret it as though
> it had
> >  an underlying truth."
> >   -- Umberto Eco
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at scipy.org
> > http://projects.scipy.org/mailman/listinfo/numpy-discussion
> >
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
.  __
.   |-\
.
.  tim.hochberg at ieee.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070817/5cfb6339/attachment.html>

From oliphant.travis at ieee.org  Sat Aug 18 03:51:50 2007
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sat, 18 Aug 2007 01:51:50 -0600
Subject: [Numpy-discussion] fast putmask implementation
In-Reply-To: <46C542C1.8060907@hawaii.edu>
References: <46C4A389.5020108@hawaii.edu>	<5b8d13220708161905l76f2ca74p5eedb8ca5ded1d18@mail.gmail.com>	<46C50A46.1070208@hawaii.edu>	<20070817030719.GA5542@arbutus.physics.mcmaster.ca>
	<46C542C1.8060907@hawaii.edu>
Message-ID: <46C6A516.4000607@ieee.org>

> Not any more! See the revised PEP 007, 
> http://www.python.org/dev/peps/pep-0007/
> 
>      In Python 3000 (and in the 2.x series, in new source files),
>      we'll switch to a different indentation style: 4 spaces per indent,
>      all spaces (no tabs in any file).  The rest will remain the same.
> 
> I would love to see this as the standard in numpy as well.  Then files 
> obey WYSIWYG regardless of editor.  (Except for unicode woes, but that 
> is another topic.)
> 

I'm fine with this.  Some information on how to make sure emacs (and 
other editors) does this would be helpful.

-Travis


From jensj at fysik.dtu.dk  Sat Aug 18 05:00:56 2007
From: jensj at fysik.dtu.dk (Jens =?ISO-8859-1?Q?J=F8rgen?= Mortensen)
Date: Sat, 18 Aug 2007 11:00:56 +0200
Subject: [Numpy-discussion] Non-contiguous array from newaxis indexing
Message-ID: <1187427656.8294.11.camel@b307-242.fysik.dtu.dk>

I would like all these arrays to be contiguous:

>>> import numpy as npy
>>> npy.__version__
'1.0.4.dev3967'
>>> x = npy.arange(4)
>>> y = x[npy.newaxis, :]
>>> z = x.reshape((1, 4))
>>> for a in [x, y, z]:
...     print a.shape, a.strides, a.flags.contiguous
... 
(4,) (4,) True
(1, 4) (0, 4) False
(1, 4) (16, 4) True

But y is not contiguous according to y.flags.contiguous - why not and
why does y and z not have the same strides?

I found this comment just before the _IsContiguous function in
arrayobject.c:

/* 0-strided arrays are not contiguous (even if dimension == 1) */

Is this correct?

Jens J?rgen Mortensen


From stefan at sun.ac.za  Sat Aug 18 06:21:07 2007
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Sat, 18 Aug 2007 12:21:07 +0200
Subject: [Numpy-discussion] Finding a row match within a numpy array
In-Reply-To: <46C1898F.6020107@bristol.ac.uk>
References: <46C1898F.6020107@bristol.ac.uk>
Message-ID: <20070818102107.GL2977@mentat.za.net>

On Tue, Aug 14, 2007 at 11:53:03AM +0100, Andy Cheesman wrote:
> Dear nice people
> 
> I'm trying to match a row (b) within a large numpy array (a). My most
> successful attempt is below
> 
> hit = equal(b, a)
> total_hits = add.reduce(hit, 1)
> max_hit = argmax(total_hits, 0)
> answer = a[max_hit]
> 
> where ...
> a = array([[ 0,  1,  2,  3],
>     	   [ 4,  5,  6,  7],
> 	   [ 8,  9, 10, 11],
> 	   [12, 13, 14, 15]])
> 
> b = array([8,  9, 10, 11])
> 
> 
> 
> I was wondering if people could suggest a possible more efficient route
> as there seems to be numerous steps.

Another way to do it:

a[N.apply_along_axis(N.all,1,a==b)]

Cheers
St?fan


From stefan at sun.ac.za  Sat Aug 18 06:25:21 2007
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Sat, 18 Aug 2007 12:25:21 +0200
Subject: [Numpy-discussion] Finding a row match within a numpy array
In-Reply-To: <46C1898F.6020107@bristol.ac.uk>
References: <46C1898F.6020107@bristol.ac.uk>
Message-ID: <20070818102521.GM2977@mentat.za.net>

On Tue, Aug 14, 2007 at 11:53:03AM +0100, Andy Cheesman wrote:
> Dear nice people
> 
> I'm trying to match a row (b) within a large numpy array (a). My most
> successful attempt is below
> 
> hit = equal(b, a)
> total_hits = add.reduce(hit, 1)
> max_hit = argmax(total_hits, 0)
> answer = a[max_hit]
> 
> where ...
> a = array([[ 0,  1,  2,  3],
>     	   [ 4,  5,  6,  7],
> 	   [ 8,  9, 10, 11],
> 	   [12, 13, 14, 15]])
> 
> b = array([8,  9, 10, 11])
> 
> 
> 
> I was wondering if people could suggest a possible more efficient route
> as there seems to be numerous steps.

For large arrays, you may not want to calculate a == b, so you could
also do

[row for row in a if N.all(row == b)]

or find the indices using

[r for r,row in enumerate(a) if N.all(row == b)]

Cheers
St?fan


From oliphant.travis at ieee.org  Sat Aug 18 06:51:53 2007
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sat, 18 Aug 2007 04:51:53 -0600
Subject: [Numpy-discussion] Non-contiguous array from newaxis indexing
In-Reply-To: <1187427656.8294.11.camel@b307-242.fysik.dtu.dk>
References: <1187427656.8294.11.camel@b307-242.fysik.dtu.dk>
Message-ID: <46C6CF49.6020202@ieee.org>

Jens J?rgen Mortensen wrote:
> I would like all these arrays to be contiguous:
> 
>>>> import numpy as npy
>>>> npy.__version__
> '1.0.4.dev3967'
>>>> x = npy.arange(4)
>>>> y = x[npy.newaxis, :]
>>>> z = x.reshape((1, 4))
>>>> for a in [x, y, z]:
> ...     print a.shape, a.strides, a.flags.contiguous
> ... 
> (4,) (4,) True
> (1, 4) (0, 4) False
> (1, 4) (16, 4) True
> 
> But y is not contiguous according to y.flags.contiguous - why not and
> why does y and z not have the same strides?
> 
> I f

We've tried a few times to let them be contiguous, but it breaks code in 
various ways because NumPy takes advantage of 0-striding to accomplish 
broadcasting.   In theory, it might be able to be fixed, but the fact 
that simple fixes don't work makes me wonder.


ound this comment just before the _IsContiguous function in
> arrayobject.c:
> 
> /* 0-strided arrays are not contiguous (even if dimension == 1) */
> 
> Is this correct?

Yes.

-Travis


From gael.varoquaux at normalesup.org  Sat Aug 18 11:11:49 2007
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Sat, 18 Aug 2007 17:11:49 +0200
Subject: [Numpy-discussion] fast putmask implementation
In-Reply-To: <46C6A516.4000607@ieee.org>
References: <46C4A389.5020108@hawaii.edu>
	<5b8d13220708161905l76f2ca74p5eedb8ca5ded1d18@mail.gmail.com>
	<46C50A46.1070208@hawaii.edu>
	<20070817030719.GA5542@arbutus.physics.mcmaster.ca>
	<46C542C1.8060907@hawaii.edu> <46C6A516.4000607@ieee.org>
Message-ID: <20070818151149.GB16053@clipper.ens.fr>

On Sat, Aug 18, 2007 at 01:51:50AM -0600, Travis Oliphant wrote:
> I'm fine with this.  Some information on how to make sure emacs (and 
> other editors) does this would be helpful.

Under vim, put in your .vimrc:

autocmd FileType python set autoindent tabstop=4 shiftwidth=4 smarttab expandtab

Ga?l


From faltet at carabos.com  Sat Aug 18 16:04:39 2007
From: faltet at carabos.com (Francesc Altet)
Date: Sat, 18 Aug 2007 22:04:39 +0200
Subject: [Numpy-discussion] Fwd: Request for Use Cases - h5import and text
	data
Message-ID: <200708182204.39674.faltet@carabos.com>

Hi,

This has been sent to the hdf-forum at hdfgroup.org list, but it should of 
interest to NumPy/SciPy lists too.  Remember that you can access most of 
the HDF5 files from Python by using PyTables.

Cheers,

-- 
>0,0<   Francesc Altet     http://www.carabos.com/
V   V   C?rabos Coop. V.   Enjoy Data
 "-"

----------- Original message -------------------------------------

Request for Use Cases - h5import and text data

h5import is an HDF5 tool that converts floating point or integer data 
stored in ASCII or binary files into the HDF5 format. Currently h5import 
only processes numeric data. The HDF Group plans to add support for 
importing text data into HDF5 using h5import.

We are now soliciting use cases that will guide the design of the text 
to dataset import feature in h5import. Please consider text you might 
want to import and how you would want to access that text once it is in 
the HDF5 file, and send your use cases to help at hdfgroup.org before 
September 17, 2007.

Thank you for your help as we strive to improve our tools.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Pytables-users mailing list
Pytables-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

-------------------------------------------------------

-- 
>0,0<   Francesc Altet ? ? http://www.carabos.com/
V   V   C?rabos Coop. V. ??Enjoy Data
 "-"


From matthew.brett at gmail.com  Sun Aug 19 07:57:50 2007
From: matthew.brett at gmail.com (Matthew Brett)
Date: Sun, 19 Aug 2007 12:57:50 +0100
Subject: [Numpy-discussion] Error in allclose with inf values
In-Reply-To: <1e2af89e0708170511v26495ecbge022b8efd51a9d9c@mail.gmail.com>
References: <1e2af89e0708170511v26495ecbge022b8efd51a9d9c@mail.gmail.com>
Message-ID: <1e2af89e0708190457o7e0269b9ta2b8b0aff787e029@mail.gmail.com>

Hi again,

> I noticed that allclose does not always behave correctly for arrays with infs.

Sorry, perhaps I should have been more specific; this is the behavior
of allclose that I was referring to (documented in the tests I
attached):

In [6]:N.allclose([N.inf, 1, 2], [10, 10, N.inf])
Out[6]:array([ True], dtype=bool)

In [7]:N.allclose([N.inf, N.inf, N.inf], [10, 10, N.inf])
Warning: invalid value encountered in subtract
Out[7]:True

In [9]:N.allclose([N.inf, N.inf], [10, 10])
---------------------------------------------------------------------------
exceptions.AttributeError                            Traceback (most
recent call last)

/home/mb312/<ipython console>

/home/mb312/lib/python2.4/site-packages/numpy/core/numeric.py in
allclose(a, b, rtol, atol)
    843     d3 = (x[xinf] == y[yinf])
    844     d4 = (~xinf & ~yinf)
--> 845     if d3.size < 2:
    846         if d3.size==0:
    847             return False

AttributeError: 'bool' object has no attribute 'size'


Matthew


From fperez.net at gmail.com  Sun Aug 19 20:51:11 2007
From: fperez.net at gmail.com (Fernando Perez)
Date: Sun, 19 Aug 2007 18:51:11 -0600
Subject: [Numpy-discussion] [SciPy-dev] NumPy 1.0.3.x and SciPy 0.5.2.x
In-Reply-To: <c7009a550708150922t5d855bbbx6e4fe6944f4618cd@mail.gmail.com>
References: <c7009a550708150922t5d855bbbx6e4fe6944f4618cd@mail.gmail.com>
Message-ID: <db6b5ecc0708191751n65bd47fat74c244bb7dd222eb@mail.gmail.com>

Hey,

On 8/15/07, Jarrod Millman <millman at berkeley.edu> wrote:
> Hello,
>
> I am hoping to release NumPy 1.0.3.1 and SciPy 0.5.2.1 this weekend.
> These releases will work with each other and get rid of the annoying
> deprecation warning about SciPyTest.

I just wanted to give you a public, huge thank you for tackling this
most thankless but important problem.  Many people at the just
finished SciPy'07 conference mentioned better deployment/installation
support as their main issue with scipy.  Our tools are maturing, but
we won't get very far if they don't actually get in the hands of
users.

Regards,

f


From markbak at gmail.com  Mon Aug 20 09:05:54 2007
From: markbak at gmail.com (mark)
Date: Mon, 20 Aug 2007 13:05:54 -0000
Subject: [Numpy-discussion] selecting part of an array like a[ a<5 ]
Message-ID: <1187615154.941614.314430@50g2000hsm.googlegroups.com>

Hello - I am wondering what the better way is to select part of an
array.

Say I have an array a:

a = arange(10)

Now I want to select the values larger than 5

a[ a>5 ]

and later I need the values smaller or equal to 5

a[ a<=5 ]

It seems that doing the comparison twice is extra work (especially if
the array is large).

So I thought I store the comparison

b = a>5

Now I can do

a[b]

But how do I get the others?

a[not b] or a[!b] don't work. So it's gotta be something different.

Besides, is it a good idea to store b like I suggest?

Thanks for the help,

Mark


From kwgoodman at gmail.com  Mon Aug 20 09:19:34 2007
From: kwgoodman at gmail.com (Keith Goodman)
Date: Mon, 20 Aug 2007 15:19:34 +0200
Subject: [Numpy-discussion] selecting part of an array like a[ a<5 ]
In-Reply-To: <1187615154.941614.314430@50g2000hsm.googlegroups.com>
References: <1187615154.941614.314430@50g2000hsm.googlegroups.com>
Message-ID: <f4f93d420708200619y7ce1f358ved8e32483f5b221b@mail.gmail.com>

On 8/20/07, mark <markbak at gmail.com> wrote:

> b = a>5
>
> a[not b] or a[!b] don't work. So it's gotta be something different.

a[~b]


From sameerslists at gmail.com  Mon Aug 20 09:34:53 2007
From: sameerslists at gmail.com (Sameer DCosta)
Date: Mon, 20 Aug 2007 08:34:53 -0500
Subject: [Numpy-discussion] Setting numpy record array elements
Message-ID: <8fb8cc060708200634n66d58be1ibbaaff069dd45ab3@mail.gmail.com>

Hi,

In the example below I have a record array *a* that has a column
*col1". I am trying to set the first element of a.col1 to zero  in two
different ways.

1. a[0].col1 = 0 (This fails silently)
2. a.col1[0] = 0 (This works fine)

I am using the latest svn version of numpy. Is this a bug? or is the
first method supposed to fail? If it is supposed to fail should it
fail silently?

Thanks in advance for any replies.

Cheers,
Sameer


%%%% Example Code %%%

import numpy
print numpy.__version__
a = numpy.rec.fromrecords( [ (1,2,3), (4,5,6)],
        dtype=[("col1", "<i4"), ("col2", "<i4"), ("col3", "<i4")])
print "a = %s" % a.__repr__()

print "First element of col1 is %ld" % a.col1[0]
print "Setting first element of col1 to zero using a[0].col1 = 0"
a[0].col1 = 0
print "Result: a.col1[0] = %ld" % a[0].col1

print "Setting first element of col1 to zero using a.col1[0] = 0"
a.col1[0] = 0
print "Result: a.col1[0] = %ld" % a[0].col1


%%%% Example Code Output %%%

1.0.4.dev3975
a = recarray([(1, 2, 3), (4, 5, 6)],
      dtype=[('col1', '<i4'), ('col2', '<i4'), ('col3', '<i4')])
First element of col1 is 1
Setting first element of col1 to zero using a[0].col1 = 0
Result: a.col1[0] = 1
Setting first element of col1 to zero using a.col1[0] = 0
Result: a.col1[0] = 0


From stefan at sun.ac.za  Mon Aug 20 10:26:19 2007
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Mon, 20 Aug 2007 16:26:19 +0200
Subject: [Numpy-discussion] fast putmask implementation
In-Reply-To: <46C6A516.4000607@ieee.org>
References: <46C4A389.5020108@hawaii.edu>
	<5b8d13220708161905l76f2ca74p5eedb8ca5ded1d18@mail.gmail.com>
	<46C50A46.1070208@hawaii.edu>
	<20070817030719.GA5542@arbutus.physics.mcmaster.ca>
	<46C542C1.8060907@hawaii.edu> <46C6A516.4000607@ieee.org>
Message-ID: <20070820142619.GD7531@mentat.za.net>

On Sat, Aug 18, 2007 at 01:51:50AM -0600, Travis Oliphant wrote:
> > Not any more! See the revised PEP 007, 
> > http://www.python.org/dev/peps/pep-0007/
> > 
> >      In Python 3000 (and in the 2.x series, in new source files),
> >      we'll switch to a different indentation style: 4 spaces per indent,
> >      all spaces (no tabs in any file).  The rest will remain the same.
> > 
> > I would love to see this as the standard in numpy as well.  Then files 
> > obey WYSIWYG regardless of editor.  (Except for unicode woes, but that 
> > is another topic.)
> > 
> 
> I'm fine with this.  Some information on how to make sure emacs (and 
> other editors) does this would be helpful.

Here are some of my .emacs snippets.  I assume that .el files are
placed in ~/elisp and that the following line is in your emacs
configuration:

(add-to-list 'load-path "~/elisp")

Highlight unnecessary whitespace
================================

Download: http://www.emacswiki.org/cgi-bin/wiki/show-wspace.el

; Show whitespace
(require 'show-wspace)
(add-hook 'python-mode-hook 'highlight-tabs)
(add-hook 'font-lock-mode-hook 'highlight-trailing-whitespace)

Clean up tabs and trailing spaces
=================================

M-x untabify

and

M-x whitespace-cleanup

Tell emacs never to use tabs
============================

(setq-default indent-tabs-mode nil)

Highlight all text after column 80
==================================

Download: http://www.emacswiki.org/cgi-bin/wiki/column-marker.el

(require 'column-marker)
(add-hook 'font-lock-mode-hook (lambda () (interactive) (column-marker-1 80)))

Show a ruler with the current column position
=============================================

(require 'ruler-mode)
(add-hook 'font-lock-mode-hook 'ruler-mode)

Enable restructured text (ReST) editing
=======================================

(require 'rst)
(add-hook 'text-mode-hook 'rst-text-mode-bindings)

Fix outline-mode to work with Python
====================================

(add-hook 'python-mode-hook 'my-python-hook)
(defun py-outline-level ()
  "This is so that `current-column` DTRT in otherwise-hidden text"
  ;; from ada-mode.el
  (let (buffer-invisibility-spec)
    (save-excursion
      (skip-chars-forward "\t ")
      (current-column))))

; this fragment originally came from the web somewhere, but the outline-regexp
; was horribly broken and is broken in all instances of this code floating
; around.  Finally fixed by Charl P. Botha
; <<a href="http://cpbotha.net/">http://cpbotha.net/</a>>
(defun my-python-hook ()
  (setq outline-regexp "[^ \t\n]\\|[ \t]*\\(def[ \t]+\\|class[ \t]+\\)")
  ; enable our level computation
  (setq outline-level 'py-outline-level)
  ; do not use their \C-c@ prefix, too hard to type. Note this overides 
  ;some python mode bindings
  ;(setq outline-minor-mode-prefix "\C-c")
  ; turn on outline mode
  (outline-minor-mode t)
  ; initially hide all but the headers
  ; (hide-body)
  (show-paren-mode 1)
)


Cheers
St?fan


From stefan at sun.ac.za  Mon Aug 20 10:54:55 2007
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Mon, 20 Aug 2007 16:54:55 +0200
Subject: [Numpy-discussion] Error in allclose with inf values
In-Reply-To: <1e2af89e0708170511v26495ecbge022b8efd51a9d9c@mail.gmail.com>
References: <1e2af89e0708170511v26495ecbge022b8efd51a9d9c@mail.gmail.com>
Message-ID: <20070820145455.GF7531@mentat.za.net>

Hi Matthew

On Fri, Aug 17, 2007 at 01:11:41PM +0100, Matthew Brett wrote:
> I noticed that allclose does not always behave correctly for arrays with infs.
> 
> I've attached a test script for allclose, and here's an alternative
> implementation that I believe behaves correctly.

Thanks for the patch --  I applied it in r3977 with the appropriate
tests.

> I wonder if we should allow nans in the test arrays - possibly with an
> optional keyword arg like allownan.  After all inf-inf is nan - but we
> allow that in the test.

I'm happy with both allclose([Inf],[-Inf]) and
allclose([anything],[NaN]) returning False.

Cheers
St?fan


From stefan at sun.ac.za  Mon Aug 20 12:10:25 2007
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Mon, 20 Aug 2007 18:10:25 +0200
Subject: [Numpy-discussion] Setting numpy record array elements
In-Reply-To: <8fb8cc060708200634n66d58be1ibbaaff069dd45ab3@mail.gmail.com>
References: <8fb8cc060708200634n66d58be1ibbaaff069dd45ab3@mail.gmail.com>
Message-ID: <20070820161025.GH7531@mentat.za.net>

On Mon, Aug 20, 2007 at 08:34:53AM -0500, Sameer DCosta wrote:
> In the example below I have a record array *a* that has a column
> *col1". I am trying to set the first element of a.col1 to zero  in two
> different ways.
> 
> 1. a[0].col1 = 0 (This fails silently)
> 2. a.col1[0] = 0 (This works fine)
> 
> I am using the latest svn version of numpy. Is this a bug? or is the
> first method supposed to fail? If it is supposed to fail should it
> fail silently?

This looks like a bug, since

a[0][0] = 0

works fine.  I'll take a closer look and make sure.

Cheers
St?fan


From zyzhu2000 at gmail.com  Mon Aug 20 17:36:52 2007
From: zyzhu2000 at gmail.com (Geoffrey Zhu)
Date: Mon, 20 Aug 2007 16:36:52 -0500
Subject: [Numpy-discussion] "Extended" Outer Product
Message-ID: <fcedc2970708201436u4a3a1837ke13f8633dc9b0d4c@mail.gmail.com>

Hi Everyone,

I am wondering if there is an "extended" outer product. Take the
example in "Guide to Numpy." Instead of doing an multiplication, I
want to call a custom function for each pair.

>>> print outer([1,2,3],[10,100,1000])

[[ 10 100 1000]
[ 20 200 2000]
[ 30 300 3000]]


So I want:

[
 [f(1,10), f(1,100), f(1,1000)],
 [f(2,10), f(2, 100), f(2, 1000)],
 [f(3,10), f(3, 100), f(3,1000)]
]

Does anyone know how to do this without using a double loop?

Thanks,
Geoffrey


From robert.kern at gmail.com  Mon Aug 20 18:37:01 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 20 Aug 2007 17:37:01 -0500
Subject: [Numpy-discussion] "Extended" Outer Product
In-Reply-To: <fcedc2970708201436u4a3a1837ke13f8633dc9b0d4c@mail.gmail.com>
References: <fcedc2970708201436u4a3a1837ke13f8633dc9b0d4c@mail.gmail.com>
Message-ID: <46CA178D.5020901@gmail.com>

Geoffrey Zhu wrote:
> Hi Everyone,
> 
> I am wondering if there is an "extended" outer product. Take the
> example in "Guide to Numpy." Instead of doing an multiplication, I
> want to call a custom function for each pair.
> 
>>>> print outer([1,2,3],[10,100,1000])
> 
> [[ 10 100 1000]
> [ 20 200 2000]
> [ 30 300 3000]]
> 
> 
> So I want:
> 
> [
>  [f(1,10), f(1,100), f(1,1000)],
>  [f(2,10), f(2, 100), f(2, 1000)],
>  [f(3,10), f(3, 100), f(3,1000)]
> ]
> 
> Does anyone know how to do this without using a double loop?

If you can code your function such that it only uses operations that broadcast
(i.e. operators and ufuncs) and avoids things like branching or loops, then you
can just use numpy.newaxis on the first array.

  from numpy import array, newaxis
  x = array([1, 2, 3])
  y = array([10, 100, 1000])
  f(x[:,newaxis], y)

Otherwise, you can use numpy.vectorize() to turn your function into one that
will do that broadcasting for you.

  from numpy import array, newaxis, vectorize
  x = array([1, 2, 3])
  y = array([10, 100, 1000])
  f = vectorize(f)
  f(x[:,newaxis], y)

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From sameerslists at gmail.com  Mon Aug 20 19:26:30 2007
From: sameerslists at gmail.com (Sameer DCosta)
Date: Mon, 20 Aug 2007 18:26:30 -0500
Subject: [Numpy-discussion] Setting numpy record array elements
In-Reply-To: <20070820161025.GH7531@mentat.za.net>
References: <8fb8cc060708200634n66d58be1ibbaaff069dd45ab3@mail.gmail.com>
	<20070820161025.GH7531@mentat.za.net>
Message-ID: <8fb8cc060708201626v6c256480k6dd8a0828c8d6a74@mail.gmail.com>

On 8/20/07, Stefan van der Walt <stefan at sun.ac.za> wrote:
>
> This looks like a bug, since
>
> a[0][0] = 0
>
> works fine.  I'll take a closer look and make sure.
>

Thanks Stefan for offering to take a closer look. I have attached a
patch against the latest svn which fixes this problem.

Both this patched version and the current subversion source do not
throw an AttributeError exception if you do something like

a[0].non_existent_column = 10

That is a different  problem that probably should be fixed.

Cheers,
Sameer
-------------- next part --------------
A non-text attachment was scrubbed...
Name: numpy_patch.svn.diff
Type: application/octet-stream
Size: 988 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070820/4d1be907/attachment.obj>

From Chris.Barker at noaa.gov  Mon Aug 20 19:51:55 2007
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Mon, 20 Aug 2007 16:51:55 -0700
Subject: [Numpy-discussion] "Extended" Outer Product
In-Reply-To: <46CA178D.5020901@gmail.com>
References: <fcedc2970708201436u4a3a1837ke13f8633dc9b0d4c@mail.gmail.com>
	<46CA178D.5020901@gmail.com>
Message-ID: <46CA291B.30800@noaa.gov>

Robert Kern wrote:
> If you can code your function such that it only uses operations that broadcast
> (i.e. operators and ufuncs) and avoids things like branching or loops, then you
> can just use numpy.newaxis on the first array.
> 
>   from numpy import array, newaxis
>   x = array([1, 2, 3])
>   y = array([10, 100, 1000])
>   f(x[:,newaxis], y)

in fact, it may make sense to just have your x be column vector anyway:
 >>> x
array([1, 2, 3])
 >>> y
array([10, 11, 12])
 >>> x.shape = (-1,1)
 >>> x
array([[1],
        [2],
        [3]])
 >>> x * y
array([[10, 11, 12],
        [20, 22, 24],
        [30, 33, 36]])

Broadcasting is VERY cool!

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From charlesr.harris at gmail.com  Tue Aug 21 01:14:52 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 20 Aug 2007 23:14:52 -0600
Subject: [Numpy-discussion] "Extended" Outer Product
In-Reply-To: <fcedc2970708201436u4a3a1837ke13f8633dc9b0d4c@mail.gmail.com>
References: <fcedc2970708201436u4a3a1837ke13f8633dc9b0d4c@mail.gmail.com>
Message-ID: <e06186140708202214j523e67b7v7fb5b08b57e4e1d3@mail.gmail.com>

On 8/20/07, Geoffrey Zhu <zyzhu2000 at gmail.com> wrote:
>
> Hi Everyone,
>
> I am wondering if there is an "extended" outer product. Take the
> example in "Guide to Numpy." Instead of doing an multiplication, I
> want to call a custom function for each pair.
>
> >>> print outer([1,2,3],[10,100,1000])
>
> [[ 10 100 1000]
> [ 20 200 2000]
> [ 30 300 3000]]
>
>
> So I want:
>
> [
> [f(1,10), f(1,100), f(1,1000)],
> [f(2,10), f(2, 100), f(2, 1000)],
> [f(3,10), f(3, 100), f(3,1000)]
> ]


You could make two matrices like so:

In [46]: a = arange(3)

In [47]: b = a.reshape(1,3).repeat(3,0)

In [48]: c = a.reshape(3,1).repeat(3,1)

In [49]: b
Out[49]:
array([[0, 1, 2],
       [0, 1, 2],
       [0, 1, 2]])

In [50]: c
Out[50]:
array([[0, 0, 0],
       [1, 1, 1],
       [2, 2, 2]])

 which will give you all pairs. You can then make a function of these in
various ways, for example

In [52]: c**b
Out[52]:
array([[1, 0, 0],
       [1, 1, 1],
       [1, 2, 4]])

That is a bit clumsy, though. I don't know how to do what you want in a
direct way.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070820/e2fb4e02/attachment.html>

From millman at berkeley.edu  Tue Aug 21 01:50:21 2007
From: millman at berkeley.edu (Jarrod Millman)
Date: Mon, 20 Aug 2007 22:50:21 -0700
Subject: [Numpy-discussion] NumPy 1.0.3.1 released
Message-ID: <c7009a550708202250p6337be62n80d5b4f84e7a21ab@mail.gmail.com>

I'm pleased to announce the release of NumPy 1.0.3.1

This a minor bug fix release, which enables the latest release of
SciPy to build.

Bug-fixes
===============
* Add back get_path to numpy.distutils.misc_utils
* Fix 64-bit zgeqrf
* Add parenthesis around GETPTR macros

Thank you to everybody who contributed to the recent release.

Best regards,

NumPy Developers
http://numpy.scipy.org


From stefan at sun.ac.za  Tue Aug 21 02:40:16 2007
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Tue, 21 Aug 2007 08:40:16 +0200
Subject: [Numpy-discussion] Setting numpy record array elements
In-Reply-To: <8fb8cc060708201626v6c256480k6dd8a0828c8d6a74@mail.gmail.com>
References: <8fb8cc060708200634n66d58be1ibbaaff069dd45ab3@mail.gmail.com>
	<20070820161025.GH7531@mentat.za.net>
	<8fb8cc060708201626v6c256480k6dd8a0828c8d6a74@mail.gmail.com>
Message-ID: <20070821064015.GB14999@mentat.za.net>

Hi Sameer

On Mon, Aug 20, 2007 at 06:26:30PM -0500, Sameer DCosta wrote:
> On 8/20/07, Stefan van der Walt <stefan at sun.ac.za> wrote:
> Thanks Stefan for offering to take a closer look. I have attached a
> patch against the latest svn which fixes this problem.

Yup, right on the money.  The __setattr__ call sets the value of
a[0].col1, but a[0].col1 is in fact a pointer to a[0][0].  It is
therefore necessary to use the setfields method.

I cannot think of any situation where you would need to call
__setattr__ on another member of "void", so I'm going to apply the
patch unless anyone objects.

Cheers
St?fan


From mpmusu at cc.usu.edu  Tue Aug 21 10:23:30 2007
From: mpmusu at cc.usu.edu (Mark.Miller)
Date: Tue, 21 Aug 2007 08:23:30 -0600
Subject: [Numpy-discussion] Finding a row match within a numpy array
In-Reply-To: <1187190071.384881.240470@w3g2000hsg.googlegroups.com>
References: <46C2CD01.5030307@bristol.ac.uk>
	<1187172556.613122.207400@r29g2000hsg.googlegroups.com>
	<46C2D4C7.4010305@bristol.ac.uk>
	<1187175588.007436.125720@w3g2000hsg.googlegroups.com>
	<e76aa17f0708150438n60234b8cg5665a33ba04f904@mail.gmail.com>
	<1187190071.384881.240470@w3g2000hsg.googlegroups.com>
Message-ID: <46CAF562.9060009@cc.usu.edu>

A slightly related question on this topic...

Is there a good loopless way to identify all of the unique rows in an 
array?  Something like numpy.unique() is ideal, but capable of 
extracting unique subarrays along an axis.

Thanks,

-Mark

mark wrote:
> Maybe this is not the intended use of where, but it seems to work:
>>>> from numpy import * # No complaining now
>>>> a = arange(12)
>>>> a.shape = (4,3)
>>>> a
> array([[ 0,  1,  2],
>        [ 3,  4,  5],
>        [ 6,  7,  8],
>        [ 9, 10, 11]])
>>>> b = array([6,7,8])
>>>> row = all( equal(a,b), 1 )
>>>> where(row==True)
> (array([2]),)
> 


From charlesr.harris at gmail.com  Tue Aug 21 12:44:09 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 21 Aug 2007 10:44:09 -0600
Subject: [Numpy-discussion] "Extended" Outer Product
In-Reply-To: <fcedc2970708201436u4a3a1837ke13f8633dc9b0d4c@mail.gmail.com>
References: <fcedc2970708201436u4a3a1837ke13f8633dc9b0d4c@mail.gmail.com>
Message-ID: <e06186140708210944p50fd17eq782bafb4fbae4678@mail.gmail.com>

On 8/20/07, Geoffrey Zhu <zyzhu2000 at gmail.com> wrote:
>
> Hi Everyone,
>
> I am wondering if there is an "extended" outer product. Take the
> example in "Guide to Numpy." Instead of doing an multiplication, I
> want to call a custom function for each pair.
>
> >>> print outer([1,2,3],[10,100,1000])
>
> [[ 10 100 1000]
> [ 20 200 2000]
> [ 30 300 3000]]
>
>
> So I want:
>
> [
> [f(1,10), f(1,100), f(1,1000)],
> [f(2,10), f(2, 100), f(2, 1000)],
> [f(3,10), f(3, 100), f(3,1000)]
> ]


Maybe something like

In [15]: f = lambda x,y : x*sin(y)

In [16]: a = array([[f(i,j) for i in range(3)] for j in range(3)])

In [17]: a
Out[17]:
array([[ 0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.84147098,  1.68294197],
       [ 0.        ,  0.90929743,  1.81859485]])

I don't know if nested list comprehensions are faster than two nested loops,
but at least they avoid array indexing.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070821/62174881/attachment.html>

From tim.hochberg at ieee.org  Tue Aug 21 13:59:56 2007
From: tim.hochberg at ieee.org (Timothy Hochberg)
Date: Tue, 21 Aug 2007 10:59:56 -0700
Subject: [Numpy-discussion] "Extended" Outer Product
In-Reply-To: <e06186140708210944p50fd17eq782bafb4fbae4678@mail.gmail.com>
References: <fcedc2970708201436u4a3a1837ke13f8633dc9b0d4c@mail.gmail.com>
	<e06186140708210944p50fd17eq782bafb4fbae4678@mail.gmail.com>
Message-ID: <e4412d6b0708211059l86d6315i466cdd2f3f397765@mail.gmail.com>

On 8/21/07, Charles R Harris <charlesr.harris at gmail.com> wrote:
>
>
>
> On 8/20/07, Geoffrey Zhu <zyzhu2000 at gmail.com> wrote:
> >
> > Hi Everyone,
> >
> > I am wondering if there is an "extended" outer product. Take the
> > example in "Guide to Numpy." Instead of doing an multiplication, I
> > want to call a custom function for each pair.
> >
> > >>> print outer([1,2,3],[10,100,1000])
> >
> > [[ 10 100 1000]
> > [ 20 200 2000]
> > [ 30 300 3000]]
> >
> >
> > So I want:
> >
> > [
> > [f(1,10), f(1,100), f(1,1000)],
> > [f(2,10), f(2, 100), f(2, 1000)],
> > [f(3,10), f(3, 100), f(3,1000)]
> > ]
>
>
> Maybe something like
>
> In [15]: f = lambda x,y : x*sin(y)
>
> In [16]: a = array([[f(i,j) for i in range(3)] for j in range(3)])
>
> In [17]: a
> Out[17]:
> array([[ 0.        ,  0.        ,  0.        ],
>        [ 0.        ,  0.84147098,  1.68294197],
>        [ 0.        ,  0.90929743,  1.81859485]])
>
> I don't know if nested list comprehensions are faster than two nested
> loops, but at least they avoid array indexing.
>

This is just a general comment on recent threads of this type and not
directed specifically at Chuck or anyone else.

IMO, the emphasis on avoiding FOR loops at all costs is misplaced. It is
often more memory friendly and thus faster to vectorize only the inner loop
and leave outer loops alone. Everything varies with the specific case of
course, but trying to avoid FOR loops on principle is not a good strategy.


-- 
.  __
.   |-\
.
.  tim.hochberg at ieee.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070821/d74333b4/attachment.html>

From zyzhu2000 at gmail.com  Tue Aug 21 15:46:28 2007
From: zyzhu2000 at gmail.com (Geoffrey Zhu)
Date: Tue, 21 Aug 2007 14:46:28 -0500
Subject: [Numpy-discussion] "Extended" Outer Product
In-Reply-To: <e4412d6b0708211059l86d6315i466cdd2f3f397765@mail.gmail.com>
References: <fcedc2970708201436u4a3a1837ke13f8633dc9b0d4c@mail.gmail.com>
	<e06186140708210944p50fd17eq782bafb4fbae4678@mail.gmail.com>
	<e4412d6b0708211059l86d6315i466cdd2f3f397765@mail.gmail.com>
Message-ID: <fcedc2970708211246s52c4223fqfe3c890c0f83de82@mail.gmail.com>

On 8/21/07, Timothy Hochberg <tim.hochberg at ieee.org> wrote:
>
>
>
> On 8/21/07, Charles R Harris <charlesr.harris at gmail.com> wrote:
> >
> >
> >
> > On 8/20/07, Geoffrey Zhu < zyzhu2000 at gmail.com> wrote:
> > > Hi Everyone,
> > >
> > > I am wondering if there is an "extended" outer product. Take the
> > > example in "Guide to Numpy." Instead of doing an multiplication, I
> > > want to call a custom function for each pair.
> > >
> > > >>> print outer([1,2,3],[10,100,1000])
> > >
> > > [[ 10 100 1000]
> > > [ 20 200 2000]
> > > [ 30 300 3000]]
> > >
> > >
> > > So I want:
> > >
> > > [
> > > [f(1,10), f(1,100), f(1,1000)],
> > > [f(2,10), f(2, 100), f(2, 1000)],
> > > [f(3,10), f(3, 100), f(3,1000)]
> > > ]
> >
> >
> > Maybe something like
> >
> > In [15]: f = lambda x,y : x*sin(y)
> >
> > In [16]: a = array([[f(i,j) for i in range(3)] for j in range(3)])
> >
> > In [17]: a
> > Out[17]:
> > array([[ 0.        ,  0.        ,  0.        ],
> >        [ 0.        ,  0.84147098,  1.68294197],
> >        [ 0.        ,  0.90929743,  1.81859485]])
> >
> > I don't know if nested list comprehensions are faster than two nested
> loops, but at least they avoid array indexing.
>
> This is just a general comment on recent threads of this type and not
> directed specifically at Chuck or anyone else.
>
> IMO, the emphasis on avoiding FOR loops at all costs is misplaced. It is
> often more memory friendly and thus faster to vectorize only the inner loop
> and leave outer loops alone. Everything varies with the specific case of
> course, but trying to avoid FOR loops on principle is not a good strategy.
>

I agree. My original post asked for solutions without using two nested
for loops because I already know the two for loop solution. Besides, I
was hoping that some version of 'outer' will take in a function
reference and call the function instead of doing multiplifcation.


From tim.hochberg at ieee.org  Tue Aug 21 15:56:03 2007
From: tim.hochberg at ieee.org (Timothy Hochberg)
Date: Tue, 21 Aug 2007 12:56:03 -0700
Subject: [Numpy-discussion] "Extended" Outer Product
In-Reply-To: <fcedc2970708211246s52c4223fqfe3c890c0f83de82@mail.gmail.com>
References: <fcedc2970708201436u4a3a1837ke13f8633dc9b0d4c@mail.gmail.com>
	<e06186140708210944p50fd17eq782bafb4fbae4678@mail.gmail.com>
	<e4412d6b0708211059l86d6315i466cdd2f3f397765@mail.gmail.com>
	<fcedc2970708211246s52c4223fqfe3c890c0f83de82@mail.gmail.com>
Message-ID: <e4412d6b0708211256y3d212a2cy1f225f00745d32f3@mail.gmail.com>

On 8/21/07, Geoffrey Zhu <zyzhu2000 at gmail.com> wrote:
>
> On 8/21/07, Timothy Hochberg <tim.hochberg at ieee.org> wrote:
> >
> >
> >
> > On 8/21/07, Charles R Harris <charlesr.harris at gmail.com> wrote:
> > >
> > >
> > >
> > > On 8/20/07, Geoffrey Zhu < zyzhu2000 at gmail.com> wrote:
> > > > Hi Everyone,
> > > >
> > > > I am wondering if there is an "extended" outer product. Take the
> > > > example in "Guide to Numpy." Instead of doing an multiplication, I
> > > > want to call a custom function for each pair.
> > > >
> > > > >>> print outer([1,2,3],[10,100,1000])
> > > >
> > > > [[ 10 100 1000]
> > > > [ 20 200 2000]
> > > > [ 30 300 3000]]
> > > >
> > > >
> > > > So I want:
> > > >
> > > > [
> > > > [f(1,10), f(1,100), f(1,1000)],
> > > > [f(2,10), f(2, 100), f(2, 1000)],
> > > > [f(3,10), f(3, 100), f(3,1000)]
> > > > ]
> > >
> > >
> > > Maybe something like
> > >
> > > In [15]: f = lambda x,y : x*sin(y)
> > >
> > > In [16]: a = array([[f(i,j) for i in range(3)] for j in range(3)])
> > >
> > > In [17]: a
> > > Out[17]:
> > > array([[ 0.        ,  0.        ,  0.        ],
> > >        [ 0.        ,  0.84147098,  1.68294197],
> > >        [ 0.        ,  0.90929743,  1.81859485]])
> > >
> > > I don't know if nested list comprehensions are faster than two nested
> > loops, but at least they avoid array indexing.
> >
> > This is just a general comment on recent threads of this type and not
> > directed specifically at Chuck or anyone else.
> >
> > IMO, the emphasis on avoiding FOR loops at all costs is misplaced. It is
> > often more memory friendly and thus faster to vectorize only the inner
> loop
> > and leave outer loops alone. Everything varies with the specific case of
> > course, but trying to avoid FOR loops on principle is not a good
> strategy.
> >
>
> I agree. My original post asked for solutions without using two nested
> for loops because I already know the two for loop solution. Besides, I
> was hoping that some version of 'outer' will take in a function
> reference and call the function instead of doing multiplifcation.


A specific example would help here. There are ways to deal with certain
subclasses of problems that won't necessarily generalize. For example, are
you aware of the outer methods on ufuncs (add.outer, substract.outer, etc)?
Typical dimensions also matter, since some approaches work well for certain
shapes, but are pretty miserable for others. FWIW, I often have very good
luck with removing the inner for-loop in favor of vector operations. This
tends to be simpler than trying to vectorize everything and often has better
performance since it's often more memory friendly. However, it all depends
on specifics of the problem.

Regards,

-tim


-- 
.  __
.   |-\
.
.  tim.hochberg at ieee.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070821/f6210ccc/attachment.html>

From peridot.faceted at gmail.com  Tue Aug 21 16:32:49 2007
From: peridot.faceted at gmail.com (Anne Archibald)
Date: Tue, 21 Aug 2007 16:32:49 -0400
Subject: [Numpy-discussion] "Extended" Outer Product
In-Reply-To: <e4412d6b0708211059l86d6315i466cdd2f3f397765@mail.gmail.com>
References: <fcedc2970708201436u4a3a1837ke13f8633dc9b0d4c@mail.gmail.com>
	<e06186140708210944p50fd17eq782bafb4fbae4678@mail.gmail.com>
	<e4412d6b0708211059l86d6315i466cdd2f3f397765@mail.gmail.com>
Message-ID: <ce557a360708211332m26e364f2k30f0903103591595@mail.gmail.com>

On 21/08/07, Timothy Hochberg <tim.hochberg at ieee.org> wrote:

> This is just a general comment on recent threads of this type and not
> directed specifically at Chuck or anyone else.
>
> IMO, the emphasis on avoiding FOR loops at all costs is misplaced. It is
> often more memory friendly and thus faster to vectorize only the inner loop
> and leave outer loops alone. Everything varies with the specific case of
> course, but trying to avoid FOR loops on principle is not a good strategy.

Yes and no. From a performance point of view, you are certainly right;
vectorizing is definitely not always a speedup. But for me, the main
advantage of vectorized operations is generally clarity: C = A*B is
clearer and simpler than C = [a*b for (a,b) in zip(A,B)]. When it's
not clearer and simpler, I feel no compunction about falling back to
list comprehensions and for loops.

That said, it would often be nice to have something like
map(f,arange(10)) for arrays; the best I've found is
vectorize(f)(arange(10)).

vectorize, of course, is a good example of my point above: it really
just loops, in python IIRC, but conceptually it's extremely handy for
doing exactly what the OP wanted. Unfortunately vectorize() does not
yield a sufficiently ufunc-like object to support .outer(), as that
would be extremely tidy.

Anne


From tim.hochberg at ieee.org  Tue Aug 21 17:14:00 2007
From: tim.hochberg at ieee.org (Timothy Hochberg)
Date: Tue, 21 Aug 2007 14:14:00 -0700
Subject: [Numpy-discussion] "Extended" Outer Product
In-Reply-To: <ce557a360708211332m26e364f2k30f0903103591595@mail.gmail.com>
References: <fcedc2970708201436u4a3a1837ke13f8633dc9b0d4c@mail.gmail.com>
	<e06186140708210944p50fd17eq782bafb4fbae4678@mail.gmail.com>
	<e4412d6b0708211059l86d6315i466cdd2f3f397765@mail.gmail.com>
	<ce557a360708211332m26e364f2k30f0903103591595@mail.gmail.com>
Message-ID: <e4412d6b0708211414l5438e8b6h630be3a91edeb17@mail.gmail.com>

On 8/21/07, Anne Archibald <peridot.faceted at gmail.com> wrote:
>
> On 21/08/07, Timothy Hochberg <tim.hochberg at ieee.org> wrote:
>
> > This is just a general comment on recent threads of this type and not
> > directed specifically at Chuck or anyone else.
> >
> > IMO, the emphasis on avoiding FOR loops at all costs is misplaced. It is
> > often more memory friendly and thus faster to vectorize only the inner
> loop
> > and leave outer loops alone. Everything varies with the specific case of
> > course, but trying to avoid FOR loops on principle is not a good
> strategy.
>
> Yes and no. From a performance point of view, you are certainly right;
> vectorizing is definitely not always a speedup. But for me, the main
> advantage of vectorized operations is generally clarity: C = A*B is
> clearer and simpler than C = [a*b for (a,b) in zip(A,B)]. When it's
> not clearer and simpler, I feel no compunction about falling back to
> list comprehensions and for loops.


I always assume that in these cases performance is a driver of the question.
It would be straightforward to code an outer equivalent in Python to hide
this for anyone who cares. Since no one who asks these questions ever does,
I assume they must be primarily motivated by performance.

That said, it would often be nice to have something like
> map(f,arange(10)) for arrays; the best I've found is
> vectorize(f)(arange(10)).
>
> vectorize, of course, is a good example of my point above: it really
> just loops, in python IIRC,


I used to think that too, but then I looked at it and I believe it actually
grabs the code object out of the function and loops in C. You still have to
run the code object at each point though so it's not that fast. It's been a
while since I did that looking so I may be totally wrong.

but conceptually it's extremely handy for
> doing exactly what the OP wanted. Unfortunately vectorize() does not
> yield a sufficiently ufunc-like object to support .outer(), as that
> would be extremely tidy.


I suppose someone should fix that someday. However, I still think vectorize
is an attractive nuisance in the sense that someone has a function that they
want to apply to an array and they get sucked into throwing vectorize at the
problem. More often than not, vectorize makes things slower than they need
to be. If you don't care about performance, that's fine, but I live in fear
of code like:

   def f(a, b):
       return sin(a*b + a**2)
   f = vectorize(f)

The original function f is a perfectly acceptable vectorized function
(assuming one uses numpy.sin), but now it's been replaced by a slower
version by passing it through vectorize. To be sure, this isn't always the
case; in cases where you have to make choices, things get messier. Still,
I'm not convinced that vectorize doesn't hurt more than it helps.


-- 
.  __
.   |-\
.
.  tim.hochberg at ieee.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070821/67a298d2/attachment.html>

From robert.kern at gmail.com  Tue Aug 21 18:00:27 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 21 Aug 2007 17:00:27 -0500
Subject: [Numpy-discussion] "Extended" Outer Product
In-Reply-To: <e4412d6b0708211414l5438e8b6h630be3a91edeb17@mail.gmail.com>
References: <fcedc2970708201436u4a3a1837ke13f8633dc9b0d4c@mail.gmail.com>	<e06186140708210944p50fd17eq782bafb4fbae4678@mail.gmail.com>	<e4412d6b0708211059l86d6315i466cdd2f3f397765@mail.gmail.com>	<ce557a360708211332m26e364f2k30f0903103591595@mail.gmail.com>
	<e4412d6b0708211414l5438e8b6h630be3a91edeb17@mail.gmail.com>
Message-ID: <46CB607B.4050400@gmail.com>

Timothy Hochberg wrote:
> On 8/21/07, *Anne Archibald* <peridot.faceted at gmail.com
> <mailto:peridot.faceted at gmail.com>> wrote:

>     but conceptually it's extremely handy for
>     doing exactly what the OP wanted. Unfortunately vectorize() does not
>     yield a sufficiently ufunc-like object to support .outer(), as that
>     would be extremely tidy.
> 
> I suppose someone should fix that someday.

Not much to fix. There is already frompyfunc() which does make a real ufunc.
However, (and it's a big "however"), those ufuncs only output object arrays.
That's why I didn't mention it earlier.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From gael.varoquaux at normalesup.org  Wed Aug 22 03:45:46 2007
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Wed, 22 Aug 2007 09:45:46 +0200
Subject: [Numpy-discussion] "Extended" Outer Product
In-Reply-To: <e4412d6b0708211414l5438e8b6h630be3a91edeb17@mail.gmail.com>
References: <fcedc2970708201436u4a3a1837ke13f8633dc9b0d4c@mail.gmail.com>
	<e06186140708210944p50fd17eq782bafb4fbae4678@mail.gmail.com>
	<e4412d6b0708211059l86d6315i466cdd2f3f397765@mail.gmail.com>
	<ce557a360708211332m26e364f2k30f0903103591595@mail.gmail.com>
	<e4412d6b0708211414l5438e8b6h630be3a91edeb17@mail.gmail.com>
Message-ID: <20070822074546.GA18548@clipper.ens.fr>

On Tue, Aug 21, 2007 at 02:14:00PM -0700, Timothy Hochberg wrote:
>    I suppose someone should fix that someday. However, I still think
>    vectorize is an attractive nuisance in the sense that someone has a
>    function that they want to apply to an array and they get sucked into
>    throwing vectorize at the problem. More often than not, vectorize makes
>    things slower than they need to be. If you don't care about performance,
>    that's fine, but I live in fear of code like:

>       def f(a, b):
>           return sin(a*b + a**2)
>       f = vectorize(f)

>    The original function f is a perfectly acceptable vectorized function
>    (assuming one uses numpy.sin), but now it's been replaced by a slower
>    version by passing it through vectorize. To be sure, this isn't always the
>    case; in cases where you have to make choices, things get messier. Still,
>    I'm not convinced that vectorize doesn't hurt more than it helps.

I often have code where I am going to loop over a large amount of nested
loops, some thing like:

# A function to return the optical field in each point:

def optical_field( (x, y, z) ):
    loop over an array of laser wave-vector
    return optical field

# Evaluate the optical field on a grid to plot it :

x, y z = mgrid[-10:10, -10:10, -10:10]
field = optical_field( (x, y, z) )

In such a code every single operation could be vectorized, but the
problem is that each function assumes the input array to be of a certain
dimension: I may be using some code like:
r = c_[x, y, z]
cross(r, r_o) 

So implementing loops with arrays is not that convenient, because I have
to add dimensions to my arrays, and to make sure that my inner functions
are robust to these extra dimensions.

Looking at some of my code where I had this kind of problems, I see
functions similar to:

def delta(r, v, k):
    return  dot(r, transpose(k))      
            + Gaussian_beam(r)
            + dot(v, transpose(k))

I am starting to realize that the real problem is that there is no info
of what the expected size for the input and output arguments should be.
Given such info, the function could resize its input and output
arguments.

Maybe some clever decorators could be written to address this issue,
something like:

@inputsize( (3, -1), (3, -1), (3, -1) )

which would reshape every input positional argument to the shape given in
the list of shapes, and reshape the output argument to the shape of the
first input argument.

As I worked around these problems in my code I cannot say whether these
decorators would get rid of them (I had not had the idea at the time), I
like the idea, and I will try next time I run into these problems.

I just wanted to point out that replacing for loops with arrays was not
always that simple and that using "vectorize" sometimes was a quick and a
dirty way to get things done.

Ga?l


From faltet at carabos.com  Wed Aug 22 05:11:16 2007
From: faltet at carabos.com (Francesc Altet)
Date: Wed, 22 Aug 2007 11:11:16 +0200
Subject: [Numpy-discussion] Finding unique rows in an array [Was: Finding a
	row match within a numpy array]
In-Reply-To: <46CAF562.9060009@cc.usu.edu>
References: <46C2CD01.5030307@bristol.ac.uk>
	<1187190071.384881.240470@w3g2000hsg.googlegroups.com>
	<46CAF562.9060009@cc.usu.edu>
Message-ID: <200708221111.17141.faltet@carabos.com>

A Tuesday 21 August 2007, Mark.Miller escrigu?:
> A slightly related question on this topic...
>
> Is there a good loopless way to identify all of the unique rows in an
> array?  Something like numpy.unique() is ideal, but capable of
> extracting unique subarrays along an axis.

You can always do a view of the rows as strings and then use unique().
Here is an example:

In [1]: import numpy
In [2]: a=numpy.arange(12).reshape(4,3)
In [3]: a[2]=(3,4,5)
In [4]: a
Out[4]: 
array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 3,  4,  5],
       [ 9, 10, 11]])

now, create the view and select the unique rows:

In [5]: b=numpy.unique(a.view('S%d'%a.itemsize*a.shape[0])).view('i4')

and finally restore the shape:

In [6]: b.reshape((len(b)/a.shape[1], a.shape[1]))
Out[6]: 
array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 9, 10, 11]])

If you want to find unique columns instead of rows, do a tranpose first 
on the initial array.

Cheers,

-- 
>0,0<   Francesc Altet ? ? http://www.carabos.com/
V   V   C?rabos Coop. V. ??Enjoy Data
 "-"


From jensj at fysik.dtu.dk  Wed Aug 22 05:36:21 2007
From: jensj at fysik.dtu.dk (Jens =?iso-8859-1?Q?J=F8rgen_Mortensen?=)
Date: Wed, 22 Aug 2007 11:36:21 +0200 (CEST)
Subject: [Numpy-discussion] Non-contiguous array from newaxis indexing
In-Reply-To: <46C6CF49.6020202@ieee.org>
References: <1187427656.8294.11.camel@b307-242.fysik.dtu.dk>
	<46C6CF49.6020202@ieee.org>
Message-ID: <10306.85.81.43.249.1187775381.squirrel@webmail.fysik.dtu.dk>

> Jens J?rgen Mortensen wrote:
>> I would like all these arrays to be contiguous:
>>
>>>>> import numpy as npy
>>>>> npy.__version__
>> '1.0.4.dev3967'
>>>>> x = npy.arange(4)
>>>>> y = x[npy.newaxis, :]
>>>>> z = x.reshape((1, 4))
>>>>> for a in [x, y, z]:
>> ...     print a.shape, a.strides, a.flags.contiguous
>> ...
>> (4,) (4,) True
>> (1, 4) (0, 4) False
>> (1, 4) (16, 4) True
>>
>> But y is not contiguous according to y.flags.contiguous - why not and
>> why does y and z not have the same strides?
>>
>> I f
>
> We've tried a few times to let them be contiguous, but it breaks code in
> various ways because NumPy takes advantage of 0-striding to accomplish
> broadcasting.   In theory, it might be able to be fixed, but the fact
> that simple fixes don't work makes me wonder.

OK, then how about giving y the strides (16, 4) like z?  Then
_IsContiguous() will say thay y is contiguous.  Will that break any code?

I can take a look at how to fix the strides for newaxis-indexed arrays if
this is way to go.

Jens J?rgen

> ound this comment just before the _IsContiguous function in
>> arrayobject.c:
>>
>> /* 0-strided arrays are not contiguous (even if dimension == 1) */
>>
>> Is this correct?
>
> Yes.
>
> -Travis
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>


From Shawn.Gong at drdc-rddc.gc.ca  Wed Aug 22 12:36:09 2007
From: Shawn.Gong at drdc-rddc.gc.ca (Gong, Shawn (Contractor))
Date: Wed, 22 Aug 2007 12:36:09 -0400
Subject: [Numpy-discussion] memory error caused by astype()
In-Reply-To: <2E58C246F17003499C141D334794D049027683D8@ottawaex02.Ottawa.drdc-rddc.gc.ca>
References: <2E58C246F17003499C141D334794D049027683D8@ottawaex02.Ottawa.drdc-rddc.gc.ca>
Message-ID: <2E58C246F17003499C141D334794D049027683FA@ottawaex02.Ottawa.drdc-rddc.gc.ca>

Hi list,

When I do large array manipulations, I get out-of-memory errors.
For instance if the array size is 5000 by 6000, the following codes use
nearly 1G of RAM. 
Then my PC displays a Python error box.  The try/except won't even catch
it if the error happens in "astype" instead of "array1* array2"

    try:
        if ( array1.typecode() in cplx_types ):
            array1 = abs(array1.astype(Numeric.Complex32))
        else:
            array1 = array1.astype(Numeric.Float32)

        if ( array2.typecode() in cplx_types ):
            array2 = abs(array2.astype(Numeric.Complex32))
        else:
            array2 = array2.astype(Numeric.Float32)

        array1 = Numeric.sqrt(array1) * Numeric.sqrt(array2)
        return array1

    except:
        gvutils.error("Memory error occurred\nPlease select a smaller
array")
        return None

My questions are:
1)	Is there a more memory efficient way instead of using astype?
2)	If not, then how do I catch error during astype?
3)	Is there a way in Python that detects the available RAM and
limits the array size before he/she can go ahead with the array
multiplications?  
			i.e. detects the available RAM, say 1G
			Assume the worst case - Complex32
			Figure out the array size limit and warn user

Thanks,
Shaw Gong


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070822/d865b777/attachment.html>

From chanley at stsci.edu  Wed Aug 22 13:25:26 2007
From: chanley at stsci.edu (Christopher Hanley)
Date: Wed, 22 Aug 2007 13:25:26 -0400
Subject: [Numpy-discussion] latest svn version fails on Solaris
Message-ID: <46CC7186.7010109@stsci.edu>

Hi,

The latest version of numpy has a unit test failure on big endian machines.

======================================================================
FAIL: test_record_array (numpy.core.tests.test_multiarray.test_putmask)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/data/basil5/site-packages/lib/python/numpy/core/tests/test_multiarray.py", line 450, in test_record_array
    assert_array_equal(rec['x'],[10,5])
  File "/data/basil5/site-packages/lib/python/numpy/testing/utils.py", line 223, in assert_array_equal
    verbose=verbose, header='Arrays are not equal')
  File "/data/basil5/site-packages/lib/python/numpy/testing/utils.py", line 215, in assert_array_compare
    assert cond, msg
AssertionError: 
Arrays are not equal

(mismatch 50.0%)
 x: array([  4.58492919e-320,   5.00000000e+000])
 y: array([10,  5])

----------------------------------------------------------------------
Ran 670 tests in 47.182s


Chris


From millman at berkeley.edu  Wed Aug 22 17:35:24 2007
From: millman at berkeley.edu (Jarrod Millman)
Date: Wed, 22 Aug 2007 14:35:24 -0700
Subject: [Numpy-discussion] Branch and Tag Maintenance
Message-ID: <c7009a550708221435t3c845579if14699c7c55c4146@mail.gmail.com>

Hello,

I deleted any old (2+ years since modified) branches and tags.
Nothing is actually deleted so if you need to access the old code
simply use the relevant revision number with svn checkout, svn switch,
or svn list.

It is also very easy to restore if you are planning to continue
working on some of the code.  For example if you need to restore the
numarray branch just use:
svn copy -r 3988 http://svn.scipy.org/svn/numpy/branches/numarray \
                           http://svn.scipy.org/svn/numpy/branches/numarray

I have attached a text file with the deletions I made and what
revision they were committed on.  You can also view the changes using
the trac site:
http://projects.scipy.org/scipy/numpy/timeline

Thanks,

-- 
Jarrod Millman
Computational Infrastructure for Research Labs
10 Giannini Hall, UC Berkeley
phone: 510.643.4014
http://cirl.berkeley.edu/
-------------- next part --------------
NumPy Branch Maintenance
========================

svn delete http://svn.scipy.org/svn/numpy/branches/build_src -m "Removing old branch"
Committed revision 3985.

svn delete http://svn.scipy.org/svn/numpy/branches/kiva_window_branch -m "Removing old branch"
Committed revision 3986.

svn delete http://svn.scipy.org/svn/numpy/branches/newunicode -m "Removing old branch"
Committed revision 3987.

svn delete http://svn.scipy.org/svn/numpy/branches/numarray -m "Removing old branch"
Committed revision 3988.

svn delete http://svn.scipy.org/svn/numpy/branches/oldcore -m "Removing old branch"
Committed revision 3989.

svn delete http://svn.scipy.org/svn/numpy/branches/v0_3_2 -m "Removing old branch"
Committed revision 3990.

NumPy Tag Maintenance
========================

svn delete http://svn.scipy.org/svn/numpy/tags/beta-0.4.2 -m "Removing old tag"
Committed revision 3991.

svn delete http://svn.scipy.org/svn/numpy/tags/Daily_Snapshot_01-11-2002 -m "Removing old tag"
Committed revision 3992.

svn delete http://svn.scipy.org/svn/numpy/tags/kiva_window -m "Removing old tag"
Committed revision 3993.

svn delete http://svn.scipy.org/svn/numpy/tags/post_numarray_merge -m "Removing old tag"
Committed revision 3994.

svn delete http://svn.scipy.org/svn/numpy/tags/pre_classify_conversion -m "Removing old tag"
Committed revision 3995.

svn delete http://svn.scipy.org/svn/numpy/tags/pre_compiler_removal -m "Removing old tag"
Committed revision 3996.

svn delete http://svn.scipy.org/svn/numpy/tags/pre_numarray -m "Removing old tag"
Committed revision 3997.

svn delete http://svn.scipy.org/svn/numpy/tags/pre_numarray_merge -m "Removing old tag"
Committed revision 3998.

svn delete http://svn.scipy.org/svn/numpy/tags/pre_org -m "Removing old tag"
Committed revision 3999.

svn delete http://svn.scipy.org/svn/numpy/tags/release_0_2_0 -m "Removing old tag"
Committed revision 4000.

svn delete http://svn.scipy.org/svn/numpy/tags/v0_2_0 -m "Removing old tag"
Committed revision 4001.

svn delete http://svn.scipy.org/svn/numpy/tags/v0_2_2 -m "Removing old tag"
Committed revision 4002.

svn delete http://svn.scipy.org/svn/numpy/tags/v0_3_0 -m "Removing old tag"
Committed revision 4003.


From stefan at sun.ac.za  Wed Aug 22 18:36:48 2007
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Thu, 23 Aug 2007 00:36:48 +0200
Subject: [Numpy-discussion] latest svn version fails on Solaris
In-Reply-To: <46CC7186.7010109@stsci.edu>
References: <46CC7186.7010109@stsci.edu>
Message-ID: <20070822223648.GA8884@mentat.za.net>

Hi Chris

Do you have a Solaris machine that we can use as a client for the
buildbot (this can be a desktop machine)?  I didn't see this problem
earlier, since all the other platforms built without problems.

I also noticed that not all platforms execute the same number of
tests, which is worrisome.

Cheers
St?fan

On Wed, Aug 22, 2007 at 01:25:26PM -0400, Christopher Hanley wrote:
> Hi,
> 
> The latest version of numpy has a unit test failure on big endian machines.
> 
> ======================================================================
> FAIL: test_record_array (numpy.core.tests.test_multiarray.test_putmask)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "/data/basil5/site-packages/lib/python/numpy/core/tests/test_multiarray.py", line 450, in test_record_array
>     assert_array_equal(rec['x'],[10,5])
>   File "/data/basil5/site-packages/lib/python/numpy/testing/utils.py", line 223, in assert_array_equal
>     verbose=verbose, header='Arrays are not equal')
>   File "/data/basil5/site-packages/lib/python/numpy/testing/utils.py", line 215, in assert_array_compare
>     assert cond, msg
> AssertionError: 
> Arrays are not equal
> 
> (mismatch 50.0%)
>  x: array([  4.58492919e-320,   5.00000000e+000])
>  y: array([10,  5])
> 
> ----------------------------------------------------------------------
> Ran 670 tests in 47.182s


From Chris.Barker at noaa.gov  Thu Aug 23 20:34:20 2007
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Thu, 23 Aug 2007 17:34:20 -0700
Subject: [Numpy-discussion] comparing arrays with NaN in them.
Message-ID: <46CE278C.6010804@noaa.gov>

Hi all,

I was just trying to write a unit test for something where I was 
expecting to get some NaN's in the array. However, since NaN == NaN 
returns false, the simple test:

assert(alltrue(a == b))

 >>> a = N.array((1,2,3,N.nan))
 >>> b = N.array((1,2,3,N.nan))
 >>> a == b
array([ True,  True,  True, False], dtype=bool)

 >>> assert(N.alltrue(a == b))
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
AssertionError
 >>>

So is there any way to test is two arrays are the same, when there may 
be a NaN or two mixed in???

With a bit of thought -- this works:
 >>> N.alltrue(a[~N.isnan(a)] == b[~N.isnan(b)])
True

but that feels like a kludge. maybe some sort of "TheseArrays are binary 
equal" would be useful.

-CHB


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From focke at slac.stanford.edu  Thu Aug 23 22:51:50 2007
From: focke at slac.stanford.edu (Warren Focke)
Date: Thu, 23 Aug 2007 19:51:50 -0700 (PDT)
Subject: [Numpy-discussion] comparing arrays with NaN in them.
In-Reply-To: <46CE278C.6010804@noaa.gov>
References: <46CE278C.6010804@noaa.gov>
Message-ID: <Pine.LNX.4.64.0708231950010.23936@circinus.slac.stanford.edu>


On Thu, 23 Aug 2007, Christopher Barker wrote:

> but that feels like a kludge. maybe some sort of "TheseArrays are binary
> equal" would be useful.

But there are multiple possible NaNs, so you couldn't rely on the bits 
comparing.

Maybe something with masked arrays?

w


From matthieu.brucher at gmail.com  Fri Aug 24 04:41:18 2007
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Fri, 24 Aug 2007 10:41:18 +0200
Subject: [Numpy-discussion] Error code of NumpyTest()
Message-ID: <e76aa17f0708240141k973e168gd47ef9f48b95113a@mail.gmail.com>

Hi,

I wondered if there was a way of returning another error code than 0 when
executing the test suite so that a parent process can immediately know if
all the tests passed or not.
The numpy buildbot seems to have the same behaviour BTW.
I don't know if it is possible, but it would be great.

Matthieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070824/5a758399/attachment.html>

From hoeffken at ipk-gatersleben.de  Fri Aug 24 04:46:05 2007
From: hoeffken at ipk-gatersleben.de (=?UTF-8?B?TWF0dGhpYXMgSMO2ZmZrZW4=?=)
Date: Fri, 24 Aug 2007 10:46:05 +0200
Subject: [Numpy-discussion] Reference counter of builtin descriptor objects
Message-ID: <46CE9ACD.3080607@ipk-gatersleben.de>

Greetings,

I struggling with the numpy C-API (version 1.0.3).
Now I have obscurities concerning the reference counter of builtin
descriptor objects. In some situation, when running my own code, the
reference counter fall to zero an I get warning messages. In some other
samples the reference counter increases more and more while the program
is running  and the average number of used object is keeping constant.

Now I would like to know when I have to take care about the reference
counter of builtin descriptor objects. Especially when using
"PyArray_SimpleNew", "PyArray_SimpleNewFromData", "PyArray_NewFromDescr"
and "PyArray_SimpleNewFromDescr".
Up to now I never touched the counters in my code after using this
functions resulting in the described problems.

Another case concerns parsing the arguments of functions. I often use
such kind of expressions:

PyArg_ParseTupleAndKeywords(args, kwds, "O!O!", kwlist, &PyArray_Type,
&array1, &PyArray_Type, &array1))

Normally I would expect, that no reference counter is changed. Is that
really true?

Many thanks in advance!

Matthias


-------------- next part --------------
A non-text attachment was scrubbed...
Name: hoeffken.vcf
Type: text/x-vcard
Size: 315 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070824/5b4eedd3/attachment.vcf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 252 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070824/5b4eedd3/attachment.sig>

From haase at msg.ucsf.edu  Fri Aug 24 05:22:53 2007
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Fri, 24 Aug 2007 11:22:53 +0200
Subject: [Numpy-discussion] pyOpenGL with numpy support
Message-ID: <bc657ead0708240222o16d2bd03ub44c6fec016cfc4f@mail.gmail.com>

Hi,
The latest release notes of pyOpenGL (Feb 15, 2007) say that "Numarray
support [was] reenabled".
The current version is 3.0.0a6.

Does anyone here know the status of the new (ctypes based) pyOpenGL ?
How is the binding to ("modern") numpy ?

I'm especially interested in fast memory access.  So far I have to
SWIG my own call to
glVertexPointer to reduce the execution from about 160ms to a few ms.
I think without the numpy support arrays are accessed through a very
slow list protocol. (I'm just guessing in the dark.)

I use pyOpenGL with great pleasure to display medical/microscopy
images on mutli-color, color-maps using 2d-textures.  It works very
fast.

Thanks,
Sebastian


From markbak at gmail.com  Fri Aug 24 11:06:35 2007
From: markbak at gmail.com (mark)
Date: Fri, 24 Aug 2007 15:06:35 -0000
Subject: [Numpy-discussion] comparing arrays with NaN in them.
In-Reply-To: <Pine.LNX.4.64.0708231950010.23936@circinus.slac.stanford.edu>
References: <46CE278C.6010804@noaa.gov>
	<Pine.LNX.4.64.0708231950010.23936@circinus.slac.stanford.edu>
Message-ID: <1187967995.784494.4110@x35g2000prf.googlegroups.com>

There may be multiple nan-s, but what Chris did is simply create one
with the same nan's

>>> a = N.array((1,2,3,N.nan))
>>> b = N.array((1,2,3,N.nan))

I think these should be the same.
Can anybody give me a good reason why they shouldn't, because it could
confuse a lot of people?

Thanks, Mark

ps. I have to admit though, that matlab does the same thing. nan==nan
is false.

On Aug 24, 4:51 am, Warren Focke <fo... at slac.stanford.edu> wrote:
> On Thu, 23 Aug 2007, Christopher Barker wrote:
> > but that feels like a kludge. maybe some sort of "TheseArrays are binary
> > equal" would be useful.
>
> But there are multiple possible NaNs, so you couldn't rely on the bits
> comparing.
>
> Maybe something with masked arrays?
>
> w
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion


From matthieu.brucher at gmail.com  Fri Aug 24 11:25:43 2007
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Fri, 24 Aug 2007 17:25:43 +0200
Subject: [Numpy-discussion] comparing arrays with NaN in them.
In-Reply-To: <1187967995.784494.4110@x35g2000prf.googlegroups.com>
References: <46CE278C.6010804@noaa.gov>
	<Pine.LNX.4.64.0708231950010.23936@circinus.slac.stanford.edu>
	<1187967995.784494.4110@x35g2000prf.googlegroups.com>
Message-ID: <e76aa17f0708240825v5a659a86n430e99c86a4d4436@mail.gmail.com>

2007/8/24, mark <markbak at gmail.com>:
>
> There may be multiple nan-s, but what Chris did is simply create one
> with the same nan's
>
> >>> a = N.array((1,2,3,N.nan))
> >>> b = N.array((1,2,3,N.nan))
>
> I think these should be the same.
> Can anybody give me a good reason why they shouldn't, because it could
> confuse a lot of people?
>
> Thanks, Mark
>

It's the IEEE norm for flotting point numbers. You can have sevaral
different NaN, although in this case, they are the same kind.
Even if they are the same kind, the norm tells that NaN != NaN.

Matthieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070824/c934dd3f/attachment.html>

From Glen.Mabey at swri.org  Fri Aug 24 11:46:39 2007
From: Glen.Mabey at swri.org (Glen W. Mabey)
Date: Fri, 24 Aug 2007 10:46:39 -0500
Subject: [Numpy-discussion] comparing arrays with NaN in them.
In-Reply-To: <e76aa17f0708240825v5a659a86n430e99c86a4d4436@mail.gmail.com>
References: <46CE278C.6010804@noaa.gov>
	<Pine.LNX.4.64.0708231950010.23936@circinus.slac.stanford.edu>
	<1187967995.784494.4110@x35g2000prf.googlegroups.com>
	<e76aa17f0708240825v5a659a86n430e99c86a4d4436@mail.gmail.com>
Message-ID: <20070824154639.GA21230@bams.ccf.swri.edu>

On Fri, Aug 24, 2007 at 05:25:43PM +0200, Matthieu Brucher wrote:
> It's the IEEE norm for flotting point numbers. You can have sevaral
> different NaN, although in this case, they are the same kind.
> Even if they are the same kind, the norm tells that NaN != NaN.

Someone mentioned using masked arrays.  There is one "standard" mask
that comes with numpy.ma (dunno about maskedarray -- is that still in
the scipy sandbox?).

Anyway, there could be another standard mask for NaN, which would serve
to simplify the answer to those who encounter this in the future ...

Glen


From cournape at gmail.com  Fri Aug 24 12:04:38 2007
From: cournape at gmail.com (David Cournapeau)
Date: Sat, 25 Aug 2007 01:04:38 +0900
Subject: [Numpy-discussion] comparing arrays with NaN in them.
In-Reply-To: <e76aa17f0708240825v5a659a86n430e99c86a4d4436@mail.gmail.com>
References: <46CE278C.6010804@noaa.gov>
	<Pine.LNX.4.64.0708231950010.23936@circinus.slac.stanford.edu>
	<1187967995.784494.4110@x35g2000prf.googlegroups.com>
	<e76aa17f0708240825v5a659a86n430e99c86a4d4436@mail.gmail.com>
Message-ID: <5b8d13220708240904x3ea17c08u5f5fbb2130b39bd8@mail.gmail.com>

On 8/25/07, Matthieu Brucher <matthieu.brucher at gmail.com> wrote:
>
>
> 2007/8/24, mark <markbak at gmail.com>:
> > There may be multiple nan-s, but what Chris did is simply create one
> > with the same nan's
> >
> > >>> a = N.array((1,2,3,N.nan))
> > >>> b = N.array((1,2,3,N.nan))
> >
> > I think these should be the same.
> > Can anybody give me a good reason why they shouldn't, because it could
> > confuse a lot of people?
> >
> > Thanks, Mark
> >
>
> It's the IEEE norm for flotting point numbers. You can have sevaral
> different NaN, although in this case, they are the same kind.
> Even if they are the same kind, the norm tells that NaN != NaN.
>
AFAIK, this is the definition of Nan, eg on a system which FPU is IEEE
compatible, a number is x is NAN iff x != x. A Nan is defined at the
binary level as having the exponent to 1 everywhere, and any non zero
value in the mantissa:

 http://en.wikipedia.org/wiki/NaN

Personaly, I would simply compare the non Nan numbers if Nan is a
possible outcome of the operation. Checking at the binary level may
make sense, but it really depends on the cases.

David


From Chris.Barker at noaa.gov  Fri Aug 24 12:08:05 2007
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Fri, 24 Aug 2007 09:08:05 -0700
Subject: [Numpy-discussion] comparing arrays with NaN in them.
In-Reply-To: <e76aa17f0708240825v5a659a86n430e99c86a4d4436@mail.gmail.com>
References: <46CE278C.6010804@noaa.gov>
	<Pine.LNX.4.64.0708231950010.23936@circinus.slac.stanford.edu>
	<1187967995.784494.4110@x35g2000prf.googlegroups.com>
	<e76aa17f0708240825v5a659a86n430e99c86a4d4436@mail.gmail.com>
Message-ID: <46CF0265.7040005@noaa.gov>

Matthieu Brucher wrote:
> 2007/8/24, mark <markbak at gmail.com <mailto:markbak at gmail.com>>:
>     There may be multiple nan-s, but what Chris did is simply create one
>     with the same nan's
> 
>      >>> a = N.array((1,2,3,N.nan))
>      >>> b = N.array((1,2,3,N.nan))
> 
>     I think these should be the same.

I'm the OP, but It depends what you mean by "the same". Yes, these two 
arrays are the same, and that's what I want to test for in this case. 
However, in the mathematical sense, I do understand what NaN == NaN 
should be false -- if you're doing math, those NaN's could have been 
arrived at by very different calculations, so you really wouldn't want 
them to compare equal, so the IEEE standard that NaN does not compare 
equal to anything makes sense to me.

However, what I'm doing is testing to make sure I got the result I 
expected, so I want to know if two arrays are the same, including NaN's 
in the same places. If I wasn't working with an array package, I guess 
I'd be testing for NaN specifically where I expect it, so the solution I 
came up with before makes the most sense:

N.alltrue(a[~N.isnan(a)] == b[~N.isnan(b)])

However, it's not likely, but that could give a true result if the NaN's 
were in different places, but there were the same number and everything 
happened to work out right. So maybe there is a need for a:

nanequal, to go with:

nanargmax
nanargmin
nanmax
nanmin
nansum

> You can have several  different NaN, 

You can? I thought NaN was defined by IEEE 754 as a particular bit 
pattern (one for each precision, anyway).

Warren Focke wrote:
> Maybe something with masked arrays?

In this case, I'm using NaN to mean: "no valid data", so masked arrays 
are probably a better solution anyway. However, I like the simplicity of 
storing a non-value in the same binary array.

However, if I do go with masked arrays:

What's the status of the two masked array implementations? Which should 
I use? Unless there are huge feature differences (which I don't think 
there are), then I want to use the one that's going to get maintained 
into the future -- do we know yet which that will be?

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From tim.hochberg at ieee.org  Fri Aug 24 12:15:09 2007
From: tim.hochberg at ieee.org (Timothy Hochberg)
Date: Fri, 24 Aug 2007 09:15:09 -0700
Subject: [Numpy-discussion] comparing arrays with NaN in them.
In-Reply-To: <46CF0265.7040005@noaa.gov>
References: <46CE278C.6010804@noaa.gov>
	<Pine.LNX.4.64.0708231950010.23936@circinus.slac.stanford.edu>
	<1187967995.784494.4110@x35g2000prf.googlegroups.com>
	<e76aa17f0708240825v5a659a86n430e99c86a4d4436@mail.gmail.com>
	<46CF0265.7040005@noaa.gov>
Message-ID: <e4412d6b0708240915p507d9c10sc0930c121b65293a@mail.gmail.com>

On 8/24/07, Christopher Barker <Chris.Barker at noaa.gov> wrote:
[SNIP]


> You can have several  different NaN,
>
> You can? I thought NaN was defined by IEEE 754 as a particular bit
> pattern (one for each precision, anyway).


There's more than one way to spell NaN in binary and they tend to mean
different things IIRC. Signalling NaNs and quiet NaNs and all of that. (Can
you tell how superficial my knowledge is here, good).

However, if you are inserting the NaNs yourself as placeholders, then they
should all be the same kind and a binary comparison should be fine.


[SNIP]

-Chris
>
>
>
>
> --
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
.  __
.   |-\
.
.  tim.hochberg at ieee.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070824/ffec7344/attachment.html>

From David.L.Goldsmith at noaa.gov  Fri Aug 24 12:23:23 2007
From: David.L.Goldsmith at noaa.gov (David Goldsmith)
Date: Fri, 24 Aug 2007 09:23:23 -0700
Subject: [Numpy-discussion] comparing arrays with NaN in them.
In-Reply-To: <1187967995.784494.4110@x35g2000prf.googlegroups.com>
References: <46CE278C.6010804@noaa.gov>
	<Pine.LNX.4.64.0708231950010.23936@circinus.slac.stanford.edu>
	<1187967995.784494.4110@x35g2000prf.googlegroups.com>
Message-ID: <46CF05FB.4040700@noaa.gov>

What is meant by "multiple nan-s"?

DG

mark wrote:
> There may be multiple nan-s, but what Chris did is simply create one
> with the same nan's
>
>   
>>>> a = N.array((1,2,3,N.nan))
>>>> b = N.array((1,2,3,N.nan))
>>>>         
>
> I think these should be the same.
> Can anybody give me a good reason why they shouldn't, because it could
> confuse a lot of people?
>
> Thanks, Mark
>
> ps. I have to admit though, that matlab does the same thing. nan==nan
> is false.
>
> On Aug 24, 4:51 am, Warren Focke <fo... at slac.stanford.edu> wrote:
>   
>> On Thu, 23 Aug 2007, Christopher Barker wrote:
>>     
>>> but that feels like a kludge. maybe some sort of "TheseArrays are binary
>>> equal" would be useful.
>>>       
>> But there are multiple possible NaNs, so you couldn't rely on the bits
>> comparing.
>>
>> Maybe something with masked arrays?
>>
>> w
>>
>> _______________________________________________
>> Numpy-discussion mailing list
>> Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion
>>     
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>   

-- 
ERD/ORR/NOS/NOAA <http://response.restoration.noaa.gov/emergencyresponse/>


From David.L.Goldsmith at noaa.gov  Fri Aug 24 12:33:04 2007
From: David.L.Goldsmith at noaa.gov (David Goldsmith)
Date: Fri, 24 Aug 2007 09:33:04 -0700
Subject: [Numpy-discussion] comparing arrays with NaN in them.
In-Reply-To: <46CF05FB.4040700@noaa.gov>
References: <46CE278C.6010804@noaa.gov>
	<Pine.LNX.4.64.0708231950010.23936@circinus.slac.stanford.edu>
	<1187967995.784494.4110@x35g2000prf.googlegroups.com>
	<46CF05FB.4040700@noaa.gov>
Message-ID: <46CF0840.400@noaa.gov>

Never mind.  (Posted that before finishing the thread, sorry).

DG

David Goldsmith wrote:
> What is meant by "multiple nan-s"?
>
> DG
>
> mark wrote:
>   
>> There may be multiple nan-s, but what Chris did is simply create one
>> with the same nan's
>>
>>   
>>     
>>>>> a = N.array((1,2,3,N.nan))
>>>>> b = N.array((1,2,3,N.nan))
>>>>>         
>>>>>           
>> I think these should be the same.
>> Can anybody give me a good reason why they shouldn't, because it could
>> confuse a lot of people?
>>
>> Thanks, Mark
>>
>> ps. I have to admit though, that matlab does the same thing. nan==nan
>> is false.
>>
>> On Aug 24, 4:51 am, Warren Focke <fo... at slac.stanford.edu> wrote:
>>   
>>     
>>> On Thu, 23 Aug 2007, Christopher Barker wrote:
>>>     
>>>       
>>>> but that feels like a kludge. maybe some sort of "TheseArrays are binary
>>>> equal" would be useful.
>>>>       
>>>>         
>>> But there are multiple possible NaNs, so you couldn't rely on the bits
>>> comparing.
>>>
>>> Maybe something with masked arrays?
>>>
>>> w
>>>
>>> _______________________________________________
>>> Numpy-discussion mailing list
>>> Numpy-discuss... at scipy.orghttp://projects.scipy.org/mailman/listinfo/numpy-discussion
>>>     
>>>       
>> _______________________________________________
>> Numpy-discussion mailing list
>> Numpy-discussion at scipy.org
>> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>>   
>>     
>
>   

-- 
ERD/ORR/NOS/NOAA <http://response.restoration.noaa.gov/emergencyresponse/>


From tim.hochberg at ieee.org  Fri Aug 24 12:40:51 2007
From: tim.hochberg at ieee.org (Timothy Hochberg)
Date: Fri, 24 Aug 2007 09:40:51 -0700
Subject: [Numpy-discussion] comparing arrays with NaN in them.
In-Reply-To: <e4412d6b0708240915p507d9c10sc0930c121b65293a@mail.gmail.com>
References: <46CE278C.6010804@noaa.gov>
	<Pine.LNX.4.64.0708231950010.23936@circinus.slac.stanford.edu>
	<1187967995.784494.4110@x35g2000prf.googlegroups.com>
	<e76aa17f0708240825v5a659a86n430e99c86a4d4436@mail.gmail.com>
	<46CF0265.7040005@noaa.gov>
	<e4412d6b0708240915p507d9c10sc0930c121b65293a@mail.gmail.com>
Message-ID: <e4412d6b0708240940k515ba9c9t6bdd23ab768a99c3@mail.gmail.com>

On 8/24/07, Timothy Hochberg <tim.hochberg at ieee.org> wrote:
>
>
>
> On 8/24/07, Christopher Barker <Chris.Barker at noaa.gov> wrote:
> [SNIP]
>
>
> > You can have several  different NaN,
> >
> > You can? I thought NaN was defined by IEEE 754 as a particular bit
> > pattern (one for each precision, anyway).
>
>
> There's more than one way to spell NaN in binary and they tend to mean
> different things IIRC. Signalling NaNs and quiet NaNs and all of that. (Can
> you tell how superficial my knowledge is here, good).
>
> However, if you are inserting the NaNs yourself as placeholders, then they
> should all be the same kind and a binary comparison should be fine.
>

To beat this horse a little more:

IEEE 754 NaNs are represented with the exponential field filled with ones
and some non-zero number in the mantissa. A bit-wise example of a IEEE
floating-point standard<http://en.wikipedia.org/wiki/IEEE_floating-point_standard>single
precision NaN: x11111111axxxxxxxxxxxxxxxxxxxxxx. x = undefined. If a
= 1, it is a *quiet NaN*, otherwise it is a *signalling NaN*.

That's from http://en.wikipedia.org/wiki/NaN#NaN_encodings.

So there a bunch of undefined bits that could be set for the private use of
whoever is producing the NaNs for their own purposes. I don't know how often
those bits vary in practice, but in principle it's not safe to rely on NaNs
being bitwise equal.


-- 
.  __
.   |-\
.
.  tim.hochberg at ieee.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070824/dd08928b/attachment.html>

From Chris.Barker at noaa.gov  Fri Aug 24 12:53:51 2007
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Fri, 24 Aug 2007 09:53:51 -0700
Subject: [Numpy-discussion] comparing arrays with NaN in them.
In-Reply-To: <e4412d6b0708240940k515ba9c9t6bdd23ab768a99c3@mail.gmail.com>
References: <46CE278C.6010804@noaa.gov>
	<Pine.LNX.4.64.0708231950010.23936@circinus.slac.stanford.edu>
	<1187967995.784494.4110@x35g2000prf.googlegroups.com>
	<e76aa17f0708240825v5a659a86n430e99c86a4d4436@mail.gmail.com>
	<46CF0265.7040005@noaa.gov>
	<e4412d6b0708240915p507d9c10sc0930c121b65293a@mail.gmail.com>
	<e4412d6b0708240940k515ba9c9t6bdd23ab768a99c3@mail.gmail.com>
Message-ID: <46CF0D1F.8030503@noaa.gov>

Timothy Hochberg wrote:
> in principle it's not safe to rely on NaNs being bitwise equal.

Thanks Tim, I always learn a lot on this list.

Anyway, I think my suggestion of "binary equal" wasn't really what I 
want. What I want is essentially a NaN-safe comparison, much like the 
NaN-safe functions like nanmax, nanmin, etc.

I guess what that would involve is looping through the arrays, checking 
for "==", then checking if both are NaN if it returns false. (or 
checking for the NaN's first).

I'm not sure NaNifying the other comparisons would make any sense: NaN > 
NaN (and all the other comparison's would have to return False anyway.

So, would a nanequal function be useful? would it be hard to write?

-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From pgmdevlist at gmail.com  Fri Aug 24 13:03:40 2007
From: pgmdevlist at gmail.com (Pierre GM)
Date: Fri, 24 Aug 2007 13:03:40 -0400
Subject: [Numpy-discussion] comparing arrays with NaN in them.
In-Reply-To: <46CF0265.7040005@noaa.gov>
References: <46CE278C.6010804@noaa.gov>
	<e76aa17f0708240825v5a659a86n430e99c86a4d4436@mail.gmail.com>
	<46CF0265.7040005@noaa.gov>
Message-ID: <200708241303.41843.pgmdevlist@gmail.com>

All,

Using the maskedarray package:
>>>import maskedarray as ma
>>>x = numpy.array([1,numpy.nan,3])
>>>y = numpy.array([1,numpy.nan,3])
>>>ma.allclose(ma.array(x,mask=numpy.isnan(x)),ma.array(y,mask=numpy.isnan(y)) )
True

or even simpler:
>>> maskedarray.testutils.assert_equal(x,y)

#........................................

> What's the status of the two masked array implementations? 

One is official but no longer really supported (numpy.ma), one is still 
unofficial but fully functional (maskedarray), and supported (by me at 
least). My understanding is that maskedarray will stay in the sandbox as long 
as we don't have enough feedback from users.


> Which should 
> I use? Unless there are huge feature differences (which I don't think
> there are), 

Actually there is at least one big difference:

the masked arrays you get from numpy.ma are NOT ndarrays. Therefore, a code 
like:
>>>numpy.asanyarray(numpy.ma.array([1,2,3],mask=[0,1,0]))
array([1, 2, 3])
loses your mask.

On the other side, the maskedarray package (still in the sandbox) implements 
masked arrays as a subclass of ndarrays, so:
>>>numpy.asanyarray(maskedarray.array([1,2,3],mask=[0,1,0]))
masked_array(data = [1 -- 3],
      mask = [False  True False],
      fill_value=999999)

Apart from that, maskedarray implements more functions and methods than are 
available in numpy.ma.

> then I want to use the one that's going to get maintained 
> into the future -- do we know yet which that will be?

I've already committed myself to the support of maskedarray for the time 
being.

Eric Firing and I have been in contact over the last few weeks about how to 
optimize maskedarray, for example by porting part of the code to C. There are 
still a couple of conceptual issues we need to address first, as presented in 
another thread


From robert.kern at gmail.com  Fri Aug 24 15:46:16 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Fri, 24 Aug 2007 14:46:16 -0500
Subject: [Numpy-discussion] pyOpenGL with numpy support
In-Reply-To: <bc657ead0708240222o16d2bd03ub44c6fec016cfc4f@mail.gmail.com>
References: <bc657ead0708240222o16d2bd03ub44c6fec016cfc4f@mail.gmail.com>
Message-ID: <46CF3588.8020206@gmail.com>

Sebastian Haase wrote:
> Hi,
> The latest release notes of pyOpenGL (Feb 15, 2007) say that "Numarray
> support [was] reenabled".
> The current version is 3.0.0a6.
> 
> Does anyone here know the status of the new (ctypes based) pyOpenGL ?
> How is the binding to ("modern") numpy ?

numpy is the primary array type in PyOpenGL 3.0.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From seandavi at gmail.com  Fri Aug 24 17:03:20 2007
From: seandavi at gmail.com (Sean Davis)
Date: Fri, 24 Aug 2007 17:03:20 -0400
Subject: [Numpy-discussion] Dict of lists to numpy recarray
Message-ID: <264855a00708241403w3783703dg37f832bee66c2507@mail.gmail.com>

I have a simple question (I assume), but I can't quite get a handle on the
answer.  I have a dict with each member a list having a long (>5M
elements).  I would like to convert that into a numpy recarray.  So far, my
only thought is to loop over the length of the lists and convert to a list
of tuples--this is SLOW.  What I really need to be able to do is to supply
columns of data to create a recarray, but I haven't found an example of how
to do that.

Thanks,
Sean
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070824/8a715bc5/attachment.html>

From seandavi at gmail.com  Fri Aug 24 17:45:18 2007
From: seandavi at gmail.com (Sean Davis)
Date: Fri, 24 Aug 2007 17:45:18 -0400
Subject: [Numpy-discussion] Dict of lists to numpy recarray
In-Reply-To: <264855a00708241403w3783703dg37f832bee66c2507@mail.gmail.com>
References: <264855a00708241403w3783703dg37f832bee66c2507@mail.gmail.com>
Message-ID: <264855a00708241445w6b10e0faga2761f63927e0ac1@mail.gmail.com>

On 8/24/07, Sean Davis <seandavi at gmail.com> wrote:
>
> I have a simple question (I assume), but I can't quite get a handle on the
> answer.  I have a dict with each member a list having a long (>5M
> elements).  I would like to convert that into a numpy recarray.  So far, my
> only thought is to loop over the length of the lists and convert to a list
> of tuples--this is SLOW.  What I really need to be able to do is to supply
> columns of data to create a recarray, but I haven't found an example of how
> to do that.


Sorry for the noise.  Found it.

newrecarray = numpy.rec.fromarrays
([x1,x2,x3],names='x1,x2,x3',formats='f8,i8,i8')

Sean
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070824/a5ecb274/attachment.html>

From aisaac at american.edu  Fri Aug 24 17:50:12 2007
From: aisaac at american.edu (Alan Isaac)
Date: Fri, 24 Aug 2007 17:50:12 -0400
Subject: [Numpy-discussion] additional thanks
Message-ID: <Mahogany-0.67.0-1728-20070824-175012.00@CASC64043305.american.edu>

I know thanks have already been offered,
but I hope one more on the list will be acceptable.

I start classes next week, in Economics.
It is easy to discourage some of my students,
if the "getting started" part of new software is rough.
The new compatible NumPy and SciPy binaries are VERY HELPFUL!!!

Thanks!
Alan Isaac

PS Just a warning to others in my position:
students using VISTA are reporting install difficulties
for Python 2.5.1.  It sounds like a fix is to proceed as at
<URL:http://www.sephiroth.it/weblog/archives/2006/06/installing_python_msi_on_windo.php>
but as I do not have access to a VISTA machine I have not been able to test this.


From ryanlists at gmail.com  Fri Aug 24 17:55:46 2007
From: ryanlists at gmail.com (Ryan Krauss)
Date: Fri, 24 Aug 2007 16:55:46 -0500
Subject: [Numpy-discussion] additional thanks
In-Reply-To: <Mahogany-0.67.0-1728-20070824-175012.00@CASC64043305.american.edu>
References: <Mahogany-0.67.0-1728-20070824-175012.00@CASC64043305.american.edu>
Message-ID: <c5b438120708241455h29e1fd35q7636f03154886a41@mail.gmail.com>

I helped a coulpe of my students install on Vista.  It was enough to
right click on the exe and choose "Run as Administrator".  A pop-up
window then comes up asking you if you trust the file or something and
you have to chose an option that is something like, "yes, let it
proceed".

On 8/24/07, Alan Isaac <aisaac at american.edu> wrote:
> I know thanks have already been offered,
> but I hope one more on the list will be acceptable.
>
> I start classes next week, in Economics.
> It is easy to discourage some of my students,
> if the "getting started" part of new software is rough.
> The new compatible NumPy and SciPy binaries are VERY HELPFUL!!!
>
> Thanks!
> Alan Isaac
>
> PS Just a warning to others in my position:
> students using VISTA are reporting install difficulties
> for Python 2.5.1.  It sounds like a fix is to proceed as at
> <URL:http://www.sephiroth.it/weblog/archives/2006/06/installing_python_msi_on_windo.php>
> but as I do not have access to a VISTA machine I have not been able to test this.
>
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>


From tom.denniston at alum.dartmouth.org  Fri Aug 24 17:59:53 2007
From: tom.denniston at alum.dartmouth.org (Tom Denniston)
Date: Fri, 24 Aug 2007 16:59:53 -0500
Subject: [Numpy-discussion] Dict of lists to numpy recarray
In-Reply-To: <264855a00708241445w6b10e0faga2761f63927e0ac1@mail.gmail.com>
References: <264855a00708241403w3783703dg37f832bee66c2507@mail.gmail.com>
	<264855a00708241445w6b10e0faga2761f63927e0ac1@mail.gmail.com>
Message-ID: <d6f2d3dd0708241459r7b41b787r2a20ee064063bb97@mail.gmail.com>

Try itertools.izipping the lists and then use numpy.fromiter.

On 8/24/07, Sean Davis <seandavi at gmail.com> wrote:
>
>
> On 8/24/07, Sean Davis <seandavi at gmail.com> wrote:
> > I have a simple question (I assume), but I can't quite get a handle on the
> answer.  I have a dict with each member a list having a long (>5M elements).
>  I would like to convert that into a numpy recarray.  So far, my only
> thought is to loop over the length of the lists and convert to a list of
> tuples--this is SLOW.  What I really need to be able to do is to supply
> columns of data to create a recarray, but I haven't found an example of how
> to do that.
>
> Sorry for the noise.  Found it.
>
> newrecarray =
> numpy.rec.fromarrays([x1,x2,x3],names='x1,x2,x3',formats='f8,i8,i8')
>
> Sean
>
>
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>


From pgmdevlist at gmail.com  Fri Aug 24 20:27:41 2007
From: pgmdevlist at gmail.com (Pierre GM)
Date: Fri, 24 Aug 2007 20:27:41 -0400
Subject: [Numpy-discussion] Maskedarray implementations
Message-ID: <200708242027.41896.pgmdevlist@gmail.com>

All,

As you might be aware, there are currently two concurrent implementations of 
masked arrays in numpy:
* numpy.ma is the official implementation, but it is unclear whether it is 
still actively maintained.
* maskedarray is the alternative I've been developing initially for my own 
purpose from numpy.ma. It is available in the scipy svn sandbox, but is 
already fully functional

The main difference between numpy.ma and maskedarray is that the objects 
created by numpy.ma are NOT ndarrays, while maskedarray.MaskedArray is a full 
subclass of ndarrays. For example:

>>>import numpy, maskedarray
>>>x = numpy.ma.array([1,2], mask=[0,1])
>>>isinstance(x, numpy.ndarray)
False
>>>numpy.asanyarray(x)
array([1,2])
Note that we just lost the mask...

>>>x = maskedarray.array([1,2], mask=[0,1])
>>>isinstance(x, numpy.ndarray)
True
>>>numpy.asanyarray(x)
masked_array(data = [1 --],
? ? ? mask = [False ?True],
? ? ? fill_value=999999)
Note that the mask is conserved.

Having the masked array be a subclass of ndarray makes masked arrays easier to 
mix with other ndarray types and to subclass. An example of application is 
the TimeSeries package, where the main TimeSeries class is a subclass of 
maskedarray.MaskedArray.

* Does anyone see any *disadvantages* to this aspect of maskedarray relative 
to numpy.ma?

* What would be the requisites to move maskedarray out of the sandbox ? We 
hope to be able in the short term to either replace or at least merge the two 
implementations, once a couple of issues are addressed (but we can talk about 
that later...)

Thanks a lot in advance for your feedback
Pierre


From aisaac at american.edu  Fri Aug 24 20:34:48 2007
From: aisaac at american.edu (Alan G Isaac)
Date: Fri, 24 Aug 2007 20:34:48 -0400
Subject: [Numpy-discussion] additional thanks
In-Reply-To: <c5b438120708241455h29e1fd35q7636f03154886a41@mail.gmail.com>
References: <Mahogany-0.67.0-1728-20070824-175012.00@CASC64043305.american.edu><c5b438120708241455h29e1fd35q7636f03154886a41@mail.gmail.com>
Message-ID: <Mahogany-0.67.0-1568-20070824-203448.00@american.edu>

On Fri, 24 Aug 2007, Ryan Krauss apparently wrote:
> I helped a couple of my students install on Vista.  It was 
> enough to right click on the exe and choose "Run as 
> Administrator".  A pop-up window then comes up asking you 
> if you trust the file or something and you have to chose 
> an option that is something like, "yes, let it proceed". 

OK.  I was not present for the installs.  (Our classes start 
next week.) I did of course check that they were installing 
as Administrator, and they claimed "yes".  I'll know more 
next week.

Thanks,
Alan Isaac


From ryanlists at gmail.com  Fri Aug 24 20:44:15 2007
From: ryanlists at gmail.com (Ryan Krauss)
Date: Fri, 24 Aug 2007 19:44:15 -0500
Subject: [Numpy-discussion] additional thanks
In-Reply-To: <Mahogany-0.67.0-1568-20070824-203448.00@american.edu>
References: <Mahogany-0.67.0-1728-20070824-175012.00@CASC64043305.american.edu>
	<c5b438120708241455h29e1fd35q7636f03154886a41@mail.gmail.com>
	<Mahogany-0.67.0-1568-20070824-203448.00@american.edu>
Message-ID: <c5b438120708241744pdb6600eyfe426487d00f5181@mail.gmail.com>

I think in Vista this is different from being logged into an
administrator account (which I think is different from XP).  I don't
actually have a Vista machine to test on, but did help my students do
it on theirs.  No idea how their user accounts were set up.  So, I
think you have to right click and choose "Run as Administrator"
regardless of what account you are logged into (I think, based on my
limited Vista experience).

Ryan

On 8/24/07, Alan G Isaac <aisaac at american.edu> wrote:
> On Fri, 24 Aug 2007, Ryan Krauss apparently wrote:
> > I helped a couple of my students install on Vista.  It was
> > enough to right click on the exe and choose "Run as
> > Administrator".  A pop-up window then comes up asking you
> > if you trust the file or something and you have to chose
> > an option that is something like, "yes, let it proceed".
>
> OK.  I was not present for the installs.  (Our classes start
> next week.) I did of course check that they were installing
> as Administrator, and they claimed "yes".  I'll know more
> next week.
>
> Thanks,
> Alan Isaac
>
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>


From David.L.Goldsmith at noaa.gov  Fri Aug 24 20:49:17 2007
From: David.L.Goldsmith at noaa.gov (David Goldsmith)
Date: Fri, 24 Aug 2007 17:49:17 -0700
Subject: [Numpy-discussion] Maskedarray implementations
In-Reply-To: <200708242027.41896.pgmdevlist@gmail.com>
References: <200708242027.41896.pgmdevlist@gmail.com>
Message-ID: <46CF7C8D.90005@noaa.gov>

Pierre GM wrote:
> * Does anyone see any *disadvantages* to this aspect of maskedarray relative 
> to numpy.ma?
>   
What *is* numpy.ma derived from?

DG
-- 
ERD/ORR/NOS/NOAA <http://response.restoration.noaa.gov/emergencyresponse/>


From pgmdevlist at gmail.com  Fri Aug 24 21:08:48 2007
From: pgmdevlist at gmail.com (Pierre GM)
Date: Fri, 24 Aug 2007 21:08:48 -0400
Subject: [Numpy-discussion] Maskedarray implementations
In-Reply-To: <46CF7C8D.90005@noaa.gov>
References: <200708242027.41896.pgmdevlist@gmail.com> <46CF7C8D.90005@noaa.gov>
Message-ID: <200708242108.51833.pgmdevlist@gmail.com>

On Friday 24 August 2007 20:49:17 David Goldsmith wrote:
> Pierre GM wrote:
> > * Does anyone see any *disadvantages* to this aspect of maskedarray
> > relative to numpy.ma?
>
> What *is* numpy.ma derived from?

If you're talking about numpy.ma arrays:

A numpy.ma.MaskedArray is an independent object consisting of two ndarrays 
(one for the data, one for the mask). 
A maskedarray.MaskedArray is a ndarray with another ndarray as attribute (the 
mask). Therefore, it inherits the methods of a ndarray.

>>>import numpy, maskedarray
>>>x = numpy.ma.array([1,2,3],mask=[1,0,0])
>>>type(x._data),type(x._mask)
(<type 'numpy.ndarray'>, <type 'numpy.ndarray'>)
>>>x.view(numpy.ndarray)
NotImplementedError: not yet implemented for numpy.ma arrays

>>>x = maskedarray.array([1,2,3],mask=[1,0,0])
(<type 'numpy.ndarray'>, <type 'numpy.ndarray'>)
>>>x.view(numpy.ndarray)
array([1, 2, 3])

If you're talking about the package itself:
numpy.ma derives from the corresponding Numeric module, written by Paul 
Dubois. The maskedarray implementation relies quite heavily on Paul's work, I 
can't thank him enough.


From jks at iki.fi  Fri Aug 24 23:08:33 2007
From: jks at iki.fi (=?iso-8859-1?Q?Jouni_K=2E_Sepp=E4nen?=)
Date: Sat, 25 Aug 2007 06:08:33 +0300
Subject: [Numpy-discussion] Finding unique rows in an array
References: <46C2CD01.5030307@bristol.ac.uk>
	<1187190071.384881.240470@w3g2000hsg.googlegroups.com>
	<46CAF562.9060009@cc.usu.edu> <200708221111.17141.faltet@carabos.com>
Message-ID: <m2sl6871la.fsf@iki.fi>

Francesc Altet <faltet at carabos.com> writes:

> A Tuesday 21 August 2007, Mark.Miller escrigu?:
>> Is there a good loopless way to identify all of the unique rows in an
>> array?  Something like numpy.unique() is ideal, but capable of
>> extracting unique subarrays along an axis.
>
> You can always do a view of the rows as strings and then use unique().

For large arrays it probably makes sense to hash the rows by taking a
dot product with a random vector. Then sort the hash values and identify
blocks of equal values (allowing for rounding errors). Rows with
different hash values are guaranteed to be different; for blocks of rows
with the same hash value, you'll have to check, but this will probably
be much less work than checking every row, and (I hope) BLAS makes the
dot-product phase go fast.

-- 
Jouni K. Sepp?nen
http://www.iki.fi/jks


From lxander.m at gmail.com  Sat Aug 25 11:43:32 2007
From: lxander.m at gmail.com (Alexander Michael)
Date: Sat, 25 Aug 2007 11:43:32 -0400
Subject: [Numpy-discussion] Maskedarray implementations
In-Reply-To: <200708242027.41896.pgmdevlist@gmail.com>
References: <200708242027.41896.pgmdevlist@gmail.com>
Message-ID: <525f23e80708250843x722ecfa2h9a61337c457c0923@mail.gmail.com>

Is there any documentation available for your maskedarray? I would
like to get a feel for the basics, like how do I take the dot product,
do elementwise multiplication, etc, with your implementation.

Thanks,
Alex


From v.tini at tu-bs.de  Tue Aug 21 08:11:37 2007
From: v.tini at tu-bs.de (Vivian Tini)
Date: Tue, 21 Aug 2007 14:11:37 +0200
Subject: [Numpy-discussion] Installation problem NumPy-1.0.3 on Linux x86_64
	Python 2.4.2
Message-ID: <1187698297.46cad67939e01@webmail.tu-bs.de>

Dear All, 
 
I am trying to install the package NumPy-1.0.3 on Linux x86_64 with Python 
version 2.4.2 and after using the standard installation command : 
 
 python setup.py install 
 
I received the following error message: 
error: could not create '/usr/local/lib64/python2.4/site-packages/numpy': 
Permission denied 
 
What is the cause of this problem? What kind of permission do I need? 
 
The python executable is located in /usr/bin however the numpy directory from 
where I tried to install is located in /home/../numpy-1.0.3. 
 
I hope someone could help me to figure out how shall I proceed. Thank you very 
much in advance. 
 
Regards 
 
Vivian Tini 

-------------- next part --------------
An embedded message was scrubbed...
From: unknown sender
Subject: no subject
Date: no date
Size: 3163
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070821/bf5f09da/attachment.mht>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: unnamed
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070821/bf5f09da/attachment.ksh>

From v.tini at tu-bs.de  Fri Aug 24 11:46:55 2007
From: v.tini at tu-bs.de (Vivian Tini)
Date: Fri, 24 Aug 2007 17:46:55 +0200
Subject: [Numpy-discussion] problem on testing numpy
Message-ID: <1187970415.46cefd6f7805f@webmail.tu-bs.de>

Dear All,

I have just installed NumPy and I am excited to test it.
Since I have no access as root then I installed Numpy in my home directory. 
The following messages appears as I tried some commands:

>>> import numpy 
Running from numpy source directory

>>> from numpy import *
>>> a = array([1,2,3])
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
NameError: name 'array' is not defined

>>> import Numeric 
>>> from Numeric import *
>>> a = array([1,2,3]) # this works fine 
>>> b = array([4,5,6])
>>> c = inner(a,b)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
NameError: name 'inner' is not defined

How should I proceed to make either the numpy or Numeric works? Is it the 
problem from the installation?

Thanks a lot in advance.

Regards,

Vivian Tini
     

From ondrej at certik.cz  Sun Aug 19 23:30:19 2007
From: ondrej at certik.cz (Ondrej Certik)
Date: Sun, 19 Aug 2007 20:30:19 -0700
Subject: [Numpy-discussion] [SciPy-dev] NumPy 1.0.3.x and SciPy 0.5.2.x
In-Reply-To: <db6b5ecc0708191751n65bd47fat74c244bb7dd222eb@mail.gmail.com>
References: <c7009a550708150922t5d855bbbx6e4fe6944f4618cd@mail.gmail.com>
	<db6b5ecc0708191751n65bd47fat74c244bb7dd222eb@mail.gmail.com>
Message-ID: <85b5c3130708192030t6947d623oa00a710a229cbd5d@mail.gmail.com>

> I just wanted to give you a public, huge thank you for tackling this
> most thankless but important problem.  Many people at the just
> finished SciPy'07 conference mentioned better deployment/installation
> support as their main issue with scipy.  Our tools are maturing, but
> we won't get very far if they don't actually get in the hands of
> users.

I think all of the developers should make sure, that scipy and numpy
installs natively in their own favourite distribution. So for example
I am using Debian, so I'll try to keep an eye on it and help the
maintainer of the deb package. This way it should cover the most
distributions.

Ondrej

P.S. I don't know what the native way of installing packages on Mac OS
X is, but I know of the fink project, that basically allows to use
debian packages:

http://finkproject.org/


From oliphant at enthought.com  Fri Aug 24 21:43:23 2007
From: oliphant at enthought.com (Travis Oliphant)
Date: Fri, 24 Aug 2007 19:43:23 -0600
Subject: [Numpy-discussion] Maskedarray implementations
In-Reply-To: <200708242027.41896.pgmdevlist@gmail.com>
References: <200708242027.41896.pgmdevlist@gmail.com>
Message-ID: <46CF893B.8080007@enthought.com>

Pierre GM wrote:
> All,
>
>
>   
> * Does anyone see any *disadvantages* to this aspect of maskedarray relative 
> to numpy.ma?
>
> * What would be the requisites to move maskedarray out of the sandbox ? We 
> hope to be able in the short term to either replace or at least merge the two 
> implementations, once a couple of issues are addressed (but we can talk about 
> that later...)
>   

I like the direction of this work.  For me, the biggest issue is whether 
or not matplotlib (and other code depending on numpy.ma) works with it.  
I'm pretty sure this can be handled and so, I'd personally like to see it.

Best,

-Travis


From matthieu.brucher at gmail.com  Sat Aug 25 12:31:08 2007
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Sat, 25 Aug 2007 18:31:08 +0200
Subject: [Numpy-discussion] Installation problem NumPy-1.0.3 on Linux
	x86_64 Python 2.4.2
In-Reply-To: <1187698297.46cad67939e01@webmail.tu-bs.de>
References: <1187698297.46cad67939e01@webmail.tu-bs.de>
Message-ID: <e76aa17f0708250931h45b554f6oc509c466d791e5d6@mail.gmail.com>

2007/8/21, Vivian Tini <v.tini at tu-bs.de>:
>
> Dear All,
>
> I am trying to install the package NumPy-1.0.3 on Linux x86_64 with Python
> version 2.4.2 and after using the standard installation command :
>
> python setup.py install
>
> I received the following error message:
> error: could not create '/usr/local/lib64/python2.4/site-packages/numpy':
> Permission denied
>
> What is the cause of this problem? What kind of permission do I need?


Root permission. If you want to install it in a local folder, try
--prefix=/home/something/local and set PYTHONPATH to
/home/something/local/lib/python2.4/site-packages

Matthieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070825/6782febd/attachment.html>

From matthieu.brucher at gmail.com  Sat Aug 25 12:31:45 2007
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Sat, 25 Aug 2007 18:31:45 +0200
Subject: [Numpy-discussion] problem on testing numpy
In-Reply-To: <1187970415.46cefd6f7805f@webmail.tu-bs.de>
References: <1187970415.46cefd6f7805f@webmail.tu-bs.de>
Message-ID: <e76aa17f0708250931p5b277e41n67b70abae196220a@mail.gmail.com>

Where did you launch Python from ?

Matthieu

2007/8/24, Vivian Tini <v.tini at tu-bs.de>:
>
> Dear All,
>
> I have just installed NumPy and I am excited to test it.
> Since I have no access as root then I installed Numpy in my home
> directory.
> The following messages appears as I tried some commands:
>
> >>> import numpy
> Running from numpy source directory
>
> >>> from numpy import *
> >>> a = array([1,2,3])
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> NameError: name 'array' is not defined
>
> >>> import Numeric
> >>> from Numeric import *
> >>> a = array([1,2,3]) # this works fine
> >>> b = array([4,5,6])
> >>> c = inner(a,b)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> NameError: name 'inner' is not defined
>
> How should I proceed to make either the numpy or Numeric works? Is it the
> problem from the installation?
>
> Thanks a lot in advance.
>
> Regards,
>
> Vivian Tini
>
>
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070825/879cd2d4/attachment.html>

From efiring at hawaii.edu  Sat Aug 25 12:50:38 2007
From: efiring at hawaii.edu (Eric Firing)
Date: Sat, 25 Aug 2007 06:50:38 -1000
Subject: [Numpy-discussion] Maskedarray implementations
In-Reply-To: <525f23e80708250843x722ecfa2h9a61337c457c0923@mail.gmail.com>
References: <200708242027.41896.pgmdevlist@gmail.com>
	<525f23e80708250843x722ecfa2h9a61337c457c0923@mail.gmail.com>
Message-ID: <46D05DDE.5060006@hawaii.edu>

Alexander Michael wrote:
> Is there any documentation available for your maskedarray? I would
> like to get a feel for the basics, like how do I take the dot product,
> do elementwise multiplication, etc, with your implementation.
> 
> Thanks,
> Alex
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

Alex,

Pierre wrote some notes about maskedarray here: 
http://projects.scipy.org/scipy/numpy/wiki/MaskedArray
starting half-way down the page.

For normal use, do whatever you would do with numpy.ma; the maskedarray 
implementation is highly compatible, so the same functions and methods 
are available with the same signatures.

Eric


From jdh2358 at gmail.com  Sat Aug 25 14:06:50 2007
From: jdh2358 at gmail.com (John Hunter)
Date: Sat, 25 Aug 2007 13:06:50 -0500
Subject: [Numpy-discussion] Maskedarray implementations
In-Reply-To: <46CF893B.8080007@enthought.com>
References: <200708242027.41896.pgmdevlist@gmail.com>
	<46CF893B.8080007@enthought.com>
Message-ID: <88e473830708251106m10322565g7dcdc1c200ecb6f9@mail.gmail.com>

On 8/24/07, Travis Oliphant <oliphant at enthought.com> wrote:
> I like the direction of this work.  For me, the biggest issue is whether
> or not matplotlib (and other code depending on numpy.ma) works with it.
> I'm pretty sure this can be handled and so, I'd personally like to see it.

mpl already supports it (both ma and masked array via a config
setting) and we would be very happy to just maskedarray so we don't
have to support both.   Eric Firing added support for this a couple of
months back...  Things like having support for masked record arrays
are a big incentive to use maskedarray for me....

JDH


From pgmdevlist at gmail.com  Sat Aug 25 15:06:06 2007
From: pgmdevlist at gmail.com (Pierre GM)
Date: Sat, 25 Aug 2007 15:06:06 -0400
Subject: [Numpy-discussion] Maskedarray implementations
In-Reply-To: <46D05DDE.5060006@hawaii.edu>
References: <200708242027.41896.pgmdevlist@gmail.com>
	<525f23e80708250843x722ecfa2h9a61337c457c0923@mail.gmail.com>
	<46D05DDE.5060006@hawaii.edu>
Message-ID: <200708251506.08568.pgmdevlist@gmail.com>

On Saturday 25 August 2007 12:50:38 Eric Firing wrote:
> Alexander Michael wrote:
> > Is there any documentation available for your maskedarray?
>
> Pierre wrote some notes about maskedarray here:
> http://projects.scipy.org/scipy/numpy/wiki/MaskedArray
> starting half-way down the page.

Please note that I should probably edit the page, as it starts to be a bit 
old. We could also start another wiki page for masked arrays...
In addition, there are a lot of unittest available, which can give you a first 
taste.
The 'dot' function in maskedarray takes an additional parameter, strict.
If strict is True, the masked values are propagated: if a masked value appears  
in a row or column, the whole row or column is considered masked. That's 
basically what you would have if masked values were nan (on a float array).
If strict is False, masked values are considered as 0.

> For normal use, do whatever you would do with numpy.ma; the maskedarray
> implementation is highly compatible, so the same functions and methods
> are available with the same signatures.

Please don't hesitate to let me know where the doc is lacking, I'll fix that.

As noted by Eric and John, mpl is fully compatible w/ maskedarray. Until 
recently, one had to edit the matplotlib.numerix.ma manually. Thanks to Eric, 
rcParams now accept a parameter that sets whether numpy.ma or maskedarray 
should be used.

John, masked records are still experimental. I wrote the basis for the code 
(the mrecords module), tweaked it here and there according to the feedback I 
received (not so much so far, I want to thank Matt Knox (with whom we wrote 
TimeSeries) for starting to use mrecords on a regular basis), I'd be of 
course more than happy to fix any problem we may/will run into. An 
interesting feature of masked records is that individual fields can be masked 
(instead of masking full records).


From efiring at hawaii.edu  Sat Aug 25 15:48:00 2007
From: efiring at hawaii.edu (Eric Firing)
Date: Sat, 25 Aug 2007 09:48:00 -1000
Subject: [Numpy-discussion] Maskedarray implementations
In-Reply-To: <200708251506.08568.pgmdevlist@gmail.com>
References: <200708242027.41896.pgmdevlist@gmail.com>
	<525f23e80708250843x722ecfa2h9a61337c457c0923@mail.gmail.com>
	<46D05DDE.5060006@hawaii.edu> <200708251506.08568.pgmdevlist@gmail.com>
Message-ID: <46D08770.7050007@hawaii.edu>

Pierre GM wrote:
> On Saturday 25 August 2007 12:50:38 Eric Firing wrote:
>> Alexander Michael wrote:
>>> Is there any documentation available for your maskedarray?
>> Pierre wrote some notes about maskedarray here:
>> http://projects.scipy.org/scipy/numpy/wiki/MaskedArray
>> starting half-way down the page.
> 
> Please note that I should probably edit the page, as it starts to be a bit 
> old. We could also start another wiki page for masked arrays...

I've made a couple of small "emergency" edits, but a separate page would 
make things much more visible and less confusing.

Eric


From mattknox_ca at hotmail.com  Sat Aug 25 17:13:44 2007
From: mattknox_ca at hotmail.com (Matt Knox)
Date: Sat, 25 Aug 2007 21:13:44 +0000 (UTC)
Subject: [Numpy-discussion] Maskedarray implementations
References: <200708242027.41896.pgmdevlist@gmail.com>
	<46CF893B.8080007@enthought.com>
	<88e473830708251106m10322565g7dcdc1c200ecb6f9@mail.gmail.com>
Message-ID: <loom.20070825T224942-553@post.gmane.org>

I think it's reasonably safe to say at this point that most people are in 
favor of the new maskedarray implementation becoming the default numpy.ma at 
some point in the future. So the question is, when/how will the migration 
process be done?

 - just swap the whole thing as is and hope for the best?
 - start a numpy 1.1 branch and put it in there as a replacement for numpy.ma?
 - put it in numpy as a separate module from numpy.ma initially? 
(eg. "numpy.ma_new" ?)
 - some other approach?

The first option would be perfectly fine by me since I don't use the standard 
numpy.ma anyway, but I suspect some people might have a problem with that. So 
what is the best way to do this?

- Matt


From pgmdevlist at gmail.com  Sat Aug 25 20:21:26 2007
From: pgmdevlist at gmail.com (Pierre GM)
Date: Sat, 25 Aug 2007 20:21:26 -0400
Subject: [Numpy-discussion] Maskedarray implementations : new developer zone
	wiki page
In-Reply-To: <46D08770.7050007@hawaii.edu>
References: <200708242027.41896.pgmdevlist@gmail.com>
	<200708251506.08568.pgmdevlist@gmail.com>
	<46D08770.7050007@hawaii.edu>
Message-ID: <200708252021.26965.pgmdevlist@gmail.com>

On Saturday 25 August 2007 15:48:00 Eric Firing wrote:
> I've made a couple of small "emergency" edits, but a separate page would
> make things much more visible and less confusing.

So here it is:
http://projects.scipy.org/scipy/numpy/wiki/MaskedArrayAlternative

Please note the section : Optimizing maskedarray. You'll find the quick 
description of a test case (three implementations of divide) that emerged 
from on off-list discussion with Eric Firing.
The problem can be formulated as "do we need to fill masked arrays before 
processing or not ?". Eric is in favor of the second solution (prefilling 
according to the domain mask),  while the more it goes, the more I'm leaning 
towards the third one "bah, let numpy take care of that."

I would be very grateful if you could post your comments/ideas/suggestions 
about the three implementations on that list. This is an issue I'd like to 
solve ASAP. 

Thanks a lot in advance
Pierre


From stefan at sun.ac.za  Sun Aug 26 08:04:19 2007
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Sun, 26 Aug 2007 14:04:19 +0200
Subject: [Numpy-discussion] problem on testing numpy
In-Reply-To: <1187970415.46cefd6f7805f@webmail.tu-bs.de>
References: <1187970415.46cefd6f7805f@webmail.tu-bs.de>
Message-ID: <20070826120419.GB20731@mentat.za.net>

On Fri, Aug 24, 2007 at 05:46:55PM +0200, Vivian Tini wrote:
> Dear All,
> 
> I have just installed NumPy and I am excited to test it.
> Since I have no access as root then I installed Numpy in my home directory. 
> The following messages appears as I tried some commands:
> 
> >>> import numpy 
> Running from numpy source directory

^^^ You shouldn't be running from the source directory.  Change to
another directory and try again -- it should work.

Cheers
St?fan


From mnandris at btinternet.com  Sun Aug 26 08:45:55 2007
From: mnandris at btinternet.com (Michael Nandris)
Date: Sun, 26 Aug 2007 13:45:55 +0100 (BST)
Subject: [Numpy-discussion] numpy.random.multinomial() cannot handle zero's
Message-ID: <442992.24353.qm@web86509.mail.ird.yahoo.com>

Hi,
Is there an easy way around this problem, that does not involve fixing the API (like using NaN instead of 0.0)?

>>> from numpy.random import multinomial 
>>> multinomial(100,[ 0.2, 0.4, 0.1, 0.3 ])
array([19, 45, 10, 26])
>>> multinomial( 100, [0.2, 0.0, 0.8, 0.0] )
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "mtrand.pyx", line 1173, in mtrand.RandomState.multinomial
TypeError: exceptions must be strings, classes, or instances, not exceptions.ValueError

I found a similar problem in scipy.stats.rv_discrete() which was fixed by adding sort to a dictionary handler: 

"""
def reverse_dict(dict):
    newdict = {}
    for key in dict.keys():            #  DUFF
        newdict[dict[key]] = key
    return newdict
"""

def reverse_dict(dict): 
    newdict = {} 
    for key in dict.keys(): 
        sorted_keys = copy(dict.keys()) 
        sorted_keys.sort() 
    for key in sorted_keys[::-1]:         # NEW
        newdict[dict[key]] = key 
    return newdict 

Obviously this cannot be done with numpy since it runs in C or something which I don't understand. Can anyone help? Numpy is great and the simulation I want to code requires speed.

Thanks for any advice given

Michael
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070826/92d087b5/attachment.html>

From stefan at sun.ac.za  Sun Aug 26 19:36:27 2007
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Mon, 27 Aug 2007 01:36:27 +0200
Subject: [Numpy-discussion] numpy.random.multinomial() cannot handle
	zero's
In-Reply-To: <442992.24353.qm@web86509.mail.ird.yahoo.com>
References: <442992.24353.qm@web86509.mail.ird.yahoo.com>
Message-ID: <20070826233627.GF14395@mentat.za.net>

Hi Michael

On Sun, Aug 26, 2007 at 01:45:55PM +0100, Michael Nandris wrote:
> Is there an easy way around this problem, that does not involve fixing the API
> (like using NaN instead of 0.0)?
> 
> >>> from numpy.random import multinomial
> >>> multinomial(100,[ 0.2, 0.4, 0.1, 0.3 ])
> array([19, 45, 10, 26])
> >>> multinomial( 100, [0.2, 0.0, 0.8, 0.0] )
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "mtrand.pyx", line 1173, in mtrand.RandomState.multinomial
> TypeError: exceptions must be strings, classes, or instances, not
> exceptions.ValueError

For some reason, the kahan_sum of [0.2,0.0,0.8,0.0] is ever so
slightly larger than 1.0 (in the order of 1e-16), but I'm not sure
why, yet (this isn't specific to kahan summation -- normal summation
shows the same behaviour).

As a quick workaround, you can subtract 1e-16 from all your
probabilities.

Regards
St?fan


From martin.wiechert at gmx.de  Mon Aug 27 08:57:28 2007
From: martin.wiechert at gmx.de (Martin Wiechert)
Date: Mon, 27 Aug 2007 14:57:28 +0200
Subject: [Numpy-discussion] possibly ctypes related segfault
Message-ID: <200708271457.28679.martin.wiechert@gmx.de>

Hi list,

I'm suffering from a strange segfault and would appreciate your help.

I'm calling a small C function using ctypes / numpy.ctypeslib. The function 
works in the sense that it returns correct results. After calling the 
function however I can reliably evoke a segfault by using readline tab 
completion.

I'm not very experienced, but this smells like a memory management bug to me, 
which is strange, because I'm not doing any mallocing/freeing at all in the C 
code.

I could not reproduce the bug in a debug build of python (--without-pymalloc) 
or on another machine. The crashing machine is an eight-way opteron.

Maybe I should mention that the C function calls two lapack fortran functions. 
Can this cause problems?

Anyway, I'm at a loss. Please help!

I've attached the files for reference.

Thanks Martin

P.S.: Here is what valgrind finds in the debug build:

-bash-3.1$ valgrind ~/local/debug/bin/python
==16266== Memcheck, a memory error detector.
==16266== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al.
==16266== Using LibVEX rev 1658, a library for dynamic binary translation.
==16266== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP.
==16266== Using valgrind-3.2.1, a dynamic binary instrumentation framework.
==16266== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al.
==16266== For more details, rerun with: -v
==16266==
Python 2.5.1 (r251:54863, Aug 24 2007, 16:13:26)
[GCC 4.1.1 20070105 (Red Hat 4.1.1-51)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> execfile ('recttest.py')
--16266-- DWARF2 CFI reader: unhandled CFI instruction 0:10
--16266-- DWARF2 CFI reader: unhandled CFI instruction 0:10
78.6006829739 4
78.6006829739
[92353 refs]
>>>
==16266== Conditional jump or move depends on uninitialised value(s)
==16266==    at 0x41361F: parsetok (parsetok.c:189)
==16266==    by 0x4131B6: PyParser_ParseFileFlags (parsetok.c:89)
==16266==    by 0x4E01D2: PyParser_ASTFromFile (pythonrun.c:1381)
==16266==    by 0x4DE15C: PyRun_InteractiveOneFlags (pythonrun.c:770)
==16266==    by 0x4DDF15: PyRun_InteractiveLoopFlags (pythonrun.c:721)
==16266==    by 0x4DDD70: PyRun_AnyFileExFlags (pythonrun.c:690)
==16266==    by 0x412E05: Py_Main (main.c:523)
==16266==    by 0x411D62: main (python.c:23)
[92353 refs]
[37870 refs]
==16266==
==16266== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 98 from 1)
==16266== malloc/free: in use at exit: 2,791,249 bytes in 17,090 blocks.
==16266== malloc/free: 174,713 allocs, 157,623 frees, 376,495,377 bytes 
allocated.
==16266== For counts of detected errors, rerun with: -v
==16266== searching for pointers to 17,090 not-freed blocks.
==16266== checked 5,156,624 bytes.
==16266==
==16266== LEAK SUMMARY:
==16266==    definitely lost: 0 bytes in 0 blocks.
==16266==      possibly lost: 35,400 bytes in 103 blocks.
==16266==    still reachable: 2,755,849 bytes in 16,987 blocks.
==16266==         suppressed: 0 bytes in 0 blocks.
==16266== Reachable blocks (those to which a pointer was found) are not shown.
==16266== To see them, rerun with: --show-reachable=yes


-------------- next part --------------
A non-text attachment was scrubbed...
Name: solver.c
Type: text/x-csrc
Size: 4892 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070827/dd147e91/attachment.c>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cMontecarlo.py
Type: application/x-python
Size: 2657 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070827/dd147e91/attachment.bin>

From seandavi at gmail.com  Mon Aug 27 13:23:34 2007
From: seandavi at gmail.com (Sean Davis)
Date: Mon, 27 Aug 2007 13:23:34 -0400
Subject: [Numpy-discussion] Issue with converting from numpy record to
	list/tuple
Message-ID: <264855a00708271023n9c0076bl1957ebead751ca70@mail.gmail.com>

I have a numpy recarray that I want to load into a database using insert
statements.  To do so, I need to convert each record to a tuple.  Here is
what I get (using psycopg2)

In [1]: a[1]
Out[1]: ('5151_0023_0001', 'FORWARD', 'interval rank', 'target_tm:76.00
;probe_tm:70.90;freq:27.93;count:01;rules:0000;score:0658',
'chr3:1-199501827',
'AAAGGAATTCCATTCATCTCTGGATATTTTGAAATCATTAGGGCAAACAATAAATAA', 0L, 171449529L,
171449529L, 1L, 23L, 'experimental', 'CHR03P006149104', 6149104L, 5151L,
23L, 1L)

In [2]: type(a[1])
Out[2]: <class 'numpy.core.records.record'>

In [3]: sqlcommand
Out[3]: 'insert into nbl_tmp values
(%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s);'

In [4]: cur.execute(sqlcommand,tuple(a[1]))
---------------------------------------------------------------------------
<class 'psycopg2.ProgrammingError'>       Traceback (most recent call last)

/sherlock/sdavis/Documents/workspace/svn/watson/Sean/PythonCode/<ipython
console> in <module>()

<class 'psycopg2.ProgrammingError'>: can't adapt

In [5]: b=('5151_0023_0001', 'FORWARD', 'interval rank', 'target_tm:76.00
;probe_tm:70.90;freq:27.93;count:01;rules:0000;score:0658',
'chr3:1-199501827',
'AAAGGAATTCCATTCATCTCTGGATATTTTGAAATCATTAGGGCAAACAATAAATAA', 0L, 171449529L,
171449529L, 1L, 23L, 'experimental', 'CHR03P006149104', 6149104L, 5151L,
23L, 1L)

In [6]: cur.execute(sqlcommand,b)

In [7]: a[1].dtype
Out[7]: dtype([('PROBE_DESIGN_ID', '|S40'), ('CONTAINER', '|S40'),
('DESIGN_NOTE', '|S80'), ('SELECTION_CRITERIA', '|S80'), ('SEQ_ID', '|S40'),
('PROBE_SEQUENCE', '|S100'), ('MISMATCH', '<u8'), ('MATCH_INDEX', '<u8'),
('FEATURE_ID', '<u8'), ('ROW_NUM', '<u8'), ('COL_NUM', '<u8'),
('PROBE_CLASS', '|S40'), ('PROBE_ID', '|S40'), ('POSITION', '<u8'),
('DESIGN_ID', '<u8'), ('X', '<u8'), ('Y', '<u8')])

Why does the casting using tuple() not work while cut-and-paste of the a[1]
record into a new variable works just fine?

Thanks,
Sean
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070827/a5c46573/attachment.html>

From Chris.Barker at noaa.gov  Mon Aug 27 13:31:38 2007
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Mon, 27 Aug 2007 10:31:38 -0700
Subject: [Numpy-discussion] Maskedarray implementations
In-Reply-To: <200708242027.41896.pgmdevlist@gmail.com>
References: <200708242027.41896.pgmdevlist@gmail.com>
Message-ID: <46D30A7A.7080403@noaa.gov>

Pierre GM wrote:
> * Does anyone see any *disadvantages* to this aspect of maskedarray relative 
> to numpy.ma?

Nope, but I sure do see the advantages!

-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From Chris.Barker at noaa.gov  Mon Aug 27 14:07:00 2007
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Mon, 27 Aug 2007 11:07:00 -0700
Subject: [Numpy-discussion] numpy.random.multinomial() cannot
	handle	zero's
In-Reply-To: <20070826233627.GF14395@mentat.za.net>
References: <442992.24353.qm@web86509.mail.ird.yahoo.com>
	<20070826233627.GF14395@mentat.za.net>
Message-ID: <46D312C4.4080504@noaa.gov>

Stefan van der Walt wrote:
> For some reason, the kahan_sum of [0.2,0.0,0.8,0.0] is ever so
> slightly larger than 1.0 (in the order of 1e-16), but I'm not sure
> why, yet (this isn't specific to kahan summation -- normal summation
> shows the same behavior).

Just to make sure -- is the khan_sum "compensated summation"?

Is the kahan_sum closer? -- it should be, though compensated summation 
  is really for adding LOTS of numbers, for 4, it's pointless at best. 
Anyway, binary floating point has its errors, and compensated summation 
can help, but it's still not exact for numbers that can't be exactly 
represented by binary.

i.e. if your result is within 15 decimal digits of the exact result, 
that's as good as it gets.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From Chris.Barker at noaa.gov  Mon Aug 27 14:09:33 2007
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Mon, 27 Aug 2007 11:09:33 -0700
Subject: [Numpy-discussion] Maskedarray implementations
In-Reply-To: <loom.20070825T224942-553@post.gmane.org>
References: <200708242027.41896.pgmdevlist@gmail.com>
	<46CF893B.8080007@enthought.com>
	<88e473830708251106m10322565g7dcdc1c200ecb6f9@mail.gmail.com>
	<loom.20070825T224942-553@post.gmane.org>
Message-ID: <46D3135D.6000109@noaa.gov>

Matt Knox wrote:
>  - put it in numpy as a separate module from numpy.ma initially? 
> (eg. "numpy.ma_new" ?)

This is the best bet, or we could call the new one ma, and the old one 
ma_old. In any case, the old one needs to stick around until the new one 
has been fully tested for compatibility (and otherwise).

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From l.mastrodomenico at gmail.com  Mon Aug 27 14:21:43 2007
From: l.mastrodomenico at gmail.com (Lino Mastrodomenico)
Date: Mon, 27 Aug 2007 20:21:43 +0200
Subject: [Numpy-discussion] possibly ctypes related segfault
In-Reply-To: <200708271457.28679.martin.wiechert@gmx.de>
References: <200708271457.28679.martin.wiechert@gmx.de>
Message-ID: <cc93256f0708271121m67fdd62cyc36cc1565dc5aa1e@mail.gmail.com>

Hi Martin,

2007/8/27, Martin Wiechert <martin.wiechert at gmx.de>:
> I could not reproduce the bug in a debug build of python (--without-pymalloc)
> or on another machine. The crashing machine is an eight-way opteron.

Not sure if it's related to your problem, but on 64-bit architectures
sizeof(ssize_t) is 8.

-- 
Lino Mastrodomenico
E-mail: l.mastrodomenico at gmail.com


From seandavi at gmail.com  Mon Aug 27 16:07:33 2007
From: seandavi at gmail.com (Sean Davis)
Date: Mon, 27 Aug 2007 16:07:33 -0400
Subject: [Numpy-discussion] Issue with converting from numpy record to
	list/tuple
In-Reply-To: <264855a00708271023n9c0076bl1957ebead751ca70@mail.gmail.com>
References: <264855a00708271023n9c0076bl1957ebead751ca70@mail.gmail.com>
Message-ID: <264855a00708271307r75c4b57fufcb19a43316b446d@mail.gmail.com>

On 8/27/07, Sean Davis <seandavi at gmail.com> wrote:
>
> I have a numpy recarray that I want to load into a database using insert
> statements.  To do so, I need to convert each record to a tuple.  Here is
> what I get (using psycopg2)
>
> In [1]: a[1]
> Out[1]: ('5151_0023_0001', 'FORWARD', 'interval rank', 'target_tm: 76.00
> ;probe_tm:70.90;freq:27.93;count:01;rules:0000;score:0658',
> 'chr3:1-199501827',
> 'AAAGGAATTCCATTCATCTCTGGATATTTTGAAATCATTAGGGCAAACAATAAATAA', 0L, 171449529L,
> 171449529L, 1L, 23L, 'experimental', 'CHR03P006149104', 6149104L, 5151L,
> 23L, 1L)
>
> In [2]: type(a[1])
> Out[2]: <class 'numpy.core.records.record'>
>
> In [3]: sqlcommand
> Out[3]: 'insert into nbl_tmp values
> (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s);'
>
> In [4]: cur.execute(sqlcommand,tuple(a[1]))
>
> ---------------------------------------------------------------------------
> <class 'psycopg2.ProgrammingError'>       Traceback (most recent call
> last)
>
> /sherlock/sdavis/Documents/workspace/svn/watson/Sean/PythonCode/<ipython
> console> in <module>()
>
> <class 'psycopg2.ProgrammingError'>: can't adapt
>
> In [5]: b=('5151_0023_0001', 'FORWARD', 'interval rank', 'target_tm: 76.00
> ;probe_tm:70.90;freq:27.93;count:01;rules:0000;score:0658',
> 'chr3:1-199501827',
> 'AAAGGAATTCCATTCATCTCTGGATATTTTGAAATCATTAGGGCAAACAATAAATAA', 0L, 171449529L,
> 171449529L, 1L, 23L, 'experimental', 'CHR03P006149104', 6149104L, 5151L,
> 23L, 1L)
>
> In [6]: cur.execute(sqlcommand,b)
>
> In [7]: a[1].dtype
> Out[7]: dtype([('PROBE_DESIGN_ID', '|S40'), ('CONTAINER', '|S40'),
> ('DESIGN_NOTE', '|S80'), ('SELECTION_CRITERIA', '|S80'), ('SEQ_ID', '|S40'),
> ('PROBE_SEQUENCE', '|S100'), ('MISMATCH', '<u8'), ('MATCH_INDEX', '<u8'),
> ('FEATURE_ID', '<u8'), ('ROW_NUM', '<u8'), ('COL_NUM', '<u8'),
> ('PROBE_CLASS', '|S40'), ('PROBE_ID', '|S40'), ('POSITION', '<u8'),
> ('DESIGN_ID', '<u8'), ('X', '<u8'), ('Y', '<u8')])
>
> Why does the casting using tuple() not work while cut-and-paste of the
> a[1] record into a new variable works just fine?


I answered part of the question myself.  In the coercion back to tuple from
a record, the datatypes remain numpy datatypes.  Is there a way to convert
back from numpy datatypes to standard python types (string, int, float,
etc.) without needing to check every numpy type and determine the
appropriate python type?  In other words, is there a single function that I
can feed a numpy type to (or a variable that has a numpy type) and have the
standard python type (or an appropriately-coerced variable)?

Thanks,
Sean
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070827/b140133b/attachment.html>

From stefan at sun.ac.za  Mon Aug 27 19:22:53 2007
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Tue, 28 Aug 2007 01:22:53 +0200
Subject: [Numpy-discussion] possibly ctypes related segfault
In-Reply-To: <cc93256f0708271121m67fdd62cyc36cc1565dc5aa1e@mail.gmail.com>
References: <200708271457.28679.martin.wiechert@gmx.de>
	<cc93256f0708271121m67fdd62cyc36cc1565dc5aa1e@mail.gmail.com>
Message-ID: <20070827232253.GU14395@mentat.za.net>

On Mon, Aug 27, 2007 at 08:21:43PM +0200, Lino Mastrodomenico wrote:
> Hi Martin,
> 
> 2007/8/27, Martin Wiechert <martin.wiechert at gmx.de>:
> > I could not reproduce the bug in a debug build of python (--without-pymalloc)
> > or on another machine. The crashing machine is an eight-way opteron.
> 
> Not sure if it's related to your problem, but on 64-bit architectures
> sizeof(ssize_t) is 8.

You should be able to circumvent this problem by referring to
ctypes.c_size_t or ctypes.int instead of specifying the width
explicitly.

Regards
St?fan


From stefan at sun.ac.za  Mon Aug 27 19:29:56 2007
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Tue, 28 Aug 2007 01:29:56 +0200
Subject: [Numpy-discussion] numpy.random.multinomial() cannot
	handle	zero's
In-Reply-To: <46D312C4.4080504@noaa.gov>
References: <442992.24353.qm@web86509.mail.ird.yahoo.com>
	<20070826233627.GF14395@mentat.za.net> <46D312C4.4080504@noaa.gov>
Message-ID: <20070827232956.GV14395@mentat.za.net>

Hi Chris

On Mon, Aug 27, 2007 at 11:07:00AM -0700, Christopher Barker wrote:
> Is the kahan_sum closer? -- it should be, though compensated summation 
>   is really for adding LOTS of numbers, for 4, it's pointless at best. 
> Anyway, binary floating point has its errors, and compensated summation 
> can help, but it's still not exact for numbers that can't be exactly 
> represented by binary.
> 
> i.e. if your result is within 15 decimal digits of the exact result, 
> that's as good as it gets.

I find this behaviour odd for addition.  Under python:

In [7]: 0.8+0.2 > 1.0
Out[7]: False

but using the Pyrex module, it yields true.  You can find the code at

http://mentat.za.net/html/refer/somesumbug.tar.bz2

and compile it using

pyrexc sum.pyx ; python setup.py build_ext -i

When you run the test, it illustrates the problem:

Sum: 1.00000000000000000000000000000000000000000000000000
Is greater than 1.0? True

Cheers
St?fan


From Chris.Barker at noaa.gov  Mon Aug 27 19:46:45 2007
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Mon, 27 Aug 2007 16:46:45 -0700
Subject: [Numpy-discussion] numpy.random.multinomial()
	cannot	handle	zero's
In-Reply-To: <20070827232956.GV14395@mentat.za.net>
References: <442992.24353.qm@web86509.mail.ird.yahoo.com>
	<20070826233627.GF14395@mentat.za.net> <46D312C4.4080504@noaa.gov>
	<20070827232956.GV14395@mentat.za.net>
Message-ID: <46D36265.8010208@noaa.gov>

Stefan van der Walt wrote:
> I find this behaviour odd for addition.  Under python:
> 
> In [7]: 0.8+0.2 > 1.0
> Out[7]: False
> 
> but using the Pyrex module, it yields true.

odd. I wonder if one is using extended floating point in the FPU, and 
the other not? What hardware/OS/compiler are you using?

I'm no numerical analyst, I just know enough not to expect floating 
point calculations to be accurate to the last couple digits.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From Chris.Barker at noaa.gov  Mon Aug 27 19:54:21 2007
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Mon, 27 Aug 2007 16:54:21 -0700
Subject: [Numpy-discussion] numpy.random.multinomial()
	cannot	handle	zero's
In-Reply-To: <20070827232956.GV14395@mentat.za.net>
References: <442992.24353.qm@web86509.mail.ird.yahoo.com>
	<20070826233627.GF14395@mentat.za.net> <46D312C4.4080504@noaa.gov>
	<20070827232956.GV14395@mentat.za.net>
Message-ID: <46D3642D.6000100@noaa.gov>

Stefan van der Walt wrote:
> but using the Pyrex module, it yields true.  You can find the code at
> 
> http://mentat.za.net/html/refer/somesumbug.tar.bz2

That link appears to be broken.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From l.mastrodomenico at gmail.com  Mon Aug 27 20:02:53 2007
From: l.mastrodomenico at gmail.com (Lino Mastrodomenico)
Date: Tue, 28 Aug 2007 02:02:53 +0200
Subject: [Numpy-discussion] numpy.random.multinomial() cannot handle
	zero's
In-Reply-To: <46D3642D.6000100@noaa.gov>
References: <442992.24353.qm@web86509.mail.ird.yahoo.com>
	<20070826233627.GF14395@mentat.za.net> <46D312C4.4080504@noaa.gov>
	<20070827232956.GV14395@mentat.za.net> <46D3642D.6000100@noaa.gov>
Message-ID: <cc93256f0708271702u69c87278q3ac0612177066f86@mail.gmail.com>

2007/8/28, Christopher Barker <Chris.Barker at noaa.gov>:
> Stefan van der Walt wrote:
> > but using the Pyrex module, it yields true.  You can find the code at
> >
> > http://mentat.za.net/html/refer/somesumbug.tar.bz2
>
> That link appears to be broken.

The correct one is probably:

    http://mentat.za.net/refer/somesumbug.tar.bz2

-- 
Lino Mastrodomenico
E-mail: l.mastrodomenico at gmail.com


From l.mastrodomenico at gmail.com  Mon Aug 27 20:39:51 2007
From: l.mastrodomenico at gmail.com (Lino Mastrodomenico)
Date: Tue, 28 Aug 2007 02:39:51 +0200
Subject: [Numpy-discussion] numpy.random.multinomial() cannot handle
	zero's
In-Reply-To: <20070827232956.GV14395@mentat.za.net>
References: <442992.24353.qm@web86509.mail.ird.yahoo.com>
	<20070826233627.GF14395@mentat.za.net> <46D312C4.4080504@noaa.gov>
	<20070827232956.GV14395@mentat.za.net>
Message-ID: <cc93256f0708271739u11f80e50td39130809ce0530e@mail.gmail.com>

2007/8/28, Stefan van der Walt <stefan at sun.ac.za>:
> I find this behaviour odd for addition.  Under python:
>
> In [7]: 0.8+0.2 > 1.0
> Out[7]: False

Keep in mind that both 0.2 and 0.8 cannot be represented exactly as
floating-point numbers (unless you use decimal floating points, like
the "decimal" module), so the starting point isn't what it appears to
be.

> Sum: 1.00000000000000000000000000000000000000000000000000
> Is greater than 1.0? True

I get True on a x86, gcc-3.3.1, numpy-1.0b5, GNU/Linux box, and False
on x86_64, gcc-4.1.1, numpy-1.0.3.1, GNU/Linux.

YMMV ;-)

-- 
Lino Mastrodomenico
E-mail: l.mastrodomenico at gmail.com


From stefan at sun.ac.za  Tue Aug 28 03:07:59 2007
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Tue, 28 Aug 2007 09:07:59 +0200
Subject: [Numpy-discussion] numpy.random.multinomial()
	cannot	handle	zero's
In-Reply-To: <46D3642D.6000100@noaa.gov>
References: <442992.24353.qm@web86509.mail.ird.yahoo.com>
	<20070826233627.GF14395@mentat.za.net> <46D312C4.4080504@noaa.gov>
	<20070827232956.GV14395@mentat.za.net> <46D3642D.6000100@noaa.gov>
Message-ID: <20070828070759.GW14395@mentat.za.net>

On Mon, Aug 27, 2007 at 04:54:21PM -0700, Christopher Barker wrote:
> Stefan van der Walt wrote:
> > but using the Pyrex module, it yields true.  You can find the code at
> > 
> > http://mentat.za.net/html/refer/somesumbug.tar.bz2
> 
> That link appears to be broken.

Sorry, http://mentat.za.net/refer/somesumbug.tar.bz2

Cheers
St?fan


From martin.wiechert at gmx.de  Tue Aug 28 04:58:06 2007
From: martin.wiechert at gmx.de (Martin Wiechert)
Date: Tue, 28 Aug 2007 10:58:06 +0200
Subject: [Numpy-discussion] possibly ctypes related segfault
In-Reply-To: <20070827232253.GU14395@mentat.za.net>
References: <200708271457.28679.martin.wiechert@gmx.de>
	<cc93256f0708271121m67fdd62cyc36cc1565dc5aa1e@mail.gmail.com>
	<20070827232253.GU14395@mentat.za.net>
Message-ID: <200708281058.06578.martin.wiechert@gmx.de>

Lino and Stefan,

thanks for your suggestion. However, I doubt this is the problem because as 
far as I know numpy.intp is actually ssize_t.

Thanks, Martin


On Tuesday 28 August 2007 01:22, Stefan van der Walt wrote:
> On Mon, Aug 27, 2007 at 08:21:43PM +0200, Lino Mastrodomenico wrote:
> > Hi Martin,
> >
> > 2007/8/27, Martin Wiechert <martin.wiechert at gmx.de>:
> > > I could not reproduce the bug in a debug build of python
> > > (--without-pymalloc) or on another machine. The crashing machine is an
> > > eight-way opteron.
> >
> > Not sure if it's related to your problem, but on 64-bit architectures
> > sizeof(ssize_t) is 8.
>
> You should be able to circumvent this problem by referring to
> ctypes.c_size_t or ctypes.int instead of specifying the width
> explicitly.
>
> Regards
> St?fan
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion


From stefan at sun.ac.za  Tue Aug 28 07:11:38 2007
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Tue, 28 Aug 2007 13:11:38 +0200
Subject: [Numpy-discussion] possibly ctypes related segfault
In-Reply-To: <200708271457.28679.martin.wiechert@gmx.de>
References: <200708271457.28679.martin.wiechert@gmx.de>
Message-ID: <20070828111138.GB14395@mentat.za.net>

Hi Martin

On Mon, Aug 27, 2007 at 02:57:28PM +0200, Martin Wiechert wrote:
> I'm suffering from a strange segfault and would appreciate your help.
> 
> I'm calling a small C function using ctypes / numpy.ctypeslib. The function 
> works in the sense that it returns correct results. After calling the 
> function however I can reliably evoke a segfault by using readline tab 
> completion.
> 
> I'm not very experienced, but this smells like a memory management bug to me, 
> which is strange, because I'm not doing any mallocing/freeing at all in the C 
> code.
> 
> I could not reproduce the bug in a debug build of python (--without-pymalloc) 
> or on another machine. The crashing machine is an eight-way opteron.

I had to #include <unistd.h> in solver, and modify cMonteCarlo not to
depend on GV.  Then, I used

gcc -o solver.os -c -O2 -ggdb -Wall -ansi -pedantic -fPIC solver.c
gcc -o librectify.so -shared solver.os -llapack

to compile.

Please send me the script that excercises the solver, then I will test
on my machines here.

> --16266-- DWARF2 CFI reader: unhandled CFI instruction 0:10
> --16266-- DWARF2 CFI reader: unhandled CFI instruction 0:10

This could be a valgrind issue.

Cheers
St?fan


From faltet at carabos.com  Tue Aug 28 06:14:22 2007
From: faltet at carabos.com (Francesc Altet)
Date: Tue, 28 Aug 2007 12:14:22 +0200
Subject: [Numpy-discussion] Issue with converting from numpy record to
	list/tuple
In-Reply-To: <264855a00708271307r75c4b57fufcb19a43316b446d@mail.gmail.com>
References: <264855a00708271023n9c0076bl1957ebead751ca70@mail.gmail.com>
	<264855a00708271307r75c4b57fufcb19a43316b446d@mail.gmail.com>
Message-ID: <200708281214.23333.faltet@carabos.com>

A Monday 27 August 2007, Sean Davis escrigu?:
> On 8/27/07, Sean Davis <seandavi at gmail.com> wrote:
> > I have a numpy recarray that I want to load into a database using
> > insert statements.  To do so, I need to convert each record to a
> > tuple.  Here is what I get (using psycopg2)
> >
> > In [1]: a[1]
> > Out[1]: ('5151_0023_0001', 'FORWARD', 'interval rank', 'target_tm:
> > 76.00 ;probe_tm:70.90;freq:27.93;count:01;rules:0000;score:0658',
> > 'chr3:1-199501827',
> > 'AAAGGAATTCCATTCATCTCTGGATATTTTGAAATCATTAGGGCAAACAATAAATAA', 0L,
> > 171449529L, 171449529L, 1L, 23L, 'experimental', 'CHR03P006149104',
> > 6149104L, 5151L, 23L, 1L)
> >
> > In [2]: type(a[1])
> > Out[2]: <class 'numpy.core.records.record'>
> >
> > In [3]: sqlcommand
> > Out[3]: 'insert into nbl_tmp values
> > (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s);'
> >
> > In [4]: cur.execute(sqlcommand,tuple(a[1]))
> >
> > -------------------------------------------------------------------
> >-------- <class 'psycopg2.ProgrammingError'>       Traceback (most
> > recent call last)
> >
> > /sherlock/sdavis/Documents/workspace/svn/watson/Sean/PythonCode/<ip
> >ython console> in <module>()
> >
> > <class 'psycopg2.ProgrammingError'>: can't adapt
> >
> > In [5]: b=('5151_0023_0001', 'FORWARD', 'interval rank',
> > 'target_tm: 76.00
> > ;probe_tm:70.90;freq:27.93;count:01;rules:0000;score:0658',
> > 'chr3:1-199501827',
> > 'AAAGGAATTCCATTCATCTCTGGATATTTTGAAATCATTAGGGCAAACAATAAATAA', 0L,
> > 171449529L, 171449529L, 1L, 23L, 'experimental', 'CHR03P006149104',
> > 6149104L, 5151L, 23L, 1L)
> >
> > In [6]: cur.execute(sqlcommand,b)
> >
> > In [7]: a[1].dtype
> > Out[7]: dtype([('PROBE_DESIGN_ID', '|S40'), ('CONTAINER', '|S40'),
> > ('DESIGN_NOTE', '|S80'), ('SELECTION_CRITERIA', '|S80'), ('SEQ_ID',
> > '|S40'), ('PROBE_SEQUENCE', '|S100'), ('MISMATCH', '<u8'),
> > ('MATCH_INDEX', '<u8'), ('FEATURE_ID', '<u8'), ('ROW_NUM', '<u8'),
> > ('COL_NUM', '<u8'), ('PROBE_CLASS', '|S40'), ('PROBE_ID', '|S40'),
> > ('POSITION', '<u8'), ('DESIGN_ID', '<u8'), ('X', '<u8'), ('Y',
> > '<u8')])
> >
> > Why does the casting using tuple() not work while cut-and-paste of
> > the a[1] record into a new variable works just fine?
>
> I answered part of the question myself.  In the coercion back to
> tuple from a record, the datatypes remain numpy datatypes.  Is there
> a way to convert back from numpy datatypes to standard python types
> (string, int, float, etc.) without needing to check every numpy type
> and determine the appropriate python type?  In other words, is there
> a single function that I can feed a numpy type to (or a variable that
> has a numpy type) and have the standard python type (or an
> appropriately-coerced variable)?

Use .tolist() method.  Here is an example:

In [92]: r=numpy.empty(5, 'f8,i4,f8')

In [93]: type(tuple(r[0])[0])
Out[93]: <type 'numpy.float64'>

In [94]: type(r[0].tolist()[0])
Out[94]: <type 'float'>

HTH,

-- 
>0,0<   Francesc Altet ? ? http://www.carabos.com/
V   V   C?rabos Coop. V. ??Enjoy Data
 "-"


From seandavi at gmail.com  Tue Aug 28 07:38:29 2007
From: seandavi at gmail.com (Sean Davis)
Date: Tue, 28 Aug 2007 07:38:29 -0400
Subject: [Numpy-discussion] Issue with converting from numpy record to
	list/tuple
In-Reply-To: <200708281214.23333.faltet@carabos.com>
References: <264855a00708271023n9c0076bl1957ebead751ca70@mail.gmail.com>
	<264855a00708271307r75c4b57fufcb19a43316b446d@mail.gmail.com>
	<200708281214.23333.faltet@carabos.com>
Message-ID: <264855a00708280438q3555dc4aq30e43de028554d8@mail.gmail.com>

On 8/28/07, Francesc Altet <faltet at carabos.com> wrote:
>
> A Monday 27 August 2007, Sean Davis escrigu?:
> > On 8/27/07, Sean Davis <seandavi at gmail.com> wrote:
> > > I have a numpy recarray that I want to load into a database using
> > > insert statements.  To do so, I need to convert each record to a
> > > tuple.  Here is what I get (using psycopg2)
> > >
> > > In [1]: a[1]
> > > Out[1]: ('5151_0023_0001', 'FORWARD', 'interval rank', 'target_tm:
> > > 76.00 ;probe_tm:70.90;freq:27.93;count:01;rules:0000;score:0658',
> > > 'chr3:1-199501827',
> > > 'AAAGGAATTCCATTCATCTCTGGATATTTTGAAATCATTAGGGCAAACAATAAATAA', 0L,
> > > 171449529L, 171449529L, 1L, 23L, 'experimental', 'CHR03P006149104',
> > > 6149104L, 5151L, 23L, 1L)
> > >
> > > In [2]: type(a[1])
> > > Out[2]: <class 'numpy.core.records.record'>
> > >
> > > In [3]: sqlcommand
> > > Out[3]: 'insert into nbl_tmp values
> > > (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s);'
> > >
> > > In [4]: cur.execute(sqlcommand,tuple(a[1]))
> > >
> > > -------------------------------------------------------------------
> > >-------- <class 'psycopg2.ProgrammingError'>       Traceback (most
> > > recent call last)
> > >
> > > /sherlock/sdavis/Documents/workspace/svn/watson/Sean/PythonCode/<ip
> > >ython console> in <module>()
> > >
> > > <class 'psycopg2.ProgrammingError'>: can't adapt
> > >
> > > In [5]: b=('5151_0023_0001', 'FORWARD', 'interval rank',
> > > 'target_tm: 76.00
> > > ;probe_tm:70.90;freq:27.93;count:01;rules:0000;score:0658',
> > > 'chr3:1-199501827',
> > > 'AAAGGAATTCCATTCATCTCTGGATATTTTGAAATCATTAGGGCAAACAATAAATAA', 0L,
> > > 171449529L, 171449529L, 1L, 23L, 'experimental', 'CHR03P006149104',
> > > 6149104L, 5151L, 23L, 1L)
> > >
> > > In [6]: cur.execute(sqlcommand,b)
> > >
> > > In [7]: a[1].dtype
> > > Out[7]: dtype([('PROBE_DESIGN_ID', '|S40'), ('CONTAINER', '|S40'),
> > > ('DESIGN_NOTE', '|S80'), ('SELECTION_CRITERIA', '|S80'), ('SEQ_ID',
> > > '|S40'), ('PROBE_SEQUENCE', '|S100'), ('MISMATCH', '<u8'),
> > > ('MATCH_INDEX', '<u8'), ('FEATURE_ID', '<u8'), ('ROW_NUM', '<u8'),
> > > ('COL_NUM', '<u8'), ('PROBE_CLASS', '|S40'), ('PROBE_ID', '|S40'),
> > > ('POSITION', '<u8'), ('DESIGN_ID', '<u8'), ('X', '<u8'), ('Y',
> > > '<u8')])
> > >
> > > Why does the casting using tuple() not work while cut-and-paste of
> > > the a[1] record into a new variable works just fine?
> >
> > I answered part of the question myself.  In the coercion back to
> > tuple from a record, the datatypes remain numpy datatypes.  Is there
> > a way to convert back from numpy datatypes to standard python types
> > (string, int, float, etc.) without needing to check every numpy type
> > and determine the appropriate python type?  In other words, is there
> > a single function that I can feed a numpy type to (or a variable that
> > has a numpy type) and have the standard python type (or an
> > appropriately-coerced variable)?
>
> Use .tolist() method.  Here is an example:
>
> In [92]: r=numpy.empty(5, 'f8,i4,f8')
>
> In [93]: type(tuple(r[0])[0])
> Out[93]: <type 'numpy.float64'>
>
> In [94]: type(r[0].tolist()[0])
> Out[94]: <type 'float'>
>
> HTH,


That will do it.

Thanks,
Sean
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070828/cee7e2cf/attachment.html>

From martin.wiechert at gmx.de  Tue Aug 28 08:03:52 2007
From: martin.wiechert at gmx.de (Martin Wiechert)
Date: Tue, 28 Aug 2007 14:03:52 +0200
Subject: [Numpy-discussion] possibly ctypes related segfault
In-Reply-To: <20070828111138.GB14395@mentat.za.net>
References: <200708271457.28679.martin.wiechert@gmx.de>
	<20070828111138.GB14395@mentat.za.net>
Message-ID: <200708281403.52444.martin.wiechert@gmx.de>

Wow, thanks a lot for putting so much efffort!

Here's the test script. I'm using it via execfile from an interactive session, 
so I can inspect (and crash with readline) afterwards.

Here's how I compiled:
gcc 
solver.c -fPIC -ggdb -shared -llapack -lf77blas -lcblas -latlas -lgfortran -o 
librectify.so

Thanks, Martin


On Tuesday 28 August 2007 13:11, Stefan van der Walt wrote:
> Hi Martin
>
> On Mon, Aug 27, 2007 at 02:57:28PM +0200, Martin Wiechert wrote:
> > I'm suffering from a strange segfault and would appreciate your help.
> >
> > I'm calling a small C function using ctypes / numpy.ctypeslib. The
> > function works in the sense that it returns correct results. After
> > calling the function however I can reliably evoke a segfault by using
> > readline tab completion.
> >
> > I'm not very experienced, but this smells like a memory management bug to
> > me, which is strange, because I'm not doing any mallocing/freeing at all
> > in the C code.
> >
> > I could not reproduce the bug in a debug build of python
> > (--without-pymalloc) or on another machine. The crashing machine is an
> > eight-way opteron.
>
> I had to #include <unistd.h> in solver, and modify cMonteCarlo not to
> depend on GV.  Then, I used
>
> gcc -o solver.os -c -O2 -ggdb -Wall -ansi -pedantic -fPIC solver.c
> gcc -o librectify.so -shared solver.os -llapack
>
> to compile.
>
> Please send me the script that excercises the solver, then I will test
> on my machines here.
>
> > --16266-- DWARF2 CFI reader: unhandled CFI instruction 0:10
> > --16266-- DWARF2 CFI reader: unhandled CFI instruction 0:10
>
> This could be a valgrind issue.
>
> Cheers
> St?fan
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: recttest.py
Type: application/x-python
Size: 389 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070828/fda4a9f3/attachment.bin>

From pgmdevlist at gmail.com  Tue Aug 28 10:25:46 2007
From: pgmdevlist at gmail.com (Pierre GM)
Date: Tue, 28 Aug 2007 10:25:46 -0400
Subject: [Numpy-discussion] Maskedarray implementations
In-Reply-To: <46D3135D.6000109@noaa.gov>
References: <200708242027.41896.pgmdevlist@gmail.com>
	<loom.20070825T224942-553@post.gmane.org>
	<46D3135D.6000109@noaa.gov>
Message-ID: <200708281025.46269.pgmdevlist@gmail.com>

On Monday 27 August 2007 14:09:33 Christopher Barker wrote:

> This is the best bet, or we could call the new one ma, and the old one
> ma_old. In any case, the old one needs to stick around until the new one
> has been fully tested for compatibility (and otherwise).

That shouldn't be a pb, the tests I've performed so far w/ the two 
implementations seem to run, but sure, that's the wisest.
However, maskedarray is spread on several files (core, extras, mrecords, 
mstats). What would be the best structure for numpy, then ?


From pgmdevlist at gmail.com  Tue Aug 28 10:32:07 2007
From: pgmdevlist at gmail.com (Pierre GM)
Date: Tue, 28 Aug 2007 10:32:07 -0400
Subject: [Numpy-discussion] maskedarray : new developer zone wiki page
In-Reply-To: <46D08770.7050007@hawaii.edu>
References: <200708242027.41896.pgmdevlist@gmail.com>
	<200708251506.08568.pgmdevlist@gmail.com>
	<46D08770.7050007@hawaii.edu>
Message-ID: <200708281032.07372.pgmdevlist@gmail.com>

On Saturday 25 August 2007 15:48:00 Eric Firing wrote:
> I've made a couple of small "emergency" edits, but a separate page would
> make things much more visible and less confusing.

So here it is:
http://projects.scipy.org/scipy/numpy/wiki/MaskedArrayAlternative

Please note the section : Optimizing maskedarray. You'll find the quick 
description of a test case (three implementations of divide) that emerged 
from on off-list discussion with Eric Firing.
The problem can be formulated as "do we need to fill masked arrays before 
processing or not ?". Eric is in favor of the second solution (prefilling 
according to the domain mask),  while the more it goes, the more I'm leaning 
towards the third one "bah, let numpy take care of that."

I would be very grateful if you could post your comments/ideas/suggestions 
about the three implementations on that list. This is an issue I'd like to 
solve ASAP. 

Thanks a lot in advance
Pierre

PS: Sorry if I bumped this thread, I'm not sure on what list I sent it. 
Cross-posting is bad...


From stefan at sun.ac.za  Tue Aug 28 12:45:27 2007
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Tue, 28 Aug 2007 18:45:27 +0200
Subject: [Numpy-discussion] possibly ctypes related segfault
In-Reply-To: <200708281403.52444.martin.wiechert@gmx.de>
References: <200708271457.28679.martin.wiechert@gmx.de>
	<20070828111138.GB14395@mentat.za.net>
	<200708281403.52444.martin.wiechert@gmx.de>
Message-ID: <20070828164527.GA19381@mentat.za.net>

On Tue, Aug 28, 2007 at 02:03:52PM +0200, Martin Wiechert wrote:
> Here's the test script. I'm using it via execfile from an interactive session, 
> so I can inspect (and crash with readline) afterwards.
> 
> Here's how I compiled:
> gcc 
> solver.c -fPIC -ggdb -shared -llapack -lf77blas -lcblas -latlas -lgfortran -o 
> librectify.so

It works perfectly on the two Linux machines I tried (32-bit and
64-bit).  Maybe your lapack isn't healthy?

Cheers
St?fan


From martin.wiechert at gmx.de  Wed Aug 29 05:03:26 2007
From: martin.wiechert at gmx.de (Martin Wiechert)
Date: Wed, 29 Aug 2007 11:03:26 +0200
Subject: [Numpy-discussion] possibly ctypes related segfault
In-Reply-To: <20070828164527.GA19381@mentat.za.net>
References: <200708271457.28679.martin.wiechert@gmx.de>
	<200708281403.52444.martin.wiechert@gmx.de>
	<20070828164527.GA19381@mentat.za.net>
Message-ID: <200708291103.26636.martin.wiechert@gmx.de>

Hmpf. Anyway, thanks again, Stefan!

Cheers, Martin


On Tuesday 28 August 2007 18:45, Stefan van der Walt wrote:
> On Tue, Aug 28, 2007 at 02:03:52PM +0200, Martin Wiechert wrote:
> > Here's the test script. I'm using it via execfile from an interactive
> > session, so I can inspect (and crash with readline) afterwards.
> >
> > Here's how I compiled:
> > gcc
> > solver.c -fPIC -ggdb -shared -llapack -lf77blas -lcblas -latlas
> > -lgfortran -o librectify.so
>
> It works perfectly on the two Linux machines I tried (32-bit and
> 64-bit).  Maybe your lapack isn't healthy?
>
> Cheers
> St?fan


From numpy-discussion at maubp.freeserve.co.uk  Wed Aug 29 07:44:01 2007
From: numpy-discussion at maubp.freeserve.co.uk (Peter)
Date: Wed, 29 Aug 2007 12:44:01 +0100
Subject: [Numpy-discussion] Citing Numeric and numpy
Message-ID: <46D55C01.6050604@maubp.freeserve.co.uk>

Dear Travis and the Numerical Python community,

I would like to know if there is a preferred form for citing the old 
"Numeric" library and more recent "numpy" libraries in a publication.  I 
have checked the mailing list archives, but didn't find an answer.

I am not aware of any publication for the original Numeric library, 
leaving just the project webpage.  Is something like this acceptable?:

David Ascher et al. (2001) Numerical Python, http://www.numpy.org

For NumPy, is it best to cite http://www.numpy.org or the book? Would 
this suffice for the NumPy book citation:

Travis E. Oliphant (2006) Guide to NumPy, Trelgol Publishing, USA.

It would be nice to have the full citation reference details (e.g. 
publisher's address and explicit year of publication) on the book's 
webpage: http://www.tramy.us

Thanks,

Peter


From ryanlists at gmail.com  Wed Aug 29 09:06:42 2007
From: ryanlists at gmail.com (Ryan Krauss)
Date: Wed, 29 Aug 2007 08:06:42 -0500
Subject: [Numpy-discussion] Citing Numeric and numpy
In-Reply-To: <46D55C01.6050604@maubp.freeserve.co.uk>
References: <46D55C01.6050604@maubp.freeserve.co.uk>
Message-ID: <c5b438120708290606i102eb161x796b4cd9a445837@mail.gmail.com>

Obviously this is mainly Travis' question to answer and it depends on
the nature of the reference, but I would like to see Travis's article
in the recent special issue on Python for scientific use in CiSE cited
as well because I think it does a great job of presenting why Python
should be taken seriously as a language for scientific computing.

FWIW,

Ryan

On 8/29/07, Peter <numpy-discussion at maubp.freeserve.co.uk> wrote:
> Dear Travis and the Numerical Python community,
>
> I would like to know if there is a preferred form for citing the old
> "Numeric" library and more recent "numpy" libraries in a publication.  I
> have checked the mailing list archives, but didn't find an answer.
>
> I am not aware of any publication for the original Numeric library,
> leaving just the project webpage.  Is something like this acceptable?:
>
> David Ascher et al. (2001) Numerical Python, http://www.numpy.org
>
> For NumPy, is it best to cite http://www.numpy.org or the book? Would
> this suffice for the NumPy book citation:
>
> Travis E. Oliphant (2006) Guide to NumPy, Trelgol Publishing, USA.
>
> It would be nice to have the full citation reference details (e.g.
> publisher's address and explicit year of publication) on the book's
> webpage: http://www.tramy.us
>
> Thanks,
>
> Peter
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>


From ryanlists at gmail.com  Wed Aug 29 09:06:42 2007
From: ryanlists at gmail.com (Ryan Krauss)
Date: Wed, 29 Aug 2007 08:06:42 -0500
Subject: [Numpy-discussion] Citing Numeric and numpy
In-Reply-To: <46D55C01.6050604@maubp.freeserve.co.uk>
References: <46D55C01.6050604@maubp.freeserve.co.uk>
Message-ID: <c5b438120708290606i102eb161x796b4cd9a445837@mail.gmail.com>

Obviously this is mainly Travis' question to answer and it depends on
the nature of the reference, but I would like to see Travis's article
in the recent special issue on Python for scientific use in CiSE cited
as well because I think it does a great job of presenting why Python
should be taken seriously as a language for scientific computing.

FWIW,

Ryan

On 8/29/07, Peter <numpy-discussion at maubp.freeserve.co.uk> wrote:
> Dear Travis and the Numerical Python community,
>
> I would like to know if there is a preferred form for citing the old
> "Numeric" library and more recent "numpy" libraries in a publication.  I
> have checked the mailing list archives, but didn't find an answer.
>
> I am not aware of any publication for the original Numeric library,
> leaving just the project webpage.  Is something like this acceptable?:
>
> David Ascher et al. (2001) Numerical Python, http://www.numpy.org
>
> For NumPy, is it best to cite http://www.numpy.org or the book? Would
> this suffice for the NumPy book citation:
>
> Travis E. Oliphant (2006) Guide to NumPy, Trelgol Publishing, USA.
>
> It would be nice to have the full citation reference details (e.g.
> publisher's address and explicit year of publication) on the book's
> webpage: http://www.tramy.us
>
> Thanks,
>
> Peter
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>


From ryanlists at gmail.com  Wed Aug 29 09:06:42 2007
From: ryanlists at gmail.com (Ryan Krauss)
Date: Wed, 29 Aug 2007 08:06:42 -0500
Subject: [Numpy-discussion] Citing Numeric and numpy
In-Reply-To: <46D55C01.6050604@maubp.freeserve.co.uk>
References: <46D55C01.6050604@maubp.freeserve.co.uk>
Message-ID: <c5b438120708290606i102eb161x796b4cd9a445837@mail.gmail.com>

Obviously this is mainly Travis' question to answer and it depends on
the nature of the reference, but I would like to see Travis's article
in the recent special issue on Python for scientific use in CiSE cited
as well because I think it does a great job of presenting why Python
should be taken seriously as a language for scientific computing.

FWIW,

Ryan

On 8/29/07, Peter <numpy-discussion at maubp.freeserve.co.uk> wrote:
> Dear Travis and the Numerical Python community,
>
> I would like to know if there is a preferred form for citing the old
> "Numeric" library and more recent "numpy" libraries in a publication.  I
> have checked the mailing list archives, but didn't find an answer.
>
> I am not aware of any publication for the original Numeric library,
> leaving just the project webpage.  Is something like this acceptable?:
>
> David Ascher et al. (2001) Numerical Python, http://www.numpy.org
>
> For NumPy, is it best to cite http://www.numpy.org or the book? Would
> this suffice for the NumPy book citation:
>
> Travis E. Oliphant (2006) Guide to NumPy, Trelgol Publishing, USA.
>
> It would be nice to have the full citation reference details (e.g.
> publisher's address and explicit year of publication) on the book's
> webpage: http://www.tramy.us
>
> Thanks,
>
> Peter
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>


From aisaac at american.edu  Wed Aug 29 09:17:21 2007
From: aisaac at american.edu (Alan G Isaac)
Date: Wed, 29 Aug 2007 09:17:21 -0400
Subject: [Numpy-discussion] Citing Numeric and numpy
In-Reply-To: <46D55C01.6050604@maubp.freeserve.co.uk>
References: <46D55C01.6050604@maubp.freeserve.co.uk>
Message-ID: <Mahogany-0.67.0-860-20070829-091721.00@american.edu>

On Wed, 29 Aug 2007, Peter apparently wrote:
> I would like to know if there is a preferred form for 
> citing the old
> "Numeric" library

I'll attach text from the first two pages of *Numerical 
Python* below.

Cheers,
Alan Isaac

-------------------------------------------------------------

An Open Source Project

    Numerical Python

David Ascher
Paul F. Dubois
Konrad Hinsen
Jim Hugunin
Travis Oliphant
with contributions from the Numerical Python community.
September 7, 2001
Lawrence Livermore National Laboratory, Livermore, CA 94566
UCRL?MA?128569

ii
Legal Notice
Please see file Legal.html in the source distribution.

This open source project has been contributed to by many people, including personnel of the Lawrence Liver?
more National Laboratory. The following notice covers those contributions including this manual.
Copyright (c) 1999, 2000, 2001. The Regents of the University of California. All rights reserved.
Permission to use, copy, modify, and distribute this software for any purpose without fee is hereby granted,
provided that this entire notice is included in all copies of any software which is or includes a copy or modifi?
cation of this software and in all copies of the supporting documentation for such software.
This work was produced at the University of California, Lawrence Livermore National Laboratory under con?
tract no. W?7405?ENG?48 between the U.S. Department of Energy and The Regents of the University of Cali?
fornia for the operation of UC LLNL.


From Joris.DeRidder at ster.kuleuven.be  Wed Aug 29 10:56:02 2007
From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder)
Date: Wed, 29 Aug 2007 16:56:02 +0200
Subject: [Numpy-discussion] Trac ticket
Message-ID: <1F561FA8-53D7-483D-9772-4F1E60E37510@ster.kuleuven.be>

Hi,

Perhaps a stupid question, but I don't seem to find any info about it  
on the web.
I would like to take up a (simple) Numpy Trac ticket, and fix it in  
the Numpy trunk. How can I assign the ticket to myself? After logging  
in, I don't see any obvious way of doing this. Secondly, committing a  
fix back to the SVN repository seems to require a specific login/pw,  
how to get one (assuming my fix is welcome)?

Cheers,
Joris


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm


From lists.steve at arachnedesign.net  Wed Aug 29 11:05:14 2007
From: lists.steve at arachnedesign.net (Steve Lianoglou)
Date: Wed, 29 Aug 2007 11:05:14 -0400
Subject: [Numpy-discussion] Trac ticket
In-Reply-To: <1F561FA8-53D7-483D-9772-4F1E60E37510@ster.kuleuven.be>
References: <1F561FA8-53D7-483D-9772-4F1E60E37510@ster.kuleuven.be>
Message-ID: <F3EABE04-98C8-48D4-9C3E-8D04A35D46AB@arachnedesign.net>

> Perhaps a stupid question, but I don't seem to find any info about it
> on the web.
> I would like to take up a (simple) Numpy Trac ticket, and fix it in
> the Numpy trunk. How can I assign the ticket to myself?

I'm not sure how the trac system is setup @ numpy, but you may not  
have the perms to do that yourself.

Perhaps you can add a comment to the ticket saying that you are  
working on it (an expected completion date may be helpful)

> After logging
> in, I don't see any obvious way of doing this. Secondly, committing a
> fix back to the SVN repository seems to require a specific login/pw,
> how to get one (assuming my fix is welcome)?

You should most likely just attach a patch against the latest trunk  
to the ticket itself for review.

-steve


From charlesr.harris at gmail.com  Wed Aug 29 11:42:50 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 29 Aug 2007 09:42:50 -0600
Subject: [Numpy-discussion] Bug in resize method?
Message-ID: <e06186140708290842t7f25f0dbo8b53216d2b056686@mail.gmail.com>

Hi all,

This looks like a bug to me.

>>> a = arange(6).reshape(2,3)
>>> a.resize((3,3))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: cannot resize this array:  it does not own its data

Is there any reason resize should fail in this case? Resize should be
returning an new array, no? There are several other things that
look like bugs in this method, for instance:

>>> a = arange(6).resize((2,3))
>>> a

`a` has no value and no error is raised.

The resize function works as expected

>>> resize(a,(3,3))
array([[0, 1, 2],
       [3, 4, 5],
       [0, 1, 2]])

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070829/f858ebec/attachment.html>

From stefan at sun.ac.za  Wed Aug 29 11:58:57 2007
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Wed, 29 Aug 2007 17:58:57 +0200
Subject: [Numpy-discussion] Bug in resize method?
In-Reply-To: <e06186140708290842t7f25f0dbo8b53216d2b056686@mail.gmail.com>
References: <e06186140708290842t7f25f0dbo8b53216d2b056686@mail.gmail.com>
Message-ID: <20070829155856.GV14395@mentat.za.net>

Hi Charles

On Wed, Aug 29, 2007 at 09:42:50AM -0600, Charles R Harris wrote:
> Hi all,
> 
> This looks like a bug to me.
> 
> >>> a = arange(6).reshape(2,3)
> >>> a.resize((3,3))
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> ValueError: cannot resize this array:  it does not own its data

>From the docstring of a.resize:

    Change size and shape of self inplace.  Array must own its own memory and
    not be referenced by other arrays.    Returns None.

The reshaped array is a view on the original data, hence it doesn't
own it:

In [15]: a = N.arange(6).reshape(2,3)

In [16]: a.flags
Out[16]: 
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

> >>> a = arange(6).resize((2,3))
> >>> a
> 
> `a` has no value and no error is raised.

It is because `a` is now None.

Cheers
St?fan


From charlesr.harris at gmail.com  Wed Aug 29 12:28:21 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 29 Aug 2007 10:28:21 -0600
Subject: [Numpy-discussion] Bug in resize method?
In-Reply-To: <20070829155856.GV14395@mentat.za.net>
References: <e06186140708290842t7f25f0dbo8b53216d2b056686@mail.gmail.com>
	<20070829155856.GV14395@mentat.za.net>
Message-ID: <e06186140708290928l318e9e2ew26beddd6cb541f7b@mail.gmail.com>

On 8/29/07, Stefan van der Walt <stefan at sun.ac.za> wrote:
>
> Hi Charles
>
> On Wed, Aug 29, 2007 at 09:42:50AM -0600, Charles R Harris wrote:
> > Hi all,
> >
> > This looks like a bug to me.
> >
> > >>> a = arange(6).reshape(2,3)
> > >>> a.resize((3,3))
> > Traceback (most recent call last):
> >   File "<stdin>", line 1, in <module>
> > ValueError: cannot resize this array:  it does not own its data
>
> >From the docstring of a.resize:
>
>     Change size and shape of self inplace.  Array must own its own memory
> and
>     not be referenced by other arrays.    Returns None.


The documentation is bogus:

>>> a = arange(6).reshape(2,3)
>>> a
array([[0, 1, 2],
       [3, 4, 5]])
>>> a.flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
>>> a.resize((3,2))
>>> a
array([[0, 1],
       [2, 3],
       [4, 5]])


The reshaped array is a view on the original data, hence it doesn't
> own it:
>
> In [15]: a = N.arange(6).reshape(2,3)
>
> In [16]: a.flags
> Out[16]:
>   C_CONTIGUOUS : True
>   F_CONTIGUOUS : False
>   OWNDATA : False
>   WRITEABLE : True
>   ALIGNED : True
>   UPDATEIFCOPY : False
>
> > >>> a = arange(6).resize((2,3))
> > >>> a
> >
> > `a` has no value and no error is raised.
>
> It is because `a` is now None.


This behaviour doesn't match documentation elsewhere, which is why I am
raising the question. What *should* the resize method do? It looks like it
is equivalent to assigning a shape tuple to a.shape, so why do we need it?
Apart from that, the reshape method looks like it would serve for most
cases.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070829/7a9adf1e/attachment.html>

From frankb.mail at gmail.com  Wed Aug 29 12:41:29 2007
From: frankb.mail at gmail.com (F Bitonti)
Date: Wed, 29 Aug 2007 12:41:29 -0400
Subject: [Numpy-discussion] linear algebra error?
Message-ID: <22887860708290941h1acf2085g85193fd09ed3d5e8@mail.gmail.com>

I am trying to install the linear algebra package from the NumPy package and
i keep getting this error. I have the most recent version of numpy
1.0.3.1.<http://python-forum.org/py/viewtopic.php?t=5234>

this is the error

Traceback (most recent call last):
File "C:\Python25\lib\site-packages\numpy\linalg\setup.py", line 31, in
<module>
setup(configuration=configuration)
File "C:\Python25\Lib\site-packages\numpy\distutils\core.py", line 113, in
setup
return setup(**attr)
File "C:\Python25\Lib\site-packages\numpy\distutils\core.py", line 173, in
setup
return old_setup(**new_attr)
File "C:\Python25\lib\distutils\core.py", line 168, in setup
raise SystemExit, "error: " + str(msg)
SystemExit: error: Python was built with Visual Studio 2003;
extensions must be built with a compiler than can generate compatible
binaries.
Visual Studio 2003 was not found on this system. If you have Cygwin
installed,
you can try compiling with MingW32, by passing "-c mingw32" to setup.py.


However, I was told that the reason I am getting this error is because the
linear algebra module is already installed yet it dosn't seem to be becaues
when I exectue the following commands i get these error messages. Am I doing
someting wrong I have only been using python two days.


>>> from numpy import *
>>> from linalg import *

Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
    from linalg import *
ImportError: No module named linalg
>>> a = reshape(arange(25.0), (5,5)) + identity(5)
>>> print a
[[  1.   1.   2.   3.   4.]
 [  5.   7.   7.   8.   9.]
 [ 10.  11.  13.  13.  14.]
 [ 15.  16.  17.  19.  19.]
 [ 20.  21.  22.  23.  25.]]
>>> inv_a = inverse(a)

Traceback (most recent call last):
  File "<pyshell#4>", line 1, in <module>
    inv_a = inverse(a)
NameError: name 'inverse' is not defined
>>>


Thank you for any help you can provide.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070829/d32eb1ce/attachment.html>

From matthieu.brucher at gmail.com  Wed Aug 29 12:43:42 2007
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Wed, 29 Aug 2007 18:43:42 +0200
Subject: [Numpy-discussion] linear algebra error?
In-Reply-To: <22887860708290941h1acf2085g85193fd09ed3d5e8@mail.gmail.com>
References: <22887860708290941h1acf2085g85193fd09ed3d5e8@mail.gmail.com>
Message-ID: <e76aa17f0708290943t1cde10c2v34a4ef6f93fba354@mail.gmail.com>

>
>
> >>> from numpy import *
> >>> from numpy.linalg import *
>

linalg is in the numpy module
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070829/c9d6098d/attachment.html>

From Chris.Barker at noaa.gov  Wed Aug 29 12:52:15 2007
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Wed, 29 Aug 2007 09:52:15 -0700
Subject: [Numpy-discussion] Bug in resize method?
In-Reply-To: <e06186140708290928l318e9e2ew26beddd6cb541f7b@mail.gmail.com>
References: <e06186140708290842t7f25f0dbo8b53216d2b056686@mail.gmail.com>
	<20070829155856.GV14395@mentat.za.net>
	<e06186140708290928l318e9e2ew26beddd6cb541f7b@mail.gmail.com>
Message-ID: <46D5A43F.5080702@noaa.gov>

Charles R Harris wrote:
> What *should* the resize method do? It looks like 
> it is equivalent to assigning a shape tuple to a.shape,

No, that's what reshape does.

 > so why do we need it?

resize() will change the SIZE of the array (number of elements), where 
reshape() will only change the shape, but not the number of elements. 
The fact that the size is changing is why it won't work if if doesn't 
own the data.

 >>> a = N.array((1,2,3))
 >>> a.reshape((6,))
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
ValueError: total size of new array must be unchanged

can't reshape to a shape that is a different size.

 >>> b = a.resize((6,))
 >>> repr(b)
'None'

resize changes the array in place, so it returns None, but a has been 
changed:
 >>> a
array([1, 2, 3, 0, 0, 0])

Perhaps you want the function, rather than the method:

 >>> b = N.resize(a, (12,))
 >>> b
array([1, 2, 3, 0, 0, 0, 1, 2, 3, 0, 0, 0])

 >>> a
array([1, 2, 3, 0, 0, 0])

a hasn't been changed, b is a brand new array.

-CHB


  Apart from that, the reshape method looks like it would serve
> for most cases.
> 
> Chuck
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From charlesr.harris at gmail.com  Wed Aug 29 12:59:22 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 29 Aug 2007 10:59:22 -0600
Subject: [Numpy-discussion] Trace returns int32 type for int8 array.
Message-ID: <e06186140708290959i1e9dac64yb70ec717db664d18@mail.gmail.com>

Hi all,

The documentation of trace says it returns the same type as the array. Yet:

>>> trace(eye(2, dtype=int8)).dtype
dtype('int32')

For float types this promotion does not occur

>>> trace(eye(2, dtype=float32)).dtype
dtype('float32')


Trace operates the same way as sum. What should be the case here? And if
type promotion is the default, shouldn't float32 be promoted to double?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070829/1126a868/attachment.html>

From robert.kern at gmail.com  Wed Aug 29 13:03:51 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 29 Aug 2007 12:03:51 -0500
Subject: [Numpy-discussion] linear algebra error?
In-Reply-To: <22887860708290941h1acf2085g85193fd09ed3d5e8@mail.gmail.com>
References: <22887860708290941h1acf2085g85193fd09ed3d5e8@mail.gmail.com>
Message-ID: <46D5A6F7.3060507@gmail.com>

F Bitonti wrote:

> However, I was told that the reason I am getting this error is because
> the linear algebra module is already installed

 The more proximate reasonThat's not the reason why you are getting the error,
it's just that you don't need to and shouldn't try to execute that setup.py
since it's already installed. that you are getting that particular traceback is
because you don't have a compiler installed. However, that's not relevant here
since numpy.linalg is already installed.

> yet it dosn't seem to be
> becaues when I exectue the following commands i get these error
> messages. Am I doing someting wrong I have only been using python two days.
> 
>>>> from numpy import *
>>>> from linalg import *

  from numpy.linalg import *
  inv(...)

Or preferably:

  from numpy import linalg
  linalg.inv(...)

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From charlesr.harris at gmail.com  Wed Aug 29 13:14:38 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 29 Aug 2007 11:14:38 -0600
Subject: [Numpy-discussion] Bug in resize method?
In-Reply-To: <46D5A43F.5080702@noaa.gov>
References: <e06186140708290842t7f25f0dbo8b53216d2b056686@mail.gmail.com>
	<20070829155856.GV14395@mentat.za.net>
	<e06186140708290928l318e9e2ew26beddd6cb541f7b@mail.gmail.com>
	<46D5A43F.5080702@noaa.gov>
Message-ID: <e06186140708291014l2df350cbva84352d4c409625d@mail.gmail.com>

On 8/29/07, Christopher Barker <Chris.Barker at noaa.gov> wrote:
>
> Charles R Harris wrote:
> > What *should* the resize method do? It looks like
> > it is equivalent to assigning a shape tuple to a.shape,
>
> No, that's what reshape does.


No, reshape returns a view and the view doesn't own its data. Totally
different behavior in this context.

> so why do we need it?
>
> resize() will change the SIZE of the array (number of elements), where
> reshape() will only change the shape, but not the number of elements.
> The fact that the size is changing is why it won't work if if doesn't
> own the data.


According to the documentation, the resize method changes the array inplace.
How can it be inplace if the number of elements changes? Admittedly, it
*will* change the size, but that is not consistent with the documentation. I
suspect it reallocates memory and (hopefully) frees the old, but then that
is what the documentation should say because it explains why the data must
be owned -- a condition violated in some cases as demonstrated above. I am
working on documentation and that is why I am raising these questions. There
seem to be some inconsistencies that need clarification and/or fixing.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070829/8468a0d9/attachment.html>

From charlesr.harris at gmail.com  Wed Aug 29 13:25:25 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 29 Aug 2007 11:25:25 -0600
Subject: [Numpy-discussion] svn down
Message-ID: <e06186140708291025u3bdd3fe5offb8b16921cd7410@mail.gmail.com>

Hi all,

The svn server seems to be down, I am getting error messages from the
buildbots:

svn: PROPFIND request failed on '/svn/numpy/trunk'
svn: PROPFIND of '/svn/numpy/trunk': could not connect to server
(http://scipy.org)
program finished with exit code 1

It might be reasonable to check this case before sending posts.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070829/b22e0c17/attachment.html>

From tim.hochberg at ieee.org  Wed Aug 29 13:30:33 2007
From: tim.hochberg at ieee.org (Timothy Hochberg)
Date: Wed, 29 Aug 2007 10:30:33 -0700
Subject: [Numpy-discussion] Bug in resize method?
In-Reply-To: <e06186140708291014l2df350cbva84352d4c409625d@mail.gmail.com>
References: <e06186140708290842t7f25f0dbo8b53216d2b056686@mail.gmail.com>
	<20070829155856.GV14395@mentat.za.net>
	<e06186140708290928l318e9e2ew26beddd6cb541f7b@mail.gmail.com>
	<46D5A43F.5080702@noaa.gov>
	<e06186140708291014l2df350cbva84352d4c409625d@mail.gmail.com>
Message-ID: <e4412d6b0708291030s16d7c537tad7346ccd63ca22@mail.gmail.com>

On 8/29/07, Charles R Harris <charlesr.harris at gmail.com> wrote:
>
>
>
> On 8/29/07, Christopher Barker <Chris.Barker at noaa.gov> wrote:
> >
> > Charles R Harris wrote:
> > > What *should* the resize method do? It looks like
> > > it is equivalent to assigning a shape tuple to a.shape,
> >
> > No, that's what reshape does.
>
>
> No, reshape returns a view and the view doesn't own its data. Totally
> different behavior in this context.
>
> > so why do we need it?
> >
> > resize() will change the SIZE of the array (number of elements), where
> > reshape() will only change the shape, but not the number of elements.
> > The fact that the size is changing is why it won't work if if doesn't
> > own the data.
>
>
> According to the documentation, the resize method changes the array
> inplace. How can it be inplace if the number of elements changes?
>

It sounds like you and Chris are talking past each other on a matter of
terminology. At a C-level, it's obviously not (necessarily) in place, since
the array may get realloced as you surmise below. However, at the Python
level, the change is in fact in place, in the same sense that appending to a
Python list operates in-place, even though under the covers memory may get
realloced there as well.


> Admittedly, it *will* change the size, but that is not consistent with the
> documentation. I suspect it reallocates memory and (hopefully) frees the
> old, but then that is what the documentation should say because it explains
> why the data must be owned -- a condition violated in some cases as
> demonstrated above. I am working on documentation and that is why I am
> raising these questions. There seem to be some inconsistencies that need
> clarification and/or fixing.
>

The main inconsistency I see above is that resize appears to only require
ownership of the data if in fact the number of items changes. I don't think
that's actually a bug, but I don't like it much; I would prefer that resize
be strict and always require ownership. However, I'm fairly certain that
there are people that prefer "friendliness" over consistency, so I wouldn't
be surprised to get some pushback on changing that.


-- 
.  __
.   |-\
.
.  tim.hochberg at ieee.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070829/23ec00c5/attachment.html>

From chanley at stsci.edu  Wed Aug 29 13:37:10 2007
From: chanley at stsci.edu (Christopher Hanley)
Date: Wed, 29 Aug 2007 13:37:10 -0400
Subject: [Numpy-discussion] svn down
In-Reply-To: <e06186140708291025u3bdd3fe5offb8b16921cd7410@mail.gmail.com>
References: <e06186140708291025u3bdd3fe5offb8b16921cd7410@mail.gmail.com>
Message-ID: <46D5AEC6.5010505@stsci.edu>

This could be a problem with the buildbots.  I was just able to update 
from svn.

Chris


Charles R Harris wrote:
> Hi all,
> 
> The svn server seems to be down, I am getting error messages from the 
> buildbots:
> 
> svn: PROPFIND request failed on '/svn/numpy/trunk'
> svn: PROPFIND of '/svn/numpy/trunk': could not connect to server (
> http://scipy.org)
> program finished with exit code 1
> 
> It might be reasonable to check this case before sending posts.
> 
> Chuck
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

-- 
Christopher Hanley
Systems Software Engineer
Space Telescope Science Institute
3700 San Martin Drive
Baltimore MD, 21218
(410) 338-4338


From mpmusu at cc.usu.edu  Wed Aug 29 14:05:09 2007
From: mpmusu at cc.usu.edu (Mark.Miller)
Date: Wed, 29 Aug 2007 12:05:09 -0600
Subject: [Numpy-discussion] Finding unique rows in an array [Was:
	Finding a	row match within a numpy array]
In-Reply-To: <200708221111.17141.faltet@carabos.com>
References: <46C2CD01.5030307@bristol.ac.uk>
	<1187190071.384881.240470@w3g2000hsg.googlegroups.com>
	<46CAF562.9060009@cc.usu.edu> <200708221111.17141.faltet@carabos.com>
Message-ID: <46D5B555.2070904@cc.usu.edu>

A belated thanks...but yes.  That does the trick.  I've not worked with 
views explicitly, so I appreciate the input.  I definitely foresee 
additional applications of these types of things in the future.

Thanks again,

-Mark

Francesc Altet wrote:
> 
> You can always do a view of the rows as strings and then use unique().
> Here is an example:
> 
> In [1]: import numpy
> In [2]: a=numpy.arange(12).reshape(4,3)
> In [3]: a[2]=(3,4,5)
> In [4]: a
> Out[4]: 
> array([[ 0,  1,  2],
>        [ 3,  4,  5],
>        [ 3,  4,  5],
>        [ 9, 10, 11]])
> 
> now, create the view and select the unique rows:
> 
> In [5]: b=numpy.unique(a.view('S%d'%a.itemsize*a.shape[0])).view('i4')
> 
> and finally restore the shape:
> 
> In [6]: b.reshape((len(b)/a.shape[1], a.shape[1]))
> Out[6]: 
> array([[ 0,  1,  2],
>        [ 3,  4,  5],
>        [ 9, 10, 11]])
> 
> If you want to find unique columns instead of rows, do a tranpose first 
> on the initial array.
> 
> Cheers,
> 


From peridot.faceted at gmail.com  Wed Aug 29 14:19:22 2007
From: peridot.faceted at gmail.com (Anne Archibald)
Date: Wed, 29 Aug 2007 14:19:22 -0400
Subject: [Numpy-discussion] Bug in resize method?
In-Reply-To: <e4412d6b0708291030s16d7c537tad7346ccd63ca22@mail.gmail.com>
References: <e06186140708290842t7f25f0dbo8b53216d2b056686@mail.gmail.com>
	<20070829155856.GV14395@mentat.za.net>
	<e06186140708290928l318e9e2ew26beddd6cb541f7b@mail.gmail.com>
	<46D5A43F.5080702@noaa.gov>
	<e06186140708291014l2df350cbva84352d4c409625d@mail.gmail.com>
	<e4412d6b0708291030s16d7c537tad7346ccd63ca22@mail.gmail.com>
Message-ID: <ce557a360708291119u3fec2e06h1a0b1f0b3cfdb8a2@mail.gmail.com>

On 29/08/2007, Timothy Hochberg <tim.hochberg at ieee.org> wrote:

> The main inconsistency I see above is that resize appears to only require
> ownership of the data if in fact the number of items changes. I don't think
> that's actually a bug, but I don't like it much; I would prefer that resize
> be strict and always require ownership. However, I'm fairly certain that
> there are people that prefer "friendliness" over consistency, so I wouldn't
> be surprised to get some pushback on changing that.

It seems to me like inplace resize is a problem, no matter how you
implement it --- is there any way to verify that no view exists of a
given array? (refcounts won't do it since there are other, non-view
ways to increase the refcount of an array.) If there's a view of an
array, you resize() it in place, and realloc() moves the data, the
views now point to bogus memory: you can cause the python interpreter
to segfault by addressing their contents. I really can't see any way
around this; why not remove inplace resize() (or make it raise
exceptions if the size has to change) and allow only the function
resize()?

Anne


From charlesr.harris at gmail.com  Wed Aug 29 14:29:22 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 29 Aug 2007 12:29:22 -0600
Subject: [Numpy-discussion] Bug in resize method?
In-Reply-To: <e4412d6b0708291030s16d7c537tad7346ccd63ca22@mail.gmail.com>
References: <e06186140708290842t7f25f0dbo8b53216d2b056686@mail.gmail.com>
	<20070829155856.GV14395@mentat.za.net>
	<e06186140708290928l318e9e2ew26beddd6cb541f7b@mail.gmail.com>
	<46D5A43F.5080702@noaa.gov>
	<e06186140708291014l2df350cbva84352d4c409625d@mail.gmail.com>
	<e4412d6b0708291030s16d7c537tad7346ccd63ca22@mail.gmail.com>
Message-ID: <e06186140708291129n336ac67aoa7e640c6ac34d0f4@mail.gmail.com>

On 8/29/07, Timothy Hochberg <tim.hochberg at ieee.org> wrote:
>
>
>
> On 8/29/07, Charles R Harris <charlesr.harris at gmail.com> wrote:
> >
> >
> >
> > On 8/29/07, Christopher Barker < Chris.Barker at noaa.gov> wrote:
> > >
> > > Charles R Harris wrote:
> > > > What *should* the resize method do? It looks like
> > > > it is equivalent to assigning a shape tuple to a.shape,
> > >
> > > No, that's what reshape does.
> >
> >
> > No, reshape returns a view and the view doesn't own its data. Totally
> > different behavior in this context.
> >
> > > so why do we need it?
> > >
> > > resize() will change the SIZE of the array (number of elements), where
> > >
> > > reshape() will only change the shape, but not the number of elements.
> > > The fact that the size is changing is why it won't work if if doesn't
> > > own the data.
> >
> >
> > According to the documentation, the resize method changes the array
> > inplace. How can it be inplace if the number of elements changes?
> >
>
> It sounds like you and Chris are talking past each other on a matter of
> terminology. At a C-level, it's obviously not (necessarily) in place, since
> the array may get realloced as you surmise below. However, at the Python
> level, the change is in fact in place, in the same sense that appending to a
> Python list operates in-place, even though under the covers memory may get
> realloced there as well.
>
>
> > Admittedly, it *will* change the size, but that is not consistent with
> > the documentation. I suspect it reallocates memory and (hopefully) frees the
> > old, but then that is what the documentation should say because it explains
> > why the data must be owned -- a condition violated in some cases as
> > demonstrated above. I am working on documentation and that is why I am
> > raising these questions. There seem to be some inconsistencies that need
> > clarification and/or fixing.
> >
>
> The main inconsistency I see above is that resize appears to only require
> ownership of the data if in fact the number of items changes. I don't think
> that's actually a bug, but I don't like it much; I would prefer that resize
> be strict and always require ownership. However, I'm fairly certain that
> there are people that prefer "friendliness" over consistency, so I wouldn't
> be surprised to get some pushback on changing that.
>

I still don't see why the method is needed at all. Given the conditions on
the array, the only thing it buys you over the resize function or a reshape
is the automatic deletion of the old memory if new memory is allocated. And
the latter is easily done as a = reshape(a, new_shape). I know there was a
push to make most things methods, but it is possible to overdo it. Is this a
Numarray compatibility issue?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070829/68933649/attachment.html>

From tim.hochberg at ieee.org  Wed Aug 29 14:31:12 2007
From: tim.hochberg at ieee.org (Timothy Hochberg)
Date: Wed, 29 Aug 2007 11:31:12 -0700
Subject: [Numpy-discussion] Bug in resize method?
In-Reply-To: <ce557a360708291119u3fec2e06h1a0b1f0b3cfdb8a2@mail.gmail.com>
References: <e06186140708290842t7f25f0dbo8b53216d2b056686@mail.gmail.com>
	<20070829155856.GV14395@mentat.za.net>
	<e06186140708290928l318e9e2ew26beddd6cb541f7b@mail.gmail.com>
	<46D5A43F.5080702@noaa.gov>
	<e06186140708291014l2df350cbva84352d4c409625d@mail.gmail.com>
	<e4412d6b0708291030s16d7c537tad7346ccd63ca22@mail.gmail.com>
	<ce557a360708291119u3fec2e06h1a0b1f0b3cfdb8a2@mail.gmail.com>
Message-ID: <e4412d6b0708291131m443dbe05ia888a1e533d41634@mail.gmail.com>

On 8/29/07, Anne Archibald <peridot.faceted at gmail.com> wrote:
>
> On 29/08/2007, Timothy Hochberg <tim.hochberg at ieee.org> wrote:
>
> > The main inconsistency I see above is that resize appears to only
> require
> > ownership of the data if in fact the number of items changes. I don't
> think
> > that's actually a bug, but I don't like it much; I would prefer that
> resize
> > be strict and always require ownership. However, I'm fairly certain that
> > there are people that prefer "friendliness" over consistency, so I
> wouldn't
> > be surprised to get some pushback on changing that.
>
> It seems to me like inplace resize is a problem, no matter how you
> implement it --- is there any way to verify that no view exists of a
> given array? (refcounts won't do it since there are other, non-view
> ways to increase the refcount of an array.)


I think that may be overstating the problem a bit; refcounts should work in
the sense that they would prevent segfaults. They'll just be too
conservative in many cases, preventing resizes in cases where they would
otherwise work.


> If there's a view of an
> array, you resize() it in place, and realloc() moves the data, the
> views now point to bogus memory: you can cause the python interpreter
> to segfault by addressing their contents. I really can't see any way
> around this; why not remove inplace resize() (or make it raise
> exceptions if the size has to change) and allow only the function
> resize()?


Probably because in a few cases, it's vastly more efficient to realloc the
data than to copy it.

FWIW, I don't use either the resize function or the resize method, but if I
was going to get rid of one, personally I'd axe the function. Resizing is a
confusing operation and the function doesn't have the possibility of better
efficiency to justify it's existence.


-- 
.  __
.   |-\
.
.  tim.hochberg at ieee.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070829/c8533be4/attachment.html>

From robert.kern at gmail.com  Wed Aug 29 14:34:28 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 29 Aug 2007 13:34:28 -0500
Subject: [Numpy-discussion] Bug in resize method?
In-Reply-To: <ce557a360708291119u3fec2e06h1a0b1f0b3cfdb8a2@mail.gmail.com>
References: <e06186140708290842t7f25f0dbo8b53216d2b056686@mail.gmail.com>	<20070829155856.GV14395@mentat.za.net>	<e06186140708290928l318e9e2ew26beddd6cb541f7b@mail.gmail.com>	<46D5A43F.5080702@noaa.gov>	<e06186140708291014l2df350cbva84352d4c409625d@mail.gmail.com>	<e4412d6b0708291030s16d7c537tad7346ccd63ca22@mail.gmail.com>
	<ce557a360708291119u3fec2e06h1a0b1f0b3cfdb8a2@mail.gmail.com>
Message-ID: <46D5BC34.7080400@gmail.com>

Anne Archibald wrote:
> On 29/08/2007, Timothy Hochberg <tim.hochberg at ieee.org> wrote:
> 
>> The main inconsistency I see above is that resize appears to only require
>> ownership of the data if in fact the number of items changes. I don't think
>> that's actually a bug, but I don't like it much; I would prefer that resize
>> be strict and always require ownership. However, I'm fairly certain that
>> there are people that prefer "friendliness" over consistency, so I wouldn't
>> be surprised to get some pushback on changing that.
> 
> It seems to me like inplace resize is a problem, no matter how you
> implement it --- is there any way to verify that no view exists of a
> given array? (refcounts won't do it since there are other, non-view
> ways to increase the refcount of an array.)

Yes, as long as every view is created using the C API correctly. That's why
Chuck saw the exception he did, because he tried to resize() an array that had a
view stuck of it (or rather, he was trying to resize() the view, which didn't
have ownership of the data).


In [8]: from numpy import *

In [9]: a = zeros(10)

In [10]: a.resize(15)

In [11]: b = a[:]

In [12]: a.resize(20)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)

/Users/rkern/src/VTK-5.0.2/<ipython console> in <module>()

ValueError: cannot resize an array that has been referenced or is referencing
another array in this way.  Use the resize function


Of course, if you muck around with the raw data pointer using ctypes, you might
have problems, but that's ctypes for you.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From gael.varoquaux at normalesup.org  Wed Aug 29 14:37:57 2007
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Wed, 29 Aug 2007 20:37:57 +0200
Subject: [Numpy-discussion] Bug in resize method?
In-Reply-To: <e4412d6b0708291131m443dbe05ia888a1e533d41634@mail.gmail.com>
References: <e06186140708290842t7f25f0dbo8b53216d2b056686@mail.gmail.com>
	<20070829155856.GV14395@mentat.za.net>
	<e06186140708290928l318e9e2ew26beddd6cb541f7b@mail.gmail.com>
	<46D5A43F.5080702@noaa.gov>
	<e06186140708291014l2df350cbva84352d4c409625d@mail.gmail.com>
	<e4412d6b0708291030s16d7c537tad7346ccd63ca22@mail.gmail.com>
	<ce557a360708291119u3fec2e06h1a0b1f0b3cfdb8a2@mail.gmail.com>
	<e4412d6b0708291131m443dbe05ia888a1e533d41634@mail.gmail.com>
Message-ID: <20070829183757.GB10641@clipper.ens.fr>

On Wed, Aug 29, 2007 at 11:31:12AM -0700, Timothy Hochberg wrote:
>    FWIW, I don't use either the resize function or the resize method, but if
>    I was going to get rid of one, personally I'd axe the function. Resizing
>    is a confusing operation and the function doesn't have the possibility of
>    better efficiency to justify it's existence.

My understand of OOP is that I expect a method to modify an object in
place, and a function to return a new object (or a view).

Now this is not true with Python, as some objects are imutable and this
is not possible, but at least there seems to be some logic that a method
returns a new object only if the object is imutable.

With numpy I often fail to see the logic, but I'd love to see one.

Ga?l


From tim.hochberg at ieee.org  Wed Aug 29 14:38:54 2007
From: tim.hochberg at ieee.org (Timothy Hochberg)
Date: Wed, 29 Aug 2007 11:38:54 -0700
Subject: [Numpy-discussion] Bug in resize method?
In-Reply-To: <e06186140708291129n336ac67aoa7e640c6ac34d0f4@mail.gmail.com>
References: <e06186140708290842t7f25f0dbo8b53216d2b056686@mail.gmail.com>
	<20070829155856.GV14395@mentat.za.net>
	<e06186140708290928l318e9e2ew26beddd6cb541f7b@mail.gmail.com>
	<46D5A43F.5080702@noaa.gov>
	<e06186140708291014l2df350cbva84352d4c409625d@mail.gmail.com>
	<e4412d6b0708291030s16d7c537tad7346ccd63ca22@mail.gmail.com>
	<e06186140708291129n336ac67aoa7e640c6ac34d0f4@mail.gmail.com>
Message-ID: <e4412d6b0708291138v7608e219o9aa1e73e94db769b@mail.gmail.com>

On 8/29/07, Charles R Harris <charlesr.harris at gmail.com> wrote:

>
> I still don't see why the method is needed at all. Given the conditions on
> the array, the only thing it buys you over the resize function or a reshape
> is the automatic deletion of the old memory if new memory is allocated.
>

Can you explain this more? Both you and Anne seem to share the opinion that
the resize method is useless, while the resize function is useful. So, now
I'm worried I'm missing something since as far as I can tell the function is
useless and the method is only mostly useless.


> And the latter is easily done as a = reshape(a, new_shape). I know there
> was a push to make most things methods,
>

In general I think methods are easy to overdo, but I'm not on board for this
particular case.

but it is possible to overdo it. Is this a Numarray compatibility issue?
>

Dunno about that.


-- 
.  __
.   |-\
.
.  tim.hochberg at ieee.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070829/27b4202e/attachment.html>

From charlesr.harris at gmail.com  Wed Aug 29 15:14:54 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 29 Aug 2007 13:14:54 -0600
Subject: [Numpy-discussion] Bug in resize method?
In-Reply-To: <e4412d6b0708291138v7608e219o9aa1e73e94db769b@mail.gmail.com>
References: <e06186140708290842t7f25f0dbo8b53216d2b056686@mail.gmail.com>
	<20070829155856.GV14395@mentat.za.net>
	<e06186140708290928l318e9e2ew26beddd6cb541f7b@mail.gmail.com>
	<46D5A43F.5080702@noaa.gov>
	<e06186140708291014l2df350cbva84352d4c409625d@mail.gmail.com>
	<e4412d6b0708291030s16d7c537tad7346ccd63ca22@mail.gmail.com>
	<e06186140708291129n336ac67aoa7e640c6ac34d0f4@mail.gmail.com>
	<e4412d6b0708291138v7608e219o9aa1e73e94db769b@mail.gmail.com>
Message-ID: <e06186140708291214x45361bcbg6748b1265e6ee7e@mail.gmail.com>

On 8/29/07, Timothy Hochberg <tim.hochberg at ieee.org> wrote:
>
>
>
> On 8/29/07, Charles R Harris <charlesr.harris at gmail.com> wrote:
>
> >
> > I still don't see why the method is needed at all. Given the conditions
> > on the array, the only thing it buys you over the resize function or a
> > reshape is the automatic deletion of the old memory if new memory is
> > allocated.
> >
>
> Can you explain this more? Both you and Anne seem to share the opinion
> that the resize method is useless, while the resize function is useful. So,
> now I'm worried I'm missing something since as far as I can tell the
> function is useless and the method is only mostly useless.
>

Heh. I might dump both. The resize function is a concatenation followed by
reshape. It differs from the resize method in that it always returns a new
array and repeats the data instead of filling with zeros. The inconsistency
in the way the array is filled  bothers me a bit, I would have just named
the method realloc. I really don't see the need for either except for
backward compatibility. Maybe someone can make a case.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070829/c37a9a2c/attachment.html>

From efiring at hawaii.edu  Wed Aug 29 15:25:59 2007
From: efiring at hawaii.edu (Eric Firing)
Date: Wed, 29 Aug 2007 09:25:59 -1000
Subject: [Numpy-discussion] Bug in resize method?
In-Reply-To: <e4412d6b0708291138v7608e219o9aa1e73e94db769b@mail.gmail.com>
References: <e06186140708290842t7f25f0dbo8b53216d2b056686@mail.gmail.com>
	<20070829155856.GV14395@mentat.za.net>
	<e06186140708290928l318e9e2ew26beddd6cb541f7b@mail.gmail.com>
	<46D5A43F.5080702@noaa.gov>
	<e06186140708291014l2df350cbva84352d4c409625d@mail.gmail.com>
	<e4412d6b0708291030s16d7c537tad7346ccd63ca22@mail.gmail.com>
	<e06186140708291129n336ac67aoa7e640c6ac34d0f4@mail.gmail.com>
	<e4412d6b0708291138v7608e219o9aa1e73e94db769b@mail.gmail.com>
Message-ID: <46D5C847.8080208@hawaii.edu>

Timothy Hochberg wrote:
> 
> 
> On 8/29/07, *Charles R Harris* <charlesr.harris at gmail.com 
> <mailto:charlesr.harris at gmail.com>> wrote:
> 
> 
>     I still don't see why the method is needed at all. Given the
>     conditions on the array, the only thing it buys you over the resize
>     function or a reshape is the automatic deletion of the old memory if
>     new memory is allocated. 
> 
> 
> Can you explain this more? Both you and Anne seem to share the opinion 
> that the resize method is useless, while the resize function is useful. 
> So, now I'm worried I'm missing something since as far as I can tell the 
> function is useless and the method is only mostly useless.

The resize function docstring makes the following distinction:

Definition:     numpy.resize(a, new_shape)
Docstring:
     Return a new array with the specified shape.

     The original array's total size can be any size.  The new array is
     filled with repeated copies of a.

     Note that a.resize(new_shape) will fill the array with 0's beyond
     current definition of a.

So the method and the function are subtly different.  As far as I can 
see, the method is causing more trouble than it is worth.  Under what 
circumstances, in real code, can it provide enough benefit to override 
the penalty it is now exacting in confusion?

Eric


From ipan at freeshell.org  Wed Aug 29 15:34:24 2007
From: ipan at freeshell.org (Ivan Pan)
Date: Wed, 29 Aug 2007 14:34:24 -0500
Subject: [Numpy-discussion] [SciPy-dev] NumPy 1.0.3.x and SciPy 0.5.2.x
In-Reply-To: <85b5c3130708192030t6947d623oa00a710a229cbd5d@mail.gmail.com>
References: <c7009a550708150922t5d855bbbx6e4fe6944f4618cd@mail.gmail.com>
	<db6b5ecc0708191751n65bd47fat74c244bb7dd222eb@mail.gmail.com>
	<85b5c3130708192030t6947d623oa00a710a229cbd5d@mail.gmail.com>
Message-ID: <b2e047420708291234o7bb77c63qb4424aecc0a98e3b@mail.gmail.com>

On 8/19/07, Ondrej Certik <ondrej at certik.cz> wrote:
>  I don't know what the native way of installing packages on Mac OS
> X is, but I know of the fink project, that basically allows to use
> debian packages:
>
> http://finkproject.org/

Besides fink, there is also MacPort <http://www.macports.org/>. It is
similar to BSD Portage. They have fairly recent SciPy (0.5.2), NumPy
(1.0.3), IPython (0.8.1) and many more ...

Chris Fonnesbeck provides a Mac OS X installer for SciPy (0.5.3),
NumPy (1.0.4), Matplotlib (0.90.1), IPython (0.8.2) with readline, and
PyMC (1.3). He provdies binaries for both Intel and PPC version. It is
fairly up-to-date. He releases weekly or bi-monthly.

<http://trichech.us/?page_id=5>

ip


From myeates at jpl.nasa.gov  Wed Aug 29 15:59:18 2007
From: myeates at jpl.nasa.gov (Mathew Yeates)
Date: Wed, 29 Aug 2007 12:59:18 -0700
Subject: [Numpy-discussion] help! not using lapack
Message-ID: <46D5D016.2070000@jpl.nasa.gov>

Hi
When I try
import numpy
id(numpy.dot) == id(numpy.core.multiarray.dot)

I get True. But I have liblapck.a installed in ~/lib and I put the lines
[DEFAULT]
library_dirs = /home/myeates/lib
include_dirs = /home/myeates/include

in site.cfg
In fact, when I build and run a sytem trace I see that liblapack.a is 
being accessed.

Any ideas?
Mathew


From robert.kern at gmail.com  Wed Aug 29 16:12:30 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 29 Aug 2007 15:12:30 -0500
Subject: [Numpy-discussion] help! not using lapack
In-Reply-To: <46D5D016.2070000@jpl.nasa.gov>
References: <46D5D016.2070000@jpl.nasa.gov>
Message-ID: <46D5D32E.4060507@gmail.com>

Mathew Yeates wrote:
> Hi
> When I try
> import numpy
> id(numpy.dot) == id(numpy.core.multiarray.dot)
> 
> I get True. But I have liblapck.a installed in ~/lib and I put the lines
> [DEFAULT]
> library_dirs = /home/myeates/lib
> include_dirs = /home/myeates/include
> 
> in site.cfg
> In fact, when I build and run a sytem trace I see that liblapack.a is 
> being accessed.
> 
> Any ideas?

It is possible that you have a linking problem with _dotblas.so. On some
systems, such a problem will only manifest itself at run-time, not build-time.
At runtime, you will get an ImportError, which we catch because that's also the
error one gets if the _dotblas is legitimately absent.

Try importing _dotblas by itself to see the error message.


In [8]: from numpy.core import _dotblas


Most likely you are missing the appropriate libblas, too, since you don't
mention it.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From myeates at jpl.nasa.gov  Wed Aug 29 16:15:39 2007
From: myeates at jpl.nasa.gov (Mathew Yeates)
Date: Wed, 29 Aug 2007 13:15:39 -0700
Subject: [Numpy-discussion] help! not using lapack
In-Reply-To: <46D5D32E.4060507@gmail.com>
References: <46D5D016.2070000@jpl.nasa.gov> <46D5D32E.4060507@gmail.com>
Message-ID: <46D5D3EB.4070608@jpl.nasa.gov>

yes, I get
from numpy.core import _dotblas
ImportError: No module named multiarray

Now what?
uname -a
Linux 2.6.9-55.0.2.EL #1 Tue Jun 12 17:47:10 EDT 2007 i686 athlon i386 
GNU/Linux


Robert Kern wrote:
> Mathew Yeates wrote:
>   
>> Hi
>> When I try
>> import numpy
>> id(numpy.dot) == id(numpy.core.multiarray.dot)
>>
>> I get True. But I have liblapck.a installed in ~/lib and I put the lines
>> [DEFAULT]
>> library_dirs = /home/myeates/lib
>> include_dirs = /home/myeates/include
>>
>> in site.cfg
>> In fact, when I build and run a sytem trace I see that liblapack.a is 
>> being accessed.
>>
>> Any ideas?
>>     
>
> It is possible that you have a linking problem with _dotblas.so. On some
> systems, such a problem will only manifest itself at run-time, not build-time.
> At runtime, you will get an ImportError, which we catch because that's also the
> error one gets if the _dotblas is legitimately absent.
>
> Try importing _dotblas by itself to see the error message.
>
>
> In [8]: from numpy.core import _dotblas
>
>
> Most likely you are missing the appropriate libblas, too, since you don't
> mention it.
>
>   


From robert.kern at gmail.com  Wed Aug 29 16:18:56 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 29 Aug 2007 15:18:56 -0500
Subject: [Numpy-discussion] help! not using lapack
In-Reply-To: <46D5D3EB.4070608@jpl.nasa.gov>
References: <46D5D016.2070000@jpl.nasa.gov> <46D5D32E.4060507@gmail.com>
	<46D5D3EB.4070608@jpl.nasa.gov>
Message-ID: <46D5D4B0.6040705@gmail.com>

Mathew Yeates wrote:
> yes, I get
> from numpy.core import _dotblas
> ImportError: No module named multiarray

That's just weird. Can you import numpy.core.multiarray by itself?

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From myeates at jpl.nasa.gov  Wed Aug 29 16:20:13 2007
From: myeates at jpl.nasa.gov (Mathew Yeates)
Date: Wed, 29 Aug 2007 13:20:13 -0700
Subject: [Numpy-discussion] help! not using lapack
In-Reply-To: <46D5D4B0.6040705@gmail.com>
References: <46D5D016.2070000@jpl.nasa.gov>
	<46D5D32E.4060507@gmail.com>	<46D5D3EB.4070608@jpl.nasa.gov>
	<46D5D4B0.6040705@gmail.com>
Message-ID: <46D5D4FD.3060706@jpl.nasa.gov>

yes

Robert Kern wrote:
> Mathew Yeates wrote:
>   
>> yes, I get
>> from numpy.core import _dotblas
>> ImportError: No module named multiarray
>>     
>
> That's just weird. Can you import numpy.core.multiarray by itself?
>
>   


From myeates at jpl.nasa.gov  Wed Aug 29 16:22:23 2007
From: myeates at jpl.nasa.gov (Mathew Yeates)
Date: Wed, 29 Aug 2007 13:22:23 -0700
Subject: [Numpy-discussion] help! not using lapack
In-Reply-To: <46D5D4B0.6040705@gmail.com>
References: <46D5D016.2070000@jpl.nasa.gov>
	<46D5D32E.4060507@gmail.com>	<46D5D3EB.4070608@jpl.nasa.gov>
	<46D5D4B0.6040705@gmail.com>
Message-ID: <46D5D57F.5080701@jpl.nasa.gov>

oops. sorry
from numpy.core import _dotblas
ImportError: 
/home/myeates/lib/python2.5/site-packages/numpy/core/_dotblas.so: 
undefined symbol: cblas_zaxpy


Robert Kern wrote:
> Mathew Yeates wrote:
>   
>> yes, I get
>> from numpy.core import _dotblas
>> ImportError: No module named multiarray
>>     
>
> That's just weird. Can you import numpy.core.multiarray by itself?
>
>   


From robert.kern at gmail.com  Wed Aug 29 16:26:31 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 29 Aug 2007 15:26:31 -0500
Subject: [Numpy-discussion] help! not using lapack
In-Reply-To: <46D5D57F.5080701@jpl.nasa.gov>
References: <46D5D016.2070000@jpl.nasa.gov>	<46D5D32E.4060507@gmail.com>	<46D5D3EB.4070608@jpl.nasa.gov>	<46D5D4B0.6040705@gmail.com>
	<46D5D57F.5080701@jpl.nasa.gov>
Message-ID: <46D5D677.1070408@gmail.com>

Mathew Yeates wrote:
> oops. sorry
> from numpy.core import _dotblas
> ImportError: 
> /home/myeates/lib/python2.5/site-packages/numpy/core/_dotblas.so: 
> undefined symbol: cblas_zaxpy

Okay, yes, that's the problem. liblapack depends on libblas. Make sure that you
specify one to use. Follow the directions in site.cfg.example. If you need more
help, please tell us what libraries you are using, your full site.cfg and the
output of

  $ python setup.py config

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From myeates at jpl.nasa.gov  Wed Aug 29 16:29:46 2007
From: myeates at jpl.nasa.gov (Mathew Yeates)
Date: Wed, 29 Aug 2007 13:29:46 -0700
Subject: [Numpy-discussion] help! not using lapack
In-Reply-To: <46D5D677.1070408@gmail.com>
References: <46D5D016.2070000@jpl.nasa.gov>	<46D5D32E.4060507@gmail.com>	<46D5D3EB.4070608@jpl.nasa.gov>	<46D5D4B0.6040705@gmail.com>	<46D5D57F.5080701@jpl.nasa.gov>
	<46D5D677.1070408@gmail.com>
Message-ID: <46D5D73A.4080500@jpl.nasa.gov>

my site,cfg just is
[DEFAULT]
library_dirs = /home/myeates/lib
include_dirs = /home/myeates/include

python setup.py config gives
F2PY Version 2_3979
blas_opt_info:
blas_mkl_info:
  libraries mkl,vml,guide not found in /home/myeates/lib
  NOT AVAILABLE

atlas_blas_threads_info:
Setting PTATLAS=ATLAS
  libraries ptf77blas,ptcblas,atlas not found in /home/myeates/lib
  NOT AVAILABLE

atlas_blas_info:
  libraries f77blas,cblas,atlas not found in /home/myeates/lib
  NOT AVAILABLE

blas_info:
  FOUND:
    libraries = ['blas']
    library_dirs = ['/home/myeates/lib']
    language = f77

  FOUND:
    libraries = ['blas']
    library_dirs = ['/home/myeates/lib']
    define_macros = [('NO_ATLAS_INFO', 1)]
    language = f77

lapack_opt_info:
lapack_mkl_info:
mkl_info:
  libraries mkl,vml,guide not found in /home/myeates/lib
  NOT AVAILABLE

  NOT AVAILABLE

atlas_threads_info:
Setting PTATLAS=ATLAS
  libraries ptf77blas,ptcblas,atlas not found in /home/myeates/lib
  libraries lapack_atlas not found in /home/myeates/lib
numpy.distutils.system_info.atlas_threads_info
  NOT AVAILABLE

atlas_info:
  libraries f77blas,cblas,atlas not found in /home/myeates/lib
  libraries lapack_atlas not found in /home/myeates/lib
numpy.distutils.system_info.atlas_info
  NOT AVAILABLE

lapack_info:
  FOUND:
    libraries = ['lapack']
    library_dirs = ['/home/myeates/lib']
    language = f77

  FOUND:
    libraries = ['lapack', 'blas']
    library_dirs = ['/home/myeates/lib']
    define_macros = [('NO_ATLAS_INFO', 1)]
    language = f77

running config


Robert Kern wrote:
> Mathew Yeates wrote:
>   
>> oops. sorry
>> from numpy.core import _dotblas
>> ImportError: 
>> /home/myeates/lib/python2.5/site-packages/numpy/core/_dotblas.so: 
>> undefined symbol: cblas_zaxpy
>>     
>
> Okay, yes, that's the problem. liblapack depends on libblas. Make sure that you
> specify one to use. Follow the directions in site.cfg.example. If you need more
> help, please tell us what libraries you are using, your full site.cfg and the
> output of
>
>   $ python setup.py config
>
>   


From myeates at jpl.nasa.gov  Wed Aug 29 16:35:36 2007
From: myeates at jpl.nasa.gov (Mathew Yeates)
Date: Wed, 29 Aug 2007 13:35:36 -0700
Subject: [Numpy-discussion] help! not using lapack
In-Reply-To: <46D5D73A.4080500@jpl.nasa.gov>
References: <46D5D016.2070000@jpl.nasa.gov>	<46D5D32E.4060507@gmail.com>	<46D5D3EB.4070608@jpl.nasa.gov>	<46D5D4B0.6040705@gmail.com>	<46D5D57F.5080701@jpl.nasa.gov>	<46D5D677.1070408@gmail.com>
	<46D5D73A.4080500@jpl.nasa.gov>
Message-ID: <46D5D898.3060809@jpl.nasa.gov>

more info. My blas library has zaxpy defined but not  cblas_zaxpy

Mathew Yeates wrote:
> my site,cfg just is
> [DEFAULT]
> library_dirs = /home/myeates/lib
> include_dirs = /home/myeates/include
>
> python setup.py config gives
> F2PY Version 2_3979
> blas_opt_info:
> blas_mkl_info:
>   libraries mkl,vml,guide not found in /home/myeates/lib
>   NOT AVAILABLE
>
> atlas_blas_threads_info:
> Setting PTATLAS=ATLAS
>   libraries ptf77blas,ptcblas,atlas not found in /home/myeates/lib
>   NOT AVAILABLE
>
> atlas_blas_info:
>   libraries f77blas,cblas,atlas not found in /home/myeates/lib
>   NOT AVAILABLE
>
> blas_info:
>   FOUND:
>     libraries = ['blas']
>     library_dirs = ['/home/myeates/lib']
>     language = f77
>
>   FOUND:
>     libraries = ['blas']
>     library_dirs = ['/home/myeates/lib']
>     define_macros = [('NO_ATLAS_INFO', 1)]
>     language = f77
>
> lapack_opt_info:
> lapack_mkl_info:
> mkl_info:
>   libraries mkl,vml,guide not found in /home/myeates/lib
>   NOT AVAILABLE
>
>   NOT AVAILABLE
>
> atlas_threads_info:
> Setting PTATLAS=ATLAS
>   libraries ptf77blas,ptcblas,atlas not found in /home/myeates/lib
>   libraries lapack_atlas not found in /home/myeates/lib
> numpy.distutils.system_info.atlas_threads_info
>   NOT AVAILABLE
>
> atlas_info:
>   libraries f77blas,cblas,atlas not found in /home/myeates/lib
>   libraries lapack_atlas not found in /home/myeates/lib
> numpy.distutils.system_info.atlas_info
>   NOT AVAILABLE
>
> lapack_info:
>   FOUND:
>     libraries = ['lapack']
>     library_dirs = ['/home/myeates/lib']
>     language = f77
>
>   FOUND:
>     libraries = ['lapack', 'blas']
>     library_dirs = ['/home/myeates/lib']
>     define_macros = [('NO_ATLAS_INFO', 1)]
>     language = f77
>
> running config
>
>
> Robert Kern wrote:
>   
>> Mathew Yeates wrote:
>>   
>>     
>>> oops. sorry
>>> from numpy.core import _dotblas
>>> ImportError: 
>>> /home/myeates/lib/python2.5/site-packages/numpy/core/_dotblas.so: 
>>> undefined symbol: cblas_zaxpy
>>>     
>>>       
>> Okay, yes, that's the problem. liblapack depends on libblas. Make sure that you
>> specify one to use. Follow the directions in site.cfg.example. If you need more
>> help, please tell us what libraries you are using, your full site.cfg and the
>> output of
>>
>>   $ python setup.py config
>>
>>   
>>     
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>   


From robert.kern at gmail.com  Wed Aug 29 16:35:50 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 29 Aug 2007 15:35:50 -0500
Subject: [Numpy-discussion] help! not using lapack
In-Reply-To: <46D5D73A.4080500@jpl.nasa.gov>
References: <46D5D016.2070000@jpl.nasa.gov>	<46D5D32E.4060507@gmail.com>	<46D5D3EB.4070608@jpl.nasa.gov>	<46D5D4B0.6040705@gmail.com>	<46D5D57F.5080701@jpl.nasa.gov>	<46D5D677.1070408@gmail.com>
	<46D5D73A.4080500@jpl.nasa.gov>
Message-ID: <46D5D8A6.7010502@gmail.com>

If your BLAS just the reference BLAS, don't bother with _dotblas. It won't be
any faster than the default implementation in numpy. You only get a win if you
are using an accelerated BLAS with the CBLAS interface for C-style row-major
matrices. Your libblas does not seem to be such an accelerated BLAS.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From myeates at jpl.nasa.gov  Wed Aug 29 16:39:26 2007
From: myeates at jpl.nasa.gov (Mathew Yeates)
Date: Wed, 29 Aug 2007 13:39:26 -0700
Subject: [Numpy-discussion] help! not using lapack
In-Reply-To: <46D5D8A6.7010502@gmail.com>
References: <46D5D016.2070000@jpl.nasa.gov>	<46D5D32E.4060507@gmail.com>	<46D5D3EB.4070608@jpl.nasa.gov>	<46D5D4B0.6040705@gmail.com>	<46D5D57F.5080701@jpl.nasa.gov>	<46D5D677.1070408@gmail.com>	<46D5D73A.4080500@jpl.nasa.gov>
	<46D5D8A6.7010502@gmail.com>
Message-ID: <46D5D97E.9040401@jpl.nasa.gov>

I'm the one who created libblas.a so I must have done something wrong. 
This is lapack-3.1.1.


Robert Kern wrote:
> If your BLAS just the reference BLAS, don't bother with _dotblas. It won't be
> any faster than the default implementation in numpy. You only get a win if you
> are using an accelerated BLAS with the CBLAS interface for C-style row-major
> matrices. Your libblas does not seem to be such an accelerated BLAS.
>
>   


From robert.kern at gmail.com  Wed Aug 29 16:46:17 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 29 Aug 2007 15:46:17 -0500
Subject: [Numpy-discussion] help! not using lapack
In-Reply-To: <46D5D97E.9040401@jpl.nasa.gov>
References: <46D5D016.2070000@jpl.nasa.gov>	<46D5D32E.4060507@gmail.com>	<46D5D3EB.4070608@jpl.nasa.gov>	<46D5D4B0.6040705@gmail.com>	<46D5D57F.5080701@jpl.nasa.gov>	<46D5D677.1070408@gmail.com>	<46D5D73A.4080500@jpl.nasa.gov>	<46D5D8A6.7010502@gmail.com>
	<46D5D97E.9040401@jpl.nasa.gov>
Message-ID: <46D5DB19.6090808@gmail.com>

Mathew Yeates wrote:
> I'm the one who created libblas.a so I must have done something wrong. 
> This is lapack-3.1.1.

No, you didn't do anything wrong, per se, you just built the reference F77 BLAS.
It's not an accelerated BLAS, so there's no point in using it with numpy.
There's not way you *can* build it to be an accelerated BLAS.

If you want an accelerated BLAS, try to use ATLAS:

  http://math-atlas.sourceforge.net/

It is possible that your Linux distribution, whatever it is, already has a build
of it for you.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From myeates at jpl.nasa.gov  Wed Aug 29 16:52:59 2007
From: myeates at jpl.nasa.gov (Mathew Yeates)
Date: Wed, 29 Aug 2007 13:52:59 -0700
Subject: [Numpy-discussion] help! not using lapack
In-Reply-To: <46D5DB19.6090808@gmail.com>
References: <46D5D016.2070000@jpl.nasa.gov>	<46D5D32E.4060507@gmail.com>	<46D5D3EB.4070608@jpl.nasa.gov>	<46D5D4B0.6040705@gmail.com>	<46D5D57F.5080701@jpl.nasa.gov>	<46D5D677.1070408@gmail.com>	<46D5D73A.4080500@jpl.nasa.gov>	<46D5D8A6.7010502@gmail.com>	<46D5D97E.9040401@jpl.nasa.gov>
	<46D5DB19.6090808@gmail.com>
Message-ID: <46D5DCAB.6000304@jpl.nasa.gov>

Thanks Robert
I have a deadline and don't have time to install ATLAS. Instead I'm 
installing clapack. Is this the corrrect thing to do?

Mathew

Robert Kern wrote:
> Mathew Yeates wrote:
>   
>> I'm the one who created libblas.a so I must have done something wrong. 
>> This is lapack-3.1.1.
>>     
>
> No, you didn't do anything wrong, per se, you just built the reference F77 BLAS.
> It's not an accelerated BLAS, so there's no point in using it with numpy.
> There's not way you *can* build it to be an accelerated BLAS.
>
> If you want an accelerated BLAS, try to use ATLAS:
>
>   http://math-atlas.sourceforge.net/
>
> It is possible that your Linux distribution, whatever it is, already has a build
> of it for you.
>
>   


From Joris.DeRidder at ster.kuleuven.be  Wed Aug 29 16:53:25 2007
From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder)
Date: Wed, 29 Aug 2007 22:53:25 +0200
Subject: [Numpy-discussion] Trac ticket
In-Reply-To: <F3EABE04-98C8-48D4-9C3E-8D04A35D46AB@arachnedesign.net>
References: <1F561FA8-53D7-483D-9772-4F1E60E37510@ster.kuleuven.be>
	<F3EABE04-98C8-48D4-9C3E-8D04A35D46AB@arachnedesign.net>
Message-ID: <42F6611A-C621-4D31-AA66-784E1F597639@ster.kuleuven.be>


> You should most likely just attach a patch against the latest trunk
> to the ticket itself for review.

Done. The patch adds an 'axis' keyword to median().

J.


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm


From robert.kern at gmail.com  Wed Aug 29 16:55:05 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 29 Aug 2007 15:55:05 -0500
Subject: [Numpy-discussion] help! not using lapack
In-Reply-To: <46D5DCAB.6000304@jpl.nasa.gov>
References: <46D5D016.2070000@jpl.nasa.gov>	<46D5D32E.4060507@gmail.com>	<46D5D3EB.4070608@jpl.nasa.gov>	<46D5D4B0.6040705@gmail.com>	<46D5D57F.5080701@jpl.nasa.gov>	<46D5D677.1070408@gmail.com>	<46D5D73A.4080500@jpl.nasa.gov>	<46D5D8A6.7010502@gmail.com>	<46D5D97E.9040401@jpl.nasa.gov>	<46D5DB19.6090808@gmail.com>
	<46D5DCAB.6000304@jpl.nasa.gov>
Message-ID: <46D5DD29.2040405@gmail.com>

Mathew Yeates wrote:
> Thanks Robert
> I have a deadline and don't have time to install ATLAS. Instead I'm 
> installing clapack. Is this the corrrect thing to do?

No. Just leave things alone if you don't have an accelerated BLAS at hand.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From Chris.Barker at noaa.gov  Wed Aug 29 17:53:12 2007
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Wed, 29 Aug 2007 14:53:12 -0700
Subject: [Numpy-discussion] [SciPy-dev] NumPy 1.0.3.x and SciPy 0.5.2.x
In-Reply-To: <b2e047420708291234o7bb77c63qb4424aecc0a98e3b@mail.gmail.com>
References: <c7009a550708150922t5d855bbbx6e4fe6944f4618cd@mail.gmail.com>
	<db6b5ecc0708191751n65bd47fat74c244bb7dd222eb@mail.gmail.com>
	<85b5c3130708192030t6947d623oa00a710a229cbd5d@mail.gmail.com>
	<b2e047420708291234o7bb77c63qb4424aecc0a98e3b@mail.gmail.com>
Message-ID: <46D5EAC8.8000109@noaa.gov>

Ivan Pan wrote:
> On 8/19/07, Ondrej Certik <ondrej at certik.cz> wrote:
>>  I don't know what the native way of installing packages on Mac OS
>> X is

Boy, I wish this weren't such a mess. Quite some time ago, a bunch of us 
on the pythonmac list tried to establish the idea of a "one 'standard' 
python for OS-X", and a set of pre-built packages for it. It is the one 
you find here:

http://www.pythonmac.org/packages/py25-fat/

That Python is the same as the one you find at python.org too. It is the 
closest one comes to a "native" set of packages for OS-X.

It would be really nice if the scipy/numpy projects would provide 
binaries (or at least have setup.py ready to go) for that repository. I 
really like being able to tell folks ONE place to go to get python 
packages. There are a number of Mac folks that help build the packages 
there.

For a while, SciPy had a key problem -- no one knew how to build 
Universal(Intel and PPC) packages from Fortran code, and that repository 
really should have Universal binaries, so that folks don't have to think 
about what hardware they are running, and can bundle up apps with Py2App 
that will work on any Mac (with a new enough OS).

I understand that the Universal problem has been solved now. I hope that 
if the SciPy project "officially" releases binaries for OS-X, they will 
be Universal binaries compatible with that Python.

About fink/macport. They are fine systems that have some real use. 
However, they really should be thought of as different platforms (or at 
least different distributions), much like CygWin, or Ubuntu vs. Fedora. 
If you make a fink package, you are making a fink package, NOT an OS-X one.

Anyway, sorry to be so pedantic when I don't think I will get the time 
to do the building myself, but I wanted to lay out a goal anyway. I do 
offer to do some testing, etc if needed.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From myeates at jpl.nasa.gov  Wed Aug 29 17:53:42 2007
From: myeates at jpl.nasa.gov (Mathew Yeates)
Date: Wed, 29 Aug 2007 14:53:42 -0700
Subject: [Numpy-discussion] gesdd hangs
Message-ID: <46D5EAE6.6070809@jpl.nasa.gov>

I guess I can't blame lapack. My system has atlas so I recompiled numpy 
pointing to atlas. Now

id(numpy.dot) == id(numpy.core.multiarray.dot) is False

However when I run decomp.svd on a 25 by 25 identity matrix, it hangs when gesdd is called (line 501 of linalag/decomp.py)

Anybody else seeing this?

Mathew


From peridot.faceted at gmail.com  Wed Aug 29 18:31:55 2007
From: peridot.faceted at gmail.com (Anne Archibald)
Date: Wed, 29 Aug 2007 18:31:55 -0400
Subject: [Numpy-discussion] Bug or surprising undocumented behaviour in irfft
Message-ID: <ce557a360708291531m1ce3f8dbqc0cf2174cb92f95c@mail.gmail.com>

Hi,

numpy's Fourier transforms have the handy feature of being able to
upsample and downsample signals; for example the documentation cites
irfft(rfft(A),16*len(A)) as a way to get a Fourier interpolation of A.
However, there is a peculiarity with the way numpy handles the
highest-frequency coefficient.

First of all, the normalization:

In [65]: rfft(cos(2*pi*arange(8)/8.))
Out[65]:
array([ -3.44505240e-16 +0.00000000e+00j,
         4.00000000e+00 -1.34392280e-15j,
         1.22460635e-16 -0.00000000e+00j,
        -1.16443313e-16 -8.54080261e-16j,   9.95839695e-17 +0.00000000e+00j])

In [66]: rfft(cos(2*4*pi*arange(8)/8.))
Out[66]: array([ 0.+0.j,  0.+0.j,  0.-0.j,  0.+0.j,  8.+0.j])

So a cosine signal gives 0.5*N if its frequency F is 0<F<=N/2, but N
if its frequency is N/2+1 (or zero). This is fine; it's the way
Fourier transforms work.

Now, suppose we take a signal whose Fourier transform is [0,0,0,0,1]:

In [67]: n=8; irfft([0,0,0,0,1],n)[0]*n
Out[67]: 1.0

In [68]: n=16; irfft([0,0,0,0,1],n)[0]*n
Out[68]: 2.0

In [69]: n=32; irfft([0,0,0,0,1],n)[0]*n
Out[69]: 2.0

In [70]: n=64; irfft([0,0,0,0,1],n)[0]*n
Out[70]: 2.0

The value at zero - a non-interpolated point - changes when you interpolate!

Similarly, if we are reducing the number of harmonics:

In [71]: n=8; irfft([0,0,1,0,0],n)[0]*n
Out[71]: 2.0

In [72]: n=4; irfft([0,0,1,0,0],n)[0]*n
Out[72]: 1.0


The upshot is, if I correctly understand what is going on, that the
last coefficient needs to be treated somewhat differently from the
others; when one pads with zeros in order to upsample the signal, one
should multiply the last coefficient by  0.5. Should this be done in
numpy's upsampling code? Should it at least be documented?

Thanks,
Anne M. Archibald


From charlesr.harris at gmail.com  Wed Aug 29 18:34:30 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 29 Aug 2007 16:34:30 -0600
Subject: [Numpy-discussion] gesdd hangs
In-Reply-To: <46D5EAE6.6070809@jpl.nasa.gov>
References: <46D5EAE6.6070809@jpl.nasa.gov>
Message-ID: <e06186140708291534l7a7d6f19i2034d3b3663e886@mail.gmail.com>

On 8/29/07, Mathew Yeates <myeates at jpl.nasa.gov> wrote:
>
> I guess I can't blame lapack. My system has atlas so I recompiled numpy
> pointing to atlas. Now
>
> id(numpy.dot) == id(numpy.core.multiarray.dot) is False
>
> However when I run decomp.svd on a 25 by 25 identity matrix, it hangs when
> gesdd is called (line 501 of linalag/decomp.py)
>
> Anybody else seeing this?


What do you mean by hang?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070829/894bbc85/attachment.html>

From myeates at jpl.nasa.gov  Wed Aug 29 18:35:55 2007
From: myeates at jpl.nasa.gov (Mathew Yeates)
Date: Wed, 29 Aug 2007 15:35:55 -0700
Subject: [Numpy-discussion] gesdd hangs
In-Reply-To: <e06186140708291534l7a7d6f19i2034d3b3663e886@mail.gmail.com>
References: <46D5EAE6.6070809@jpl.nasa.gov>
	<e06186140708291534l7a7d6f19i2034d3b3663e886@mail.gmail.com>
Message-ID: <46D5F4CB.4020702@jpl.nasa.gov>

never returns


Charles R Harris wrote:
>
>
> On 8/29/07, *Mathew Yeates* <myeates at jpl.nasa.gov 
> <mailto:myeates at jpl.nasa.gov>> wrote:
>
>     I guess I can't blame lapack. My system has atlas so I recompiled
>     numpy
>     pointing to atlas. Now
>
>     id(numpy.dot) == id(numpy.core.multiarray.dot) is False
>
>     However when I run decomp.svd on a 25 by 25 identity matrix, it
>     hangs when gesdd is called (line 501 of linalag/decomp.py)
>
>     Anybody else seeing this?
>
>
> What do you mean by hang?
>
> Chuck
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>   


From charlesr.harris at gmail.com  Wed Aug 29 19:08:08 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 29 Aug 2007 17:08:08 -0600
Subject: [Numpy-discussion] Bug or surprising undocumented behaviour in
	irfft
In-Reply-To: <ce557a360708291531m1ce3f8dbqc0cf2174cb92f95c@mail.gmail.com>
References: <ce557a360708291531m1ce3f8dbqc0cf2174cb92f95c@mail.gmail.com>
Message-ID: <e06186140708291608k58f9a5e5p3c75b49499cbbb2a@mail.gmail.com>

Anne,

On 8/29/07, Anne Archibald <peridot.faceted at gmail.com> wrote:
>
> Hi,
>
> numpy's Fourier transforms have the handy feature of being able to
> upsample and downsample signals; for example the documentation cites
> irfft(rfft(A),16*len(A)) as a way to get a Fourier interpolation of A.
> However, there is a peculiarity with the way numpy handles the
> highest-frequency coefficient.


<snip>

The upshot is, if I correctly understand what is going on, that the
> last coefficient needs to be treated somewhat differently from the
> others; when one pads with zeros in order to upsample the signal, one
> should multiply the last coefficient by  0.5. Should this be done in
> numpy's upsampling code? Should it at least be documented?


What is going on is that the coefficient at the Nyquist  frequency appears
once in the unextended array, but twice when the array is extended with
zeros because of the Hermitean symmetry. That should probably be fixed in
the upsampling code.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070829/c54c4318/attachment.html>

From charlesr.harris at gmail.com  Wed Aug 29 19:44:09 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 29 Aug 2007 17:44:09 -0600
Subject: [Numpy-discussion] Bug or surprising undocumented behaviour in
	irfft
In-Reply-To: <e06186140708291608k58f9a5e5p3c75b49499cbbb2a@mail.gmail.com>
References: <ce557a360708291531m1ce3f8dbqc0cf2174cb92f95c@mail.gmail.com>
	<e06186140708291608k58f9a5e5p3c75b49499cbbb2a@mail.gmail.com>
Message-ID: <e06186140708291644l4a4e201cy352d83a376c87c61@mail.gmail.com>

On 8/29/07, Charles R Harris <charlesr.harris at gmail.com> wrote:
>
> Anne,
>
> On 8/29/07, Anne Archibald <peridot.faceted at gmail.com> wrote:
> >
> > Hi,
> >
> > numpy's Fourier transforms have the handy feature of being able to
> > upsample and downsample signals; for example the documentation cites
> > irfft(rfft(A),16*len(A)) as a way to get a Fourier interpolation of A.
> > However, there is a peculiarity with the way numpy handles the
> > highest-frequency coefficient.
>
>
> <snip>
>
> The upshot is, if I correctly understand what is going on, that the
> > last coefficient needs to be treated somewhat differently from the
> > others; when one pads with zeros in order to upsample the signal, one
> > should multiply the last coefficient by   0.5. Should this be done in
> > numpy's upsampling code? Should it at least be documented?
>
>
> What is going on is that the coefficient at the Nyquist  frequency appears
> once in the unextended array, but twice when the array is extended with
> zeros because of the Hermitean symmetry. That should probably be fixed in
> the upsampling code.
>

The inverse irfft also scales by dividing by the new transform size instead
of the original size, so the result needs to be scaled up for the
interpolation to work. It is easy to go wrong with fft's when the correct
sampling/frequency scales aren't carried with the data. I always do that
myself so that the results are independent of transform size/interpolation
and expressed in some standard units.


In [9]: a = array([1, 0, 0, 0], dtype=double)

In [10]: b = rfft(a)

In [11]: b[2] *= .5

In [12]: irfft(b,8)
Out[12]:
array([ 0.5      ,  0.3017767,  0.       , -0.0517767,  0.       ,
       -0.0517767,  0.       ,  0.3017767])

In [13]: 2*irfft(b,8)
Out[13]:
array([ 1.        ,  0.60355339,  0.        , -0.10355339,  0.        ,
       -0.10355339,  0.        ,  0.60355339])

I don't know where that should be fixed.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070829/bd90ea9f/attachment.html>

From charlesr.harris at gmail.com  Wed Aug 29 20:14:30 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 29 Aug 2007 18:14:30 -0600
Subject: [Numpy-discussion] gesdd hangs
In-Reply-To: <46D5F4CB.4020702@jpl.nasa.gov>
References: <46D5EAE6.6070809@jpl.nasa.gov>
	<e06186140708291534l7a7d6f19i2034d3b3663e886@mail.gmail.com>
	<46D5F4CB.4020702@jpl.nasa.gov>
Message-ID: <e06186140708291714u29121f10g993e8fce23b6cc11@mail.gmail.com>

On 8/29/07, Mathew Yeates <myeates at jpl.nasa.gov> wrote:
>
> never returns


Where is decomp coming from? linalg.svd(eye(25)) works fine here.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070829/246cce9b/attachment.html>

From robert.kern at gmail.com  Wed Aug 29 20:18:14 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 29 Aug 2007 19:18:14 -0500
Subject: [Numpy-discussion] gesdd hangs
In-Reply-To: <e06186140708291714u29121f10g993e8fce23b6cc11@mail.gmail.com>
References: <46D5EAE6.6070809@jpl.nasa.gov>	<e06186140708291534l7a7d6f19i2034d3b3663e886@mail.gmail.com>	<46D5F4CB.4020702@jpl.nasa.gov>
	<e06186140708291714u29121f10g993e8fce23b6cc11@mail.gmail.com>
Message-ID: <46D60CC6.4040007@gmail.com>

Charles R Harris wrote:
> 
> On 8/29/07, *Mathew Yeates* <myeates at jpl.nasa.gov
> <mailto:myeates at jpl.nasa.gov>> wrote:
> 
>     never returns
> 
> Where is decomp coming from? linalg.svd(eye(25)) works fine here.

scipy, most likely.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From peridot.faceted at gmail.com  Wed Aug 29 20:49:08 2007
From: peridot.faceted at gmail.com (Anne Archibald)
Date: Wed, 29 Aug 2007 20:49:08 -0400
Subject: [Numpy-discussion] Bug or surprising undocumented behaviour in
	irfft
In-Reply-To: <e06186140708291644l4a4e201cy352d83a376c87c61@mail.gmail.com>
References: <ce557a360708291531m1ce3f8dbqc0cf2174cb92f95c@mail.gmail.com>
	<e06186140708291608k58f9a5e5p3c75b49499cbbb2a@mail.gmail.com>
	<e06186140708291644l4a4e201cy352d83a376c87c61@mail.gmail.com>
Message-ID: <ce557a360708291749o41d4341eg4f8e9834123ed850@mail.gmail.com>

On 29/08/2007, Charles R Harris <charlesr.harris at gmail.com> wrote:
>
> > What is going on is that the coefficient at the Nyquist  frequency appears
> once in the unextended array, but twice when the array is extended with
> zeros because of the Hermitean symmetry. That should probably be fixed in
> the upsampling code.

Is this also appropriate for the other FFTs? (inverse real, complex,
hermitian, what have you) I have written a quick hack (attached) that
should do just that rescaling, but I don't know that it's a good idea,
as implemented. Really, for a complex IFFT it's extremely peculiar to
add the padding where we do (between frequency -1 and frequency zero);
it would make more sense to pad at the high frequencies (which are in
the middle of the array). Forward FFTs, though, can reasonably be
padded at the end, and it doesn't make much sense to rescale the last
data point.

> The inverse irfft also scales by dividing by the new transform size instead
> of the original size, so the result needs to be scaled up for the
> interpolation to work. It is easy to go wrong with fft's when the correct
> sampling/frequency scales aren't carried with the data. I always do that
> myself so that the results are independent of transform size/interpolation
> and expressed in some standard units.

The scaling of the FFT is a pain everywhere. I always just try it a
few times until I get the coefficients right. I sort of like FFTW's
convention of never normalizing anything - it means the transforms
have nice simple formulas, though unfortunately it also means that
ifft(fft(A))!=A. In any case the normalization of numpy's FFTs is not
something that can reasonably be changed, even in the special case of
the zero-padding inverse (and forward) FFTs.

Anne
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fftfix
Type: application/octet-stream
Size: 1242 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070829/dafce1e6/attachment.obj>

From charlesr.harris at gmail.com  Wed Aug 29 21:46:32 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 29 Aug 2007 19:46:32 -0600
Subject: [Numpy-discussion] Bug or surprising undocumented behaviour in
	irfft
In-Reply-To: <ce557a360708291749o41d4341eg4f8e9834123ed850@mail.gmail.com>
References: <ce557a360708291531m1ce3f8dbqc0cf2174cb92f95c@mail.gmail.com>
	<e06186140708291608k58f9a5e5p3c75b49499cbbb2a@mail.gmail.com>
	<e06186140708291644l4a4e201cy352d83a376c87c61@mail.gmail.com>
	<ce557a360708291749o41d4341eg4f8e9834123ed850@mail.gmail.com>
Message-ID: <e06186140708291846j4ac60ff3h582c7ef46496cc11@mail.gmail.com>

Hi Anne,

On 8/29/07, Anne Archibald <peridot.faceted at gmail.com> wrote:
>
> On 29/08/2007, Charles R Harris <charlesr.harris at gmail.com> wrote:
> >
> > > What is going on is that the coefficient at the Nyquist  frequency
> appears
> > once in the unextended array, but twice when the array is extended with
> > zeros because of the Hermitean symmetry. That should probably be fixed
> in
> > the upsampling code.
>
> Is this also appropriate for the other FFTs? (inverse real, complex,
> hermitian, what have you) I have written a quick hack (attached) that
> should do just that rescaling, but I don't know that it's a good idea,
> as implemented. Really, for a complex IFFT it's extremely peculiar to
> add the padding where we do (between frequency -1 and frequency zero);
> it would make more sense to pad at the high frequencies (which are in
> the middle of the array). Forward FFTs, though, can reasonably be
> padded at the end, and it doesn't make much sense to rescale the last
> data point.


It all depends on the data and what you intend. Much of my experience is
with Michaelson interferometers and in that case the interferogram is
essentially an autocorrelation, so it is desirable to keep its center at
sample zero and let the left side wrap around, so ideally you fill in the
middle as you suggest. You can also pad at the end if you don't put the
center at zero, but then you need to phase shift the spectrum in a way that
corresponds to rotating the center to index zero and padding in the middle.
I expect you would want to do the same thing for complex transforms if they
are of real data and do the nyquist divided by two thingy. If the high
frequencies in a complex transform are actually high frequencies and not
aliases of negative frequencies, then you will want to just append zeros.
That case also occurs,  I have designed decimating complex filters that
produce output like that, they were like single sideband in the radio world.

> The inverse irfft also scales by dividing by the new transform size
> instead
> > of the original size, so the result needs to be scaled up for the
> > interpolation to work. It is easy to go wrong with fft's when the
> correct
> > sampling/frequency scales aren't carried with the data. I always do that
> > myself so that the results are independent of transform
> size/interpolation
> > and expressed in some standard units.
>
> The scaling of the FFT is a pain everywhere. I always just try it a
> few times until I get the coefficients right. I sort of like FFTW's
> convention of never normalizing anything - it means the transforms
> have nice simple formulas, though unfortunately it also means that
> ifft(fft(A))!=A. In any case the normalization of numpy's FFTs is not
> something that can reasonably be changed, even in the special case of
> the zero-padding inverse (and forward) FFTs.


I usually multiply the forward transform by the sample interval, in secs or
cm, and the unscaled inverse transform by the frequency sample interval, in
Hz or cm^-1. That treats both the forward and inverse fft like
approximations to the integral transforms and makes the units those of
spectral density. If you think trapezoidal rule, then you will also see
factors of .5 at the ends, but that is a sort of apodization that is
consistent with how Fourier series converge at discontinuities. In the
normal case where no interpolation is done the product of the sample
intervals is 1/N, so it reduces to the usual convention. Note that in your
example the sampling interval decreases when you do the interpolation, so if
you did another forward transform it would be scaled down to account for the
extra points in the data.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070829/bac46c14/attachment.html>

From peridot.faceted at gmail.com  Wed Aug 29 22:24:50 2007
From: peridot.faceted at gmail.com (Anne Archibald)
Date: Wed, 29 Aug 2007 22:24:50 -0400
Subject: [Numpy-discussion] Bug or surprising undocumented behaviour in
	irfft
In-Reply-To: <e06186140708291846j4ac60ff3h582c7ef46496cc11@mail.gmail.com>
References: <ce557a360708291531m1ce3f8dbqc0cf2174cb92f95c@mail.gmail.com>
	<e06186140708291608k58f9a5e5p3c75b49499cbbb2a@mail.gmail.com>
	<e06186140708291644l4a4e201cy352d83a376c87c61@mail.gmail.com>
	<ce557a360708291749o41d4341eg4f8e9834123ed850@mail.gmail.com>
	<e06186140708291846j4ac60ff3h582c7ef46496cc11@mail.gmail.com>
Message-ID: <ce557a360708291924s1e14c4aeo3ee11c8393fbd7e4@mail.gmail.com>

On 29/08/2007, Charles R Harris <charlesr.harris at gmail.com> wrote:

> > Is this also appropriate for the other FFTs? (inverse real, complex,
> > hermitian, what have you) I have written a quick hack (attached) that
> > should do just that rescaling, but I don't know that it's a good idea,
> > as implemented. Really, for a complex IFFT it's extremely peculiar to
> > add the padding where we do (between frequency -1 and frequency zero);
> > it would make more sense to pad at the high frequencies (which are in
> > the middle of the array). Forward FFTs, though, can reasonably be
> > padded at the end, and it doesn't make much sense to rescale the last
> > data point.
>
> It all depends on the data and what you intend. Much of my experience is
> with Michaelson interferometers and in that case the interferogram is
> essentially an autocorrelation, so it is desirable to keep its center at
> sample zero and let the left side wrap around, so ideally you fill in the
> middle as you suggest. You can also pad at the end if you don't put the
> center at zero, but then you need to phase shift the spectrum in a way that
> corresponds to rotating the center to index zero and padding in the middle.
> I expect you would want to do the same thing for complex transforms if they
> are of real data and do the nyquist divided by two thingy. If the high
> frequencies in a complex transform are actually high frequencies and not
> aliases of negative frequencies, then you will want to just append zeros.
> That case also occurs,  I have designed decimating complex filters that
> produce output like that, they were like single sideband in the radi o
> world.

So is it a fair summary to say that for irfft, it is fairly clear that
one should adjust the Nyquist coefficient, but for the other varieties
of FFT, the padding done by numpy is just one of many possible
choices?

Should numpy be modified so that irfft adjusts the Nyquist
coefficient? Should this happen only for irfft?

> I usually multiply the forward transform by the sample interval, in secs or
> cm, and the unscaled inverse transform by the frequency sample interval, in
> Hz or cm^-1. That treats both the forward and inverse fft like
> approximations to the integral transforms and makes the units those of
> spectral density. If you think trapezoidal rule, then you will also see
> factors of .5 at the ends, but that is a sort of apodization that is
> consistent with how Fourier series converge at discontinuities. In the
> normal case where no interpolation is done the product of the sample
> intervals is 1/N, so it reduces to the usual convention. Note that in your
> example the sampling interval decreases when you do the interpolation, so if
> you did another forward transform it would be scaled down to account for the
> extra points in the data.

That's a convenient normalization.

Do you know if there's a current package to associate units with numpy
arrays? For my purposes it would usually be sufficient to have arrays
of quantities with uniform units. Conversions need only be
multiplicative (I don't care about Celsius-to-Fahrenheit style
conversions) and need not even be automatic, though of course that
would be convenient. Right now I use Frink for that sort of thing, but
it would have saved me from making a number of minor mistakes in
several pieces of python code I've written.

Thanks,
Anne


From charlesr.harris at gmail.com  Wed Aug 29 23:25:55 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 29 Aug 2007 21:25:55 -0600
Subject: [Numpy-discussion] Bug or surprising undocumented behaviour in
	irfft
In-Reply-To: <ce557a360708291924s1e14c4aeo3ee11c8393fbd7e4@mail.gmail.com>
References: <ce557a360708291531m1ce3f8dbqc0cf2174cb92f95c@mail.gmail.com>
	<e06186140708291608k58f9a5e5p3c75b49499cbbb2a@mail.gmail.com>
	<e06186140708291644l4a4e201cy352d83a376c87c61@mail.gmail.com>
	<ce557a360708291749o41d4341eg4f8e9834123ed850@mail.gmail.com>
	<e06186140708291846j4ac60ff3h582c7ef46496cc11@mail.gmail.com>
	<ce557a360708291924s1e14c4aeo3ee11c8393fbd7e4@mail.gmail.com>
Message-ID: <e06186140708292025l684b741dt7b989736afc7aaf0@mail.gmail.com>

On 8/29/07, Anne Archibald <peridot.faceted at gmail.com> wrote:
>
> On 29/08/2007, Charles R Harris <charlesr.harris at gmail.com> wrote:
>
> > > Is this also appropriate for the other FFTs? (inverse real, complex,
> > > hermitian, what have you) I have written a quick hack (attached) that
> > > should do just that rescaling, but I don't know that it's a good idea,
> > > as implemented. Really, for a complex IFFT it's extremely peculiar to
> > > add the padding where we do (between frequency -1 and frequency zero);
> > > it would make more sense to pad at the high frequencies (which are in
> > > the middle of the array). Forward FFTs, though, can reasonably be
> > > padded at the end, and it doesn't make much sense to rescale the last
> > > data point.
> >
> > It all depends on the data and what you intend. Much of my experience is
> > with Michaelson interferometers and in that case the interferogram is
> > essentially an autocorrelation, so it is desirable to keep its center at
> > sample zero and let the left side wrap around, so ideally you fill in
> the
> > middle as you suggest. You can also pad at the end if you don't put the
> > center at zero, but then you need to phase shift the spectrum in a way
> that
> > corresponds to rotating the center to index zero and padding in the
> middle.
> > I expect you would want to do the same thing for complex transforms if
> they
> > are of real data and do the nyquist divided by two thingy. If the high
> > frequencies in a complex transform are actually high frequencies and not
> > aliases of negative frequencies, then you will want to just append
> zeros.
> > That case also occurs,  I have designed decimating complex filters that
> > produce output like that, they were like single sideband in the radi o
> > world.
>
> So is it a fair summary to say that for irfft, it is fairly clear that
> one should adjust the Nyquist coefficient, but for the other varieties
> of FFT, the padding done by numpy is just one of many possible
> choices?
>
> Should numpy be modified so that irfft adjusts the Nyquist
> coefficient? Should this happen only for irfft?


Yes, I think that should be the case. If the complex transforms pad in the
middle, then they are treating the high frequencies as aliases, but unless
they explicitly duplicate the Nyquist coefficient scaling isn't needed. Hmm,
actually, I think that is wrong. The original data points will be
reproduced, but what happens in between points? In between there is a
difference between positive and negative frequences. So in a complex
transform of real data one would want to split the Nyquist coefficient
between high and low frequencies. I don't think it is possible to make a
general statement about the complex case. Just hope the middle frequency is
zero so you can ignore the problem ;)

What happens in the real case is that the irfft algorithm uses the Hermitean
symmetry of the spectrum, so the coefficient is implicitly duplicated.

> I usually multiply the forward transform by the sample interval, in secs
> or
> > cm, and the unscaled inverse transform by the frequency sample interval,
> in
> > Hz or cm^-1. That treats both the forward and inverse fft like
> > approximations to the integral transforms and makes the units those of
> > spectral density. If you think trapezoidal rule, then you will also see
> > factors of .5 at the ends, but that is a sort of apodization that is
> > consistent with how Fourier series converge at discontinuities. In the
> > normal case where no interpolation is done the product of the sample
> > intervals is 1/N, so it reduces to the usual convention. Note that in
> your
> > example the sampling interval decreases when you do the interpolation,
> so if
> > you did another forward transform it would be scaled down to account for
> the
> > extra points in the data.
>
> That's a convenient normalization.
>
> Do you know if there's a current package to associate units with numpy
> arrays? For my purposes it would usually be sufficient to have arrays
> of quantities with uniform units. Conversions need only be
> multiplicative (I don't care about Celsius-to-Fahrenheit style
> conversions) and need not even be automatic, though of course that
> would be convenient. Right now I use Frink for that sort of thing, but
> it would have saved me from making a number of minor mistakes in
> several pieces of python code I've written.


There was a presentation by some fellow from CalTech at SciPy 2005 (4?)
about such a system, but ISTR it looked pretty complex. C++ template
programming does it with traits and maybe the Enthought folks have something
useful along those lines. Otherwise, I don't know of any such system for
general use. Maybe ndarray could be subclassed? It can be convenient to
multiply and divide units, so maybe some sort of string with something to
gather the same units together with a power could be a useful way to track
them and wouldn't tie one down to any particular choice.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070829/de51a0b1/attachment.html>

From numpy-discussion at maubp.freeserve.co.uk  Thu Aug 30 05:28:09 2007
From: numpy-discussion at maubp.freeserve.co.uk (Peter)
Date: Thu, 30 Aug 2007 10:28:09 +0100
Subject: [Numpy-discussion] Citing Numeric and numpy
In-Reply-To: <c5b438120708290606i102eb161x796b4cd9a445837@mail.gmail.com>
References: <46D55C01.6050604@maubp.freeserve.co.uk>
	<c5b438120708290606i102eb161x796b4cd9a445837@mail.gmail.com>
Message-ID: <46D68DA9.8090606@maubp.freeserve.co.uk>

Thank you Ryan & Alan for the feedback - the three references are 
summarized here for anyone searching for the citations in future.

The recent overview was:

Travis E. Oliphant, "Python for Scientific Computing," Computing in
Science & Engineering, vol. 9, no. 3, May/June 2007, pp. 10-20.

Numerical Python citation, available online at: 
http://numpy.scipy.org/numpydoc/numpy.html

D. Ascher et al., Numerical Python, tech. report UCRL-MA-128569,
Lawrence Livermore National Laboratory, 2001; http://numpy.scipy.org.

NumPy book citation, see also http://www.tramy.us for details:

Travis E. Oliphant (2006) Guide to NumPy, Trelgol Publishing, USA;
http://numpy.scipy.org.

Cheers,

Peter


From pearu at cens.ioc.ee  Thu Aug 30 05:48:44 2007
From: pearu at cens.ioc.ee (Pearu Peterson)
Date: Thu, 30 Aug 2007 12:48:44 +0300 (EEST)
Subject: [Numpy-discussion] Error code of NumpyTest()
In-Reply-To: <e76aa17f0708240141k973e168gd47ef9f48b95113a@mail.gmail.com>
References: <e76aa17f0708240141k973e168gd47ef9f48b95113a@mail.gmail.com>
Message-ID: <59808.129.240.228.53.1188467324.squirrel@cens.ioc.ee>

On Fri, August 24, 2007 11:41 am, Matthieu Brucher wrote:
> Hi,
>
> I wondered if there was a way of returning another error code than 0 when
> executing the test suite so that a parent process can immediately know if
> all the tests passed or not.
> The numpy buildbot seems to have the same behaviour BTW.
> I don't know if it is possible, but it would be great.

The svn version of test() function now returns TestResult object.

So, test() calls in buildbot should read:

  import numpy,sys; sys.exit(not
numpy.test(verbosity=9999,level=9999).wasSuccessful())

Hopefully buildbot admins can update the test commands accordingly.

Pearu


From matthieu.brucher at gmail.com  Thu Aug 30 05:59:13 2007
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Thu, 30 Aug 2007 11:59:13 +0200
Subject: [Numpy-discussion] Error code of NumpyTest()
In-Reply-To: <59808.129.240.228.53.1188467324.squirrel@cens.ioc.ee>
References: <e76aa17f0708240141k973e168gd47ef9f48b95113a@mail.gmail.com>
	<59808.129.240.228.53.1188467324.squirrel@cens.ioc.ee>
Message-ID: <e76aa17f0708300259l5a84e39eg13db5149c2098231@mail.gmail.com>

Thank you for the answer


The svn version of test() function now returns TestResult object.


Numpy 1.3.x does not provide this ? I can't upgrade the numpy packages on
the Linux boxes (on the Windows box, I suppose that I could use an Enthought
egg).


So, test() calls in buildbot should read:
>
>   import numpy,sys; sys.exit(not
> numpy.test(verbosity=9999,level=9999).wasSuccessful())
>
> Hopefully buildbot admins can update the test commands accordingly.


I'll be able to do this as the tests are located on the repository.

Matthieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070830/ced57e10/attachment.html>

From bryanv at enthought.com  Thu Aug 30 11:17:00 2007
From: bryanv at enthought.com (Bryan Van de Ven)
Date: Thu, 30 Aug 2007 10:17:00 -0500
Subject: [Numpy-discussion] Units;
 was  Bug or surprising undocumented behaviour in	irfft
In-Reply-To: <ce557a360708291924s1e14c4aeo3ee11c8393fbd7e4@mail.gmail.com>
References: <ce557a360708291531m1ce3f8dbqc0cf2174cb92f95c@mail.gmail.com>	<e06186140708291608k58f9a5e5p3c75b49499cbbb2a@mail.gmail.com>	<e06186140708291644l4a4e201cy352d83a376c87c61@mail.gmail.com>	<ce557a360708291749o41d4341eg4f8e9834123ed850@mail.gmail.com>	<e06186140708291846j4ac60ff3h582c7ef46496cc11@mail.gmail.com>
	<ce557a360708291924s1e14c4aeo3ee11c8393fbd7e4@mail.gmail.com>
Message-ID: <46D6DF6C.7010805@enthought.com>


> Do you know if there's a current package to associate units with numpy
> arrays? For my purposes it would usually be sufficient to have arrays
> of quantities with uniform units. Conversions need only be
> multiplicative (I don't care about Celsius-to-Fahrenheit style
> conversions) and need not even be automatic, though of course that
> would be convenient. Right now I use Frink for that sort of thing, but
> it would have saved me from making a number of minor mistakes in
> several pieces of python code I've written.

Anne,

We have an enthought.units package in ETS, and for unit-ed numpy arrays we have 
(fairly new) UnitArray and UnitScalar in enthought.numerical_modeling.units.api 
Automatic conversions on arithmetic expressions are not performed; however, we 
do have a "@has_units" function decorator that will perform unit conversions on 
function inputs automatically (and will label--but not convert--the outputs of a 
function) If you are interested in checking it out I can get you more 
information/examples.

Bryan


From donovan at mirsl.ecs.umass.edu  Thu Aug 30 11:25:44 2007
From: donovan at mirsl.ecs.umass.edu (Brian Donovan)
Date: Thu, 30 Aug 2007 11:25:44 -0400
Subject: [Numpy-discussion] Accessing a numpy array in a mmap fashion
Message-ID: <29e003270708300825w3fc10f2diaef3f4b09690882b@mail.gmail.com>

Hello all,

  I'm wondering if there is a way to use a numpy array that uses disk as a
memory store rather than ram. I'm looking for something like mmap but which
can be used like a numpy array. The general idea is this. I'm simulating a
system which produces a large dataset over a few hours of processing time.
Rather than store the numpy array in memory during processing I'd like to
write the data directly to disk but still be able to treat the array as a
numpy array. Is this possible? Any ideas?

Thanks,

Brian
--
Brian Donovan
Research Assistant
Microwave Remote Sensing Lab
UMass Amherst
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070830/5a4a3c15/attachment.html>

From broman at spawar.navy.mil  Thu Aug 30 11:24:27 2007
From: broman at spawar.navy.mil (Vincent Broman)
Date: Thu, 30 Aug 2007 08:24:27 -0700
Subject: [Numpy-discussion] numpy build fails on powerpc ydl
In-Reply-To: <mailman.69791.1188438394.30507.numpy-discussion@scipy.org>
References: <mailman.69791.1188438394.30507.numpy-discussion@scipy.org>
Message-ID: <200708300824.27221@b00d61a8cecf8b2266f81358fd170621.navy.mil>

My build of numpy fails under Yellow Dog Linux 2.1,
running on a powerpc multiprocessor board from Curtiss-Wright.

Its kernel is 2.4.19-Asmp tailored by the vendor.
The gcc compiler is configured as ppc-yellowdog-linux with
version number 2.95.3 20010111.
The python I'm using is Python 2.5.1 (r251:54863) installed as python2.
Plain /usr/bin/python is 1.5.x .
The numpy version I'm trying to build is r4003 for v1.0.4 .

The setup fails compiling build/src.linux-ppc-2.5/numpy/core/src/umathmodule.c
with a long list of error messages of the following two kinds.

warning: conflicting types for built-in function `sinl'
repeated for `cosl', `fabsl', and `sqrtl', triggered by line 442.

inconsistent operand constraints in an ?`asm',
triggered by lines 1100, 1124, 1150, 1755, 1785, and 1834.

I cannot see on those source lines what causes such a
message; I suspect there is some long complicated
cpp macro or asm statement in some include file which
I don't find.

Has anyone tried building numpy on Yellow Dog Linux or on a PowerPC with gcc?

Vincent Broman
broman at spawar.navy.mil


From rmay at ou.edu  Thu Aug 30 11:33:29 2007
From: rmay at ou.edu (Ryan May)
Date: Thu, 30 Aug 2007 10:33:29 -0500
Subject: [Numpy-discussion] Accessing a numpy array in a mmap fashion
In-Reply-To: <29e003270708300825w3fc10f2diaef3f4b09690882b@mail.gmail.com>
References: <29e003270708300825w3fc10f2diaef3f4b09690882b@mail.gmail.com>
Message-ID: <46D6E349.4010109@ou.edu>

Brian Donovan wrote:
> Hello all,
> 
>   I'm wondering if there is a way to use a numpy array that uses disk as
> a memory store rather than ram. I'm looking for something like mmap but
> which can be used like a numpy array. The general idea is this. I'm
> simulating a system which produces a large dataset over a few hours of
> processing time. Rather than store the numpy array in memory during
> processing I'd like to write the data directly to disk but still be able
> to treat the array as a numpy array. Is this possible? Any ideas?

What you're looking for is numpy.memmap, though the documentation is
eluding me at the moment.

Ryan

-- 
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma


From peridot.faceted at gmail.com  Thu Aug 30 11:34:11 2007
From: peridot.faceted at gmail.com (Anne Archibald)
Date: Thu, 30 Aug 2007 11:34:11 -0400
Subject: [Numpy-discussion] Accessing a numpy array in a mmap fashion
In-Reply-To: <29e003270708300825w3fc10f2diaef3f4b09690882b@mail.gmail.com>
References: <29e003270708300825w3fc10f2diaef3f4b09690882b@mail.gmail.com>
Message-ID: <ce557a360708300834u2f103db2gf337204265669104@mail.gmail.com>

On 30/08/2007, Brian Donovan <donovan at mirsl.ecs.umass.edu> wrote:
> Hello all,
>
>   I'm wondering if there is a way to use a numpy array that uses disk as a
> memory store rather than ram. I'm looking for something like mmap but which
> can be used like a numpy array. The general idea is this. I'm simulating a
> system which produces a large dataset over a few hours of processing time.
> Rather than store the numpy array in memory during processing I'd like to
> write the data directly to disk but still be able to treat the array as a
> numpy array. Is this possible? Any ideas?

You want numpy.memmap:
http://mail.python.org/pipermail/python-list/2007-May/443036.html

This will do exactly what you want (though you may have problems with
arrays bigger than a few gigabytes, particularly on 32-bit systems)
and there may be a few rough edges. You will probably need to create
the file first.

Keep in mind that if the array is actually temporary, the virtual
memory system will push unused parts out to disk as memory fills up,
so there's no need to use memmap explicitly. If you want the array
permanently on disk, though, memmap is probably the most convenient
way to do it - though if your access patterns are not local it may
involve a lot of thrashing. Sequential disk writes have the advantage
(?) of forcing you to write code that accesses disks in a local
fashion.

Anne


From stefan at sun.ac.za  Thu Aug 30 18:04:51 2007
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Fri, 31 Aug 2007 00:04:51 +0200
Subject: [Numpy-discussion] Error code of NumpyTest()
In-Reply-To: <59808.129.240.228.53.1188467324.squirrel@cens.ioc.ee>
References: <e76aa17f0708240141k973e168gd47ef9f48b95113a@mail.gmail.com>
	<59808.129.240.228.53.1188467324.squirrel@cens.ioc.ee>
Message-ID: <20070830220451.GG14395@mentat.za.net>

On Thu, Aug 30, 2007 at 12:48:44PM +0300, Pearu Peterson wrote:
> The svn version of test() function now returns TestResult object.
> 
> So, test() calls in buildbot should read:
> 
>   import numpy,sys; sys.exit(not
> numpy.test(verbosity=9999,level=9999).wasSuccessful())
> 
> Hopefully buildbot admins can update the test commands accordingly.

Thanks, Pearu.  I forwarded your instructions to the relevant parties.

Cheers
St?fan


From oliphant at enthought.com  Fri Aug 31 15:22:06 2007
From: oliphant at enthought.com (Travis E. Oliphant)
Date: Fri, 31 Aug 2007 14:22:06 -0500
Subject: [Numpy-discussion] numpy build fails on powerpc ydl
In-Reply-To: <200708300824.27221@b00d61a8cecf8b2266f81358fd170621.navy.mil>
References: <mailman.69791.1188438394.30507.numpy-discussion@scipy.org>
	<200708300824.27221@b00d61a8cecf8b2266f81358fd170621.navy.mil>
Message-ID: <46D86A5E.7010809@enthought.com>

Vincent Broman wrote:
> My build of numpy fails under Yellow Dog Linux 2.1,
> running on a powerpc multiprocessor board from Curtiss-Wright.
>
> Its kernel is 2.4.19-Asmp tailored by the vendor.
> The gcc compiler is configured as ppc-yellowdog-linux with
> version number 2.95.3 20010111.
> The python I'm using is Python 2.5.1 (r251:54863) installed as python2.
> Plain /usr/bin/python is 1.5.x .
> The numpy version I'm trying to build is r4003 for v1.0.4 .
>
> The setup fails compiling build/src.linux-ppc-2.5/numpy/core/src/umathmodule.c
> with a long list of error messages of the following two kinds.
>
> warning: conflicting types for built-in function `sinl'
> repeated for `cosl', `fabsl', and `sqrtl', triggered by line 442.
>   

You may be the first one to build on this platform.  What needs to 
happen is that the correct config.h file needs to be set up for that 
platform.   The long-float versions of certain functions are being 
incorrectly identified.

Would you be willing to help get the config.h file set up correctly?

-Travis


From charlesr.harris at gmail.com  Fri Aug 31 16:35:57 2007
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 31 Aug 2007 14:35:57 -0600
Subject: [Numpy-discussion] numpy build fails on powerpc ydl
In-Reply-To: <200708300824.27221@b00d61a8cecf8b2266f81358fd170621.navy.mil>
References: <mailman.69791.1188438394.30507.numpy-discussion@scipy.org>
	<200708300824.27221@b00d61a8cecf8b2266f81358fd170621.navy.mil>
Message-ID: <e06186140708311335p3d24f321h1f08248f6e43ddf4@mail.gmail.com>

On 8/30/07, Vincent Broman <broman at spawar.navy.mil> wrote:
>
> My build of numpy fails under Yellow Dog Linux 2.1,
> running on a powerpc multiprocessor board from Curtiss-Wright.
>
> Its kernel is 2.4.19-Asmp tailored by the vendor.


Which vendor?

The gcc compiler is configured as ppc-yellowdog-linux with
> version number 2.95.3 20010111.


That compiler is really, I mean really, ancient. And the API changed in
newer gcc (> 3.x.x), so code compiled with later versions isn't binary
compatible. Hmmm. Curtiss-Wright now supports Linux and kernel 2.6.16 on
some of their newer hardware, you might want to check with them or install a
more current distro from Fedora or someone else who supports the PPC.

The python I'm using is Python 2.5.1 (r251:54863) installed as python2.
> Plain /usr/bin/python is 1.5.x .
> The numpy version I'm trying to build is r4003 for v1.0.4 .
>
> The setup fails compiling build/src.linux-ppc-2.5
> /numpy/core/src/umathmodule.c
> with a long list of error messages of the following two kinds.
>
> warning: conflicting types for built-in function `sinl'
> repeated for `cosl', `fabsl', and `sqrtl', triggered by line 442.


Any more detail on these? What causes the conflict. I've got to wonder about
the the libc/libm versions also. Does the include file math.h say anything
about the prototypes for these functions? I expect cosl et.al. to be
potential problems on the PPC anyway due to the way long doubles were
implemented.

inconsistent operand constraints in an `asm',
> triggered by lines 1100, 1124, 1150, 1755, 1785, and 1834.
>
> I cannot see on those source lines what causes such a
> message; I suspect there is some long complicated
> cpp macro or asm statement in some include file which
> I don't find.


What is the PPC model number?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070831/69407a03/attachment.html>