From schlesin at cshl.edu  Wed Dec  1 00:46:39 2010
From: schlesin at cshl.edu (Felix Schlesinger)
Date: Wed, 1 Dec 2010 05:46:39 +0000 (UTC)
Subject: [Numpy-discussion] A faster median (Wirth's method)
References: <4A9C9DDA.9060503@molden.no>
	<4A9D7B5E.6040009@student.matnat.uio.no> <4A9D9432.20907@molden.no>
	<AANLkTikPub-0Bij8HP+yKq7w7R8q5j0XKOPtNeRVGGKS@mail.gmail.com>
	<AANLkTi=wFwEjqEMBDbf6kXxbegn3hdY-wdw0CcpXNeZR@mail.gmail.com>
	<AANLkTinPCyJ-Wf8=i9OvTkFOX+M1AVAAsiqWaygps4VA@mail.gmail.com>
	<AANLkTikRsK8ZPx43LhC6tyNf+ENTqWd7PVoAhcS4-y+8@mail.gmail.com>
	<AANLkTinR7jTZvQ6fN0RMnX9UCayhEMRDYTe7jU6UTT0+@mail.gmail.com>
Message-ID: <loom.20101201T064602-772@post.gmane.org>

> > import numpy as np
> > cimport numpy as cnp
>
> > cdef cnp.float64_t namean(cnp.ndarray[cnp.float64_t, ndim=1] a):
> >    return np.nanmean(a)  # just a placeholder
>
> > is not allowed?  It works for me.  Is it a cython version thing?
> > (I've got 0.13),
>
> Oh, that's nice! I'm using 0.11.2. OK, time to upgrade.

Oh wow, does that mean that http://trac.cython.org/cython_trac/ticket/177
is fixed? I couldn't find anything in the release notes about that,
but it would be great news. Does the cdef function acquire and hold
the buffer?

Felix


From jeanluc.menut at free.fr  Wed Dec  1 05:23:22 2010
From: jeanluc.menut at free.fr (Jean-Luc Menut)
Date: Wed, 01 Dec 2010 11:23:22 +0100
Subject: [Numpy-discussion] numpy speed question
In-Reply-To: <AANLkTin+U-KHL3cPGFkNyRqpdBJYLrH6X3Fi=YdE2yk-@mail.gmail.com>
References: <4CEE36DD.8000105@free.fr>	<AANLkTi=NLAeT=VLA4uU=f4iY1ZRkftTC_u8-BBR8_MTU@mail.gmail.com>
	<AANLkTin+U-KHL3cPGFkNyRqpdBJYLrH6X3Fi=YdE2yk-@mail.gmail.com>
Message-ID: <4CF6221A.6020804@free.fr>

Le 26/11/2010 17:48, Bruce Sherwood a ?crit :
> Although this was mentioned earlier, it's worth emphasizing that if
> you need to use functions such as cosine with scalar arguments, you
> should use math.cos(), not numpy.cos(). The numpy versions of these
> functions are optimized for handling array arguments and are much
> slower than the math versions for scalar arguments.


Yes I understand that. I just want to stress that it was not a benchmark 
(nor a critic) but a test to know if it was interesting to translate 
directly an IDL code into python/numpy before trying to optimize it (I 
know more python than IDL). I expected  to have approximatively the same 
speed for both, was surprised by the result, and wanted to know if there 
was an obvious reason besides the unoptimization for scalars.


From John.Hornstein at nrl.navy.mil  Wed Dec  1 09:26:02 2010
From: John.Hornstein at nrl.navy.mil (John Hornstein)
Date: Wed, 1 Dec 2010 09:26:02 -0500
Subject: [Numpy-discussion] Python versions for NumPy 1.5
Message-ID: <004d01cb9163$ae317de0$0a9479a0$@Hornstein@nrl.navy.mil>

Does NumPy 1.5 work with Python 2.7 or Python 3.x?


From charlesr.harris at gmail.com  Wed Dec  1 09:32:51 2010
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 1 Dec 2010 07:32:51 -0700
Subject: [Numpy-discussion] Python versions for NumPy 1.5
In-Reply-To: <493532460441506022@unknownmsgid>
References: <493532460441506022@unknownmsgid>
Message-ID: <AANLkTimueEG9+N0g-oFfOuzEWGzCmB52wAWBMRTd03+9@mail.gmail.com>

On Wed, Dec 1, 2010 at 7:26 AM, John Hornstein
<John.Hornstein at nrl.navy.mil>wrote:

> Does NumPy 1.5 work with Python 2.7 or Python 3.x?
>
>
>
Yes, both. NumPy 1.5.1 fixes some small bugs and that is what you should
use.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101201/1553285c/attachment.html>

From jsalvati at u.washington.edu  Wed Dec  1 11:07:18 2010
From: jsalvati at u.washington.edu (John Salvatier)
Date: Wed, 1 Dec 2010 08:07:18 -0800
Subject: [Numpy-discussion] A faster median (Wirth's method)
In-Reply-To: <loom.20101201T064602-772@post.gmane.org>
References: <4A9C9DDA.9060503@molden.no>
	<4A9D7B5E.6040009@student.matnat.uio.no> <4A9D9432.20907@molden.no>
	<AANLkTikPub-0Bij8HP+yKq7w7R8q5j0XKOPtNeRVGGKS@mail.gmail.com>
	<AANLkTi=wFwEjqEMBDbf6kXxbegn3hdY-wdw0CcpXNeZR@mail.gmail.com>
	<AANLkTinPCyJ-Wf8=i9OvTkFOX+M1AVAAsiqWaygps4VA@mail.gmail.com>
	<AANLkTikRsK8ZPx43LhC6tyNf+ENTqWd7PVoAhcS4-y+8@mail.gmail.com>
	<AANLkTinR7jTZvQ6fN0RMnX9UCayhEMRDYTe7jU6UTT0+@mail.gmail.com>
	<loom.20101201T064602-772@post.gmane.org>
Message-ID: <AANLkTin_fHmhDz4i7Vp8dnFXMvLD3hVseoE3FbbeBxXd@mail.gmail.com>

@Keith Goodman

I think I figured it out. I believe something like the following will do
what you want, iterating across one axis specially, so it can apply a median
function along an axis. This code in particular is for calculating a moving
average and seems to work (though I haven't checked my math). Let me know if
you find any problems.

def ewma(a, d,  int axis = -1):

    out = np.empty(a.shape, dtype)

    cdef np.flatiter ita, ito

    ita = np.PyArray_IterAllButAxis(a,   &axis)
    ito = np.PyArray_IterAllButAxis(out, &axis)

    cdef int i
    cdef int axis_length = a.shape[axis]
    cdef int a_axis_stride = a.strides[axis]/a.itemsize
    cdef int o_axis_stride = out.strides[axis]/out.itemsize

    cdef double avg = 0.0
    cdef double weight = 1.0 - np.exp(-d)


    while np.PyArray_ITER_NOTDONE(ita):

        avg = 0.0
        for i in range(axis_length):

            avg += (<dtype_t*>np.PyArray_ITER_DATA (ita))[i * a_axis_stride
] * weight + avg * (1 - weight)
            (<dtype_t*>np.PyArray_ITER_DATA (ito))[i * o_axis_stride ] = avg

        np.PyArray_ITER_NEXT(ita)
        np.PyArray_ITER_NEXT(ito)

    return out

On Tue, Nov 30, 2010 at 9:46 PM, Felix Schlesinger <schlesin at cshl.edu>wrote:

> > > import numpy as np
> > > cimport numpy as cnp
> >
> > > cdef cnp.float64_t namean(cnp.ndarray[cnp.float64_t, ndim=1] a):
> > >    return np.nanmean(a)  # just a placeholder
> >
> > > is not allowed?  It works for me.  Is it a cython version thing?
> > > (I've got 0.13),
> >
> > Oh, that's nice! I'm using 0.11.2. OK, time to upgrade.
>
> Oh wow, does that mean that http://trac.cython.org/cython_trac/ticket/177
> is fixed? I couldn't find anything in the release notes about that,
> but it would be great news. Does the cdef function acquire and hold
> the buffer?
>
> Felix
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101201/6b90efc9/attachment.html>

From gregwh at gmail.com  Wed Dec  1 14:16:13 2010
From: gregwh at gmail.com (greg whittier)
Date: Wed, 1 Dec 2010 14:16:13 -0500
Subject: [Numpy-discussion] broadcasting with numpy.interp
In-Reply-To: <AANLkTimB0E2AS_-iZ974AxhmbjaHAcAqQC-532xC3bRb@mail.gmail.com>
References: <AANLkTi=LK_Tg20sgC-H5HhkWTOvwoX_XtF=xTvfrZu=t@mail.gmail.com>
	<AANLkTimB0E2AS_-iZ974AxhmbjaHAcAqQC-532xC3bRb@mail.gmail.com>
Message-ID: <AANLkTikJZSikKf2v1+FSmnhW08aR=M2pthHcYHxcjaQ6@mail.gmail.com>

On Wed, Nov 24, 2010 at 3:16 PM, Friedrich Romstedt <
friedrichromstedt at gmail.com> wrote:

> 2010/11/16 greg whittier <gregwh at gmail.com>:
> > I'd like to be able to speed up the following code.
> >
> > def replace_dead(cube, dead):
> >   # cube.shape == (320, 640, 1200)
> >   # dead.shape == (320, 640)
> >   # cube[i,j,:] are bad points to be replaced via interpolation if
> > dead[i,j] == True
> >
> >    bands = np.arange(0, cube.shape[0])
> >    for line in range(cube.shape[1]):
> >        dead_bands = bands[dead[:, line] == True]
> >        good_bands = bands[dead[:, line] == False]
> >        for sample in range(cube.shape[2]):
> >            # interp returns fp[0] for x < xp[0] and fp[-1] for x > xp[-1]
> >            cube[dead_bands, line, sample] = \
> >                np.interp(dead_bands,
> >                          good_bands,
> >                          cube[good_bands, line, sample])
>
> I assume you just need *some* interpolation, not that specific one?
> In that case, I'd suggest the following:
>
> 1)  Use a 2d interpolation, taking into account all nearest neighbours.
> 2)  For this, use a looped interpolation in this nearest-neighbour sense:
>    a)  Generate sums of all unmasked nearest-neighbour values
>    b)  Generate counts for the nearest neighbours present
>    c)  Replace the bad values by the sums divided by the count.
>    d)  Continue at (a) if there are bad values left
>
> Bad values which are neighbouring each other (>= 3) need multiple
> passes through the loop.  It should be pretty fast.
>
> If this is what you have in mind, maybe we (or I) can make up some code.
>
> Friedrich
>
>
Thanks so much for the response!  Sorry I didn't respond earlier.  I put it
aside until I found time to try and understand part 2 of your response and
forgot about it.  I'm not really looking for 2d interpolation at the moment,
but I can see needing it in the future.  Right now, I just want to
interpolate along one of the three axes.  I think what you're suggesting
might work for 1d or 2d depending on how you find the nearest neighbors.
What routine would you use?  Also, when you say "unmasked" do you mean
literally using masked arrays?

Thanks,
Greg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101201/ad368465/attachment.html>

From kbasye1 at jhu.edu  Wed Dec  1 14:18:36 2010
From: kbasye1 at jhu.edu (Ken Basye)
Date: Wed, 01 Dec 2010 14:18:36 -0500
Subject: [Numpy-discussion] printoption to allow hexified floats?
Message-ID: <4CF69F8C.2090603@jhu.edu>

Hi Numpy folks,
     When working with floats, I prefer to have exact string 
representations in doctests and other reference-based testing; I find it 
helps a lot to avoid chasing cross-platform differences that are really 
about the string conversion rather than about numerical differences.  
Since Python 2.6, the, the hex() method on floats has been available and 
it gives an exact representation.  Is there any way to have Numpy arrays 
of floats printed using this representation?  If not, would there be 
interest in adding that?
     On a somewhat related note, is there a table someplace which shows 
which versions of Python are supported in each release of Numpy?  I 
found an FAQ that mentioned 2.4 and 2.5, but since it didn't mention 2.6 
or 2.7 (much less 3.1), I assume it's out of date.  This relates to the 
above since it would be harder to support a new hex printoption for 
Pythons before 2.6.
     Thanks,
         Ken B.


From kwgoodman at gmail.com  Wed Dec  1 14:47:36 2010
From: kwgoodman at gmail.com (Keith Goodman)
Date: Wed, 1 Dec 2010 11:47:36 -0800
Subject: [Numpy-discussion] A Cython apply_along_axis function
Message-ID: <AANLkTinp6pMmzwF0Q6fUhYH=cAyKNZcA5_nTj0Tmr3JE@mail.gmail.com>

It's hard to write Cython code that can handle all dtypes and
arbitrary number of dimensions. The former is typically dealt with
using templates, but what do people do about the latter?

I'm trying to take baby steps towards writing an apply_along_axis
function that takes as input a cython function, numpy array, and axis.
I'm using the following numpy ticket as a guide but I'm really just
copying and pasting without understanding:

http://projects.scipy.org/numpy/attachment/ticket/1213/_selectmodule.pyx

Can anyone spot why I get a segfault on the call to nanmean_1d in
apply_along_axis?

import numpy as np
cimport numpy as np
import cython

cdef double NAN = <double> np.nan
ctypedef np.float64_t (*func_t)(void *buf, np.npy_intp size, np.npy_intp s)

def apply_along_axis(np.ndarray[np.float64_t, ndim=1] a, int axis):

    cdef func_t nanmean_1d
    cdef np.npy_intp stride, itemsize
    cdef int ndim = a.ndim
    cdef np.float64_t out

    itemsize = <np.npy_intp> a.itemsize

    if ndim == 1:
        stride = a.strides[0] // itemsize # convert stride bytes --> items
        out = nanmean_1d(a.data, a.shape[0], stride)
    else:
        raise ValueError("Not yet coded")

    return out

cdef np.float64_t nanmean_1d(void *buf, np.npy_intp n, np.npy_intp s):
    "nanmean of buffer."
    cdef np.float64_t *a = <np.float64_t *> buf #
    cdef np.npy_intp i, count = 0
    cdef np.float64_t asum, ai
    if s == 1:
        for i in range(n):
            ai = a[i]
            if ai == ai:
                asum += ai
                count += 1
    else:
        for i in range(n):
            ai = a[i*s]
            if ai == ai:
                asum += ai
                count += 1
    if count > 0:
        return asum / count
    else:
        return NAN


From dagss at student.matnat.uio.no  Wed Dec  1 15:00:38 2010
From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn)
Date: Wed, 01 Dec 2010 21:00:38 +0100
Subject: [Numpy-discussion] A Cython apply_along_axis function
In-Reply-To: <AANLkTinp6pMmzwF0Q6fUhYH=cAyKNZcA5_nTj0Tmr3JE@mail.gmail.com>
References: <AANLkTinp6pMmzwF0Q6fUhYH=cAyKNZcA5_nTj0Tmr3JE@mail.gmail.com>
Message-ID: <4CF6A966.2070002@student.matnat.uio.no>

On 12/01/2010 08:47 PM, Keith Goodman wrote:
> It's hard to write Cython code that can handle all dtypes and
> arbitrary number of dimensions. The former is typically dealt with
> using templates, but what do people do about the latter?
>    

What you typically do is to use the C-level iterator API. In fact 
there's a recent thread on cython-users that does exactly this ("How can 
I use PyArray_IterAllButAxis..."). Of course, make sure you take the 
comments of that thread into account (!).

I feel that is easier to work with than what you do below. Not saying it 
couldn't be easier, but it's not too bad once you get used to it.

Dag Sverre


> I'm trying to take baby steps towards writing an apply_along_axis
> function that takes as input a cython function, numpy array, and axis.
> I'm using the following numpy ticket as a guide but I'm really just
> copying and pasting without understanding:
>
> http://projects.scipy.org/numpy/attachment/ticket/1213/_selectmodule.pyx
>
> Can anyone spot why I get a segfault on the call to nanmean_1d in
> apply_along_axis?
>
> import numpy as np
> cimport numpy as np
> import cython
>
> cdef double NAN =<double>  np.nan
> ctypedef np.float64_t (*func_t)(void *buf, np.npy_intp size, np.npy_intp s)
>
> def apply_along_axis(np.ndarray[np.float64_t, ndim=1] a, int axis):
>
>      cdef func_t nanmean_1d
>      cdef np.npy_intp stride, itemsize
>      cdef int ndim = a.ndim
>      cdef np.float64_t out
>
>      itemsize =<np.npy_intp>  a.itemsize
>
>      if ndim == 1:
>          stride = a.strides[0] // itemsize # convert stride bytes -->  items
>          out = nanmean_1d(a.data, a.shape[0], stride)
>      else:
>          raise ValueError("Not yet coded")
>
>      return out
>
> cdef np.float64_t nanmean_1d(void *buf, np.npy_intp n, np.npy_intp s):
>      "nanmean of buffer."
>      cdef np.float64_t *a =<np.float64_t *>  buf #
>      cdef np.npy_intp i, count = 0
>      cdef np.float64_t asum, ai
>      if s == 1:
>          for i in range(n):
>              ai = a[i]
>              if ai == ai:
>                  asum += ai
>                  count += 1
>      else:
>          for i in range(n):
>              ai = a[i*s]
>              if ai == ai:
>                  asum += ai
>                  count += 1
>      if count>  0:
>          return asum / count
>      else:
>          return NAN
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>    


From david at silveregg.co.jp  Wed Dec  1 20:53:43 2010
From: david at silveregg.co.jp (David)
Date: Thu, 02 Dec 2010 10:53:43 +0900
Subject: [Numpy-discussion] A Cython apply_along_axis function
In-Reply-To: <AANLkTinp6pMmzwF0Q6fUhYH=cAyKNZcA5_nTj0Tmr3JE@mail.gmail.com>
References: <AANLkTinp6pMmzwF0Q6fUhYH=cAyKNZcA5_nTj0Tmr3JE@mail.gmail.com>
Message-ID: <4CF6FC27.5040005@silveregg.co.jp>

Hi Keith,

On 12/02/2010 04:47 AM, Keith Goodman wrote:
> It's hard to write Cython code that can handle all dtypes and
> arbitrary number of dimensions. The former is typically dealt with
> using templates, but what do people do about the latter?

The only way that I know to do that systematically is iterator. There is 
a relatively simple example in scipy/signal (lfilter.c.src).

I wonder if it would be possible to add better support for numpy 
iterators in cython...

cheers,

David


From kwgoodman at gmail.com  Wed Dec  1 21:07:04 2010
From: kwgoodman at gmail.com (Keith Goodman)
Date: Wed, 1 Dec 2010 18:07:04 -0800
Subject: [Numpy-discussion] A Cython apply_along_axis function
In-Reply-To: <4CF6FC27.5040005@silveregg.co.jp>
References: <AANLkTinp6pMmzwF0Q6fUhYH=cAyKNZcA5_nTj0Tmr3JE@mail.gmail.com>
	<4CF6FC27.5040005@silveregg.co.jp>
Message-ID: <AANLkTikzwnLcWa837fpUOua6GbA6VDoBVvj6i8WT4LcC@mail.gmail.com>

On Wed, Dec 1, 2010 at 5:53 PM, David <david at silveregg.co.jp> wrote:

> On 12/02/2010 04:47 AM, Keith Goodman wrote:
>> It's hard to write Cython code that can handle all dtypes and
>> arbitrary number of dimensions. The former is typically dealt with
>> using templates, but what do people do about the latter?
>
> The only way that I know to do that systematically is iterator. There is
> a relatively simple example in scipy/signal (lfilter.c.src).
>
> I wonder if it would be possible to add better support for numpy
> iterators in cython...

Thanks for the tip. I'm starting to think that for now I should just
template both dtype and ndim.


From jsalvati at u.washington.edu  Wed Dec  1 21:09:27 2010
From: jsalvati at u.washington.edu (John Salvatier)
Date: Wed, 1 Dec 2010 18:09:27 -0800
Subject: [Numpy-discussion] A Cython apply_along_axis function
In-Reply-To: <AANLkTikzwnLcWa837fpUOua6GbA6VDoBVvj6i8WT4LcC@mail.gmail.com>
References: <AANLkTinp6pMmzwF0Q6fUhYH=cAyKNZcA5_nTj0Tmr3JE@mail.gmail.com>
	<4CF6FC27.5040005@silveregg.co.jp>
	<AANLkTikzwnLcWa837fpUOua6GbA6VDoBVvj6i8WT4LcC@mail.gmail.com>
Message-ID: <AANLkTi=GLzLxJ8tmf4oOE70ofPNZAYPgUxt2bbaB-9wO@mail.gmail.com>

On Wed, Dec 1, 2010 at 6:07 PM, Keith Goodman <kwgoodman at gmail.com> wrote:

> On Wed, Dec 1, 2010 at 5:53 PM, David <david at silveregg.co.jp> wrote:
>
> > On 12/02/2010 04:47 AM, Keith Goodman wrote:
> >> It's hard to write Cython code that can handle all dtypes and
> >> arbitrary number of dimensions. The former is typically dealt with
> >> using templates, but what do people do about the latter?
> >
> > The only way that I know to do that systematically is iterator. There is
> > a relatively simple example in scipy/signal (lfilter.c.src).
> >
> > I wonder if it would be possible to add better support for numpy
> > iterators in cython...
>
> Thanks for the tip. I'm starting to think that for now I should just
> template both dtype and ndim.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
I enthusiastically support better iterator support for cython
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101201/6f468f9d/attachment.html>

From wardefar at iro.umontreal.ca  Wed Dec  1 21:53:21 2010
From: wardefar at iro.umontreal.ca (David Warde-Farley)
Date: Wed, 1 Dec 2010 21:53:21 -0500
Subject: [Numpy-discussion] printoption to allow hexified floats?
In-Reply-To: <4CF69F8C.2090603@jhu.edu>
References: <4CF69F8C.2090603@jhu.edu>
Message-ID: <ACA7223C-3676-447A-8264-7860B88E27DC@iro.umontreal.ca>

On 2010-12-01, at 2:18 PM, Ken Basye wrote:

>     On a somewhat related note, is there a table someplace which shows 
> which versions of Python are supported in each release of Numpy?  I 
> found an FAQ that mentioned 2.4 and 2.5, but since it didn't mention 2.6 
> or 2.7 (much less 3.1), I assume it's out of date.  This relates to the 
> above since it would be harder to support a new hex printoption for 
> Pythons before 2.6.

NumPy 1.5.x still aims to support Python >= 2.4. I don't know what the plans are for dropping 2.4 support, but I don't think 2.5 would be dropped until some time after official support for 2.4 is phased out.

I'm confused how having an exact hex representation of a float would help with doctests, though. It seems like it would exacerbate platform issues. One thought is to include a 'print' and explicit format specifier, which (I think?) is fairly consistent across platforms... or is this what you mean to say you're already doing?

David

From robert.kern at gmail.com  Wed Dec  1 22:02:02 2010
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 1 Dec 2010 21:02:02 -0600
Subject: [Numpy-discussion] printoption to allow hexified floats?
In-Reply-To: <4CF69F8C.2090603@jhu.edu>
References: <4CF69F8C.2090603@jhu.edu>
Message-ID: <AANLkTimaKfQPROhm4GZ6edBT1yhcJp5p9g43qOPbZ7WP@mail.gmail.com>

On Wed, Dec 1, 2010 at 13:18, Ken Basye <kbasye1 at jhu.edu> wrote:
> Hi Numpy folks,
> ? ? When working with floats, I prefer to have exact string
> representations in doctests and other reference-based testing; I find it
> helps a lot to avoid chasing cross-platform differences that are really
> about the string conversion rather than about numerical differences.

Unfortunately, there are still cross-platform numerical differences
that are real (but are irrelevant to the validity of the code under
test). Hex-printing for floats only helps a little to make doctests
useful for numerical code.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From jsalvati at u.washington.edu  Wed Dec  1 22:35:42 2010
From: jsalvati at u.washington.edu (John Salvatier)
Date: Wed, 1 Dec 2010 19:35:42 -0800
Subject: [Numpy-discussion] MultiIter version of PyArray_IterAllButAxis ?
Message-ID: <AANLkTi=Gd9UcVkPU3picki4P=hmuQWKJSA7mVXQqCVZ4@mail.gmail.com>

Hello,

I am writing a UFunc creation utility, and I would like to know: is there a
way to mimic the behavior ofPyArray_IterAllButAxis for multiple arrays at a
time? I would like to be able to write UFuncs that take an axis argument and
also take multiple array arguments, for example I want to be able to create
a moving average with a weighting that changes according to another array.

Best Regards,
John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101201/f34a5241/attachment.html>

From david at silveregg.co.jp  Wed Dec  1 22:56:36 2010
From: david at silveregg.co.jp (David)
Date: Thu, 02 Dec 2010 12:56:36 +0900
Subject: [Numpy-discussion] MultiIter version of PyArray_IterAllButAxis ?
In-Reply-To: <AANLkTi=Gd9UcVkPU3picki4P=hmuQWKJSA7mVXQqCVZ4@mail.gmail.com>
References: <AANLkTi=Gd9UcVkPU3picki4P=hmuQWKJSA7mVXQqCVZ4@mail.gmail.com>
Message-ID: <4CF718F4.3010508@silveregg.co.jp>

On 12/02/2010 12:35 PM, John Salvatier wrote:
> Hello,
>
> I am writing a UFunc creation utility, and I would like to know: is
> there a way to mimic the behavior ofPyArray_IterAllButAxis for multiple
> arrays at a time?

Is there a reason why creating a separate iterator for each array is not 
possible ?

cheers,

David


From jsalvati at u.washington.edu  Wed Dec  1 23:00:25 2010
From: jsalvati at u.washington.edu (John Salvatier)
Date: Wed, 1 Dec 2010 20:00:25 -0800
Subject: [Numpy-discussion] MultiIter version of PyArray_IterAllButAxis ?
In-Reply-To: <4CF718F4.3010508@silveregg.co.jp>
References: <AANLkTi=Gd9UcVkPU3picki4P=hmuQWKJSA7mVXQqCVZ4@mail.gmail.com>
	<4CF718F4.3010508@silveregg.co.jp>
Message-ID: <AANLkTinX==_0D=jqkbaHhcxAUukrMjfFbCbepO6x22b=@mail.gmail.com>

On Wed, Dec 1, 2010 at 7:56 PM, David <david at silveregg.co.jp> wrote:

> On 12/02/2010 12:35 PM, John Salvatier wrote:
> > Hello,
> >
> > I am writing a UFunc creation utility, and I would like to know: is
> > there a way to mimic the behavior ofPyArray_IterAllButAxis for multiple
> > arrays at a time?
>
> Is there a reason why creating a separate iterator for each array is not
> possible ?
>

If the arrays are not the same shape, separate iterators won't be aligned,
so if the results are going into a broadcasted result array, the computation
won't be correct.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101201/7034512a/attachment.html>

From robertwb at math.washington.edu  Thu Dec  2 02:17:50 2010
From: robertwb at math.washington.edu (Robert Bradshaw)
Date: Wed, 1 Dec 2010 23:17:50 -0800
Subject: [Numpy-discussion] A Cython apply_along_axis function
In-Reply-To: <AANLkTi=GLzLxJ8tmf4oOE70ofPNZAYPgUxt2bbaB-9wO@mail.gmail.com>
References: <AANLkTinp6pMmzwF0Q6fUhYH=cAyKNZcA5_nTj0Tmr3JE@mail.gmail.com>
	<4CF6FC27.5040005@silveregg.co.jp>
	<AANLkTikzwnLcWa837fpUOua6GbA6VDoBVvj6i8WT4LcC@mail.gmail.com>
	<AANLkTi=GLzLxJ8tmf4oOE70ofPNZAYPgUxt2bbaB-9wO@mail.gmail.com>
Message-ID: <AANLkTimPT-2X8rZWLsu6Aomn_pz6D_UHk2PCif_1DBk0@mail.gmail.com>

On Wed, Dec 1, 2010 at 6:09 PM, John Salvatier
<jsalvati at u.washington.edu> wrote:
> On Wed, Dec 1, 2010 at 6:07 PM, Keith Goodman <kwgoodman at gmail.com> wrote:
>>
>> On Wed, Dec 1, 2010 at 5:53 PM, David <david at silveregg.co.jp> wrote:
>>
>> > On 12/02/2010 04:47 AM, Keith Goodman wrote:
>> >> It's hard to write Cython code that can handle all dtypes and
>> >> arbitrary number of dimensions. The former is typically dealt with
>> >> using templates, but what do people do about the latter?
>> >
>> > The only way that I know to do that systematically is iterator. There is
>> > a relatively simple example in scipy/signal (lfilter.c.src).
>> >
>> > I wonder if it would be possible to add better support for numpy
>> > iterators in cython...
>>
>> Thanks for the tip. I'm starting to think that for now I should just
>> template both dtype and ndim.
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
> I enthusiastically support better iterator support for cython

I enthusiastically welcome contributions along this line.

- Robert


From dagss at student.matnat.uio.no  Thu Dec  2 04:08:12 2010
From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn)
Date: Thu, 02 Dec 2010 10:08:12 +0100
Subject: [Numpy-discussion] A Cython apply_along_axis function
In-Reply-To: <AANLkTimPT-2X8rZWLsu6Aomn_pz6D_UHk2PCif_1DBk0@mail.gmail.com>
References: <AANLkTinp6pMmzwF0Q6fUhYH=cAyKNZcA5_nTj0Tmr3JE@mail.gmail.com>	<4CF6FC27.5040005@silveregg.co.jp>	<AANLkTikzwnLcWa837fpUOua6GbA6VDoBVvj6i8WT4LcC@mail.gmail.com>	<AANLkTi=GLzLxJ8tmf4oOE70ofPNZAYPgUxt2bbaB-9wO@mail.gmail.com>
	<AANLkTimPT-2X8rZWLsu6Aomn_pz6D_UHk2PCif_1DBk0@mail.gmail.com>
Message-ID: <4CF761FC.60804@student.matnat.uio.no>

On 12/02/2010 08:17 AM, Robert Bradshaw wrote:
> On Wed, Dec 1, 2010 at 6:09 PM, John Salvatier
> <jsalvati at u.washington.edu>  wrote:
>    
>> On Wed, Dec 1, 2010 at 6:07 PM, Keith Goodman<kwgoodman at gmail.com>  wrote:
>>      
>>> On Wed, Dec 1, 2010 at 5:53 PM, David<david at silveregg.co.jp>  wrote:
>>>
>>>        
>>>> On 12/02/2010 04:47 AM, Keith Goodman wrote:
>>>>          
>>>>> It's hard to write Cython code that can handle all dtypes and
>>>>> arbitrary number of dimensions. The former is typically dealt with
>>>>> using templates, but what do people do about the latter?
>>>>>            
>>>> The only way that I know to do that systematically is iterator. There is
>>>> a relatively simple example in scipy/signal (lfilter.c.src).
>>>>
>>>> I wonder if it would be possible to add better support for numpy
>>>> iterators in cython...
>>>>          
>>> Thanks for the tip. I'm starting to think that for now I should just
>>> template both dtype and ndim.
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>        
>> I enthusiastically support better iterator support for cython
>>      
> I enthusiastically welcome contributions along this line.
>    

Me too :-)

I guess we're moving into more Cython-list territory, so let's move any 
follow-ups there (posting this one both places).

Just in case anybody is wondering what something like this could look 
like, here's a rough scetch complete with bugs. The idea would be to a) 
add some rudimentary support for using the yield keyword in Cython to 
make a generator function, b) inline the generator function if the 
generator is used directly in a for-loop. This should result in very 
efficient code, and would also be much easier to implement than a 
general purpose generator.

@cython.inline
cdef array_iter_double(np.ndarray a, int axis=-1):
     cdef np.flatiter it
     ita = np.PyArray_IterAllButAxis(a, &axis)
     cdef Py_ssize_t stride = a.strides[axis], length = a.shape[axis], i
     while np.PyArray_ITER_NOTDONE(ita):
         for i in range(length):
             yield <double*>(np.PyArray_ITER_DATA(it) + )[i * stride])[0]
             # TODO: Probably yield indices as well
         np.PyArray_ITER_NEXT(it)
     # TODO: add faster special-cases for stride == sizeof(double)


# Use NumPy iterator API to sum all values of array with
# arbitrary number of dimensions:
cdef double s = 0, value
for value in array_iter_double(myarray):
     s += value
     # at this point, the contents of the array_iter_double function is 
copied,
     # and "s += value" simply inserted everywhere "yield" occurs in the 
function

Dag Sverre


From friedrichromstedt at gmail.com  Thu Dec  2 07:04:46 2010
From: friedrichromstedt at gmail.com (Friedrich Romstedt)
Date: Thu, 2 Dec 2010 13:04:46 +0100
Subject: [Numpy-discussion] broadcasting with numpy.interp
In-Reply-To: <AANLkTikJZSikKf2v1+FSmnhW08aR=M2pthHcYHxcjaQ6@mail.gmail.com>
References: <AANLkTi=LK_Tg20sgC-H5HhkWTOvwoX_XtF=xTvfrZu=t@mail.gmail.com>
	<AANLkTimB0E2AS_-iZ974AxhmbjaHAcAqQC-532xC3bRb@mail.gmail.com>
	<AANLkTikJZSikKf2v1+FSmnhW08aR=M2pthHcYHxcjaQ6@mail.gmail.com>
Message-ID: <AANLkTindiGgjucZWMdvdVpe5vG1EqOaky+f=MjObvcgL@mail.gmail.com>

2010/12/1 greg whittier <gregwh at gmail.com>:
> On Wed, Nov 24, 2010 at 3:16 PM, Friedrich Romstedt
> <friedrichromstedt at gmail.com> wrote:
>> I assume you just need *some* interpolation, not that specific one?
>> In that case, I'd suggest the following:
>>
>> 1) ?Use a 2d interpolation, taking into account all nearest neighbours.
>> 2) ?For this, use a looped interpolation in this nearest-neighbour sense:
>> ? ?a) ?Generate sums of all unmasked nearest-neighbour values
>> ? ?b) ?Generate counts for the nearest neighbours present
>> ? ?c) ?Replace the bad values by the sums divided by the count.
>> ? ?d) ?Continue at (a) if there are bad values left
>>
>> Bad values which are neighbouring each other (>= 3) need multiple
>> passes through the loop. ?It should be pretty fast.
>>
>> If this is what you have in mind, maybe we (or I) can make up some code.
>>
>> Friedrich
>
> Thanks so much for the response!? Sorry I didn't respond earlier.? I put it
> aside until I found time to try and understand part 2 of your response and
> forgot about it.? I'm not really looking for 2d interpolation at the moment,
> but I can see needing it in the future.? Right now, I just want to
> interpolate along one of the three axes.? I think what you're suggesting
> might work for 1d or 2d depending on how you find the nearest neighbors.
> What routine would you use?? Also, when you say "unmasked" do you mean
> literally using masked arrays?

Hi Greg,

if you can estimate that you'll need a more sophisticated algorithm in
future I'd recommend to write it in full glory, in a general way, in
the end it'll save you time (this is what I would do).

Yes, you're right, by choosing just neighbours along one axis you
could do simple one-axis interpolation, but in some corner cases it'll
not work properly since it will work the following (some ascii
graphics):

"x" are present values, "-" are missing values.  The chain might look
like the following:

xxxx-xxxx

In this case, interpolation will work.  It'll pick the two neighbours,
and interpolate them.  But consider this:

xxxx--xxxx

This will just propagate the end points to the neighbours.  The
missing points will have just one neighbour, hence this behaviour.

After the propagation, all values are filled, and you end up with a
step in the middle.

If such neighbouring missing data points are rare, it might still be
considerable over Python loops with numpy.interp().  I don't see a way
to vectorize interp(), since the run lengthes are different in each
case.  You might consider writing a C or Cython function, but I cannot
give any advise with this.

I'm thinking about a way to propagate the values over more than one
step.  You might know that interpolation (in images) uses also kernels
extending beyond the next neighbours.  But I don't know precisely how
to design them.

First, I'd like to know if you have or have not such neighbouring
missing data points.

And why do you prefer interpolation in only one axis?

I can help with the code, but I'd prefer to do it the following way:
You write the code, and when you're stuck, seriously, you write back
to the list.  I'm sure I could do the code, but 1) it might (might?)
save me time, 2) You might profit from doing it yourself :-)

Would you mind putting the code online in a github repo?  Might well
be that I sometimes run across a similar problem.

Considering your masking question, I would keep the mask array
separate, but this is rather because I'm not familiar with masked
arrays.

Another thing which comes into my mind would be to rewrite or write a
new interp() which takes care of masked entries, but it would be quite
an amount of work for me (I'm not familiar with the C interior of
numpy either).  And it would be restricted to one dimension only.

If you can please give more detail on you data, where it comes from etc.

Friedrich


From totonixsame at gmail.com  Thu Dec  2 07:35:38 2010
From: totonixsame at gmail.com (totonixsame at gmail.com)
Date: Thu, 2 Dec 2010 10:35:38 -0200
Subject: [Numpy-discussion] Threshold
Message-ID: <AANLkTintQ2zrSK+TxzSdKkUcRv4HiZG0E6CgF40xLgZz@mail.gmail.com>

Hi all,

I' m developing a medical software named InVesalius [1], it is a free
software. It uses numpy arrays to store the medical images (CT and
MRI) and the mask, the mask is used to mark the region of interest and
to create 3D surfaces. Those array generally have 512x512 elements.
The mask is created based in threshold, with lower and upper bound,
this way:

mask = numpy.zeros(medical_image.shape, dtype="uint16")
mask[ numpy.logical_and( medical_image >= lower, medical_image <= upper)] = 255

Where lower and upper are the threshold bounds. Here I' m marking the
array positions where medical_image is between the threshold bounds
with 255, where isn' t with 0. The question is: Is there a better way
to do that?

Thank!

[1] - svn.softwarepublico.gov.br/trac/invesalius


From zachary.pincus at yale.edu  Thu Dec  2 08:14:03 2010
From: zachary.pincus at yale.edu (Zachary Pincus)
Date: Thu, 2 Dec 2010 08:14:03 -0500
Subject: [Numpy-discussion] Threshold
In-Reply-To: <AANLkTintQ2zrSK+TxzSdKkUcRv4HiZG0E6CgF40xLgZz@mail.gmail.com>
References: <AANLkTintQ2zrSK+TxzSdKkUcRv4HiZG0E6CgF40xLgZz@mail.gmail.com>
Message-ID: <5CFD7508-E92B-47B2-96A1-38FBBC235DFE@yale.edu>

> mask = numpy.zeros(medical_image.shape, dtype="uint16")
> mask[ numpy.logical_and( medical_image >= lower, medical_image <=  
> upper)] = 255
>
> Where lower and upper are the threshold bounds. Here I' m marking the
> array positions where medical_image is between the threshold bounds
> with 255, where isn' t with 0. The question is: Is there a better  
> way to do that?

This will give you a True/False boolean mask:
mask = numpy.logical_and( medical_image >= lower, medical_image <=  
upper)

And this a 0/255 mask:
mask = 255*numpy.logical_and( medical_image >= lower, medical_image <=  
upper)

You can make the code a bit more terse/idiomatic by using the bitwise  
operators, which do logical operations on boolean arrays:
mask = 255*((medical_image >= lower) & (medical_image <= upper))

Though this is a bit annoying as the bitwise ops (& | ^ ~) have higher  
precedence than the comparison ops (< <= > >=), so you need to  
parenthesize carefully, as above.

Zach


On Dec 2, 2010, at 7:35 AM, totonixsame at gmail.com wrote:

> Hi all,
>
> I' m developing a medical software named InVesalius [1], it is a free
> software. It uses numpy arrays to store the medical images (CT and
> MRI) and the mask, the mask is used to mark the region of interest and
> to create 3D surfaces. Those array generally have 512x512 elements.
> The mask is created based in threshold, with lower and upper bound,
> this way:
>
> mask = numpy.zeros(medical_image.shape, dtype="uint16")
> mask[ numpy.logical_and( medical_image >= lower, medical_image <=  
> upper)] = 255
>
> Where lower and upper are the threshold bounds. Here I' m marking the
> array positions where medical_image is between the threshold bounds
> with 255, where isn' t with 0. The question is: Is there a better way
> to do that?
>
> Thank!
>
> [1] - svn.softwarepublico.gov.br/trac/invesalius
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From totonixsame at gmail.com  Thu Dec  2 09:06:02 2010
From: totonixsame at gmail.com (totonixsame at gmail.com)
Date: Thu, 2 Dec 2010 12:06:02 -0200
Subject: [Numpy-discussion] Threshold
In-Reply-To: <5CFD7508-E92B-47B2-96A1-38FBBC235DFE@yale.edu>
References: <AANLkTintQ2zrSK+TxzSdKkUcRv4HiZG0E6CgF40xLgZz@mail.gmail.com>
	<5CFD7508-E92B-47B2-96A1-38FBBC235DFE@yale.edu>
Message-ID: <AANLkTinFyc1JMzbL-dTtezaffm2VFOs0X2HGx5snVfWo@mail.gmail.com>

On Thu, Dec 2, 2010 at 11:14 AM, Zachary Pincus <zachary.pincus at yale.edu> wrote:
>> mask = numpy.zeros(medical_image.shape, dtype="uint16")
>> mask[ numpy.logical_and( medical_image >= lower, medical_image <=
>> upper)] = 255
>>
>> Where lower and upper are the threshold bounds. Here I' m marking the
>> array positions where medical_image is between the threshold bounds
>> with 255, where isn' t with 0. The question is: Is there a better
>> way to do that?
>
> This will give you a True/False boolean mask:
> mask = numpy.logical_and( medical_image >= lower, medical_image <=
> upper)
>
> And this a 0/255 mask:
> mask = 255*numpy.logical_and( medical_image >= lower, medical_image <=
> upper)
>
> You can make the code a bit more terse/idiomatic by using the bitwise
> operators, which do logical operations on boolean arrays:
> mask = 255*((medical_image >= lower) & (medical_image <= upper))
>
> Though this is a bit annoying as the bitwise ops (& | ^ ~) have higher
> precedence than the comparison ops (< <= > >=), so you need to
> parenthesize carefully, as above.
>
> Zach

Thanks, Zach! I stayed with the last one.


From kbasye1 at jhu.edu  Thu Dec  2 12:17:07 2010
From: kbasye1 at jhu.edu (Ken Basye)
Date: Thu, 02 Dec 2010 12:17:07 -0500
Subject: [Numpy-discussion] printoption to allow hexified floats?
In-Reply-To: <mailman.153.1291295651.2209.numpy-discussion@scipy.org>
References: <mailman.153.1291295651.2209.numpy-discussion@scipy.org>
Message-ID: <4CF7D493.3060802@jhu.edu>

Thanks for the replies.

Robert is right; many numerical operations, particularly complex ones, 
generate different values across platforms, and we deal with these by 
storing the values from some platform as a reference and using 
allclose(), which requires extra work.  But many basic operations 
generate the same underlying values on IEEE 754-compliant platforms but 
don't always format floats consistently (see 
http://bugs.python.org/issue1580 for a lengthy discussion on this).  My 
impression is that Python 2.7 does a better job here, but at this point 
a lot of differences also crop up between 2.6 (or less) and 2.7 due to 
the changed formatting built into 2.7, and these are the result of 
formatting differences; the numbers themselves are identical (in our 
experience so far, at any rate).  This is a current pain-point which an 
exact representation would alleviate.

In response to David, we haven't implemented a separate print; we rely 
on the Numpy repr/str for ndarrays and the printoptions that allow some 
control over float formatting.  I'm basically proposing to add a bit 
more control there.  And thanks for the info on supported versions of 
Python.

      Ken


On 12/2/10 8:14 AM, Robert Kern wrote:
> On Wed, Dec 1, 2010 at 13:18, Ken Basye<kbasye1 at jhu.edu>  wrote:
>> Hi Numpy folks,
>> ? ? When working with floats, I prefer to have exact string
>> representations in doctests and other reference-based testing; I find it
>> helps a lot to avoid chasing cross-platform differences that are really
>> about the string conversion rather than about numerical differences.
> Unfortunately, there are still cross-platform numerical differences
> that are real (but are irrelevant to the validity of the code under
> test). Hex-printing for floats only helps a little to make doctests
> useful for numerical code.


From charlesr.harris at gmail.com  Thu Dec  2 12:20:27 2010
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 2 Dec 2010 10:20:27 -0700
Subject: [Numpy-discussion] Float16 and PEP 3118
Message-ID: <AANLkTimCLSrPsa88mCQcj_-Lsek1An6D1ZKB59NZN=zx@mail.gmail.com>

Hi Folks,

Now that the float16 type is in I was wondering if we should do anything to
support it in the PEP 3118 buffer interface. This would probably affect the
Cython folks as well as the people working on fixing up the structure module
<http://bugs.python.org/issue3132>for Python 3.x. There is a fairly long
thread about the latter and it also looks like what the Python folks are
doing with structure alignment isn't going to be compatible with Numpy
structured arrays. Thoughts?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101202/3c251aa5/attachment.html>

From ben.root at ou.edu  Thu Dec  2 12:41:25 2010
From: ben.root at ou.edu (Benjamin Root)
Date: Thu, 2 Dec 2010 11:41:25 -0600
Subject: [Numpy-discussion] printoption to allow hexified floats?
In-Reply-To: <4CF7D493.3060802@jhu.edu>
References: <mailman.153.1291295651.2209.numpy-discussion@scipy.org>
	<4CF7D493.3060802@jhu.edu>
Message-ID: <AANLkTikLT4V3KjnhXXWf4evqwhFEM5gTmnAVamGM3x=a@mail.gmail.com>

On Thu, Dec 2, 2010 at 11:17 AM, Ken Basye <kbasye1 at jhu.edu> wrote:

> Thanks for the replies.
>
> Robert is right; many numerical operations, particularly complex ones,
> generate different values across platforms, and we deal with these by
> storing the values from some platform as a reference and using
> allclose(), which requires extra work.  But many basic operations
> generate the same underlying values on IEEE 754-compliant platforms but
> don't always format floats consistently (see
> http://bugs.python.org/issue1580 for a lengthy discussion on this).  My
> impression is that Python 2.7 does a better job here, but at this point
> a lot of differences also crop up between 2.6 (or less) and 2.7 due to
> the changed formatting built into 2.7, and these are the result of
> formatting differences; the numbers themselves are identical (in our
> experience so far, at any rate).  This is a current pain-point which an
> exact representation would alleviate.
>
> In response to David, we haven't implemented a separate print; we rely
> on the Numpy repr/str for ndarrays and the printoptions that allow some
> control over float formatting.  I'm basically proposing to add a bit
> more control there.  And thanks for the info on supported versions of
> Python.
>
>      Ken
>
>
Another approach to consider is to save the numerical data in a
platform-independent standard file format (maybe like netcdf?).  While this
isn't a fool-proof approach because the calculations themselves may
introduce differences that are platform dependent, this at least puts strong
controls on one aspect of the overall problem.

One caveat that does come across my mind is if the save/load process for the
file might have some platform-dependent differences based on the
compression/decompression schemes.  For example, the GRIB file format does a
compression where the mean value and the differences from those means are
stored.  Calculations like these might introduce some slight differences on
various platforms.

Just food for thought,
Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101202/15903519/attachment.html>

From pav at iki.fi  Thu Dec  2 13:16:47 2010
From: pav at iki.fi (Pauli Virtanen)
Date: Thu, 2 Dec 2010 18:16:47 +0000 (UTC)
Subject: [Numpy-discussion] Float16 and PEP 3118
References: <AANLkTimCLSrPsa88mCQcj_-Lsek1An6D1ZKB59NZN=zx@mail.gmail.com>
Message-ID: <id8nqf$5bd$1@dough.gmane.org>

Thu, 02 Dec 2010 10:20:27 -0700, Charles R Harris wrote:
> Now that the float16 type is in I was wondering if we should do anything
> to support it in the PEP 3118 buffer interface. This would probably
> affect the Cython folks as well as the people working on fixing up the
> structure module <http://bugs.python.org/issue3132>for Python 3.x. 

Before introducing a PEP 3118 type code for half floats in the PEP, one 
would need to argue the Python people to add it to the struct module.

Before that, the choices probably are:

- refuse to export buffers containing half floats

- export half floats as two bytes

> There is a fairly long thread about the latter and it also looks like
> what the Python folks are doing with structure alignment isn't going to
> be compatible with Numpy structured arrays. Thoughts?

I think it would be useful for the Python people to have feedback from us 
here.

AFAIK, the native-aligned mode that was discussed there is compatible 
with what dtype(..., align=True) produces: Numpy aligns structs as given 
by the maximum alignment of its fields.

-- 
Pauli Virtanen


From pearu.peterson at gmail.com  Thu Dec  2 15:52:49 2010
From: pearu.peterson at gmail.com (Pearu Peterson)
Date: Thu, 2 Dec 2010 22:52:49 +0200
Subject: [Numpy-discussion] Pushing changes to numpy git repo problem
Message-ID: <AANLkTimxHwghf2g9X75w4CB++_MRLes7Up=6oqQnSLRO@mail.gmail.com>

Hi,

I have followed Development workflow instructions in

  http://docs.scipy.org/doc/numpy/dev/gitwash/

but I am having a problem with the last step:

$ git push upstream ticket1679:master
fatal: remote error:
  You can't push to git://github.com/numpy/numpy.git
  Use git at github.com:numpy/numpy.git

What I am doing wrong?

Here's some additional info:
$ git remote -v show
origin  git at github.com:pearu/numpy.git (fetch)
origin  git at github.com:pearu/numpy.git (push)
upstream        git://github.com/numpy/numpy.git (fetch)
upstream        git://github.com/numpy/numpy.git (push)
$ git branch -a
  master
* ticket1679
  remotes/origin/HEAD -> origin/master
  remotes/origin/maintenance/1.0.3.x
  remotes/origin/maintenance/1.1.x
  remotes/origin/maintenance/1.2.x
  remotes/origin/maintenance/1.3.x
  remotes/origin/maintenance/1.4.x
  remotes/origin/maintenance/1.5.x
  remotes/origin/master
  remotes/origin/ticket1679
  remotes/upstream/maintenance/1.0.3.x
  remotes/upstream/maintenance/1.1.x
  remotes/upstream/maintenance/1.2.x
  remotes/upstream/maintenance/1.3.x
  remotes/upstream/maintenance/1.4.x
  remotes/upstream/maintenance/1.5.x
  remotes/upstream/master


Thanks,
Pearu


From fperez.net at gmail.com  Thu Dec  2 16:07:14 2010
From: fperez.net at gmail.com (Fernando Perez)
Date: Thu, 2 Dec 2010 13:07:14 -0800
Subject: [Numpy-discussion] Pushing changes to numpy git repo problem
In-Reply-To: <AANLkTimxHwghf2g9X75w4CB++_MRLes7Up=6oqQnSLRO@mail.gmail.com>
References: <AANLkTimxHwghf2g9X75w4CB++_MRLes7Up=6oqQnSLRO@mail.gmail.com>
Message-ID: <AANLkTin9+av_MMFLESWzBOheBqbz7hs029Trqx0W_=9g@mail.gmail.com>

On Thu, Dec 2, 2010 at 12:52 PM, Pearu Peterson
<pearu.peterson at gmail.com> wrote:
>
> What I am doing wrong?
>
> Here's some additional info:
> $ git remote -v show
> origin ?git at github.com:pearu/numpy.git (fetch)
> origin ?git at github.com:pearu/numpy.git (push)
> upstream ? ? ? ?git://github.com/numpy/numpy.git (fetch)
> upstream ? ? ? ?git://github.com/numpy/numpy.git (push)

The git:// protocol is read-only, for write access you need ssh
access.  Just edit your /path-to-repo/.git/config file and change the

git://github.com/numpy/numpy.git

lines for

git at github.com:numpy/numpy.git

in the upstream description.  That should be sufficient.

Regards,

f


From charlesr.harris at gmail.com  Thu Dec  2 16:08:10 2010
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 2 Dec 2010 14:08:10 -0700
Subject: [Numpy-discussion] Pushing changes to numpy git repo problem
In-Reply-To: <AANLkTimxHwghf2g9X75w4CB++_MRLes7Up=6oqQnSLRO@mail.gmail.com>
References: <AANLkTimxHwghf2g9X75w4CB++_MRLes7Up=6oqQnSLRO@mail.gmail.com>
Message-ID: <AANLkTi=-W+ADeXw4CHJVRr19wBYWZRoL2yjZQJWF=Rts@mail.gmail.com>

On Thu, Dec 2, 2010 at 1:52 PM, Pearu Peterson <pearu.peterson at gmail.com>wrote:

> Hi,
>
> I have followed Development workflow instructions in
>
>  http://docs.scipy.org/doc/numpy/dev/gitwash/
>
> but I am having a problem with the last step:
>
> $ git push upstream ticket1679:master
> fatal: remote error:
>  You can't push to git://github.com/numpy/numpy.git
>  Use git at github.com:numpy/numpy.git
>
>
Do what the message says, the first address is readonly. You can change the
settings in .git/config, mine looks like

[core]
        repositoryformatversion = 0
        filemode = true
        bare = false
        logallrefupdates = true
[remote "origin"]
        fetch = +refs/heads/*:refs/remotes/origin/*
        url = git at github.com:charris/numpy
[branch "master"]
        remote = origin
        merge = refs/heads/master
[remote "upstream"]
        url = git at github.com:numpy/numpy
        fetch = +refs/heads/*:refs/remotes/upstream/*
[alias]
        mb = merge --no-ff

Where upstream is the numpy repository.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101202/3b942116/attachment.html>

From pearu.peterson at gmail.com  Thu Dec  2 16:14:05 2010
From: pearu.peterson at gmail.com (Pearu Peterson)
Date: Thu, 2 Dec 2010 23:14:05 +0200
Subject: [Numpy-discussion] Pushing changes to numpy git repo problem
In-Reply-To: <AANLkTi=-W+ADeXw4CHJVRr19wBYWZRoL2yjZQJWF=Rts@mail.gmail.com>
References: <AANLkTimxHwghf2g9X75w4CB++_MRLes7Up=6oqQnSLRO@mail.gmail.com>
	<AANLkTi=-W+ADeXw4CHJVRr19wBYWZRoL2yjZQJWF=Rts@mail.gmail.com>
Message-ID: <AANLkTikJpP89oyUs4=r4iFVbChHeEOA-483Q_cKFBkW4@mail.gmail.com>

Thanks!
Pearu

On Thu, Dec 2, 2010 at 11:08 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Thu, Dec 2, 2010 at 1:52 PM, Pearu Peterson <pearu.peterson at gmail.com>
> wrote:
>>
>> Hi,
>>
>> I have followed Development workflow instructions in
>>
>> ?http://docs.scipy.org/doc/numpy/dev/gitwash/
>>
>> but I am having a problem with the last step:
>>
>> $ git push upstream ticket1679:master
>> fatal: remote error:
>> ?You can't push to git://github.com/numpy/numpy.git
>> ?Use git at github.com:numpy/numpy.git
>>
>
> Do what the message says, the first address is readonly. You can change the
> settings in .git/config, mine looks like
>
> [core]
> ??????? repositoryformatversion = 0
> ??????? filemode = true
> ??????? bare = false
> ??????? logallrefupdates = true
> [remote "origin"]
> ??????? fetch = +refs/heads/*:refs/remotes/origin/*
> ??????? url = git at github.com:charris/numpy
> [branch "master"]
> ??????? remote = origin
> ??????? merge = refs/heads/master
> [remote "upstream"]
> ??????? url = git at github.com:numpy/numpy
> ??????? fetch = +refs/heads/*:refs/remotes/upstream/*
> [alias]
> ??????? mb = merge --no-ff
>
> Where upstream is the numpy repository.
>
> Chuck
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From nwagner at iam.uni-stuttgart.de  Fri Dec  3 02:29:34 2010
From: nwagner at iam.uni-stuttgart.de (Nils Wagner)
Date: Fri, 03 Dec 2010 08:29:34 +0100
Subject: [Numpy-discussion] numpy.test() Program received signal SIGABRT,
	Aborted.
Message-ID: <web-140315457@uni-stuttgart.de>

Hi all,

I have installed the latest version of numpy.

>>> numpy.__version__
'2.0.0.dev-6aacc2d'

numpy.test(verbose=2) received signal SIGABRT.

test_cdouble_2 (test_linalg.TestEig) ... ok
test_csingle (test_linalg.TestEig) ... FAIL
*** glibc detected *** 
/data/home/nwagner/local/bin/python: free(): invalid next 
size (fast): 0x000000001c2887b0 ***
======= Backtrace: =========
/lib64/libc.so.6[0x383cc71684]
/lib64/libc.so.6(cfree+0x8c)[0x383cc74ccc]
/data/home/nwagner/local/lib/python2.5/site-packages/numpy/core/multiarray.so[0x2b33e06f710e]

(gdb) bt
#0  0x000000383cc30155 in raise () from /lib64/libc.so.6
#1  0x000000383cc31bf0 in abort () from /lib64/libc.so.6
#2  0x000000383cc6a3db in __libc_message () from 
/lib64/libc.so.6
#3  0x000000383cc71684 in _int_free () from 
/lib64/libc.so.6
#4  0x000000383cc74ccc in free () from /lib64/libc.so.6
#5  0x00002b33e06f710e in array_dealloc (self=0x1c65fa00) 
at numpy/core/src/multiarray/arrayobject.c:209
#6  0x00000000004d6dbb in frame_dealloc (f=0x1c65eec0) at 
Objects/frameobject.c:416

Nils


From charlesr.harris at gmail.com  Fri Dec  3 02:42:16 2010
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 3 Dec 2010 00:42:16 -0700
Subject: [Numpy-discussion] numpy.test() Program received signal SIGABRT,
	Aborted.
In-Reply-To: <web-140315457@uni-stuttgart.de>
References: <web-140315457@uni-stuttgart.de>
Message-ID: <AANLkTik2sWgirED+TOs51ddMbKE2A0ZHt7x1GwXiw4As@mail.gmail.com>

On Fri, Dec 3, 2010 at 12:29 AM, Nils Wagner
<nwagner at iam.uni-stuttgart.de>wrote:

> Hi all,
>
> I have installed the latest version of numpy.
>
> >>> numpy.__version__
> '2.0.0.dev-6aacc2d'
>
>

I don't see that here or on the buildbots. There was a problem with
segfaults that was fixed in commit
c0e1c0000f27b55dfd5a<https://github.com/numpy/numpy/commit/c0e1c0000f27b55dfd5aa4b1674a8c1b6ac38c36>Can
you check that your installation is clean, etc. Also, what platform
are
you running on?

<snip>

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101203/bde2b3d5/attachment.html>

From nwagner at iam.uni-stuttgart.de  Fri Dec  3 02:47:32 2010
From: nwagner at iam.uni-stuttgart.de (Nils Wagner)
Date: Fri, 03 Dec 2010 08:47:32 +0100
Subject: [Numpy-discussion] numpy.test() Program received signal SIGABRT,
 Aborted.
In-Reply-To: <AANLkTik2sWgirED+TOs51ddMbKE2A0ZHt7x1GwXiw4As@mail.gmail.com>
References: <web-140315457@uni-stuttgart.de>
	<AANLkTik2sWgirED+TOs51ddMbKE2A0ZHt7x1GwXiw4As@mail.gmail.com>
Message-ID: <web-140316041@uni-stuttgart.de>

On Fri, 3 Dec 2010 00:42:16 -0700
  Charles R Harris <charlesr.harris at gmail.com> wrote:
> On Fri, Dec 3, 2010 at 12:29 AM, Nils Wagner
> <nwagner at iam.uni-stuttgart.de>wrote:
> 
>> Hi all,
>>
>> I have installed the latest version of numpy.
>>
>> >>> numpy.__version__
>> '2.0.0.dev-6aacc2d'
>>
>>
> 
> I don't see that here or on the buildbots. There was a 
>problem with
> segfaults that was fixed in commit
> c0e1c0000f27b55dfd5a<https://github.com/numpy/numpy/commit/c0e1c0000f27b55dfd5aa4b1674a8c1b6ac38c36>Can
> you check that your installation is clean, etc. Also, 
>what platform
> are
> you running on?

I have removed the build directory.
Is it also neccessary to remove numpy in thr installation 
directory ?
  
/data/home/nwagner/local/lib/python2.5/site-packages/

Platform

2.6.18-92.el5 #1 SMP Tue Jun 10 18:51:06 EDT 2008 x86_64 
x86_64 x86_64 GNU/Linux

  
Nils


From nwagner at iam.uni-stuttgart.de  Fri Dec  3 02:56:02 2010
From: nwagner at iam.uni-stuttgart.de (Nils Wagner)
Date: Fri, 03 Dec 2010 08:56:02 +0100
Subject: [Numpy-discussion] numpy.test() Program received signal SIGABRT,
 Aborted.
In-Reply-To: <web-140316041@uni-stuttgart.de>
References: <web-140315457@uni-stuttgart.de>
	<AANLkTik2sWgirED+TOs51ddMbKE2A0ZHt7x1GwXiw4As@mail.gmail.com>
	<web-140316041@uni-stuttgart.de>
Message-ID: <web-140316302@uni-stuttgart.de>

On Fri, 03 Dec 2010 08:47:32 +0100
  "Nils Wagner" <nwagner at iam.uni-stuttgart.de> wrote:
> On Fri, 3 Dec 2010 00:42:16 -0700
>  Charles R Harris <charlesr.harris at gmail.com> wrote:
>> On Fri, Dec 3, 2010 at 12:29 AM, Nils Wagner
>> <nwagner at iam.uni-stuttgart.de>wrote:
>> 
>>> Hi all,
>>>
>>> I have installed the latest version of numpy.
>>>
>>> >>> numpy.__version__
>>> '2.0.0.dev-6aacc2d'
>>>
>>>
>> 
>> I don't see that here or on the buildbots. There was a 
>>problem with
>> segfaults that was fixed in commit
>> c0e1c0000f27b55dfd5a<https://github.com/numpy/numpy/commit/c0e1c0000f27b55dfd5aa4b1674a8c1b6ac38c36>Can
>> you check that your installation is clean, etc. Also, 
>>what platform
>> are
>> you running on?
> 
> I have removed the build directory.
> Is it also neccessary to remove numpy in thr 
>installation 
> directory ?
>  
> /data/home/nwagner/local/lib/python2.5/site-packages/
> 
> Platform
> 
> 2.6.18-92.el5 #1 SMP Tue Jun 10 18:51:06 EDT 2008 x86_64 
> x86_64 x86_64 GNU/Linux
> 
>  
> Nils
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

  
I have also removed the numpy directory within 
/data/home/nwagner/local/lib/python2.5/site-packages/.
Now all tests pass.
Ran 3080 tests in 12.288s

OK (KNOWNFAIL=4, SKIP=1)
<nose.result.TextTestResult run=3080 errors=0 failures=0>

How is the build process implemented on the build bots ?

Nils


From oc-spam66 at laposte.net  Fri Dec  3 06:29:49 2010
From: oc-spam66 at laposte.net (oc-spam66)
Date: Fri, 03 Dec 2010 12:29:49 +0100
Subject: [Numpy-discussion] numpy.r_[True, False] is not a boolean array
Message-ID: <4CF8D4AD.801@laposte.net>

Hello,

I observe the following behavior:

numpy.r_[True, False] 	-> array([1, 0], dtype=int8)
numpy.r_[True]		-> array([ True], dtype=bool)

I would expect the first line to give a boolean array:
array([ True, False], dtype=bool)

Is it normal? Is it a bug?

-- 
O.C.
numpy.__version__ = '1.4.1'


From josef.pktd at gmail.com  Fri Dec  3 06:48:10 2010
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 3 Dec 2010 06:48:10 -0500
Subject: [Numpy-discussion] numpy.r_[True, False] is not a boolean array
In-Reply-To: <4CF8D4AD.801@laposte.net>
References: <4CF8D4AD.801@laposte.net>
Message-ID: <AANLkTinpciVz2-J2FOTGeaOx1Bcif+CNtgdfwbJ8hZ_s@mail.gmail.com>

On Fri, Dec 3, 2010 at 6:29 AM, oc-spam66 <oc-spam66 at laposte.net> wrote:
> Hello,
>
> I observe the following behavior:
>
> numpy.r_[True, False] ? -> array([1, 0], dtype=int8)
> numpy.r_[True] ? ? ? ? ?-> array([ True], dtype=bool)

and

>>> np.r_[[True], [False]]
array([ True, False], dtype=bool)

>>> np.r_[[True, False]]
array([ True, False], dtype=bool)


>
> I would expect the first line to give a boolean array:
> array([ True, False], dtype=bool)
>
> Is it normal? Is it a bug?

Looks like a bug to me.

Josef
>
> --
> O.C.
> numpy.__version__ = '1.4.1'
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From moura.mario at gmail.com  Fri Dec  3 07:31:20 2010
From: moura.mario at gmail.com (Mario Moura)
Date: Fri, 3 Dec 2010 10:31:20 -0200
Subject: [Numpy-discussion]  itertools.combinations to numpy
Message-ID: <AANLkTikC-DWeOELxzBKdfKjBoU_HEbZhGXcZWJadDwzq@mail.gmail.com>

Hi Folks

I have this situation

>>> from timeit import Timer
>>> reps = 5
>>>
>>> t = Timer('itertools.combinations(range(1,10),3)', 'import itertools')
>>> print sum(t.repeat(repeat=reps, number=1)) / reps
1.59740447998e-05
>>> t = Timer('itertools.combinations(range(1,100),3)', 'import itertools')
>>> print sum(t.repeat(repeat=reps, number=1)) / reps
1.74999237061e-05
>>>
>>> t = Timer('list(itertools.combinations(range(1,10),3))', 'import itertools')
>>> print sum(t.repeat(repeat=reps, number=1)) / reps
5.31673431396e-05
>>> t = Timer('list(itertools.combinations(range(1,100),3))', 'import itertools')
>>> print sum(t.repeat(repeat=reps, number=1)) / reps
0.0556231498718
>>>

You can see list(itertools.combinations(range(1,100),3)) is terrible!!

If you change to range(1,100000) your computer will lock.

So I would like to know a good way to convert <itertools.combinations
object> to ndarray? fast! without use list
Is it possible?

>>> x = itertools.combinations(range(1,10),3)
>>> x
<itertools.combinations object at 0x25f1520>
>>>

I tried this from
http://docs.python.org/library/itertools.html?highlight=itertools#itertools.combinations

>>> numpy.fromiter(itertools.combinations(range(1,10),3), int, count=-1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: setting an array element with a sequence.
>>>

and this from
http://docs.python.org/library/itertools.html?highlight=itertools#itertools.combinations

import numpy
from itertools import *
from numpy import *

def combinations(iterable, r):
    pool = tuple(iterable)
    n = len(pool)
    for indices in permutations(range(n), r):
        if sorted(indices) == list(indices):
            yield tuple(pool[i] for i in indices)


numpy.fromiter(combinations(range(1,10),3), int, count=-1)

>>> numpy.fromiter(combinations(range(1,10),3), int, count=-1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: setting an array element with a sequence.
>>>


I like itertools.combinations performance but I need convert it to numpy.

Best Regards

mario


From sturla at molden.no  Fri Dec  3 07:32:46 2010
From: sturla at molden.no (Sturla Molden)
Date: Fri, 3 Dec 2010 13:32:46 +0100
Subject: [Numpy-discussion] A Cython apply_along_axis function
In-Reply-To: <AANLkTinp6pMmzwF0Q6fUhYH=cAyKNZcA5_nTj0Tmr3JE@mail.gmail.com>
References: <AANLkTinp6pMmzwF0Q6fUhYH=cAyKNZcA5_nTj0Tmr3JE@mail.gmail.com>
Message-ID: <1e1a8219245e238092b40c2656799f27.squirrel@webmail.uio.no>


>     if ndim == 1:
>         stride = a.strides[0] // itemsize # convert stride bytes --> items

Oh, did I really do this in selectmodule.pyx? :(

That is clearly an error. I don't have time to fix it now.


Sturla


From sturla at molden.no  Fri Dec  3 07:42:50 2010
From: sturla at molden.no (Sturla Molden)
Date: Fri, 3 Dec 2010 13:42:50 +0100
Subject: [Numpy-discussion] A Cython apply_along_axis function
In-Reply-To: <AANLkTinp6pMmzwF0Q6fUhYH=cAyKNZcA5_nTj0Tmr3JE@mail.gmail.com>
References: <AANLkTinp6pMmzwF0Q6fUhYH=cAyKNZcA5_nTj0Tmr3JE@mail.gmail.com>
Message-ID: <24f90a063c03155ad1952f7ee25191d9.squirrel@webmail.uio.no>

> It's hard to write Cython code that can handle all dtypes and
> arbitrary number of dimensions. The former is typically dealt with
> using templates, but what do people do about the latter?

There are number of ways to do it. NumPy's C API has an iterator that
returns an axis on demand. Mine just collects an array with pointers to
the first element in each axis. The latter is more friendly to
parallelization (OpenMP or Python threads with released GIL), which is why
I wrote it, otherwise it has no advantage over NumPy's.

Sturla


From warren.weckesser at enthought.com  Fri Dec  3 09:50:57 2010
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Fri, 3 Dec 2010 08:50:57 -0600
Subject: [Numpy-discussion] itertools.combinations to numpy
In-Reply-To: <AANLkTikC-DWeOELxzBKdfKjBoU_HEbZhGXcZWJadDwzq@mail.gmail.com>
References: <AANLkTikC-DWeOELxzBKdfKjBoU_HEbZhGXcZWJadDwzq@mail.gmail.com>
Message-ID: <AANLkTi=e-6tj3KdpkeWG7myWZyT2gfN_LgYBqV_bRPDR@mail.gmail.com>

On Fri, Dec 3, 2010 at 6:31 AM, Mario Moura <moura.mario at gmail.com> wrote:

> Hi Folks
>
> I have this situation
>
> >>> from timeit import Timer
> >>> reps = 5
> >>>
> >>> t = Timer('itertools.combinations(range(1,10),3)', 'import itertools')
> >>> print sum(t.repeat(repeat=reps, number=1)) / reps
> 1.59740447998e-05
> >>> t = Timer('itertools.combinations(range(1,100),3)', 'import itertools')
> >>> print sum(t.repeat(repeat=reps, number=1)) / reps
> 1.74999237061e-05
> >>>
> >>> t = Timer('list(itertools.combinations(range(1,10),3))', 'import
> itertools')
> >>> print sum(t.repeat(repeat=reps, number=1)) / reps
> 5.31673431396e-05
> >>> t = Timer('list(itertools.combinations(range(1,100),3))', 'import
> itertools')
> >>> print sum(t.repeat(repeat=reps, number=1)) / reps
> 0.0556231498718
> >>>
>
> You can see list(itertools.combinations(range(1,100),3)) is terrible!!
>
> If you change to range(1,100000) your computer will lock.
>
> So I would like to know a good way to convert <itertools.combinations
> object> to ndarray? fast! without use list
> Is it possible?
>
> >>> x = itertools.combinations(range(1,10),3)
> >>> x
> <itertools.combinations object at 0x25f1520>
> >>>
>
> I tried this from
>
> http://docs.python.org/library/itertools.html?highlight=itertools#itertools.combinations
>
> >>> numpy.fromiter(itertools.combinations(range(1,10),3), int, count=-1)
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
> ValueError: setting an array element with a sequence.
> >>>
>
> and this from
>
> http://docs.python.org/library/itertools.html?highlight=itertools#itertools.combinations
>
> import numpy
> from itertools import *
> from numpy import *
>
> def combinations(iterable, r):
>    pool = tuple(iterable)
>    n = len(pool)
>    for indices in permutations(range(n), r):
>        if sorted(indices) == list(indices):
>            yield tuple(pool[i] for i in indices)
>
>
> numpy.fromiter(combinations(range(1,10),3), int, count=-1)
>
> >>> numpy.fromiter(combinations(range(1,10),3), int, count=-1)
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
> ValueError: setting an array element with a sequence.
> >>>
>
>
> I like itertools.combinations performance but I need convert it to numpy.
>
>

The docstring for numpy.fromiter() says it creates a 1D array.  You can use
it with itertools.combinations if you specify a dtype for a 1D  structured
array.  Here's an example (I'm using ipython with the -pylab option, so the
numpy functions have all been imported):


In [1]: from itertools import combinations

In [2]: dt = dtype('i,i,i')

In [3]: a = fromiter(combinations(range(100),3), dtype=dt, count=-1)

In [4]: b = array(list(combinations(range(100),3)))

In [5]: all(a.view(int).reshape(-1,3) == b)
Out[5]: True

In [6]: timeit a = fromiter(combinations(range(100),3), dtype=dt, count=-1)
10 loops, best of 3: 92.7 ms per loop

In [7]: timeit b = array(list(combinations(range(100),3)))
1 loops, best of 3: 627 ms per loop

In [8]: a[:3]
Out[8]:
array([(0, 1, 2), (0, 1, 3), (0, 1, 4)],
      dtype=[('f0', '<i4'), ('f1', '<i4'), ('f2', '<i4')])

In [9]: b[:3]
Out[9]:
array([[0, 1, 2],
       [0, 1, 3],
       [0, 1, 4]])


In the above example, 'a' is a 1D structured array; each element of 'a'
holds one of the combinations.  If you need it, you can create a 2D view
with a.view(int).reshape(-1,3).

Warren
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101203/183b122b/attachment.html>

From fabian.pedregosa at inria.fr  Fri Dec  3 10:24:49 2010
From: fabian.pedregosa at inria.fr (Fabian Pedregosa)
Date: Fri, 3 Dec 2010 16:24:49 +0100
Subject: [Numpy-discussion] [PATCH] gfortran under macports
Message-ID: <AANLkTinHEgodU6ShK0U9i1_erLh-geqTZch240TS3pbA@mail.gmail.com>

Hi all.

Macports installs gfortran as part of the gcc package, but names it
gfortran-mp-$version, without providing a symbolic link to a default
gcfortran executable, and thus numpy.distutils is unable to find the
right executable.

The attached patch very simple, it just extends possible_executables
with those names, but makes the build of scipy work without having to
restore to obscure fc_config flags.

Fabian.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-FIX-recognize-macports-gfortran-compiler.patch
Type: application/octet-stream
Size: 1165 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101203/47ba33ad/attachment.obj>

From charlesr.harris at gmail.com  Fri Dec  3 11:00:45 2010
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 3 Dec 2010 09:00:45 -0700
Subject: [Numpy-discussion] numpy.test() Program received signal SIGABRT,
	Aborted.
In-Reply-To: <web-140316302@uni-stuttgart.de>
References: <web-140315457@uni-stuttgart.de>
	<AANLkTik2sWgirED+TOs51ddMbKE2A0ZHt7x1GwXiw4As@mail.gmail.com>
	<web-140316041@uni-stuttgart.de> <web-140316302@uni-stuttgart.de>
Message-ID: <AANLkTi=h2ajdJ2hTwpqEd6SFZT6ArfTj9PjxZRWCbNV4@mail.gmail.com>

On Fri, Dec 3, 2010 at 12:56 AM, Nils Wagner
<nwagner at iam.uni-stuttgart.de>wrote:

> On Fri, 03 Dec 2010 08:47:32 +0100
>   "Nils Wagner" <nwagner at iam.uni-stuttgart.de> wrote:
> > On Fri, 3 Dec 2010 00:42:16 -0700
> >  Charles R Harris <charlesr.harris at gmail.com> wrote:
> >> On Fri, Dec 3, 2010 at 12:29 AM, Nils Wagner
> >> <nwagner at iam.uni-stuttgart.de>wrote:
> >>
> >>> Hi all,
> >>>
> >>> I have installed the latest version of numpy.
> >>>
> >>> >>> numpy.__version__
> >>> '2.0.0.dev-6aacc2d'
> >>>
> >>>
> >>
> >> I don't see that here or on the buildbots. There was a
> >>problem with
> >> segfaults that was fixed in commit
> >> c0e1c0000f27b55dfd5a<
> https://github.com/numpy/numpy/commit/c0e1c0000f27b55dfd5aa4b1674a8c1b6ac38c36
> >Can
> >> you check that your installation is clean, etc. Also,
> >>what platform
> >> are
> >> you running on?
> >
> > I have removed the build directory.
> > Is it also neccessary to remove numpy in thr
> >installation
> > directory ?
> >
> > /data/home/nwagner/local/lib/python2.5/site-packages/
> >
> > Platform
> >
> > 2.6.18-92.el5 #1 SMP Tue Jun 10 18:51:06 EDT 2008 x86_64
> > x86_64 x86_64 GNU/Linux
> >
> >
> > Nils
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> I have also removed the numpy directory within
> /data/home/nwagner/local/lib/python2.5/site-packages/.
> Now all tests pass.
> Ran 3080 tests in 12.288s
>
> OK (KNOWNFAIL=4, SKIP=1)
> <nose.result.TextTestResult run=3080 errors=0 failures=0>
>
>
Great.


> How is the build process implemented on the build bots ?
>
>
I don't know the details, but it looks to me like they do a clean checkout
and fresh install.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101203/e36637bb/attachment.html>

From charlesr.harris at gmail.com  Fri Dec  3 11:02:21 2010
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 3 Dec 2010 09:02:21 -0700
Subject: [Numpy-discussion] [PATCH] gfortran under macports
In-Reply-To: <AANLkTinHEgodU6ShK0U9i1_erLh-geqTZch240TS3pbA@mail.gmail.com>
References: <AANLkTinHEgodU6ShK0U9i1_erLh-geqTZch240TS3pbA@mail.gmail.com>
Message-ID: <AANLkTim=0HwR9Xra+5KnY8ujSu9QnUeb8JnjACgzVmJv@mail.gmail.com>

Hi Fabian,

On Fri, Dec 3, 2010 at 8:24 AM, Fabian Pedregosa
<fabian.pedregosa at inria.fr>wrote:

> Hi all.
>
> Macports installs gfortran as part of the gcc package, but names it
> gfortran-mp-$version, without providing a symbolic link to a default
> gcfortran executable, and thus numpy.distutils is unable to find the
> right executable.
>
> The attached patch very simple, it just extends possible_executables
> with those names, but makes the build of scipy work without having to
> restore to obscure fc_config flags.
>
>
Can you open a ticket for this so it doesn't get lost?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101203/eebb9b62/attachment.html>

From moura.mario at gmail.com  Fri Dec  3 11:14:04 2010
From: moura.mario at gmail.com (Mario Moura)
Date: Fri, 3 Dec 2010 14:14:04 -0200
Subject: [Numpy-discussion] itertools.combinations to numpy
In-Reply-To: <AANLkTi=e-6tj3KdpkeWG7myWZyT2gfN_LgYBqV_bRPDR@mail.gmail.com>
References: <AANLkTikC-DWeOELxzBKdfKjBoU_HEbZhGXcZWJadDwzq@mail.gmail.com>
	<AANLkTi=e-6tj3KdpkeWG7myWZyT2gfN_LgYBqV_bRPDR@mail.gmail.com>
Message-ID: <AANLkTikwW=8tMicTUJdj+=+o=noV6eKvvK9CgX7tU_wS@mail.gmail.com>

Hi Mr. Weckesser

Thanks a lot!

Works fine!

Regards

Mario

2010/12/3 Warren Weckesser <warren.weckesser at enthought.com>:
>
>
> On Fri, Dec 3, 2010 at 6:31 AM, Mario Moura <moura.mario at gmail.com> wrote:
>>
>> Hi Folks
>>
>> I have this situation
>>
>> >>> from timeit import Timer
>> >>> reps = 5
>> >>>
>> >>> t = Timer('itertools.combinations(range(1,10),3)', 'import itertools')
>> >>> print sum(t.repeat(repeat=reps, number=1)) / reps
>> 1.59740447998e-05
>> >>> t = Timer('itertools.combinations(range(1,100),3)', 'import
>> >>> itertools')
>> >>> print sum(t.repeat(repeat=reps, number=1)) / reps
>> 1.74999237061e-05
>> >>>
>> >>> t = Timer('list(itertools.combinations(range(1,10),3))', 'import
>> >>> itertools')
>> >>> print sum(t.repeat(repeat=reps, number=1)) / reps
>> 5.31673431396e-05
>> >>> t = Timer('list(itertools.combinations(range(1,100),3))', 'import
>> >>> itertools')
>> >>> print sum(t.repeat(repeat=reps, number=1)) / reps
>> 0.0556231498718
>> >>>
>>
>> You can see list(itertools.combinations(range(1,100),3)) is terrible!!
>>
>> If you change to range(1,100000) your computer will lock.
>>
>> So I would like to know a good way to convert <itertools.combinations
>> object> to ndarray? fast! without use list
>> Is it possible?
>>
>> >>> x = itertools.combinations(range(1,10),3)
>> >>> x
>> <itertools.combinations object at 0x25f1520>
>> >>>
>>
>> I tried this from
>>
>> http://docs.python.org/library/itertools.html?highlight=itertools#itertools.combinations
>>
>> >>> numpy.fromiter(itertools.combinations(range(1,10),3), int, count=-1)
>> Traceback (most recent call last):
>> ?File "<stdin>", line 1, in <module>
>> ValueError: setting an array element with a sequence.
>> >>>
>>
>> and this from
>>
>> http://docs.python.org/library/itertools.html?highlight=itertools#itertools.combinations
>>
>> import numpy
>> from itertools import *
>> from numpy import *
>>
>> def combinations(iterable, r):
>> ? ?pool = tuple(iterable)
>> ? ?n = len(pool)
>> ? ?for indices in permutations(range(n), r):
>> ? ? ? ?if sorted(indices) == list(indices):
>> ? ? ? ? ? ?yield tuple(pool[i] for i in indices)
>>
>>
>> numpy.fromiter(combinations(range(1,10),3), int, count=-1)
>>
>> >>> numpy.fromiter(combinations(range(1,10),3), int, count=-1)
>> Traceback (most recent call last):
>> ?File "<stdin>", line 1, in <module>
>> ValueError: setting an array element with a sequence.
>> >>>
>>
>>
>> I like itertools.combinations performance but I need convert it to numpy.
>>
>
>
> The docstring for numpy.fromiter() says it creates a 1D array.? You can use
> it with itertools.combinations if you specify a dtype for a 1D? structured
> array.? Here's an example (I'm using ipython with the -pylab option, so the
> numpy functions have all been imported):
>
>
> In [1]: from itertools import combinations
>
> In [2]: dt = dtype('i,i,i')
>
> In [3]: a = fromiter(combinations(range(100),3), dtype=dt, count=-1)
>
> In [4]: b = array(list(combinations(range(100),3)))
>
> In [5]: all(a.view(int).reshape(-1,3) == b)
> Out[5]: True
>
> In [6]: timeit a = fromiter(combinations(range(100),3), dtype=dt, count=-1)
> 10 loops, best of 3: 92.7 ms per loop
>
> In [7]: timeit b = array(list(combinations(range(100),3)))
> 1 loops, best of 3: 627 ms per loop
>
> In [8]: a[:3]
> Out[8]:
> array([(0, 1, 2), (0, 1, 3), (0, 1, 4)],
> ????? dtype=[('f0', '<i4'), ('f1', '<i4'), ('f2', '<i4')])
>
> In [9]: b[:3]
> Out[9]:
> array([[0, 1, 2],
> ?????? [0, 1, 3],
> ?????? [0, 1, 4]])
>
>
> In the above example, 'a' is a 1D structured array; each element of 'a'
> holds one of the combinations.? If you need it, you can create a 2D view
> with a.view(int).reshape(-1,3).
>
> Warren
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From mwwiebe at gmail.com  Fri Dec  3 12:23:15 2010
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Fri, 3 Dec 2010 09:23:15 -0800
Subject: [Numpy-discussion] Float16 and PEP 3118
In-Reply-To: <id8nqf$5bd$1@dough.gmane.org>
References: <AANLkTimCLSrPsa88mCQcj_-Lsek1An6D1ZKB59NZN=zx@mail.gmail.com>
	<id8nqf$5bd$1@dough.gmane.org>
Message-ID: <AANLkTikUdA4nq6fzbgBW+Esmrp=DTqzxQzuXyYrC8rkm@mail.gmail.com>

On Thu, Dec 2, 2010 at 10:16 AM, Pauli Virtanen <pav at iki.fi> wrote:

> Before introducing a PEP 3118 type code for half floats in the PEP, one
> would need to argue the Python people to add it to the struct module.
>
> Before that, the choices probably are:
>
> - refuse to export buffers containing half floats
>

I think this is the better option, code that needs to do this can create an
int16 view for the time being.


> - export half floats as two bytes
>

This would throw away the byte-order, a problem much harder to track down
for the user than the other option.

-Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101203/ec71b460/attachment.html>

From pav at iki.fi  Fri Dec  3 12:50:35 2010
From: pav at iki.fi (Pauli Virtanen)
Date: Fri, 3 Dec 2010 17:50:35 +0000 (UTC)
Subject: [Numpy-discussion] Float16 and PEP 3118
References: <AANLkTimCLSrPsa88mCQcj_-Lsek1An6D1ZKB59NZN=zx@mail.gmail.com>
	<id8nqf$5bd$1@dough.gmane.org>
	<AANLkTikUdA4nq6fzbgBW+Esmrp=DTqzxQzuXyYrC8rkm@mail.gmail.com>
Message-ID: <idbalb$bt6$1@dough.gmane.org>

Fri, 03 Dec 2010 09:23:15 -0800, Mark Wiebe wrote:
[clip]
>> - refuse to export buffers containing half floats
>
> I think this is the better option, code that needs to do this can create
> an int16 view for the time being.

That's also easier to implement -- no changes are needed :)

	Pauli


From paul.anton.letnes at gmail.com  Sat Dec  4 04:00:42 2010
From: paul.anton.letnes at gmail.com (Paul Anton Letnes)
Date: Sat, 4 Dec 2010 10:00:42 +0100
Subject: [Numpy-discussion] [PATCH] gfortran under macports
In-Reply-To: <AANLkTinHEgodU6ShK0U9i1_erLh-geqTZch240TS3pbA@mail.gmail.com>
References: <AANLkTinHEgodU6ShK0U9i1_erLh-geqTZch240TS3pbA@mail.gmail.com>
Message-ID: <CB37966C-1A4C-40D3-B501-18E7BC1D4E23@gmail.com>


On 3. des. 2010, at 16.24, Fabian Pedregosa wrote:

> Hi all.
> 
> Macports installs gfortran as part of the gcc package, but names it
> gfortran-mp-$version, without providing a symbolic link to a default
> gcfortran executable, and thus numpy.distutils is unable to find the
> right executable.
> 
> The attached patch very simple, it just extends possible_executables
> with those names, but makes the build of scipy work without having to
> restore to obscure fc_config flags.
> 
> Fabian.
> <0001-FIX-recognize-macports-gfortran-compiler.patch>_______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

Correct me if I am wrong here: If you run "(sudo) gcc_select gfortran-mp-XY", where XY are the version numbers (e.g. 45 for gfortran 4.5), you should get symbolic links for the selected gcc/gfortran version. I believe that macports should probably make this clearer, and perhaps automatically when you do a "port install gccXY", but I am not sure if this needs any patching? Again, I might be wrong on this.

Cheers
Paul.

From fabian.pedregosa at inria.fr  Sat Dec  4 04:25:52 2010
From: fabian.pedregosa at inria.fr (Fabian Pedregosa)
Date: Sat, 4 Dec 2010 10:25:52 +0100
Subject: [Numpy-discussion] [PATCH] gfortran under macports
In-Reply-To: <1135702146.1178800.1291453261307.JavaMail.root@zmbs3.inria.fr>
References: <AANLkTinHEgodU6ShK0U9i1_erLh-geqTZch240TS3pbA@mail.gmail.com>
	<1135702146.1178800.1291453261307.JavaMail.root@zmbs3.inria.fr>
Message-ID: <AANLkTi=f3iMaemT4WCXRy39WYGf0kN6FKwWSONnRAW_S@mail.gmail.com>

>
> Correct me if I am wrong here: If you run "(sudo) gcc_select gfortran-mp-XY", where XY are the version numbers (e.g. 45 for gfortran 4.5), you should get symbolic links for the selected gcc/gfortran version. I believe that macports should probably make this clearer, and perhaps automatically when you do a "port install gccXY", but I am not sure if this needs any patching? Again, I might be wrong on this.

Thanks! I didn't know about gcc_select.

The correct command is "sudo gcc_select mp-gcc45" which effectively
does all the symbolic links for you and works like a charm, so please
ignore my previous patch.

Cheers,

fabian


From gael.varoquaux at normalesup.org  Sat Dec  4 04:29:11 2010
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Sat, 4 Dec 2010 10:29:11 +0100
Subject: [Numpy-discussion] [PATCH] gfortran under macports
In-Reply-To: <AANLkTi=f3iMaemT4WCXRy39WYGf0kN6FKwWSONnRAW_S@mail.gmail.com>
References: <AANLkTinHEgodU6ShK0U9i1_erLh-geqTZch240TS3pbA@mail.gmail.com>
	<1135702146.1178800.1291453261307.JavaMail.root@zmbs3.inria.fr>
	<AANLkTi=f3iMaemT4WCXRy39WYGf0kN6FKwWSONnRAW_S@mail.gmail.com>
Message-ID: <20101204092911.GB30391@phare.normalesup.org>

On Sat, Dec 04, 2010 at 10:25:52AM +0100, Fabian Pedregosa wrote:
> The correct command is "sudo gcc_select mp-gcc45" which effectively
> does all the symbolic links for you and works like a charm, so please
> ignore my previous patch.

I am not a mac user, so I guess that my opinion is not very educated, but
isn't your patch still useful: test if 'gcc' exists, and if not fallback
to your patch, so that it still works for the clueless user?

My 2 cents,

Ga?l


From fabian.pedregosa at inria.fr  Sat Dec  4 08:47:48 2010
From: fabian.pedregosa at inria.fr (Fabian Pedregosa)
Date: Sat, 4 Dec 2010 14:47:48 +0100
Subject: [Numpy-discussion] [PATCH] gfortran under macports
In-Reply-To: <374052771.1189935.1291454962784.JavaMail.root@zmbs3.inria.fr>
References: <AANLkTinHEgodU6ShK0U9i1_erLh-geqTZch240TS3pbA@mail.gmail.com>
	<1135702146.1178800.1291453261307.JavaMail.root@zmbs3.inria.fr>
	<AANLkTi=f3iMaemT4WCXRy39WYGf0kN6FKwWSONnRAW_S@mail.gmail.com>
	<374052771.1189935.1291454962784.JavaMail.root@zmbs3.inria.fr>
Message-ID: <AANLkTi=VVx5hvqk4fVGAxsnTZRdFqPwkqfdiGKT0CS7x@mail.gmail.com>

On Sat, Dec 4, 2010 at 10:29 AM, Gael Varoquaux
<gael.varoquaux at normalesup.org> wrote:
> On Sat, Dec 04, 2010 at 10:25:52AM +0100, Fabian Pedregosa wrote:
>> The correct command is "sudo gcc_select mp-gcc45" which effectively
>> does all the symbolic links for you and works like a charm, so please
>> ignore my previous patch.
>
> I am not a mac user, so I guess that my opinion is not very educated, but
> isn't your patch still useful: test if 'gcc' exists, and if not fallback
> to your patch, so that it still works for the clueless user?

Indeed, having scipy build out of the box would be nice, but it's not
for me to decide if numpy.distutils should overcome these limitations
in macports ...

On the other hand, as installing on macports is not that trivial, I
strongly feel that a subsection 'Macports' should be added to scipy's
INSTALL.txt file, where it details needed packages, the gcc_select
trick and options needed in site.cfg for umfpack. I'll gladly provide
a patch for that if people are OK.

Fabian.

>
> My 2 cents,
>
> Ga?l
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From ralf.gommers at googlemail.com  Sat Dec  4 09:04:16 2010
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sat, 4 Dec 2010 22:04:16 +0800
Subject: [Numpy-discussion] [PATCH] gfortran under macports
In-Reply-To: <AANLkTi=VVx5hvqk4fVGAxsnTZRdFqPwkqfdiGKT0CS7x@mail.gmail.com>
References: <AANLkTinHEgodU6ShK0U9i1_erLh-geqTZch240TS3pbA@mail.gmail.com>
	<1135702146.1178800.1291453261307.JavaMail.root@zmbs3.inria.fr>
	<AANLkTi=f3iMaemT4WCXRy39WYGf0kN6FKwWSONnRAW_S@mail.gmail.com>
	<374052771.1189935.1291454962784.JavaMail.root@zmbs3.inria.fr>
	<AANLkTi=VVx5hvqk4fVGAxsnTZRdFqPwkqfdiGKT0CS7x@mail.gmail.com>
Message-ID: <AANLkTi=mg210f7r3Pqjt1ppEj36VXnGt8GwY-Bs9s2F1@mail.gmail.com>

On Sat, Dec 4, 2010 at 9:47 PM, Fabian Pedregosa
<fabian.pedregosa at inria.fr>wrote:

> On Sat, Dec 4, 2010 at 10:29 AM, Gael Varoquaux
> <gael.varoquaux at normalesup.org> wrote:
> > On Sat, Dec 04, 2010 at 10:25:52AM +0100, Fabian Pedregosa wrote:
> >> The correct command is "sudo gcc_select mp-gcc45" which effectively
> >> does all the symbolic links for you and works like a charm, so please
> >> ignore my previous patch.
> >
> > I am not a mac user, so I guess that my opinion is not very educated, but
> > isn't your patch still useful: test if 'gcc' exists, and if not fallback
> > to your patch, so that it still works for the clueless user?
>
> Indeed, having scipy build out of the box would be nice, but it's not
> for me to decide if numpy.distutils should overcome these limitations
> in macports ...
>

I would prefer to just document the gcc_select solution, since it solves the
problem at hand.

>
> On the other hand, as installing on macports is not that trivial, I
> strongly feel that a subsection 'Macports' should be added to scipy's
> INSTALL.txt file, where it details needed packages, the gcc_select
> trick and options needed in site.cfg for umfpack. I'll gladly provide
> a patch for that if people are OK.
>

The most up-to-date instructions are at
http://www.scipy.org/Installing_SciPy/Mac_OS_X, so those should be updated
as well. That said, if a "Macports" section is added there should be a
strong disclaimer that it is *not* the recommended way to install
numpy/scipy. A good portion of the build problems reported on these lists
are related to Fortran on OS X, and a specific gfortran build at
http://r.research.att.com/tools/ is recommended for a reason. If a user
really wants to use Macports some notes in the docs may help, but let's not
give the impression that it's a good/default option for a new user.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101204/23fc369e/attachment.html>

From charlesr.harris at gmail.com  Sat Dec  4 11:20:37 2010
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 4 Dec 2010 09:20:37 -0700
Subject: [Numpy-discussion] Refactor fork uses the ./configure, make,
	make install process.
Message-ID: <AANLkTinnjtg=U0MtF4w2xhgJesOEMOohRL3YUsd580fH@mail.gmail.com>

Hi Jason,

Just wondering if this is temporary or the intention is to change the build
process? I also note that the *.h files in libndarray are not complete and a
*lot* of trailing whitespace has crept into the files.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101204/a1c68ae2/attachment.html>

From garyfallidis at gmail.com  Sat Dec  4 14:00:43 2010
From: garyfallidis at gmail.com (Eleftherios Garyfallidis)
Date: Sat, 4 Dec 2010 19:00:43 +0000
Subject: [Numpy-discussion]  Faster than ndindex?
Message-ID: <AANLkTimup5cBdMo9kS=SrcK+OKX9kuZd4kxyC2znSZSH@mail.gmail.com>

Hi guys,

I would like to know if there is any way to make the following operation
faster.

def test():
    shape=(200,200,200,3)
    refinds = np.ndindex(shape[:3])
    reftmp=np.zeros(shape)
    for ijk_t in refinds:
        i,j,k = ijk_t
        reftmp[i,j,k,0]=i
        reftmp[i,j,k,1]=j
        reftmp[i,j,k,2]=k

 %timeit test()
1 loops, best of 3: 19.5 s per loop

I am using ndindex and then a for loop. Is there a better/faster way?

Thank you,
Eleftherios
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101204/c29cdbb4/attachment.html>

From ischnell at enthought.com  Sat Dec  4 14:07:54 2010
From: ischnell at enthought.com (Ilan Schnell)
Date: Sat, 4 Dec 2010 13:07:54 -0600
Subject: [Numpy-discussion] Refactor fork uses the ./configure, make,
 make install process.
In-Reply-To: <AANLkTinnjtg=U0MtF4w2xhgJesOEMOohRL3YUsd580fH@mail.gmail.com>
References: <AANLkTinnjtg=U0MtF4w2xhgJesOEMOohRL3YUsd580fH@mail.gmail.com>
Message-ID: <AANLkTinvoL3CQUbQugYUdPNgRQHn=BcUwDz5MVNuOftO@mail.gmail.com>

Hello Charles,

it was indeed the intention to change the build process of the core
libndarray to use autoconf.  I've tested it on Linux, Mac, Solaris, and
it works very well.  libndarray is really a separate project, which only
resides for current development inside the numpy project.  The point
is that you can build libndarray without having a particular Python
installed.  The hope is that libndarray becomes used by other projects
which are not Python based, for example:
  * a pure C program
  * a Perl C extension
  * a Ruby C extension

I thought that autoconf was the obvious choice for doing this, and also
that is cleaner than numpy.distutils.

- Ilan

On Sat, Dec 4, 2010 at 10:20 AM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> Hi Jason,
>
> Just wondering if this is temporary or the intention is to change the build
> process? I also note that the *.h files in libndarray are not complete and a
> *lot* of trailing whitespace has crept into the files.
>
> Chuck
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From pav at iki.fi  Sat Dec  4 14:16:45 2010
From: pav at iki.fi (Pauli Virtanen)
Date: Sat, 4 Dec 2010 19:16:45 +0000 (UTC)
Subject: [Numpy-discussion] Faster than ndindex?
References: <AANLkTimup5cBdMo9kS=SrcK+OKX9kuZd4kxyC2znSZSH@mail.gmail.com>
Message-ID: <ide42t$mp3$1@dough.gmane.org>

On Sat, 04 Dec 2010 19:00:43 +0000, Eleftherios Garyfallidis wrote:
[clip]
> I am using ndindex and then a for loop. Is there a better/faster way?

Yes:

import numpy as np
from numpy import newaxis

x = np.zeros((200, 200, 200, 3))
x[...,0] = np.arange(200)[:,newaxis,newaxis]
x[...,1] = np.arange(200)[newaxis,:,newaxis]
x[...,2] = np.arange(200)[newaxis,newaxis,:]
x[1,3,2]
# -> array([ 1.,  3.,  2.])

Depending on what you use this array for, it's possible that you can 
avoid constructing it (and use broadcasting etc. instead).

-- 
Pauli Virtanen


From charlesr.harris at gmail.com  Sat Dec  4 14:21:15 2010
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 4 Dec 2010 12:21:15 -0700
Subject: [Numpy-discussion] Refactor fork uses the ./configure, make,
 make install process.
In-Reply-To: <AANLkTinvoL3CQUbQugYUdPNgRQHn=BcUwDz5MVNuOftO@mail.gmail.com>
References: <AANLkTinnjtg=U0MtF4w2xhgJesOEMOohRL3YUsd580fH@mail.gmail.com>
	<AANLkTinvoL3CQUbQugYUdPNgRQHn=BcUwDz5MVNuOftO@mail.gmail.com>
Message-ID: <AANLkTinUAJ=YByt01GUeSma_SMch0d5UZb_gr+b_LKkT@mail.gmail.com>

On Sat, Dec 4, 2010 at 12:07 PM, Ilan Schnell <ischnell at enthought.com>wrote:

> Hello Charles,
>
> it was indeed the intention to change the build process of the core
> libndarray to use autoconf.  I've tested it on Linux, Mac, Solaris, and
> it works very well.  libndarray is really a separate project, which only
> resides for current development inside the numpy project.  The point
> is that you can build libndarray without having a particular Python
> installed.  The hope is that libndarray becomes used by other projects
> which are not Python based, for example:
>  * a pure C program
>  * a Perl C extension
>  * a Ruby C extension
>
> I thought that autoconf was the obvious choice for doing this, and also
> that is cleaner than numpy.distutils.
>
>
So does numpy currently build on top of libndarray or is that something for
the future also? It would also be useful for David C. to offer his thoughts
on building/packaging, did you consult with him by any chance?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101204/39f90790/attachment.html>

From pav at iki.fi  Sat Dec  4 14:52:50 2010
From: pav at iki.fi (Pauli Virtanen)
Date: Sat, 4 Dec 2010 19:52:50 +0000 (UTC)
Subject: [Numpy-discussion] Refactor fork uses the ./configure, make,
	make install process.
References: <AANLkTinnjtg=U0MtF4w2xhgJesOEMOohRL3YUsd580fH@mail.gmail.com>
	<AANLkTinvoL3CQUbQugYUdPNgRQHn=BcUwDz5MVNuOftO@mail.gmail.com>
	<AANLkTinUAJ=YByt01GUeSma_SMch0d5UZb_gr+b_LKkT@mail.gmail.com>
Message-ID: <ide66i$mp3$2@dough.gmane.org>

On Sat, 04 Dec 2010 12:21:15 -0700, Charles R Harris wrote:
[clip]
> So does numpy currently build on top of libndarray or is that something
> for the future also?
[clip]

It does. If you look how it works, most of the heavy lifting has been 
moved there, leaving the multiarray module mostly as Python-specific 
wrappers.

-- 
Pauli Virtanen


From ischnell at enthought.com  Sat Dec  4 14:59:12 2010
From: ischnell at enthought.com (Ilan Schnell)
Date: Sat, 4 Dec 2010 13:59:12 -0600
Subject: [Numpy-discussion] Refactor fork uses the ./configure, make,
 make install process.
In-Reply-To: <AANLkTinUAJ=YByt01GUeSma_SMch0d5UZb_gr+b_LKkT@mail.gmail.com>
References: <AANLkTinnjtg=U0MtF4w2xhgJesOEMOohRL3YUsd580fH@mail.gmail.com>
	<AANLkTinvoL3CQUbQugYUdPNgRQHn=BcUwDz5MVNuOftO@mail.gmail.com>
	<AANLkTinUAJ=YByt01GUeSma_SMch0d5UZb_gr+b_LKkT@mail.gmail.com>
Message-ID: <AANLkTi=7__b1Dj8iJWYT5FR7=LRcJvaZ955709a9tw7w@mail.gmail.com>

Yes, numpy-refactor builds of top of libndarray.  The whole point
was that the libndarray is independent of the interface, i.e. the
CPython or the IronPython interface, and possibly other (Jython)
in the future.
Looking at different building/packaging solutions for libndarray,
autoconf make things very easy, it's a well established pattern,
I'm sure David C. will agree.

- Ilan


On Sat, Dec 4, 2010 at 1:21 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
> So does numpy currently build on top of libndarray or is that something for
> the future also? It would also be useful for David C. to offer his thoughts
> on building/packaging, did you consult with him by any chance?
>
> Chuck
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From garyfallidis at gmail.com  Sat Dec  4 15:05:02 2010
From: garyfallidis at gmail.com (Eleftherios Garyfallidis)
Date: Sat, 4 Dec 2010 20:05:02 +0000
Subject: [Numpy-discussion] Faster than ndindex?
In-Reply-To: <ide42t$mp3$1@dough.gmane.org>
References: <AANLkTimup5cBdMo9kS=SrcK+OKX9kuZd4kxyC2znSZSH@mail.gmail.com>
	<ide42t$mp3$1@dough.gmane.org>
Message-ID: <AANLkTikjFGCHuxYkz36WWSHaAfP4rer0O7vmYdqTTEno@mail.gmail.com>

This is beautiful! Thank you Pauli.

On Sat, Dec 4, 2010 at 7:16 PM, Pauli Virtanen <pav at iki.fi> wrote:

> On Sat, 04 Dec 2010 19:00:43 +0000, Eleftherios Garyfallidis wrote:
> [clip]
> > I am using ndindex and then a for loop. Is there a better/faster way?
>
> Yes:
>
> import numpy as np
> from numpy import newaxis
>
> x = np.zeros((200, 200, 200, 3))
> x[...,0] = np.arange(200)[:,newaxis,newaxis]
> x[...,1] = np.arange(200)[newaxis,:,newaxis]
> x[...,2] = np.arange(200)[newaxis,newaxis,:]
> x[1,3,2]
> # -> array([ 1.,  3.,  2.])
>
> Depending on what you use this array for, it's possible that you can
> avoid constructing it (and use broadcasting etc. instead).
>
> --
> Pauli Virtanen
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101204/52a460c8/attachment.html>

From charlesr.harris at gmail.com  Sat Dec  4 15:11:48 2010
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 4 Dec 2010 13:11:48 -0700
Subject: [Numpy-discussion] Refactor fork uses the ./configure, make,
 make install process.
In-Reply-To: <AANLkTi=7__b1Dj8iJWYT5FR7=LRcJvaZ955709a9tw7w@mail.gmail.com>
References: <AANLkTinnjtg=U0MtF4w2xhgJesOEMOohRL3YUsd580fH@mail.gmail.com>
	<AANLkTinvoL3CQUbQugYUdPNgRQHn=BcUwDz5MVNuOftO@mail.gmail.com>
	<AANLkTinUAJ=YByt01GUeSma_SMch0d5UZb_gr+b_LKkT@mail.gmail.com>
	<AANLkTi=7__b1Dj8iJWYT5FR7=LRcJvaZ955709a9tw7w@mail.gmail.com>
Message-ID: <AANLkTikf_jxCH_uPmU2Rkapo_JNk7576Xxc2wsJF1TLB@mail.gmail.com>

On Sat, Dec 4, 2010 at 12:59 PM, Ilan Schnell <ischnell at enthought.com>wrote:

> Yes, numpy-refactor builds of top of libndarray.  The whole point
> was that the libndarray is independent of the interface, i.e. the
> CPython or the IronPython interface, and possibly other (Jython)
> in the future.
> Looking at different building/packaging solutions for libndarray,
> autoconf make things very easy, it's a well established pattern,
> I'm sure David C. will agree.
>
>
I know he has expressed reservations about it on non-posix platforms and
some large projects have moved away from it. I'm not saying it isn't the
best short term solution so you folks can get on with the job, but it may be
that long term we will want to look elsewhere.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101204/a62f9f14/attachment.html>

From charlesr.harris at gmail.com  Sat Dec  4 15:19:11 2010
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 4 Dec 2010 13:19:11 -0700
Subject: [Numpy-discussion] Refactor fork uses the ./configure, make,
 make install process.
In-Reply-To: <ide66i$mp3$2@dough.gmane.org>
References: <AANLkTinnjtg=U0MtF4w2xhgJesOEMOohRL3YUsd580fH@mail.gmail.com>
	<AANLkTinvoL3CQUbQugYUdPNgRQHn=BcUwDz5MVNuOftO@mail.gmail.com>
	<AANLkTinUAJ=YByt01GUeSma_SMch0d5UZb_gr+b_LKkT@mail.gmail.com>
	<ide66i$mp3$2@dough.gmane.org>
Message-ID: <AANLkTina+5uYikheacJgx0fU7e36_qH0hm40ZtwYVgcG@mail.gmail.com>

On Sat, Dec 4, 2010 at 12:52 PM, Pauli Virtanen <pav at iki.fi> wrote:

> On Sat, 04 Dec 2010 12:21:15 -0700, Charles R Harris wrote:
> [clip]
> > So does numpy currently build on top of libndarray or is that something
> > for the future also?
> [clip]
>
> It does. If you look how it works, most of the heavy lifting has been
> moved there, leaving the multiarray module mostly as Python-specific
> wrappers.
>
>
Would it unreasonable to move the libndarray stuff to the current master
branch of numpy while leaving the rest of things intact? The needed changes
to the current core/src could be brought in later.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101204/672e5d73/attachment.html>

From ischnell at enthought.com  Sat Dec  4 15:24:49 2010
From: ischnell at enthought.com (Ilan Schnell)
Date: Sat, 4 Dec 2010 14:24:49 -0600
Subject: [Numpy-discussion] Refactor fork uses the ./configure, make,
 make install process.
In-Reply-To: <AANLkTina+5uYikheacJgx0fU7e36_qH0hm40ZtwYVgcG@mail.gmail.com>
References: <AANLkTinnjtg=U0MtF4w2xhgJesOEMOohRL3YUsd580fH@mail.gmail.com>
	<AANLkTinvoL3CQUbQugYUdPNgRQHn=BcUwDz5MVNuOftO@mail.gmail.com>
	<AANLkTinUAJ=YByt01GUeSma_SMch0d5UZb_gr+b_LKkT@mail.gmail.com>
	<ide66i$mp3$2@dough.gmane.org>
	<AANLkTina+5uYikheacJgx0fU7e36_qH0hm40ZtwYVgcG@mail.gmail.com>
Message-ID: <AANLkTinEfmjquHpU+6DCumGuO8S5tLYdB47CbhQwBtMn@mail.gmail.com>

I'm not sure how reasonable it would be to move only
libndarray into the master, because I've been working
on EPD for the last couple of week.  But Jason will know
how complete libndarray is.

- Ilan

On Sat, Dec 4, 2010 at 2:19 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> Would it unreasonable to move the libndarray stuff to the current master
> branch of numpy while leaving the rest of things intact? The needed changes
> to the current core/src could be brought in later.
>
> Chuck
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From pav at iki.fi  Sat Dec  4 15:45:59 2010
From: pav at iki.fi (Pauli Virtanen)
Date: Sat, 4 Dec 2010 20:45:59 +0000 (UTC)
Subject: [Numpy-discussion] Refactor fork uses the ./configure, make,
	make install process.
References: <AANLkTinnjtg=U0MtF4w2xhgJesOEMOohRL3YUsd580fH@mail.gmail.com>
	<AANLkTinvoL3CQUbQugYUdPNgRQHn=BcUwDz5MVNuOftO@mail.gmail.com>
	<AANLkTinUAJ=YByt01GUeSma_SMch0d5UZb_gr+b_LKkT@mail.gmail.com>
	<ide66i$mp3$2@dough.gmane.org>
	<AANLkTina+5uYikheacJgx0fU7e36_qH0hm40ZtwYVgcG@mail.gmail.com>
	<AANLkTinEfmjquHpU+6DCumGuO8S5tLYdB47CbhQwBtMn@mail.gmail.com>
Message-ID: <ide9a7$9ih$1@dough.gmane.org>

On Sat, 04 Dec 2010 14:24:49 -0600, Ilan Schnell wrote:
> I'm not sure how reasonable it would be to move only libndarray into the
> master, because I've been working on EPD for the last couple of week. 
> But Jason will know how complete libndarray is.

The main question is whether moving it will make things easier or more 
difficult, I think. It's one tree more to keep track of.

In any case, it would be a first part in the merge, and it would split 
the hunk of changes into two parts.

    ***

Technically, the move could be done like this, so that merge tracking 
still works:

           --------refactor--------------- new-refactor
          /                            /
         /--------libndarray----------x
        /                              \
   start---------------------- master----- new-master

-- 
Pauli Virtanen


From dagss at student.matnat.uio.no  Sat Dec  4 15:57:31 2010
From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn)
Date: Sat, 04 Dec 2010 21:57:31 +0100
Subject: [Numpy-discussion] Refactor fork uses the ./configure, make,
 make install process.
In-Reply-To: <AANLkTikf_jxCH_uPmU2Rkapo_JNk7576Xxc2wsJF1TLB@mail.gmail.com>
References: <AANLkTinnjtg=U0MtF4w2xhgJesOEMOohRL3YUsd580fH@mail.gmail.com>	<AANLkTinvoL3CQUbQugYUdPNgRQHn=BcUwDz5MVNuOftO@mail.gmail.com>	<AANLkTinUAJ=YByt01GUeSma_SMch0d5UZb_gr+b_LKkT@mail.gmail.com>	<AANLkTi=7__b1Dj8iJWYT5FR7=LRcJvaZ955709a9tw7w@mail.gmail.com>
	<AANLkTikf_jxCH_uPmU2Rkapo_JNk7576Xxc2wsJF1TLB@mail.gmail.com>
Message-ID: <4CFAAB3B.3060509@student.matnat.uio.no>

On 12/04/2010 09:11 PM, Charles R Harris wrote:
>
>
> On Sat, Dec 4, 2010 at 12:59 PM, Ilan Schnell <ischnell at enthought.com 
> <mailto:ischnell at enthought.com>> wrote:
>
>     Yes, numpy-refactor builds of top of libndarray.  The whole point
>     was that the libndarray is independent of the interface, i.e. the
>     CPython or the IronPython interface, and possibly other (Jython)
>     in the future.
>     Looking at different building/packaging solutions for libndarray,
>     autoconf make things very easy, it's a well established pattern,
>     I'm sure David C. will agree.
>
>
>
> I know he has expressed reservations about it on non-posix platforms 
> and some large projects have moved away from it. I'm not saying it 
> isn't the best short term solution so you folks can get on with the 
> job, but it may be that long term we will want to look elsewhere.

Such as perhaps waf for building libndarray, which seems like it will be 
much easier to make work nicely with Bento etc. than autoconf (again, 
speaking long-term).

Also, it'd be good to avoid a seperate build system for Windows (problem 
of keeping changes sync-ed with Visual Studio projects etc. etc.).

Dag Sverre
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101204/7075a8e8/attachment.html>

From charlesr.harris at gmail.com  Sat Dec  4 16:01:03 2010
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 4 Dec 2010 14:01:03 -0700
Subject: [Numpy-discussion] Refactor fork uses the ./configure, make,
 make install process.
In-Reply-To: <ide9a7$9ih$1@dough.gmane.org>
References: <AANLkTinnjtg=U0MtF4w2xhgJesOEMOohRL3YUsd580fH@mail.gmail.com>
	<AANLkTinvoL3CQUbQugYUdPNgRQHn=BcUwDz5MVNuOftO@mail.gmail.com>
	<AANLkTinUAJ=YByt01GUeSma_SMch0d5UZb_gr+b_LKkT@mail.gmail.com>
	<ide66i$mp3$2@dough.gmane.org>
	<AANLkTina+5uYikheacJgx0fU7e36_qH0hm40ZtwYVgcG@mail.gmail.com>
	<AANLkTinEfmjquHpU+6DCumGuO8S5tLYdB47CbhQwBtMn@mail.gmail.com>
	<ide9a7$9ih$1@dough.gmane.org>
Message-ID: <AANLkTikSzMUcKuSbz3XcFmKS6DjC2SU+DS4cBM8yTCHj@mail.gmail.com>

On Sat, Dec 4, 2010 at 1:45 PM, Pauli Virtanen <pav at iki.fi> wrote:

> On Sat, 04 Dec 2010 14:24:49 -0600, Ilan Schnell wrote:
> > I'm not sure how reasonable it would be to move only libndarray into the
> > master, because I've been working on EPD for the last couple of week.
> > But Jason will know how complete libndarray is.
>
> The main question is whether moving it will make things easier or more
> difficult, I think. It's one tree more to keep track of.
>
> In any case, it would be a first part in the merge, and it would split
> the hunk of changes into two parts.
>
>
That would be a good thing IMHO. It would also bring a bit more numpy
reality to the refactor and since we are implicitly relying on it for the
next release sometime next spring the closer to reality it gets the better.


>    ***
>
> Technically, the move could be done like this, so that merge tracking
> still works:
>
>           --------refactor--------------- new-refactor
>          /                            /
>         /--------libndarray----------x
>        /                              \
>   start---------------------- master----- new-master
>
>
Looks good to me.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101204/701a97d9/attachment.html>

From mwwiebe at gmail.com  Sat Dec  4 19:41:17 2010
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Sat, 4 Dec 2010 16:41:17 -0800
Subject: [Numpy-discussion] Refactor fork uses the ./configure, make,
 make install process.
In-Reply-To: <ide9a7$9ih$1@dough.gmane.org>
References: <AANLkTinnjtg=U0MtF4w2xhgJesOEMOohRL3YUsd580fH@mail.gmail.com>
	<AANLkTinvoL3CQUbQugYUdPNgRQHn=BcUwDz5MVNuOftO@mail.gmail.com>
	<AANLkTinUAJ=YByt01GUeSma_SMch0d5UZb_gr+b_LKkT@mail.gmail.com>
	<ide66i$mp3$2@dough.gmane.org>
	<AANLkTina+5uYikheacJgx0fU7e36_qH0hm40ZtwYVgcG@mail.gmail.com>
	<AANLkTinEfmjquHpU+6DCumGuO8S5tLYdB47CbhQwBtMn@mail.gmail.com>
	<ide9a7$9ih$1@dough.gmane.org>
Message-ID: <AANLkTiniyPFmU7DqRfpd9psuTcoix2_1uGQvfdujS4yG@mail.gmail.com>

On Sat, Dec 4, 2010 at 12:45 PM, Pauli Virtanen <pav at iki.fi> wrote:

>
> Technically, the move could be done like this, so that merge tracking
> still works:
>
>           --------refactor--------------- new-refactor
>          /                            /
>         /--------libndarray----------x
>        /                              \
>   start---------------------- master----- new-master
>

Switching to use libndarray is a big ABI+API change, right?  If there's an
idea to release an ABI-compatible 1.6, wouldn't this end up being more
difficult?  Maybe I'm misunderstanding this idea.

I looked a little bit at the 1.4.0 ABI issue, and if the only blocking
problem was the cast[] array in ArrFuncs, I think that can be worked around
without too much difficulty.  Would people want an ABI-compatible 1.6
release adding date-time and float16?

-Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101204/720fc7f1/attachment.html>

From paul.anton.letnes at gmail.com  Sun Dec  5 02:58:57 2010
From: paul.anton.letnes at gmail.com (Paul Anton Letnes)
Date: Sun, 5 Dec 2010 08:58:57 +0100
Subject: [Numpy-discussion] [PATCH] gfortran under macports
In-Reply-To: <AANLkTimyx9YOHzFV2LrO2i8FFh6XHOppquDGrE2g=iez@mail.gmail.com>
References: <AANLkTinHEgodU6ShK0U9i1_erLh-geqTZch240TS3pbA@mail.gmail.com>
	<1135702146.1178800.1291453261307.JavaMail.root@zmbs3.inria.fr>
	<AANLkTi=f3iMaemT4WCXRy39WYGf0kN6FKwWSONnRAW_S@mail.gmail.com>
	<374052771.1189935.1291454962784.JavaMail.root@zmbs3.inria.fr>
	<AANLkTi=VVx5hvqk4fVGAxsnTZRdFqPwkqfdiGKT0CS7x@mail.gmail.com>
	<AANLkTi=mg210f7r3Pqjt1ppEj36VXnGt8GwY-Bs9s2F1@mail.gmail.com>
	<AANLkTimyx9YOHzFV2LrO2i8FFh6XHOppquDGrE2g=iez@mail.gmail.com>
Message-ID: <AANLkTimTHRuXebBQg5KFeeRHmRLo2vuAcSk7uLMfVUd3@mail.gmail.com>

Mabe I am wrong somehow, but in my experience the easiest install of scipy
is 'port install py26-scipy'. For new users, I do not see why one would
recommend to build manually from source? Macports can do it for you,
automagically...

Paul

4. des.. 2010 15.04 "Ralf Gommers" <ralf.gommers at googlemail.com>:


On Sat, Dec 4, 2010 at 9:47 PM, Fabian Pedregosa <fabian.pedregosa at inria.fr>
wrote:
>
> On Sat, Dec ...

I would prefer to just document the gcc_select solution, since it solves the
problem at hand.

>
>
> On the other hand, as installing on macports is not that trivial, I
> strongly feel that a sub...

The most up-to-date instructions are at
http://www.scipy.org/Installing_SciPy/Mac_OS_X, so those should be updated
as well. That said, if a "Macports" section is added there should be a
strong disclaimer that it is *not* the recommended way to install
numpy/scipy. A good portion of the build problems reported on these lists
are related to Fortran on OS X, and a specific gfortran build at
http://r.research.att.com/tools/ is recommended for a reason. If a user
really wants to use Macports some notes in the docs may help, but let's not
give the impression that it's a good/default option for a new user.

Cheers,
Ralf

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101205/0e0063f0/attachment.html>

From ralf.gommers at googlemail.com  Sun Dec  5 06:28:44 2010
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sun, 5 Dec 2010 19:28:44 +0800
Subject: [Numpy-discussion] [PATCH] gfortran under macports
In-Reply-To: <AANLkTimTHRuXebBQg5KFeeRHmRLo2vuAcSk7uLMfVUd3@mail.gmail.com>
References: <AANLkTinHEgodU6ShK0U9i1_erLh-geqTZch240TS3pbA@mail.gmail.com>
	<1135702146.1178800.1291453261307.JavaMail.root@zmbs3.inria.fr>
	<AANLkTi=f3iMaemT4WCXRy39WYGf0kN6FKwWSONnRAW_S@mail.gmail.com>
	<374052771.1189935.1291454962784.JavaMail.root@zmbs3.inria.fr>
	<AANLkTi=VVx5hvqk4fVGAxsnTZRdFqPwkqfdiGKT0CS7x@mail.gmail.com>
	<AANLkTi=mg210f7r3Pqjt1ppEj36VXnGt8GwY-Bs9s2F1@mail.gmail.com>
	<AANLkTimyx9YOHzFV2LrO2i8FFh6XHOppquDGrE2g=iez@mail.gmail.com>
	<AANLkTimTHRuXebBQg5KFeeRHmRLo2vuAcSk7uLMfVUd3@mail.gmail.com>
Message-ID: <AANLkTikD_T8vE_kHC4nhj=FTdn9fw4CN3d=oDxeUTFh1@mail.gmail.com>

On Sun, Dec 5, 2010 at 3:58 PM, Paul Anton Letnes <
paul.anton.letnes at gmail.com> wrote:

> Mabe I am wrong somehow, but in my experience the easiest install of scipy
> is 'port install py26-scipy'. For new users, I do not see why one would
> recommend to build manually from source? Macports can do it for you,
> automagically...
>
> Well, by far the easiest method is to just grab a binary installer.  The
other choices you have are build from source, or try to use
Macports/Fink/Homebrew/easy_install/pip/buildout-recipe/<your-favorite-solution-here>.
Those all rely on source builds as well, they're just hiding the details.
Which makes things way more confusing when something goes wrong.

About Macports specifically, I haven't tried in a few years but certainly
don't remember things always working out of the box. And AFAIK Homebrew is a
replacement for Macports for many people because the latter was issues.

Cheers,
Ralf

Paul
>
> 4. des.. 2010 15.04 "Ralf Gommers" <ralf.gommers at googlemail.com>:
>
>
>
> On Sat, Dec 4, 2010 at 9:47 PM, Fabian Pedregosa <
> fabian.pedregosa at inria.fr> wrote:
> >
> > On Sat, Dec ...
>
> I would prefer to just document the gcc_select solution, since it solves
> the problem at hand.
>
> >
> >
> > On the other hand, as installing on macports is not that trivial, I
> > strongly feel that a sub...
>
> The most up-to-date instructions are at
> http://www.scipy.org/Installing_SciPy/Mac_OS_X, so those should be updated
> as well. That said, if a "Macports" section is added there should be a
> strong disclaimer that it is *not* the recommended way to install
> numpy/scipy. A good portion of the build problems reported on these lists
> are related to Fortran on OS X, and a specific gfortran build at
> http://r.research.att.com/tools/ is recommended for a reason. If a user
> really wants to use Macports some notes in the docs may help, but let's not
> give the impression that it's a good/default option for a new user.
>
> Cheers,
> Ralf
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101205/d31a3f9f/attachment.html>

From ralf.gommers at googlemail.com  Sun Dec  5 07:10:56 2010
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sun, 5 Dec 2010 20:10:56 +0800
Subject: [Numpy-discussion] ANN: NumPy 1.5.1
In-Reply-To: <4CEAC5EF.1030205@noaa.gov>
References: <AANLkTimROtG6pGwtaC5+VB11q=YhLxuTTR7zXe7wAqBG@mail.gmail.com>
	<4CEAC5EF.1030205@noaa.gov>
Message-ID: <AANLkTinygZ=YnteqvawmGaGhtjHCCt6yXT0B6u_L=UoP@mail.gmail.com>

On Tue, Nov 23, 2010 at 3:35 AM, Christopher Barker
<Chris.Barker at noaa.gov>wrote:

> On 11/20/10 11:04 PM, Ralf Gommers wrote:
>
>> I am pleased to announce the availability of NumPy 1.5.1.
>>
>
>  Binaries, sources and release notes can be found at
>> https://sourceforge.net/projects/numpy/files/.
>>
>> Thank you to everyone who contributed to this release.
>>
>
> Yes, thanks so much -- in particular thanks to the team that build the OS-X
> binaries -- looks like a complete set!
>
> It does look like a complete set. And it was named correctly and in sync
with python.org for a single week. From pythonmac list:

"With Python 2.7, there are two Mac OS X installer variants available for
download: the "traditional" 32-bit-only (Intel and PPC) version that
installs and runs on all versions of OS X from 10.3.9 through current
10.6.x; and a new 64-bit/32-bit (Intel only) variant.  As discussed in
http://bugs.python.org/issue9227, there were problems using Tkinter and
IDLE with the original 2.7 64/32 installer.  The problem is that the
only supported non-X11 64-bit Tcl/Tk at the moment is the one supplied
by Apple in 10.6 and the installer tried unsuccessfully to support both
10.5 and 10.6.  For 2.7.1, the 64/32 installer now only supports 10.6.x
and will only use the Apple-supplied Tcl/Tk 8.5.  The 32-bit-only
installer is still built to link with either an Active/State Tcl/Tk 8.4,
if installed in /Library/Frameworks, or fallback to the Apple-supplied
Tcl/Tk 8.4 in OS X 10.4 through 10.6."

So for the next release we'll build the 10.6 binary on 10.6 again. Sigh.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101205/87d38f47/attachment.html>

From friedrichromstedt at gmail.com  Sun Dec  5 07:33:22 2010
From: friedrichromstedt at gmail.com (Friedrich Romstedt)
Date: Sun, 5 Dec 2010 13:33:22 +0100
Subject: [Numpy-discussion] ANN: NumPy 1.5.1
In-Reply-To: <AANLkTinygZ=YnteqvawmGaGhtjHCCt6yXT0B6u_L=UoP@mail.gmail.com>
References: <AANLkTimROtG6pGwtaC5+VB11q=YhLxuTTR7zXe7wAqBG@mail.gmail.com>
	<4CEAC5EF.1030205@noaa.gov>
	<AANLkTinygZ=YnteqvawmGaGhtjHCCt6yXT0B6u_L=UoP@mail.gmail.com>
Message-ID: <AANLkTikfvRAO0rBV8hCogsRru+8V2EEox030kj7bBmFa@mail.gmail.com>

Hi Ralf,

2010/12/5 Ralf Gommers <ralf.gommers at googlemail.com>:
> It does look like a complete set. And it was named correctly and in sync
> with python.org for a single week. From pythonmac list:
>
> "With Python 2.7, there are two Mac OS X installer variants available for
> download: the "traditional" 32-bit-only (Intel and PPC) version that
> installs and runs on all versions of OS X from 10.3.9 through current
> 10.6.x; and a new 64-bit/32-bit (Intel only) variant. ?As discussed in
> http://bugs.python.org/issue9227, there were problems using Tkinter and
> IDLE with the original 2.7 64/32 installer. ?The problem is that the
> only supported non-X11 64-bit Tcl/Tk at the moment is the one supplied
> by Apple in 10.6 and the installer tried unsuccessfully to support both
> 10.5 and 10.6. ?For 2.7.1, the 64/32 installer now only supports 10.6.x
> and will only use the Apple-supplied Tcl/Tk 8.5. ?The 32-bit-only
> installer is still built to link with either an Active/State Tcl/Tk 8.4,
> if installed in /Library/Frameworks, or fallback to the Apple-supplied
> Tcl/Tk 8.4 in OS X 10.4 through 10.6."
>
> So for the next release we'll build the 10.6 binary on 10.6 again. Sigh.

But the i386/ppc version should still be built on 10.5?

Shall I give you commit rights on my repo for the build logs
(concerning the i386/x86_64 10.6 build)?

Who is going to do the 10.6 builds?

Friedrich


From ralf.gommers at googlemail.com  Sun Dec  5 09:34:27 2010
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sun, 5 Dec 2010 22:34:27 +0800
Subject: [Numpy-discussion] ANN: NumPy 1.5.1
In-Reply-To: <AANLkTikfvRAO0rBV8hCogsRru+8V2EEox030kj7bBmFa@mail.gmail.com>
References: <AANLkTimROtG6pGwtaC5+VB11q=YhLxuTTR7zXe7wAqBG@mail.gmail.com>
	<4CEAC5EF.1030205@noaa.gov>
	<AANLkTinygZ=YnteqvawmGaGhtjHCCt6yXT0B6u_L=UoP@mail.gmail.com>
	<AANLkTikfvRAO0rBV8hCogsRru+8V2EEox030kj7bBmFa@mail.gmail.com>
Message-ID: <AANLkTinsE_PGJUCK15nEackP9FxTgkbs7YzGeUbqSBRZ@mail.gmail.com>

On Sun, Dec 5, 2010 at 8:33 PM, Friedrich Romstedt <
friedrichromstedt at gmail.com> wrote:

> Hi Ralf,
>
> 2010/12/5 Ralf Gommers <ralf.gommers at googlemail.com>:
> > It does look like a complete set. And it was named correctly and in sync
> > with python.org for a single week. From pythonmac list:
> >
> > "With Python 2.7, there are two Mac OS X installer variants available for
> > download: the "traditional" 32-bit-only (Intel and PPC) version that
> > installs and runs on all versions of OS X from 10.3.9 through current
> > 10.6.x; and a new 64-bit/32-bit (Intel only) variant.  As discussed in
> > http://bugs.python.org/issue9227, there were problems using Tkinter and
> > IDLE with the original 2.7 64/32 installer.  The problem is that the
> > only supported non-X11 64-bit Tcl/Tk at the moment is the one supplied
> > by Apple in 10.6 and the installer tried unsuccessfully to support both
> > 10.5 and 10.6.  For 2.7.1, the 64/32 installer now only supports 10.6.x
> > and will only use the Apple-supplied Tcl/Tk 8.5.  The 32-bit-only
> > installer is still built to link with either an Active/State Tcl/Tk 8.4,
> > if installed in /Library/Frameworks, or fallback to the Apple-supplied
> > Tcl/Tk 8.4 in OS X 10.4 through 10.6."
> >
> > So for the next release we'll build the 10.6 binary on 10.6 again. Sigh.
>
> But the i386/ppc version should still be built on 10.5?
>
> Yes.


> Shall I give you commit rights on my repo for the build logs
> (concerning the i386/x86_64 10.6 build)?
>

Sure.

>
> Who is going to do the 10.6 builds?
>

I'll do that one.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101205/0de54177/attachment.html>

From ben.root at ou.edu  Sun Dec  5 10:40:56 2010
From: ben.root at ou.edu (Benjamin Root)
Date: Sun, 5 Dec 2010 09:40:56 -0600
Subject: [Numpy-discussion] [PATCH] gfortran under macports
In-Reply-To: <AANLkTikD_T8vE_kHC4nhj=FTdn9fw4CN3d=oDxeUTFh1@mail.gmail.com>
References: <AANLkTinHEgodU6ShK0U9i1_erLh-geqTZch240TS3pbA@mail.gmail.com>
	<1135702146.1178800.1291453261307.JavaMail.root@zmbs3.inria.fr>
	<AANLkTi=f3iMaemT4WCXRy39WYGf0kN6FKwWSONnRAW_S@mail.gmail.com>
	<374052771.1189935.1291454962784.JavaMail.root@zmbs3.inria.fr>
	<AANLkTi=VVx5hvqk4fVGAxsnTZRdFqPwkqfdiGKT0CS7x@mail.gmail.com>
	<AANLkTi=mg210f7r3Pqjt1ppEj36VXnGt8GwY-Bs9s2F1@mail.gmail.com>
	<AANLkTimyx9YOHzFV2LrO2i8FFh6XHOppquDGrE2g=iez@mail.gmail.com>
	<AANLkTimTHRuXebBQg5KFeeRHmRLo2vuAcSk7uLMfVUd3@mail.gmail.com>
	<AANLkTikD_T8vE_kHC4nhj=FTdn9fw4CN3d=oDxeUTFh1@mail.gmail.com>
Message-ID: <AANLkTinJD9L0-dLzi9DF0ML4nMr=gTtBdUQCRT+QMP5A@mail.gmail.com>

On Sun, Dec 5, 2010 at 5:28 AM, Ralf Gommers <ralf.gommers at googlemail.com>wrote:

>
>
> On Sun, Dec 5, 2010 at 3:58 PM, Paul Anton Letnes <
> paul.anton.letnes at gmail.com> wrote:
>
>> Mabe I am wrong somehow, but in my experience the easiest install of scipy
>> is 'port install py26-scipy'. For new users, I do not see why one would
>> recommend to build manually from source? Macports can do it for you,
>> automagically...
>>
>> Well, by far the easiest method is to just grab a binary installer.  The
> other choices you have are build from source, or try to use
> Macports/Fink/Homebrew/easy_install/pip/buildout-recipe/<your-favorite-solution-here>.
> Those all rely on source builds as well, they're just hiding the details.
> Which makes things way more confusing when something goes wrong.
>
> About Macports specifically, I haven't tried in a few years but certainly
> don't remember things always working out of the box. And AFAIK Homebrew is a
> replacement for Macports for many people because the latter was issues.
>
> Cheers,
> Ralf
>
>
I did a Macports install of numpy/scipy/matplotlib on my wife's macbook a
few months ago just because I was curious.  Besides the fact that it took
forever (it had trouble obtaining the various compilers from the servers,
and it did a full-blown ATLAS tuning and compiling...) it did eventually
install and work.

YMMV,

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101205/ca46958c/attachment.html>

From david at silveregg.co.jp  Sun Dec  5 20:03:05 2010
From: david at silveregg.co.jp (David)
Date: Mon, 06 Dec 2010 10:03:05 +0900
Subject: [Numpy-discussion] Refactor fork uses the ./configure, make,
	make install process.
In-Reply-To: <4CFAAB3B.3060509@student.matnat.uio.no>
References: <AANLkTinnjtg=U0MtF4w2xhgJesOEMOohRL3YUsd580fH@mail.gmail.com>
	<AANLkTinvoL3CQUbQugYUdPNgRQHn=BcUwDz5MVNuOftO@mail.gmail.com>
	<AANLkTinUAJ=YByt01GUeSma_SMch0d5UZb_gr+b_LKkT@mail.gmail.com>
	<AANLkTi=7__b1Dj8iJWYT5FR7=LRcJvaZ955709a9tw7w@mail.gmail.com>
	<AANLkTikf_jxCH_uPmU2Rkapo_JNk7576Xxc2wsJF1TLB@mail.gmail.com>
	<4CFAAB3B.3060509@student.matnat.uio.no>
Message-ID: <4CFC3649.2000203@silveregg.co.jp>

On 12/05/2010 05:57 AM, Dag Sverre Seljebotn wrote:
> On 12/04/2010 09:11 PM, Charles R Harris wrote:
>>
>>
>> On Sat, Dec 4, 2010 at 12:59 PM, Ilan Schnell <ischnell at enthought.com
>> <mailto:ischnell at enthought.com>> wrote:
>>
>>     Yes, numpy-refactor builds of top of libndarray. The whole point
>>     was that the libndarray is independent of the interface, i.e. the
>>     CPython or the IronPython interface, and possibly other (Jython)
>>     in the future.
>>     Looking at different building/packaging solutions for libndarray,
>>     autoconf make things very easy, it's a well established pattern,
>>     I'm sure David C. will agree.
>>
>>
>>
>> I know he has expressed reservations about it on non-posix platforms
>> and some large projects have moved away from it. I'm not saying it
>> isn't the best short term solution so you folks can get on with the
>> job, but it may be that long term we will want to look elsewhere.
>
> Such as perhaps waf for building libndarray, which seems like it will be
> much easier to make work nicely with Bento etc. than autoconf (again,
> speaking long-term).
>
> Also, it'd be good to avoid a seperate build system for Windows (problem
> of keeping changes sync-ed with Visual Studio projects etc. etc.).

Is support for visual studio projects a requirement for the refactoring 
? If so, the only alternative to keeping changes in sync is to be able 
to generate the project files from a description, which is not so easy 
(and quite time consuming). I know of at least two tools doing that: 
cmake and gpy (the build system used for chrome).

cheers,

David


From tungwaiyip at yahoo.com  Sun Dec  5 22:44:25 2010
From: tungwaiyip at yahoo.com (Wai Yip Tung)
Date: Sun, 05 Dec 2010 19:44:25 -0800
Subject: [Numpy-discussion] Structured array? recarray? issue access by
	attribute name
Message-ID: <op.vm9wobe2433nmu@waiyip-ws.cisco.com>

I'm trying to use numpy to manipulate CSV file. I'm looking for feature  
similar to relational database. So I come across a class recarray that  
seems to meet my need. And then I see other references of structured  
array. Are these just different name of the same feature?

Also I encounter a problem trying to access an field by attribute name. I  
have


In [303]: arr = np.array([
    .....:     (1, 2.2, 0.0),
    .....:     (3, 4.5, 0.0)
    .....:     ],
    .....:     dtype=[
    .....:         ('unit',int),
    .....:         ('price',float),
    .....:         ('amount',float),
    .....:     ]
    .....: )

In [304]: data0 = arr.view(recarray)

In [305]: data0.price[0]
Out[305]: 2.2000000000000002


It works fine when I get a price vector and pick the first element of it.  
But if instead I select the first row and try to access its price  
attribute, it wouldn't work


In [306]: data0[0].price
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)

c:\Python26\Lib\site-packages\numpy\<ipython console> in <module>()

AttributeError: 'numpy.void' object has no attribute 'price'


Then I come across an alternative way to build a recarray. In that case  
both usage work fine.


In [307]: data1 = np.rec.fromarrays(
    .....:     [[1,3],[2.2,4.5],[0.0,0.0]],
    .....:     names='unit,price,amount')

In [309]: data1.price[0]
Out[309]: 2.2000000000000002

In [310]: data1[0].price
Out[310]: 2.2000000000000002


What's going on here?


Wai Yip


From tungwaiyip at yahoo.com  Sun Dec  5 22:56:51 2010
From: tungwaiyip at yahoo.com (Wai Yip Tung)
Date: Sun, 05 Dec 2010 19:56:51 -0800
Subject: [Numpy-discussion] Can I add rows and columns to recarray?
Message-ID: <op.vm9w81pq433nmu@waiyip-ws.cisco.com>

I'm fairly new to numpy and I'm trying to figure out the right way to do  
things. Continuing on my question about using recarray as a relation. I  
have a recarray like this


In [339]: arr = np.array([
    .....:     (1, 2.2, 0.0),
    .....:     (3, 4.5, 0.0)
    .....:     ],
    .....:     dtype=[
    .....:         ('unit',int),
    .....:         ('price',float),
    .....:         ('amount',float),
    .....:     ]
    .....: )

In [340]: data = arr.view(recarray)


One of the most common thing I want to do is to append rows to data.  I  
think concatenate() might be the method. But I get a problem:


In [342]: np.concatenate((data0,[1,9.0,9.0]))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)

c:\Python26\Lib\site-packages\numpy\<ipython console> in <module>()

TypeError: expected a readable buffer object


The other thing I want to do is to calculate the column value. Right now  
it can do great thing like


In [343]: data.amount = data.unit * data.price


But sometimes it may require me to add a new column not already exist,  
e.g.:


In [344]: data.discount_price = data.price * 0.9


How can I add a new column? I tried column_stack. But it give a similar  
TypeError. I figure I need to first specify the type of the column. But I  
don't know how.

Thanks,

Wai Yip


From jsseabold at gmail.com  Sun Dec  5 23:22:04 2010
From: jsseabold at gmail.com (Skipper Seabold)
Date: Sun, 5 Dec 2010 23:22:04 -0500
Subject: [Numpy-discussion] Structured array? recarray? issue access by
 attribute name
In-Reply-To: <op.vm9wobe2433nmu@waiyip-ws.cisco.com>
References: <op.vm9wobe2433nmu@waiyip-ws.cisco.com>
Message-ID: <AANLkTikinhLOoJL9xN0_UNntQmuN2MyMaPHgSf89JtAT@mail.gmail.com>

On Sun, Dec 5, 2010 at 10:44 PM, Wai Yip Tung <tungwaiyip at yahoo.com> wrote:
> I'm trying to use numpy to manipulate CSV file. I'm looking for feature
> similar to relational database. So I come across a class recarray that
> seems to meet my need. And then I see other references of structured
> array. Are these just different name of the same feature?
>
> Also I encounter a problem trying to access an field by attribute name. I
> have
>
>
> In [303]: arr = np.array([
> ? ?.....: ? ? (1, 2.2, 0.0),
> ? ?.....: ? ? (3, 4.5, 0.0)
> ? ?.....: ? ? ],
> ? ?.....: ? ? dtype=[
> ? ?.....: ? ? ? ? ('unit',int),
> ? ?.....: ? ? ? ? ('price',float),
> ? ?.....: ? ? ? ? ('amount',float),
> ? ?.....: ? ? ]
> ? ?.....: )
>
> In [304]: data0 = arr.view(recarray)
>
> In [305]: data0.price[0]
> Out[305]: 2.2000000000000002
>

You don't have to take a view as a recarray if you don't want to.  You
lose attribute lookup but gain some speed.

In [14]: arr['price']
Out[14]: array([ 2.2,  4.5])

>
>
> It works fine when I get a price vector and pick the first element of it.
> But if instead I select the first row and try to access its price
> attribute, it wouldn't work
>

I'm not sure why this doesn't work.  It looks like taking a view of
the structured array as a recarray does not cast the structs to
records  Is this a bug?  Note that you can do

In [19]: arr[0]['price']
Out[19]: 2.2000000000000002

In [20]: data0[0]['price']
Out[20]: 2.2000000000000002

also slicing seems to work

In [27]: data0[0:1].price
Out[27]: array([ 2.2])

Skipper

>
>
> In [306]: data0[0].price
> ---------------------------------------------------------------------------
> AttributeError ? ? ? ? ? ? ? ? ? ? ? ? ? ?Traceback (most recent call last)
>
> c:\Python26\Lib\site-packages\numpy\<ipython console> in <module>()
>
> AttributeError: 'numpy.void' object has no attribute 'price'
>
>
>
> Then I come across an alternative way to build a recarray. In that case
> both usage work fine.
>
>
>
> In [307]: data1 = np.rec.fromarrays(
> ? ?.....: ? ? [[1,3],[2.2,4.5],[0.0,0.0]],
> ? ?.....: ? ? names='unit,price,amount')
>
> In [309]: data1.price[0]
> Out[309]: 2.2000000000000002
>
> In [310]: data1[0].price
> Out[310]: 2.2000000000000002
>
>
> What's going on here?
>
>
> Wai Yip
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From jsseabold at gmail.com  Sun Dec  5 23:23:51 2010
From: jsseabold at gmail.com (Skipper Seabold)
Date: Sun, 5 Dec 2010 23:23:51 -0500
Subject: [Numpy-discussion] Can I add rows and columns to recarray?
In-Reply-To: <op.vm9w81pq433nmu@waiyip-ws.cisco.com>
References: <op.vm9w81pq433nmu@waiyip-ws.cisco.com>
Message-ID: <AANLkTikMDFNxqrsAm73G_TAytLBYtfN1i2aLc0yuAk8c@mail.gmail.com>

On Sun, Dec 5, 2010 at 10:56 PM, Wai Yip Tung <tungwaiyip at yahoo.com> wrote:
> I'm fairly new to numpy and I'm trying to figure out the right way to do
> things. Continuing on my question about using recarray as a relation. I
> have a recarray like this
>
>
> In [339]: arr = np.array([
> ? ?.....: ? ? (1, 2.2, 0.0),
> ? ?.....: ? ? (3, 4.5, 0.0)
> ? ?.....: ? ? ],
> ? ?.....: ? ? dtype=[
> ? ?.....: ? ? ? ? ('unit',int),
> ? ?.....: ? ? ? ? ('price',float),
> ? ?.....: ? ? ? ? ('amount',float),
> ? ?.....: ? ? ]
> ? ?.....: )
>
> In [340]: data = arr.view(recarray)
>
>
> One of the most common thing I want to do is to append rows to data. ?I
> think concatenate() might be the method. But I get a problem:
>
>
> In [342]: np.concatenate((data0,[1,9.0,9.0]))
> ---------------------------------------------------------------------------
> TypeError ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Traceback (most recent call last)
>
> c:\Python26\Lib\site-packages\numpy\<ipython console> in <module>()
>
> TypeError: expected a readable buffer object
>
>
>
> The other thing I want to do is to calculate the column value. Right now
> it can do great thing like
>
>
>
> In [343]: data.amount = data.unit * data.price
>
>
>
> But sometimes it may require me to add a new column not already exist,
> e.g.:
>
>
> In [344]: data.discount_price = data.price * 0.9
>
>
> How can I add a new column? I tried column_stack. But it give a similar
> TypeError. I figure I need to first specify the type of the column. But I
> don't know how.
>

Check out numpy.lib.recfunctions

I often have

import numpy.lib.recfunctions as nprf

Skipper


From washakie at gmail.com  Mon Dec  6 09:21:18 2010
From: washakie at gmail.com (John)
Date: Mon, 6 Dec 2010 15:21:18 +0100
Subject: [Numpy-discussion] numpy rec array and sorting
Message-ID: <AANLkTi=ymvg5+hm013+yFeVkOsrgB0c3YXCMhFaoUhdX@mail.gmail.com>

Hello, I have been trying two methods for creating a rec array from my
data (or a structured array -- I'm still not completely clear on the
distinction). In terms of data, you can see what types they are,
basically simple (n,1) np.ndarrays. I had to reshape them to (n,1) to
get them to work with hstack.

The 'NON WORKING' method returns no errors, but when I go to 'sort'
the data array that is returned, no sorting takes place, whereas with
the 'WORKING' method, I can do: data.sort(order='sza') and my
data.indices match the sorted 'sza' data. You can see I also tried to
include a 2-d array, but I haven't managed to get this to work...

Could someone please explain what is going on here?

Thanks,
john

## Create recarray so we can easily sort
dtype=np.dtype([('indices','int32'),('time','f8'),('zen','f4'),\
                ('az','f4'),('sza','f4'),('saz','f4'),('muslope','f4'),\
                ('roll','f4'),('pitch','f4'),('yaw','f4')\
                #,('spectra',np.ma.core.MaskedArray,c.shape)
                ])

### WORKING METHOD
values = np.hstack((indices,time,zen,az,sza,saz,musl,roll,pitch,yaw))
data = [[] for dummy in xrange(len(dtype))]
for i in xrange(len(dtype)):
    data[i] = cast[dtype[i]](values[:,i])
data = np.rec.array(data,dtype=dtype)

### NON WORKING METHOD ###
values = (indices,time,zen,az,sza,saz,musl,roll,pitch,yaw)
data = np.rec.fromarrays(values,dtype=dtype)


From Chris.Barker at noaa.gov  Mon Dec  6 13:26:59 2010
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Mon, 06 Dec 2010 10:26:59 -0800
Subject: [Numpy-discussion] Can I add rows and columns to recarray?
In-Reply-To: <op.vm9w81pq433nmu@waiyip-ws.cisco.com>
References: <op.vm9w81pq433nmu@waiyip-ws.cisco.com>
Message-ID: <4CFD2AF3.3070205@noaa.gov>

On 12/5/10 7:56 PM, Wai Yip Tung wrote:
> I'm fairly new to numpy and I'm trying to figure out the right way to do
> things. Continuing on my question about using recarray as a relation.

note that recarrays (or structured arrays, AFAIK, the difference is 
atturube access only -- I don't use recarrays) are far more static than 
a database table. So you may really want to use a database, or maybe 
pytables. Or maybe even just stick with lists.

But if you are keeping things in memory, should be able to do what you want.

> In [339]: arr = np.array([
>      .....:     (1, 2.2, 0.0),
>      .....:     (3, 4.5, 0.0)
>      .....:     ],
>      .....:     dtype=[
>      .....:         ('unit',int),
>      .....:         ('price',float),
>      .....:         ('amount',float),
>      .....:     ]
>      .....: )
>
> In [340]: data = arr.view(recarray)
>
>
> One of the most common thing I want to do is to append rows to data.

numpy arrays do not naturally support appending, as you have discovered.

>  I
> think concatenate() might be the method.

yes.

> But I get a problem:

> In [342]: np.concatenate((data0,[1,9.0,9.0]))
> ---------------------------------------------------------------------------
> TypeError                                 Traceback (most recent call last)
>
> c:\Python26\Lib\site-packages\numpy\<ipython console>  in<module>()
>
> TypeError: expected a readable buffer object

concatenate expects two arrays to be joined. If you pass in something 
that can easily be turned into an array, it will work, but a tuple can 
be converted to multiple types of arrays, so it doesn't know what to do. 
So you need to re-construct the second array:

a2 = np.array( [(3,5.5, 3)], dtype=dt)
arr = np.concatenate( (arr, a2) )

> In [343]: data.amount = data.unit * data.price

yup

> But sometimes it may require me to add a new column not already exist,
> e.g.:
>
> In [344]: data.discount_price = data.price * 0.9
>
>
> How can I add a new column?

you can't. what you need to do is create a new array with a new dtype 
that includes the new field.

The trick is that numpy only supports homogenous arrays -- evey item is 
the same data type. So when you could a strut array like above, numpy 
does not define it as a 2-d table, but rather, a 1-d array, each element 
of which is a structure.

so you need to do something like:

# create a new array
data2 = np.zeros(len(data), dtype=dt2)

# fill the array:
for field_name in dt.fields.keys():
     data2[field_name] = data[field_name]

# now some calculations:
data2['discount_price'] = data2['price'] * 0.9

I don't know of a way to avoid that loop when filling the array.

Better yet -- anticipate your needs and create the array with all the 
fields you need in the first place.

You can see that ndarrays are pretty static -- struct arrays can be 
useful data storage, but are not very suitable when things are changing 
much.

You could write a class that wraps an andarray, and supports what you 
need better -- it could be a pretty usefull general purpose class, too. 
I've got one that handle the appending part, but nothing with adding new 
fields.

Here's appending with my class:

data3 = accumulator.accumulator(dtype = dt2)
data3.append((1, 2.2, 0.0, 0.0))
data3.append((3, 4.5, 0.0, 0.0))
data3.append((2, 1.2, 0.0, 0.0))
data3.append((5, 4.2, 0.0, 0.0))
print repr(data3)

# convert to regular array for calculations:
data3 = np.array(data3)

# now some calculations:
data3['discount_price'] = data3['price'] * 0.9

You wouldn't have to convert to a regular array, except that I haven't 
written the code to support field access yet -- I don't think it would 
be too hard, though.

I've enclosed some test code, and my accumulator class, in case you find 
it useful.


-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: struct_test.py
Type: application/x-python
Size: 1589 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101206/bd747a3d/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: accumulator.py
Type: application/x-python
Size: 4651 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101206/bd747a3d/attachment-0001.bin>

From ben.root at ou.edu  Mon Dec  6 14:00:30 2010
From: ben.root at ou.edu (Benjamin Root)
Date: Mon, 6 Dec 2010 13:00:30 -0600
Subject: [Numpy-discussion] Can I add rows and columns to recarray?
In-Reply-To: <4CFD2AF3.3070205@noaa.gov>
References: <op.vm9w81pq433nmu@waiyip-ws.cisco.com> <4CFD2AF3.3070205@noaa.gov>
Message-ID: <AANLkTimqrKZaqaheguLt+H3thc6mSxTUyqpKS=0fAriZ@mail.gmail.com>

On Mon, Dec 6, 2010 at 12:26 PM, Christopher Barker
<Chris.Barker at noaa.gov>wrote:

> On 12/5/10 7:56 PM, Wai Yip Tung wrote:
>
>> I'm fairly new to numpy and I'm trying to figure out the right way to do
>> things. Continuing on my question about using recarray as a relation.
>>
>
> note that recarrays (or structured arrays, AFAIK, the difference is
> atturube access only -- I don't use recarrays) are far more static than a
> database table. So you may really want to use a database, or maybe pytables.
> Or maybe even just stick with lists.
>
> But if you are keeping things in memory, should be able to do what you
> want.
>
>
>  In [339]: arr = np.array([
>>     .....:     (1, 2.2, 0.0),
>>     .....:     (3, 4.5, 0.0)
>>     .....:     ],
>>     .....:     dtype=[
>>     .....:         ('unit',int),
>>     .....:         ('price',float),
>>     .....:         ('amount',float),
>>     .....:     ]
>>     .....: )
>>
>> In [340]: data = arr.view(recarray)
>>
>>
>> One of the most common thing I want to do is to append rows to data.
>>
>
> numpy arrays do not naturally support appending, as you have discovered.
>
>
>   I
>> think concatenate() might be the method.
>>
>
> yes.
>
>
>  But I get a problem:
>>
>
>  In [342]: np.concatenate((data0,[1,9.0,9.0]))
>>
>> ---------------------------------------------------------------------------
>> TypeError                                 Traceback (most recent call
>> last)
>>
>> c:\Python26\Lib\site-packages\numpy\<ipython console>  in<module>()
>>
>> TypeError: expected a readable buffer object
>>
>
> concatenate expects two arrays to be joined. If you pass in something that
> can easily be turned into an array, it will work, but a tuple can be
> converted to multiple types of arrays, so it doesn't know what to do. So you
> need to re-construct the second array:
>
> a2 = np.array( [(3,5.5, 3)], dtype=dt)
> arr = np.concatenate( (arr, a2) )
>
>
>  In [343]: data.amount = data.unit * data.price
>>
>
> yup
>
>
>  But sometimes it may require me to add a new column not already exist,
>> e.g.:
>>
>> In [344]: data.discount_price = data.price * 0.9
>>
>>
>> How can I add a new column?
>>
>
> you can't. what you need to do is create a new array with a new dtype that
> includes the new field.
>
> The trick is that numpy only supports homogenous arrays -- evey item is the
> same data type. So when you could a strut array like above, numpy does not
> define it as a 2-d table, but rather, a 1-d array, each element of which is
> a structure.
>
> so you need to do something like:
>
> # create a new array
> data2 = np.zeros(len(data), dtype=dt2)
>
> # fill the array:
> for field_name in dt.fields.keys():
>    data2[field_name] = data[field_name]
>
> # now some calculations:
> data2['discount_price'] = data2['price'] * 0.9
>
> I don't know of a way to avoid that loop when filling the array.
>
> Better yet -- anticipate your needs and create the array with all the
> fields you need in the first place.
>
> You can see that ndarrays are pretty static -- struct arrays can be useful
> data storage, but are not very suitable when things are changing much.
>
> You could write a class that wraps an andarray, and supports what you need
> better -- it could be a pretty usefull general purpose class, too. I've got
> one that handle the appending part, but nothing with adding new fields.
>
> Here's appending with my class:
>
> data3 = accumulator.accumulator(dtype = dt2)
> data3.append((1, 2.2, 0.0, 0.0))
> data3.append((3, 4.5, 0.0, 0.0))
> data3.append((2, 1.2, 0.0, 0.0))
> data3.append((5, 4.2, 0.0, 0.0))
> print repr(data3)
>
> # convert to regular array for calculations:
> data3 = np.array(data3)
>
> # now some calculations:
> data3['discount_price'] = data3['price'] * 0.9
>
> You wouldn't have to convert to a regular array, except that I haven't
> written the code to support field access yet -- I don't think it would be
> too hard, though.
>
> I've enclosed some test code, and my accumulator class, in case you find it
> useful.
>
>
>
> -Chris
>
>
numpy.lib.recfunctions has a method for easily adding new columns.  Of
course, it really returns a new recarray rather than adding it to an
existing recarray.  Appending records to such an array, however is a
different story, and you have to do something like you demonstrated above.

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101206/29444b7d/attachment.html>

From Chris.Barker at noaa.gov  Mon Dec  6 14:28:58 2010
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Mon, 06 Dec 2010 11:28:58 -0800
Subject: [Numpy-discussion] Can I add rows and columns to recarray?
In-Reply-To: <AANLkTimqrKZaqaheguLt+H3thc6mSxTUyqpKS=0fAriZ@mail.gmail.com>
References: <op.vm9w81pq433nmu@waiyip-ws.cisco.com> <4CFD2AF3.3070205@noaa.gov>
	<AANLkTimqrKZaqaheguLt+H3thc6mSxTUyqpKS=0fAriZ@mail.gmail.com>
Message-ID: <4CFD397A.7010204@noaa.gov>

On 12/6/10 11:00 AM, Benjamin Root wrote:

> numpy.lib.recfunctions has a method for easily adding new columns.

cool! There is a lot of other nifty- looking stuff in there too. The OP 
should really take a look.

And maybe an appending function is in order, too.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From tungwaiyip at yahoo.com  Mon Dec  6 16:00:29 2010
From: tungwaiyip at yahoo.com (Wai Yip Tung)
Date: Mon, 06 Dec 2010 13:00:29 -0800
Subject: [Numpy-discussion] Can I add rows and columns to recarray?
References: <op.vm9w81pq433nmu@waiyip-ws.cisco.com>
	<AANLkTikMDFNxqrsAm73G_TAytLBYtfN1i2aLc0yuAk8c@mail.gmail.com>
Message-ID: <op.vna8m3cx433nmu@waiyip-ws.cisco.com>

Thank you for the quick response and Christopher's explanation on the  
design background.

All my tables fit in-memory. I want to explore the data interactively and  
relational database is does not provide me a lot of value.

I was rolling my own library before I come to numpy. Then I find numpy's  
universal function awesome and really fit what I want to do. Now I just  
need to find out what to add row which is easy in Python. It is OK if it  
rebuild an array when I add a column, which should happen infrequently.  
But if adding row build a new array, this will lead to O(n^2) complexity.  
In anycase, I will explore the recfunctions.

Thank you

Wai Yip


> On Sun, Dec 5, 2010 at 10:56 PM, Wai Yip Tung <tungwaiyip at yahoo.com>  
> wrote:
>> I'm fairly new to numpy and I'm trying to figure out the right way to do
>> things. Continuing on my question about using recarray as a relation. I
>> have a recarray like this
>>
>>
>> In [339]: arr = np.array([
>>    .....:     (1, 2.2, 0.0),
>>    .....:     (3, 4.5, 0.0)
>>    .....:     ],
>>    .....:     dtype=[
>>    .....:         ('unit',int),
>>    .....:         ('price',float),
>>    .....:         ('amount',float),
>>    .....:     ]
>>    .....: )
>>
>> In [340]: data = arr.view(recarray)
>>
>>
>> One of the most common thing I want to do is to append rows to data.  I
>> think concatenate() might be the method. But I get a problem:
>>
>>
>> In [342]: np.concatenate((data0,[1,9.0,9.0]))
>> ---------------------------------------------------------------------------
>> TypeError                                 Traceback (most recent call  
>> last)
>>
>> c:\Python26\Lib\site-packages\numpy\<ipython console> in <module>()
>>
>> TypeError: expected a readable buffer object
>>
>>
>>
>> The other thing I want to do is to calculate the column value. Right now
>> it can do great thing like
>>
>>
>>
>> In [343]: data.amount = data.unit * data.price
>>
>>
>>
>> But sometimes it may require me to add a new column not already exist,
>> e.g.:
>>
>>
>> In [344]: data.discount_price = data.price * 0.9
>>
>>
>> How can I add a new column? I tried column_stack. But it give a similar
>> TypeError. I figure I need to first specify the type of the column. But  
>> I
>> don't know how.
>>
>
> Check out numpy.lib.recfunctions
>
> I often have
>
> import numpy.lib.recfunctions as nprf
>
> Skipper


From Chris.Barker at noaa.gov  Mon Dec  6 17:44:54 2010
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Mon, 06 Dec 2010 14:44:54 -0800
Subject: [Numpy-discussion] Can I add rows and columns to recarray?
In-Reply-To: <op.vna8m3cx433nmu@waiyip-ws.cisco.com>
References: <op.vm9w81pq433nmu@waiyip-ws.cisco.com>
	<AANLkTikMDFNxqrsAm73G_TAytLBYtfN1i2aLc0yuAk8c@mail.gmail.com>
	<op.vna8m3cx433nmu@waiyip-ws.cisco.com>
Message-ID: <4CFD6766.7080104@noaa.gov>

On 12/6/10 1:00 PM, Wai Yip Tung wrote:
> Thank you for the quick response and Christopher's explanation on the
> design background.

you're welcome.

> But if adding row build a new array, this will lead to O(n^2) complexity.

if you are adding a lot of rows one at a time, yes, you can have 
performance issues -- though re-allocating data is pretty fast, too -- 
maybe it won't matter.

If it does, consider the accumulator code I sent, or use it as 
inspiration to write your own.

If you do improve it, please send your improvements back to me.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From faltet at pytables.org  Mon Dec  6 18:06:54 2010
From: faltet at pytables.org (Francesc Alted)
Date: Tue, 7 Dec 2010 00:06:54 +0100
Subject: [Numpy-discussion] Can I add rows and columns to recarray?
In-Reply-To: <op.vna8m3cx433nmu@waiyip-ws.cisco.com>
References: <op.vm9w81pq433nmu@waiyip-ws.cisco.com>
	<AANLkTikMDFNxqrsAm73G_TAytLBYtfN1i2aLc0yuAk8c@mail.gmail.com>
	<op.vna8m3cx433nmu@waiyip-ws.cisco.com>
Message-ID: <201012070006.54624.faltet@pytables.org>

A Monday 06 December 2010 22:00:29 Wai Yip Tung escrigu?:
> Thank you for the quick response and Christopher's explanation on the
> design background.
> 
> All my tables fit in-memory. I want to explore the data interactively
> and relational database is does not provide me a lot of value.
> 
> I was rolling my own library before I come to numpy. Then I find
> numpy's universal function awesome and really fit what I want to do.
> Now I just need to find out what to add row which is easy in Python.
> It is OK if it rebuild an array when I add a column, which should
> happen infrequently. But if adding row build a new array, this will
> lead to O(n^2) complexity. In anycase, I will explore the
> recfunctions.

If you want a container with a better complexity for adding columns  
than O(n^2), you may want to have a look at the ctable object in carray 
package:

https://github.com/FrancescAlted/carray

carray is about providing compressed, in-memory data containers for both 
homogeneous (arrays) and heterogeneous data (structured arrays).  Here 
it is an example of use:

>>> import numpy as np
>>> import carray as ca
>>> NR = 1000*1000
>>> r = np.fromiter(((i,i*i) for i in xrange(NR)), dtype="i4,i8")
>>> new_field = np.arange(NR, dtype='f8')**3
>>> rc = ca.ctable(r)
>>> rc
ctable((1000000,), [('f0', '<i4'), ('f1', '<i8')])
  nbytes: 11.44 MB; cbytes: 1.71 MB; ratio: 6.70
[(0, 0), (1, 1), (2, 4), ..., (999997, 999994000009), (999998, 
999996000004), (999999, 999998000001)]
>>> time rc.addcol(new_field, "f2")
CPU times: user 0.03 s, sys: 0.00 s, total: 0.03 s
Wall time: 0.03 s

that is, only 30 ms for appending a column.  This is basically the time 
to copy (and compress) the data (i.e. O(n)).  If you append an already 
compressed column, the cost of adding it is O(1):

>>> r = np.fromiter(((i,i*i) for i in xrange(NR)), dtype="i4,i8")
>>> rc = ca.ctable(r)
>>> cnew_field = ca.carray(np.arange(NR, dtype='f8')**3)
>>> time rc.addcol(cnew_field, "f2")
CPU times: user 0.00 s, sys: 0.00 s, total: 0.00 s
Wall time: 0.00 s

On his hand, using plain structured arrays is pretty more costly:

>>> import numpy.lib.recfunctions as nprf
>>> time r2 = nprf.rec_append_fields(r, 'f2', new_field, 'f8')
CPU times: user 0.34 s, sys: 0.02 s, total: 0.36 s
Wall time: 0.36 s

Appending data at the end of ctable objects is also very fast:

>>> timeit rc.append(row)
100000 loops, best of 3: 13.1 ?s per loop

Compare this with an append with an structured array:

>>> timeit np.concatenate((r2, row))
100 loops, best of 3: 6.84 ms per loop

Unfortunately you cannot do the full range of operations supported by 
structured arrays with ctables, and a ctable object is rather meant to 
be used as an efficient, compressed container for structures in memory:

>>> r2[2]
(2, 4, 8.0)
>>> rc[2]
(2, 4, 8.0)
>>> r2['f1']
array([0, 1, 4, ..., 1, 1, 1])
>>> rc['f1']
carray((1452223,), int64)  nbytes: 11.08 MB; cbytes: 1.62 MB; ratio: 
6.85
  cparams := cparams(clevel=5, shuffle=True)
[0, 1, 4, ..., 1, 1, 1]

But still, you can do funny things like complex queries:

>>> [r for r in rc.getif("(f0<10)&(f2>4)", ["__nrow__", "f1"])]
[(2, 4),
 (3, 9),
 (4, 16),
 (5, 25),
 (6, 36),
 (7, 49),
 (8, 64),
 (9, 81),
 (1041112, 1)]

The queries are also very fast (both Numexpr and Blosc are used under 
the hood):

>>> timeit [r for r in rc.getif("(f0<10)&(f2>4)")]
10 loops, best of 3: 58.6 ms per loop
>>> timeit r2[(r2['f0']<10)&(r2['f2']>4)]
10 loops, best of 3: 28 ms per loop

So, queries on ctables are only 2x slower than using plain structured 
arrays  --of course, the secret goal is to make these sort of queries 
actually faster than using structured arrays :)

I still need to finish the docs, but I plan to release carray 0.3 later 
this week.

Cheers,

-- 
Francesc Alted


From moura.mario at gmail.com  Mon Dec  6 21:18:41 2010
From: moura.mario at gmail.com (Mario Moura)
Date: Tue, 7 Dec 2010 00:18:41 -0200
Subject: [Numpy-discussion]  The power of strides - Combinations
Message-ID: <AANLkTingzff1i+8aTiyaWCfeV_vhaeHXFtgWYJQ0ZN2Z@mail.gmail.com>

Hi Folks

Is it possible some example how deal with strides with combinations, let see:

>>> from numpy import *
>>> import itertools
>>> dt = dtype('i,i,i')
>>> a = fromiter(itertools.combinations(range(10),3), dtype=dt, count=-1)
>>> a
array([(0, 1, 2), (0, 1, 3), (0, 1, 4), (0, 1, 5), (0, 1, 6), (0, 1, 7),
       (0, 1, 8), (0, 1, 9), (0, 2, 3), (0, 2, 4), (0, 2, 5), (0, 2, 6),
       (0, 2, 7), (0, 2, 8), (0, 2, 9), (0, 3, 4), (0, 3, 5), (0, 3, 6),
       (0, 3, 7), (0, 3, 8), (0, 3, 9), (0, 4, 5), (0, 4, 6), (0, 4, 7),
       (0, 4, 8), (0, 4, 9), (0, 5, 6), (0, 5, 7), (0, 5, 8), (0, 5, 9),
       (0, 6, 7), (0, 6, 8), (0, 6, 9), (0, 7, 8), (0, 7, 9), (0, 8, 9),
       (1, 2, 3), (1, 2, 4), (1, 2, 5), (1, 2, 6), (1, 2, 7), (1, 2, 8),
       (1, 2, 9), (1, 3, 4), (1, 3, 5), (1, 3, 6), (1, 3, 7), (1, 3, 8),
       (1, 3, 9), (1, 4, 5), (1, 4, 6), (1, 4, 7), (1, 4, 8), (1, 4, 9),
       (1, 5, 6), (1, 5, 7), (1, 5, 8), (1, 5, 9), (1, 6, 7), (1, 6, 8),
       (1, 6, 9), (1, 7, 8), (1, 7, 9), (1, 8, 9), (2, 3, 4), (2, 3, 5),
       (2, 3, 6), (2, 3, 7), (2, 3, 8), (2, 3, 9), (2, 4, 5), (2, 4, 6),
       (2, 4, 7), (2, 4, 8), (2, 4, 9), (2, 5, 6), (2, 5, 7), (2, 5, 8),
       (2, 5, 9), (2, 6, 7), (2, 6, 8), (2, 6, 9), (2, 7, 8), (2, 7, 9),
       (2, 8, 9), (3, 4, 5), (3, 4, 6), (3, 4, 7), (3, 4, 8), (3, 4, 9),
       (3, 5, 6), (3, 5, 7), (3, 5, 8), (3, 5, 9), (3, 6, 7), (3, 6, 8),
       (3, 6, 9), (3, 7, 8), (3, 7, 9), (3, 8, 9), (4, 5, 6), (4, 5, 7),
       (4, 5, 8), (4, 5, 9), (4, 6, 7), (4, 6, 8), (4, 6, 9), (4, 7, 8),
       (4, 7, 9), (4, 8, 9), (5, 6, 7), (5, 6, 8), (5, 6, 9), (5, 7, 8),
       (5, 7, 9), (5, 8, 9), (6, 7, 8), (6, 7, 9), (6, 8, 9), (7, 8, 9)],
      dtype=[('f0', '<i4'), ('f1', '<i4'), ('f2', '<i4')])
>>>

Many thanks Mr. Warren about this
((itertools.combinations(range(10),3), dtype=dt, count=-1))

But as you can see itertools.combinations are emitted in lexicographic
sort order but NOT with "power of" strides.

So what I see is every element in this array into one memory spot but
I would like to know if is possible, use "the power of strides"!

>>> x = a.reshape(120,1)
x = stride_tricks.as_strided(a,shape=(120,),strides=(4,4))

Should I use some sub-class like record array, scalar array?

So what I want is repetitive elements on same memory spot. I want save
memory in big arrays (main reason) and want go fast.

How can I deal with this in random arrays but with repetitive
elements? Is it possible have custom strides in subclass(that change
in dimension) ? How do this?

Best Regards

Mario


From robert.kern at gmail.com  Mon Dec  6 21:47:34 2010
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 6 Dec 2010 20:47:34 -0600
Subject: [Numpy-discussion] The power of strides - Combinations
In-Reply-To: <AANLkTingzff1i+8aTiyaWCfeV_vhaeHXFtgWYJQ0ZN2Z@mail.gmail.com>
References: <AANLkTingzff1i+8aTiyaWCfeV_vhaeHXFtgWYJQ0ZN2Z@mail.gmail.com>
Message-ID: <AANLkTinSyLDwiEDN5gY_qmSPYvqRDo8azPEP2LMFoV99@mail.gmail.com>

On Mon, Dec 6, 2010 at 20:18, Mario Moura <moura.mario at gmail.com> wrote:
> Hi Folks
>
> Is it possible some example how deal with strides with combinations, let see:

No, sorry. It is not possible to generate combinations just using strides.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From rbanerj at fas.harvard.edu  Mon Dec  6 22:20:35 2010
From: rbanerj at fas.harvard.edu (Rajat Banerjee)
Date: Mon, 6 Dec 2010 22:20:35 -0500
Subject: [Numpy-discussion] fromrecords yields "ValueError: invalid itemsize
	in generic type tuple"
Message-ID: <AANLkTi=y=3+QoJ5rk2bBDYOMRsKX9f9A1UvRBJqguJ_B@mail.gmail.com>

Hi All,
I have been using Numpy for a while with great success. I left my
little project for a little while
(http://web.mit.edu/stardev/cluster/) and now some of my code is
broken.

I have some Numpy code to create graphs of activity on a cluster with
matplotlib. It ran just fine in July / August 2010, but has since
stopped working. I have updated numpy on my machine, I think.

In [2]: np.version.version
Out[2]: '1.5.1'

My call to np.rec.fromrecords() is throwing this exception:

  File "/home/rajat/Envs/StarCluster/lib/python2.6/site-packages/numpy/core/records.py",
line 607, in fromrecords
    descr = sb.dtype((record, dtype))
ValueError: invalid itemsize in generic type tuple

Here is the code with some irrelevant stuff stripped:

        for line in file:
            a = [datetime.strptime(parts[0], '%Y-%m-%d %H:%M:%S.%f'),
                 int(parts[1]), int(parts[2]), int(parts[3]), int(parts[4]),
                 int(parts[5]), int(parts[6]), float(parts[7])]
            list.append(a)
        file.close()
        names = ['dt', 'hosts', 'running_jobs', 'queued_jobs',\
                 'slots', 'avg_duration', 'avg_wait', 'avg_load']
        descriptor = {'names':
('dt,hosts,running_jobs,queued_jobs,slots,avg_duration,avg_wait,avg_load'),\
                      'formats' : ('S20','u','u','u','u','u','u','f')}
        self.records = np.rec.fromrecords(list,','.join(names)) #used to work
        #self.records = np.rec.fromrecords(list, dtype=descriptor) #new attempt

Here is one "line" from the array "list":
>>> parts (8) = ['2010-12-07 03:09:46.855712', '2', '2', '177', '2', '86', '370', '1.05'].

Neither of those np.rec.fromrecords() calls works. I've tried both
separately. They both throw the exact same exception, ValueError:
invalid itemsize in generic type tuple


Can anybody help me? Am I doing something dumb?
Thank you.
Rajat Banerjee
Masters Candidate, Computer Science, Harvard University


From ben.root at ou.edu  Mon Dec  6 22:51:06 2010
From: ben.root at ou.edu (Benjamin Root)
Date: Mon, 6 Dec 2010 21:51:06 -0600
Subject: [Numpy-discussion] The power of strides - Combinations
In-Reply-To: <AANLkTinSyLDwiEDN5gY_qmSPYvqRDo8azPEP2LMFoV99@mail.gmail.com>
References: <AANLkTingzff1i+8aTiyaWCfeV_vhaeHXFtgWYJQ0ZN2Z@mail.gmail.com>
	<AANLkTinSyLDwiEDN5gY_qmSPYvqRDo8azPEP2LMFoV99@mail.gmail.com>
Message-ID: <AANLkTikgvNRZSZOa9nWD94vM+QGZNO8OhpTWR-Rr9dr1@mail.gmail.com>

On Monday, December 6, 2010, Robert Kern <robert.kern at gmail.com> wrote:
> On Mon, Dec 6, 2010 at 20:18, Mario Moura <moura.mario at gmail.com> wrote:
>> Hi Folks
>>
>> Is it possible some example how deal with strides with combinations, let see:
>
> No, sorry. It is not possible to generate combinations just using strides.
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
> ? -- Umberto Eco
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>

Just wondering, would using ogrid[] in numpy help the OP?

Ben


From robert.kern at gmail.com  Mon Dec  6 22:58:12 2010
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 6 Dec 2010 21:58:12 -0600
Subject: [Numpy-discussion] The power of strides - Combinations
In-Reply-To: <AANLkTikgvNRZSZOa9nWD94vM+QGZNO8OhpTWR-Rr9dr1@mail.gmail.com>
References: <AANLkTingzff1i+8aTiyaWCfeV_vhaeHXFtgWYJQ0ZN2Z@mail.gmail.com>
	<AANLkTinSyLDwiEDN5gY_qmSPYvqRDo8azPEP2LMFoV99@mail.gmail.com>
	<AANLkTikgvNRZSZOa9nWD94vM+QGZNO8OhpTWR-Rr9dr1@mail.gmail.com>
Message-ID: <AANLkTim0xZn7_SZ_p=8VKM5uUbsZG8WL4mM6_PSYNVCe@mail.gmail.com>

On Mon, Dec 6, 2010 at 21:51, Benjamin Root <ben.root at ou.edu> wrote:
> On Monday, December 6, 2010, Robert Kern <robert.kern at gmail.com> wrote:
>> On Mon, Dec 6, 2010 at 20:18, Mario Moura <moura.mario at gmail.com> wrote:
>>> Hi Folks
>>>
>>> Is it possible some example how deal with strides with combinations, let see:
>>
>> No, sorry. It is not possible to generate combinations just using strides.
>
> Just wondering, would using ogrid[] in numpy help the OP?

No. The limitations of the stride mechanisms apply with ogrid just the
same. Generating combinations is not regular enough.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From sebastian.walter at gmail.com  Tue Dec  7 09:15:24 2010
From: sebastian.walter at gmail.com (Sebastian Walter)
Date: Tue, 7 Dec 2010 15:15:24 +0100
Subject: [Numpy-discussion] ctypes and numpy
Message-ID: <AANLkTikXf2+UTkk8vRP0p82C_rES9vkgheAJSKck-URE@mail.gmail.com>

Hello all,

I'd like to call a Python function from a C++ code. The Python
function has numpy.ndarrays as input.
I figured that the easiest way would be to use ctypes.

However,  I can't get numpy and ctypes to work together.

----------- run.c ------------
#include <Python.h>
#include <numpy/noprefix.h>

void run(PyArrayObject *y, PyObject *f) {
    npy_intp Ny = PyArray_SIZE(y);
}
------- end run.c ----------

-------- run.py ----------
import os, ctypes, numpy

_r = numpy.ctypeslib.load_library('librun.so', os.path.dirname(__file__))
_r.run.argtypes = [ctypes.py_object]
x = numpy.array([1,2,3.])
_r.run(x)

-------- end run.py ----------

Compiling gives me the warning:

gcc -o librun.so -I/usr/include/python2.6
-I/usr/lib/python2.6/dist-packages/numpy/core/include -O0 -fpic
-shared -Wall  run.c
run.c: In function ?run?:
run.c:5: warning: unused variable ?Ny?
run.c: At top level:
/usr/include/python2.6/numpy/__multiarray_api.h:968: warning:
?_import_array? defined but not used

and when I run it I get a segmentation fault.

I guess I'm not the first one who has this problem, but I couldn't
find something useful on the web.
Any pointers are suggestions are welcome.

cheers,
Sebastian


From jmccampbell at enthought.com  Tue Dec  7 13:34:25 2010
From: jmccampbell at enthought.com (Jason McCampbell)
Date: Tue, 7 Dec 2010 12:34:25 -0600
Subject: [Numpy-discussion] Refactor fork uses the ./configure, make,
 make install process.
In-Reply-To: <AANLkTikSzMUcKuSbz3XcFmKS6DjC2SU+DS4cBM8yTCHj@mail.gmail.com>
References: <AANLkTinnjtg=U0MtF4w2xhgJesOEMOohRL3YUsd580fH@mail.gmail.com>
	<AANLkTinvoL3CQUbQugYUdPNgRQHn=BcUwDz5MVNuOftO@mail.gmail.com>
	<AANLkTinUAJ=YByt01GUeSma_SMch0d5UZb_gr+b_LKkT@mail.gmail.com>
	<ide66i$mp3$2@dough.gmane.org>
	<AANLkTina+5uYikheacJgx0fU7e36_qH0hm40ZtwYVgcG@mail.gmail.com>
	<AANLkTinEfmjquHpU+6DCumGuO8S5tLYdB47CbhQwBtMn@mail.gmail.com>
	<ide9a7$9ih$1@dough.gmane.org>
	<AANLkTikSzMUcKuSbz3XcFmKS6DjC2SU+DS4cBM8yTCHj@mail.gmail.com>
Message-ID: <AANLkTinkgtKhA8u-5zkCr_M7YE4=z2GGGBjroKkDK8Yf@mail.gmail.com>

Sorry for the late reply... I missed this thread.  Thanks to Ilan for
pointing it out.  A variety of comments below...

On Sat, Dec 4, 2010 at 10:20 AM, Charles R Harris<charlesr.harris at gmail.com>
wrote:

> Just wondering if this is temporary or the intention is to change the
> build process? I also note that the *.h files in libndarray are not complete
> and a *lot* of trailing whitespace has crept into the files.


For the purposes of our immediate project the intent is to use autoconf
since it's widely available and makes building this part Python-independent
and easier than working it into both distutils and numscons.  Going forward
it's certainly open to discussion.

Currently all of the .h and .c files are generated as a part of the build
rather than being checked in just because it saves a build step.  Checking
in the intermediate files isn't a problem either.

Does the trailing whitespace cause problems?  We saw it in the coding
guidelines and planned to run a filter over it once the code stabilizes, but
none of us had seen a guideline like that before and weren't sure why it was
there.

On Sat, Dec 4, 2010 at 3:01 PM, Charles R Harris
<charlesr.harris at gmail.com>wrote:

>
>
> On Sat, Dec 4, 2010 at 1:45 PM, Pauli Virtanen <pav at iki.fi> wrote:
>
>> On Sat, 04 Dec 2010 14:24:49 -0600, Ilan Schnell wrote:
>> > I'm not sure how reasonable it would be to move only libndarray into the
>> > master, because I've been working on EPD for the last couple of week.
>> > But Jason will know how complete libndarray is.
>>
>> The main question is whether moving it will make things easier or more
>> difficult, I think. It's one tree more to keep track of.
>>
>> In any case, it would be a first part in the merge, and it would split
>> the hunk of changes into two parts.
>>
>>
> That would be a good thing IMHO. It would also bring a bit more numpy
> reality to the refactor and since we are implicitly relying on it for the
> next release sometime next spring the closer to reality it gets the better.
>
>
>>    ***
>>
>> Technically, the move could be done like this, so that merge tracking
>> still works:
>>
>>           --------refactor--------------- new-refactor
>>          /                            /
>>         /--------libndarray----------x
>>        /                              \
>>   start---------------------- master----- new-master
>>
>>
> Looks good to me.
>

Doing this isn't a problem, though I'm not sure if it buys us much.  90% of
the changes are the refactoring, moving substantial amounts of code from
numpy/core/src/multiarray and /umath into libndarray and then all of the
assorted fix-ups.  The rest is the .NET interface layer which is isolated in
numpy/NumpyDotNet for now.  We can leave this directory out, but everything
else is the same between libndarray and refactor. Or am I misunderstanding
the reason?

The current state of the refactor branch is that it passes the bulk of
regressions on Python 2.6 and 3.? (Ilan, what version did you use?) and is
up-to-date with the master branch.  There are a few failing regression test
that we need to look at vs. the master branch but less than dozen.

Switching to use libndarray is a big ABI+API change, right?  If there's an
> idea to release an ABI-compatible 1.6, wouldn't this end up being more
> difficult?  Maybe I'm misunderstanding this idea.


Definitely a big ABI change and effectively a big API change.  The API
itself should be close to 100% compatible, except that the data structures
all change to introduce a new layer of indirection.  Code that strictly uses
the macro accessors will build fine, but that is turning out to be quite
rare. The changes are quite mechanical but still non-trivial for code that
directly accesses the structure fields.

Changes to Cython as a part of the project take care of some of the work. A
new numpy.pdx file is needed and will mask the changes as long as the Python
(as opposed to the CPython) interface is used.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101207/87a072b4/attachment.html>

From charlesr.harris at gmail.com  Tue Dec  7 13:57:13 2010
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 7 Dec 2010 11:57:13 -0700
Subject: [Numpy-discussion] Refactor fork uses the ./configure, make,
 make install process.
In-Reply-To: <AANLkTinkgtKhA8u-5zkCr_M7YE4=z2GGGBjroKkDK8Yf@mail.gmail.com>
References: <AANLkTinnjtg=U0MtF4w2xhgJesOEMOohRL3YUsd580fH@mail.gmail.com>
	<AANLkTinvoL3CQUbQugYUdPNgRQHn=BcUwDz5MVNuOftO@mail.gmail.com>
	<AANLkTinUAJ=YByt01GUeSma_SMch0d5UZb_gr+b_LKkT@mail.gmail.com>
	<ide66i$mp3$2@dough.gmane.org>
	<AANLkTina+5uYikheacJgx0fU7e36_qH0hm40ZtwYVgcG@mail.gmail.com>
	<AANLkTinEfmjquHpU+6DCumGuO8S5tLYdB47CbhQwBtMn@mail.gmail.com>
	<ide9a7$9ih$1@dough.gmane.org>
	<AANLkTikSzMUcKuSbz3XcFmKS6DjC2SU+DS4cBM8yTCHj@mail.gmail.com>
	<AANLkTinkgtKhA8u-5zkCr_M7YE4=z2GGGBjroKkDK8Yf@mail.gmail.com>
Message-ID: <AANLkTikpbPv_kbe_72eTjUMf07Y6rfnJLO=aWnYLQ+uK@mail.gmail.com>

On Tue, Dec 7, 2010 at 11:34 AM, Jason McCampbell <jmccampbell at enthought.com
> wrote:

> Sorry for the late reply... I missed this thread.  Thanks to Ilan for
> pointing it out.  A variety of comments below...
>
> On Sat, Dec 4, 2010 at 10:20 AM, Charles R Harris<
> charlesr.harris at gmail.com> wrote:
>
>> Just wondering if this is temporary or the intention is to change the
>> build process? I also note that the *.h files in libndarray are not complete
>> and a *lot* of trailing whitespace has crept into the files.
>
>
> For the purposes of our immediate project the intent is to use autoconf
> since it's widely available and makes building this part Python-independent
> and easier than working it into both distutils and numscons.  Going forward
> it's certainly open to discussion.
>
>
Yes, maintaining multiple build systems is a hassle. I'm wondering if we
shouldn't remove the scons stuff and stick with distutils until we
definitely decide there is a better way. As to autotools, I think it is a
fine short term solution for development purposes, but probably needs to be
replaced down the road.


> Currently all of the .h and .c files are generated as a part of the build
> rather than being checked in just because it saves a build step.  Checking
> in the intermediate files isn't a problem either.
>
>
The idea of having separate .h files is that you can test compile without a
complete build. They might also be helpful in the separate compilation case
(I haven't checked). But in any case, the *.h.src files are there just to
make maintaining the .h file easier, they shouldn't be used as part of the
build.


> Does the trailing whitespace cause problems?  We saw it in the coding
> guidelines and planned to run a filter over it once the code stabilizes, but
> none of us had seen a guideline like that before and weren't sure why it was
> there.
>
>
It should be cleaned up before anything becomes official. Git can be set up
to warn about trailing whitespace. The general guideline is no trailing
whitespace. For one thing you end up with repository changes that
unintentionally involve whitespace. Most editors can be set up to flag
trailing whitespace, which will increase the desire to keep the file clean.

On Sat, Dec 4, 2010 at 3:01 PM, Charles R Harris
<charlesr.harris at gmail.com>wrote:
>
>>
>>
>> On Sat, Dec 4, 2010 at 1:45 PM, Pauli Virtanen <pav at iki.fi> wrote:
>>
>>> On Sat, 04 Dec 2010 14:24:49 -0600, Ilan Schnell wrote:
>>> > I'm not sure how reasonable it would be to move only libndarray into
>>> the
>>> > master, because I've been working on EPD for the last couple of week.
>>> > But Jason will know how complete libndarray is.
>>>
>>> The main question is whether moving it will make things easier or more
>>> difficult, I think. It's one tree more to keep track of.
>>>
>>> In any case, it would be a first part in the merge, and it would split
>>> the hunk of changes into two parts.
>>>
>>>
>> That would be a good thing IMHO. It would also bring a bit more numpy
>> reality to the refactor and since we are implicitly relying on it for the
>> next release sometime next spring the closer to reality it gets the better.
>>
>>
>>>    ***
>>>
>>> Technically, the move could be done like this, so that merge tracking
>>> still works:
>>>
>>>           --------refactor--------------- new-refactor
>>>          /                            /
>>>         /--------libndarray----------x
>>>        /                              \
>>>   start---------------------- master----- new-master
>>>
>>>
>> Looks good to me.
>>
>
> Doing this isn't a problem, though I'm not sure if it buys us much.  90% of
> the changes are the refactoring, moving substantial amounts of code from
> numpy/core/src/multiarray and /umath into libndarray and then all of the
> assorted fix-ups.  The rest is the .NET interface layer which is isolated in
> numpy/NumpyDotNet for now.  We can leave this directory out, but everything
> else is the same between libndarray and refactor. Or am I misunderstanding
> the reason?
>
>
The idea is to keep things moving along and maybe encourage others to take a
bigger role in the merge. We wouldn't touch the current master branch of
numpy yet.


> The current state of the refactor branch is that it passes the bulk of
> regressions on Python 2.6 and 3.? (Ilan, what version did you use?) and is
> up-to-date with the master branch.  There are a few failing regression test
> that we need to look at vs. the master branch but less than dozen.
>
> Switching to use libndarray is a big ABI+API change, right?  If there's an
>> idea to release an ABI-compatible 1.6, wouldn't this end up being more
>> difficult?  Maybe I'm misunderstanding this idea.
>
>
> Definitely a big ABI change and effectively a big API change.  The API
> itself should be close to 100% compatible, except that the data structures
> all change to introduce a new layer of indirection.  Code that strictly uses
> the macro accessors will build fine, but that is turning out to be quite
> rare. The changes are quite mechanical but still non-trivial for code that
> directly accesses the structure fields.
>
> Changes to Cython as a part of the project take care of some of the work. A
> new numpy.pdx file is needed and will mask the changes as long as the Python
> (as opposed to the CPython) interface is used.
>
>
There probably needs to be some discussion of a release schedule so we can
plan ahead.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101207/4eb6c5d3/attachment.html>

From ralf.gommers at googlemail.com  Tue Dec  7 19:36:56 2010
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Wed, 8 Dec 2010 08:36:56 +0800
Subject: [Numpy-discussion] Refactor fork uses the ./configure, make,
 make install process.
In-Reply-To: <AANLkTikpbPv_kbe_72eTjUMf07Y6rfnJLO=aWnYLQ+uK@mail.gmail.com>
References: <AANLkTinnjtg=U0MtF4w2xhgJesOEMOohRL3YUsd580fH@mail.gmail.com>
	<AANLkTinvoL3CQUbQugYUdPNgRQHn=BcUwDz5MVNuOftO@mail.gmail.com>
	<AANLkTinUAJ=YByt01GUeSma_SMch0d5UZb_gr+b_LKkT@mail.gmail.com>
	<ide66i$mp3$2@dough.gmane.org>
	<AANLkTina+5uYikheacJgx0fU7e36_qH0hm40ZtwYVgcG@mail.gmail.com>
	<AANLkTinEfmjquHpU+6DCumGuO8S5tLYdB47CbhQwBtMn@mail.gmail.com>
	<ide9a7$9ih$1@dough.gmane.org>
	<AANLkTikSzMUcKuSbz3XcFmKS6DjC2SU+DS4cBM8yTCHj@mail.gmail.com>
	<AANLkTinkgtKhA8u-5zkCr_M7YE4=z2GGGBjroKkDK8Yf@mail.gmail.com>
	<AANLkTikpbPv_kbe_72eTjUMf07Y6rfnJLO=aWnYLQ+uK@mail.gmail.com>
Message-ID: <AANLkTi=Wab2znkGAGoqyovGdsewOg_=mhg=DCY4d7RTH@mail.gmail.com>

On Wed, Dec 8, 2010 at 2:57 AM, Charles R Harris
<charlesr.harris at gmail.com>wrote:

>
>
> On Tue, Dec 7, 2010 at 11:34 AM, Jason McCampbell <
> jmccampbell at enthought.com> wrote:
>
>> Sorry for the late reply... I missed this thread.  Thanks to Ilan for
>> pointing it out.  A variety of comments below...
>>
>> On Sat, Dec 4, 2010 at 10:20 AM, Charles R Harris<
>> charlesr.harris at gmail.com> wrote:
>>
>>> Just wondering if this is temporary or the intention is to change the
>>> build process? I also note that the *.h files in libndarray are not complete
>>> and a *lot* of trailing whitespace has crept into the files.
>>
>>
>> For the purposes of our immediate project the intent is to use autoconf
>> since it's widely available and makes building this part Python-independent
>> and easier than working it into both distutils and numscons.  Going forward
>> it's certainly open to discussion.
>>
>>
> Yes, maintaining multiple build systems is a hassle. I'm wondering if we
> shouldn't remove the scons stuff and stick with distutils until we
> definitely decide there is a better way.
>

Why would you want to remove scons before we settle on a final new way of
doing things? It's not that much effort to maintain as far as I'm aware, and
more useful (at least to me) than distutils. I don't see a reason not to
keep it until we have something that's actually better (hopefully bento).

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101208/c0d398fc/attachment.html>

From charlesr.harris at gmail.com  Tue Dec  7 19:45:38 2010
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 7 Dec 2010 17:45:38 -0700
Subject: [Numpy-discussion] Refactor fork uses the ./configure, make,
 make install process.
In-Reply-To: <AANLkTi=Wab2znkGAGoqyovGdsewOg_=mhg=DCY4d7RTH@mail.gmail.com>
References: <AANLkTinnjtg=U0MtF4w2xhgJesOEMOohRL3YUsd580fH@mail.gmail.com>
	<AANLkTinvoL3CQUbQugYUdPNgRQHn=BcUwDz5MVNuOftO@mail.gmail.com>
	<AANLkTinUAJ=YByt01GUeSma_SMch0d5UZb_gr+b_LKkT@mail.gmail.com>
	<ide66i$mp3$2@dough.gmane.org>
	<AANLkTina+5uYikheacJgx0fU7e36_qH0hm40ZtwYVgcG@mail.gmail.com>
	<AANLkTinEfmjquHpU+6DCumGuO8S5tLYdB47CbhQwBtMn@mail.gmail.com>
	<ide9a7$9ih$1@dough.gmane.org>
	<AANLkTikSzMUcKuSbz3XcFmKS6DjC2SU+DS4cBM8yTCHj@mail.gmail.com>
	<AANLkTinkgtKhA8u-5zkCr_M7YE4=z2GGGBjroKkDK8Yf@mail.gmail.com>
	<AANLkTikpbPv_kbe_72eTjUMf07Y6rfnJLO=aWnYLQ+uK@mail.gmail.com>
	<AANLkTi=Wab2znkGAGoqyovGdsewOg_=mhg=DCY4d7RTH@mail.gmail.com>
Message-ID: <AANLkTin8hTGQHLowgRNwtCCe7Or9YEq1abaxJPTMmGPT@mail.gmail.com>

On Tue, Dec 7, 2010 at 5:36 PM, Ralf Gommers <ralf.gommers at googlemail.com>wrote:

>
>
> On Wed, Dec 8, 2010 at 2:57 AM, Charles R Harris <
> charlesr.harris at gmail.com> wrote:
>
>>
>>
>> On Tue, Dec 7, 2010 at 11:34 AM, Jason McCampbell <
>> jmccampbell at enthought.com> wrote:
>>
>>> Sorry for the late reply... I missed this thread.  Thanks to Ilan for
>>> pointing it out.  A variety of comments below...
>>>
>>> On Sat, Dec 4, 2010 at 10:20 AM, Charles R Harris<
>>> charlesr.harris at gmail.com> wrote:
>>>
>>>> Just wondering if this is temporary or the intention is to change the
>>>> build process? I also note that the *.h files in libndarray are not complete
>>>> and a *lot* of trailing whitespace has crept into the files.
>>>
>>>
>>> For the purposes of our immediate project the intent is to use autoconf
>>> since it's widely available and makes building this part Python-independent
>>> and easier than working it into both distutils and numscons.  Going forward
>>> it's certainly open to discussion.
>>>
>>>
>> Yes, maintaining multiple build systems is a hassle. I'm wondering if we
>> shouldn't remove the scons stuff and stick with distutils until we
>> definitely decide there is a better way.
>>
>
> Why would you want to remove scons before we settle on a final new way of
> doing things? It's not that much effort to maintain as far as I'm aware, and
> more useful (at least to me) than distutils. I don't see a reason not to
> keep it until we have something that's actually better (hopefully bento).
>
>
Actually, I was waiting to see if you liked scons ;) I don't use it myself
but I was wondering if you used it for the releases. I agree it will be
interesting to see what David comes up with, I think at the moment he likes
waf as a build system to use with bento.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101207/09d34449/attachment.html>

From matt.gregory at oregonstate.edu  Wed Dec  8 12:12:44 2010
From: matt.gregory at oregonstate.edu (Gregory, Matthew)
Date: Wed, 8 Dec 2010 09:12:44 -0800
Subject: [Numpy-discussion] creating zonal statistics from two arrays
Message-ID: <1D673F86DDA00841A1216F04D1CE70D6426800037D@EXCH2.nws.oregonstate.edu>

Hi all,

Likely a very newbie type of question.  I'm using numpy with GDAL to calculate zonal statistics on images.  The basic approach is that I have a zone raster and a value raster which are aligned spatially and I am storing each zone's corresponding values in a dictionary, then calculating the statistics on that population.  (I'm well aware that this approach may have memory issues with large rasters ...)

GDAL ReadAsArray gives you a chunk of raster data as a numpy array.  Currently I'm iterating over rows and columns of that chunk, but I'm guessing there's a better (and more numpy-like) way.

zone_stats = {}
zone_block = zone_band.ReadAsArray(x_off, y_off, x_size, y_size)
value_block = value_band.ReadAsArray(x_off, y_off, x_size, y_size)
for row in xrange(y_size):
    for col in xrange(x_size):
        zone = zone_block[row][col]
        value = value_block[row][col]
        try:
            zone_stats[zone].append(value)
        except KeyError:
            zone_stats[zone] = [value]

# Then calculate stats per zone
...

Thanks for all suggestions on how to make this better, especially if the initial approach I'm taking is flawed.

matt


From josef.pktd at gmail.com  Wed Dec  8 12:48:51 2010
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 8 Dec 2010 12:48:51 -0500
Subject: [Numpy-discussion] creating zonal statistics from two arrays
In-Reply-To: <1D673F86DDA00841A1216F04D1CE70D6426800037D@EXCH2.nws.oregonstate.edu>
References: <1D673F86DDA00841A1216F04D1CE70D6426800037D@EXCH2.nws.oregonstate.edu>
Message-ID: <AANLkTin7mqKVKKHBEcZfa4m_3J3jgjBy1s60jEHuRJvH@mail.gmail.com>

On Wed, Dec 8, 2010 at 12:12 PM, Gregory, Matthew
<matt.gregory at oregonstate.edu> wrote:
> Hi all,
>
> Likely a very newbie type of question. ?I'm using numpy with GDAL to calculate zonal statistics on images. ?The basic approach is that I have a zone raster and a value raster which are aligned spatially and I am storing each zone's corresponding values in a dictionary, then calculating the statistics on that population. ?(I'm well aware that this approach may have memory issues with large rasters ...)
>
> GDAL ReadAsArray gives you a chunk of raster data as a numpy array. ?Currently I'm iterating over rows and columns of that chunk, but I'm guessing there's a better (and more numpy-like) way.
>
> zone_stats = {}
> zone_block = zone_band.ReadAsArray(x_off, y_off, x_size, y_size)
> value_block = value_band.ReadAsArray(x_off, y_off, x_size, y_size)
> for row in xrange(y_size):
> ? ?for col in xrange(x_size):
> ? ? ? ?zone = zone_block[row][col]
> ? ? ? ?value = value_block[row][col]
> ? ? ? ?try:
> ? ? ? ? ? ?zone_stats[zone].append(value)
> ? ? ? ?except KeyError:
> ? ? ? ? ? ?zone_stats[zone] = [value]
>
> # Then calculate stats per zone
> ...

Just a thought since I'm not doing spatial statistics.

If you can create (integer) labels that assigns each point to a zone,
then you can treat it essentially as a 1d grouped data, and you could
use np.bincount to calculate some statistics, or alternatively
scipy.ndimage.measurements for some additional statistics.

This would avoid any python loop, but require a full label array.

Josef

>
> Thanks for all suggestions on how to make this better, especially if the initial approach I'm taking is flawed.
>
> matt
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From n.becker at amolf.nl  Fri Dec 10 04:13:11 2010
From: n.becker at amolf.nl (Nils Becker)
Date: Fri, 10 Dec 2010 10:13:11 +0100
Subject: [Numpy-discussion] truth value of dtypes
Message-ID: <4D01EF27.1080102@amolf.nl>

Hi,

why is

>>> bool(np.dtype(np.float))
False

?

I came across this when using this python idiom:

def f(dtype=None):
....if not dtype:
........print 'using default dtype'

If there is no good reason to have a False truth value, I would vote for
making it True since that is what one would expect (no?)
N.


From markbak at gmail.com  Fri Dec 10 04:38:01 2010
From: markbak at gmail.com (Mark Bakker)
Date: Fri, 10 Dec 2010 10:38:01 +0100
Subject: [Numpy-discussion] status of date-times
Message-ID: <AANLkTinawAfxKrd5Ud=ndJqXrSzsfW-xdc2-eOKtRExA@mail.gmail.com>

Hello List,

Can someone update us on the status of the date-times datatype?

Is it working yet? If not, what are the plans?

I really appreciate all the work and am looking forward to using the new
date-times,

Best regards, Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101210/c7eacd78/attachment.html>

From Ingwer.Wurzel at gmx.net  Fri Dec 10 05:33:18 2010
From: Ingwer.Wurzel at gmx.net (Katharina)
Date: Fri, 10 Dec 2010 11:33:18 +0100
Subject: [Numpy-discussion] Numpy and Python3
Message-ID: <1291977198.4723.137.camel@Speranza>

Hello everyone,

first, I'm really apologise for my English-skills. But I have only one
simple questions. Does NumPy work on Python3 now.
I read so many articles on the Internet, but you can only read some
speculation and not a clear state about this topic.

At the moment I try numpy1.5.1 on Python3.0, but I get only Errors.
If Numpy works on Python3, are the support libraries the same as by
Python2.6? 
(By the way, I use Linux (Ubuntu 9.04))

/With kind regards
Ingwer


From ralf.gommers at googlemail.com  Fri Dec 10 07:05:29 2010
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Fri, 10 Dec 2010 20:05:29 +0800
Subject: [Numpy-discussion] Numpy and Python3
In-Reply-To: <1291977198.4723.137.camel@Speranza>
References: <1291977198.4723.137.camel@Speranza>
Message-ID: <AANLkTi=UpsSdxqnzaCkpN55iScFPRd5CXAwNkCfE8RP_@mail.gmail.com>

On Fri, Dec 10, 2010 at 6:33 PM, Katharina <Ingwer.Wurzel at gmx.net> wrote:

> Hello everyone,
>
> first, I'm really apologise for my English-skills. But I have only one
> simple questions. Does NumPy work on Python3 now.
> I read so many articles on the Internet, but you can only read some
> speculation and not a clear state about this topic.
>

It works fine with Python 3.1.

>
> At the moment I try numpy1.5.1 on Python3.0, but I get only Errors.
> If Numpy works on Python3, are the support libraries the same as by
> Python2.6?
>

Which support libraries? Just Lapack/Blas or Atlas should be all you need,
and just "$ python3.1 setup.py install --prefix=/home/XXX/pick-a-folder"
should work fine.

If you encounter a problem, please send us the exact build command you used,
the build log and compiler versions.

Ralf


> (By the way, I use Linux (Ubuntu 9.04))


> /With kind regards
> Ingwer
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101210/d1003cac/attachment.html>

From robert.kern at gmail.com  Fri Dec 10 09:33:40 2010
From: robert.kern at gmail.com (Robert Kern)
Date: Fri, 10 Dec 2010 08:33:40 -0600
Subject: [Numpy-discussion] truth value of dtypes
In-Reply-To: <4D01EF27.1080102@amolf.nl>
References: <4D01EF27.1080102@amolf.nl>
Message-ID: <AANLkTi=P1=nerTw7pRjh3DR_=+KceQMUqjL84wmg-CwT@mail.gmail.com>

On Fri, Dec 10, 2010 at 03:13, Nils Becker <n.becker at amolf.nl> wrote:
> Hi,
>
> why is
>
>>>> bool(np.dtype(np.float))
> False
>
> ?
>
> I came across this when using this python idiom:
>
> def f(dtype=None):
> ....if not dtype:
> ........print 'using default dtype'

The default truth value probably should be True, but not for this
reason. The correct idiom to use is this:

def f(dtype=None):
    if dtype is None:
        print 'using default dtype'

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From alan.isaac at gmail.com  Fri Dec 10 09:47:59 2010
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Fri, 10 Dec 2010 09:47:59 -0500
Subject: [Numpy-discussion] truth value of dtypes
In-Reply-To: <4D01EF27.1080102@amolf.nl>
References: <4D01EF27.1080102@amolf.nl>
Message-ID: <4D023D9F.8050707@gmail.com>

On 12/10/2010 4:13 AM, Nils Becker wrote:
> def f(dtype=None):
> ....if not dtype:

I think you want:
	if dtype is None:

fwiw,
Alan


From charlesr.harris at gmail.com  Fri Dec 10 10:45:47 2010
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 10 Dec 2010 08:45:47 -0700
Subject: [Numpy-discussion] Numpy and Python3
In-Reply-To: <1291977198.4723.137.camel@Speranza>
References: <1291977198.4723.137.camel@Speranza>
Message-ID: <AANLkTimrK+U==mNgGsMKSe4A_i7SMNxcXFS2B09D=eco@mail.gmail.com>

On Fri, Dec 10, 2010 at 3:33 AM, Katharina <Ingwer.Wurzel at gmx.net> wrote:

> Hello everyone,
>
> first, I'm really apologise for my English-skills. But I have only one
> simple questions. Does NumPy work on Python3 now.
> I read so many articles on the Internet, but you can only read some
> speculation and not a clear state about this topic.
>
> At the moment I try numpy1.5.1 on Python3.0, but I get only Errors.
> If Numpy works on Python3, are the support libraries the same as by
> Python2.6?
> (By the way, I use Linux (Ubuntu 9.04))
>
>
We don't support 3.0, only 3.1 and above.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101210/6db16c53/attachment.html>

From kwgoodman at gmail.com  Fri Dec 10 12:25:25 2010
From: kwgoodman at gmail.com (Keith Goodman)
Date: Fri, 10 Dec 2010 09:25:25 -0800
Subject: [Numpy-discussion] A Cython apply_along_axis function
In-Reply-To: <AANLkTikzwnLcWa837fpUOua6GbA6VDoBVvj6i8WT4LcC@mail.gmail.com>
References: <AANLkTinp6pMmzwF0Q6fUhYH=cAyKNZcA5_nTj0Tmr3JE@mail.gmail.com>
	<4CF6FC27.5040005@silveregg.co.jp>
	<AANLkTikzwnLcWa837fpUOua6GbA6VDoBVvj6i8WT4LcC@mail.gmail.com>
Message-ID: <AANLkTikVeBJWBBvMkR67rdkt6UsTz36YXhLJEWeHxkUO@mail.gmail.com>

On Wed, Dec 1, 2010 at 6:07 PM, Keith Goodman <kwgoodman at gmail.com> wrote:
> On Wed, Dec 1, 2010 at 5:53 PM, David <david at silveregg.co.jp> wrote:
>
>> On 12/02/2010 04:47 AM, Keith Goodman wrote:
>>> It's hard to write Cython code that can handle all dtypes and
>>> arbitrary number of dimensions. The former is typically dealt with
>>> using templates, but what do people do about the latter?
>>
>> The only way that I know to do that systematically is iterator. There is
>> a relatively simple example in scipy/signal (lfilter.c.src).
>>
>> I wonder if it would be possible to add better support for numpy
>> iterators in cython...
>
> Thanks for the tip. I'm starting to think that for now I should just
> template both dtype and ndim.

I ended up templating both dtype and axis. For the axis templating I
used two functions: looper and loop_cdef.

LOOPER

    Make a 3d loop template:

    >>> loop = '''
    .... for iINDEX0 in range(nINDEX0):
    ....         for iINDEX1 in range(nINDEX1):
    ....             amin = MAXDTYPE
    ....         for iINDEX2 in range(nINDEX2):
    ....                 ai = a[INDEXALL]
    ....             if ai <= amin:
    ....                 amin = ai
    ....         y[INDEXPOP] = amin
    .... '''

    Import the looper function:

    >>> from bottleneck.src.template.template import looper

    Make a loop over axis=0:

    >>> print looper(loop, ndim=3, axis=0)
    for i1 in range(n1):
        for i2 in range(n2):
            amin = MAXDTYPE
            for i0 in range(n0):
                ai = a[i0, i1, i2]
                if ai <= amin:
                    amin = ai
            y[i1, i2] = amin

    Make a loop over axis=1:

    >>> print looper(loop, ndim=3, axis=1)
    for i0 in range(n0):
        for i2 in range(n2):
            amin = MAXDTYPE
            for i1 in range(n1):
                ai = a[i0, i1, i2]
                if ai <= amin:
                    amin = ai
            y[i0, i2] = amin

LOOP_CDEF

    Define parameters:

    >>> ndim = 3
    >>> dtype = 'float64'
    >>> axis = 1
    >>> is_reducing_function = True

    Import loop_cdef:

    >>> from bottleneck.src.template.template import loop_cdef

    Make loop initialization code:

    >>> print loop_cdef(ndim, dtype, axis, is_reducing_function)
        cdef Py_ssize_t i0, i1, i2
        cdef int n0 = a.shape[0]
        cdef int n1 = a.shape[1]
        cdef int n2 = a.shape[2]
        cdef np.npy_intp *dims = [n0, n2]
        cdef np.ndarray[np.float64_t, ndim=2] y = PyArray_EMPTY(2, dims,
                                                  NPY_float64, 0)


From kwgoodman at gmail.com  Fri Dec 10 16:42:49 2010
From: kwgoodman at gmail.com (Keith Goodman)
Date: Fri, 10 Dec 2010 13:42:49 -0800
Subject: [Numpy-discussion] np.var() and ddof
Message-ID: <AANLkTi=29UDO8H=5FjmSDGwU3kZJFXoWGzqqiH3+v6qW@mail.gmail.com>

Why does ddof=2 and ddof=3 give the same result?

>> np.var([1, 2, 3], ddof=0)
   0.66666666666666663
>> np.var([1, 2, 3], ddof=1)
   1.0
>> np.var([1, 2, 3], ddof=2)
   2.0
>> np.var([1, 2, 3], ddof=3)
   2.0
>> np.var([1, 2, 3], ddof=4)
   -2.0

I expected NaN for ddof=3.


From josef.pktd at gmail.com  Fri Dec 10 17:26:54 2010
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 10 Dec 2010 17:26:54 -0500
Subject: [Numpy-discussion] np.var() and ddof
In-Reply-To: <AANLkTi=29UDO8H=5FjmSDGwU3kZJFXoWGzqqiH3+v6qW@mail.gmail.com>
References: <AANLkTi=29UDO8H=5FjmSDGwU3kZJFXoWGzqqiH3+v6qW@mail.gmail.com>
Message-ID: <AANLkTimk_R08+pXz8mXBJrNcrm8pmjVVeZQa_cZtYLvc@mail.gmail.com>

On Fri, Dec 10, 2010 at 4:42 PM, Keith Goodman <kwgoodman at gmail.com> wrote:
> Why does ddof=2 and ddof=3 give the same result?
>
>>> np.var([1, 2, 3], ddof=0)
> ? 0.66666666666666663
>>> np.var([1, 2, 3], ddof=1)
> ? 1.0
>>> np.var([1, 2, 3], ddof=2)
> ? 2.0
>>> np.var([1, 2, 3], ddof=3)
> ? 2.0
>>> np.var([1, 2, 3], ddof=4)
> ? -2.0
>
> I expected NaN for ddof=3.

It's a floating point calculation, so I would expect np.inf

Josef

> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From kwgoodman at gmail.com  Fri Dec 10 17:32:24 2010
From: kwgoodman at gmail.com (Keith Goodman)
Date: Fri, 10 Dec 2010 14:32:24 -0800
Subject: [Numpy-discussion] np.var() and ddof
In-Reply-To: <AANLkTimk_R08+pXz8mXBJrNcrm8pmjVVeZQa_cZtYLvc@mail.gmail.com>
References: <AANLkTi=29UDO8H=5FjmSDGwU3kZJFXoWGzqqiH3+v6qW@mail.gmail.com>
	<AANLkTimk_R08+pXz8mXBJrNcrm8pmjVVeZQa_cZtYLvc@mail.gmail.com>
Message-ID: <AANLkTi=W4XqERxnPUuwSv7JPT08HK99tszKFvAAvJ40J@mail.gmail.com>

On Fri, Dec 10, 2010 at 2:26 PM,  <josef.pktd at gmail.com> wrote:
> On Fri, Dec 10, 2010 at 4:42 PM, Keith Goodman <kwgoodman at gmail.com> wrote:
>> Why does ddof=2 and ddof=3 give the same result?
>>
>>>> np.var([1, 2, 3], ddof=0)
>> ? 0.66666666666666663
>>>> np.var([1, 2, 3], ddof=1)
>> ? 1.0
>>>> np.var([1, 2, 3], ddof=2)
>> ? 2.0
>>>> np.var([1, 2, 3], ddof=3)
>> ? 2.0
>>>> np.var([1, 2, 3], ddof=4)
>> ? -2.0
>>
>> I expected NaN for ddof=3.
>
> It's a floating point calculation, so I would expect np.inf

Right, NAFN (F=Finite). Unless, of course, the numerator is zero too.


From Ingwer.Wurzel at gmx.net  Sat Dec 11 07:41:59 2010
From: Ingwer.Wurzel at gmx.net (Katharina)
Date: Sat, 11 Dec 2010 13:41:59 +0100
Subject: [Numpy-discussion] Numpy and Python3
In-Reply-To: <AANLkTi=UpsSdxqnzaCkpN55iScFPRd5CXAwNkCfE8RP_@mail.gmail.com>
References: <1291977198.4723.137.camel@Speranza>
	<AANLkTi=UpsSdxqnzaCkpN55iScFPRd5CXAwNkCfE8RP_@mail.gmail.com>
Message-ID: <1292071320.4874.26.camel@Speranza>

Hi,
I install Python3.1, but I get the same Error:
--------------------------------------------------------------------------------------------------------
sudo python3 setup.py build --fcompiler=gnu95
Converting to Python3 via 2to3...
RefactoringTool: Skipping implicit fixer: buffer
RefactoringTool: Skipping implicit fixer: idioms
RefactoringTool: Skipping implicit fixer: set_literal
RefactoringTool: Skipping implicit fixer: ws_comma
RefactoringTool: No files need to be modified.
Running from numpy source directory.Traceback (most recent call last):
  File "setup.py", line 211, in <module>
    setup_package()
  File "setup.py", line 188, in setup_package
    from numpy.distutils.core import setup
  File
"/usr/local/lib/python3.1/site-packages/numpy-1.5.1/build/py3k/numpy/distutils/__init__.py", line 22, in <module>
    import numpy.distutils.ccompiler
  File
"/usr/local/lib/python3.1/site-packages/numpy-1.5.1/build/py3k/numpy/distutils/ccompiler.py", line 15, in <module>
    from numpy.distutils.exec_command import exec_command
  File
"/usr/local/lib/python3.1/site-packages/numpy-1.5.1/build/py3k/numpy/distutils/exec_command.py", line 58, in <module>
    from numpy.compat import open_latin1
  File
"/usr/local/lib/python3.1/site-packages/numpy-1.5.1/build/py3k/numpy/compat/__init__.py", line 14, in <module>
    from .py3k import *
AttributeError: 'module' object has no attribute 'unicode'

--------------------------------------------------------------------------------------------------------

Can somebody see, what's the problem?
I'm really pleased for any help. 

/With kind regards
Ingwer


Am Freitag, den 10.12.2010, 20:05 +0800 schrieb Ralf Gommers:
> 
> 
> On Fri, Dec 10, 2010 at 6:33 PM, Katharina <Ingwer.Wurzel at gmx.net>
> wrote:
>         Hello everyone,
>         
>         first, I'm really apologise for my English-skills. But I have
>         only one
>         simple questions. Does NumPy work on Python3 now.
>         I read so many articles on the Internet, but you can only read
>         some
>         speculation and not a clear state about this topic.
> 
> It works fine with Python 3.1.  
> 
>         
>         At the moment I try numpy1.5.1 on Python3.0, but I get only
>         Errors.
>         If Numpy works on Python3, are the support libraries the same
>         as by
>         Python2.6?
> 
> Which support libraries? Just Lapack/Blas or Atlas should be all you
> need, and just "$ python3.1 setup.py install
> --prefix=/home/XXX/pick-a-folder" should work fine.
> 
> If you encounter a problem, please send us the exact build command you
> used, the build log and compiler versions.
> 
> Ralf
> 
>    
> 
>         (By the way, I use Linux (Ubuntu 9.04)) 
>         
>         /With kind regards
>         Ingwer
>         
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From charlesr.harris at gmail.com  Sat Dec 11 10:05:17 2010
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 11 Dec 2010 08:05:17 -0700
Subject: [Numpy-discussion] Numpy and Python3
In-Reply-To: <1292071320.4874.26.camel@Speranza>
References: <1291977198.4723.137.camel@Speranza>
	<AANLkTi=UpsSdxqnzaCkpN55iScFPRd5CXAwNkCfE8RP_@mail.gmail.com>
	<1292071320.4874.26.camel@Speranza>
Message-ID: <AANLkTinjFodR74+0tZwGKvRn6B6VXsAi8NVmZH5y4BMb@mail.gmail.com>

On Sat, Dec 11, 2010 at 5:41 AM, Katharina <Ingwer.Wurzel at gmx.net> wrote:

> Hi,
> I install Python3.1, but I get the same Error:
>
> --------------------------------------------------------------------------------------------------------
> sudo python3 setup.py build --fcompiler=gnu95
> Converting to Python3 via 2to3...
> RefactoringTool: Skipping implicit fixer: buffer
> RefactoringTool: Skipping implicit fixer: idioms
> RefactoringTool: Skipping implicit fixer: set_literal
> RefactoringTool: Skipping implicit fixer: ws_comma
> RefactoringTool: No files need to be modified.
> Running from numpy source directory.Traceback (most recent call last):
>  File "setup.py", line 211, in <module>
>    setup_package()
>  File "setup.py", line 188, in setup_package
>    from numpy.distutils.core import setup
>  File
> "/usr/local/lib/python3.1/site-packages/numpy-1.5.1/build/py3k/numpy/distutils/__init__.py",
> line 22, in <module>
>    import numpy.distutils.ccompiler
>

Are you doing the build in  /usr/local/lib/python3.1/site-packages/ ?
Usually the build is done in a working directory and installed by "python
setup.py install". I don't know that that is the problem, but it is unusual.

 File
> "/usr/local/lib/python3.1/site-packages/numpy-1.5.1/build/py3k/numpy/distutils/ccompiler.py",
> line 15, in <module>
>    from numpy.distutils.exec_command import exec_command
>  File
> "/usr/local/lib/python3.1/site-packages/numpy-1.5.1/build/py3k/numpy/distutils/exec_command.py",
> line 58, in <module>
>    from numpy.compat import open_latin1
>  File
> "/usr/local/lib/python3.1/site-packages/numpy-1.5.1/build/py3k/numpy/compat/__init__.py",
> line 14, in <module>
>    from .py3k import *
> AttributeError: 'module' object has no attribute 'unicode'
>
>
> --------------------------------------------------------------------------------------------------------
>
> Can somebody see, what's the problem?
> I'm really pleased for any help.
>
>
<snip>

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101211/5cc54247/attachment.html>

From Ingwer.Wurzel at gmx.net  Sat Dec 11 13:53:30 2010
From: Ingwer.Wurzel at gmx.net (Katharina)
Date: Sat, 11 Dec 2010 19:53:30 +0100
Subject: [Numpy-discussion] Numpy and Python3
In-Reply-To: <AANLkTinjFodR74+0tZwGKvRn6B6VXsAi8NVmZH5y4BMb@mail.gmail.com>
References: <1291977198.4723.137.camel@Speranza>
	<AANLkTi=UpsSdxqnzaCkpN55iScFPRd5CXAwNkCfE8RP_@mail.gmail.com>
	<1292071320.4874.26.camel@Speranza>
	<AANLkTinjFodR74+0tZwGKvRn6B6VXsAi8NVmZH5y4BMb@mail.gmail.com>
Message-ID: <1292093610.4874.31.camel@Speranza>

Hi,
yes my build is in /usr/local/lib/python3.1/site-packages/numpy-1.5.1.
Is't wrong?

/ Ingwer

Am Samstag, den 11.12.2010, 08:05 -0700 schrieb Charles R Harris:
> 
> 
> On Sat, Dec 11, 2010 at 5:41 AM, Katharina <Ingwer.Wurzel at gmx.net>
> wrote:
>         Hi,
>         I install Python3.1, but I get the same Error:
>         --------------------------------------------------------------------------------------------------------
>         sudo python3 setup.py build --fcompiler=gnu95
>         Converting to Python3 via 2to3...
>         RefactoringTool: Skipping implicit fixer: buffer
>         RefactoringTool: Skipping implicit fixer: idioms
>         RefactoringTool: Skipping implicit fixer: set_literal
>         RefactoringTool: Skipping implicit fixer: ws_comma
>         RefactoringTool: No files need to be modified.
>         Running from numpy source directory.Traceback (most recent
>         call last):
>          File "setup.py", line 211, in <module>
>            setup_package()
>          File "setup.py", line 188, in setup_package
>            from numpy.distutils.core import setup
>          File
>         "/usr/local/lib/python3.1/site-packages/numpy-1.5.1/build/py3k/numpy/distutils/__init__.py", line 22, in <module>
>            import numpy.distutils.ccompiler
> 
> Are you doing the build in  /usr/local/lib/python3.1/site-packages/ ?
> Usually the build is done in a working directory and installed by
> "python setup.py install". I don't know that that is the problem, but
> it is unusual.
> 
> 
>          File
>         "/usr/local/lib/python3.1/site-packages/numpy-1.5.1/build/py3k/numpy/distutils/ccompiler.py", line 15, in <module>
>            from numpy.distutils.exec_command import exec_command
>          File
>         "/usr/local/lib/python3.1/site-packages/numpy-1.5.1/build/py3k/numpy/distutils/exec_command.py", line 58, in <module>
>            from numpy.compat import open_latin1
>          File
>         "/usr/local/lib/python3.1/site-packages/numpy-1.5.1/build/py3k/numpy/compat/__init__.py", line 14, in <module>
>            from .py3k import *
>         AttributeError: 'module' object has no attribute 'unicode'
>         
>         --------------------------------------------------------------------------------------------------------
>         
>         Can somebody see, what's the problem?
>         I'm really pleased for any help.
>         
> 
> <snip>
> 
> Chuck 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From charlesr.harris at gmail.com  Sat Dec 11 14:06:23 2010
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 11 Dec 2010 12:06:23 -0700
Subject: [Numpy-discussion] Numpy and Python3
In-Reply-To: <1292093610.4874.31.camel@Speranza>
References: <1291977198.4723.137.camel@Speranza>
	<AANLkTi=UpsSdxqnzaCkpN55iScFPRd5CXAwNkCfE8RP_@mail.gmail.com>
	<1292071320.4874.26.camel@Speranza>
	<AANLkTinjFodR74+0tZwGKvRn6B6VXsAi8NVmZH5y4BMb@mail.gmail.com>
	<1292093610.4874.31.camel@Speranza>
Message-ID: <AANLkTimNN=TEJz0XCwhsfSZj=Y-DZJL76fM-C=PFw0J4@mail.gmail.com>

On Sat, Dec 11, 2010 at 11:53 AM, Katharina <Ingwer.Wurzel at gmx.net> wrote:

> Hi,
> yes my build is in /usr/local/lib/python3.1/site-packages/numpy-1.5.1.
> Is't wrong?
>
>
Well, let's find out ;) Move your numpy download somewhere like
~/numpy-1.5.1, then do

cd numpy-1.5.1
python3.1 setup.py build
sudo python3.1 setup.py install

You should probably also do

sudo rm -rf  /usr/local/lib/python3.1/site-packages/numpy-1.5.1 before the
build as well as remove your local build directory. You might also need to
change ownership of the files from root to yourself.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101211/46679d2f/attachment.html>

From Ingwer.Wurzel at gmx.net  Sat Dec 11 14:49:04 2010
From: Ingwer.Wurzel at gmx.net (Katharina)
Date: Sat, 11 Dec 2010 20:49:04 +0100
Subject: [Numpy-discussion] Numpy and Python3
In-Reply-To: <AANLkTimNN=TEJz0XCwhsfSZj=Y-DZJL76fM-C=PFw0J4@mail.gmail.com>
References: <1291977198.4723.137.camel@Speranza>
	<AANLkTi=UpsSdxqnzaCkpN55iScFPRd5CXAwNkCfE8RP_@mail.gmail.com>
	<1292071320.4874.26.camel@Speranza>
	<AANLkTinjFodR74+0tZwGKvRn6B6VXsAi8NVmZH5y4BMb@mail.gmail.com>
	<1292093610.4874.31.camel@Speranza>
	<AANLkTimNN=TEJz0XCwhsfSZj=Y-DZJL76fM-C=PFw0J4@mail.gmail.com>
Message-ID: <1292096944.4874.43.camel@Speranza>

I'm really sorry, but the Error is the same:

----------------------------------------------------------------------------------------
 ~/Desktop/numpy-1.5.1$ python3.1 setup.py build
Converting to Python3 via 2to3...
RefactoringTool: Skipping implicit fixer: buffer
RefactoringTool: Skipping implicit fixer: idioms
RefactoringTool: Skipping implicit fixer: set_literal
RefactoringTool: Skipping implicit fixer: ws_comma
RefactoringTool: No files need to be modified.
Running from numpy source directory.Traceback (most recent call last):
  File "setup.py", line 211, in <module>
    setup_package()
  File "setup.py", line 188, in setup_package
    from numpy.distutils.core import setup
  File
"/home/natta/Desktop/numpy-1.5.1/build/py3k/numpy/distutils/__init__.py", line 22, in <module>
    import numpy.distutils.ccompiler
  File
"/home/natta/Desktop/numpy-1.5.1/build/py3k/numpy/distutils/ccompiler.py", line 15, in <module>
    from numpy.distutils.exec_command import exec_command
  File
"/home/natta/Desktop/numpy-1.5.1/build/py3k/numpy/distutils/exec_command.py", line 58, in <module>
    from numpy.compat import open_latin1
  File
"/home/natta/Desktop/numpy-1.5.1/build/py3k/numpy/compat/__init__.py",
line 14, in <module>
    from .py3k import *
AttributeError: 'module' object has no attribute 'unicode'
----------------------------------------------------------------------------------------


I don't do if it helps. But I try to install numpy1.5.1 on python2.6 
and get this Erros:
----------------------------------------------------------------------------------------
/usr/local/lib/python2.6/site-packages/numpy-1.5.1$ python setup.py
build fcompiler=gnu95
Traceback (most recent call last):
  File "setup.py", line 25, in <module>
    import builtins as builtins
ImportError: No module named builtins
----------------------------------------------------------------------------------------

/ Ingwer


Am Samstag, den 11.12.2010, 12:06 -0700 schrieb Charles R Harris:
> 
> 
> On Sat, Dec 11, 2010 at 11:53 AM, Katharina <Ingwer.Wurzel at gmx.net>
> wrote:
>         Hi,
>         yes my build is
>         in /usr/local/lib/python3.1/site-packages/numpy-1.5.1.
>         Is't wrong?
>         
> 
> Well, let's find out ;) Move your numpy download somewhere like
> ~/numpy-1.5.1, then do
> 
> cd numpy-1.5.1
> python3.1 setup.py build
> sudo python3.1 setup.py install
> 
> You should probably also do
> 
> sudo rm -rf  /usr/local/lib/python3.1/site-packages/numpy-1.5.1 before
> the build as well as remove your local build directory. You might also
> need to change ownership of the files from root to yourself.
> 
> Chuck
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From Ingwer.Wurzel at gmx.net  Sat Dec 11 14:58:35 2010
From: Ingwer.Wurzel at gmx.net (Katharina)
Date: Sat, 11 Dec 2010 20:58:35 +0100
Subject: [Numpy-discussion] Numpy and Python3
In-Reply-To: <1292096944.4874.43.camel@Speranza>
References: <1291977198.4723.137.camel@Speranza>
	<AANLkTi=UpsSdxqnzaCkpN55iScFPRd5CXAwNkCfE8RP_@mail.gmail.com>
	<1292071320.4874.26.camel@Speranza>
	<AANLkTinjFodR74+0tZwGKvRn6B6VXsAi8NVmZH5y4BMb@mail.gmail.com>
	<1292093610.4874.31.camel@Speranza>
	<AANLkTimNN=TEJz0XCwhsfSZj=Y-DZJL76fM-C=PFw0J4@mail.gmail.com>
	<1292096944.4874.43.camel@Speranza>
Message-ID: <1292097515.4874.47.camel@Speranza>

Oh... the Problem with python2.6 is solved.
I take the numpy Version,  which was transformed with 2to3.

sorry

/Ingwer 

Am Samstag, den 11.12.2010, 20:49 +0100 schrieb Katharina:
> I'm really sorry, but the Error is the same:
> 
> ----------------------------------------------------------------------------------------
>  ~/Desktop/numpy-1.5.1$ python3.1 setup.py build
> Converting to Python3 via 2to3...
> RefactoringTool: Skipping implicit fixer: buffer
> RefactoringTool: Skipping implicit fixer: idioms
> RefactoringTool: Skipping implicit fixer: set_literal
> RefactoringTool: Skipping implicit fixer: ws_comma
> RefactoringTool: No files need to be modified.
> Running from numpy source directory.Traceback (most recent call last):
>   File "setup.py", line 211, in <module>
>     setup_package()
>   File "setup.py", line 188, in setup_package
>     from numpy.distutils.core import setup
>   File
> "/home/natta/Desktop/numpy-1.5.1/build/py3k/numpy/distutils/__init__.py", line 22, in <module>
>     import numpy.distutils.ccompiler
>   File
> "/home/natta/Desktop/numpy-1.5.1/build/py3k/numpy/distutils/ccompiler.py", line 15, in <module>
>     from numpy.distutils.exec_command import exec_command
>   File
> "/home/natta/Desktop/numpy-1.5.1/build/py3k/numpy/distutils/exec_command.py", line 58, in <module>
>     from numpy.compat import open_latin1
>   File
> "/home/natta/Desktop/numpy-1.5.1/build/py3k/numpy/compat/__init__.py",
> line 14, in <module>
>     from .py3k import *
> AttributeError: 'module' object has no attribute 'unicode'
> ----------------------------------------------------------------------------------------
> 
> 
> I don't do if it helps. But I try to install numpy1.5.1 on python2.6 
> and get this Erros:
> ----------------------------------------------------------------------------------------
> /usr/local/lib/python2.6/site-packages/numpy-1.5.1$ python setup.py
> build fcompiler=gnu95
> Traceback (most recent call last):
>   File "setup.py", line 25, in <module>
>     import builtins as builtins
> ImportError: No module named builtins
> ----------------------------------------------------------------------------------------
> 
> / Ingwer
> 
> 
> 
> 
> 
> Am Samstag, den 11.12.2010, 12:06 -0700 schrieb Charles R Harris:
> > 
> > 
> > On Sat, Dec 11, 2010 at 11:53 AM, Katharina <Ingwer.Wurzel at gmx.net>
> > wrote:
> >         Hi,
> >         yes my build is
> >         in /usr/local/lib/python3.1/site-packages/numpy-1.5.1.
> >         Is't wrong?
> >         
> > 
> > Well, let's find out ;) Move your numpy download somewhere like
> > ~/numpy-1.5.1, then do
> > 
> > cd numpy-1.5.1
> > python3.1 setup.py build
> > sudo python3.1 setup.py install
> > 
> > You should probably also do
> > 
> > sudo rm -rf  /usr/local/lib/python3.1/site-packages/numpy-1.5.1 before
> > the build as well as remove your local build directory. You might also
> > need to change ownership of the files from root to yourself.
> > 
> > Chuck
> > 
> > 
> > 
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From charlesr.harris at gmail.com  Sat Dec 11 15:19:05 2010
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 11 Dec 2010 13:19:05 -0700
Subject: [Numpy-discussion] Numpy and Python3
In-Reply-To: <1292097515.4874.47.camel@Speranza>
References: <1291977198.4723.137.camel@Speranza>
	<AANLkTi=UpsSdxqnzaCkpN55iScFPRd5CXAwNkCfE8RP_@mail.gmail.com>
	<1292071320.4874.26.camel@Speranza>
	<AANLkTinjFodR74+0tZwGKvRn6B6VXsAi8NVmZH5y4BMb@mail.gmail.com>
	<1292093610.4874.31.camel@Speranza>
	<AANLkTimNN=TEJz0XCwhsfSZj=Y-DZJL76fM-C=PFw0J4@mail.gmail.com>
	<1292096944.4874.43.camel@Speranza>
	<1292097515.4874.47.camel@Speranza>
Message-ID: <AANLkTi=A=cbGsw6dETfa+WaZVrdj44UvXSEx=cWXjr2p@mail.gmail.com>

On Sat, Dec 11, 2010 at 12:58 PM, Katharina <Ingwer.Wurzel at gmx.net> wrote:

> Oh... the Problem with python2.6 is solved.
> I take the numpy Version,  which was transformed with 2to3.
>
>
Wait, how did you do that? Setup should automatically select the right
version.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101211/d431701e/attachment.html>

From seb.haase at gmail.com  Sat Dec 11 17:40:35 2010
From: seb.haase at gmail.com (Sebastian Haase)
Date: Sat, 11 Dec 2010 23:40:35 +0100
Subject: [Numpy-discussion] np.lookfor -- still supported / working ?
Message-ID: <AANLkTikhn19h39HBnDaLK4QbDr-pxVfDLkPhY3Fj4rvJ@mail.gmail.com>

Hi all,

I recently discovered numpy's lookfor function,
which is supposed to look through "all kinds" of doc strings and list
relevant functions related to given keywords.
However it does not seem to work for me:
>>> N.__version__
'1.3.0'
>>> N.lookfor('fft', module=None, import_modules=True, regenerate=False)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "C:\cygwin\home\haase\Priithon_25_win\numpy\lib\utils.py", line
622, in lookfor
    found.sort(relevance_sort)
TypeError: comparison function must return int
>>> 1/2   # I use  from future import division ....
0.5
>>>

I especially like, that it is supposed to work for any other (not just
numpy or scipy) module.
But that also doesn't work:
>>> import wx
>>> N.lookfor('background', module='wx', import_modules=True, regenerate=False)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "C:\cygwin\home\haase\Priithon_25_win\numpy\lib\utils.py", line
574, in lookfor
    cache = _lookfor_generate_cache(module, import_modules, regenerate)
  File "C:\cygwin\home\haase\Priithon_25_win\numpy\lib\utils.py", line
729, in _lookfor_generate_cache
    doc = inspect.getdoc(item)
  File "C:\cygwin\home\haase\Priithon_25_win\Python25\lib\inspect.py",
line 313, in getdoc
    doc = object.__doc__
NameError: Unknown C global variable
>>>

I tried it on more recent numpy (1.5.1 I think) and got same problems.
Any comments ?

Thanks,
Sebastian Haase


From pav at iki.fi  Sat Dec 11 18:53:58 2010
From: pav at iki.fi (Pauli Virtanen)
Date: Sat, 11 Dec 2010 23:53:58 +0000 (UTC)
Subject: [Numpy-discussion] np.lookfor -- still supported / working ?
References: <AANLkTikhn19h39HBnDaLK4QbDr-pxVfDLkPhY3Fj4rvJ@mail.gmail.com>
Message-ID: <ie12ug$42d$1@dough.gmane.org>

On Sat, 11 Dec 2010 23:40:35 +0100, Sebastian Haase wrote:

> Hi all,
> 
> I recently discovered numpy's lookfor function, which is supposed to
> look through "all kinds" of doc strings and list relevant functions
> related to given keywords. However it does not seem to work for me:
>>>> N.__version__
> '1.3.0'
>>>> N.lookfor('fft', module=None, import_modules=True, regenerate=False)
[clip]

Worksforme

>>> import numpy as np
>>> np.__version__
'1.5.1'
>>> np.lookfor('fft', module=None, import_modules=True, regenerate=False)
Search results for 'fft'
------------------------
numpy.fft.hfft
    Compute the FFT of a signal whose spectrum has Hermitian symmetry.
...


[clip]
> I especially like, that it is supposed to work for any other (not just
> numpy or scipy) module.
> But that also doesn't work:
>>>> import wx
>>>> N.lookfor('background', module='wx', import_modules=True,
>>>> regenerate=False)
> Traceback (most recent call last):
>   File "<input>", line 1, in <module>
>   File "C:\cygwin\home\haase\Priithon_25_win\numpy\lib\utils.py", line
> 574, in lookfor
>     cache = _lookfor_generate_cache(module, import_modules, regenerate)
>   File "C:\cygwin\home\haase\Priithon_25_win\numpy\lib\utils.py", line
> 729, in _lookfor_generate_cache
>     doc = inspect.getdoc(item)
>   File "C:\cygwin\home\haase\Priithon_25_win\Python25\lib\inspect.py",
> line 313, in getdoc
>     doc = object.__doc__
> NameError: Unknown C global variable
[clip]

That's more of an issue in the `wx` module in that it behaves in a non-
standard way under introspection. But yes, it would be possible to catch 
that exception and ignore it.

-- 
Pauli Virtanen


From olivier.grisel at ensta.org  Sun Dec 12 08:41:25 2010
From: olivier.grisel at ensta.org (Olivier Grisel)
Date: Sun, 12 Dec 2010 14:41:25 +0100
Subject: [Numpy-discussion] [ANN] FOSDEM datadevroom - Feb. 5 2011 -
	Brussels - Call for Presentations
Message-ID: <AANLkTi=NGyOrP43bKUn5GWOivQ6ig2V82fPrioK-sx5U@mail.gmail.com>

Hello numpy users,

We (Isabel Drost, Nicolas Maillot and I) are organizing a Data Analytics Devroom
that will take place during the next edition of the FOSDEM in Brussels
on Feb. 5. Here is the CFP:

http://datadevroom.couch.it/CFP

You might be interested in attending the event and take the
opportunity to speak about your projects.

Important Dates (all dates in GMT +2):

Submission deadline: 2010-12-17
Notification of accepted speakers: 2010-12-20
Publication of final schedule: 2011-01-10
Meetup: 2011-02-05

The event will comprise presentations on scalable data processing. We
invite you to submit talks on the topics: Information retrieval / Search
Large Scale data processing, Machine Learning, Text Mining, Computer
vision, [Linked] Open Data.

High quality, technical submissions are called for, ranging from
principles to practice. We are looking for presentations on the
implementation of the systems themselves, real world applications and
case studies.

Submissions should be based on free software solutions.

Please re-distribute this CFP to people who might be interested.

Looking forward to meeting you face to face in Brussels,

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel


From Ingwer.Wurzel at gmx.net  Sun Dec 12 13:06:10 2010
From: Ingwer.Wurzel at gmx.net (Katharina)
Date: Sun, 12 Dec 2010 19:06:10 +0100
Subject: [Numpy-discussion] Numpy and Python3
In-Reply-To: <AANLkTi=A=cbGsw6dETfa+WaZVrdj44UvXSEx=cWXjr2p@mail.gmail.com>
References: <1291977198.4723.137.camel@Speranza>
	<AANLkTi=UpsSdxqnzaCkpN55iScFPRd5CXAwNkCfE8RP_@mail.gmail.com>
	<1292071320.4874.26.camel@Speranza>
	<AANLkTinjFodR74+0tZwGKvRn6B6VXsAi8NVmZH5y4BMb@mail.gmail.com>
	<1292093610.4874.31.camel@Speranza>
	<AANLkTimNN=TEJz0XCwhsfSZj=Y-DZJL76fM-C=PFw0J4@mail.gmail.com>
	<1292096944.4874.43.camel@Speranza> <1292097515.4874.47.camel@Speranza>
	<AANLkTi=A=cbGsw6dETfa+WaZVrdj44UvXSEx=cWXjr2p@mail.gmail.com>
Message-ID: <1292177170.8349.25.camel@Speranza>

Hi Chuck,
You are right, it works.
I had so many versions of Numpy, that in the end I lost track.
*Sorry*

But now it works perfectly.
Thank you for the help.

/Ingwer


ps: I know SciPY has its own Mail list, but it could be, that somebody
can answer my question.
Does SciPy works on Python3.1? 


Am Samstag, den 11.12.2010, 13:19 -0700 schrieb Charles R Harris:
> 
> 
> On Sat, Dec 11, 2010 at 12:58 PM, Katharina <Ingwer.Wurzel at gmx.net>
> wrote:
>         Oh... the Problem with python2.6 is solved.
>         I take the numpy Version,  which was transformed with 2to3.
>         
> 
> Wait, how did you do that? Setup should automatically select the right
> version.
> 
> Chuck 
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From charlesr.harris at gmail.com  Sun Dec 12 14:23:21 2010
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun, 12 Dec 2010 12:23:21 -0700
Subject: [Numpy-discussion] Numpy and Python3
In-Reply-To: <1292177170.8349.25.camel@Speranza>
References: <1291977198.4723.137.camel@Speranza>
	<AANLkTi=UpsSdxqnzaCkpN55iScFPRd5CXAwNkCfE8RP_@mail.gmail.com>
	<1292071320.4874.26.camel@Speranza>
	<AANLkTinjFodR74+0tZwGKvRn6B6VXsAi8NVmZH5y4BMb@mail.gmail.com>
	<1292093610.4874.31.camel@Speranza>
	<AANLkTimNN=TEJz0XCwhsfSZj=Y-DZJL76fM-C=PFw0J4@mail.gmail.com>
	<1292096944.4874.43.camel@Speranza>
	<1292097515.4874.47.camel@Speranza>
	<AANLkTi=A=cbGsw6dETfa+WaZVrdj44UvXSEx=cWXjr2p@mail.gmail.com>
	<1292177170.8349.25.camel@Speranza>
Message-ID: <AANLkTikxq_UHNOhP9+QB28=-vcGv7wi6bP9TOpoy2w=y@mail.gmail.com>

On Sun, Dec 12, 2010 at 11:06 AM, Katharina <Ingwer.Wurzel at gmx.net> wrote:

> Hi Chuck,
> You are right, it works.
> I had so many versions of Numpy, that in the end I lost track.
> *Sorry*
>
> But now it works perfectly.
> Thank you for the help.
>
> /Ingwer
>
>
> ps: I know SciPY has its own Mail list, but it could be, that somebody
> can answer my question.
> Does SciPy works on Python3.1?
>
>
Support for 3.1 will be in the next scipy release. It should be available in
4-6 weeks.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101212/29ed2fc0/attachment.html>

From Ingwer.Wurzel at gmx.net  Sun Dec 12 14:48:39 2010
From: Ingwer.Wurzel at gmx.net (Katharina)
Date: Sun, 12 Dec 2010 20:48:39 +0100
Subject: [Numpy-discussion] Numpy and Python3
In-Reply-To: <AANLkTikxq_UHNOhP9+QB28=-vcGv7wi6bP9TOpoy2w=y@mail.gmail.com>
References: <1291977198.4723.137.camel@Speranza>
	<AANLkTi=UpsSdxqnzaCkpN55iScFPRd5CXAwNkCfE8RP_@mail.gmail.com>
	<1292071320.4874.26.camel@Speranza>
	<AANLkTinjFodR74+0tZwGKvRn6B6VXsAi8NVmZH5y4BMb@mail.gmail.com>
	<1292093610.4874.31.camel@Speranza>
	<AANLkTimNN=TEJz0XCwhsfSZj=Y-DZJL76fM-C=PFw0J4@mail.gmail.com>
	<1292096944.4874.43.camel@Speranza> <1292097515.4874.47.camel@Speranza>
	<AANLkTi=A=cbGsw6dETfa+WaZVrdj44UvXSEx=cWXjr2p@mail.gmail.com>
	<1292177170.8349.25.camel@Speranza>
	<AANLkTikxq_UHNOhP9+QB28=-vcGv7wi6bP9TOpoy2w=y@mail.gmail.com>
Message-ID: <1292183319.8349.34.camel@Speranza>

ok, thanks

/Ingwer

Am Sonntag, den 12.12.2010, 12:23 -0700 schrieb Charles R Harris:
> 
> 
> On Sun, Dec 12, 2010 at 11:06 AM, Katharina <Ingwer.Wurzel at gmx.net>
> wrote:
>         Hi Chuck,
>         You are right, it works.
>         I had so many versions of Numpy, that in the end I lost track.
>         *Sorry*
>         
>         But now it works perfectly.
>         Thank you for the help.
>         
>         /Ingwer
>         
>         
>         ps: I know SciPY has its own Mail list, but it could be, that
>         somebody
>         can answer my question.
>         Does SciPy works on Python3.1?
>         
> 
> Support for 3.1 will be in the next scipy release. It should be
> available in 4-6 weeks.
> 
> Chuck 
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From kwgoodman at gmail.com  Mon Dec 13 12:59:48 2010
From: kwgoodman at gmail.com (Keith Goodman)
Date: Mon, 13 Dec 2010 09:59:48 -0800
Subject: [Numpy-discussion] Output dtype
Message-ID: <AANLkTim_33YvhSiAAsoDge77+4j1hGuBK6S0Uu9+tHwk@mail.gmail.com>

>From the np.median doc string: "If the input contains integers, or
floats of smaller precision than 64, then the output data-type is
float64."

>> arr = np.array([[0,1,2,3,4,5]], dtype='float32')
>> np.median(arr, axis=0).dtype
   dtype('float32')
>> np.median(arr, axis=1).dtype
   dtype('float32')
>> np.median(arr, axis=None).dtype
   dtype('float64')

So the output doesn't agree with the doc string.

What is the desired dtype of the accumulator and the output for when
the input dtype is less than float64? Should it depend on axis?

I'm trying to duplicate the behavior of np.median (and other
numpy/scipy functions) in the Bottleneck package and am running into a
few corner cases while unit testing.

Here's another one:

>> np.sum([np.nan]).dtype
   dtype('float64')
>> np.nansum([1,np.nan]).dtype
   dtype('float64')
>> np.nansum([np.nan]).dtype
<snip>
AttributeError: 'float' object has no attribute 'dtype'

I just duplicated the numpy behavior for that one since it was easy to do.


From morph at debian.org  Mon Dec 13 14:51:56 2010
From: morph at debian.org (Sandro Tosi)
Date: Mon, 13 Dec 2010 20:51:56 +0100
Subject: [Numpy-discussion] where to ship libnpymath.a ?
Message-ID: <AANLkTi=2wppxnTv=Yc986J4A-rbFcDVQytgCguYYvuMe@mail.gmail.com>

Hi,
in Debian we had a bug report[1] requesting to ship libnpymath.a .

[1] http://bugs.debian.org/596987

Our python packaging tools doesn't handle .a files, so I'd like to ask
you where exactly should I ship that file. In the build directory I
have:

$ find . -name "*.a" | xargs md5sum
4c2371b98c138756b0471a4fb364e0ae
./debian/tmp/usr/lib/python2.5/site-packages/numpy/core/lib/libnpymath.a
fc6040f2bd4354cca8ef130abc5c8b17
./debian/tmp/usr/lib/python2.6/dist-packages/numpy/core/lib/libnpymath.a
0c9870a2e5cf61669c92677d9b12c116  ./build/temp_d.linux-x86_64-2.5/libnpymath.a
47fdd29b85570ce80b1c616c6c02f41a
./build/temp.linux-x86_64-2.6-pydebug/libnpymath.a
4c2371b98c138756b0471a4fb364e0ae  ./build/temp.linux-x86_64-2.5/libnpymath.a
fc6040f2bd4354cca8ef130abc5c8b17  ./build/temp.linux-x86_64-2.6/libnpymath.a

(the md5sum is to show that they are actually different between python
version and/or debug build): the first 2 are in the "temporary" debian
package preparation dir, while the other 4 are for 2.5/2.6 +
normal/debug build.

So, back to the original question: where should I put libnpymath.a to
be useful for our users (main request: new scipy)? maybe in
..../numpy/core/lib/ ?

Cheers,
-- 
Sandro Tosi (aka morph, morpheus, matrixhasu)
My website: http://matrixhasu.altervista.org/
Me at Debian: http://wiki.debian.org/SandroTosi


From bsouthey at gmail.com  Mon Dec 13 15:20:01 2010
From: bsouthey at gmail.com (Bruce Southey)
Date: Mon, 13 Dec 2010 14:20:01 -0600
Subject: [Numpy-discussion] Output dtype
In-Reply-To: <AANLkTim_33YvhSiAAsoDge77+4j1hGuBK6S0Uu9+tHwk@mail.gmail.com>
References: <AANLkTim_33YvhSiAAsoDge77+4j1hGuBK6S0Uu9+tHwk@mail.gmail.com>
Message-ID: <4D067FF1.9090001@gmail.com>

On 12/13/2010 11:59 AM, Keith Goodman wrote:
> > From the np.median doc string: "If the input contains integers, or
> floats of smaller precision than 64, then the output data-type is
> float64."
>
>>> arr = np.array([[0,1,2,3,4,5]], dtype='float32')
>>> np.median(arr, axis=0).dtype
>     dtype('float32')
>>> np.median(arr, axis=1).dtype
>     dtype('float32')
>>> np.median(arr, axis=None).dtype
>     dtype('float64')
>
> So the output doesn't agree with the doc string.
>
> What is the desired dtype of the accumulator and the output for when
> the input dtype is less than float64? Should it depend on axis?
>
> I'm trying to duplicate the behavior of np.median (and other
> numpy/scipy functions) in the Bottleneck package and am running into a
> few corner cases while unit testing.
>
> Here's another one:
>
>>> np.sum([np.nan]).dtype
>     dtype('float64')
>>> np.nansum([1,np.nan]).dtype
>     dtype('float64')
>>> np.nansum([np.nan]).dtype
> <snip>
> AttributeError: 'float' object has no attribute 'dtype'
>
> I just duplicated the numpy behavior for that one since it was easy to do.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
Unless something has changed since the docstring was written, this is 
probably an inherited 'bug' from np.mean() as the author expected that 
the docstring of mean was correct. For my 'old' 2.0 dev version:

 >>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32'), axis=1).dtype
dtype('float32')
 >>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32')).dtype
dtype('float64')


Bruce


From kwgoodman at gmail.com  Mon Dec 13 15:32:25 2010
From: kwgoodman at gmail.com (Keith Goodman)
Date: Mon, 13 Dec 2010 12:32:25 -0800
Subject: [Numpy-discussion] Output dtype
In-Reply-To: <4D067FF1.9090001@gmail.com>
References: <AANLkTim_33YvhSiAAsoDge77+4j1hGuBK6S0Uu9+tHwk@mail.gmail.com>
	<4D067FF1.9090001@gmail.com>
Message-ID: <AANLkTik1ok5NB3zkGq-otGLRXX854WZn5DkXjcO4Y5u_@mail.gmail.com>

On Mon, Dec 13, 2010 at 12:20 PM, Bruce Southey <bsouthey at gmail.com> wrote:
> On 12/13/2010 11:59 AM, Keith Goodman wrote:
>> > From the np.median doc string: "If the input contains integers, or
>> floats of smaller precision than 64, then the output data-type is
>> float64."
>>
>>>> arr = np.array([[0,1,2,3,4,5]], dtype='float32')
>>>> np.median(arr, axis=0).dtype
>> ? ? dtype('float32')
>>>> np.median(arr, axis=1).dtype
>> ? ? dtype('float32')
>>>> np.median(arr, axis=None).dtype
>> ? ? dtype('float64')
>>
>> So the output doesn't agree with the doc string.
>>
>> What is the desired dtype of the accumulator and the output for when
>> the input dtype is less than float64? Should it depend on axis?
>>
>> I'm trying to duplicate the behavior of np.median (and other
>> numpy/scipy functions) in the Bottleneck package and am running into a
>> few corner cases while unit testing.
>>
>> Here's another one:
>>
>>>> np.sum([np.nan]).dtype
>> ? ? dtype('float64')
>>>> np.nansum([1,np.nan]).dtype
>> ? ? dtype('float64')
>>>> np.nansum([np.nan]).dtype
>> <snip>
>> AttributeError: 'float' object has no attribute 'dtype'
>>
>> I just duplicated the numpy behavior for that one since it was easy to do.
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> Unless something has changed since the docstring was written, this is
> probably an inherited 'bug' from np.mean() as the author expected that
> the docstring of mean was correct. For my 'old' 2.0 dev version:
>
> ?>>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32'), axis=1).dtype
> dtype('float32')
> ?>>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32')).dtype
> dtype('float64')

Same issue with np.std and np.var.


From pav at iki.fi  Mon Dec 13 16:05:10 2010
From: pav at iki.fi (Pauli Virtanen)
Date: Mon, 13 Dec 2010 21:05:10 +0000 (UTC)
Subject: [Numpy-discussion] where to ship libnpymath.a ?
References: <AANLkTi=2wppxnTv=Yc986J4A-rbFcDVQytgCguYYvuMe@mail.gmail.com>
Message-ID: <ie61q6$5fj$1@dough.gmane.org>

On Mon, 13 Dec 2010 20:51:56 +0100, Sandro Tosi wrote:
[clip]
> So, back to the original question: where should I put libnpymath.a to be
> useful for our users (main request: new scipy)? maybe in
> ..../numpy/core/lib/ ?

In the place pointed to by npymath.ini, which is where 
"python setup.py install" puts it. The point is that numpy.distutils 
should be able to locate this library file for building extension modules 
that depend on it.

-- 
Pauli Virtanen


From Kathleen.M.Tacina at nasa.gov  Mon Dec 13 16:39:20 2010
From: Kathleen.M.Tacina at nasa.gov (Kathleen M Tacina)
Date: Mon, 13 Dec 2010 16:39:20 -0500
Subject: [Numpy-discussion] same name and title in structured arrays
Message-ID: <1292276360.7055.31.camel@moses.grc.nasa.gov>

Hi,

I've been finding numpy/scipy/matplotlib a very useful tool for data
analysis.  However, a recent change has caused me some problems.

Numpy used to allow the name and title of a column of a structured array
or recarray to be the same (at least in the svn version as of early last
winter).  Now, it seems that this is not allowed; see below.

Python 2.6.5 (r265:79063, Apr 27 2010, 12:20:23)
[GCC 4.2.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.__version__
'2.0.0.dev-799179d'
>>> data = np.ndarray((5,1),dtype=[(('T','T'),float)])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: title already used as a name or title.
>>> data = np.ndarray((5,1),dtype=[(('T at 0.25-in in F','T'),float)])
>>>           

Would it be possible to change the tests to allow the name and title to
be the same for the same component? 

I can work around this new limitation for new data files, but I'm having
trouble reading data files I created last winter.  

(But even for new stuff, it would be nice if it name=title was allowed.
I like using both names and titles so that I can record interesting
information like point location and units in the titles.  Sometimes,
though, there isn't anything else interesting to say about a column
(e.g., 'point','date', 'time'), and I'd like to (1) have both a title
and a name to be consistent with more interesting columns and (2) have
the title be equal to the name.)

Thanks for any help you can give me with this!
Kathy Tacina
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101213/a202c23f/attachment.html>

From kwgoodman at gmail.com  Mon Dec 13 17:53:45 2010
From: kwgoodman at gmail.com (Keith Goodman)
Date: Mon, 13 Dec 2010 14:53:45 -0800
Subject: [Numpy-discussion] Output dtype
In-Reply-To: <4D067FF1.9090001@gmail.com>
References: <AANLkTim_33YvhSiAAsoDge77+4j1hGuBK6S0Uu9+tHwk@mail.gmail.com>
	<4D067FF1.9090001@gmail.com>
Message-ID: <AANLkTinYdXwp+O0PaKV+G=eKP3c=n4=ppXLaPdmpxKJT@mail.gmail.com>

On Mon, Dec 13, 2010 at 12:20 PM, Bruce Southey <bsouthey at gmail.com> wrote:

> Unless something has changed since the docstring was written, this is
> probably an inherited 'bug' from np.mean() as the author expected that
> the docstring of mean was correct. For my 'old' 2.0 dev version:
>
> ?>>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32'), axis=1).dtype
> dtype('float32')
> ?>>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32')).dtype
> dtype('float64')

Are you saying the bug is in the doc string, the output, or both? I
think it is both; I expect the second result above to be float32.


From bsouthey at gmail.com  Mon Dec 13 21:50:01 2010
From: bsouthey at gmail.com (Bruce Southey)
Date: Mon, 13 Dec 2010 20:50:01 -0600
Subject: [Numpy-discussion] Output dtype
In-Reply-To: <AANLkTinYdXwp+O0PaKV+G=eKP3c=n4=ppXLaPdmpxKJT@mail.gmail.com>
References: <AANLkTim_33YvhSiAAsoDge77+4j1hGuBK6S0Uu9+tHwk@mail.gmail.com>
	<4D067FF1.9090001@gmail.com>
	<AANLkTinYdXwp+O0PaKV+G=eKP3c=n4=ppXLaPdmpxKJT@mail.gmail.com>
Message-ID: <AANLkTi=Q1cFqWHXD+Rf8=wjrZJtCha1K-DtN1JJSGyYJ@mail.gmail.com>

On Mon, Dec 13, 2010 at 4:53 PM, Keith Goodman <kwgoodman at gmail.com> wrote:
> On Mon, Dec 13, 2010 at 12:20 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>
>> Unless something has changed since the docstring was written, this is
>> probably an inherited 'bug' from np.mean() as the author expected that
>> the docstring of mean was correct. For my 'old' 2.0 dev version:
>>
>> ?>>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32'), axis=1).dtype
>> dtype('float32')
>> ?>>> np.mean( np.array([[0,1,2,3,4,5]], dtype='float32')).dtype
>> dtype('float64')
>
> Are you saying the bug is in the doc string, the output, or both? I
> think it is both; I expect the second result above to be float32.

This was a surprise to me as this 'misunderstanding' goes back to at
least numpy 1.1.

Both!

The documentation is wrong when using axis argument.

There is a bug because the output should be the same dtype for all
possible axis values - which should be a ticket regardless. The recent
half-float dtype or if users want the lower precision suggests that it
might be a good time to ensure the 'correct' option is used (whatever
that is).

Bruce


From morph at debian.org  Tue Dec 14 02:31:46 2010
From: morph at debian.org (Sandro Tosi)
Date: Tue, 14 Dec 2010 08:31:46 +0100
Subject: [Numpy-discussion] where to ship libnpymath.a ?
In-Reply-To: <ie61q6$5fj$1@dough.gmane.org>
References: <AANLkTi=2wppxnTv=Yc986J4A-rbFcDVQytgCguYYvuMe@mail.gmail.com>
	<ie61q6$5fj$1@dough.gmane.org>
Message-ID: <AANLkTi=rWHJ6gpWknHpa7TieSLzJGJbqzi4sF2uW5sWv@mail.gmail.com>

Hi,

On Mon, Dec 13, 2010 at 22:05, Pauli Virtanen <pav at iki.fi> wrote:
> On Mon, 13 Dec 2010 20:51:56 +0100, Sandro Tosi wrote:
> [clip]
>> So, back to the original question: where should I put libnpymath.a to be
>> useful for our users (main request: new scipy)? maybe in
>> ..../numpy/core/lib/ ?
>
> In the place pointed to by npymath.ini, which is where
> "python setup.py install" puts it. The point is that numpy.distutils
> should be able to locate this library file for building extension modules
> that depend on it.

Yep, now I see: I think I've prepared the package shipping
libnpymath.a in the right place, let's see :)

Thanks a lot for your help!

Cheers,
-- 
Sandro Tosi (aka morph, morpheus, matrixhasu)
My website: http://matrixhasu.altervista.org/
Me at Debian: http://wiki.debian.org/SandroTosi


From mjanikas at esri.com  Tue Dec 14 13:20:00 2010
From: mjanikas at esri.com (Mark Janikas)
Date: Tue, 14 Dec 2010 10:20:00 -0800
Subject: [Numpy-discussion] Most efficient trim of arrays
Message-ID: <FC90C6AB4859DF44AE7376C81140C0FD0D20840C@redmx4.esri.com>

Hello All,

I was wondering what the best way to trim an array based on some values I do not want....  I could use NUM.where or NUM.take... but let me give you an example:

import numpy as NUM
n = 100 (Length of my dataset)
data = NUM.empty((n,), float)
badRecords = []
for ind, record in enumerate(records):
                if record == someValueIDOntWant:
                                badRecords.append(ind)
                else:
                                data[ind] = record


Now, I want to "trim" my array using badRecords.  I guess I want to avoid copying.  Any thoughts on the best way to do it?  I do not want to use lists and then subsequently array the result as it is nice to pre-allocate the space.

Thanks much,

MJ


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101214/f8137f80/attachment.html>

From robert.kern at gmail.com  Tue Dec 14 13:32:45 2010
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 14 Dec 2010 12:32:45 -0600
Subject: [Numpy-discussion] Most efficient trim of arrays
In-Reply-To: <FC90C6AB4859DF44AE7376C81140C0FD0D20840C@redmx4.esri.com>
References: <FC90C6AB4859DF44AE7376C81140C0FD0D20840C@redmx4.esri.com>
Message-ID: <AANLkTi=s+7cnnLipSEH77zkC_6VwBZNgyNVOp5=Hmv5m@mail.gmail.com>

On Tue, Dec 14, 2010 at 12:20, Mark Janikas <mjanikas at esri.com> wrote:
> Hello All,
>
> I was wondering what the best way to trim an array based on some values I do
> not want?.? I could use NUM.where or NUM.take? but let me give you an
> example:
>
> import numpy as NUM
>
> n = 100 (Length of my dataset)
> data = NUM.empty((n,), float)
> badRecords = []
> for ind, record in enumerate(records):
> ??????????????? if record == someValueIDOntWant:
> ??????????????????????????????? badRecords.append(ind)
> ??????????????? else:
> ??????????????????????????????? data[ind] = record
>
> Now, I want to ?trim? my array using badRecords. ?I guess I want to avoid
> copying.? Any thoughts on the best way to do it?? I do not want to use lists
> and then subsequently array the result as it is nice to pre-allocate the
> space.

Don't fear the copy. Use boolean indexing.

http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#boolean

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From tmp50 at ukr.net  Wed Dec 15 10:24:20 2010
From: tmp50 at ukr.net (Dmitrey)
Date: Wed, 15 Dec 2010 17:24:20 +0200
Subject: [Numpy-discussion] new quarterly OpenOpt/FuncDesigner release 0.32
In-Reply-To: <4D085CD7.5080104@gmail.com>
References: <4D085CD7.5080104@gmail.com>
Message-ID: <E1PStDI-000GbA-OJ@ffe10.ukr.net>

 Hi all,
   I'm glad to inform you about new quarterly OpenOpt/FuncDesigner
   release (0.32):

   OpenOpt:
   * New class: LCP (and related solver)
   * New QP solver: qlcp
   * New NLP solver: sqlcp
   * New large-scale NSP (nonsmooth) solver gsubg. Currently it still
   requires lots of improvements (especially for constraints - their
   handling is very premature yet and often fails), but since the solver
   sometimes already works better than ipopt, algencan and other
   competitors it was tried with, I decided to include the one into the
   release.
   * Now SOCP can handle Ax <= b constraints (and bugfix for handling lb
   <= x <= ub has been committed)
   * Some other fixes and improvements

   >

   FuncDesigner:
   * Add new function removeAttachedConstraints
   * Add new oofuns min and max (their capabilities are quite restricted
   yet)
   * Systems of nonlinear equations: possibility to assign personal
   tolerance for an equation
   * Some fixes and improvements

   >


   >

   For more details see our forum entry

   >

   http://forum.openopt.org/viewtopic.php?id=325

   >


   >

   Regards, D.

   >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101215/40829d8c/attachment.html>

From pav at iki.fi  Mon Dec 20 07:15:17 2010
From: pav at iki.fi (Pauli Virtanen)
Date: Mon, 20 Dec 2010 13:15:17 +0100
Subject: [Numpy-discussion] same name and title in structured arrays
In-Reply-To: <1292276360.7055.31.camel@moses.grc.nasa.gov>
References: <1292276360.7055.31.camel@moses.grc.nasa.gov>
Message-ID: <1292847317.2876.0.camel@talisman>

On Mon, 13 Dec 2010 16:39:20 -0500, Kathleen M Tacina wrote:
> I've been finding numpy/scipy/matplotlib a very useful tool for data
> analysis.  However, a recent change has caused me some problems.
> 
> Numpy used to allow the name and title of a column of a structured array
> or recarray to be the same (at least in the svn version as of early last
> winter).  Now, it seems that this is not allowed; see below.
> 
> Python 2.6.5 (r265:79063, Apr 27 2010, 12:20:23) [GCC 4.2.2] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import numpy as np
>>>> np.__version__
> '2.0.0.dev-799179d'
>>>> data = np.ndarray((5,1),dtype=[(('T','T'),float)])
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> ValueError: title already used as a name or title.

This behavior was changed when fixing #1254:

        http://projects.scipy.org/numpy/ticket/1254

It seems that it will not be possible to just revert to the old behavior, 
since apparently allowing that was a design mistake.

The data loading routines however could in principle be changed to handle 
duplicate title/field combinations. How did you save your data, with 
numpy.save/numpy.savez or via pickling?

-- 
Pauli Virtanen


From alan.isaac at gmail.com  Mon Dec 20 11:28:47 2010
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Mon, 20 Dec 2010 11:28:47 -0500
Subject: [Numpy-discussion] sample without replacement
Message-ID: <4D0F843F.3070705@gmail.com>

I want to sample *without* replacement from a vector
(as with Python's random.sample).  I don't see a direct
replacement for this, and I don't want to carry two
PRNG's around.  Is the best way something like  this?

	permutation(myvector)[:samplesize]

Thanks,
Alan Isaac


From qubax at gmx.at  Sun Dec 19 10:40:13 2010
From: qubax at gmx.at (qubax at gmx.at)
Date: Sun, 19 Dec 2010 16:40:13 +0100
Subject: [Numpy-discussion] Efficient Matrix-matrix product of hermitian
 matrices, zhemm (blas) and numpy
Message-ID: <20101219154013.GA7960@tux.hotze.com>

I need to calculate several products of matrices where
at least one of them is always hermitian. The function
zhemm (in blas, level 3) seems to directly do that in
an efficient manner.

However ... how can i access that function and dirctly apply
it on numpy arrays?

If you know alternatives that are equivalent or even
faster, please let me know.

Any help is highly appreciated.
Q


-- 
The king who needs to remind his people of his rank, is no king.

A beggar's mistake harms no one but the beggar. A king's mistake,
however, harms everyone but the king. Too often, the measure of
power lies not in the number who obey your will, but in the number
who suffer your stupidity.


From fperez.net at gmail.com  Fri Dec 17 09:16:13 2010
From: fperez.net at gmail.com (Fernando Perez)
Date: Fri, 17 Dec 2010 19:46:13 +0530
Subject: [Numpy-discussion] Links to doc guidelines broken
Message-ID: <AANLkTikF8CSfBOPJbLBMs7uv5-m_w2vE-id+1SbOBrDT@mail.gmail.com>

Howdy,

In the ipython doc guide (and many other places) we point to the numpy
coding guidelines (especially for documentation), but today while
conducting a sprint at the Scipy India conference,  I noticed this
link is now dead:

http://projects.scipy.org/numpy/wiki/CodingStyleGuidelines

It seems the docs got moved over to github, which is fine:

https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt

but it would be really nice if whoever went through the Trac wiki
deleting stuff could have left a link pointing to the new proper
location of this document.  For those who knew what they were looking
for it's easy enough to find it againg with a bit of googling around,
but newcomers who may be trying to read these guidelines and simply
get Trac's version of a 404 are likely to be left confused.

I realize that in moving to a new infrastructure broken links are hard
to avoid, but for documents as widely used as the numpy coding
guidelines, perhaps leaving a link to the new location in the old
location would be a good idea...

I fixed the page with a github link, but there may be other important
pages needing a similar treatment, and I think as a policy it's
generally a good idea to leave proper redirects when important pages
are deleted.


Thanks

f


From jsalvati at u.washington.edu  Mon Dec 20 12:13:21 2010
From: jsalvati at u.washington.edu (John Salvatier)
Date: Mon, 20 Dec 2010 09:13:21 -0800
Subject: [Numpy-discussion] sample without replacement
In-Reply-To: <4D0F843F.3070705@gmail.com>
References: <4D0F843F.3070705@gmail.com>
Message-ID: <AANLkTi=MZw8eXvc-7UNAYbFe=hwdpf6HSmBWWJN8Nyxc@mail.gmail.com>

I think this is not possible to do efficiently with just numpy. If you want
to do this efficiently, I wrote a no-replacement sampler in Cython some time
ago (below). I hearby release it to the public domain.

'''

Created on Oct 24, 2009
http://stackoverflow.com/questions/311703/algorithm-for-sampling-without-replacement
@author: johnsalvatier

'''

from __future__ import division

import numpy

def random_no_replace(sampleSize, populationSize, numSamples):


    samples  = numpy.zeros((numSamples, sampleSize),dtype=int)


    # Use Knuth's variable names

    cdef int n = sampleSize

    cdef int N = populationSize

    cdef i = 0

    cdef int t = 0 # total input records dealt with

    cdef int m = 0 # number of items selected so far

    cdef double u

    while i < numSamples:

        t = 0

        m = 0

        while m < n :


            u = numpy.random.uniform() # call a uniform(0,1) random number
generator

            if  (N - t)*u >= n - m :


                t += 1


            else:


                samples[i,m] = t

                t += 1

                m += 1


        i += 1


    return samples


On Mon, Dec 20, 2010 at 8:28 AM, Alan G Isaac <alan.isaac at gmail.com> wrote:

> I want to sample *without* replacement from a vector
> (as with Python's random.sample).  I don't see a direct
> replacement for this, and I don't want to carry two
> PRNG's around.  Is the best way something like  this?
>
>        permutation(myvector)[:samplesize]
>
> Thanks,
> Alan Isaac
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101220/59ecd2e5/attachment.html>

From jpscipy at gmail.com  Mon Dec 20 15:25:05 2010
From: jpscipy at gmail.com (Justin Peel)
Date: Mon, 20 Dec 2010 13:25:05 -0700
Subject: [Numpy-discussion] Reversing an array in-place
Message-ID: <AANLkTimvAp+X70ds0diiJPZd25so_U8TRw0QRamHAemH@mail.gmail.com>

I noticed that there is currently no way to reverse a numpy array
in-place. The current way to reverse a numpy array is using slicing,
ala arr[::-1]. This is okay for small matrices, but for really large
ones, this can be prohibitive. Not only that, but an in-place reverse
is much faster than slicing. It seems like a reverse method could be
added to arrays that would reverse the array along a given axis fairly
easily. Is there any opposition to this?

Also, there is consideration for simply marking a given axis as being
reverse similar to how transposes are taken. However, I see this as a
problem for a method like reshape to deal with and therefore think
that it is better to just add a reverse method. What are your
opinions?

I'm quite willing to make such a method if it will be accepted.

Justin Peel


From jpscipy at gmail.com  Mon Dec 20 15:25:29 2010
From: jpscipy at gmail.com (Justin Peel)
Date: Mon, 20 Dec 2010 13:25:29 -0700
Subject: [Numpy-discussion] Short circuiting the all() and any()
	methods/functions
Message-ID: <AANLkTim3APehx6nCrYfiTH6KHGeTDe-f84ASJ3csyWt=@mail.gmail.com>

It has come to my attention that the all() and any() methods/functions
do not short circuit. It takes nearly as much time to call any() on an
array which has 1 as the first entry as it does to call it on an array
of the same size full of zeros.

The cause of the problem is that all() and any() just call reduce()
with the appropriate operator. Is anyone opposed to changing the
implementations of these functions so that they short-circuit?

By the way, Python already short circuits all() and any() correctly so
it certainly makes sense to enact this change.

I'm willing to head this up if there isn't any opposition to it.

Justin Peel


From matt.gregory at oregonstate.edu  Mon Dec 20 15:44:17 2010
From: matt.gregory at oregonstate.edu (Matt Gregory)
Date: Mon, 20 Dec 2010 12:44:17 -0800
Subject: [Numpy-discussion] creating zonal statistics from two arrays
In-Reply-To: <AANLkTin7mqKVKKHBEcZfa4m_3J3jgjBy1s60jEHuRJvH@mail.gmail.com>
References: <1D673F86DDA00841A1216F04D1CE70D6426800037D@EXCH2.nws.oregonstate.edu>
	<AANLkTin7mqKVKKHBEcZfa4m_3J3jgjBy1s60jEHuRJvH@mail.gmail.com>
Message-ID: <ieof71$sg1$1@dough.gmane.org>

On 12/8/2010 9:48 AM, josef.pktd at gmail.com wrote:
> Just a thought since I'm not doing spatial statistics.
>
> If you can create (integer) labels that assigns each point to a zone,
> then you can treat it essentially as a 1d grouped data, and you could
> use np.bincount to calculate some statistics, or alternatively
> scipy.ndimage.measurements for some additional statistics.
>
> This would avoid any python loop, but require a full label array.

Josef,

The measurements module did the trick; thanks for the pointer.  I just 
stumbled across a very similar thread on the scipy listserv that you 
answered basically the same question with some nice code (sorry for the
redundancy):

     http://mail.scipy.org/pipermail/scipy-user/2009-February/019850.html

BTW, the OP on that thread (Jose Gomez-Dans) has a script out there for 
doing just this type of operation that I was after:

     http://sites.google.com/site/spatialpython/zonal-statistics

thanks, matt


From charlesr.harris at gmail.com  Mon Dec 20 16:12:22 2010
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 20 Dec 2010 14:12:22 -0700
Subject: [Numpy-discussion] Short circuiting the all() and any()
	methods/functions
In-Reply-To: <AANLkTim3APehx6nCrYfiTH6KHGeTDe-f84ASJ3csyWt=@mail.gmail.com>
References: <AANLkTim3APehx6nCrYfiTH6KHGeTDe-f84ASJ3csyWt=@mail.gmail.com>
Message-ID: <AANLkTikQY73MSSEVFDoi0_m+=TXxKk0x-1Dn4=BPjKdy@mail.gmail.com>

On Mon, Dec 20, 2010 at 1:25 PM, Justin Peel <jpscipy at gmail.com> wrote:

> It has come to my attention that the all() and any() methods/functions
> do not short circuit. It takes nearly as much time to call any() on an
> array which has 1 as the first entry as it does to call it on an array
> of the same size full of zeros.
>
> The cause of the problem is that all() and any() just call reduce()
> with the appropriate operator. Is anyone opposed to changing the
> implementations of these functions so that they short-circuit?
>
>
Recent version of reduce do short circuit. What version of numpy are you
using?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101220/a0686d63/attachment.html>

From charlesr.harris at gmail.com  Mon Dec 20 16:15:25 2010
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 20 Dec 2010 14:15:25 -0700
Subject: [Numpy-discussion] Reversing an array in-place
In-Reply-To: <AANLkTimvAp+X70ds0diiJPZd25so_U8TRw0QRamHAemH@mail.gmail.com>
References: <AANLkTimvAp+X70ds0diiJPZd25so_U8TRw0QRamHAemH@mail.gmail.com>
Message-ID: <AANLkTinU1_XrUOqe7ixZ1U1Y9g9kFurN_edSsQC1zXzH@mail.gmail.com>

On Mon, Dec 20, 2010 at 1:25 PM, Justin Peel <jpscipy at gmail.com> wrote:

> I noticed that there is currently no way to reverse a numpy array
> in-place. The current way to reverse a numpy array is using slicing,
> ala arr[::-1]. This is okay for small matrices, but for really large
> ones, this can be prohibitive. Not only that, but an in-place reverse
> is much faster than slicing. It seems like a reverse method could be
> added to arrays that would reverse the array along a given axis fairly
> easily. Is there any opposition to this?
>
>
The reversed matrix is a view,  no copyihg is done. It is even faster than
an inplace reversal.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101220/41cb8f0c/attachment.html>

From jsalvati at u.washington.edu  Mon Dec 20 16:42:26 2010
From: jsalvati at u.washington.edu (John Salvatier)
Date: Mon, 20 Dec 2010 13:42:26 -0800
Subject: [Numpy-discussion] Giving numpy the ability to multi-iterate
	excluding an axis
Message-ID: <AANLkTikxA_cn07_y1PbLh-oVe7qQ2wG_T7+TH_fyiu-n@mail.gmail.com>

A while ago, I asked a whether it was possible to multi-iterate over several
ndarrays but exclude a certain axis(
http://www.mail-archive.com/numpy-discussion at scipy.org/msg29204.html), sort
of a combination of PyArray_IterAllButAxis and PyArray_MultiIterNew. My goal
was to allow creation of relatively complex ufuncs that can allow reduction
or directionally dependent computation and still use broadcasting (for
example a moving averaging ufunc that can have changing averaging
parameters). I didn't get any solutions, which I take to mean that no one
knew how to do this.

I am thinking about trying to make a numpy patch with this functionality,
and I have some questions: 1) How difficult would this kind of task be for
someone with non-expert C knowledge and good numpy knowledge? 2) Does anyone
have advice on how to do this kind of thing?

Best Regards,
John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101220/4990f3e6/attachment.html>

From jpscipy at gmail.com  Mon Dec 20 17:32:28 2010
From: jpscipy at gmail.com (Justin Peel)
Date: Mon, 20 Dec 2010 15:32:28 -0700
Subject: [Numpy-discussion] Short circuiting the all() and any()
	methods/functions
In-Reply-To: <AANLkTikQY73MSSEVFDoi0_m+=TXxKk0x-1Dn4=BPjKdy@mail.gmail.com>
References: <AANLkTim3APehx6nCrYfiTH6KHGeTDe-f84ASJ3csyWt=@mail.gmail.com>
	<AANLkTikQY73MSSEVFDoi0_m+=TXxKk0x-1Dn4=BPjKdy@mail.gmail.com>
Message-ID: <AANLkTin0Rh807M+6OL2uV18=Gjdv6ww_MTXet463T_oP@mail.gmail.com>

I'm using version 2.0.0.dev8716, which should be new enough I would
think.  Let me show you what makes me think that there isn't
short-circuiting going on.

I'll do two timeit's from the command line:

$ python -m timeit -s 'import numpy as np; x = np.ones(200000)' 'x.all()'
100 loops, best of 3: 3.87 msec per loop
$ python -m timeit -s 'import numpy as np; x = np.ones(200000); x[0] =
0' 'x.all()'
100 loops, best of 3: 2.76 msec per loop

You can try different sizes for the arrays if you like, but the ratio
of the times seems to hold pretty well. I would think that the second
statement would be much, much faster than the first. Instead, it is
only about 29% faster. I'm guessing that this speed isn't so much from
short-circuiting as that the logical AND operator is faster when the
first argument is 0 (the second argument doesn't need to be checked).
What do you think?

On Mon, Dec 20, 2010 at 2:12 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Mon, Dec 20, 2010 at 1:25 PM, Justin Peel <jpscipy at gmail.com> wrote:
>>
>> It has come to my attention that the all() and any() methods/functions
>> do not short circuit. It takes nearly as much time to call any() on an
>> array which has 1 as the first entry as it does to call it on an array
>> of the same size full of zeros.
>>
>> The cause of the problem is that all() and any() just call reduce()
>> with the appropriate operator. Is anyone opposed to changing the
>> implementations of these functions so that they short-circuit?
>>
>
> Recent version of reduce do short circuit. What version of numpy are you
> using?
>
> Chuck
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From jpscipy at gmail.com  Mon Dec 20 17:36:17 2010
From: jpscipy at gmail.com (Justin Peel)
Date: Mon, 20 Dec 2010 15:36:17 -0700
Subject: [Numpy-discussion] Reversing an array in-place
In-Reply-To: <AANLkTinU1_XrUOqe7ixZ1U1Y9g9kFurN_edSsQC1zXzH@mail.gmail.com>
References: <AANLkTimvAp+X70ds0diiJPZd25so_U8TRw0QRamHAemH@mail.gmail.com>
	<AANLkTinU1_XrUOqe7ixZ1U1Y9g9kFurN_edSsQC1zXzH@mail.gmail.com>
Message-ID: <AANLkTikrOiTM9y2yoCYQD+ogv4cSY-J4_a=tNtPJHH7w@mail.gmail.com>

Oh, you're quite right. I should have looked more closely into this.
Thanks for the reply.

On Mon, Dec 20, 2010 at 2:15 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Mon, Dec 20, 2010 at 1:25 PM, Justin Peel <jpscipy at gmail.com> wrote:
>>
>> I noticed that there is currently no way to reverse a numpy array
>> in-place. The current way to reverse a numpy array is using slicing,
>> ala arr[::-1]. This is okay for small matrices, but for really large
>> ones, this can be prohibitive. Not only that, but an in-place reverse
>> is much faster than slicing. It seems like a reverse method could be
>> added to arrays that would reverse the array along a given axis fairly
>> easily. Is there any opposition to this?
>>
>
> The reversed matrix is a view,? no copyihg is done. It is even faster than
> an inplace reversal.
>
> Chuck
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From pav at iki.fi  Mon Dec 20 19:15:08 2010
From: pav at iki.fi (Pauli Virtanen)
Date: Tue, 21 Dec 2010 01:15:08 +0100
Subject: [Numpy-discussion] Short circuiting the all() and any()
 methods/functions
In-Reply-To: <AANLkTin0Rh807M+6OL2uV18=Gjdv6ww_MTXet463T_oP@mail.gmail.com>
References: <AANLkTim3APehx6nCrYfiTH6KHGeTDe-f84ASJ3csyWt=@mail.gmail.com>
	<AANLkTikQY73MSSEVFDoi0_m+=TXxKk0x-1Dn4=BPjKdy@mail.gmail.com>
	<AANLkTin0Rh807M+6OL2uV18=Gjdv6ww_MTXet463T_oP@mail.gmail.com>
Message-ID: <1292890508.3169.3.camel@Obelisk>

ma, 2010-12-20 kello 15:32 -0700, Justin Peel kirjoitti:
> I'm using version 2.0.0.dev8716, which should be new enough I would
> think.  Let me show you what makes me think that there isn't
> short-circuiting going on.
> 
> I'll do two timeit's from the command line:
> 
> $ python -m timeit -s 'import numpy as np; x = np.ones(200000)' 'x.all()'
> 100 loops, best of 3: 3.87 msec per loop
> $ python -m timeit -s 'import numpy as np; x = np.ones(200000); x[0] =
> 0' 'x.all()'
> 100 loops, best of 3: 2.76 msec per loop

The short-circuit is made only for bool arrays.

$ python -m timeit -s 'import numpy as np; x = np.ones(200000, dtype=bool)' 'x.all()'
1000 loops, best of 3: 779 usec per loop
$ python -m timeit -s 'import numpy as np; x = np.ones(200000, dtype=bool); x[0] = 0' 'x.all()'
100000 loops, best of 3: 3.12 usec per loop

Could be easily generalized to all types, though, apart from maybe
handling the thruth value of NaN correctly.

-- 
Pauli Virtanen


From ralf.gommers at googlemail.com  Mon Dec 20 19:39:44 2010
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Tue, 21 Dec 2010 08:39:44 +0800
Subject: [Numpy-discussion] Links to doc guidelines broken
In-Reply-To: <AANLkTikF8CSfBOPJbLBMs7uv5-m_w2vE-id+1SbOBrDT@mail.gmail.com>
References: <AANLkTikF8CSfBOPJbLBMs7uv5-m_w2vE-id+1SbOBrDT@mail.gmail.com>
Message-ID: <AANLkTi=Atsa66Lez4w+y5D3cDKQKX6+=D=6iEOKss-61@mail.gmail.com>

On Fri, Dec 17, 2010 at 10:16 PM, Fernando Perez <fperez.net at gmail.com>wrote:

> Howdy,
>
> In the ipython doc guide (and many other places) we point to the numpy
> coding guidelines (especially for documentation), but today while
> conducting a sprint at the Scipy India conference,  I noticed this
> link is now dead:
>
> http://projects.scipy.org/numpy/wiki/CodingStyleGuidelines
>
> It seems the docs got moved over to github, which is fine:
>
> https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt
>
> but it would be really nice if whoever went through the Trac wiki
> deleting stuff could have left a link pointing to the new proper
> location of this document.  For those who knew what they were looking
> for it's easy enough to find it againg with a bit of googling around,
> but newcomers who may be trying to read these guidelines and simply
> get Trac's version of a 404 are likely to be left confused.


> I realize that in moving to a new infrastructure broken links are hard
> to avoid, but for documents as widely used as the numpy coding
> guidelines, perhaps leaving a link to the new location in the old
> location would be a good idea...
>
> That's my mistake, sorry. I changed the front page links and cleaned up the
rest. I'll check for other pages and put them back if necessary.

Just noticed that for links to the svn repo we have the opposite problem
BTW, the pages still exist with no warning that the content is outdated.

Ralf


> I fixed the page with a github link, but there may be other important
> pages needing a similar treatment, and I think as a policy it's
> generally a good idea to leave proper redirects when important pages
> are deleted.
>
>
> Thanks
>
> f
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101221/6c3acfbc/attachment.html>

From josef.pktd at gmail.com  Mon Dec 20 21:41:16 2010
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 20 Dec 2010 21:41:16 -0500
Subject: [Numpy-discussion] sample without replacement
In-Reply-To: <4D0F843F.3070705@gmail.com>
References: <4D0F843F.3070705@gmail.com>
Message-ID: <AANLkTin4DG2upLb-LAecGB_xxxMckF7cJmGVn01beAW0@mail.gmail.com>

On Mon, Dec 20, 2010 at 11:28 AM, Alan G Isaac <alan.isaac at gmail.com> wrote:
> I want to sample *without* replacement from a vector
> (as with Python's random.sample). ?I don't see a direct
> replacement for this, and I don't want to carry two
> PRNG's around. ?Is the best way something like ?this?
>
> ? ? ? ?permutation(myvector)[:samplesize]

python has it in random

sample( population, k)
Return a k length list of unique elements chosen from the population
sequence. Used for random sampling without replacement. New in version
2.3

Josef


>
> Thanks,
> Alan Isaac
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From alan.isaac at gmail.com  Mon Dec 20 22:19:21 2010
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Mon, 20 Dec 2010 22:19:21 -0500
Subject: [Numpy-discussion] sample without replacement
In-Reply-To: <AANLkTin4DG2upLb-LAecGB_xxxMckF7cJmGVn01beAW0@mail.gmail.com>
References: <4D0F843F.3070705@gmail.com>
	<AANLkTin4DG2upLb-LAecGB_xxxMckF7cJmGVn01beAW0@mail.gmail.com>
Message-ID: <4D101CB9.2090907@gmail.com>

On 12/20/2010 9:41 PM, josef.pktd at gmail.com wrote:
> python has it in random
>
> sample( population, k)


Yes, I mentioned this in my original post:
http://www.mail-archive.com/numpy-discussion at scipy.org/msg29324.html

But good simulation practice is perhaps to seed
a simulation specific random number generator
(not just rely on a global), and I don't want
to pass around two different instances.
So I want to get this functionality from numpy.random.

Which reminds me of another question.
numpy.random.RandomState accepts an int array as a seed:
what is the *intended* use?

Thanks,
Alan


From josef.pktd at gmail.com  Mon Dec 20 22:49:25 2010
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 20 Dec 2010 22:49:25 -0500
Subject: [Numpy-discussion] sample without replacement
In-Reply-To: <4D101CB9.2090907@gmail.com>
References: <4D0F843F.3070705@gmail.com>
	<AANLkTin4DG2upLb-LAecGB_xxxMckF7cJmGVn01beAW0@mail.gmail.com>
	<4D101CB9.2090907@gmail.com>
Message-ID: <AANLkTimEMCiWE+2AQL-BQ85CdvGmBMvhp22STkQefizm@mail.gmail.com>

On Mon, Dec 20, 2010 at 10:19 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:
> On 12/20/2010 9:41 PM, josef.pktd at gmail.com wrote:
>> python has it in random
>>
>> sample( population, k)
>
>
> Yes, I mentioned this in my original post:
> http://www.mail-archive.com/numpy-discussion at scipy.org/msg29324.html
>
> But good simulation practice is perhaps to seed
> a simulation specific random number generator
> (not just rely on a global), and I don't want
> to pass around two different instances.
> So I want to get this functionality from numpy.random.

Sorry, I was reading to fast, and I might be tired.

What's the difference between a numpy Random and a python
random.Random instance of separate states of the random number
generators?

Josef

>
> Which reminds me of another question.
> numpy.random.RandomState accepts an int array as a seed:
> what is the *intended* use?
>
> Thanks,
> Alan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From alan.isaac at gmail.com  Tue Dec 21 08:25:33 2010
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Tue, 21 Dec 2010 08:25:33 -0500
Subject: [Numpy-discussion] sample without replacement
In-Reply-To: <AANLkTimEMCiWE+2AQL-BQ85CdvGmBMvhp22STkQefizm@mail.gmail.com>
References: <4D0F843F.3070705@gmail.com>	<AANLkTin4DG2upLb-LAecGB_xxxMckF7cJmGVn01beAW0@mail.gmail.com>	<4D101CB9.2090907@gmail.com>
	<AANLkTimEMCiWE+2AQL-BQ85CdvGmBMvhp22STkQefizm@mail.gmail.com>
Message-ID: <4D10AACD.5020907@gmail.com>

On 12/20/2010 10:49 PM, josef.pktd at gmail.com wrote:
> What's the difference between a numpy Random and a python
> random.Random instance of separate states of the random number
> generators?


Sorry, I don't understand the question.  The difference
for my use is that a np.RandomState instance provides
access to a different set of methods, which unfortunately
does not include an equivalent to random.Random's sample
method but which does include others I need.

Would it be appropriate to request that an analog
to random.sample be added to numpy.random?
(It might sample only a range, since producing indexes
would provide the base functionality.)
Or is this functionality absent intentionally?

Alan


From aarchiba at physics.mcgill.ca  Tue Dec 21 10:39:01 2010
From: aarchiba at physics.mcgill.ca (Anne Archibald)
Date: Tue, 21 Dec 2010 10:39:01 -0500
Subject: [Numpy-discussion] sample without replacement
In-Reply-To: <AANLkTi=MZw8eXvc-7UNAYbFe=hwdpf6HSmBWWJN8Nyxc@mail.gmail.com>
References: <4D0F843F.3070705@gmail.com>
	<AANLkTi=MZw8eXvc-7UNAYbFe=hwdpf6HSmBWWJN8Nyxc@mail.gmail.com>
Message-ID: <AANLkTinBOs83VtWhY5r7_u-C-Ar3BAGPiSy81zbf4HZ3@mail.gmail.com>

I know this question came up on the mailing list some time ago
(19/09/2008), and the conclusion was that yes, you can do it more or
less efficiently in pure python; the trick is to use two different
methods. If your sample is more than, say, a quarter the size of the
set you're drawing from, you permute the set and take the first few.
If your sample is smaller, you draw with replacement, then redraw the
duplicates, and repeat until there aren't any more duplicates. Since
you only do this when your sample is much smaller than the population
you don't need to repeat many times.

Here's the code I posted to the previous discussion (not tested this
time around) with comments:

'''
def choose_without_replacement(m,n,repeats=None):
   """Choose n nonnegative integers less than m without replacement

   Returns an array of shape n, or (n,repeats).
   """
   if repeats is None:
       r = 1
   else:
       r = repeats
   if n>m:
       raise ValueError, "Cannot find %d nonnegative integers less
than %d" %  (n,m)
   if n>m/2:
       res = np.sort(np.random.rand(m,r).argsort(axis=0)[:n,:],axis=0)
   else:
       res = np.random.random_integers(m,size=(n,r))
       while True:
           res = np.sort(res,axis=0)
           w = np.nonzero(np.diff(res,axis=0)==0)
           nr = len(w[0])
           if nr==0:
               break
           res[w] = np.random.random_integers(m,size=nr)

   if repeats is None:
       return res[:,0]
   else:
       return res

For really large values of repeats it does too much sorting; I didn't
have the energy to make it pull all the ones with repeats to the
beginning so that only they need to be re-sorted the next time
through. Still, the expected number of trips through the while loop
grows only logarithmically with repeats, so it shouldn't be too bad.
'''

Anne

On 20 December 2010 12:13, John Salvatier <jsalvati at u.washington.edu> wrote:
> I think this is not possible to do efficiently with just numpy. If you want
> to do this efficiently, I wrote a no-replacement sampler in Cython some time
> ago (below). I hearby release it to the public domain.
>
> '''
>
> Created on Oct 24, 2009
> http://stackoverflow.com/questions/311703/algorithm-for-sampling-without-replacement
> @author: johnsalvatier
>
> '''
>
> from __future__ import division
>
> import numpy
>
> def random_no_replace(sampleSize, populationSize, numSamples):
>
>
>
> ?? ?samples? = numpy.zeros((numSamples, sampleSize),dtype=int)
>
>
>
> ?? ?# Use Knuth's variable names
>
> ?? ?cdef int n = sampleSize
>
> ?? ?cdef int N = populationSize
>
> ?? ?cdef i = 0
>
> ?? ?cdef int t = 0 # total input records dealt with
>
> ?? ?cdef int m = 0 # number of items selected so far
>
> ?? ?cdef double u
>
> ?? ?while i < numSamples:
>
> ?? ? ? ?t = 0
>
> ?? ? ? ?m = 0
>
> ?? ? ? ?while m < n :
>
>
>
> ?? ? ? ? ? ?u = numpy.random.uniform() # call a uniform(0,1) random number
> generator
>
> ?? ? ? ? ? ?if? (N - t)*u >= n - m :
>
>
>
> ?? ? ? ? ? ? ? ?t += 1
>
>
>
> ?? ? ? ? ? ?else:
>
>
>
> ?? ? ? ? ? ? ? ?samples[i,m] = t
>
> ?? ? ? ? ? ? ? ?t += 1
>
> ?? ? ? ? ? ? ? ?m += 1
>
>
>
> ?? ? ? ?i += 1
>
>
>
> ?? ?return samples
>
>
>
> On Mon, Dec 20, 2010 at 8:28 AM, Alan G Isaac <alan.isaac at gmail.com> wrote:
>>
>> I want to sample *without* replacement from a vector
>> (as with Python's random.sample). ?I don't see a direct
>> replacement for this, and I don't want to carry two
>> PRNG's around. ?Is the best way something like ?this?
>>
>> ? ? ? ?permutation(myvector)[:samplesize]
>>
>> Thanks,
>> Alan Isaac
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From alan.isaac at gmail.com  Tue Dec 21 13:53:47 2010
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Tue, 21 Dec 2010 13:53:47 -0500
Subject: [Numpy-discussion] bincount question
Message-ID: <4D10F7BB.3000905@gmail.com>

::

     >>> np.bincount([])
     Traceback (most recent call last):
       File "<stdin>", line 1, in <module>
     ValueError: The first argument cannot be empty.

Why not?
(I.e., why isn't an empty array the right answer?)

Thanks,
Alan Isaac


From alan.isaac at gmail.com  Tue Dec 21 14:00:26 2010
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Tue, 21 Dec 2010 14:00:26 -0500
Subject: [Numpy-discussion] bincount and generators
In-Reply-To: <AANLkTinBOs83VtWhY5r7_u-C-Ar3BAGPiSy81zbf4HZ3@mail.gmail.com>
References: <4D0F843F.3070705@gmail.com>	<AANLkTi=MZw8eXvc-7UNAYbFe=hwdpf6HSmBWWJN8Nyxc@mail.gmail.com>
	<AANLkTinBOs83VtWhY5r7_u-C-Ar3BAGPiSy81zbf4HZ3@mail.gmail.com>
Message-ID: <4D10F94A.60209@gmail.com>

bincount does not currently allow a generator as an argument.
I'm wondering if it is considered too costly to extend it to allow this.
(Motivation: I'm counting based on an attribute of a large number of objects,
and I don't need a list of the data.)

Thanks,
Alan Isaac


From sturla at molden.no  Tue Dec 21 14:33:49 2010
From: sturla at molden.no (Sturla Molden)
Date: Tue, 21 Dec 2010 20:33:49 +0100
Subject: [Numpy-discussion] sample without replacement
In-Reply-To: <AANLkTinBOs83VtWhY5r7_u-C-Ar3BAGPiSy81zbf4HZ3@mail.gmail.com>
References: <4D0F843F.3070705@gmail.com>
	<AANLkTi=MZw8eXvc-7UNAYbFe=hwdpf6HSmBWWJN8Nyxc@mail.gmail.com>
	<AANLkTinBOs83VtWhY5r7_u-C-Ar3BAGPiSy81zbf4HZ3@mail.gmail.com>
Message-ID: <7732552513b8620ba14bfbf82404a5e3.squirrel@webmail.uio.no>


We often need to generate more than one such sample from an array, e.g.
for permutation tests. If we shuffle an array x of size N and use x[:M] as
a random sample "without replacement", we just need to put them back
randomly to get the next sample (cf. Fisher-Yates shuffle). That way we
get O(M) amortized complexity for each sample of size M. Only the first
sample will have complexity O(N).

Sturla


> I know this question came up on the mailing list some time ago
> (19/09/2008), and the conclusion was that yes, you can do it more or
> less efficiently in pure python; the trick is to use two different
> methods. If your sample is more than, say, a quarter the size of the
> set you're drawing from, you permute the set and take the first few.
> If your sample is smaller, you draw with replacement, then redraw the
> duplicates, and repeat until there aren't any more duplicates. Since
> you only do this when your sample is much smaller than the population
> you don't need to repeat many times.
>
> Here's the code I posted to the previous discussion (not tested this
> time around) with comments:
>
> '''
> def choose_without_replacement(m,n,repeats=None):
>    """Choose n nonnegative integers less than m without replacement
>
>    Returns an array of shape n, or (n,repeats).
>    """
>    if repeats is None:
>        r = 1
>    else:
>        r = repeats
>    if n>m:
>        raise ValueError, "Cannot find %d nonnegative integers less
> than %d" %  (n,m)
>    if n>m/2:
>        res = np.sort(np.random.rand(m,r).argsort(axis=0)[:n,:],axis=0)
>    else:
>        res = np.random.random_integers(m,size=(n,r))
>        while True:
>            res = np.sort(res,axis=0)
>            w = np.nonzero(np.diff(res,axis=0)==0)
>            nr = len(w[0])
>            if nr==0:
>                break
>            res[w] = np.random.random_integers(m,size=nr)
>
>    if repeats is None:
>        return res[:,0]
>    else:
>        return res
>
> For really large values of repeats it does too much sorting; I didn't
> have the energy to make it pull all the ones with repeats to the
> beginning so that only they need to be re-sorted the next time
> through. Still, the expected number of trips through the while loop
> grows only logarithmically with repeats, so it shouldn't be too bad.
> '''
>
> Anne
>
> On 20 December 2010 12:13, John Salvatier <jsalvati at u.washington.edu>
> wrote:
>> I think this is not possible to do efficiently with just numpy. If you
>> want
>> to do this efficiently, I wrote a no-replacement sampler in Cython some
>> time
>> ago (below). I hearby release it to the public domain.
>>
>> '''
>>
>> Created on Oct 24, 2009
>> http://stackoverflow.com/questions/311703/algorithm-for-sampling-without-replacement
>> @author: johnsalvatier
>>
>> '''
>>
>> from __future__ import division
>>
>> import numpy
>>
>> def random_no_replace(sampleSize, populationSize, numSamples):
>>
>>
>>
>> ?? ?samples? = numpy.zeros((numSamples, sampleSize),dtype=int)
>>
>>
>>
>> ?? ?# Use Knuth's variable names
>>
>> ?? ?cdef int n = sampleSize
>>
>> ?? ?cdef int N = populationSize
>>
>> ?? ?cdef i = 0
>>
>> ?? ?cdef int t = 0 # total input records dealt with
>>
>> ?? ?cdef int m = 0 # number of items selected so far
>>
>> ?? ?cdef double u
>>
>> ?? ?while i < numSamples:
>>
>> ?? ? ? ?t = 0
>>
>> ?? ? ? ?m = 0
>>
>> ?? ? ? ?while m < n :
>>
>>
>>
>> ?? ? ? ? ? ?u = numpy.random.uniform() # call a uniform(0,1) random
>> number
>> generator
>>
>> ?? ? ? ? ? ?if? (N - t)*u >= n - m :
>>
>>
>>
>> ?? ? ? ? ? ? ? ?t += 1
>>
>>
>>
>> ?? ? ? ? ? ?else:
>>
>>
>>
>> ?? ? ? ? ? ? ? ?samples[i,m] = t
>>
>> ?? ? ? ? ? ? ? ?t += 1
>>
>> ?? ? ? ? ? ? ? ?m += 1
>>
>>
>>
>> ?? ? ? ?i += 1
>>
>>
>>
>> ?? ?return samples
>>
>>
>>
>> On Mon, Dec 20, 2010 at 8:28 AM, Alan G Isaac <alan.isaac at gmail.com>
>> wrote:
>>>
>>> I want to sample *without* replacement from a vector
>>> (as with Python's random.sample). ?I don't see a direct
>>> replacement for this, and I don't want to carry two
>>> PRNG's around. ?Is the best way something like ?this?
>>>
>>> ? ? ? ?permutation(myvector)[:samplesize]
>>>
>>> Thanks,
>>> Alan Isaac
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From sturla at molden.no  Tue Dec 21 14:51:18 2010
From: sturla at molden.no (Sturla Molden)
Date: Tue, 21 Dec 2010 20:51:18 +0100
Subject: [Numpy-discussion] Reversing an array in-place
In-Reply-To: <AANLkTinU1_XrUOqe7ixZ1U1Y9g9kFurN_edSsQC1zXzH@mail.gmail.com>
References: <AANLkTimvAp+X70ds0diiJPZd25so_U8TRw0QRamHAemH@mail.gmail.com>
	<AANLkTinU1_XrUOqe7ixZ1U1Y9g9kFurN_edSsQC1zXzH@mail.gmail.com>
Message-ID: <79490512fd8661f01cf9675aa235d9bb.squirrel@webmail.uio.no>

Chuck wrote:

> The reversed matrix is a view,  no copyihg is done. It is even faster than
> an inplace reversal.

This is why I love NumPy. In C, Fortran or Matlab most programmers would
probably form the reversed array. In NumPy we just change some
metainformation (data pointer and strides) behind the scenes. It cannot be
done more efficiently than that.

Sturla


From robert.kern at gmail.com  Tue Dec 21 14:53:11 2010
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 21 Dec 2010 13:53:11 -0600
Subject: [Numpy-discussion] sample without replacement
In-Reply-To: <4D0F843F.3070705@gmail.com>
References: <4D0F843F.3070705@gmail.com>
Message-ID: <AANLkTi=n1pC4ASYqaQu69HaLNDw81Kfd7R4sj1Sm8r2w@mail.gmail.com>

On Mon, Dec 20, 2010 at 10:28, Alan G Isaac <alan.isaac at gmail.com> wrote:
> I want to sample *without* replacement from a vector
> (as with Python's random.sample). ?I don't see a direct
> replacement for this, and I don't want to carry two
> PRNG's around. ?Is the best way something like ?this?
>
> ? ? ? ?permutation(myvector)[:samplesize]

For one of my personal projects, I copied over the mtrand package and
added a method to RandomState for doing this kind of thing using
reservoir sampling.

http://en.wikipedia.org/wiki/Reservoir_sampling

    def subset_reservoir(self, long nselected, long ntotal, object size=None):
        """ Sample a given number integers from the set [0, ntotal) without
        replacement using a reservoir algorithm.

        Parameters
        ----------
        nselected : int
            The number of integers to sample.
        ntotal : int
            The size of the set to sample from.
        size : int, sequence of ints, or None
            The number of subsets to sample or a shape tuple. An axis of the
            length nselected will be appended to a shape.

        Returns
        -------
        out : ndarray
            The sampled subsets. The order of the items is not necessarily
            random. Use a slice from the result of permutation() if you need the
            order of the items to be randomized.
        """
        cdef long total_size, length, i, j, u
        cdef cnp.ndarray[cnp.int_t, ndim=2] out
        if size is None:
            shape = (nselected,)
            total_size = nselected
            length = 1
        elif isinstance(size, int):
            shape = (size, nselected)
            total_size = size * nselected
            length = size
        else:
            shape = size + (nselected,)
            length = 1
            for i from 0 <= i < len(size):
                length *= size[i]
            total_size = length * nselected
        out = np.empty((length, nselected), dtype=int)
        for i from 0 <= i < length:
            for j from 0 <= j < nselected:
                out[i,j] = j
            for j from nselected <= j < ntotal:
                u = <long>rk_interval(j+1, self.internal_state)
                if u < nselected:
                    out[i,u] = j
        return out.reshape(shape)

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From mwwiebe at gmail.com  Tue Dec 21 19:53:55 2010
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Tue, 21 Dec 2010 16:53:55 -0800
Subject: [Numpy-discussion] NEP for faster ufuncs
Message-ID: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>

Hello NumPy-ers,

After some performance analysis, I've designed and implemented a new
iterator designed to speed up ufuncs and allow for easier multi-dimensional
iteration.  The new code is fairly large, but works quite well already.  If
some people could read the NEP and give some feedback, that would be great!
 Here's a link:

https://github.com/m-paradox/numpy/blob/mw_neps/doc/neps/new-iterator-ufunc.rst

I would also love it if someone could try building the code and play around
with it a bit.  The github branch is here:

https://github.com/m-paradox/numpy/tree/new_iterator

To give a taste of the iterator's functionality, below is an example from
the NEP for how to implement a "Lambda UFunc."  With just a few lines of
code, it's possible to replicate something similar to the numexpr library
(numexpr still gets a bigger speedup, though).  In the example expression I
chose, execution time went from 138ms to 61ms.

Hopefully this is a good Christmas present for NumPy. :)

Cheers,
Mark

Here is the definition of the ``luf`` function.::

    def luf(lamdaexpr, *args, **kwargs):
        """Lambda UFunc

            e.g.
            c = luf(lambda i,j:i+j, a, b, order='K',
                                casting='safe', buffersize=8192)

            c = np.empty(...)
            luf(lambda i,j:i+j, a, b, out=c, order='K',
                                casting='safe', buffersize=8192)
        """

        nargs = len(args)
        op = args + (kwargs.get('out',None),)
        it = np.newiter(op, ['buffered','no_inner_iteration'],
                [['readonly','nbo_aligned']]*nargs +
                                [['writeonly','allocate','no_broadcast']],
                order=kwargs.get('order','K'),
                casting=kwargs.get('casting','safe'),
                buffersize=kwargs.get('buffersize',0))
        while not it.finished:
            it[-1] = lamdaexpr(*it[:-1])
            it.iternext()

        return it.operands[-1]

Then, by using ``luf`` instead of straight Python expressions, we
can gain some performance from better cache behavior.::

    In [2]: a = np.random.random((50,50,50,10))
    In [3]: b = np.random.random((50,50,1,10))
    In [4]: c = np.random.random((50,50,50,1))

    In [5]: timeit 3*a+b-(a/c)
    1 loops, best of 3: 138 ms per loop

    In [6]: timeit luf(lambda a,b,c:3*a+b-(a/c), a, b, c)
    10 loops, best of 3: 60.9 ms per loop

    In [7]: np.all(3*a+b-(a/c) == luf(lambda a,b,c:3*a+b-(a/c), a, b, c))
    Out[7]: True
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101221/d9d44f32/attachment.html>

From jsalvati at u.washington.edu  Tue Dec 21 19:59:15 2010
From: jsalvati at u.washington.edu (John Salvatier)
Date: Tue, 21 Dec 2010 16:59:15 -0800
Subject: [Numpy-discussion] NEP for faster ufuncs
In-Reply-To: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
References: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
Message-ID: <AANLkTimkwCKpVjA7HPwtQca67pWc8bvNSew-zqzdnTs4@mail.gmail.com>

That is an amazing christmas present.

On Tue, Dec 21, 2010 at 4:53 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:

> Hello NumPy-ers,
>
> After some performance analysis, I've designed and implemented a new
> iterator designed to speed up ufuncs and allow for easier multi-dimensional
> iteration.  The new code is fairly large, but works quite well already.  If
> some people could read the NEP and give some feedback, that would be great!
>  Here's a link:
>
>
> https://github.com/m-paradox/numpy/blob/mw_neps/doc/neps/new-iterator-ufunc.rst
>
> I would also love it if someone could try building the code and play around
> with it a bit.  The github branch is here:
>
> https://github.com/m-paradox/numpy/tree/new_iterator
>
> To give a taste of the iterator's functionality, below is an example from
> the NEP for how to implement a "Lambda UFunc."  With just a few lines of
> code, it's possible to replicate something similar to the numexpr library
> (numexpr still gets a bigger speedup, though).  In the example expression I
> chose, execution time went from 138ms to 61ms.
>
> Hopefully this is a good Christmas present for NumPy. :)
>
> Cheers,
> Mark
>
> Here is the definition of the ``luf`` function.::
>
>     def luf(lamdaexpr, *args, **kwargs):
>         """Lambda UFunc
>
>             e.g.
>             c = luf(lambda i,j:i+j, a, b, order='K',
>                                 casting='safe', buffersize=8192)
>
>             c = np.empty(...)
>             luf(lambda i,j:i+j, a, b, out=c, order='K',
>                                 casting='safe', buffersize=8192)
>         """
>
>         nargs = len(args)
>         op = args + (kwargs.get('out',None),)
>         it = np.newiter(op, ['buffered','no_inner_iteration'],
>                 [['readonly','nbo_aligned']]*nargs +
>                                 [['writeonly','allocate','no_broadcast']],
>                 order=kwargs.get('order','K'),
>                 casting=kwargs.get('casting','safe'),
>                 buffersize=kwargs.get('buffersize',0))
>         while not it.finished:
>             it[-1] = lamdaexpr(*it[:-1])
>             it.iternext()
>
>         return it.operands[-1]
>
> Then, by using ``luf`` instead of straight Python expressions, we
> can gain some performance from better cache behavior.::
>
>     In [2]: a = np.random.random((50,50,50,10))
>     In [3]: b = np.random.random((50,50,1,10))
>     In [4]: c = np.random.random((50,50,50,1))
>
>     In [5]: timeit 3*a+b-(a/c)
>     1 loops, best of 3: 138 ms per loop
>
>     In [6]: timeit luf(lambda a,b,c:3*a+b-(a/c), a, b, c)
>     10 loops, best of 3: 60.9 ms per loop
>
>     In [7]: np.all(3*a+b-(a/c) == luf(lambda a,b,c:3*a+b-(a/c), a, b, c))
>     Out[7]: True
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101221/8a1f3674/attachment.html>

From mwwiebe at gmail.com  Tue Dec 21 20:12:15 2010
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Tue, 21 Dec 2010 17:12:15 -0800
Subject: [Numpy-discussion] Giving numpy the ability to multi-iterate
 excluding an axis
In-Reply-To: <AANLkTikxA_cn07_y1PbLh-oVe7qQ2wG_T7+TH_fyiu-n@mail.gmail.com>
References: <AANLkTikxA_cn07_y1PbLh-oVe7qQ2wG_T7+TH_fyiu-n@mail.gmail.com>
Message-ID: <AANLkTi=8tyTzPQLHGFsJeW_OehvwagPfpX2aA5RscSdJ@mail.gmail.com>

On Mon, Dec 20, 2010 at 1:42 PM, John Salvatier
<jsalvati at u.washington.edu>wrote:

> A while ago, I asked a whether it was possible to multi-iterate over
> several ndarrays but exclude a certain axis(
> http://www.mail-archive.com/numpy-discussion at scipy.org/msg29204.html),
> sort of a combination of PyArray_IterAllButAxis and PyArray_MultiIterNew. My
> goal was to allow creation of relatively complex ufuncs that can allow
> reduction or directionally dependent computation and still use broadcasting
> (for example a moving averaging ufunc that can have changing averaging
> parameters). I didn't get any solutions, which I take to mean that no one
> knew how to do this.
>
> I am thinking about trying to make a numpy patch with this functionality,
> and I have some questions: 1) How difficult would this kind of task be for
> someone with non-expert C knowledge and good numpy knowledge? 2) Does anyone
> have advice on how to do this kind of thing?
>

You may be able to do what you would like with the new iterator I've
written.  In particular, it supports nesting multiple iterators by providing
either pointers or offsets, and allowing you to specify any subset of the
axes to iterate.  Here's how the code to do this in a simple 3D case might
look, for making axis 1 the inner loop:

PyArrayObject *op[2] = {a,b};
npy_intp axes_outer[2] = {0,2}};
npy_intp *op_axes[2];
npy_intp axis_inner = 1;
npy_int32 flags[2] = {NPY_ITER_READONLY, NPY_ITER_READONLY};
NpyIter *outer, *inner;
NpyIter_IterNext_Fn oiternext, iiternext;
npy_intp *ooffsets;
char **idataptrs;

op_axes[0] = op_axes[1] = axes_outer;
outer = NpyIter_MultiNew(2, op, NPY_ITER_OFFSETS,
                           NPY_KEEPORDER, NPY_NO_CASTING, flags, NULL, 2,
op_axes, 0);
op_axes[0] = op_axes[1] = &axis_inner;
inner = NpyIter_MultiNew(2, op, 0, NPY_KEEPORDER, NPY_NO_CASTING, flags,
NULL, 1, op_axes, 0);

oiternext = NpyIter_GetIterNext(outer);
iiternext = NpyIter_GetIterNext(inner);

ooffsets = (npy_intp *)NpyIter_GetDataPtrArray(outer);
idataptrs = NpyIter_GetDataPtrArray(inner);

do {
   do {
      char *a_data = idataptrs[0] + ooffsets[0], *b_data = idataptrs[0] +
ooffsets[0];
      /* Do stuff with the data */
   } while(iiternext());
   NpyIter_Reset(inner);
} while(oiternext());

NpyIter_Deallocate(outer);
NpyIter_Deallocate(inner);

Extending to more dimensions, or making both the inner and outer loops have
multiple dimensions, isn't too crazy.  Is this along the lines of what you
need?

If you check out my code, note that it currently isn't exposed as NumPy API
yet, but you can try a lot of things with the Python exposure.

Cheers,
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101221/d4f4cf11/attachment.html>

From jsalvati at u.washington.edu  Tue Dec 21 21:00:49 2010
From: jsalvati at u.washington.edu (John Salvatier)
Date: Tue, 21 Dec 2010 18:00:49 -0800
Subject: [Numpy-discussion] NEP for faster ufuncs
In-Reply-To: <AANLkTimkwCKpVjA7HPwtQca67pWc8bvNSew-zqzdnTs4@mail.gmail.com>
References: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
	<AANLkTimkwCKpVjA7HPwtQca67pWc8bvNSew-zqzdnTs4@mail.gmail.com>
Message-ID: <AANLkTi=5ScGKzvsYchYDRmP-xwgAHPkf_cmY51EnZoQQ@mail.gmail.com>

I applaud you on your vision. I only have one small suggestion: I suggest
you put a table of contents at the beginning of your NEP so people may skip
to the part that most interests them.

On Tue, Dec 21, 2010 at 4:59 PM, John Salvatier
<jsalvati at u.washington.edu>wrote:

> That is an amazing christmas present.
>
> On Tue, Dec 21, 2010 at 4:53 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
>
>> Hello NumPy-ers,
>>
>> After some performance analysis, I've designed and implemented a new
>> iterator designed to speed up ufuncs and allow for easier multi-dimensional
>> iteration.  The new code is fairly large, but works quite well already.  If
>> some people could read the NEP and give some feedback, that would be great!
>>  Here's a link:
>>
>>
>> https://github.com/m-paradox/numpy/blob/mw_neps/doc/neps/new-iterator-ufunc.rst
>>
>> I would also love it if someone could try building the code and play
>> around with it a bit.  The github branch is here:
>>
>> https://github.com/m-paradox/numpy/tree/new_iterator
>>
>> To give a taste of the iterator's functionality, below is an example from
>> the NEP for how to implement a "Lambda UFunc."  With just a few lines of
>> code, it's possible to replicate something similar to the numexpr library
>> (numexpr still gets a bigger speedup, though).  In the example expression I
>> chose, execution time went from 138ms to 61ms.
>>
>> Hopefully this is a good Christmas present for NumPy. :)
>>
>> Cheers,
>> Mark
>>
>> Here is the definition of the ``luf`` function.::
>>
>>     def luf(lamdaexpr, *args, **kwargs):
>>         """Lambda UFunc
>>
>>             e.g.
>>             c = luf(lambda i,j:i+j, a, b, order='K',
>>                                 casting='safe', buffersize=8192)
>>
>>             c = np.empty(...)
>>             luf(lambda i,j:i+j, a, b, out=c, order='K',
>>                                 casting='safe', buffersize=8192)
>>         """
>>
>>         nargs = len(args)
>>         op = args + (kwargs.get('out',None),)
>>         it = np.newiter(op, ['buffered','no_inner_iteration'],
>>                 [['readonly','nbo_aligned']]*nargs +
>>                                 [['writeonly','allocate','no_broadcast']],
>>                 order=kwargs.get('order','K'),
>>                 casting=kwargs.get('casting','safe'),
>>                 buffersize=kwargs.get('buffersize',0))
>>         while not it.finished:
>>             it[-1] = lamdaexpr(*it[:-1])
>>             it.iternext()
>>
>>         return it.operands[-1]
>>
>> Then, by using ``luf`` instead of straight Python expressions, we
>> can gain some performance from better cache behavior.::
>>
>>     In [2]: a = np.random.random((50,50,50,10))
>>     In [3]: b = np.random.random((50,50,1,10))
>>     In [4]: c = np.random.random((50,50,50,1))
>>
>>     In [5]: timeit 3*a+b-(a/c)
>>     1 loops, best of 3: 138 ms per loop
>>
>>     In [6]: timeit luf(lambda a,b,c:3*a+b-(a/c), a, b, c)
>>     10 loops, best of 3: 60.9 ms per loop
>>
>>     In [7]: np.all(3*a+b-(a/c) == luf(lambda a,b,c:3*a+b-(a/c), a, b, c))
>>     Out[7]: True
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101221/4c881653/attachment.html>

From david at silveregg.co.jp  Tue Dec 21 21:06:35 2010
From: david at silveregg.co.jp (David)
Date: Wed, 22 Dec 2010 11:06:35 +0900
Subject: [Numpy-discussion] NEP for faster ufuncs
In-Reply-To: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
References: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
Message-ID: <4D115D2B.7070904@silveregg.co.jp>

Hi Mark,

On 12/22/2010 09:53 AM, Mark Wiebe wrote:
> Hello NumPy-ers,
>
> After some performance analysis, I've designed and implemented a new
> iterator designed to speed up ufuncs and allow for easier
> multi-dimensional iteration.  The new code is fairly large, but works
> quite well already.  If some people could read the NEP and give some
> feedback, that would be great!  Here's a link:
>
> https://github.com/m-paradox/numpy/blob/mw_neps/doc/neps/new-iterator-ufunc.rst

This looks pretty cool. I hope to be able to take a look at it during 
the christmas holidays.

I cannot comment in details yet, but it seems to address several issues 
I encountered myself while implementing the neighborhood iterator (which 
I will try to update to use the new one).

One question: which CPU/platform did you test it on ?

cheers,

David


From mwwiebe at gmail.com  Tue Dec 21 21:15:06 2010
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Tue, 21 Dec 2010 18:15:06 -0800
Subject: [Numpy-discussion] NEP for faster ufuncs
In-Reply-To: <AANLkTi=5ScGKzvsYchYDRmP-xwgAHPkf_cmY51EnZoQQ@mail.gmail.com>
References: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
	<AANLkTimkwCKpVjA7HPwtQca67pWc8bvNSew-zqzdnTs4@mail.gmail.com>
	<AANLkTi=5ScGKzvsYchYDRmP-xwgAHPkf_cmY51EnZoQQ@mail.gmail.com>
Message-ID: <AANLkTikodC7=sL+6=pEju-h8Us5M=K7OUcB04mjnEJA2@mail.gmail.com>

That's a good suggestion - added.  Unfortunately, it looks like the github
rst converter doesn't make a table of contents with working links.

Cheers,
Mark

On Tue, Dec 21, 2010 at 6:00 PM, John Salvatier
<jsalvati at u.washington.edu>wrote:

> I applaud you on your vision. I only have one small suggestion: I suggest
> you put a table of contents at the beginning of your NEP so people may skip
> to the part that most interests them.


> <snip>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101221/c1bc43ba/attachment.html>

From mwwiebe at gmail.com  Tue Dec 21 21:20:32 2010
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Tue, 21 Dec 2010 18:20:32 -0800
Subject: [Numpy-discussion] NEP for faster ufuncs
In-Reply-To: <4D115D2B.7070904@silveregg.co.jp>
References: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
	<4D115D2B.7070904@silveregg.co.jp>
Message-ID: <AANLkTinjtkc+eQsUQxoaQwvd1ZjNXJZZU8twS_mEHRy7@mail.gmail.com>

On Tue, Dec 21, 2010 at 6:06 PM, David <david at silveregg.co.jp> wrote:

> <snip>
>
This looks pretty cool. I hope to be able to take a look at it during
> the christmas holidays.
>
Thanks!

>
> I cannot comment in details yet, but it seems to address several issues
> I encountered myself while implementing the neighborhood iterator (which
> I will try to update to use the new one).
>
> One question: which CPU/platform did you test it on ?
>

The system I'm testing on is a bit old: a dual core Athlon 64 X2 4200+ on
64-bit Fedora  13, gcc 4.4.5.

-Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101221/1d9299b6/attachment.html>

From charlesr.harris at gmail.com  Wed Dec 22 01:05:36 2010
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 21 Dec 2010 23:05:36 -0700
Subject: [Numpy-discussion] NEP for faster ufuncs
In-Reply-To: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
References: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
Message-ID: <AANLkTi=2fYd1gB=YPOROdbLtS_TEQzuR0zys9jOvtHMe@mail.gmail.com>

On Tue, Dec 21, 2010 at 5:53 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:

> Hello NumPy-ers,
>
> After some performance analysis, I've designed and implemented a new
> iterator designed to speed up ufuncs and allow for easier multi-dimensional
> iteration.  The new code is fairly large, but works quite well already.  If
> some people could read the NEP and give some feedback, that would be great!
>  Here's a link:
>
>
> https://github.com/m-paradox/numpy/blob/mw_neps/doc/neps/new-iterator-ufunc.rst
>
> I would also love it if someone could try building the code and play around
> with it a bit.  The github branch is here:
>
> https://github.com/m-paradox/numpy/tree/new_iterator
>
> To give a taste of the iterator's functionality, below is an example from
> the NEP for how to implement a "Lambda UFunc."  With just a few lines of
> code, it's possible to replicate something similar to the numexpr library
> (numexpr still gets a bigger speedup, though).  In the example expression I
> chose, execution time went from 138ms to 61ms.
>
> Hopefully this is a good Christmas present for NumPy. :)
>
> Cheers,
> Mark
>
> Here is the definition of the ``luf`` function.::
>
>     def luf(lamdaexpr, *args, **kwargs):
>         """Lambda UFunc
>
>             e.g.
>             c = luf(lambda i,j:i+j, a, b, order='K',
>                                 casting='safe', buffersize=8192)
>
>             c = np.empty(...)
>             luf(lambda i,j:i+j, a, b, out=c, order='K',
>                                 casting='safe', buffersize=8192)
>         """
>
>         nargs = len(args)
>         op = args + (kwargs.get('out',None),)
>         it = np.newiter(op, ['buffered','no_inner_iteration'],
>                 [['readonly','nbo_aligned']]*nargs +
>                                 [['writeonly','allocate','no_broadcast']],
>                 order=kwargs.get('order','K'),
>                 casting=kwargs.get('casting','safe'),
>                 buffersize=kwargs.get('buffersize',0))
>         while not it.finished:
>             it[-1] = lamdaexpr(*it[:-1])
>             it.iternext()
>
>         return it.operands[-1]
>
> Then, by using ``luf`` instead of straight Python expressions, we
> can gain some performance from better cache behavior.::
>
>     In [2]: a = np.random.random((50,50,50,10))
>     In [3]: b = np.random.random((50,50,1,10))
>     In [4]: c = np.random.random((50,50,50,1))
>
>     In [5]: timeit 3*a+b-(a/c)
>     1 loops, best of 3: 138 ms per loop
>
>     In [6]: timeit luf(lambda a,b,c:3*a+b-(a/c), a, b, c)
>     10 loops, best of 3: 60.9 ms per loop
>
>     In [7]: np.all(3*a+b-(a/c) == luf(lambda a,b,c:3*a+b-(a/c), a, b, c))
>     Out[7]: True
>
>
>
Wow, that's a really nice design and write up. Small typo:

        /* Only allow exactly equivalent types */
        NPY_NO_CASTING=0,
        /* Allow casts between equivalent types of different byte orders  */
        NPY_EQUIV_CASTING=0,

 Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101221/1760676d/attachment.html>

From cmutel at gmail.com  Wed Dec 22 02:51:46 2010
From: cmutel at gmail.com (Christopher Mutel)
Date: Wed, 22 Dec 2010 08:51:46 +0100
Subject: [Numpy-discussion] take from structured array is faster than
 boolean indexing, but reshapes columns to 2D
Message-ID: <AANLkTi=6H5c9jcA9KPQ_AoHhZo9pbpexm0v3qnrXqMC=@mail.gmail.com>

Dear all-

Structured arrays are great, but I am having problems filtering them
efficiently. Reading through the mailing list, it seems like boolean
arrays are the recommended approach to filtering arrays for arbitrary
conditions, but my testing shows that a combination of take and where
can be much faster when dealing with structured arrays:

import timeit

setup = "from numpy import random, where, zeros; r =
random.random_integers(1e3, size=1e6); q = zeros((1e6), dtype=[('foo',
'u4'), ('bar', 'u4'), ('baz', 'u4')]); q['foo'] = r"
statement1 = "s = q.take(where(q['foo'] < 500))"
statement2 = "s = q[q['foo'] < 500]"

t = timeit.Timer(statement1, setup)
t.timeit(10)
t = timeit.Timer(statement2, setup)
t.timeit(10)

Using the boolean array is about 4 times slower when dealing with
large arrays. In my case, these operations are supposed to happen on a
web server with a large number of requests, so the efficiency gain is
important.

However, the combination of take and where reshapes the columns of
structured arrays to be 2-dimensional:

q['foo'].shape
>> (1000000,)
s = q[q['foo'] < 500]
s['foo'].shape
>> (499102,)
s = q.take(where(q['foo'] < 500))
s['foo'].shape
>> (1, 499102)

Is there a way to use this seemingly more efficient approach (take &
where) and not have to manually reshape the columns? This seems
ungainly for larger structured arrays. Or should I file this as a bug?
Perhaps there are even more efficient approaches that I haven't
thought of, but are obvious to others?

Thanks in advance,

Yours,
-Chris

-- 
############################
Chris Mutel
?kologisches Systemdesign - Ecological Systems Design
Institut f.Umweltingenieurwissenschaften - Institute for Environmental
Engineering
ETH Z?rich - HIF C 42 - Schafmattstr. 6
8093 Z?rich

Telefon: +41 44 633 71 45 - Fax: +41 44 633 10 61
############################


From faltet at pytables.org  Wed Dec 22 03:21:39 2010
From: faltet at pytables.org (Francesc Alted)
Date: Wed, 22 Dec 2010 09:21:39 +0100
Subject: [Numpy-discussion] NEP for faster ufuncs
In-Reply-To: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
References: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
Message-ID: <201012220921.39669.faltet@pytables.org>

A Wednesday 22 December 2010 01:53:55 Mark Wiebe escrigu?:
> Hello NumPy-ers,
> 
> After some performance analysis, I've designed and implemented a new
> iterator designed to speed up ufuncs and allow for easier
> multi-dimensional iteration.  The new code is fairly large, but
> works quite well already.  If some people could read the NEP and
> give some feedback, that would be great! Here's a link:
> 
> https://github.com/m-paradox/numpy/blob/mw_neps/doc/neps/new-iterator
> -ufunc.rst
> 
> I would also love it if someone could try building the code and play
> around with it a bit.  The github branch is here:
> 
> https://github.com/m-paradox/numpy/tree/new_iterator
> 
> To give a taste of the iterator's functionality, below is an example
> from the NEP for how to implement a "Lambda UFunc."  With just a few
> lines of code, it's possible to replicate something similar to the
> numexpr library (numexpr still gets a bigger speedup, though).  In
> the example expression I chose, execution time went from 138ms to
> 61ms.
> 
> Hopefully this is a good Christmas present for NumPy. :)
> 
> Cheers,
> Mark
> 
> Here is the definition of the ``luf`` function.::
> 
>     def luf(lamdaexpr, *args, **kwargs):
>         """Lambda UFunc
> 
>             e.g.
>             c = luf(lambda i,j:i+j, a, b, order='K',
>                                 casting='safe', buffersize=8192)
> 
>             c = np.empty(...)
>             luf(lambda i,j:i+j, a, b, out=c, order='K',
>                                 casting='safe', buffersize=8192)
>         """
> 
>         nargs = len(args)
>         op = args + (kwargs.get('out',None),)
>         it = np.newiter(op, ['buffered','no_inner_iteration'],
>                 [['readonly','nbo_aligned']]*nargs +
>                                
> [['writeonly','allocate','no_broadcast']],
> order=kwargs.get('order','K'),
>                 casting=kwargs.get('casting','safe'),
>                 buffersize=kwargs.get('buffersize',0))
>         while not it.finished:
>             it[-1] = lamdaexpr(*it[:-1])
>             it.iternext()
> 
>         return it.operands[-1]
> 
> Then, by using ``luf`` instead of straight Python expressions, we
> can gain some performance from better cache behavior.::
> 
>     In [2]: a = np.random.random((50,50,50,10))
>     In [3]: b = np.random.random((50,50,1,10))
>     In [4]: c = np.random.random((50,50,50,1))
> 
>     In [5]: timeit 3*a+b-(a/c)
>     1 loops, best of 3: 138 ms per loop
> 
>     In [6]: timeit luf(lambda a,b,c:3*a+b-(a/c), a, b, c)
>     10 loops, best of 3: 60.9 ms per loop
> 
>     In [7]: np.all(3*a+b-(a/c) == luf(lambda a,b,c:3*a+b-(a/c), a, b,
> c)) Out[7]: True

Wow, really nice work!  It would be great if that could make into NumPy 
:-)  Regarding your comment on numexpr being faster, I'm not sure (your 
new_iterator branch does not work for me; it gives me an error like: 
AttributeError: 'module' object has no attribute 'newiter'), but my 
guess is that your approach seems actually faster:

>>> a = np.random.random((50,50,50,10))
>>> b = np.random.random((50,50,1,10))
>>> c = np.random.random((50,50,50,1))
>>> timeit 3*a+b-(a/c)
10 loops, best of 3: 67.5 ms per loop
>>> import numexpr as ne
>>> ne.evaluate("3*a+b-(a/c)
>>> timeit ne.evaluate("3*a+b-(a/c)")
10 loops, best of 3: 42.8 ms per loop

i.e. numexpr is not able to achieve the 2x speedup mark that you are 
getting with ``luf`` (using a Core2 @ 3 GHz here).

-- 
Francesc Alted


From jsalvati at u.washington.edu  Wed Dec 22 03:44:28 2010
From: jsalvati at u.washington.edu (John Salvatier)
Date: Wed, 22 Dec 2010 00:44:28 -0800
Subject: [Numpy-discussion] Giving numpy the ability to multi-iterate
 excluding an axis
In-Reply-To: <AANLkTi=8tyTzPQLHGFsJeW_OehvwagPfpX2aA5RscSdJ@mail.gmail.com>
References: <AANLkTikxA_cn07_y1PbLh-oVe7qQ2wG_T7+TH_fyiu-n@mail.gmail.com>
	<AANLkTi=8tyTzPQLHGFsJeW_OehvwagPfpX2aA5RscSdJ@mail.gmail.com>
Message-ID: <AANLkTin6Vt7+=EoLYyABEbx+t5EusUMVyRepYsX8HGcE@mail.gmail.com>

This now makes sense to me, and I think it should work :D. This is all very
cool. This is going to do big things for cython and numpy.

Some hopefully constructive criticism:

When first reading through the API description, the way oa_ndim and oa_axes
work is not clear. I think your description would be clearer if you explain
what oa_ndim means (I gather something like "the number of axes over which
you wish to iterate"), currently it just says "These parameters let you
control in detail how the axes of the operand arrays get matched together
and iterated."

It's also not totally clear to me how offsetting works. What are the offsets
measured from? It seems like they are measured from another iterator, but
I'm not sure and I don't see how it gets that information.

John

On Tue, Dec 21, 2010 at 5:12 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:

> On Mon, Dec 20, 2010 at 1:42 PM, John Salvatier <jsalvati at u.washington.edu
> > wrote:
>
>> A while ago, I asked a whether it was possible to multi-iterate over
>> several ndarrays but exclude a certain axis(
>> http://www.mail-archive.com/numpy-discussion at scipy.org/msg29204.html),
>> sort of a combination of PyArray_IterAllButAxis and PyArray_MultiIterNew. My
>> goal was to allow creation of relatively complex ufuncs that can allow
>> reduction or directionally dependent computation and still use broadcasting
>> (for example a moving averaging ufunc that can have changing averaging
>> parameters). I didn't get any solutions, which I take to mean that no one
>> knew how to do this.
>>
>> I am thinking about trying to make a numpy patch with this functionality,
>> and I have some questions: 1) How difficult would this kind of task be for
>> someone with non-expert C knowledge and good numpy knowledge? 2) Does anyone
>> have advice on how to do this kind of thing?
>>
>
> You may be able to do what you would like with the new iterator I've
> written.  In particular, it supports nesting multiple iterators by providing
> either pointers or offsets, and allowing you to specify any subset of the
> axes to iterate.  Here's how the code to do this in a simple 3D case might
> look, for making axis 1 the inner loop:
>
> PyArrayObject *op[2] = {a,b};
> npy_intp axes_outer[2] = {0,2}};
> npy_intp *op_axes[2];
> npy_intp axis_inner = 1;
> npy_int32 flags[2] = {NPY_ITER_READONLY, NPY_ITER_READONLY};
> NpyIter *outer, *inner;
> NpyIter_IterNext_Fn oiternext, iiternext;
> npy_intp *ooffsets;
> char **idataptrs;
>
> op_axes[0] = op_axes[1] = axes_outer;
> outer = NpyIter_MultiNew(2, op, NPY_ITER_OFFSETS,
>                            NPY_KEEPORDER, NPY_NO_CASTING, flags, NULL, 2,
> op_axes, 0);
> op_axes[0] = op_axes[1] = &axis_inner;
> inner = NpyIter_MultiNew(2, op, 0, NPY_KEEPORDER, NPY_NO_CASTING, flags,
> NULL, 1, op_axes, 0);
>
> oiternext = NpyIter_GetIterNext(outer);
> iiternext = NpyIter_GetIterNext(inner);
>
> ooffsets = (npy_intp *)NpyIter_GetDataPtrArray(outer);
> idataptrs = NpyIter_GetDataPtrArray(inner);
>
> do {
>    do {
>       char *a_data = idataptrs[0] + ooffsets[0], *b_data = idataptrs[0] +
> ooffsets[0];
>       /* Do stuff with the data */
>    } while(iiternext());
>    NpyIter_Reset(inner);
> } while(oiternext());
>
> NpyIter_Deallocate(outer);
> NpyIter_Deallocate(inner);
>
> Extending to more dimensions, or making both the inner and outer loops have
> multiple dimensions, isn't too crazy.  Is this along the lines of what you
> need?
>
> If you check out my code, note that it currently isn't exposed as NumPy API
> yet, but you can try a lot of things with the Python exposure.
>
> Cheers,
> Mark
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101222/3e401b2b/attachment.html>

From mwwiebe at gmail.com  Wed Dec 22 03:48:03 2010
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Wed, 22 Dec 2010 00:48:03 -0800
Subject: [Numpy-discussion] NEP for faster ufuncs
In-Reply-To: <201012220921.39669.faltet@pytables.org>
References: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
	<201012220921.39669.faltet@pytables.org>
Message-ID: <AANLkTik6FegeZoMsYqe3APNKvHSdVMvrKtwNOj1O7EYo@mail.gmail.com>

On Wed, Dec 22, 2010 at 12:21 AM, Francesc Alted <faltet at pytables.org>wrote:

> <snip>
> Wow, really nice work!  It would be great if that could make into NumPy
> :-)  Regarding your comment on numexpr being faster, I'm not sure (your
> new_iterator branch does not work for me; it gives me an error like:
> AttributeError: 'module' object has no attribute 'newiter'),


What are you using to build it?  So far I've just modified the setup.py
scripts, I still need to add it to numscons.


> but my
> guess is that your approach seems actually faster:
>
> >>> a = np.random.random((50,50,50,10))
> >>> b = np.random.random((50,50,1,10))
> >>> c = np.random.random((50,50,50,1))
> >>> timeit 3*a+b-(a/c)
> 10 loops, best of 3: 67.5 ms per loop
> >>> import numexpr as ne
> >>> ne.evaluate("3*a+b-(a/c)
> >>> timeit ne.evaluate("3*a+b-(a/c)")
> 10 loops, best of 3: 42.8 ms per loop
>
> i.e. numexpr is not able to achieve the 2x speedup mark that you are
> getting with ``luf`` (using a Core2 @ 3 GHz here).
>

That's promising!  I based my assertion on getting a slower speedup than
numexpr does on their front page example.

-Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101222/b5368ee3/attachment.html>

From faltet at pytables.org  Wed Dec 22 03:54:16 2010
From: faltet at pytables.org (Francesc Alted)
Date: Wed, 22 Dec 2010 09:54:16 +0100
Subject: [Numpy-discussion] NEP for faster ufuncs
In-Reply-To: <AANLkTik6FegeZoMsYqe3APNKvHSdVMvrKtwNOj1O7EYo@mail.gmail.com>
References: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
	<201012220921.39669.faltet@pytables.org>
	<AANLkTik6FegeZoMsYqe3APNKvHSdVMvrKtwNOj1O7EYo@mail.gmail.com>
Message-ID: <201012220954.16336.faltet@pytables.org>

A Wednesday 22 December 2010 09:48:03 Mark Wiebe escrigu?:
> On Wed, Dec 22, 2010 at 12:21 AM, Francesc Alted 
<faltet at pytables.org>wrote:
> > <snip>
> > Wow, really nice work!  It would be great if that could make into
> > NumPy
> > 
> > :-)  Regarding your comment on numexpr being faster, I'm not sure
> > :(your
> > 
> > new_iterator branch does not work for me; it gives me an error
> > like: AttributeError: 'module' object has no attribute 'newiter'),
> 
> What are you using to build it?  So far I've just modified the
> setup.py scripts, I still need to add it to numscons.

Well, just the typical "git clone ...; python setup.py install" dance.

> > i.e. numexpr is not able to achieve the 2x speedup mark that you
> > are getting with ``luf`` (using a Core2 @ 3 GHz here).
> 
> That's promising!  I based my assertion on getting a slower speedup
> than numexpr does on their front page example.

I see :-)  Well, I'd think that numexpr is not specially efficient when 
handling broadcasting, so this might be the reason your approach is 
faster.  I suppose that with operands with the same shape, things might 
look different.

-- 
Francesc Alted


From michel.dupront at hotmail.fr  Wed Dec 22 04:55:30 2010
From: michel.dupront at hotmail.fr (Michel Dupront)
Date: Wed, 22 Dec 2010 10:55:30 +0100
Subject: [Numpy-discussion] use of the PyArray_SETITEM numpy method
Message-ID: <SNT142-w69E8A433D3F2E3C8EA86BF41B0@phx.gbl>


Hello,
 
Please somebody help me ! 
 
I am really confused with all these void that I see everywhere.
 
The documentation says:
PyObject * PyArray_SETITEM(PyObject * arr, void* itemptr, PyObject* obj)
 
I created a 1D array of doubles with PyArray_SimpleNew.
Now I am in a loop of index i to feed my array with values.
What should I give for itemptr ?
 
Thank you  		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101222/2888ebd8/attachment.html>

From faltet at pytables.org  Wed Dec 22 05:11:17 2010
From: faltet at pytables.org (Francesc Alted)
Date: Wed, 22 Dec 2010 11:11:17 +0100
Subject: [Numpy-discussion] use of the PyArray_SETITEM numpy method
In-Reply-To: <SNT142-w69E8A433D3F2E3C8EA86BF41B0@phx.gbl>
References: <SNT142-w69E8A433D3F2E3C8EA86BF41B0@phx.gbl>
Message-ID: <201012221111.17356.faltet@pytables.org>

A Wednesday 22 December 2010 10:55:30 Michel Dupront escrigu?:
> Hello,
> 
> Please somebody help me !
> 
> I am really confused with all these void that I see everywhere.
> 
> The documentation says:
> PyObject * PyArray_SETITEM(PyObject * arr, void* itemptr, PyObject*
> obj)
> 
> I created a 1D array of doubles with PyArray_SimpleNew.
> Now I am in a loop of index i to feed my array with values.
> What should I give for itemptr ?

The pointer to your data indeed.  For example, if you declare your item 
as:

double myitem

then, pass it as &myitem.

-- 
Francesc Alted


From ndbecker2 at gmail.com  Wed Dec 22 08:08:33 2010
From: ndbecker2 at gmail.com (Neal Becker)
Date: Wed, 22 Dec 2010 08:08:33 -0500
Subject: [Numpy-discussion] savetxt to a string?
Message-ID: <iest8h$vit$1@dough.gmane.org>

Is the formatting of savetxt available with a string as the destination?

If not, shouldn't this functionality be factored out of savetxt?


From numpy-discussion at maubp.freeserve.co.uk  Wed Dec 22 08:26:03 2010
From: numpy-discussion at maubp.freeserve.co.uk (Peter)
Date: Wed, 22 Dec 2010 13:26:03 +0000
Subject: [Numpy-discussion] savetxt to a string?
In-Reply-To: <iest8h$vit$1@dough.gmane.org>
References: <iest8h$vit$1@dough.gmane.org>
Message-ID: <AANLkTinz4bMFO_UOcZ669q5bYY=kftOrUYdw2g6KQXes@mail.gmail.com>

On Wed, Dec 22, 2010 at 1:08 PM, Neal Becker <ndbecker2 at gmail.com> wrote:
>
> Is the formatting of savetxt available with a string as the destination?
>
> If not, shouldn't this functionality be factored out of savetxt?
>

Have you tried using a StringIO handle?

Peter


From ndbecker2 at gmail.com  Wed Dec 22 08:53:22 2010
From: ndbecker2 at gmail.com (Neal Becker)
Date: Wed, 22 Dec 2010 08:53:22 -0500
Subject: [Numpy-discussion] savetxt to a string?
References: <iest8h$vit$1@dough.gmane.org>
	<AANLkTinz4bMFO_UOcZ669q5bYY=kftOrUYdw2g6KQXes@mail.gmail.com>
Message-ID: <iesvsi$d1d$1@dough.gmane.org>

Peter wrote:

> On Wed, Dec 22, 2010 at 1:08 PM, Neal Becker <ndbecker2 at gmail.com> wrote:
>>
>> Is the formatting of savetxt available with a string as the destination?
>>
>> If not, shouldn't this functionality be factored out of savetxt?
>>
> 
> Have you tried using a StringIO handle?
> 
> Peter

Yup.  But wouldn't it be cleaner to factor out this functionality and make 
saving to a file use this?


From ijstokes at hkl.hms.harvard.edu  Wed Dec 22 09:16:17 2010
From: ijstokes at hkl.hms.harvard.edu (Ian Stokes-Rees)
Date: Wed, 22 Dec 2010 09:16:17 -0500
Subject: [Numpy-discussion] counting non-zero entries in an ndarray
Message-ID: <4D120831.80805@hkl.hms.harvard.edu>

What is the most efficient way to do the Matlab equivalent of nnz(M)
(nnz = number-of-non-zeros function)?

I've tried Google, but no luck.

My assumption is that something like

a != 0

will be used, but I'm not sure then how to "count" the number of "True"
entries.

TIA.

Ian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ijstokes.vcf
Type: text/x-vcard
Size: 380 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101222/03855a06/attachment.vcf>

From alan.isaac at gmail.com  Wed Dec 22 09:20:02 2010
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Wed, 22 Dec 2010 09:20:02 -0500
Subject: [Numpy-discussion] counting non-zero entries in an ndarray
In-Reply-To: <4D120831.80805@hkl.hms.harvard.edu>
References: <4D120831.80805@hkl.hms.harvard.edu>
Message-ID: <4D120912.20701@gmail.com>

On 12/22/2010 9:16 AM, Ian Stokes-Rees wrote:
> a != 0
>
> will be used, but I'm not sure then how to "count" the number of "True"
> entries.

(a != 0).sum()

hth,
Alan Isaac


From numpy-discussion at maubp.freeserve.co.uk  Wed Dec 22 09:33:05 2010
From: numpy-discussion at maubp.freeserve.co.uk (Peter)
Date: Wed, 22 Dec 2010 14:33:05 +0000
Subject: [Numpy-discussion] savetxt to a string?
In-Reply-To: <iesvsi$d1d$1@dough.gmane.org>
References: <iest8h$vit$1@dough.gmane.org>
	<AANLkTinz4bMFO_UOcZ669q5bYY=kftOrUYdw2g6KQXes@mail.gmail.com>
	<iesvsi$d1d$1@dough.gmane.org>
Message-ID: <AANLkTi=cxz3c7huhZLjPgnquPcRZRzLvyXnQQyn0b422@mail.gmail.com>

On Wed, Dec 22, 2010 at 1:53 PM, Neal Becker <ndbecker2 at gmail.com> wrote:
>
> Peter wrote:
>
>> On Wed, Dec 22, 2010 at 1:08 PM, Neal Becker <ndbecker2 at gmail.com> wrote:
>>>
>>> Is the formatting of savetxt available with a string as the destination?
>>>
>>> If not, shouldn't this functionality be factored out of savetxt?
>>>
>>
>> Have you tried using a StringIO handle?
>>
>> Peter
>
> Yup. ?But wouldn't it be cleaner to factor out this functionality and make
> saving to a file use this?

I doubt it.

I would think from a code point of view no - taking filenames or handles
means you can write everything using handle.write(...) statements, and
send the data to disk (or a StringIO handle) gradually. This scales well
with large data.

If you wanted a single code base to support filenames, handles, or
output as a string I think you'd be forced to build a large string in
memory, and then (if applicable) write it to disk (in one go). This
won't scale well with large data.

Peter


From ndbecker2 at gmail.com  Wed Dec 22 10:24:02 2010
From: ndbecker2 at gmail.com (Neal Becker)
Date: Wed, 22 Dec 2010 10:24:02 -0500
Subject: [Numpy-discussion] savetxt to a string?
References: <iest8h$vit$1@dough.gmane.org>
	<AANLkTinz4bMFO_UOcZ669q5bYY=kftOrUYdw2g6KQXes@mail.gmail.com>
	<iesvsi$d1d$1@dough.gmane.org>
	<AANLkTi=cxz3c7huhZLjPgnquPcRZRzLvyXnQQyn0b422@mail.gmail.com>
Message-ID: <iet56i$8o3$1@dough.gmane.org>

Peter wrote:

> On Wed, Dec 22, 2010 at 1:53 PM, Neal Becker <ndbecker2 at gmail.com> wrote:
>>
>> Peter wrote:
>>
>>> On Wed, Dec 22, 2010 at 1:08 PM, Neal Becker <ndbecker2 at gmail.com>
>>> wrote:
>>>>
>>>> Is the formatting of savetxt available with a string as the
>>>> destination?
>>>>
>>>> If not, shouldn't this functionality be factored out of savetxt?
>>>>
>>>
>>> Have you tried using a StringIO handle?
>>>
>>> Peter
>>
>> Yup.  But wouldn't it be cleaner to factor out this functionality and
>> make saving to a file use this?
> 
> I doubt it.
> 
> I would think from a code point of view no - taking filenames or handles
> means you can write everything using handle.write(...) statements, and
> send the data to disk (or a StringIO handle) gradually. This scales well
> with large data.
> 
> If you wanted a single code base to support filenames, handles, or
> output as a string I think you'd be forced to build a large string in
> memory, and then (if applicable) write it to disk (in one go). This
> won't scale well with large data.
> 
> Peter

Good point.


From mwwiebe at gmail.com  Wed Dec 22 11:21:24 2010
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Wed, 22 Dec 2010 08:21:24 -0800
Subject: [Numpy-discussion] NEP for faster ufuncs
In-Reply-To: <AANLkTi=2fYd1gB=YPOROdbLtS_TEQzuR0zys9jOvtHMe@mail.gmail.com>
References: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
	<AANLkTi=2fYd1gB=YPOROdbLtS_TEQzuR0zys9jOvtHMe@mail.gmail.com>
Message-ID: <AANLkTi=pZ1qikRcmG2-gcX5Lk0nGL3T5PnttQjUBraVb@mail.gmail.com>

On Tue, Dec 21, 2010 at 10:05 PM, Charles R Harris <
charlesr.harris at gmail.com> wrote:

> <snip>
> Wow, that's a really nice design and write up. Small typo:
>
>         /* Only allow exactly equivalent types */
>         NPY_NO_CASTING=0,
>         /* Allow casts between equivalent types of different byte orders  */
>
>         NPY_EQUIV_CASTING=0,
>
> Good catch, turns out the test that should have caught it was broken too.

-Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101222/558ea550/attachment.html>

From mwwiebe at gmail.com  Wed Dec 22 11:25:13 2010
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Wed, 22 Dec 2010 08:25:13 -0800
Subject: [Numpy-discussion] NEP for faster ufuncs
In-Reply-To: <201012220954.16336.faltet@pytables.org>
References: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
	<201012220921.39669.faltet@pytables.org>
	<AANLkTik6FegeZoMsYqe3APNKvHSdVMvrKtwNOj1O7EYo@mail.gmail.com>
	<201012220954.16336.faltet@pytables.org>
Message-ID: <AANLkTim4JhT78phiPAn6SmxAALhqOOAHCPPENf7e=09v@mail.gmail.com>

On Wed, Dec 22, 2010 at 12:54 AM, Francesc Alted <faltet at pytables.org>wrote:

> A Wednesday 22 December 2010 09:48:03 Mark Wiebe escrigu?:
> > On Wed, Dec 22, 2010 at 12:21 AM, Francesc Alted
> <faltet at pytables.org>wrote:
> > > <snip>
>  > > new_iterator branch does not work for me; it gives me an error
> > > like: AttributeError: 'module' object has no attribute 'newiter'),
> >
> > What are you using to build it?  So far I've just modified the
> > setup.py scripts, I still need to add it to numscons.
>
> Well, just the typical "git clone ...; python setup.py install" dance.
>

Can you print out your np.__version__, and try running the tests?  If
newiter didn't build for some reason, its tests should be throwing a bunch
of exceptions.


> > > i.e. numexpr is not able to achieve the 2x speedup mark that you
> > > are getting with ``luf`` (using a Core2 @ 3 GHz here).
> >
> > That's promising!  I based my assertion on getting a slower speedup
> > than numexpr does on their front page example.
>
> I see :-)  Well, I'd think that numexpr is not specially efficient when
> handling broadcasting, so this might be the reason your approach is
> faster.  I suppose that with operands with the same shape, things might
> look different.
>

I haven't looked at the numexpr code, but I think the ufuncs will need SSE
versions to make up part of the remaining difference.

-Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101222/42d48dcf/attachment.html>

From mwwiebe at gmail.com  Wed Dec 22 11:54:11 2010
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Wed, 22 Dec 2010 08:54:11 -0800
Subject: [Numpy-discussion] Giving numpy the ability to multi-iterate
 excluding an axis
In-Reply-To: <AANLkTin6Vt7+=EoLYyABEbx+t5EusUMVyRepYsX8HGcE@mail.gmail.com>
References: <AANLkTikxA_cn07_y1PbLh-oVe7qQ2wG_T7+TH_fyiu-n@mail.gmail.com>
	<AANLkTi=8tyTzPQLHGFsJeW_OehvwagPfpX2aA5RscSdJ@mail.gmail.com>
	<AANLkTin6Vt7+=EoLYyABEbx+t5EusUMVyRepYsX8HGcE@mail.gmail.com>
Message-ID: <AANLkTiniVbA42BRVzOVrCY2A5fap7CoyjMXGHOsysGtL@mail.gmail.com>

On Wed, Dec 22, 2010 at 12:44 AM, John Salvatier
<jsalvati at u.washington.edu>wrote:

> This now makes sense to me, and I think it should work :D. This is all very
> cool. This is going to do big things for cython and numpy.
>
> Some hopefully constructive criticism:
>
> When first reading through the API description, the way oa_ndim and oa_axes
> work is not clear. I think your description would be clearer if you explain
> what oa_ndim means (I gather something like "the number of axes over which
> you wish to iterate"), currently it just says "These parameters let you
> control in detail how the axes of the operand arrays get matched together
> and iterated."
>

Thanks, I've tried to clean up the description a bit.


> It's also not totally clear to me how offsetting works. What are the
> offsets measured from? It seems like they are measured from another
> iterator, but I'm not sure and I don't see how it gets that information.
>

I added an example to the NEP to try to make it more clear, here's what I
wrote:

To help understand how the offsets work, here is a simple nested iteration
example. Let's say our array a has shape (2, 3, 4), and strides (48, 16, 4).
The data pointer for element (i, j, k) is at address PyArray_BYTES(a) + 48*i
+ 16*j + 4*k. Now consider two iterators with custom op_axes (0,1) and (2,).
The first one will produce addresses like PyArray_BYTES(a) + 48*i + 16*j,
and the second one will produce addresses likePyArray_BYTES(a) + 4*k. Simply
adding together these values would produce invalid pointers. Instead, we can
make the outer iterator produce offsets, in which case it will produce the
values 48*i + 16*j, and its sum with the other iterator's pointer gives the
correct data address. It's important to note that this will not work if any
of the iterators share an axis. The iterator cannot check this, so your code
must handle it.


Additionally, taking a look at the ndarray strides documentation might help:

http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.strides.html

Cheers,
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101222/54afd171/attachment.html>

From faltet at pytables.org  Wed Dec 22 12:07:13 2010
From: faltet at pytables.org (Francesc Alted)
Date: Wed, 22 Dec 2010 18:07:13 +0100
Subject: [Numpy-discussion] NEP for faster ufuncs
In-Reply-To: <AANLkTim4JhT78phiPAn6SmxAALhqOOAHCPPENf7e=09v@mail.gmail.com>
References: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
	<201012220954.16336.faltet@pytables.org>
	<AANLkTim4JhT78phiPAn6SmxAALhqOOAHCPPENf7e=09v@mail.gmail.com>
Message-ID: <201012221807.13516.faltet@pytables.org>

A Wednesday 22 December 2010 17:25:13 Mark Wiebe escrigu?:
> Can you print out your np.__version__, and try running the tests?  If
> newiter didn't build for some reason, its tests should be throwing a
> bunch of exceptions.

I'm a bit swamped now.  Let's see if I can do that later on.

> > I see :-)  Well, I'd think that numexpr is not specially efficient
> > when handling broadcasting, so this might be the reason your
> > approach is faster.  I suppose that with operands with the same
> > shape, things might look different.
> 
> I haven't looked at the numexpr code, but I think the ufuncs will
> need SSE versions to make up part of the remaining difference.

Uh, I doubt that SSE can do a lot for accelerating operations like 
3*a+b-(a/c), as this computation is mainly bounded by memory (although 
threading does certainly help).  Numexpr can use SSE only via Intel's 
VML, which is very good for accelerating the computation of 
transcendental functions (sin, cos, sqrt, exp, log...).

-- 
Francesc Alted


From mwwiebe at gmail.com  Wed Dec 22 12:21:28 2010
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Wed, 22 Dec 2010 09:21:28 -0800
Subject: [Numpy-discussion] NEP for faster ufuncs
In-Reply-To: <201012221807.13516.faltet@pytables.org>
References: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
	<201012220954.16336.faltet@pytables.org>
	<AANLkTim4JhT78phiPAn6SmxAALhqOOAHCPPENf7e=09v@mail.gmail.com>
	<201012221807.13516.faltet@pytables.org>
Message-ID: <AANLkTimm0ugWChKVP=G-N8gjzFx=v3VAzu9ZFRJUBkvA@mail.gmail.com>

On Wed, Dec 22, 2010 at 9:07 AM, Francesc Alted <faltet at pytables.org> wrote:

> A Wednesday 22 December 2010 17:25:13 Mark Wiebe escrigu?:
> > Can you print out your np.__version__, and try running the tests?  If
> > newiter didn't build for some reason, its tests should be throwing a
> > bunch of exceptions.
>
> I'm a bit swamped now.  Let's see if I can do that later on.
>

Ok.

> > I see :-)  Well, I'd think that numexpr is not specially efficient
> > > when handling broadcasting, so this might be the reason your
> > > approach is faster.  I suppose that with operands with the same
> > > shape, things might look different.
> >
> > I haven't looked at the numexpr code, but I think the ufuncs will
> > need SSE versions to make up part of the remaining difference.
>
> Uh, I doubt that SSE can do a lot for accelerating operations like
> 3*a+b-(a/c), as this computation is mainly bounded by memory (although
> threading does certainly help).  Numexpr can use SSE only via Intel's
> VML, which is very good for accelerating the computation of
> transcendental functions (sin, cos, sqrt, exp, log...).
>

The reason I think it might help is that with 'luf' is that it's calculating
the expression on smaller sized arrays, which possibly just got buffered.
 If the memory allocator for the temporaries keeps giving back the same
addresses, all this will be in one of the caches very close to the CPU.
 Unless this cache is still too slow to feed the SSE instructions, there
should be a speed benefit.  The ufunc inner loops could also use the SSE
prefetch instructions based on the stride to give some strong hints about
where the next memory bytes to use will be.

-Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101222/a8939f92/attachment.html>

From tkg at lanl.gov  Wed Dec 22 12:43:55 2010
From: tkg at lanl.gov (Thomas K Gamble)
Date: Wed, 22 Dec 2010 10:43:55 -0700
Subject: [Numpy-discussion] counting non-zero entries in an ndarray
In-Reply-To: <4D120831.80805@hkl.hms.harvard.edu>
References: <4D120831.80805@hkl.hms.harvard.edu>
Message-ID: <201012221043.55668.tkg@lanl.gov>

On Wednesday, December 22, 2010 07:16:17 am Ian Stokes-Rees wrote:
> What is the most efficient way to do the Matlab equivalent of nnz(M)
> (nnz = number-of-non-zeros function)?
> 
> I've tried Google, but no luck.
> 
> My assumption is that something like
> 
> a != 0
> 
> will be used, but I'm not sure then how to "count" the number of "True"
> entries.
> 
> TIA.
> 
> Ian

one possibility:

len(where(a != 0)[0])

-- 
Thomas K. Gamble
Research Technologist, System/Network Administrator
Chemical Diagnostics and Engineering (C-CDE)
Los Alamos National Laboratory
MS-E543,p:505-665-4323 f:505-665-4267

There cannot be a crisis next week. My schedule is already full.
    Henry Kissinger


From faltet at pytables.org  Wed Dec 22 13:41:10 2010
From: faltet at pytables.org (Francesc Alted)
Date: Wed, 22 Dec 2010 19:41:10 +0100
Subject: [Numpy-discussion] NEP for faster ufuncs
In-Reply-To: <AANLkTimm0ugWChKVP=G-N8gjzFx=v3VAzu9ZFRJUBkvA@mail.gmail.com>
References: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
	<201012221807.13516.faltet@pytables.org>
	<AANLkTimm0ugWChKVP=G-N8gjzFx=v3VAzu9ZFRJUBkvA@mail.gmail.com>
Message-ID: <201012221941.11032.faltet@pytables.org>

A Wednesday 22 December 2010 18:21:28 Mark Wiebe escrigu?:
> On Wed, Dec 22, 2010 at 9:07 AM, Francesc Alted <faltet at pytables.org> 
wrote:
> > A Wednesday 22 December 2010 17:25:13 Mark Wiebe escrigu?:
> > > Can you print out your np.__version__, and try running the tests?
> > >  If newiter didn't build for some reason, its tests should be
> > > throwing a bunch of exceptions.

$ PYTHONPATH=numpy python -c "import numpy; numpy.test()"
Running unit tests for numpy
NumPy version 2.0.0.dev-147f817
NumPy is installed in /tmp/numpy/numpy
Python version 2.6.1 (r261:67515, Feb  3 2009, 17:34:37) [GCC 4.3.2 
[gcc-4_3-branch revision 141291]]
nose version 0.11.0
[clip]
Warning: divide by zero encountered in log
Warning: divide by zero encountered in log
[clip]
Ran 3094 tests in 16.771s
OK (KNOWNFAIL=4, SKIP=1)

IPython seems to work well too:

>>> np.__version__
'2.0.0.dev-147f817'
>>> timeit 3*a+b-(a/c)
10 loops, best of 3: 67.5 ms per loop

However, when trying you luf function:

>>> cpaste
[the luf code here]
--
>>> timeit luf(lambda a,b,c:3*a+b-(a/c), a, b, c)
[clip]
AttributeError: 'module' object has no attribute 'newiter'

> The reason I think it might help is that with 'luf' is that it's
> calculating the expression on smaller sized arrays, which possibly
> just got buffered. If the memory allocator for the temporaries keeps
> giving back the same addresses, all this will be in one of the
> caches very close to the CPU. Unless this cache is still too slow to
> feed the SSE instructions, there should be a speed benefit.  The
> ufunc inner loops could also use the SSE prefetch instructions based
> on the stride to give some strong hints about where the next memory
> bytes to use will be.

Ah, okay.  However, Numexpr is not meant to accelerate calculations with 
small operands.  I suppose that this is where your new iterator makes 
more sense: accelerating operations where some of the operands are small 
(i.e. fit in cache) and have to be broadcasted to match the 
dimensionality of the others.

-- 
Francesc Alted


From mwwiebe at gmail.com  Wed Dec 22 13:52:45 2010
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Wed, 22 Dec 2010 10:52:45 -0800
Subject: [Numpy-discussion] NEP for faster ufuncs
In-Reply-To: <201012221941.11032.faltet@pytables.org>
References: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
	<201012221807.13516.faltet@pytables.org>
	<AANLkTimm0ugWChKVP=G-N8gjzFx=v3VAzu9ZFRJUBkvA@mail.gmail.com>
	<201012221941.11032.faltet@pytables.org>
Message-ID: <AANLkTi=DR_dADDMjnLHu6atDpg+aJ_uJX-UF1mBnzDHk@mail.gmail.com>

On Wed, Dec 22, 2010 at 10:41 AM, Francesc Alted <faltet at pytables.org>wrote:

> NumPy version 2.0.0.dev-147f817
>

There's your problem, it looks like the PYTHONPATH isn't seeing your new
build for some reason.  That build is off of this commit in the NumPy master
branch:

https://github.com/numpy/numpy/commit/147f817eefd5efa56fa26b03953a51d533cc27ec

> The reason I think it might help is that with 'luf' is that it's
> > calculating the expression on smaller sized arrays, which possibly
> > just got buffered. If the memory allocator for the temporaries keeps
> > giving back the same addresses, all this will be in one of the
> > caches very close to the CPU. Unless this cache is still too slow to
> > feed the SSE instructions, there should be a speed benefit.  The
> > ufunc inner loops could also use the SSE prefetch instructions based
> > on the stride to give some strong hints about where the next memory
> > bytes to use will be.
>
> Ah, okay.  However, Numexpr is not meant to accelerate calculations with
> small operands.  I suppose that this is where your new iterator makes
> more sense: accelerating operations where some of the operands are small
> (i.e. fit in cache) and have to be broadcasted to match the
> dimensionality of the others.
>

It's not about small operands, but small chunks of the operands at a time,
with temporary arrays for intermediate calculations.  It's the small chunks
+ temporaries which must fit in cache to get the benefit, not the whole
array.  The numexpr front page explains this fairly well in the section "Why
It Works":

http://code.google.com/p/numexpr/#Why_It_Works

-Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101222/54326e49/attachment.html>

From faltet at pytables.org  Wed Dec 22 13:58:41 2010
From: faltet at pytables.org (Francesc Alted)
Date: Wed, 22 Dec 2010 19:58:41 +0100
Subject: [Numpy-discussion] ANN: carray 0.3 released
Message-ID: <201012221958.41105.faltet@pytables.org>

=====================
Announcing carray 0.3
=====================

What's new
==========

A lot of stuff.  The most outstanding feature in this version is the
introduction of a `ctable` object.  A `ctable` is similar to a
structured array in NumPy, but instead of storing the data row-wise, it
uses a column-wise arrangement.  This allows for much better performance
for very wide tables, which is one of the scenarios where a `ctable`
makes more sense.  Of course, as `ctable` is based on `carray` objects,
it inherits all its niceties (like on-the-flight compression and fast
iterators).

Also, the `carray` object itself has received many improvements, like
new constructors (arange(), fromiter(), zeros(), ones(), fill()),
iterators (where(), wheretrue()) or resize mehtods (resize(), trim()).
Most of these also work with the new `ctable`.

Besides, Numexpr is supported now (but it is optional) in order to carry
out stunningly fast queries on `ctable` objects.  For example, doing a
query on a table with one million rows and one thousand columns can be
up to 2x faster than using a plain structured array, and up to 20x
faster than using SQLite (using the ":memory:" backend and indexing).
See 'bench/ctable-query.py' for details.

Finally, binaries for Windows (both 32-bit and 64-bit) are provided.

For more detailed info, see the release notes in:
https://github.com/FrancescAlted/carray/wiki/Release-0.3

What it is
==========

carray is a container for numerical data that can be compressed
in-memory.  The compression process is carried out internally by Blosc,
a high-performance compressor that is optimized for binary data.

Having data compressed in-memory can reduce the stress of the memory
subsystem.  The net result is that carray operations may be faster than
using a traditional ndarray object from NumPy.

carray also supports fully 64-bit addressing (both in UNIX and Windows).
Below, a carray with 1 trillion of rows has been created (7.3 TB total),
filled with zeros, modified some positions, and finally, summed-up::

  >>> %time b = ca.zeros(1e12)
  CPU times: user 54.76 s, sys: 0.03 s, total: 54.79 s
  Wall time: 55.23 s
  >>> %time b[[1, 1e9, 1e10, 1e11, 1e12-1]] = (1,2,3,4,5)
  CPU times: user 2.08 s, sys: 0.00 s, total: 2.08 s
  Wall time: 2.09 s
  >>> b
  carray((1000000000000,), float64)
    nbytes: 7450.58 GB; cbytes: 2.27 GB; ratio: 3275.35
    cparams := cparams(clevel=5, shuffle=True)
  [0.0, 1.0, 0.0, ..., 0.0, 0.0, 5.0]
  >>> %time b.sum()
  CPU times: user 10.08 s, sys: 0.00 s, total: 10.08 s
  Wall time: 10.15 s
  15.0

['%time' is a magic function provided by the IPyhton shell]

Please note that the example above is provided for demonstration
purposes only.  Do not try to run this at home unless you have more than
3 GB of RAM available, or you will get into trouble.

Resources
=========

Visit the main carray site repository at:
http://github.com/FrancescAlted/carray

You can download a source package from:
http://carray.pytables.org/download

Manual:
http://carray.pytables.org/manual

Home of Blosc compressor:
http://blosc.pytables.org

User's mail list:
carray at googlegroups.com
http://groups.google.com/group/carray

Share your experience
=====================

Let us know of any bugs, suggestions, gripes, kudos, etc. you may
have.

----

   Enjoy!

-- 
Francesc Alted


From faltet at pytables.org  Wed Dec 22 14:16:52 2010
From: faltet at pytables.org (Francesc Alted)
Date: Wed, 22 Dec 2010 20:16:52 +0100
Subject: [Numpy-discussion] NEP for faster ufuncs
In-Reply-To: <AANLkTi=DR_dADDMjnLHu6atDpg+aJ_uJX-UF1mBnzDHk@mail.gmail.com>
References: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
	<201012221941.11032.faltet@pytables.org>
	<AANLkTi=DR_dADDMjnLHu6atDpg+aJ_uJX-UF1mBnzDHk@mail.gmail.com>
Message-ID: <201012222016.52648.faltet@pytables.org>

A Wednesday 22 December 2010 19:52:45 Mark Wiebe escrigu?:
> On Wed, Dec 22, 2010 at 10:41 AM, Francesc Alted 
<faltet at pytables.org>wrote:
> > NumPy version 2.0.0.dev-147f817
> 
> There's your problem, it looks like the PYTHONPATH isn't seeing your
> new build for some reason.  That build is off of this commit in the
> NumPy master branch:
> 
> https://github.com/numpy/numpy/commit/147f817eefd5efa56fa26b03953a51d
> 533cc27ec

Uh, I think I'm a bit lost here.  I've cloned this repo:

$ git clone git://github.com/m-paradox/numpy.git

Is that wrong?

> > Ah, okay.  However, Numexpr is not meant to accelerate calculations
> > with small operands.  I suppose that this is where your new
> > iterator makes more sense: accelerating operations where some of
> > the operands are small (i.e. fit in cache) and have to be
> > broadcasted to match the dimensionality of the others.
> 
> It's not about small operands, but small chunks of the operands at a
> time, with temporary arrays for intermediate calculations.  It's the
> small chunks + temporaries which must fit in cache to get the
> benefit, not the whole array.

But you need to transport those small chunks from main memory to cache 
before you can start doing the computation for this piece, right?  This 
is what I'm saying that the bottleneck for evaluating arbitrary 
expressions (like "3*a+b-(a/c)", i.e. not including transcendental 
functions, nor broadcasting) is memory bandwidth (and more in particular 
RAM bandwidth).

> The numexpr front page explains this
> fairly well in the section "Why It Works":
> 
> http://code.google.com/p/numexpr/#Why_It_Works

I know.  I wrote that part (based on the notes by David Cooke, the 
original author ;-)

-- 
Francesc Alted


From mwwiebe at gmail.com  Wed Dec 22 14:42:54 2010
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Wed, 22 Dec 2010 11:42:54 -0800
Subject: [Numpy-discussion] NEP for faster ufuncs
In-Reply-To: <201012222016.52648.faltet@pytables.org>
References: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
	<201012221941.11032.faltet@pytables.org>
	<AANLkTi=DR_dADDMjnLHu6atDpg+aJ_uJX-UF1mBnzDHk@mail.gmail.com>
	<201012222016.52648.faltet@pytables.org>
Message-ID: <AANLkTi=x_3CMCzd4PrZT0ER_48DfVNjrk+BgpuRQyWPi@mail.gmail.com>

On Wed, Dec 22, 2010 at 11:16 AM, Francesc Alted <faltet at pytables.org>wrote:

> A Wednesday 22 December 2010 19:52:45 Mark Wiebe escrigu?:
> > On Wed, Dec 22, 2010 at 10:41 AM, Francesc Alted
> <faltet at pytables.org>wrote:
> > > NumPy version 2.0.0.dev-147f817
> >
> > There's your problem, it looks like the PYTHONPATH isn't seeing your
> > new build for some reason.  That build is off of this commit in the
> > NumPy master branch:
> >
> > https://github.com/numpy/numpy/commit/147f817eefd5efa56fa26b03953a51d
> > 533cc27ec
>
> Uh, I think I'm a bit lost here.  I've cloned this repo:
>
> $ git clone git://github.com/m-paradox/numpy.git
>
> Is that wrong?
>

That's right, it was my mistake to assume that the page for a branch on
github would give you that branch.  You need the 'new_iterator' branch, so
after that clone, you should do this:

$ git checkout origin/new_iterator

> > Ah, okay.  However, Numexpr is not meant to accelerate calculations
> > > with small operands.  I suppose that this is where your new
> > > iterator makes more sense: accelerating operations where some of
> > > the operands are small (i.e. fit in cache) and have to be
> > > broadcasted to match the dimensionality of the others.
> >
> > It's not about small operands, but small chunks of the operands at a
> > time, with temporary arrays for intermediate calculations.  It's the
> > small chunks + temporaries which must fit in cache to get the
> > benefit, not the whole array.
>
> But you need to transport those small chunks from main memory to cache
> before you can start doing the computation for this piece, right?  This
> is what I'm saying that the bottleneck for evaluating arbitrary
> expressions (like "3*a+b-(a/c)", i.e. not including transcendental
> functions, nor broadcasting) is memory bandwidth (and more in particular
> RAM bandwidth).
>

In the example expression, I believe the evaluation would go something like
this.  Assuming the memory allocator keeps giving back the same locations to
'luf', all temporary variables will already be in cache after the first
chunk.

temp1 = 3 * a             # a is read from main memory
temp2 = temp1 + b     # b is read from main memory
temp3 = a / c             # a is already in cache, c is read from main
memory
result = temp2 + temp3 # result is written to data from main memory

So there are 4 reads and writes to chunks from outside of the cache, but 12
total reads and writes to chunks, so speeding up the parts already in cache
would appear to be beneficial.  The benefit will get better with more
complicated expressions.  I think as long as the operation is slower than a
memcpy, the RAM bandwidth isn't the main bottleneck to be concerned with,
but instead produces an upper bound on performance.  I'm not sure how to
precisely measure that overhead, though.


>
> > The numexpr front page explains this
> > fairly well in the section "Why It Works":
> >
> > http://code.google.com/p/numexpr/#Why_It_Works
>
> I know.  I wrote that part (based on the notes by David Cooke, the
> original author ;-)
>

Cool :)

-Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101222/a8ec09d7/attachment.html>

From faltet at pytables.org  Wed Dec 22 15:05:09 2010
From: faltet at pytables.org (Francesc Alted)
Date: Wed, 22 Dec 2010 21:05:09 +0100
Subject: [Numpy-discussion] NEP for faster ufuncs
In-Reply-To: <AANLkTi=x_3CMCzd4PrZT0ER_48DfVNjrk+BgpuRQyWPi@mail.gmail.com>
References: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
	<201012222016.52648.faltet@pytables.org>
	<AANLkTi=x_3CMCzd4PrZT0ER_48DfVNjrk+BgpuRQyWPi@mail.gmail.com>
Message-ID: <201012222105.09278.faltet@pytables.org>

A Wednesday 22 December 2010 20:42:54 Mark Wiebe escrigu?:
> On Wed, Dec 22, 2010 at 11:16 AM, Francesc Alted 
<faltet at pytables.org>wrote:
> > A Wednesday 22 December 2010 19:52:45 Mark Wiebe escrigu?:
> > > On Wed, Dec 22, 2010 at 10:41 AM, Francesc Alted
> > 
> > <faltet at pytables.org>wrote:
> > > > NumPy version 2.0.0.dev-147f817
> > > 
> > > There's your problem, it looks like the PYTHONPATH isn't seeing
> > > your new build for some reason.  That build is off of this
> > > commit in the NumPy master branch:
> > > 
> > > https://github.com/numpy/numpy/commit/147f817eefd5efa56fa26b03953
> > > a51d 533cc27ec
> > 
> > Uh, I think I'm a bit lost here.  I've cloned this repo:
> > 
> > $ git clone git://github.com/m-paradox/numpy.git
> > 
> > Is that wrong?
> 
> That's right, it was my mistake to assume that the page for a branch
> on github would give you that branch.  You need the 'new_iterator'
> branch, so after that clone, you should do this:
> 
> $ git checkout origin/new_iterator

Ah, things go well now:

>>> timeit 3*a+b-(a/c)
10 loops, best of 3: 67.7 ms per loop
>>> timeit luf(lambda a,b,c:3*a+b-(a/c), a, b, c)
10 loops, best of 3: 27.8 ms per loop
>>> timeit ne.evaluate("3*a+b-(a/c)")
10 loops, best of 3: 42.8 ms per loop

So, yup, I'm seeing the good speedup here too :-)

> > But you need to transport those small chunks from main memory to
> > cache before you can start doing the computation for this piece,
> > right?  This is what I'm saying that the bottleneck for evaluating
> > arbitrary expressions (like "3*a+b-(a/c)", i.e. not including
> > transcendental functions, nor broadcasting) is memory bandwidth
> > (and more in particular RAM bandwidth).
> 
> In the example expression, I believe the evaluation would go
> something like this.  Assuming the memory allocator keeps giving
> back the same locations to 'luf', all temporary variables will
> already be in cache after the first chunk.
> 
> temp1 = 3 * a             # a is read from main memory
> temp2 = temp1 + b     # b is read from main memory
> temp3 = a / c             # a is already in cache, c is read from
> main memory
> result = temp2 + temp3 # result is written to data from main memory
> 
> So there are 4 reads and writes to chunks from outside of the cache,
> but 12 total reads and writes to chunks, so speeding up the parts
> already in cache would appear to be beneficial.  The benefit will
> get better with more complicated expressions.  I think as long as
> the operation is slower than a memcpy, the RAM bandwidth isn't the
> main bottleneck to be concerned with, but instead produces an upper
> bound on performance.  I'm not sure how to precisely measure that
> overhead, though.

Well, see the timings for the non-broadcasting case:

>>> a = np.random.random((50,50,50,10))
>>> b = np.random.random((50,50,50,10))
>>> c = np.random.random((50,50,50,10))

>>> timeit 3*a+b-(a/c)
10 loops, best of 3: 31.1 ms per loop
>>> timeit luf(lambda a,b,c:3*a+b-(a/c), a, b, c)
10 loops, best of 3: 24.5 ms per loop
>>> timeit ne.evaluate("3*a+b-(a/c)")
100 loops, best of 3: 10.4 ms per loop

However, the above comparison is not fair, as numexpr uses all your 
cores by default (2 for the case above).  If we force using only one 
core:

>>> ne.set_num_threads(1)
>>> timeit ne.evaluate("3*a+b-(a/c)")
100 loops, best of 3: 16 ms per loop

which is still faster than luf.  In this case numexpr was not using SSE, 
but in case luf does so, this does not imply better speed.

-- 
Francesc Alted


From jrocher at enthought.com  Wed Dec 22 15:29:54 2010
From: jrocher at enthought.com (Jonathan Rocher)
Date: Wed, 22 Dec 2010 14:29:54 -0600
Subject: [Numpy-discussion] counting non-zero entries in an ndarray
In-Reply-To: <201012221043.55668.tkg@lanl.gov>
References: <4D120831.80805@hkl.hms.harvard.edu>
	<201012221043.55668.tkg@lanl.gov>
Message-ID: <AANLkTimNuDDqdKHf9DdPu5KzyUaU8Hifv3qxU41KDBXJ@mail.gmail.com>

To answer the part about the most efficient way to do that,

In [1]: a = array([0,1,4,76,3,0,4,67,9,5,3,9,0,5,23,3,0,5,3,3,0,5,0])

In [8]: %timeit len(where(a!=0)[0])
100000 loops, best of 3: 6.54 us per loop

In [9]: %timeit (a!=0).sum()
100000 loops, best of 3: 9.81 us per loop

Seems like the where option is faster.

Now I create a large array
In [13]: a = hstack([a,a,a,a,a,a,a,a,a,a,a,a])

In [14]: %timeit len(where(a!=0)[0])
100000 loops, best of 3: 12.3 us per loop

In [15]: %timeit (a!=0).sum()
100000 loops, best of 3: 11 us per loop

Now the fastest way is using the sum. The where function is not vectorized
because it doesn't know in advance the size of the final array. In the case
of a big array, there will be a lot of copy in the memory, as it grows. And
the difference increases fast...

In [20]: a = hstack([a,a,a,a,a,a,a,a,a,a,a,a])

In [21]: %timeit len(where(a!=0)[0])
10000 loops, best of 3: 79.1 us per loop

In [22]: %timeit (a!=0).sum()
10000 loops, best of 3: 24.5 us per loop

Regards,
Jonathan

On Wed, Dec 22, 2010 at 11:43 AM, Thomas K Gamble <tkg at lanl.gov> wrote:

> On Wednesday, December 22, 2010 07:16:17 am Ian Stokes-Rees wrote:
> > What is the most efficient way to do the Matlab equivalent of nnz(M)
> > (nnz = number-of-non-zeros function)?
> >
> > I've tried Google, but no luck.
> >
> > My assumption is that something like
> >
> > a != 0
> >
> > will be used, but I'm not sure then how to "count" the number of "True"
> > entries.
> >
> > TIA.
> >
> > Ian
>
> one possibility:
>
> len(where(a != 0)[0])
>
> --
> Thomas K. Gamble
> Research Technologist, System/Network Administrator
> Chemical Diagnostics and Engineering (C-CDE)
> Los Alamos National Laboratory
> MS-E543,p:505-665-4323 f:505-665-4267
>
> There cannot be a crisis next week. My schedule is already full.
>    Henry Kissinger
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
Jonathan Rocher,
Enthought, Inc.
jrocher at enthought.com
1-512-536-1057
http://www.enthought.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101222/ba0869a6/attachment.html>

From mwwiebe at gmail.com  Wed Dec 22 15:42:43 2010
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Wed, 22 Dec 2010 12:42:43 -0800
Subject: [Numpy-discussion] NEP for faster ufuncs
In-Reply-To: <201012222105.09278.faltet@pytables.org>
References: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
	<201012222016.52648.faltet@pytables.org>
	<AANLkTi=x_3CMCzd4PrZT0ER_48DfVNjrk+BgpuRQyWPi@mail.gmail.com>
	<201012222105.09278.faltet@pytables.org>
Message-ID: <AANLkTinjk1rsozN=XFmO6jLN_9qxGpDNCY5Yj3X2drx-@mail.gmail.com>

On Wed, Dec 22, 2010 at 12:05 PM, Francesc Alted <faltet at pytables.org>wrote:

> <snip>
>
> Ah, things go well now:
>
> >>> timeit 3*a+b-(a/c)
> 10 loops, best of 3: 67.7 ms per loop
> >>> timeit luf(lambda a,b,c:3*a+b-(a/c), a, b, c)
> 10 loops, best of 3: 27.8 ms per loop
> >>> timeit ne.evaluate("3*a+b-(a/c)")
> 10 loops, best of 3: 42.8 ms per loop
>
> So, yup, I'm seeing the good speedup here too :-)
>

Great!

<snip>
>
> Well, see the timings for the non-broadcasting case:
>
> >>> a = np.random.random((50,50,50,10))
> >>> b = np.random.random((50,50,50,10))
> >>> c = np.random.random((50,50,50,10))
>
> >>> timeit 3*a+b-(a/c)
> 10 loops, best of 3: 31.1 ms per loop
> >>> timeit luf(lambda a,b,c:3*a+b-(a/c), a, b, c)
> 10 loops, best of 3: 24.5 ms per loop
> >>> timeit ne.evaluate("3*a+b-(a/c)")
> 100 loops, best of 3: 10.4 ms per loop
>
> However, the above comparison is not fair, as numexpr uses all your
> cores by default (2 for the case above).  If we force using only one
> core:
>
> >>> ne.set_num_threads(1)
> >>> timeit ne.evaluate("3*a+b-(a/c)")
> 100 loops, best of 3: 16 ms per loop
>
> which is still faster than luf.  In this case numexpr was not using SSE,
> but in case luf does so, this does not imply better speed.


Ok, I get pretty close to the same ratios (and my machine feels a bit
slow...):

In [6]: timeit 3*a+b-(a/c)
10 loops, best of 3: 101 ms per loop

In [7]: timeit luf(lambda a,b,c:3*a+b-(a/c), a, b, c)
10 loops, best of 3: 53.4 ms per loop

In [8]: timeit ne.evaluate("3*a+b-(a/c)")
10 loops, best of 3: 27.8 ms per loop

In [9]: ne.set_num_threads(1)
In [10]: timeit ne.evaluate("3*a+b-(a/c)")
10 loops, best of 3: 33.6 ms per loop

I think the closest to a "memcpy" we can do here would be just adding, which
shows the expression evaluation can be estimated to have 20% overhead.
 While that's small compared the speedup over straight NumPy, I think it's
still worth considering.

In [11]: timeit ne.evaluate("a+b+c")
10 loops, best of 3: 27.9 ms per loop

Even just switching from add to divide gives more than 10% overhead.  With
SSE2 these divides could be done two at a time for doubles or four at a time
for floats to cut that down.

In [12]: timeit ne.evaluate("a/b/c")
10 loops, best of 3: 31.7 ms per loop

This all shows that the 'luf' Python interpreter overhead is still pretty
big, the new iterator can't defeat numexpr by itself.  I think numexpr could
get a nice boost from using the new iterator internally though - if I go
back to the original motivation, different memory orderings, 'luf' is 10x
faster than single-threaded numexpr.

In [15]: a = np.random.random((50,50,50,10)).T
In [16]: b = np.random.random((50,50,50,10)).T
In [17]: c = np.random.random((50,50,50,10)).T

In [18]: timeit ne.evaluate("3*a+b-(a/c)")
1 loops, best of 3: 556 ms per loop

In [19]: timeit luf(lambda a,b,c:3*a+b-(a/c), a, b, c)
10 loops, best of 3: 52.5 ms per loop


Cheers,
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101222/78b9d531/attachment.html>

From mwwiebe at gmail.com  Wed Dec 22 16:02:03 2010
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Wed, 22 Dec 2010 13:02:03 -0800
Subject: [Numpy-discussion] NEP for faster ufuncs
In-Reply-To: <AANLkTinjk1rsozN=XFmO6jLN_9qxGpDNCY5Yj3X2drx-@mail.gmail.com>
References: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
	<201012222016.52648.faltet@pytables.org>
	<AANLkTi=x_3CMCzd4PrZT0ER_48DfVNjrk+BgpuRQyWPi@mail.gmail.com>
	<201012222105.09278.faltet@pytables.org>
	<AANLkTinjk1rsozN=XFmO6jLN_9qxGpDNCY5Yj3X2drx-@mail.gmail.com>
Message-ID: <AANLkTim93C-jAxrLUg8jMTaEo=0EcBjFr_GTWgaH_v62@mail.gmail.com>

On Wed, Dec 22, 2010 at 12:42 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
<snip>

> I think numexpr could get a nice boost from using the new iterator
> internally though
>

There's actually a trivial way to do this with very minimal changes to
numexpr - the 'itview' mechanism. Create the new iterator, call
NpyIter_GetIterView(it,i) (or it.itviews in Python) to get compatibly
reordered views of the inputs, then continue with the existing code.

-Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101222/701659ed/attachment.html>

From ijstokes at hkl.hms.harvard.edu  Wed Dec 22 16:10:05 2010
From: ijstokes at hkl.hms.harvard.edu (Ian Stokes-Rees)
Date: Wed, 22 Dec 2010 16:10:05 -0500
Subject: [Numpy-discussion] counting non-zero entries in an ndarray
In-Reply-To: <4D120831.80805@hkl.hms.harvard.edu>
References: <4D120831.80805@hkl.hms.harvard.edu>
Message-ID: <4D12692D.8060508@hkl.hms.harvard.edu>


On 12/22/10 9:16 AM, Ian Stokes-Rees wrote:
> What is the most efficient way to do the Matlab equivalent of nnz(M)
> (nnz = number-of-non-zeros function)?

Thanks to all the various responses.  I should have mentioned that I'm
using scipy.sparse, and lil_matrix objects have a method "getnnz()"
which gives me the number I want.

Ian


From faltet at pytables.org  Wed Dec 22 16:20:40 2010
From: faltet at pytables.org (Francesc Alted)
Date: Wed, 22 Dec 2010 22:20:40 +0100
Subject: [Numpy-discussion] NEP for faster ufuncs
In-Reply-To: <AANLkTim93C-jAxrLUg8jMTaEo=0EcBjFr_GTWgaH_v62@mail.gmail.com>
References: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
	<AANLkTinjk1rsozN=XFmO6jLN_9qxGpDNCY5Yj3X2drx-@mail.gmail.com>
	<AANLkTim93C-jAxrLUg8jMTaEo=0EcBjFr_GTWgaH_v62@mail.gmail.com>
Message-ID: <201012222220.40355.faltet@pytables.org>

A Wednesday 22 December 2010 22:02:03 Mark Wiebe escrigu?:
> On Wed, Dec 22, 2010 at 12:42 PM, Mark Wiebe <mwwiebe at gmail.com>
> wrote: <snip>
> 
> > I think numexpr could get a nice boost from using the new iterator
> > internally though
> 
> There's actually a trivial way to do this with very minimal changes
> to numexpr - the 'itview' mechanism. Create the new iterator, call
> NpyIter_GetIterView(it,i) (or it.itviews in Python) to get
> compatibly reordered views of the inputs, then continue with the
> existing code.

That's interesting.  I'll think about this (patches are very welcome 
too!).

Thanks!

-- 
Francesc Alted


From ijstokes at hkl.hms.harvard.edu  Wed Dec 22 16:32:23 2010
From: ijstokes at hkl.hms.harvard.edu (Ian Stokes-Rees)
Date: Wed, 22 Dec 2010 16:32:23 -0500
Subject: [Numpy-discussion] How to control column for pretty print line wrap
	of ndarrays
Message-ID: <4D126E67.3020104@hkl.hms.harvard.edu>

Like most people these days, I have multiple 24" monitors.  I don't need
"print" of ndarrays to wrap after 72 columns.  Is there some way to
change this?

TIA

Ian

CURRENT:

 [        NaN         NaN         NaN         NaN         NaN  5.1882094
   1.19646584]]

DESIRED:

 [        NaN         NaN         NaN         NaN         NaN 
5.1882094   1.19646584]]

(Although your mail client mail mangle it...)


From robert.kern at gmail.com  Wed Dec 22 16:52:08 2010
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 22 Dec 2010 16:52:08 -0500
Subject: [Numpy-discussion] How to control column for pretty print line
 wrap of ndarrays
In-Reply-To: <4D126E67.3020104@hkl.hms.harvard.edu>
References: <4D126E67.3020104@hkl.hms.harvard.edu>
Message-ID: <AANLkTikv+Lge9J5h2xqOkoZX1jHb=QrRinCqKov11qX=@mail.gmail.com>

On Wed, Dec 22, 2010 at 16:32, Ian Stokes-Rees
<ijstokes at hkl.hms.harvard.edu> wrote:
> Like most people these days, I have multiple 24" monitors. ?I don't need
> "print" of ndarrays to wrap after 72 columns. ?Is there some way to
> change this?

np.set_printoptions(linewidth=100)

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From roger at quantumbioinc.com  Thu Dec 23 08:13:08 2010
From: roger at quantumbioinc.com (Roger Martin)
Date: Thu, 23 Dec 2010 08:13:08 -0500
Subject: [Numpy-discussion] Installing on CentOS 5 claims invalid Python
	installation
Message-ID: <4D134AE4.6090001@quantumbioinc.com>

Hi,

NumPy looks like the way to get computation done in Python.  Now I'm 
going through the learning curve of installing the module into different 
linux OS's and Python versions.  An extra need is to install google 
code's h5py http://code.google.com/p/h5py/ which depends on numpy.

In trying a number of Python versions the 2.x's are yielding the message 
" invalid Python installation"
---------------
     raise DistutilsPlatformError(my_msg)
distutils.errors.DistutilsPlatformError: invalid Python installation: 
unable to open 
/home/roger/Python-2.6.6/dist/lib/python2.6/config/Makefile (No such 
file or directory)
---------------

 From reading on the web it appears a Python-2.x.x-devel version is 
needed.  Yet no search combination comes back with where to get such a 
thing(note: I need user installs/builds for security reasons).  Where 
are Python versions  compatible with numpy?

Building
	Python-2.6.6
	Python-2.7.1(fails to build)
	Python3.2beta2
numpy1.5.1
	invalid Python installation 	NA
	success
h5py1.3.1
	needs numpy
	NA
	fails


To start I need just one successful combination but will need more cases 
depending on users of a new integration project.

Interestingly your numpy 1.5.1's setup is in good shape to build with 
Python3.2 yet I need to allow older versions for people's systems not 
ready to upgrade that far.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101223/fc6538d6/attachment.html>

From bsouthey at gmail.com  Thu Dec 23 14:58:25 2010
From: bsouthey at gmail.com (Bruce Southey)
Date: Thu, 23 Dec 2010 13:58:25 -0600
Subject: [Numpy-discussion] Installing on CentOS 5 claims invalid Python
 installation
In-Reply-To: <4D134AE4.6090001@quantumbioinc.com>
References: <4D134AE4.6090001@quantumbioinc.com>
Message-ID: <4D13A9E1.6030602@gmail.com>

On 12/23/2010 07:13 AM, Roger Martin wrote:
> Hi,
>
> NumPy looks like the way to get computation done in Python.  Now I'm 
> going through the learning curve of installing the module into 
> different linux OS's and Python versions.  An extra need is to install 
> google code's h5py http://code.google.com/p/h5py/ which depends on numpy.
>
> In trying a number of Python versions the 2.x's are yielding the 
> message " invalid Python installation"
> ---------------
>     raise DistutilsPlatformError(my_msg)
> distutils.errors.DistutilsPlatformError: invalid Python installation: 
> unable to open 
> /home/roger/Python-2.6.6/dist/lib/python2.6/config/Makefile (No such 
> file or directory)
> ---------------
>
> From reading on the web it appears a Python-2.x.x-devel version is 
> needed.  Yet no search combination comes back with where to get such a 
> thing(note: I need user installs/builds for security reasons).  Where 
> are Python versions  compatible with numpy?
>
> Building
> 	Python-2.6.6
> 	Python-2.7.1(fails to build)
> 	Python3.2beta2
> numpy1.5.1
> 	invalid Python installation 	NA
> 	success
> h5py1.3.1
> 	needs numpy
> 	NA
> 	fails
>
>
> To start I need just one successful combination but will need more 
> cases depending on users of a new integration project.
>
> Interestingly your numpy 1.5.1's setup is in good shape to build with 
> Python3.2 yet I need to allow older versions for people's systems not 
> ready to upgrade that far.
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

I thought that Centos 5 ships Python 2.4 so how did you get Python 2.6, 
2.7 and 3.2?
If these are from some repository then the developmental libraries 
should also be there - if these are not there then either find another 
repository or build Python yourself.

Bruce


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101223/82aa328c/attachment.html>

From Chris.Barker at noaa.gov  Thu Dec 23 15:03:40 2010
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Thu, 23 Dec 2010 12:03:40 -0800
Subject: [Numpy-discussion] Installing on CentOS 5 claims invalid Python
 installation
In-Reply-To: <4D134AE4.6090001@quantumbioinc.com>
References: <4D134AE4.6090001@quantumbioinc.com>
Message-ID: <4D13AB1C.8070006@noaa.gov>

On 12/23/10 5:13 AM, Roger Martin wrote:
> NumPy looks like the way to get computation done in Python.

yup -- welcome!

> Now I'm
> going through the learning curve of installing the module into different
> linux OS's and Python versions.

hmm -- usually it's pretty straightforward on Linux (except maybe 
getting an optimized LAPACK, which you may or may not need).

> An extra need is to install google
> code's h5py http://code.google.com/p/h5py/ which depends on numpy.

I'll leave that for the next step.

> In trying a number of Python versions the 2.x's are yielding the message
> " invalid Python installation"
> ---------------
> raise DistutilsPlatformError(my_msg)
> distutils.errors.DistutilsPlatformError: invalid Python installation:
> unable to open
> /home/roger/Python-2.6.6/dist/lib/python2.6/config/Makefile (No such
> file or directory)
> ---------------
>
>  From reading on the web it appears a Python-2.x.x-devel version is
> needed.

yup -- many of the Linux package systems split the stuff you need to run 
Python code from what you need to compile stuff against it -- common 
with other libs, packages as well.

> Yet no search combination comes back with where to get such a
> thing

each distro has it's own naming convention -- look for anything like 
"python-devel", "python-dev", etc.

> (note: I need user installs/builds for security reasons).

Ahh -- a different story -- AFAIK (and I'm NOT an expert) the distro's 
packages will install python into system directories -- if you really 
need each user to have their own install, you may need to install from 
source. That should be pretty straight forward, too. Get the source 
tarball from python.org, and follow the build instructions. You'll need 
to specify a user install in that process somehow.

The latest numpy should work with any recent python -- If you are free 
to choose, use 2.7.1 -- it's the latest production version of the 2.* 
series. 3.* is still a bit bleeding edge. YOU can grab the tarball here:

http://www.python.org/download/releases/2.7.1/

Once you've got a python working, a simple "python setup.py install" 
should do for numpy.

HTH,

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From paul.z.thunemann at boeing.com  Thu Dec 23 17:24:02 2010
From: paul.z.thunemann at boeing.com (Thunemann, Paul Z)
Date: Thu, 23 Dec 2010 14:24:02 -0800
Subject: [Numpy-discussion]  numpy for jython
Message-ID: <11965FB92DB9D44490740AF7806ED4EB37993A872D@XCH-NW-12V.nw.nos.boeing.com>

I'd be very interested in hearing more about a numpy port to Java and Jython.  If anyone has more info about how to get involved please let me know.

-Zack


From jsalvati at u.washington.edu  Thu Dec 23 17:27:59 2010
From: jsalvati at u.washington.edu (John Salvatier)
Date: Thu, 23 Dec 2010 14:27:59 -0800
Subject: [Numpy-discussion] numpy for jython
In-Reply-To: <11965FB92DB9D44490740AF7806ED4EB37993A872D@XCH-NW-12V.nw.nos.boeing.com>
References: <11965FB92DB9D44490740AF7806ED4EB37993A872D@XCH-NW-12V.nw.nos.boeing.com>
Message-ID: <AANLkTi=QYniAod2ENHpXqnSJcShgMVMeya_3T_n3DbN=@mail.gmail.com>

I'm curious whether this kind of thing is expected to be relatively easy
after the numpy refactor.

On Thu, Dec 23, 2010 at 2:24 PM, Thunemann, Paul Z <
paul.z.thunemann at boeing.com> wrote:

> I'd be very interested in hearing more about a numpy port to Java and
> Jython.  If anyone has more info about how to get involved please let me
> know.
>
> -Zack
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101223/b0fe3aa3/attachment.html>

From numpy-discussion at maubp.freeserve.co.uk  Thu Dec 23 17:50:58 2010
From: numpy-discussion at maubp.freeserve.co.uk (Peter)
Date: Thu, 23 Dec 2010 22:50:58 +0000
Subject: [Numpy-discussion] numpy for jython
In-Reply-To: <11965FB92DB9D44490740AF7806ED4EB37993A872D@XCH-NW-12V.nw.nos.boeing.com>
References: <11965FB92DB9D44490740AF7806ED4EB37993A872D@XCH-NW-12V.nw.nos.boeing.com>
Message-ID: <AANLkTinfKgvSMjLO9iURT4LLJvRu=axRreYSJBw88qJE@mail.gmail.com>

On Thu, Dec 23, 2010 at 10:24 PM, Thunemann, Paul Z
<paul.z.thunemann at boeing.com> wrote:
>
> I'd be very interested in hearing more about a numpy port to Java and
> Jython. ?If anyone has more info about how to get involved please let
> me know.
>
> -Zack

I'd find even a minimal version useful for Jython (just using pure
Python) as long as it provided the basic data structures and some
core functionality. I'm thinking here of other Python libraries that
use NumPy, but don't necessarily need it for speed reasons alone.

Peter


From paul.z.thunemann at boeing.com  Thu Dec 23 18:38:23 2010
From: paul.z.thunemann at boeing.com (Thunemann, Paul Z)
Date: Thu, 23 Dec 2010 15:38:23 -0800
Subject: [Numpy-discussion] numpy for jython
Message-ID: <11965FB92DB9D44490740AF7806ED4EB379941591F@XCH-NW-12V.nw.nos.boeing.com>


If the refactor separates numpy from the Cpython objects and results in a clean C or C++ api, then porting to Java is still a chore but it's doable.   I've used JNI and SWIG extensively to port math libraries and could get involved but I don't know who else might be working on this (if anyone).

-Zack


From david at silveregg.co.jp  Thu Dec 23 21:39:42 2010
From: david at silveregg.co.jp (David)
Date: Fri, 24 Dec 2010 11:39:42 +0900
Subject: [Numpy-discussion] numpy for jython
In-Reply-To: <AANLkTi=QYniAod2ENHpXqnSJcShgMVMeya_3T_n3DbN=@mail.gmail.com>
References: <11965FB92DB9D44490740AF7806ED4EB37993A872D@XCH-NW-12V.nw.nos.boeing.com>
	<AANLkTi=QYniAod2ENHpXqnSJcShgMVMeya_3T_n3DbN=@mail.gmail.com>
Message-ID: <4D1407EE.20502@silveregg.co.jp>

On 12/24/2010 07:27 AM, John Salvatier wrote:
> I'm curious whether this kind of thing is expected to be relatively easy
> after the numpy refactor.

It would help, but it won't make it easy. I asked this exact question 
some time ago to Enthought developers, and java would be more 
complicated because there is no equivalent to C++/CLI in java world. 
Don't take my word for it, though, because I know very little about ways 
to wrap native code on the jvm (or CLR for that matter).

I think more than one person is interested, though (I for one am more 
interested in the JVM than the CLR),

cheers,

David


From bioinformed at gmail.com  Thu Dec 23 23:34:07 2010
From: bioinformed at gmail.com (Kevin Jacobs <jacobs@bioinformed.com>)
Date: Thu, 23 Dec 2010 23:34:07 -0500
Subject: [Numpy-discussion] ANN: carray 0.3 released
In-Reply-To: <201012221958.41105.faltet@pytables.org>
References: <201012221958.41105.faltet@pytables.org>
Message-ID: <AANLkTi=4316wb8=nBfQMD1HezWahugtDkkAN+5-=GHmw@mail.gmail.com>

On Wed, Dec 22, 2010 at 1:58 PM, Francesc Alted <faltet at pytables.org> wrote:

>  >>> %time b = ca.zeros(1e12)
>  CPU times: user 54.76 s, sys: 0.03 s, total: 54.79 s
>  Wall time: 55.23 s
>

I know this is somewhat missing the point of your demonstration, but 55
seconds to create an empty 3 GB data structure to represent a multi-TB dense
array doesn't seem all that fast to me.  Compression can do a lot of things,
but isn't this a case where a true sparse data structure would be the right
tool for the job?  I'm more interested in seeing what a carray can do with
census data, web logs, or somethat vaguely real world where direct binary
representations are used by default and assumed to be reasonable optimal
(i.e., anything sensibly stored in sqlite tables).

-Kevin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101223/d53aa1cf/attachment.html>

From oliphant at enthought.com  Fri Dec 24 00:24:21 2010
From: oliphant at enthought.com (Travis Oliphant)
Date: Thu, 23 Dec 2010 23:24:21 -0600
Subject: [Numpy-discussion] NEP for faster ufuncs
In-Reply-To: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
References: <AANLkTikQajp15nUMOFGHa1jdYn6ZuGUFn6wqYWcDaVYB@mail.gmail.com>
Message-ID: <4E081B0A-3E30-48D6-9463-6BFA434FF9ED@enthought.com>

This is very cool!

I would like to see this get into NumPy 2.0.    

Thanks for all the great work!

-Travis


On Dec 21, 2010, at 6:53 PM, Mark Wiebe wrote:

> Hello NumPy-ers,
> 
> After some performance analysis, I've designed and implemented a new iterator designed to speed up ufuncs and allow for easier multi-dimensional iteration.  The new code is fairly large, but works quite well already.  If some people could read the NEP and give some feedback, that would be great!  Here's a link:
> 
> https://github.com/m-paradox/numpy/blob/mw_neps/doc/neps/new-iterator-ufunc.rst
> 
> I would also love it if someone could try building the code and play around with it a bit.  The github branch is here:
> 
> https://github.com/m-paradox/numpy/tree/new_iterator
> 
> To give a taste of the iterator's functionality, below is an example from the NEP for how to implement a "Lambda UFunc."  With just a few lines of code, it's possible to replicate something similar to the numexpr library (numexpr still gets a bigger speedup, though).  In the example expression I chose, execution time went from 138ms to 61ms.
> 
> Hopefully this is a good Christmas present for NumPy. :)
> 
> Cheers,
> Mark
> 
> Here is the definition of the ``luf`` function.::
> 
>     def luf(lamdaexpr, *args, **kwargs):
>         """Lambda UFunc
>         
>             e.g.
>             c = luf(lambda i,j:i+j, a, b, order='K',
>                                 casting='safe', buffersize=8192)
> 
>             c = np.empty(...)
>             luf(lambda i,j:i+j, a, b, out=c, order='K',
>                                 casting='safe', buffersize=8192)
>         """
> 
>         nargs = len(args)
>         op = args + (kwargs.get('out',None),)
>         it = np.newiter(op, ['buffered','no_inner_iteration'],
>                 [['readonly','nbo_aligned']]*nargs +
>                                 [['writeonly','allocate','no_broadcast']],
>                 order=kwargs.get('order','K'),
>                 casting=kwargs.get('casting','safe'),
>                 buffersize=kwargs.get('buffersize',0))
>         while not it.finished:
>             it[-1] = lamdaexpr(*it[:-1])
>             it.iternext()
> 
>         return it.operands[-1]
> 
> Then, by using ``luf`` instead of straight Python expressions, we
> can gain some performance from better cache behavior.::
> 
>     In [2]: a = np.random.random((50,50,50,10))
>     In [3]: b = np.random.random((50,50,1,10))
>     In [4]: c = np.random.random((50,50,50,1))
> 
>     In [5]: timeit 3*a+b-(a/c)
>     1 loops, best of 3: 138 ms per loop
> 
>     In [6]: timeit luf(lambda a,b,c:3*a+b-(a/c), a, b, c)
>     10 loops, best of 3: 60.9 ms per loop
> 
>     In [7]: np.all(3*a+b-(a/c) == luf(lambda a,b,c:3*a+b-(a/c), a, b, c))
>     Out[7]: True
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

---
Travis Oliphant
Enthought, Inc.
oliphant at enthought.com
1-512-536-1057
http://www.enthought.com


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101223/352ea412/attachment.html>

From oliphant at enthought.com  Fri Dec 24 00:29:48 2010
From: oliphant at enthought.com (Travis Oliphant)
Date: Thu, 23 Dec 2010 23:29:48 -0600
Subject: [Numpy-discussion] numpy for jython
In-Reply-To: <11965FB92DB9D44490740AF7806ED4EB37993A872D@XCH-NW-12V.nw.nos.boeing.com>
References: <11965FB92DB9D44490740AF7806ED4EB37993A872D@XCH-NW-12V.nw.nos.boeing.com>
Message-ID: <902DE6C3-621B-46EC-9482-33FCF9EF0E2F@enthought.com>


On Dec 23, 2010, at 4:24 PM, Thunemann, Paul Z wrote:

> I'd be very interested in hearing more about a numpy port to Java and Jython.  If anyone has more info about how to get involved please let me know.

The numpy-refactor should help with this. 

You basically need to write a Java extension which uses the new libndarray (presumably using JNI).   I am not an expert on Java, but the design of NumPy / SciPy for .NET was to allow a NumPy for Java (and Jython) as well.   There is probably about 1-3 man months of work however to build the interface.   Currently, there are two interfaces:  a CPython and a .NET interface (written in C#), a Java interface written in Java (using JNI for the native code interaction) would make NumPy available to Jython. 

The SciPy port to .NET is being accomplished by porting SciPy to use Cython and Fwrap.   This will allow SciPy for Jython as well, once a Java / JNI backend to Cython is completed. 


-Travis


From faltet at pytables.org  Fri Dec 24 10:09:32 2010
From: faltet at pytables.org (Francesc Alted)
Date: Fri, 24 Dec 2010 16:09:32 +0100
Subject: [Numpy-discussion] ANN: carray 0.3 released
In-Reply-To: <AANLkTi=4316wb8=nBfQMD1HezWahugtDkkAN+5-=GHmw@mail.gmail.com>
References: <201012221958.41105.faltet@pytables.org>
	<AANLkTi=4316wb8=nBfQMD1HezWahugtDkkAN+5-=GHmw@mail.gmail.com>
Message-ID: <AANLkTim7sAN6=Bg9sC_CoTkV3AsrNtsrBa7-Ueyn0=ZJ@mail.gmail.com>

2010/12/24, Kevin Jacobs <jacobs at bioinformed.com> <bioinformed at gmail.com>:
> On Wed, Dec 22, 2010 at 1:58 PM, Francesc Alted <faltet at pytables.org> wrote:
>
>>  >>> %time b = ca.zeros(1e12)
>>  CPU times: user 54.76 s, sys: 0.03 s, total: 54.79 s
>>  Wall time: 55.23 s
>>
>
> I know this is somewhat missing the point of your demonstration, but 55
> seconds to create an empty 3 GB data structure to represent a multi-TB dense
> array doesn't seem all that fast to me.

Yes, this was not the point of the demo, but just showing 64-bit
addressing (a feature that I implemented recently and was eager to
show).  But, agreed, I'm guilty to show times, so your observation is
pertinent.  But mind that I'm not creating an *empty* structure, but a
*zeroed* structure; that's a bit different (that does not mean that
the process cannot be speed-up, but we all surely agree that there is
little sense in optimizing this scenario ;-).

>  Compression can do a lot of things,
> but isn't this a case where a true sparse data structure would be the right
> tool for the job?  I'm more interested in seeing what a carray can do with
> census data, web logs, or somethat vaguely real world where direct binary
> representations are used by default and assumed to be reasonable optimal
> (i.e., anything sensibly stored in sqlite tables).

Well, I'm just creating the tool; it is up to the users to find
real-world applications.  I'm pretty sure that some of you will find
some good ones.

Cheers!

-- 
Francesc Alted


From enzomich at gmail.com  Sun Dec 26 03:51:57 2010
From: enzomich at gmail.com (Enzo Michelangeli)
Date: Sun, 26 Dec 2010 16:51:57 +0800
Subject: [Numpy-discussion] Optimization suggestion sought
Message-ID: <F99034C000D94E8799A149732551C04A@EMLT>

For a pivoted algorithm, I have to perform an operation that in fully
vectorized form can be expressed as:

    pivot = tableau[locat,:]/tableau[locat,cand]
    tableau -= tableau[:,cand:cand+1]*pivot
    tableau[locat,:] = pivot

tableau is a rather large bidimensional array, and I'd like to avoid the
allocation of a temporary array of the same size holding the result of the
right-hand side expression in the second line of code (the outer product of
tableau[:,cand] and pivot). On the other hand, if I replace that line with:

    for i in xrange(tableau.shape[0]):
        tableau[i] -= tableau[i,cand]*pivot

...I incur some CPU overhead for the "for" loop -- and this part of code is
the botteneck of the whole algorithm. Is there any smarter (i.e., more
time-efficient) way of achieving my goal?

TIA --

Enzo


From josef.pktd at gmail.com  Sun Dec 26 09:34:10 2010
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sun, 26 Dec 2010 09:34:10 -0500
Subject: [Numpy-discussion] Optimization suggestion sought
In-Reply-To: <F99034C000D94E8799A149732551C04A@EMLT>
References: <F99034C000D94E8799A149732551C04A@EMLT>
Message-ID: <AANLkTikmEv8qk8=h1G1m2ULboiGynVeA07ii4Ct0PzFX@mail.gmail.com>

On Sun, Dec 26, 2010 at 3:51 AM, Enzo Michelangeli <enzomich at gmail.com> wrote:
> For a pivoted algorithm, I have to perform an operation that in fully
> vectorized form can be expressed as:
>
> ? ?pivot = tableau[locat,:]/tableau[locat,cand]
> ? ?tableau -= tableau[:,cand:cand+1]*pivot
> ? ?tableau[locat,:] = pivot
>
> tableau is a rather large bidimensional array, and I'd like to avoid the
> allocation of a temporary array of the same size holding the result of the
> right-hand side expression in the second line of code (the outer product of
> tableau[:,cand] and pivot). On the other hand, if I replace that line with:
>
> ? ?for i in xrange(tableau.shape[0]):
> ? ? ? ?tableau[i] -= tableau[i,cand]*pivot
>
> ...I incur some CPU overhead for the "for" loop -- and this part of code is
> the botteneck of the whole algorithm. Is there any smarter (i.e., more
> time-efficient) way of achieving my goal?

just a generic answer:

Working in batches can be a good compromise in some cases. I instead
of working in a loop with one row at a time, loop and handle, for
example, 1000 rows at a time.

Josef

>
> TIA --
>
> Enzo
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From Bruce_Sherwood at ncsu.edu  Sun Dec 26 17:26:36 2010
From: Bruce_Sherwood at ncsu.edu (Bruce Sherwood)
Date: Sun, 26 Dec 2010 15:26:36 -0700
Subject: [Numpy-discussion] How to call import_array() properly?
Message-ID: <AANLkTikedLOm5Ks1MD4B1Pm7jyjF1BLuW7MWrHxaT+Hw@mail.gmail.com>

In my Python code I have

import cvisual
cvisual.init_numpy()

and in my C++ code I have

void
init_numpy()
{
    import_array();
}

import_array() in numpy/core/include/numpy/_multiarray_api.h is a macro:

#if PY_VERSION_HEX >= 0x03000000
#define NUMPY_IMPORT_ARRAY_RETVAL NULL
#else
#define NUMPY_IMPORT_ARRAY_RETVAL
#endif

#define import_array() {if (_import_array() < 0) {PyErr_Print();
PyErr_SetString(PyExc_ImportError, "numpy.core.multiarray failed to
import"); return NUMPY_IMPORT_ARRAY_RETVAL; } }

Note that for Python 3 there is a change so that the macro returns
NULL, whereas for Python 2 it returned nothing.

On Windows and Mac, this works fine for Python 2, and it works fine
for a with Python 3, but with either Microsoft Visual Studio 2008 or
2010 this fails for Python 3 with the message "'void' function
returning a value", presumably due to the return NULL for Python 3,
something that doesn't bother the Mac.

So my dumb question is, how should I call import_array() from my
routine init_numpy() to get around this problem? I have found a
workaround, consisting of defining init_numpy to be of type int on
Windows with Python 3, but this seems like an odd kludge, since it
isn't needed on the Mac, and I think it's also not needed on Linux.

Bruce Sherwood


From Bruce_Sherwood at ncsu.edu  Sun Dec 26 17:42:33 2010
From: Bruce_Sherwood at ncsu.edu (Bruce Sherwood)
Date: Sun, 26 Dec 2010 15:42:33 -0700
Subject: [Numpy-discussion] How to call import_array() properly?
In-Reply-To: <AANLkTikedLOm5Ks1MD4B1Pm7jyjF1BLuW7MWrHxaT+Hw@mail.gmail.com>
References: <AANLkTikedLOm5Ks1MD4B1Pm7jyjF1BLuW7MWrHxaT+Hw@mail.gmail.com>
Message-ID: <AANLkTimwbnyWr+sqJYKuboO_jUvpG8nCEPgWvZSYs2aj@mail.gmail.com>

I made a mistake: the Mac behaves the same way when I repeat the
experiment. I guess I simply have to define init_numpy() to be of type
int for Python 3 on both machines. Nevertheless, if you see a more
elegant coding, I'd be interested. Thanks.

Bruce Sherwood

On Sun, Dec 26, 2010 at 3:26 PM, Bruce Sherwood <Bruce_Sherwood at ncsu.edu> wrote:
> In my Python code I have
>
> import cvisual
> cvisual.init_numpy()
>
> and in my C++ code I have
>
> void
> init_numpy()
> {
> ? ?import_array();
> }
>
> import_array() in numpy/core/include/numpy/_multiarray_api.h is a macro:
>
> #if PY_VERSION_HEX >= 0x03000000
> #define NUMPY_IMPORT_ARRAY_RETVAL NULL
> #else
> #define NUMPY_IMPORT_ARRAY_RETVAL
> #endif
>
> #define import_array() {if (_import_array() < 0) {PyErr_Print();
> PyErr_SetString(PyExc_ImportError, "numpy.core.multiarray failed to
> import"); return NUMPY_IMPORT_ARRAY_RETVAL; } }
>
> Note that for Python 3 there is a change so that the macro returns
> NULL, whereas for Python 2 it returned nothing.
>
> On Windows and Mac, this works fine for Python 2, and it works fine
> for a with Python 3, but with either Microsoft Visual Studio 2008 or
> 2010 this fails for Python 3 with the message "'void' function
> returning a value", presumably due to the return NULL for Python 3,
> something that doesn't bother the Mac.
>
> So my dumb question is, how should I call import_array() from my
> routine init_numpy() to get around this problem? I have found a
> workaround, consisting of defining init_numpy to be of type int on
> Windows with Python 3, but this seems like an odd kludge, since it
> isn't needed on the Mac, and I think it's also not needed on Linux.
>
> Bruce Sherwood
>


From jpscipy at gmail.com  Mon Dec 27 01:51:23 2010
From: jpscipy at gmail.com (Justin Peel)
Date: Sun, 26 Dec 2010 23:51:23 -0700
Subject: [Numpy-discussion] Optimization suggestion sought
In-Reply-To: <AANLkTikmEv8qk8=h1G1m2ULboiGynVeA07ii4Ct0PzFX@mail.gmail.com>
References: <F99034C000D94E8799A149732551C04A@EMLT>
	<AANLkTikmEv8qk8=h1G1m2ULboiGynVeA07ii4Ct0PzFX@mail.gmail.com>
Message-ID: <AANLkTinjoAt2FoKSoo+ns17dvq9tGY2drj_Zjz5j8bPR@mail.gmail.com>

On Sun, Dec 26, 2010 at 7:34 AM,  <josef.pktd at gmail.com> wrote:
> On Sun, Dec 26, 2010 at 3:51 AM, Enzo Michelangeli <enzomich at gmail.com> wrote:
>> For a pivoted algorithm, I have to perform an operation that in fully
>> vectorized form can be expressed as:
>>
>> ? ?pivot = tableau[locat,:]/tableau[locat,cand]
>> ? ?tableau -= tableau[:,cand:cand+1]*pivot
>> ? ?tableau[locat,:] = pivot
>>
>> tableau is a rather large bidimensional array, and I'd like to avoid the
>> allocation of a temporary array of the same size holding the result of the
>> right-hand side expression in the second line of code (the outer product of
>> tableau[:,cand] and pivot). On the other hand, if I replace that line with:
>>
>> ? ?for i in xrange(tableau.shape[0]):
>> ? ? ? ?tableau[i] -= tableau[i,cand]*pivot
>>
>> ...I incur some CPU overhead for the "for" loop -- and this part of code is
>> the botteneck of the whole algorithm. Is there any smarter (i.e., more
>> time-efficient) way of achieving my goal?
>
> just a generic answer:
>
> Working in batches can be a good compromise in some cases. I instead
> of working in a loop with one row at a time, loop and handle, for
> example, 1000 rows at a time.
>
> Josef
>
>>
>> TIA --
>>
>> Enzo
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>

If this is really such a big bottleneck, then I would look into using
Cython for this part. With just a few cdef's, I bet that that you
could speed up the for loop tremendously. Depending on the details of
your algorithm, you might want to make a Cython function that takes
tableau, cand and pivot as inputs and just does the for loop part.


From enzomich at gmail.com  Mon Dec 27 09:20:40 2010
From: enzomich at gmail.com (Enzo Michelangeli)
Date: Mon, 27 Dec 2010 22:20:40 +0800
Subject: [Numpy-discussion] Optimization suggestion sought
References: <F99034C000D94E8799A149732551C04A@EMLT><AANLkTikmEv8qk8=h1G1m2ULboiGynVeA07ii4Ct0PzFX@mail.gmail.com>
	<AANLkTinjoAt2FoKSoo+ns17dvq9tGY2drj_Zjz5j8bPR@mail.gmail.com>
Message-ID: <BF9B0BC50CF54529A90087A4815D7394@EMLT>

Many thanks to Josef and Justin for their replies.

Josef's hint sounds like a good way of reducing peak memory allocation
especially when the row size is large, which makes the "for" overhead for
each iteration comparatively lower. However, time is still spent in
back-and-forth conversions between numpy arrays and the native BLAS data
structures, and copying data from the temporary array holding the
intermediate results and tableau.

Regarding Justin's suggestion, before trying Cython (which, according to
http://wiki.cython.org/tutorials/numpy , seems to require a bit of work to
handle numpy arrays properly) I was looking at weave.blitz . Unfortunately,
this doesn't seems to like my code. Running code containing:

    expr = "tableau = tableau - tableau[:,cand:cand+1]*pivot"
    weave.blitz(expr)

...elicits:

/----------------------------------------------------------
distutils.errors.CompileError: error: Command
"g++ -mno-cygwin -O2 -Wall -IC:\Python26\lib\site-packages\scipy\weave -IC:\Python26\lib\site-packages\scipy\weave\scxx
 -IC:\Python26\lib\site-packages\scipy\weave\blitz -IC:\Python26\lib\site-packages\numpy\core\include
 -IC:\Python26\include -IC:\Python26\PC -c
c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.cpp
 -o
c:\docume~1\admin\locals~1\temp\ADMIN\python26_intermediate\compiler_4b433fbf94fa137eaa5ee69a06987eda\Release\docume~1\admin\locals~1\temp\admin\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.o"
failed with exit status 1
\----------------------------------------------------------

>From the error message issued by g++, it would appear that blitz can't
figure out the type of cand:

/----------------------------------------------------------
C:\Documents and Settings\ADMIN\My
Documents\Projects\Valerio\py>g++ -mno-cygwin -O2 -Wall -IC:\Python26\lib\site-packages\scipy\weave
 -IC:\Python26\lib\site-packages\scipy\weave\scxx -IC:\Python26\lib\site-packages\scipy\weave\blitz
 -IC:\Python26\lib\site-packages\numpy\core\include -IC:\Python26\include -IC:\Python26\PC
 -c
c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.cpp
 -o
c:\docume~1\admin\locals~1\temp\ADMIN\python26_intermediate\compiler_4b433fbf94fa137eaa5ee69a06987eda\Release\docume~1\admin\locals~1\temp\admin\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.o
In file included from
C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array-impl.h:37,
                 from
C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array.h:26,
                 from
c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.cpp:11:
C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/range.h: In member
function 'bool blitz::Range::isAscendingContiguous() const':
C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/range.h:120: warning:
suggest parentheses around '&&' within '||'
c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.cpp:
In function 'PyObject* compiled_func(PyObject*, PyObject*)':
c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.cpp:728:
error: ambiguous overload for 'operator+' in 'cand + 1'
c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.cpp:728:
note: candidates are: operator+(PyObject*, int) <built-in>
c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.cpp:728:
note:                 operator+(int, int) <built-in>
c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.cpp:728:
note:                 operator+(float, int) <built-in>
c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.cpp:728:
note:                 operator+(double, int) <built-in>
c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_6cc64ceb623b02ae7511c559ef81fb661.cpp:728:
note:                 operator+(char*, int) <built-in>
\----------------------------------------------------------

Using a temporary variable to keep cand out of blitz:

    tmp = tableau[:,cand:cand+1]
    expr = "tableau = tableau - tmp*pivot"
    weave.blitz(expr)

...produces an even uglier error message, which makes me think that blitz
doesn't understand that the product between a (n,1)-shaped array and an
(n,)-shaped one is meant to be an outer product:

/----------------------------------------------------------
C:\Documents and Settings\ADMIN\My Documents\Projects\Valerio\py>LCPSolve.py
Found executable C:\Program Files\pythonxy\mingw\bin\g++.exe
In file included from
C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array-impl.h:37,
                 from
C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array.h:26,
                 from
c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_c9159c98b571d3181d8848337bf1e50a1.cpp:11:
C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/range.h: In member
function 'bool blitz::Range::isAscendingContiguous() const':
C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/range.h:120: warning:
suggest parentheses around '&&' within '||'
In file included from
C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array-impl.h:2504,
                 from
C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array.h:26,
                 from
c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_c9159c98b571d3181d8848337bf1e50a1.cpp:11:
C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/expr.h: In
member function 'typename P_op::T_numtype
blitz::_bz_ArrayExprBinaryOp<P_expr1, P_expr2,
P_op>::operator()(const blitz::TinyVector<int, N_rank>&) [with int N_rank =
2, P_expr1 = blitz::FastArrayIterator<double, 2>, P_expr2 =
blitz::FastArrayIterator<double, 1>, P_op = blitz::Multiply<double,
double>]':
C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/expr.h:144:
instantiated from 'typename P_expr::T_numtype
blitz::_bz_ArrayExpr<P_expr>::operator()(const blitz::TinyVector<int,
N_rank>&) [with int N_rank = 2, P_expr =
blitz::_bz_ArrayExprBinaryOp<blitz::FastArrayIterator<double, 2>,
blitz::FastArrayIterator<double, 1>, blitz::Multiply<double, double> >]'
C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/expr.h:486:
instantiated from 'typename P_op::T_numtype
blitz::_bz_ArrayExprBinaryOp<P_expr1, P_expr2, P_op>::operator()(const
blitz::TinyVector<int, N_rank>&) [with int N_rank = 2, P_expr1 =
blitz::FastArrayIterator<double, 2>, P_expr2 =
blitz::_bz_ArrayExpr<blitz::_bz_ArrayExprBinaryOp<blitz::FastArrayIterator<double,
2>, blitz::FastArrayIterator<double, 1>, blitz::Multiply<double, double> >
 >, P_op = blitz::Subtract<double, double>]'
C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/expr.h:144:
instantiated from 'typename P_expr::T_numtype
blitz::_bz_ArrayExpr<P_expr>::operator()(const blitz::TinyVector<int,
N_rank>&) [with int N_rank = 2, P_expr =
blitz::_bz_ArrayExprBinaryOp<blitz::FastArrayIterator<double, 2>,
blitz::_bz_ArrayExpr<blitz::_bz_ArrayExprBinaryOp<blitz::FastArrayIterator<double,
2>, blitz::FastArrayIterator<double, 1>, blitz::Multiply<double, double> >
 >, blitz::Subtract<double, double> >]'
C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/eval.cc:670:
instantiated from 'blitz::Array<P_numtype, N_rank>& blitz::Array<T,
N>::evaluateWithIndexTraversal1(T_expr, T_update) [with T_expr =
blitz::_bz_ArrayExpr<blitz::_bz_ArrayExprBinaryOp<blitz::FastArrayIterator<double,
2>,
blitz::_bz_ArrayExpr<blitz::_bz_ArrayExprBinaryOp<blitz::FastArrayIterator<double,
2>, blitz::FastArrayIterator<double, 1>, blitz::Multiply<double, double> >
 >, blitz::Subtract<double,
double> > >, T_update = blitz::_bz_update<double, double>, P_numtype =
double, int N_rank = 2]'
C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/eval.cc:171:
instantiated from 'blitz::Array<P_numtype, N_rank>& blitz::Array<T,
N>::evaluate(T_expr, T_update) [with T_expr =
blitz::_bz_ArrayExpr<blitz::_bz_ArrayExprBinaryOp<blitz::FastArrayIterator<double,
2>,
blitz::_bz_ArrayExpr<blitz::_bz_ArrayExprBinaryOp<blitz::FastArrayIterator<double,
2>, blitz::FastArrayIterator<double, 1>,
blitz::Multiply<double, double> > >, blitz::Subtract<double, double> > >,
T_update = blitz::_bz_update<double, double>, P_numtype = double, int N_rank
= 2]'
C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/ops.cc:45:
instantiated from 'blitz::Array<P_numtype, N_rank>& blitz::Array<T,
N>::operator=(const blitz::ETBase<T_expr>&) [with T_expr =
blitz::_bz_ArrayExpr<blitz::_bz_ArrayExprBinaryOp<blitz::FastArrayIterator<double,
2>,
blitz::_bz_ArrayExpr<blitz::_bz_ArrayExprBinaryOp<blitz::FastArrayIterator<double,
2>, blitz::FastArrayIterator<double, 1>, blitz::Multiply<double, double> >
 >, blitz::Subtract<double, double>
> >, P_numtype = double, int N_rank = 2]'
c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_c9159c98b571d3181d8848337bf1e50a1.cpp:732:
instantiated from here
C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/expr.h:486:
error: no match for call to '(blitz::FastArrayIterator<double, 1>) (const
blitz::TinyVector<int, 2>&)'
C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/fastiter.h:74:
note: candidates are: P_numtype blitz::FastArrayIterator<T_numtype,
N_rank>::operator()(const blitz::TinyVector<int, N_destRank>&) [with
P_numtype = double, int N_rank = 1]
C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/fastiter.h:202:
note:                 P_numtype& blitz::FastArrayIterator<T_numtype,
N_rank>::operator()(int) [with P_numtype = double, int N_rank = 1]
C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/fastiter.h:208:
note:                 P_numtype& blitz::FastArrayIterator<T_numtype,
N_rank>::operator()(int, int) [with P_numtype = double, int N_rank = 1]
C:\Python26\lib\site-packages\scipy\weave\blitz/blitz/array/fastiter.h:214:
note:                 P_numtype& blitz::FastArrayIterator<T_numtype,
N_rank>::operator()(int, int, int) [with P_numtype = double, int N_rank = 1]
Traceback (most recent call last):
  File "C:\Documents and Settings\ADMIN\My
Documents\Projects\Valerio\py\LCPSolve.py", line 132, in <module>
    w, z, retcode = LCPSolve(M,q)
  File "C:\Documents and Settings\ADMIN\My
Documents\Projects\Valerio\py\LCPSolve.py", line 94, in LCPSolve
    weave.blitz(expr)
  File "C:\Python26\lib\site-packages\scipy\weave\blitz_tools.py", line 65,
in blitz
    **kw)
  File "C:\Python26\lib\site-packages\scipy\weave\inline_tools.py", line
482, in compile_function
    verbose=verbose, **kw)
  File "C:\Python26\lib\site-packages\scipy\weave\ext_tools.py", line 367,
in compile
    verbose = verbose, **kw)
  File "C:\Python26\lib\site-packages\scipy\weave\build_tools.py", line 273,
in
build_extension
    setup(name = module_name, ext_modules = [ext],verbose=verb)
  File "C:\Python26\lib\site-packages\numpy\distutils\core.py", line 186, in
setup
    return old_setup(**new_attr)
  File "C:\Python26\lib\distutils\core.py", line 169, in setup
    raise SystemExit, "error: " + str(msg)
distutils.errors.CompileError: error: Command
"g++ -mno-cygwin -O2 -Wall -IC:\Python26\lib\site-packages\scipy\weave -IC:\Python26\lib\site-packages\scipy\weave\scxx
 -IC:\Python26\lib\site-packages\scipy\weave\blitz -IC:\Python26\lib\site-packages\numpy\core\include
 -IC:\Python26\include -IC:\Python26\PC -c
c:\docume~1\admin\locals~1\temp\ADMIN\python26_compiled\sc_c9159c98b571d3181d8848337bf1e50a1.cpp
 -o
c:\docume~1\admin\locals~1\temp\ADMIN\python26_intermediate\compiler_4b433fbf94fa137eaa5ee69a06987eda\Release\docume~1\admin\locals~1\temp\admin\python26_compiled\sc_c9159c98b571d3181d8848337bf1e50a1.o"
failed with exit status 1
\----------------------------------------------------------

So, for the time being, no speed breakthrough...

Enzo

----- Original Message ----- 
From: "Justin Peel" <jpscipy at gmail.com>
To: "Discussion of Numerical Python" <numpy-discussion at scipy.org>
Sent: Monday, December 27, 2010 2:51 PM
Subject: Re: [Numpy-discussion] Optimization suggestion sought


On Sun, Dec 26, 2010 at 7:34 AM,  <josef.pktd at gmail.com> wrote:
> On Sun, Dec 26, 2010 at 3:51 AM, Enzo Michelangeli <enzomich at gmail.com>
> wrote:
>> For a pivoted algorithm, I have to perform an operation that in fully
>> vectorized form can be expressed as:
>>
>> pivot = tableau[locat,:]/tableau[locat,cand]
>> tableau -= tableau[:,cand:cand+1]*pivot
>> tableau[locat,:] = pivot
>>
>> tableau is a rather large bidimensional array, and I'd like to avoid the
>> allocation of a temporary array of the same size holding the result of
>> the
>> right-hand side expression in the second line of code (the outer product
>> of
>> tableau[:,cand] and pivot). On the other hand, if I replace that line
>> with:
>>
>> for i in xrange(tableau.shape[0]):
>> tableau[i] -= tableau[i,cand]*pivot
>>
>> ...I incur some CPU overhead for the "for" loop -- and this part of code
>> is
>> the botteneck of the whole algorithm. Is there any smarter (i.e., more
>> time-efficient) way of achieving my goal?
>
> just a generic answer:
>
> Working in batches can be a good compromise in some cases. I instead
> of working in a loop with one row at a time, loop and handle, for
> example, 1000 rows at a time.
>
> Josef
>
>>
>> TIA --
>>
>> Enzo
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>

If this is really such a big bottleneck, then I would look into using
Cython for this part. With just a few cdef's, I bet that that you
could speed up the for loop tremendously. Depending on the details of
your algorithm, you might want to make a Cython function that takes
tableau, cand and pivot as inputs and just does the for loop part.
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


From robert.kern at gmail.com  Mon Dec 27 10:20:14 2010
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 27 Dec 2010 10:20:14 -0500
Subject: [Numpy-discussion] How to call import_array() properly?
In-Reply-To: <AANLkTikedLOm5Ks1MD4B1Pm7jyjF1BLuW7MWrHxaT+Hw@mail.gmail.com>
References: <AANLkTikedLOm5Ks1MD4B1Pm7jyjF1BLuW7MWrHxaT+Hw@mail.gmail.com>
Message-ID: <AANLkTik7VE2_wFWf6jHceRdQDDGxGsc3CQb=Tku=_xaU@mail.gmail.com>

On Sun, Dec 26, 2010 at 17:26, Bruce Sherwood <Bruce_Sherwood at ncsu.edu> wrote:
> In my Python code I have
>
> import cvisual
> cvisual.init_numpy()
>
> and in my C++ code I have
>
> void
> init_numpy()
> {
> ? ?import_array();
> }

The import_array() call goes into the initialization function for your
module, e.g. initcvisual(). Do not put it into a separate function for
the user of your module to call.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From Bruce_Sherwood at ncsu.edu  Mon Dec 27 13:09:57 2010
From: Bruce_Sherwood at ncsu.edu (Bruce Sherwood)
Date: Mon, 27 Dec 2010 11:09:57 -0700
Subject: [Numpy-discussion] How to call import_array() properly?
In-Reply-To: <AANLkTik7VE2_wFWf6jHceRdQDDGxGsc3CQb=Tku=_xaU@mail.gmail.com>
References: <AANLkTikedLOm5Ks1MD4B1Pm7jyjF1BLuW7MWrHxaT+Hw@mail.gmail.com>
	<AANLkTik7VE2_wFWf6jHceRdQDDGxGsc3CQb=Tku=_xaU@mail.gmail.com>
Message-ID: <AANLkTi=nRDKJmLMELsW9pKAKv06jG6vimCLVTbSwed-S@mail.gmail.com>

Thanks for the good suggestion. I now see that it was purely
historical that import_array was driven (indirectly through
init_numpy) from the pure Python component of the module rather than
in the import of the C++ component, and I've changed that. However,
I'm still curious as to whether there's a more intelligent or elegant
way to drive import_array than the following code:

#if PY_MAJOR_VERSION >= 3
int
init_numpy()
{
	import_array();
}
#else
void
init_numpy()
{
	import_array();
}
#endif

Bruce Sherwood

On Mon, Dec 27, 2010 at 8:20 AM, Robert Kern <robert.kern at gmail.com> wrote:
> On Sun, Dec 26, 2010 at 17:26, Bruce Sherwood <Bruce_Sherwood at ncsu.edu> wrote:
>> In my Python code I have
>>
>> import cvisual
>> cvisual.init_numpy()
>>
>> and in my C++ code I have
>>
>> void
>> init_numpy()
>> {
>> ? ?import_array();
>> }
>
> The import_array() call goes into the initialization function for your
> module, e.g. initcvisual(). Do not put it into a separate function for
> the user of your module to call.
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
> ? -- Umberto Eco
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From moura.mario at gmail.com  Mon Dec 27 13:36:38 2010
From: moura.mario at gmail.com (Mario Moura)
Date: Mon, 27 Dec 2010 16:36:38 -0200
Subject: [Numpy-discussion]  How construct custom slice
Message-ID: <AANLkTimBAk0Du_sYiRspMzRO3VB0NhxQi2cYVXEYjmMx@mail.gmail.com>

Hi Folks

a = np.zeros((4,3,5,55,5),dtype='|S8')
myLen = 4 # here I use myLen = len(something)
li = [3,2,4] # li from a list.append(something)
sl = slice(0,myLen)
tmpIndex = tuple(li) + sl + 4  # <== Here my problem
a[tmpIndex]

# So What I want is:
fillMe = np.array(['foo','bar','hello','world'])
# But I cant contruct by hand like this
a[3,2,4,:4,4] = fillMe
a

Again. I need construct custom slice from here
tmpIndex = tuple(li) + sl + 4
a[tmpIndex]

Who can help me?

Best Regards

Mario Moura


From kwgoodman at gmail.com  Mon Dec 27 13:48:48 2010
From: kwgoodman at gmail.com (Keith Goodman)
Date: Mon, 27 Dec 2010 10:48:48 -0800
Subject: [Numpy-discussion] How construct custom slice
In-Reply-To: <AANLkTimBAk0Du_sYiRspMzRO3VB0NhxQi2cYVXEYjmMx@mail.gmail.com>
References: <AANLkTimBAk0Du_sYiRspMzRO3VB0NhxQi2cYVXEYjmMx@mail.gmail.com>
Message-ID: <AANLkTikvM2j705=S3PER0_+vBQEigcR_dN=gjKk=uVHk@mail.gmail.com>

On Mon, Dec 27, 2010 at 10:36 AM, Mario Moura <moura.mario at gmail.com> wrote:
> Hi Folks
>
> a = np.zeros((4,3,5,55,5),dtype='|S8')
> myLen = 4 # here I use myLen = len(something)
> li = [3,2,4] # li from a list.append(something)
> sl = slice(0,myLen)
> tmpIndex = tuple(li) + sl + 4 ?# <== Here my problem
> a[tmpIndex]
>
> # So What I want is:
> fillMe = np.array(['foo','bar','hello','world'])
> # But I cant contruct by hand like this
> a[3,2,4,:4,4] = fillMe
> a
>
> Again. I need construct custom slice from here
> tmpIndex = tuple(li) + sl + 4
> a[tmpIndex]

First let's do it by hand:

>> a = np.zeros((4,3,5,55,5),dtype='|S8')
>> fillMe = np.array(['foo','bar','hello','world'])
>> a[3,2,4,:4,4] = fillMe

Now let's try using an index:

>> b = np.zeros((4,3,5,55,5),dtype='|S8')
>> myLen = 4
>> li = [3,2,4]
>> sl = slice(0,myLen)

Make index:

>> idx = range(a.ndim)
>> idx[:3] = li
>> idx[3] = sl
>> idx[4] = 4
>> idx = tuple(idx)

Compare results:

>> b[idx] = fillMe
>> (a == b).all()
   True


From moura.mario at gmail.com  Mon Dec 27 13:58:38 2010
From: moura.mario at gmail.com (Mario Moura)
Date: Mon, 27 Dec 2010 16:58:38 -0200
Subject: [Numpy-discussion] How construct custom slice
In-Reply-To: <AANLkTikvM2j705=S3PER0_+vBQEigcR_dN=gjKk=uVHk@mail.gmail.com>
References: <AANLkTimBAk0Du_sYiRspMzRO3VB0NhxQi2cYVXEYjmMx@mail.gmail.com>
	<AANLkTikvM2j705=S3PER0_+vBQEigcR_dN=gjKk=uVHk@mail.gmail.com>
Message-ID: <AANLkTi=paTTvYiik24rC1J_1LSV5Xq8nt42UnLhN8+-i@mail.gmail.com>

Hi Mr. Goodman

Thanks a lot. Works Fine

Reagards

Mario Moura

2010/12/27 Keith Goodman <kwgoodman at gmail.com>:
> On Mon, Dec 27, 2010 at 10:36 AM, Mario Moura <moura.mario at gmail.com> wrote:
>> Hi Folks
>>
>> a = np.zeros((4,3,5,55,5),dtype='|S8')
>> myLen = 4 # here I use myLen = len(something)
>> li = [3,2,4] # li from a list.append(something)
>> sl = slice(0,myLen)
>> tmpIndex = tuple(li) + sl + 4 ?# <== Here my problem
>> a[tmpIndex]
>>
>> # So What I want is:
>> fillMe = np.array(['foo','bar','hello','world'])
>> # But I cant contruct by hand like this
>> a[3,2,4,:4,4] = fillMe
>> a
>>
>> Again. I need construct custom slice from here
>> tmpIndex = tuple(li) + sl + 4
>> a[tmpIndex]
>
> First let's do it by hand:
>
>>> a = np.zeros((4,3,5,55,5),dtype='|S8')
>>> fillMe = np.array(['foo','bar','hello','world'])
>>> a[3,2,4,:4,4] = fillMe
>
> Now let's try using an index:
>
>>> b = np.zeros((4,3,5,55,5),dtype='|S8')
>>> myLen = 4
>>> li = [3,2,4]
>>> sl = slice(0,myLen)
>
> Make index:
>
>>> idx = range(a.ndim)
>>> idx[:3] = li
>>> idx[3] = sl
>>> idx[4] = 4
>>> idx = tuple(idx)
>
> Compare results:
>
>>> b[idx] = fillMe
>>> (a == b).all()
> ? True
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From korn at freisingnet.de  Mon Dec 27 12:58:24 2010
From: korn at freisingnet.de (Johannes Korn)
Date: Mon, 27 Dec 2010 18:58:24 +0100
Subject: [Numpy-discussion] Strange problem with h5py and numpy
Message-ID: <ifak40$e9n$1@dough.gmane.org>

Hi,

I have a strange problem with h5py or with numpy.

I try to read a bunch of hdf files in a loop. The problem is that I get 
an error at the second file because the file handle is of type <Closed 
HDF5 file>

It seems the file is opened and instantaneous closed again. Meanwhile I 
found the root of the evil: First I read the contents of a dataset to 
the numpy array tmp. In the next step I paste this array into a bigger 
one "alb_c1_tmp". The shape of tmp is (651, 1701). Everything works well 
if I omit the pasting.


         alb_c1_tmp = zeros([3712,3712])

         c1_eu = File(filename_ch1_eu,mode='r')
         print c1_eu
         tmp = c1_eu['AL-SP-BH'][:]

	#Source of the evil
         alb_c1_tmp[ 3012:3663,  462:2163 ] = tmp

         c1_eu.close()

The code as above crashes always when the second file is read. However 
originally I had the boundaries of alb_c1_tmp set dynamically from 
attributes of the hdf files (they are identical for all files), this 
code crashed after a random number of files.

Here?s the line that caused the randomly delayed crashes:
alb_c1_tmp[ 1855 + c1_eu.attrs['LOFF'] - c1_eu.attrs['NL'] : 1855 + 
c1_eu.attrs['LOFF'] , 1855 + c1_eu.attrs['COFF'] - c1_eu.attrs['NC'] : 
1855 + c1_eu.attrs['COFF'] ] = tmp

Is it a bug or did I something wrong?

Python 2.6.5 (r265:79063, Oct 28 2010, 20:56:56)
Numpy-Version is probably 1.5.0
h5py 1.3.0
Happens under 32bit as well as under 64bit SuSE-release 11.3

Kind regards!

Johannes


From robert.kern at gmail.com  Mon Dec 27 14:15:05 2010
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 27 Dec 2010 14:15:05 -0500
Subject: [Numpy-discussion] How to call import_array() properly?
In-Reply-To: <AANLkTi=nRDKJmLMELsW9pKAKv06jG6vimCLVTbSwed-S@mail.gmail.com>
References: <AANLkTikedLOm5Ks1MD4B1Pm7jyjF1BLuW7MWrHxaT+Hw@mail.gmail.com>
	<AANLkTik7VE2_wFWf6jHceRdQDDGxGsc3CQb=Tku=_xaU@mail.gmail.com>
	<AANLkTi=nRDKJmLMELsW9pKAKv06jG6vimCLVTbSwed-S@mail.gmail.com>
Message-ID: <AANLkTinBSsnOLY=iFD6XWH0R4a05BHD9azcYG_Hf_ux_@mail.gmail.com>

On Mon, Dec 27, 2010 at 13:09, Bruce Sherwood <Bruce_Sherwood at ncsu.edu> wrote:
> Thanks for the good suggestion. I now see that it was purely
> historical that import_array was driven (indirectly through
> init_numpy) from the pure Python component of the module rather than
> in the import of the C++ component, and I've changed that. However,
> I'm still curious as to whether there's a more intelligent or elegant
> way to drive import_array than the following code:
>
> #if PY_MAJOR_VERSION >= 3
> int
> init_numpy()
> {
> ? ? ? ?import_array();
> }
> #else
> void
> init_numpy()
> {
> ? ? ? ?import_array();
> }
> #endif

Just put "import_array();" into initcvisual(). You should not put it
in any other function.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From roger at quantumbioinc.com  Mon Dec 27 14:40:11 2010
From: roger at quantumbioinc.com (Roger Martin)
Date: Mon, 27 Dec 2010 14:40:11 -0500
Subject: [Numpy-discussion] Installing on CentOS 5 claims invalid Python
 installation
In-Reply-To: <4D13A9E1.6030602@gmail.com>
References: <4D134AE4.6090001@quantumbioinc.com> <4D13A9E1.6030602@gmail.com>
Message-ID: <4D18EB9B.10308@quantumbioinc.com>

Hi Bruce and Chris,

This was a user build and install of Python (particularly 2.6.6 since 
2.7.1 has build troubles on CentOS 5).  The original python 2.4 in the 
system is ignored for this effort because I can't get to it.  Since I 
was unfamiliar with building Python from source I didn't know it should 
produce python development where the --prefix points the install. It is 
supposed to under the altinstall target.

By looking at the make I found(with the autoconf 2.63 version) the make 
altinstall target simply wasn't running all its subtargets even though 
no install error occurred.  Ran the inclinstall, libainstall,  
sharedinstall targets and the distribution was populated with the devel 
components needed!
In fact the top of the configure.in of python 2.6.6 source build it says
dnl NOTE: autoconf 2.64 doesn't seem to work (use 2.61).
and I was at 2.63; not sure if same problem they were noting.
.........
./configure --prefix=/home/roger/Python-2.6.6/dist
make
#make test
make altinstall
make inclinstall
make libainstall
make sharedinstall
#make oldsharedinstall
.........

   This is all python install issues and has nothing to do with numpy 
install.  The numpy install followed:
.........
export PYTHONPATH=/home/roger/Python-2.6.6/dist/lib/python2.6
export PYTHONHOME=/home/roger/Python-2.6.6/dist

export MKLROOT=/share/apps/intel/mkl/10.2.5.035
export PATH=/home/roger/Python-2.6.6/dist/bin:$PATH
export 
LD_LIBRARY_PATH=$MKLROOT/lib/em64t:/home/roger/Python-2.6.6/build/lib.linux-x86_64-2.6:/lib64:/usr/lib64:/lib:$LD_LIBRARY_PATH

export 
PYTHONPATH=/home/roger/Python-2.6.6/dist/lib/python2.6:/home/roger/Python-2.6.6/build/lib.linux-x86_64-2.6:/home/roger/Python-2.6.6/Lib:/home/roger/Python-2.6.6/Modules
export PYTHONHOME=/home/roger/Python-2.6.6/dist
export PATH=/home/roger/Python-2.6.6/dist/bin:$PATH
#export PYTHONVERBOSE=1
#python2.6 setup.py clean
python2.6 setup.py build --fcompiler=gnu95
python2.6 setup.py install
.........
Success!

Then a quick test:
........
Python 2.6.6 (r266:84292, Dec 22 2010, 13:28:53)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
 >>> import numpy
 >>> a = numpy.arange(10).reshape(2,5)
 >>> a
array([[0, 1, 2, 3, 4],
        [5, 6, 7, 8, 9]])
 >>>
........
Tested successfully!

Python 3.2 didn't have any of the python install issues and numpy 
installs and functions on it too.

Now on to h5py utilizing numpy.  Thanks for the discussion; you lead me 
to understand when building python from source the devel portion should 
be there,

Roger


On 12/23/2010 02:58 PM, Bruce Southey wrote:
> On 12/23/2010 07:13 AM, Roger Martin wrote:
>> Hi,
>>
>> NumPy looks like the way to get computation done in Python.  Now I'm 
>> going through the learning curve of installing the module into 
>> different linux OS's and Python versions.  An extra need is to 
>> install google code's h5py http://code.google.com/p/h5py/ which 
>> depends on numpy.
>>
>> In trying a number of Python versions the 2.x's are yielding the 
>> message " invalid Python installation"
>> ---------------
>>     raise DistutilsPlatformError(my_msg)
>> distutils.errors.DistutilsPlatformError: invalid Python installation: 
>> unable to open 
>> /home/roger/Python-2.6.6/dist/lib/python2.6/config/Makefile (No such 
>> file or directory)
>> ---------------
>>
>> From reading on the web it appears a Python-2.x.x-devel version is 
>> needed.  Yet no search combination comes back with where to get such 
>> a thing(note: I need user installs/builds for security reasons).  
>> Where are Python versions  compatible with numpy?
>>
>> Building
>> 	Python-2.6.6
>> 	Python-2.7.1(fails to build)
>> 	Python3.2beta2
>> numpy1.5.1
>> 	invalid Python installation 	NA
>> 	success
>> h5py1.3.1
>> 	needs numpy
>> 	NA
>> 	fails
>>
>>
>> To start I need just one successful combination but will need more 
>> cases depending on users of a new integration project.
>>
>> Interestingly your numpy 1.5.1's setup is in good shape to build with 
>> Python3.2 yet I need to allow older versions for people's systems not 
>> ready to upgrade that far.
>>
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
> I thought that Centos 5 ships Python 2.4 so how did you get Python 
> 2.6, 2.7 and 3.2?
> If these are from some repository then the developmental libraries 
> should also be there - if these are not there then either find another 
> repository or build Python yourself.
>
> Bruce
>
>
>
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101227/cd0662b7/attachment.html>

From Bruce_Sherwood at ncsu.edu  Mon Dec 27 14:51:35 2010
From: Bruce_Sherwood at ncsu.edu (Bruce Sherwood)
Date: Mon, 27 Dec 2010 12:51:35 -0700
Subject: [Numpy-discussion] How to call import_array() properly?
In-Reply-To: <AANLkTinBSsnOLY=iFD6XWH0R4a05BHD9azcYG_Hf_ux_@mail.gmail.com>
References: <AANLkTikedLOm5Ks1MD4B1Pm7jyjF1BLuW7MWrHxaT+Hw@mail.gmail.com>
	<AANLkTik7VE2_wFWf6jHceRdQDDGxGsc3CQb=Tku=_xaU@mail.gmail.com>
	<AANLkTi=nRDKJmLMELsW9pKAKv06jG6vimCLVTbSwed-S@mail.gmail.com>
	<AANLkTinBSsnOLY=iFD6XWH0R4a05BHD9azcYG_Hf_ux_@mail.gmail.com>
Message-ID: <AANLkTi=FWo6vrAGQ+aM-yE_FmnFcTzFb2X0Mo+mRjSOz@mail.gmail.com>

The module I'm working with, which uses Boost, doesn't have a function
"initcvisual". Rather there's a section headed with
BOOST_PYTHON_MODULE( cvisual). Placing the import_array macro directly
in this section causes an unwanted return.

I guess it doesn't matter, since what I've done works okay. And I
realized that I could collapse init_numpy a bit:

#if PY_MAJOR_VERSION >= 3
int
#else
void
#endif
init_numpy()
{
	import_array();
}

Bruce Sherwood

On Mon, Dec 27, 2010 at 12:15 PM, Robert Kern <robert.kern at gmail.com> wrote:
> On Mon, Dec 27, 2010 at 13:09, Bruce Sherwood <Bruce_Sherwood at ncsu.edu> wrote:
>> Thanks for the good suggestion. I now see that it was purely
>> historical that import_array was driven (indirectly through
>> init_numpy) from the pure Python component of the module rather than
>> in the import of the C++ component, and I've changed that. However,
>> I'm still curious as to whether there's a more intelligent or elegant
>> way to drive import_array than the following code:
>>
>> #if PY_MAJOR_VERSION >= 3
>> int
>> init_numpy()
>> {
>> ? ? ? ?import_array();
>> }
>> #else
>> void
>> init_numpy()
>> {
>> ? ? ? ?import_array();
>> }
>> #endif
>
> Just put "import_array();" into initcvisual(). You should not put it
> in any other function.
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
> ? -- Umberto Eco
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From dsdale24 at gmail.com  Tue Dec 28 08:46:05 2010
From: dsdale24 at gmail.com (Darren Dale)
Date: Tue, 28 Dec 2010 08:46:05 -0500
Subject: [Numpy-discussion] Strange problem with h5py and numpy
In-Reply-To: <ifak40$e9n$1@dough.gmane.org>
References: <ifak40$e9n$1@dough.gmane.org>
Message-ID: <AANLkTimyGgnJNHiFFg9fdcaauXR0Q99EdAT9N11YmG9O@mail.gmail.com>

On Mon, Dec 27, 2010 at 12:58 PM, Johannes Korn <korn at freisingnet.de> wrote:
> Hi,
>
> I have a strange problem with h5py or with numpy.

I think this question belongs on the h5py mailing list.

> I try to read a bunch of hdf files in a loop. The problem is that I get
> an error at the second file because the file handle is of type <Closed
> HDF5 file>

The code you posted only involves one file.


From korn at freisingnet.de  Tue Dec 28 09:13:42 2010
From: korn at freisingnet.de (Johannes Korn)
Date: Tue, 28 Dec 2010 15:13:42 +0100
Subject: [Numpy-discussion] Strange problem with h5py and numpy
In-Reply-To: <AANLkTimyGgnJNHiFFg9fdcaauXR0Q99EdAT9N11YmG9O@mail.gmail.com>
References: <ifak40$e9n$1@dough.gmane.org>
	<AANLkTimyGgnJNHiFFg9fdcaauXR0Q99EdAT9N11YmG9O@mail.gmail.com>
Message-ID: <ifcram$k7v$1@dough.gmane.org>

On 28.12.2010 14:46, Darren Dale wrote::
> On Mon, Dec 27, 2010 at 12:58 PM, Johannes Korn<korn at freisingnet.de>  wrote:

>> I try to read a bunch of hdf files in a loop. The problem is that I get
>> an error at the second file because the file handle is of type<Closed
>> HDF5 file>
>
> The code you posted only involves one file.

The code I posted is part of the inside of a loop over the files. The 
filename changes of course and the files are there. If I try to open a 
non existing file the error message is different.


From korn at freisingnet.de  Tue Dec 28 10:14:20 2010
From: korn at freisingnet.de (Johannes Korn)
Date: Tue, 28 Dec 2010 16:14:20 +0100
Subject: [Numpy-discussion] Strange problem with h5py and numpy
In-Reply-To: <ifcram$k7v$1@dough.gmane.org>
References: <ifak40$e9n$1@dough.gmane.org>	<AANLkTimyGgnJNHiFFg9fdcaauXR0Q99EdAT9N11YmG9O@mail.gmail.com>
	<ifcram$k7v$1@dough.gmane.org>
Message-ID: <ifcusc$4to$1@dough.gmane.org>

On 28.12.2010 15:13, Johannes Korn wrote::
> On 28.12.2010 14:46, Darren Dale wrote::
>> On Mon, Dec 27, 2010 at 12:58 PM, Johannes Korn<korn at freisingnet.de>   wrote:
>
>>> I try to read a bunch of hdf files in a loop. The problem is that I get
>>> an error at the second file because the file handle is of type<Closed
>>> HDF5 file>
>>
>> The code you posted only involves one file.
>
> The code I posted is part of the inside of a loop over the files. The
> filename changes of course and the files are there. If I try to open a
> non existing file the error message is different.

Found the solution: incompatibility between HDF 1.8.5 and h5py 1.3.0.
Seems that upgrade to h5py 1.3.1 beta fixed the problem


From kwgoodman at gmail.com  Tue Dec 28 18:32:55 2010
From: kwgoodman at gmail.com (Keith Goodman)
Date: Tue, 28 Dec 2010 15:32:55 -0800
Subject: [Numpy-discussion] NumPy C-API equivalent of np.float64()
Message-ID: <AANLkTinkB9DxYY9Bpvyz__Ti9oWGzgKqLg1c-0Nvascx@mail.gmail.com>

I'm looking for the C-API equivalent of the np.float64 function,
something that I could use inline in a Cython function.

I don't know how to write the function. Anyone have one sitting
around? I'd like to use it, if it is faster than np.float64 (np.int32,
np.float32, ...) in the Bottleneck package when the output is a
scalar, for example bn.median(arr, axis=None).


From jsalvati at u.washington.edu  Tue Dec 28 23:10:53 2010
From: jsalvati at u.washington.edu (John Salvatier)
Date: Tue, 28 Dec 2010 20:10:53 -0800
Subject: [Numpy-discussion] NumPy C-API equivalent of np.float64()
In-Reply-To: <AANLkTinkB9DxYY9Bpvyz__Ti9oWGzgKqLg1c-0Nvascx@mail.gmail.com>
References: <AANLkTinkB9DxYY9Bpvyz__Ti9oWGzgKqLg1c-0Nvascx@mail.gmail.com>
Message-ID: <AANLkTiknGpcZYSV6o1JyY7ezSdmb8rch4oQH6-_NixPS@mail.gmail.com>

Wouldn't that be a cast? You do casts in Cython with <double>(expression)
and that should be the equivalent of float64 I think.

On Tue, Dec 28, 2010 at 3:32 PM, Keith Goodman <kwgoodman at gmail.com> wrote:

> I'm looking for the C-API equivalent of the np.float64 function,
> something that I could use inline in a Cython function.
>
> I don't know how to write the function. Anyone have one sitting
> around? I'd like to use it, if it is faster than np.float64 (np.int32,
> np.float32, ...) in the Bottleneck package when the output is a
> scalar, for example bn.median(arr, axis=None).
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101228/c3ee5b79/attachment.html>

From robertwb at math.washington.edu  Wed Dec 29 02:22:55 2010
From: robertwb at math.washington.edu (Robert Bradshaw)
Date: Tue, 28 Dec 2010 23:22:55 -0800
Subject: [Numpy-discussion] NumPy C-API equivalent of np.float64()
In-Reply-To: <AANLkTiknGpcZYSV6o1JyY7ezSdmb8rch4oQH6-_NixPS@mail.gmail.com>
References: <AANLkTinkB9DxYY9Bpvyz__Ti9oWGzgKqLg1c-0Nvascx@mail.gmail.com>
	<AANLkTiknGpcZYSV6o1JyY7ezSdmb8rch4oQH6-_NixPS@mail.gmail.com>
Message-ID: <AANLkTingUDAMEHgFzwLvnnWUopVovfJjNcy-vnY2v2wA@mail.gmail.com>

On Tue, Dec 28, 2010 at 8:10 PM, John Salvatier
<jsalvati at u.washington.edu> wrote:
> Wouldn't that be a cast? You do casts in Cython with <double>(expression)
> and that should be the equivalent of float64 I think.

Or even <numpy.float64_t >(expression) if you've cimported numpy
(though as mentioned this is the same as double on every platform I
know of). Even easier is just to use the expression in a the right
context and it will convert it for you.

- Robert


> On Tue, Dec 28, 2010 at 3:32 PM, Keith Goodman <kwgoodman at gmail.com> wrote:
>>
>> I'm looking for the C-API equivalent of the np.float64 function,
>> something that I could use inline in a Cython function.
>>
>> I don't know how to write the function. Anyone have one sitting
>> around? I'd like to use it, if it is faster than np.float64 (np.int32,
>> np.float32, ...) in the Bottleneck package when the output is a
>> scalar, for example bn.median(arr, axis=None).
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From robertwb at math.washington.edu  Wed Dec 29 03:47:21 2010
From: robertwb at math.washington.edu (Robert Bradshaw)
Date: Wed, 29 Dec 2010 00:47:21 -0800
Subject: [Numpy-discussion] Optimization suggestion sought
In-Reply-To: <BF9B0BC50CF54529A90087A4815D7394@EMLT>
References: <F99034C000D94E8799A149732551C04A@EMLT>
	<AANLkTikmEv8qk8=h1G1m2ULboiGynVeA07ii4Ct0PzFX@mail.gmail.com>
	<AANLkTinjoAt2FoKSoo+ns17dvq9tGY2drj_Zjz5j8bPR@mail.gmail.com>
	<BF9B0BC50CF54529A90087A4815D7394@EMLT>
Message-ID: <AANLkTin1NCFWzqf+jhcWduTfZeN5cq87G+AM4NfOss1i@mail.gmail.com>

On Mon, Dec 27, 2010 at 6:20 AM, Enzo Michelangeli <enzomich at gmail.com> wrote:
> Many thanks to Josef and Justin for their replies.
>
> Josef's hint sounds like a good way of reducing peak memory allocation
> especially when the row size is large, which makes the "for" overhead for
> each iteration comparatively lower. However, time is still spent in
> back-and-forth conversions between numpy arrays and the native BLAS data
> structures, and copying data from the temporary array holding the
> intermediate results and tableau.
>
> Regarding Justin's suggestion, before trying Cython (which, according to
> http://wiki.cython.org/tutorials/numpy , seems to require a bit of work to
> handle numpy arrays properly)

Cython doesn't have to be that complicated. For your example, you just
have to unroll the vectorization (and account for the fact that the
result is mutated in place, which was your original goal).


cimport numpy

def do_it(numpy.ndarray[double, ndim=2] tableau, int locat, int cand,
bint vectorize=True):
    cdef numpy.ndarray[double, ndim=1] pivot
    pivot = tableau[locat,:]/tableau[locat,cand]
    if vectorize:
        tableau -= tableau[:,cand:cand+1]*pivot
    else:
        for i in range(tableau.shape[0]):
            for  j in range(tableau.shape[1]):
                if j != cand:
                    tableau[i,j] -= tableau[i,cand] * pivot[j]
    tableau[:,cand] = 0
    tableau[locat,:] = pivot
    return tableau


- Robert


From friedrichromstedt at gmail.com  Wed Dec 29 06:28:56 2010
From: friedrichromstedt at gmail.com (Friedrich Romstedt)
Date: Wed, 29 Dec 2010 12:28:56 +0100
Subject: [Numpy-discussion] fromrecords yields "ValueError: invalid
 itemsize in generic type tuple"
In-Reply-To: <AANLkTi=y=3+QoJ5rk2bBDYOMRsKX9f9A1UvRBJqguJ_B@mail.gmail.com>
References: <AANLkTi=y=3+QoJ5rk2bBDYOMRsKX9f9A1UvRBJqguJ_B@mail.gmail.com>
Message-ID: <AANLkTim_La5PQiYHHUAG1rZVcz7sfxGvcQMnmvHN+NNh@mail.gmail.com>

2010/12/7 Rajat Banerjee <rbanerj at fas.harvard.edu>:
> Hi All,
> I have been using Numpy for a while with great success. I left my
> little project for a little while
> (http://web.mit.edu/stardev/cluster/) and now some of my code is
> broken.
>
> I have some Numpy code to create graphs of activity on a cluster with
> matplotlib. It ran just fine in July / August 2010, but has since
> stopped working. I have updated numpy on my machine, I think.
>
> In [2]: np.version.version
> Out[2]: '1.5.1'
>
> My call to np.rec.fromrecords() is throwing this exception:
>
>  File "/home/rajat/Envs/StarCluster/lib/python2.6/site-packages/numpy/core/records.py",
> line 607, in fromrecords
>    descr = sb.dtype((record, dtype))
> ValueError: invalid itemsize in generic type tuple
>
> Here is the code with some irrelevant stuff stripped:
>
>        for line in file:
>            a = [datetime.strptime(parts[0], '%Y-%m-%d %H:%M:%S.%f'),
>                 int(parts[1]), int(parts[2]), int(parts[3]), int(parts[4]),
>                 int(parts[5]), int(parts[6]), float(parts[7])]
>            list.append(a)
>        file.close()
>        names = ['dt', 'hosts', 'running_jobs', 'queued_jobs',\
>                 'slots', 'avg_duration', 'avg_wait', 'avg_load']
>        descriptor = {'names':
> ('dt,hosts,running_jobs,queued_jobs,slots,avg_duration,avg_wait,avg_load'),\
>                      'formats' : ('S20','u','u','u','u','u','u','f')}
>        self.records = np.rec.fromrecords(list,','.join(names)) #used to work
>        #self.records = np.rec.fromrecords(list, dtype=descriptor) #new attempt
>
> Here is one "line" from the array "list":
>>>> parts (8) = ['2010-12-07 03:09:46.855712', '2', '2', '177', '2', '86', '370', '1.05'].
>
> Neither of those np.rec.fromrecords() calls works. I've tried both
> separately. They both throw the exact same exception, ValueError:
> invalid itemsize in generic type tuple

Hi Rajat,

seems to be good that I read all email on the list, seems to be bad
that it's such a long queue.

Consider the script attached.  Remarks:

*  Use tuples as rows in the numpy.rec array "raw" argument.  It works
for the first conversion with [] too, but I think more by incident
than by design.  For the second case, which you will need, it does not
work with lists.
*  Always use keyword args to fromrecords().  I believe this is a)
more error-prone b) there is no specification for positional
arguments, so their order might change (as it seems to have happened).
 With positional "names", it ceases working.  I don't know what it
thinks you are requesting, but for sure not "names". :-)
*  Don't use the *dtype* in the way you did.  I'm not authoritative
with the *dtype* arg, but at least it doesn't work this way.  Use the
names= and formats= kwargs instead.

I just tinkered a bit around with your code without deep knowledge of
the numpy.rec package.  I just used fromrecords() some time ago in the
way I did use it here.

Friedrich

P.S.: Please reply, if you don't I'll resend the email to you OL in
the assumtion that you desperately disappointedly unsubscribed.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rec.py
Type: application/octet-stream
Size: 405 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101229/fadc4fa7/attachment.obj>

From kwgoodman at gmail.com  Wed Dec 29 12:05:52 2010
From: kwgoodman at gmail.com (Keith Goodman)
Date: Wed, 29 Dec 2010 09:05:52 -0800
Subject: [Numpy-discussion] NumPy C-API equivalent of np.float64()
In-Reply-To: <AANLkTingUDAMEHgFzwLvnnWUopVovfJjNcy-vnY2v2wA@mail.gmail.com>
References: <AANLkTinkB9DxYY9Bpvyz__Ti9oWGzgKqLg1c-0Nvascx@mail.gmail.com>
	<AANLkTiknGpcZYSV6o1JyY7ezSdmb8rch4oQH6-_NixPS@mail.gmail.com>
	<AANLkTingUDAMEHgFzwLvnnWUopVovfJjNcy-vnY2v2wA@mail.gmail.com>
Message-ID: <AANLkTikyj_aXA5Hz7YvM021Kmk2RG3_h9AOfrcJzUetV@mail.gmail.com>

On Tue, Dec 28, 2010 at 11:22 PM, Robert Bradshaw
<robertwb at math.washington.edu> wrote:
> On Tue, Dec 28, 2010 at 8:10 PM, John Salvatier
> <jsalvati at u.washington.edu> wrote:
>> Wouldn't that be a cast? You do casts in Cython with <double>(expression)
>> and that should be the equivalent of float64 I think.
>
> Or even <numpy.float64_t >(expression) if you've cimported numpy
> (though as mentioned this is the same as double on every platform I
> know of). Even easier is just to use the expression in a the right
> context and it will convert it for you.

That will give me a float object but it will not have dtype, shape,
ndim, etc methods.

>> m = np.mean([1,2,3])
>> m
   2.0
>> m.dtype
   dtype('float64')
>> m.ndim
   0

using <np.float64_t> gives:

AttributeError: 'float' object has no attribute 'dtype'


From robertwb at math.washington.edu  Wed Dec 29 12:37:15 2010
From: robertwb at math.washington.edu (Robert Bradshaw)
Date: Wed, 29 Dec 2010 09:37:15 -0800
Subject: [Numpy-discussion] NumPy C-API equivalent of np.float64()
In-Reply-To: <AANLkTikyj_aXA5Hz7YvM021Kmk2RG3_h9AOfrcJzUetV@mail.gmail.com>
References: <AANLkTinkB9DxYY9Bpvyz__Ti9oWGzgKqLg1c-0Nvascx@mail.gmail.com>
	<AANLkTiknGpcZYSV6o1JyY7ezSdmb8rch4oQH6-_NixPS@mail.gmail.com>
	<AANLkTingUDAMEHgFzwLvnnWUopVovfJjNcy-vnY2v2wA@mail.gmail.com>
	<AANLkTikyj_aXA5Hz7YvM021Kmk2RG3_h9AOfrcJzUetV@mail.gmail.com>
Message-ID: <AANLkTikjU_wMSG-pwfEPTpcvm2Lbz68y20aCA_vSTrJO@mail.gmail.com>

On Wed, Dec 29, 2010 at 9:05 AM, Keith Goodman <kwgoodman at gmail.com> wrote:
> On Tue, Dec 28, 2010 at 11:22 PM, Robert Bradshaw
> <robertwb at math.washington.edu> wrote:
>> On Tue, Dec 28, 2010 at 8:10 PM, John Salvatier
>> <jsalvati at u.washington.edu> wrote:
>>> Wouldn't that be a cast? You do casts in Cython with <double>(expression)
>>> and that should be the equivalent of float64 I think.
>>
>> Or even <numpy.float64_t >(expression) if you've cimported numpy
>> (though as mentioned this is the same as double on every platform I
>> know of). Even easier is just to use the expression in a the right
>> context and it will convert it for you.
>
> That will give me a float object but it will not have dtype, shape,
> ndim, etc methods.
>
>>> m = np.mean([1,2,3])
>>> m
> ? 2.0
>>> m.dtype
> ? dtype('float64')
>>> m.ndim
> ? 0
>
> using <np.float64_t> gives:
>
> AttributeError: 'float' object has no attribute 'dtype'

Well, in this case I doubt your'e going to be able to do much better
than np.float64(expr), as the bulk or the time is probably spent in
object allocation (and you're really asking for an object here). If
you knew the right C calls, you might be able to get a 2x speedup.

- Robert


From kwgoodman at gmail.com  Wed Dec 29 12:44:58 2010
From: kwgoodman at gmail.com (Keith Goodman)
Date: Wed, 29 Dec 2010 09:44:58 -0800
Subject: [Numpy-discussion] NumPy C-API equivalent of np.float64()
In-Reply-To: <AANLkTikjU_wMSG-pwfEPTpcvm2Lbz68y20aCA_vSTrJO@mail.gmail.com>
References: <AANLkTinkB9DxYY9Bpvyz__Ti9oWGzgKqLg1c-0Nvascx@mail.gmail.com>
	<AANLkTiknGpcZYSV6o1JyY7ezSdmb8rch4oQH6-_NixPS@mail.gmail.com>
	<AANLkTingUDAMEHgFzwLvnnWUopVovfJjNcy-vnY2v2wA@mail.gmail.com>
	<AANLkTikyj_aXA5Hz7YvM021Kmk2RG3_h9AOfrcJzUetV@mail.gmail.com>
	<AANLkTikjU_wMSG-pwfEPTpcvm2Lbz68y20aCA_vSTrJO@mail.gmail.com>
Message-ID: <AANLkTik091AMpAshRwf_vab_N6MhsT==L=-QcHAMh-8Q@mail.gmail.com>

On Wed, Dec 29, 2010 at 9:37 AM, Robert Bradshaw
<robertwb at math.washington.edu> wrote:
> On Wed, Dec 29, 2010 at 9:05 AM, Keith Goodman <kwgoodman at gmail.com> wrote:
>> On Tue, Dec 28, 2010 at 11:22 PM, Robert Bradshaw
>> <robertwb at math.washington.edu> wrote:
>>> On Tue, Dec 28, 2010 at 8:10 PM, John Salvatier
>>> <jsalvati at u.washington.edu> wrote:
>>>> Wouldn't that be a cast? You do casts in Cython with <double>(expression)
>>>> and that should be the equivalent of float64 I think.
>>>
>>> Or even <numpy.float64_t >(expression) if you've cimported numpy
>>> (though as mentioned this is the same as double on every platform I
>>> know of). Even easier is just to use the expression in a the right
>>> context and it will convert it for you.
>>
>> That will give me a float object but it will not have dtype, shape,
>> ndim, etc methods.
>>
>>>> m = np.mean([1,2,3])
>>>> m
>> ? 2.0
>>>> m.dtype
>> ? dtype('float64')
>>>> m.ndim
>> ? 0
>>
>> using <np.float64_t> gives:
>>
>> AttributeError: 'float' object has no attribute 'dtype'
>
> Well, in this case I doubt your'e going to be able to do much better
> than np.float64(expr), as the bulk or the time is probably spent in
> object allocation (and you're really asking for an object here). If
> you knew the right C calls, you might be able to get a 2x speedup.

A factor of 2 would be great! A tenth of a micro second is a lot of
overhead for small input arrays.

I'm guessing it is one of these functions but I don't understand the
signatures (nor ref counting):

PyObject* PyArray_Scalar(void* data, PyArray_Descr* dtype, PyObject* itemsize)

    Return an array scalar object of the given enumerated typenum and
itemsize by copying from memory pointed to by data . If swap is
nonzero then this function will byteswap the data if appropriate to
the data-type because array scalars are always in correct machine-byte
order.

PyObject* PyArray_ToScalar(void* data, PyArrayObject* arr)

    Return an array scalar object of the type and itemsize indicated
by the array object arr copied from the memory pointed to by data and
swapping if the data in arr is not in machine byte-order.


From matthew.brett at gmail.com  Wed Dec 29 12:48:05 2010
From: matthew.brett at gmail.com (Matthew Brett)
Date: Wed, 29 Dec 2010 17:48:05 +0000
Subject: [Numpy-discussion] NumPy C-API equivalent of np.float64()
In-Reply-To: <AANLkTikjU_wMSG-pwfEPTpcvm2Lbz68y20aCA_vSTrJO@mail.gmail.com>
References: <AANLkTinkB9DxYY9Bpvyz__Ti9oWGzgKqLg1c-0Nvascx@mail.gmail.com>
	<AANLkTiknGpcZYSV6o1JyY7ezSdmb8rch4oQH6-_NixPS@mail.gmail.com>
	<AANLkTingUDAMEHgFzwLvnnWUopVovfJjNcy-vnY2v2wA@mail.gmail.com>
	<AANLkTikyj_aXA5Hz7YvM021Kmk2RG3_h9AOfrcJzUetV@mail.gmail.com>
	<AANLkTikjU_wMSG-pwfEPTpcvm2Lbz68y20aCA_vSTrJO@mail.gmail.com>
Message-ID: <AANLkTik8Bty11j2yAeGjqZH+zTtCBiuu14fM+_WWgGFy@mail.gmail.com>

Hi,

On Wed, Dec 29, 2010 at 5:37 PM, Robert Bradshaw
<robertwb at math.washington.edu> wrote:
> On Wed, Dec 29, 2010 at 9:05 AM, Keith Goodman <kwgoodman at gmail.com> wrote:
>> On Tue, Dec 28, 2010 at 11:22 PM, Robert Bradshaw
>> <robertwb at math.washington.edu> wrote:
>>> On Tue, Dec 28, 2010 at 8:10 PM, John Salvatier
>>> <jsalvati at u.washington.edu> wrote:
>>>> Wouldn't that be a cast? You do casts in Cython with <double>(expression)
>>>> and that should be the equivalent of float64 I think.
>>>
>>> Or even <numpy.float64_t >(expression) if you've cimported numpy
>>> (though as mentioned this is the same as double on every platform I
>>> know of). Even easier is just to use the expression in a the right
>>> context and it will convert it for you.
>>
>> That will give me a float object but it will not have dtype, shape,
>> ndim, etc methods.
>>
>>>> m = np.mean([1,2,3])
>>>> m
>> ? 2.0
>>>> m.dtype
>> ? dtype('float64')
>>>> m.ndim
>> ? 0
>>
>> using <np.float64_t> gives:
>>
>> AttributeError: 'float' object has no attribute 'dtype'

Forgive me if I haven't understood your question, but can you use
PyArray_DescrFromType with e.g  NPY_FLOAT64 ?

Best,

Matthew


From kwgoodman at gmail.com  Wed Dec 29 12:55:49 2010
From: kwgoodman at gmail.com (Keith Goodman)
Date: Wed, 29 Dec 2010 09:55:49 -0800
Subject: [Numpy-discussion] NumPy C-API equivalent of np.float64()
In-Reply-To: <AANLkTik8Bty11j2yAeGjqZH+zTtCBiuu14fM+_WWgGFy@mail.gmail.com>
References: <AANLkTinkB9DxYY9Bpvyz__Ti9oWGzgKqLg1c-0Nvascx@mail.gmail.com>
	<AANLkTiknGpcZYSV6o1JyY7ezSdmb8rch4oQH6-_NixPS@mail.gmail.com>
	<AANLkTingUDAMEHgFzwLvnnWUopVovfJjNcy-vnY2v2wA@mail.gmail.com>
	<AANLkTikyj_aXA5Hz7YvM021Kmk2RG3_h9AOfrcJzUetV@mail.gmail.com>
	<AANLkTikjU_wMSG-pwfEPTpcvm2Lbz68y20aCA_vSTrJO@mail.gmail.com>
	<AANLkTik8Bty11j2yAeGjqZH+zTtCBiuu14fM+_WWgGFy@mail.gmail.com>
Message-ID: <AANLkTimAGUscLorsaNCeXuxhkHLBnyefm7m+wKDXW2-q@mail.gmail.com>

On Wed, Dec 29, 2010 at 9:48 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Wed, Dec 29, 2010 at 5:37 PM, Robert Bradshaw
> <robertwb at math.washington.edu> wrote:
>> On Wed, Dec 29, 2010 at 9:05 AM, Keith Goodman <kwgoodman at gmail.com> wrote:
>>> On Tue, Dec 28, 2010 at 11:22 PM, Robert Bradshaw
>>> <robertwb at math.washington.edu> wrote:
>>>> On Tue, Dec 28, 2010 at 8:10 PM, John Salvatier
>>>> <jsalvati at u.washington.edu> wrote:
>>>>> Wouldn't that be a cast? You do casts in Cython with <double>(expression)
>>>>> and that should be the equivalent of float64 I think.
>>>>
>>>> Or even <numpy.float64_t >(expression) if you've cimported numpy
>>>> (though as mentioned this is the same as double on every platform I
>>>> know of). Even easier is just to use the expression in a the right
>>>> context and it will convert it for you.
>>>
>>> That will give me a float object but it will not have dtype, shape,
>>> ndim, etc methods.
>>>
>>>>> m = np.mean([1,2,3])
>>>>> m
>>> ? 2.0
>>>>> m.dtype
>>> ? dtype('float64')
>>>>> m.ndim
>>> ? 0
>>>
>>> using <np.float64_t> gives:
>>>
>>> AttributeError: 'float' object has no attribute 'dtype'
>
> Forgive me if I haven't understood your question, but can you use
> PyArray_DescrFromType with e.g ?NPY_FLOAT64 ?

I'm pretty hopeless here. I don't know how to put all that together in
a function.


From matthew.brett at gmail.com  Wed Dec 29 13:13:09 2010
From: matthew.brett at gmail.com (Matthew Brett)
Date: Wed, 29 Dec 2010 18:13:09 +0000
Subject: [Numpy-discussion] NumPy C-API equivalent of np.float64()
In-Reply-To: <AANLkTimAGUscLorsaNCeXuxhkHLBnyefm7m+wKDXW2-q@mail.gmail.com>
References: <AANLkTinkB9DxYY9Bpvyz__Ti9oWGzgKqLg1c-0Nvascx@mail.gmail.com>
	<AANLkTiknGpcZYSV6o1JyY7ezSdmb8rch4oQH6-_NixPS@mail.gmail.com>
	<AANLkTingUDAMEHgFzwLvnnWUopVovfJjNcy-vnY2v2wA@mail.gmail.com>
	<AANLkTikyj_aXA5Hz7YvM021Kmk2RG3_h9AOfrcJzUetV@mail.gmail.com>
	<AANLkTikjU_wMSG-pwfEPTpcvm2Lbz68y20aCA_vSTrJO@mail.gmail.com>
	<AANLkTik8Bty11j2yAeGjqZH+zTtCBiuu14fM+_WWgGFy@mail.gmail.com>
	<AANLkTimAGUscLorsaNCeXuxhkHLBnyefm7m+wKDXW2-q@mail.gmail.com>
Message-ID: <AANLkTin2iH++f1VyKuon2K9uVgegp39-ZCgJmwthYOoo@mail.gmail.com>

>> Forgive me if I haven't understood your question, but can you use
>> PyArray_DescrFromType with e.g ?NPY_FLOAT64 ?
>
> I'm pretty hopeless here. I don't know how to put all that together in
> a function.

That might be because I'm not understanding you very well, but I was
thinking that:

cdef dtype descr = PyArray_DescrFromType(NPY_FLOAT64)

would give you the float64 dtype that I thought you wanted?  I'm
shooting from the hip here, in between nieces competing for the
computer and my attention.

See you,

Matthew


From kwgoodman at gmail.com  Wed Dec 29 13:27:34 2010
From: kwgoodman at gmail.com (Keith Goodman)
Date: Wed, 29 Dec 2010 10:27:34 -0800
Subject: [Numpy-discussion] NumPy C-API equivalent of np.float64()
In-Reply-To: <AANLkTin2iH++f1VyKuon2K9uVgegp39-ZCgJmwthYOoo@mail.gmail.com>
References: <AANLkTinkB9DxYY9Bpvyz__Ti9oWGzgKqLg1c-0Nvascx@mail.gmail.com>
	<AANLkTiknGpcZYSV6o1JyY7ezSdmb8rch4oQH6-_NixPS@mail.gmail.com>
	<AANLkTingUDAMEHgFzwLvnnWUopVovfJjNcy-vnY2v2wA@mail.gmail.com>
	<AANLkTikyj_aXA5Hz7YvM021Kmk2RG3_h9AOfrcJzUetV@mail.gmail.com>
	<AANLkTikjU_wMSG-pwfEPTpcvm2Lbz68y20aCA_vSTrJO@mail.gmail.com>
	<AANLkTik8Bty11j2yAeGjqZH+zTtCBiuu14fM+_WWgGFy@mail.gmail.com>
	<AANLkTimAGUscLorsaNCeXuxhkHLBnyefm7m+wKDXW2-q@mail.gmail.com>
	<AANLkTin2iH++f1VyKuon2K9uVgegp39-ZCgJmwthYOoo@mail.gmail.com>
Message-ID: <AANLkTikGrvnnMFnGLJqW0SbFS9uWjv8hiNJ6xtVLcBBi@mail.gmail.com>

On Wed, Dec 29, 2010 at 10:13 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
>>> Forgive me if I haven't understood your question, but can you use
>>> PyArray_DescrFromType with e.g ?NPY_FLOAT64 ?
>>
>> I'm pretty hopeless here. I don't know how to put all that together in
>> a function.
>
> That might be because I'm not understanding you very well, but I was
> thinking that:
>
> cdef dtype descr = PyArray_DescrFromType(NPY_FLOAT64)
>
> would give you the float64 dtype that I thought you wanted? ?I'm
> shooting from the hip here, in between nieces competing for the
> computer and my attention.

I think I need a function. One that does this:

>> n = 10.0
>> hasattr(n, 'ndim')
   False
>> m = np.float64(n)
>> hasattr(m, 'ndim')
   True

np.float64 is fast, just hoping someone had a C-API inline version of
np.float64() that is faster.


From matthew.brett at gmail.com  Wed Dec 29 14:43:22 2010
From: matthew.brett at gmail.com (Matthew Brett)
Date: Wed, 29 Dec 2010 19:43:22 +0000
Subject: [Numpy-discussion] NumPy C-API equivalent of np.float64()
In-Reply-To: <AANLkTikGrvnnMFnGLJqW0SbFS9uWjv8hiNJ6xtVLcBBi@mail.gmail.com>
References: <AANLkTinkB9DxYY9Bpvyz__Ti9oWGzgKqLg1c-0Nvascx@mail.gmail.com>
	<AANLkTiknGpcZYSV6o1JyY7ezSdmb8rch4oQH6-_NixPS@mail.gmail.com>
	<AANLkTingUDAMEHgFzwLvnnWUopVovfJjNcy-vnY2v2wA@mail.gmail.com>
	<AANLkTikyj_aXA5Hz7YvM021Kmk2RG3_h9AOfrcJzUetV@mail.gmail.com>
	<AANLkTikjU_wMSG-pwfEPTpcvm2Lbz68y20aCA_vSTrJO@mail.gmail.com>
	<AANLkTik8Bty11j2yAeGjqZH+zTtCBiuu14fM+_WWgGFy@mail.gmail.com>
	<AANLkTimAGUscLorsaNCeXuxhkHLBnyefm7m+wKDXW2-q@mail.gmail.com>
	<AANLkTin2iH++f1VyKuon2K9uVgegp39-ZCgJmwthYOoo@mail.gmail.com>
	<AANLkTikGrvnnMFnGLJqW0SbFS9uWjv8hiNJ6xtVLcBBi@mail.gmail.com>
Message-ID: <AANLkTinMpPz6csQR67uYiJU9cQ7F=qHBTDfm1Lr=Sc24@mail.gmail.com>

Hi,

>> That might be because I'm not understanding you very well, but I was
>> thinking that:
>>
>> cdef dtype descr = PyArray_DescrFromType(NPY_FLOAT64)
>>
>> would give you the float64 dtype that I thought you wanted? ?I'm
>> shooting from the hip here, in between nieces competing for the
>> computer and my attention.
>
> I think I need a function. One that does this:
>
>>> n = 10.0
>>> hasattr(n, 'ndim')
> ? False
>>> m = np.float64(n)
>>> hasattr(m, 'ndim')
> ? True

Now the nieces have gone, I see that I did completely misunderstand.
I think you want the C-API calls to be able to create a 0-dim ndarray
object from a python float.

There was a thread on C-API array creation on the cython list a little
while ago:

http://www.mail-archive.com/cython-dev at codespeak.net/msg07703.html

Code in scipy here:

https://github.com/scipy/scipy-svn/blob/master/scipy/io/matlab/mio5_utils.pyx

See around line 36 there, and 432, and the header file I copied from
Dag Sverre:

https://github.com/scipy/scipy-svn/blob/master/scipy/io/matlab/numpy_rephrasing.h

As you can see, it's a little horrible, in that you have to take care
to get the references right to the dtype and to the data.  I actually
did not investigate in detail whether this lower-level array creation
was speeding my code up much.

I hope that's more useful...

Matthew


From kwgoodman at gmail.com  Wed Dec 29 14:53:35 2010
From: kwgoodman at gmail.com (Keith Goodman)
Date: Wed, 29 Dec 2010 11:53:35 -0800
Subject: [Numpy-discussion] NumPy C-API equivalent of np.float64()
In-Reply-To: <AANLkTinMpPz6csQR67uYiJU9cQ7F=qHBTDfm1Lr=Sc24@mail.gmail.com>
References: <AANLkTinkB9DxYY9Bpvyz__Ti9oWGzgKqLg1c-0Nvascx@mail.gmail.com>
	<AANLkTiknGpcZYSV6o1JyY7ezSdmb8rch4oQH6-_NixPS@mail.gmail.com>
	<AANLkTingUDAMEHgFzwLvnnWUopVovfJjNcy-vnY2v2wA@mail.gmail.com>
	<AANLkTikyj_aXA5Hz7YvM021Kmk2RG3_h9AOfrcJzUetV@mail.gmail.com>
	<AANLkTikjU_wMSG-pwfEPTpcvm2Lbz68y20aCA_vSTrJO@mail.gmail.com>
	<AANLkTik8Bty11j2yAeGjqZH+zTtCBiuu14fM+_WWgGFy@mail.gmail.com>
	<AANLkTimAGUscLorsaNCeXuxhkHLBnyefm7m+wKDXW2-q@mail.gmail.com>
	<AANLkTin2iH++f1VyKuon2K9uVgegp39-ZCgJmwthYOoo@mail.gmail.com>
	<AANLkTikGrvnnMFnGLJqW0SbFS9uWjv8hiNJ6xtVLcBBi@mail.gmail.com>
	<AANLkTinMpPz6csQR67uYiJU9cQ7F=qHBTDfm1Lr=Sc24@mail.gmail.com>
Message-ID: <AANLkTin996a0p514U7O-3jaKxCnTdM4jjyEBeNbBNL56@mail.gmail.com>

On Wed, Dec 29, 2010 at 11:43 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
>>> That might be because I'm not understanding you very well, but I was
>>> thinking that:
>>>
>>> cdef dtype descr = PyArray_DescrFromType(NPY_FLOAT64)
>>>
>>> would give you the float64 dtype that I thought you wanted? ?I'm
>>> shooting from the hip here, in between nieces competing for the
>>> computer and my attention.
>>
>> I think I need a function. One that does this:
>>
>>>> n = 10.0
>>>> hasattr(n, 'ndim')
>> ? False
>>>> m = np.float64(n)
>>>> hasattr(m, 'ndim')
>> ? True
>
> Now the nieces have gone, I see that I did completely misunderstand.
> I think you want the C-API calls to be able to create a 0-dim ndarray
> object from a python float.
>
> There was a thread on C-API array creation on the cython list a little
> while ago:
>
> http://www.mail-archive.com/cython-dev at codespeak.net/msg07703.html
>
> Code in scipy here:
>
> https://github.com/scipy/scipy-svn/blob/master/scipy/io/matlab/mio5_utils.pyx
>
> See around line 36 there, and 432, and the header file I copied from
> Dag Sverre:
>
> https://github.com/scipy/scipy-svn/blob/master/scipy/io/matlab/numpy_rephrasing.h
>
> As you can see, it's a little horrible, in that you have to take care
> to get the references right to the dtype and to the data. ?I actually
> did not investigate in detail whether this lower-level array creation
> was speeding my code up much.
>
> I hope that's more useful...

Wow! That's a mouthful of code. Yes, very handy to have an example to
work from. Thank you.


From pav at iki.fi  Wed Dec 29 14:54:03 2010
From: pav at iki.fi (Pauli Virtanen)
Date: Wed, 29 Dec 2010 21:54:03 +0200
Subject: [Numpy-discussion] NumPy C-API equivalent of np.float64()
In-Reply-To: <AANLkTikGrvnnMFnGLJqW0SbFS9uWjv8hiNJ6xtVLcBBi@mail.gmail.com>
References: <AANLkTinkB9DxYY9Bpvyz__Ti9oWGzgKqLg1c-0Nvascx@mail.gmail.com>
	<AANLkTiknGpcZYSV6o1JyY7ezSdmb8rch4oQH6-_NixPS@mail.gmail.com>
	<AANLkTingUDAMEHgFzwLvnnWUopVovfJjNcy-vnY2v2wA@mail.gmail.com>
	<AANLkTikyj_aXA5Hz7YvM021Kmk2RG3_h9AOfrcJzUetV@mail.gmail.com>
	<AANLkTikjU_wMSG-pwfEPTpcvm2Lbz68y20aCA_vSTrJO@mail.gmail.com>
	<AANLkTik8Bty11j2yAeGjqZH+zTtCBiuu14fM+_WWgGFy@mail.gmail.com>
	<AANLkTimAGUscLorsaNCeXuxhkHLBnyefm7m+wKDXW2-q@mail.gmail.com>
	<AANLkTin2iH++f1VyKuon2K9uVgegp39-ZCgJmwthYOoo@mail.gmail.com>
	<AANLkTikGrvnnMFnGLJqW0SbFS9uWjv8hiNJ6xtVLcBBi@mail.gmail.com>
Message-ID: <1293652443.27212.2.camel@Nokia-N900-42-11>

Keith Goodman wrote:
> np.float64 is fast, just hoping someone had a C-API inline version of
> np.float64() that is faster.

You're looking for PyArrayScalar_New and _ASSIGN.
See https://github.com/numpy/numpy/blob/master/numpy/core/include/numpy/arrayscalars.h

Undocumented (bad), but AFAIK public.


From kwgoodman at gmail.com  Wed Dec 29 15:13:03 2010
From: kwgoodman at gmail.com (Keith Goodman)
Date: Wed, 29 Dec 2010 12:13:03 -0800
Subject: [Numpy-discussion] NumPy C-API equivalent of np.float64()
In-Reply-To: <1293652443.27212.2.camel@Nokia-N900-42-11>
References: <AANLkTinkB9DxYY9Bpvyz__Ti9oWGzgKqLg1c-0Nvascx@mail.gmail.com>
	<AANLkTiknGpcZYSV6o1JyY7ezSdmb8rch4oQH6-_NixPS@mail.gmail.com>
	<AANLkTingUDAMEHgFzwLvnnWUopVovfJjNcy-vnY2v2wA@mail.gmail.com>
	<AANLkTikyj_aXA5Hz7YvM021Kmk2RG3_h9AOfrcJzUetV@mail.gmail.com>
	<AANLkTikjU_wMSG-pwfEPTpcvm2Lbz68y20aCA_vSTrJO@mail.gmail.com>
	<AANLkTik8Bty11j2yAeGjqZH+zTtCBiuu14fM+_WWgGFy@mail.gmail.com>
	<AANLkTimAGUscLorsaNCeXuxhkHLBnyefm7m+wKDXW2-q@mail.gmail.com>
	<AANLkTin2iH++f1VyKuon2K9uVgegp39-ZCgJmwthYOoo@mail.gmail.com>
	<AANLkTikGrvnnMFnGLJqW0SbFS9uWjv8hiNJ6xtVLcBBi@mail.gmail.com>
	<1293652443.27212.2.camel@Nokia-N900-42-11>
Message-ID: <AANLkTinYc=riE3jsDJyGuU-z8Ux+KzeKaMFY=_T-eaDu@mail.gmail.com>

On Wed, Dec 29, 2010 at 11:54 AM, Pauli Virtanen <pav at iki.fi> wrote:
> Keith Goodman wrote:
>> np.float64 is fast, just hoping someone had a C-API inline version of
>> np.float64() that is faster.
>
> You're looking for PyArrayScalar_New and _ASSIGN.
> See https://github.com/numpy/numpy/blob/master/numpy/core/include/numpy/arrayscalars.h
>
> Undocumented (bad), but AFAIK public.

Those look nice. I'm stuck since I can't cimport them. I'll have to
read up on how to tell cython about those functions.


From kmichael.aye at gmail.com  Thu Dec 30 08:27:46 2010
From: kmichael.aye at gmail.com (K.-Michael Aye)
Date: Thu, 30 Dec 2010 15:27:46 +0200
Subject: [Numpy-discussion] Why arange has no stop-point opt-in?
Message-ID: <ifi1ci$32g$1@dough.gmane.org>

Dear all,

I'm a bit puzzled that there seems just no way to cleanly code an 
interval with evenly spaced numbers that includes the stop point given?
linspace offers to include the stop point, but arange does not?
Am I missing something? (I am aware, that I could do 
arange(9,15.0001,0.1) but that's what I want to avoid!)

Best regards and Happy New Year!
Michael


From friedrichromstedt at gmail.com  Thu Dec 30 09:02:46 2010
From: friedrichromstedt at gmail.com (Friedrich Romstedt)
Date: Thu, 30 Dec 2010 15:02:46 +0100
Subject: [Numpy-discussion] Why arange has no stop-point opt-in?
In-Reply-To: <ifi1ci$32g$1@dough.gmane.org>
References: <ifi1ci$32g$1@dough.gmane.org>
Message-ID: <AANLkTi=qdgZTWpWryrHUpQoDCFEjHtLyWnrcBwTMJTJ=@mail.gmail.com>

2010/12/30 K.-Michael Aye <kmichael.aye at gmail.com>:
> I'm a bit puzzled that there seems just no way to cleanly code an
> interval with evenly spaced numbers that includes the stop point given?
> linspace offers to include the stop point, but arange does not?
> Am I missing something? (I am aware, that I could do
> arange(9,15.0001,0.1) but that's what I want to avoid!)

Use numpy.linspace(9, 15, 7 * 10 + 1).  FYI, there is also numpy.logspace().

Friedrich


From friedrichromstedt at gmail.com  Thu Dec 30 09:08:09 2010
From: friedrichromstedt at gmail.com (Friedrich Romstedt)
Date: Thu, 30 Dec 2010 15:08:09 +0100
Subject: [Numpy-discussion] Why arange has no stop-point opt-in?
In-Reply-To: <AANLkTi=qdgZTWpWryrHUpQoDCFEjHtLyWnrcBwTMJTJ=@mail.gmail.com>
References: <ifi1ci$32g$1@dough.gmane.org>
	<AANLkTi=qdgZTWpWryrHUpQoDCFEjHtLyWnrcBwTMJTJ=@mail.gmail.com>
Message-ID: <AANLkTinaiKyoDc9ZsK7xG6UdRqV2pke1TEY0XvKPijU0@mail.gmail.com>

2010/12/30 Friedrich Romstedt <friedrichromstedt at gmail.com>:
> 2010/12/30 K.-Michael Aye <kmichael.aye at gmail.com>:
>> I'm a bit puzzled that there seems just no way to cleanly code an
>> interval with evenly spaced numbers that includes the stop point given?
>> linspace offers to include the stop point, but arange does not?
>> Am I missing something? (I am aware, that I could do
>> arange(9,15.0001,0.1) but that's what I want to avoid!)
>
> Use numpy.linspace(9, 15, 7 * 10 + 1). ?FYI, there is also numpy.logspace().

Oh sorry, I overlooked that you're aware of the linspace functionality.  Sorry.

I think opting in or opting out the end point in arange() is at even
rate, because it's in both cases the same unreliable (about including
or not including the end point).  Because it might pick a) if opting
in a point just 1e-14 above so not opting in as desired and b) vice
verse if opting out, it might pick a point just 1e-14 below.  But I
believe someone more educated about fp issues will give a more
authoritative reply.

Friedrich


From josef.pktd at gmail.com  Thu Dec 30 09:43:12 2010
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 30 Dec 2010 09:43:12 -0500
Subject: [Numpy-discussion] Why arange has no stop-point opt-in?
In-Reply-To: <AANLkTinaiKyoDc9ZsK7xG6UdRqV2pke1TEY0XvKPijU0@mail.gmail.com>
References: <ifi1ci$32g$1@dough.gmane.org>
	<AANLkTi=qdgZTWpWryrHUpQoDCFEjHtLyWnrcBwTMJTJ=@mail.gmail.com>
	<AANLkTinaiKyoDc9ZsK7xG6UdRqV2pke1TEY0XvKPijU0@mail.gmail.com>
Message-ID: <AANLkTinb4HnkAj09VeD3FVLAmnBJW9Q86_iwzgefrWvz@mail.gmail.com>

On Thu, Dec 30, 2010 at 9:08 AM, Friedrich Romstedt
<friedrichromstedt at gmail.com> wrote:
> 2010/12/30 Friedrich Romstedt <friedrichromstedt at gmail.com>:
>> 2010/12/30 K.-Michael Aye <kmichael.aye at gmail.com>:
>>> I'm a bit puzzled that there seems just no way to cleanly code an
>>> interval with evenly spaced numbers that includes the stop point given?
>>> linspace offers to include the stop point, but arange does not?
>>> Am I missing something? (I am aware, that I could do
>>> arange(9,15.0001,0.1) but that's what I want to avoid!)
>>
>> Use numpy.linspace(9, 15, 7 * 10 + 1). ?FYI, there is also numpy.logspace().
>
> Oh sorry, I overlooked that you're aware of the linspace functionality. ?Sorry.
>
> I think opting in or opting out the end point in arange() is at even
> rate, because it's in both cases the same unreliable (about including
> or not including the end point). ?Because it might pick a) if opting
> in a point just 1e-14 above so not opting in as desired and b) vice
> verse if opting out, it might pick a point just 1e-14 below. ?But I
> believe someone more educated about fp issues will give a more
> authoritative reply.

Since linspace exists, I don't see much point in adding the stop point
in arange. I use arange mainly for integers as numpy equivalent of
python's range. And I often need arange(n+1) which is less writing
than arange(n, include_end_point=True)

Josef

>
> Friedrich
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From kmichael.aye at gmail.com  Thu Dec 30 09:57:50 2010
From: kmichael.aye at gmail.com (K.-Michael Aye)
Date: Thu, 30 Dec 2010 16:57:50 +0200
Subject: [Numpy-discussion] Why arange has no stop-point opt-in?
References: <ifi1ci$32g$1@dough.gmane.org>
	<AANLkTi=qdgZTWpWryrHUpQoDCFEjHtLyWnrcBwTMJTJ=@mail.gmail.com>
	<AANLkTinaiKyoDc9ZsK7xG6UdRqV2pke1TEY0XvKPijU0@mail.gmail.com>
	<AANLkTinb4HnkAj09VeD3FVLAmnBJW9Q86_iwzgefrWvz@mail.gmail.com>
Message-ID: <ifi6le$qcu$1@dough.gmane.org>

On 2010-12-30 16:43:12 +0200, josef.pktd at gmail.com said:

> 
> Since linspace exists, I don't see much point in adding the stop point
> in arange. I use arange mainly for integers as numpy equivalent of
> python's range. And I often need arange(n+1) which is less writing
> than arange(n, include_end_point=True)

I agree with the point of writing gets more in some cases.
But arange(a, n+1, 0.1) would of course fail in this case.
And the big difference is, that I need to calculate first how many 
steps it is for linspace to achieve what I believe is a frequent user 
case.
As we already have the 'convenience' of both linspace and arange, which 
in principle could be done by one function alone if we'd precalculate 
all required information ourselves, why not go the full way, and take 
all overhead away from the user?

Michael

> 
> Josef
> 
>> 
>> Friedrich
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From matthieu.brucher at gmail.com  Thu Dec 30 10:12:03 2010
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Thu, 30 Dec 2010 16:12:03 +0100
Subject: [Numpy-discussion] Why arange has no stop-point opt-in?
In-Reply-To: <ifi6le$qcu$1@dough.gmane.org>
References: <ifi1ci$32g$1@dough.gmane.org>
	<AANLkTi=qdgZTWpWryrHUpQoDCFEjHtLyWnrcBwTMJTJ=@mail.gmail.com>
	<AANLkTinaiKyoDc9ZsK7xG6UdRqV2pke1TEY0XvKPijU0@mail.gmail.com>
	<AANLkTinb4HnkAj09VeD3FVLAmnBJW9Q86_iwzgefrWvz@mail.gmail.com>
	<ifi6le$qcu$1@dough.gmane.org>
Message-ID: <AANLkTim8w-2k_BjrYa4Xs=3J-8Bdd2ftWD6dVh_f+B3A@mail.gmail.com>

2010/12/30 K.-Michael Aye <kmichael.aye at gmail.com>:
> On 2010-12-30 16:43:12 +0200, josef.pktd at gmail.com said:
>
>>
>> Since linspace exists, I don't see much point in adding the stop point
>> in arange. I use arange mainly for integers as numpy equivalent of
>> python's range. And I often need arange(n+1) which is less writing
>> than arange(n, include_end_point=True)
>
> I agree with the point of writing gets more in some cases.
> But arange(a, n+1, 0.1) would of course fail in this case.
> And the big difference is, that I need to calculate first how many
> steps it is for linspace to achieve what I believe is a frequent user
> case.
> As we already have the 'convenience' of both linspace and arange, which
> in principle could be done by one function alone if we'd precalculate
> all required information ourselves, why not go the full way, and take
> all overhead away from the user?

I think arange() should really be seen as just the numpy version of range().
The issue with including the stop point is that it well may be the
case when you do arange(0, 1, 0.1). It's just a matter of loat
precision. In this case, I think the safest course of action is to let
the user decide how it can handle this.
If the step can be expressed as a rational fraction, then using arange
with floats and a step of one, it may be the simplest way to achieve
what you want.
i.e. : np.arange(90., 150.+1) / 10

Matthieu
-- 
Information System Engineer, Ph.D.
Blog: http://matt.eifelle.com
LinkedIn: http://www.linkedin.com/in/matthieubrucher


From qubax at gmx.at  Thu Dec 30 14:07:14 2010
From: qubax at gmx.at (qubax at gmx.at)
Date: Thu, 30 Dec 2010 20:07:14 +0100
Subject: [Numpy-discussion] How to efficiently multiply 2**10 x 2**10
	hermitian matrices
Message-ID: <20101230190714.GA7993@tux.hotze.com>

I'll have to work with large hermitian matrices and calculate
traces, eigenvalues and perform several matric products. In order
to speed those up, i noticed that blas includes a function called
'zhemm' for efficient matrix products with at least one hermitian
matrix.

is there a way to call that one directly for numpy arrays?

are there other, more efficient methods for multiplying that large
matrices that one of you might be aware of? especially with the
knowledge that they are symmetric/hermitian.

i'd appreciate any help in that regard.

thanks,
q

ps: i tried to port the functionality of zhemm into cython, but this 
is still about a factor of 10 slower than directly using numpy.dot


-- 
There are two things children should get
from their parents: roots and wings.

The king who needs to remind his people of his rank, is no king.

A beggar's mistake harms no one but the beggar. A king's mistake,
however, harms everyone but the king. Too often, the measure of
power lies not in the number who obey your will, but in the number
who suffer your stupidity.


From erik at rigtorp.com  Thu Dec 30 21:30:21 2010
From: erik at rigtorp.com (Erik Rigtorp)
Date: Thu, 30 Dec 2010 21:30:21 -0500
Subject: [Numpy-discussion] Simple shared arrays
Message-ID: <AANLkTikHqyRQaYw0VLZ2MGjdtE3ZsNUwaaf3_nXgR6=Y@mail.gmail.com>

Hi,

I was trying to parallelize some algorithms and needed a writable
array shared between processes. It turned out to be quite simple and
gave a nice speed up almost linear in number of cores. Of course you
need to know what you are doing to avoid segfaults and such. But I
still think something like this should be included with NumPy for
power users.

This works by inheriting anonymous mmaped memory. Not sure if this
works on windows.

import numpy as np
import multiprocessing as mp

class shared(np.ndarray):
    """Shared writable array"""

    def __new__(subtype, shape, interface=None):
        size = np.prod(shape)
        if interface == None:
            buffer = mp.RawArray('d', size)
            self = np.ndarray.__new__(subtype, shape, float, buffer)
        else:
            class Dummy(object): pass
            buffer = Dummy()
            buffer.__array_interface__ = interface
            a = np.asarray(buffer)
            self = np.ndarray.__new__(subtype, shape=a.shape, buffer=a)
        return self

    def __reduce_ex__(self, protocol):
        return shared, (self.shape, self.__array_interface__)

    def __reduce__(self):
        return __reduce_ex__(self, 0)

Also see attached file for example usage.

Erik
-------------- next part --------------
A non-text attachment was scrubbed...
Name: shared.py
Type: text/x-python
Size: 1364 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101230/c6abf3a8/attachment.py>

From pivanov314 at gmail.com  Fri Dec 31 02:13:21 2010
From: pivanov314 at gmail.com (Paul Ivanov)
Date: Thu, 30 Dec 2010 23:13:21 -0800
Subject: [Numpy-discussion] Simple shared arrays
In-Reply-To: <AANLkTikHqyRQaYw0VLZ2MGjdtE3ZsNUwaaf3_nXgR6=Y@mail.gmail.com>
References: <AANLkTikHqyRQaYw0VLZ2MGjdtE3ZsNUwaaf3_nXgR6=Y@mail.gmail.com>
Message-ID: <20101231071321.GE19675@ykcyc>

Erik Rigtorp, on 2010-12-30 21:30,  wrote:
> Hi,
> 
> I was trying to parallelize some algorithms and needed a writable
> array shared between processes. It turned out to be quite simple and
> gave a nice speed up almost linear in number of cores. Of course you
> need to know what you are doing to avoid segfaults and such. But I
> still think something like this should be included with NumPy for
> power users.
> 
> This works by inheriting anonymous mmaped memory. Not sure if this
> works on windows.
--snip--

I've successfully used (what I think is) Sturla Molden's
shmem_as_ndarray as outline here [1] and here [2] for these
purposes.

1. http://groups.google.com/group/comp.lang.python/browse_thread/thread/79fcf022b01b7fc3
2. http://folk.uio.no/sturlamo/python/multiprocessing-tutorial.pdf

-- 
Paul Ivanov
314 address only used for lists,  off-list direct email at:
http://pirsquared.org | GPG/PGP key id: 0x0F3E28F7 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101230/d112db94/attachment.sig>

From erik at rigtorp.com  Fri Dec 31 08:52:53 2010
From: erik at rigtorp.com (Erik Rigtorp)
Date: Fri, 31 Dec 2010 08:52:53 -0500
Subject: [Numpy-discussion] Faster NaN functions
Message-ID: <AANLkTik6juujV7haXC_mKDek4ksZ63wrd=fPYde_v3uU@mail.gmail.com>

Hi,

I just send a pull request for some faster NaN functions,
https://github.com/rigtorp/numpy.

I implemented the following generalized ufuncs: nansum(), nancumsum(),
nanmean(), nanstd() and for fun mean() and std(). It turns out that
the generalized ufunc mean() and std() is faster than the current
numpy functions. I'm also going to add nanprod(), nancumprod(),
nanmax(), nanmin(), nanargmax(), nanargmin().

The current implementation is not optimized in any way and there are
probably some speedups possible.

I hope we can get this into numpy 2.0, me and people around me seems
to have a need for these functions.

Erik


From erik at rigtorp.com  Fri Dec 31 09:02:14 2010
From: erik at rigtorp.com (Erik Rigtorp)
Date: Fri, 31 Dec 2010 09:02:14 -0500
Subject: [Numpy-discussion] Simple shared arrays
In-Reply-To: <20101231071321.GE19675@ykcyc>
References: <AANLkTikHqyRQaYw0VLZ2MGjdtE3ZsNUwaaf3_nXgR6=Y@mail.gmail.com>
	<20101231071321.GE19675@ykcyc>
Message-ID: <AANLkTi=b4f-XT-dTRA_wmOnfLUF1pZ1SsXQN=iyhfuqk@mail.gmail.com>

On Fri, Dec 31, 2010 at 02:13, Paul Ivanov <pivanov314 at gmail.com> wrote:
> Erik Rigtorp, on 2010-12-30 21:30, ?wrote:
>> Hi,
>>
>> I was trying to parallelize some algorithms and needed a writable
>> array shared between processes. It turned out to be quite simple and
>> gave a nice speed up almost linear in number of cores. Of course you
>> need to know what you are doing to avoid segfaults and such. But I
>> still think something like this should be included with NumPy for
>> power users.
>>
>> This works by inheriting anonymous mmaped memory. Not sure if this
>> works on windows.
> --snip--
>
> I've successfully used (what I think is) Sturla Molden's
> shmem_as_ndarray as outline here [1] and here [2] for these
> purposes.
>

Yeah, i saw that code too. My implementation is even more lax, but
easier to use. It sends arrays by memory reference to subprocesses.
Dangerous: yes, effective: very.

It would be nice if we could stamp out some good effective patterns
using multiprocessing and include them with numpy. The best solution
is probably a parallel_for function:
def parallel_for(func, inherit_args, iterable): ...
Where func should be def func(inherit_args, item): ...
And parallel_for makes sure inherit_args are viewable as a class
shared() with writable shared memory.

Erik


From lev at columbia.edu  Fri Dec 31 11:21:14 2010
From: lev at columbia.edu (Lev Givon)
Date: Fri, 31 Dec 2010 11:21:14 -0500
Subject: [Numpy-discussion] Faster NaN functions
In-Reply-To: <AANLkTik6juujV7haXC_mKDek4ksZ63wrd=fPYde_v3uU@mail.gmail.com>
References: <AANLkTik6juujV7haXC_mKDek4ksZ63wrd=fPYde_v3uU@mail.gmail.com>
Message-ID: <20101231162114.GA17179@avicenna.ee.columbia.edu>

Received from Erik Rigtorp on Fri, Dec 31, 2010 at 08:52:53AM EST:
> Hi,
> 
> I just send a pull request for some faster NaN functions,
> https://github.com/rigtorp/numpy.
> 
> I implemented the following generalized ufuncs: nansum(), nancumsum(),
> nanmean(), nanstd() and for fun mean() and std(). It turns out that
> the generalized ufunc mean() and std() is faster than the current
> numpy functions. I'm also going to add nanprod(), nancumprod(),
> nanmax(), nanmin(), nanargmax(), nanargmin().
> 
> The current implementation is not optimized in any way and there are
> probably some speedups possible.
> 
> I hope we can get this into numpy 2.0, me and people around me seems
> to have a need for these functions.
> 
> Erik
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 

How does this compare to Bottleneck?

http://pypi.python.org/pypi/Bottleneck/

							L.G.


From kwgoodman at gmail.com  Fri Dec 31 12:20:45 2010
From: kwgoodman at gmail.com (Keith Goodman)
Date: Fri, 31 Dec 2010 09:20:45 -0800
Subject: [Numpy-discussion] Faster NaN functions
In-Reply-To: <20101231162114.GA17179@avicenna.ee.columbia.edu>
References: <AANLkTik6juujV7haXC_mKDek4ksZ63wrd=fPYde_v3uU@mail.gmail.com>
	<20101231162114.GA17179@avicenna.ee.columbia.edu>
Message-ID: <AANLkTin6U=uEPjhNzVTYEhUnuxckqEFKCb7xRWmt5W0B@mail.gmail.com>

On Fri, Dec 31, 2010 at 8:21 AM, Lev Givon <lev at columbia.edu> wrote:
> Received from Erik Rigtorp on Fri, Dec 31, 2010 at 08:52:53AM EST:
>> Hi,
>>
>> I just send a pull request for some faster NaN functions,
>> https://github.com/rigtorp/numpy.
>>
>> I implemented the following generalized ufuncs: nansum(), nancumsum(),
>> nanmean(), nanstd() and for fun mean() and std(). It turns out that
>> the generalized ufunc mean() and std() is faster than the current
>> numpy functions. I'm also going to add nanprod(), nancumprod(),
>> nanmax(), nanmin(), nanargmax(), nanargmin().
>>
>> The current implementation is not optimized in any way and there are
>> probably some speedups possible.
>>
>> I hope we can get this into numpy 2.0, me and people around me seems
>> to have a need for these functions.
>>
>> Erik
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
> How does this compare to Bottleneck?
>
> http://pypi.python.org/pypi/Bottleneck/

I had all sorts of problems with ABI differences (this is the first
time I've tried numpy 2.0). So I couldn't get ipython, etc to work
with Erik's new nan functions. That's why my speed comparison below
might be hard to follow and only tests one example.

For timing I used bottleneck's autotimeit function:

>>> from bottleneck.benchmark.autotimeit import autotimeit

First Erik's new nanmean:

>>> stmt = "nanmean2(a.flat)"
>>> setup = "import numpy as np; from numpy.core.umath_tests import nanmean as nanmean2;  rs=np.random.RandomState([1,2,3]); a = rs.rand(100,100)"
>>> autotimeit(stmt, setup)
5.1356482505798338e-05

Bottleneck's low level nanmean:

>> stmt = "nanmean(a)"
>> setup = "import numpy as np; from bottleneck.func import nanmean_2d_float64_axisNone as nanmean; rs=np.random.RandomState([1,2,3]); a = rs.rand(100,100)"
>> autotimeit(stmt, setup)
   1.5422070026397704e-05

Bottleneck's high level nanmean:

>> setup = "import numpy as np; from bottleneck.func import nanmean; rs=np.random.RandomState([1,2,3]); a = rs.rand(100,100)"
>> autotimeit(stmt, setup)
   1.7850480079650879e-05

Numpy's mean:

>> setup = "import numpy as np; from numpy import mean; rs=np.random.RandomState([1,2,3]); a = rs.rand(100,100)"
>> stmt = "mean(a)"
>> autotimeit(stmt, setup)
   1.6718170642852782e-05

Scipy's nanmean:

>> setup = "import numpy as np; from scipy.stats import nanmean; rs=np.random.RandomState([1,2,3]); a = rs.rand(100,100)"
>> stmt = "nanmean(a)"
>> autotimeit(stmt, setup)
   0.00024667191505432128

The tests above should be repeated for arrays that contain NaNs, and
for different array sizes and different axes. Bottleneck's benchmark
suite can be modified to do all that but I can't import Erik's new
numpy and bottleneck at the same time at the moment.


From gideon.simpson at gmail.com  Fri Dec 31 16:44:15 2010
From: gideon.simpson at gmail.com (Gideon)
Date: Fri, 31 Dec 2010 13:44:15 -0800 (PST)
Subject: [Numpy-discussion] OS X binaries.
Message-ID: <b7b1d51a-2862-4b94-a10f-9e74d974ff61@k22g2000yqh.googlegroups.com>

I noticed that 1.5.1 was released, and sourceforge is suggesting I use
the package numpy-1.5.1-py2.6-python.org-macosx10.3.dmg.  However, I
have an OS X 10.6 machine.

Can/should I use this binary?

Should I just compile from source?


From totonixsame at gmail.com  Fri Dec 31 16:47:27 2010
From: totonixsame at gmail.com (totonixsame at gmail.com)
Date: Fri, 31 Dec 2010 19:47:27 -0200
Subject: [Numpy-discussion] OS X binaries.
In-Reply-To: <b7b1d51a-2862-4b94-a10f-9e74d974ff61@k22g2000yqh.googlegroups.com>
References: <b7b1d51a-2862-4b94-a10f-9e74d974ff61@k22g2000yqh.googlegroups.com>
Message-ID: <AANLkTingsz=WMBYLf=2_Aj-S5WtLgTucW0-+SJLYQn=Z@mail.gmail.com>

On Fri, Dec 31, 2010 at 7:44 PM, Gideon <gideon.simpson at gmail.com> wrote:
> I noticed that 1.5.1 was released, and sourceforge is suggesting I use
> the package numpy-1.5.1-py2.6-python.org-macosx10.3.dmg. ?However, I
> have an OS X 10.6 machine.
>
> Can/should I use this binary?
>
> Should I just compile from source?

I suggest you to install pip [1] then use it to install numpy using
this command:

pip install numpy

It compiles fast.

[1] - http://pypi.python.org/pypi/pip


From erik at rigtorp.com  Fri Dec 31 23:29:14 2010
From: erik at rigtorp.com (Erik Rigtorp)
Date: Fri, 31 Dec 2010 23:29:14 -0500
Subject: [Numpy-discussion] Rolling window (moving average, moving std,
	and more)
Message-ID: <AANLkTinzbqJN1MdyQTEhJc0w0Mxo4qM8T8xAudzga_2_@mail.gmail.com>

Hi,

Implementing moving average, moving std and other functions working
over rolling windows using python for loops are slow. This is a
effective stride trick I learned from Keith Goodman's
<kwgoodman at gmail.com> Bottleneck code but generalized into arrays of
any dimension. This trick allows the loop to be performed in C code
and in the future hopefully using multiple cores.

import numpy as np

def rolling_window(a, window):
    """
    Make an ndarray with a rolling window of the last dimension

    Parameters
    ----------
    a : array_like
        Array to add rolling window to
    window : int
        Size of rolling window

    Returns
    -------
    Array that is a view of the original array with a added dimension
    of size w.

    Examples
    --------
    >>> x=np.arange(10).reshape((2,5))
    >>> rolling_window(x, 3)
    array([[[0, 1, 2], [1, 2, 3], [2, 3, 4]],
           [[5, 6, 7], [6, 7, 8], [7, 8, 9]]])

    Calculate rolling mean of last dimension:
    >>> np.mean(rolling_window(x, 3), -1)
    array([[ 1.,  2.,  3.],
           [ 6.,  7.,  8.]])

    """
    if window < 1:
        raise ValueError, "`window` must be at least 1."
    if window > a.shape[-1]:
        raise ValueError, "`window` is too long."
    shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)


Using np.swapaxes(-1, axis) rolling aggregations over any axis can be computed.

I submitted a pull request to add this to the stride_tricks module.

Erik