From pierre.haessig at crans.org  Thu Mar  1 03:37:25 2012
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Thu, 01 Mar 2012 09:37:25 +0100
Subject: [Numpy-discussion] IPython 0.12 just entered Debian Testing
Message-ID: <4F4F3545.6020209@crans.org>

Hi,

Just to start the new month on a light & happy topic :
   IPython 0.12 has entered Debian Testing !

-- 
Pierre

(I'm not at all involved in the process that enabled IPython make its
way to Testing. I've been watching this quite closely however. I
"suspect" there was a decent amount of work on packaging all the
dependencies to make this possible and I'm thankful to the guys who did it)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120301/23592d8f/attachment.sig>

From jaakko.luttinen at aalto.fi  Thu Mar  1 05:23:13 2012
From: jaakko.luttinen at aalto.fi (Jaakko Luttinen)
Date: Thu, 1 Mar 2012 12:23:13 +0200
Subject: [Numpy-discussion] Special matrices with structure?
In-Reply-To: <4F46442C.7090100@aalto.fi>
References: <4F46442C.7090100@aalto.fi>
Message-ID: <4F4F4E11.3020102@aalto.fi>

On 02/23/2012 03:50 PM, Jaakko Luttinen wrote:
> Hi!
> 
> I was wondering whether it would be easy/possible/reasonable to have
> classes for arrays that have special structure in order to use less
> memory and speed up some computations?
> 
> For instance:
> - symmetric matrix could be stored in almost half the memory required by
> a non-symmetric matrix
> - diagonal matrix only needs to store the diagonal vector
> - Toeplitz matrix only needs to store one or two vectors
> - sparse matrix only needs to store non-zero elements (some
> implementations in scipy.sparse)
> - and so on

Note to self: BLAS has lots of functions for matrices having special
structure (symmetric, triangular, banded, ...), so I suppose it would
"only" require some Python class wrappers which are compatible with
ndarray/matrix. But I don't know how to make these classes compatible
with generic numpy functions such as numpy.multiply/numpy.dot/etc..
-Jaakko

> 
> If such classes were implemented, it would be nice if they worked with
> numpy functions (dot, diag, ...) and operations (+, *, +=, ...) easily.
> 
> I believe this has been discussed before but google didn't help a lot..
> 
> Regards,
> Jaakko
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From takowl at gmail.com  Thu Mar  1 05:26:35 2012
From: takowl at gmail.com (Thomas Kluyver)
Date: Thu, 1 Mar 2012 10:26:35 +0000
Subject: [Numpy-discussion] IPython 0.12 just entered Debian Testing
In-Reply-To: <4F4F3545.6020209@crans.org>
References: <4F4F3545.6020209@crans.org>
Message-ID: <CAOvn4qgtOyOy5s=E3Z0dPV0xMrz+4PwWdtnA0QrNR7-2NS8yCA@mail.gmail.com>

On 1 March 2012 08:37, Pierre Haessig <pierre.haessig at crans.org> wrote:

> Just to start the new month on a light & happy topic :
>   IPython 0.12 has entered Debian Testing !
>

Thanks to Julian Taylor for handling Debian packaging.

IPython 0.12 is also in the upcoming Ubuntu 12.04.

Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120301/4fa1b3b1/attachment.html>

From rhattersley at gmail.com  Thu Mar  1 05:30:18 2012
From: rhattersley at gmail.com (Richard Hattersley)
Date: Thu, 1 Mar 2012 10:30:18 +0000
Subject: [Numpy-discussion] Proposed Roadmap Overview
Message-ID: <CAP=RS9nQnoWL1jRVmRPafEqMz_QErs2ka9D2UsQpoM64BzBfcw@mail.gmail.com>

+1 on the NEP guideline

As part of a team building a scientific analysis library, I'm
attempting to understand the current state of NumPy development and
its likely future (with a view to contributing if appropriate). The
proposed NEP process would make that a whole lot easier. And if
nothing else, it would reduce the chance of me posting questions about
topics that had already been discussed/decided!

Without the process the NEPs become another potential source of
confusion and mixed messages.


On 1 March 2012 03:02, Travis Oliphant wrote:
> I Would like to hear the opinions of others on that point,

> but yes,  I think that is an appropriate procedure.

>

> Travis

>

> --

> Travis Oliphant

> (on a mobile)

> 512-826-7480

>

>

> On Feb 29, 2012, at 10:54 AM, Matthew Brett

> <matthew.brett at gmail.com> wrote:

>

> > Hi,

> >

> > On Wed, Feb 29, 2012 at 1:46 AM, Travis Oliphant

> <travis at continuum.io> wrote:

> >> We already use the NEP process for such decisions.   This

> discussion came from simply from the *idea* of writing such a NEP.

> >>

> >> Nothing has been decided.  Only opinions have been shared

> that might influence the NEP.  This is all pretty premature,

> though ---  migration to C++ features on a trial branch is

> some months away were it to happen.

> >

> > Fernando can correct me if I'm wrong, but I think he was asking a

> > governance question.   That is: would you (as BDF$N) consider the

> > following guideline:

> >

> > "As a condition for accepting significant changes to Numpy, for each

> > significant change, there will be a NEP.  The NEP shall follow the

> > same model as the Python PEPs - that is - there will be a summary of

> > the changes, the issues arising, the for / against opinions and

> > alternatives offered.  There will usually be a draft implementation.

> > The NEP will contain the resolution of the discussion as it

> relates to

> > the code"

> >

> > For example, the masked array NEP, although very

> substantial, contains

> > little discussion of the controversy arising, or the intended

> > resolution of the controversy:

> >

> >

> https://github.com/numpy/numpy/blob/3f685a1a990f7b6e5149c80b52

> 436fb4207e49f5/doc/neps/missing-data.rst

> >

> > I mean, although it is useful, it is not in the form of a PEP, as

> > Fernando has described it.

> >

> > Would you accept extending the guidelines to the NEP format?

> >

> > Best,

> >

> > Matthew

> > _______________________________________________

> > NumPy-Discussion mailing list

> > NumPy-Discussion at scipy.org

> > http://mail.scipy.org/mailman/listinfo/numpy-discussion

> _______________________________________________

> NumPy-Discussion mailing list

> NumPy-Discussion at scipy.org

> http://mail.scipy.org/mailman/listinfo/numpy-discussion

>

>

>


From barthpi at gmail.com  Thu Mar  1 06:35:43 2012
From: barthpi at gmail.com (Pierre Barthelemy)
Date: Thu, 1 Mar 2012 12:35:43 +0100
Subject: [Numpy-discussion] Numpy interpolate: cut through 2D matrix
Message-ID: <CAB2iWq3=WUSjf0cVt+u8HjW-5wMQHZDgidbwonGhAxKcs8Tvvg@mail.gmail.com>

Hello,

for a data analysis tool i am programming, i need to plot a cut through a
2D graph. I then have a 2D array, and the indices start=(start_x,start_y)
and stop=(stop_x,stop_y) that are the position of the starting point and
stop point of the cut. The code i programmed is placed on the bottom.

This code returns only value existing in the original array: if the cut
should pass between the column index i and column index i+1, it returns
anyway the value at column index i.
Is it possible to use the numpy.interpolate library to make such that when
passing between i and i+1, the function returns an interpolation of the
graph between the points [row,column] and [row,column+1] ?


def cut_matrix(array,start,stop,shift=0):
    '''
    Draws a cut through a 2D array, between the positions
start=(row,column) and stop=(row,column)
    '''
    n_row=array.shape[0]
    n_col=array.shape[1]

    if abs(start[1]-stop[1])>abs(start[0]-stop[0]):
        if stop[1]<start[1]:
            start,stop= stop,start

        col_index=arange(start[1],stop[1]+1).astype(int)

row_index=round_(linspace(start[0],stop[0],len(col_index))).astype(int)+int(shift)
        row=(linspace(start[0],stop[0],len(col_index)))
    else:
        if stop[0]<start[0]:
            start,stop= stop,start
        row_index=arange(start[0],stop[0]+1).astype(int)

col_index=round_(linspace(start[1],stop[1],len(row_index))).astype(int)+int(shift)


    if max(col_index)>n_col or min(col_index)<0:
        print 'Error: column index not in range'
        raise IndexError
        return
    if max(row_index)>n_row or min(row_index)<0:
        print 'Error: row index not in range'
        raise IndexError
        return
    return array[row_index,col_index]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120301/0318a555/attachment.html>

From schut at sarvision.nl  Thu Mar  1 06:43:05 2012
From: schut at sarvision.nl (Vincent Schut)
Date: Thu, 01 Mar 2012 12:43:05 +0100
Subject: [Numpy-discussion] Numpy interpolate: cut through 2D matrix
In-Reply-To: <CAB2iWq3=WUSjf0cVt+u8HjW-5wMQHZDgidbwonGhAxKcs8Tvvg@mail.gmail.com>
References: <CAB2iWq3=WUSjf0cVt+u8HjW-5wMQHZDgidbwonGhAxKcs8Tvvg@mail.gmail.com>
Message-ID: <jinnc9$2b0$1@dough.gmane.org>

On 03/01/2012 12:35 PM, Pierre Barthelemy wrote:
> Hello,
>
> for a data analysis tool i am programming, i need to plot a cut through
> a 2D graph. I then have a 2D array, and the indices
> start=(start_x,start_y) and stop=(stop_x,stop_y) that are the position
> of the starting point and stop point of the cut. The code i programmed
> is placed on the bottom.
>
> This code returns only value existing in the original array: if the cut
> should pass between the column index i and column index i+1, it returns
> anyway the value at column index i.
> Is it possible to use the numpy.interpolate library to make such that
> when passing between i and i+1, the function returns an interpolation of
> the graph between the points [row,column] and [row,column+1] ?

Pierre,

if you have scipy next to numpy, have a look at 
scipy.ndimage.map_coordinates. In short, you define the coordinates you 
want your array to be interpolated at, and it will give you the 
interpolated values, using spline interpolation.

Best,
Vincent.


From zayd.yakoubi at gmail.com  Thu Mar  1 09:43:57 2012
From: zayd.yakoubi at gmail.com (Zayd YAKOUBI)
Date: Thu, 1 Mar 2012 15:43:57 +0100
Subject: [Numpy-discussion] Jaccard & Hamming Problem
Message-ID: <CAAbYDpntwERkasR_UhKP76Pk3ayp7xqHvUQCMhyv5PsvTAMvuw@mail.gmail.com>

Hello,

I use the similarity measure "Jaccard" and "Hamming" of pckage
Scipy.spacial.cdist (Python) in a clustering context, I applied to given
typs of real and integer (0.6 0.2 1.7 May 8 ). They gave good results. But I
just know that they normally only applies to binary data. The function of
these two similarity measures are not specified in the documentation:
http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.cdist.html
. Does anyone of you can help me find these functions?
Thank you in advance

Regards,
Zayd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120301/3e9f7da3/attachment.html>

From warren.weckesser at enthought.com  Thu Mar  1 09:59:00 2012
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Thu, 1 Mar 2012 08:59:00 -0600
Subject: [Numpy-discussion] Jaccard & Hamming Problem
In-Reply-To: <CAAbYDpntwERkasR_UhKP76Pk3ayp7xqHvUQCMhyv5PsvTAMvuw@mail.gmail.com>
References: <CAAbYDpntwERkasR_UhKP76Pk3ayp7xqHvUQCMhyv5PsvTAMvuw@mail.gmail.com>
Message-ID: <CAM-+wY9gK-U36f8weGfBQRzN2aab4J_T1xhf=gGZ=5FE4r_RAg@mail.gmail.com>

On Thu, Mar 1, 2012 at 8:43 AM, Zayd YAKOUBI <zayd.yakoubi at gmail.com> wrote:

> Hello,
>
> I use the similarity measure "Jaccard" and "Hamming" of pckage
> Scipy.spacial.cdist (Python) in a clustering context, I applied to given
> typs of real and integer (0.6 0.2 1.7 May 8 ). They gave good results. But I
> just know that they normally only applies to binary data. The function of
> these two similarity measures are not specified in the documentation:
> http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.cdist.html
> . Does anyone of you can help me find these functions?
> Thank you in advance
>
>

http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.hamming.html#scipy.spatial.distance.hamming

http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.jaccard.html#scipy.spatial.distance.jaccard


Those are the nicely formatted versions of the docstrings of functions.
You can also access these in an interactive shell, e.g.

>>> from scipy.spatial.distance import hamming
>>> help(hamming)

or in ipython

In [1] from scipy.spatial.distance import hamming
In [2] hamming?


Warren
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120301/b6e01da7/attachment.html>

From zayd.yakoubi at gmail.com  Thu Mar  1 10:10:15 2012
From: zayd.yakoubi at gmail.com (Zayd YAKOUBI)
Date: Thu, 1 Mar 2012 16:10:15 +0100
Subject: [Numpy-discussion] Jaccard & Hamming Problem
In-Reply-To: <CAM-+wY9gK-U36f8weGfBQRzN2aab4J_T1xhf=gGZ=5FE4r_RAg@mail.gmail.com>
References: <CAAbYDpntwERkasR_UhKP76Pk3ayp7xqHvUQCMhyv5PsvTAMvuw@mail.gmail.com>
	<CAM-+wY9gK-U36f8weGfBQRzN2aab4J_T1xhf=gGZ=5FE4r_RAg@mail.gmail.com>
Message-ID: <CAAbYDp=wfVN6O-hCFkn3bKSbYzTB4G-bqBxXZBsqLMrwPORXUw@mail.gmail.com>

thank you very much,
In fact, the functions of these two measures are  for binary vectors,
and I have
not found their extension to real data such as: 0.7, 0.9, 1.7 .... Knowing
that I applied to this data and it worked well..

Have an idea about the version of these functions for this type of data ?

thank you for your help
 Saisissez du texte, l'adresse d'un site Web ou importez un document ?
traduire. <http://translate.google.fr/?tr=f&hl=fr>
Annuler <http://translate.google.fr/?tr=t&hl=fr>
 Alpha

Regards,
Zayd
 <https://sites.google.com/site/yakoubizayd/>


2012/3/1 Warren Weckesser <warren.weckesser at enthought.com>

>
>
> On Thu, Mar 1, 2012 at 8:43 AM, Zayd YAKOUBI <zayd.yakoubi at gmail.com>wrote:
>
>> Hello,
>>
>> I use the similarity measure "Jaccard" and "Hamming" of pckage
>> Scipy.spacial.cdist (Python) in a clustering context, I applied to given
>> typs of real and integer (0.6 0.2 1.7 May 8 ). They gave good results.
>> But I just know that they normally only applies to binary data. The
>> function of these two similarity measures are not specified in the
>> documentation:
>> http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.cdist.html
>> . Does anyone of you can help me find these functions?
>> Thank you in advance
>>
>>
>
>
> http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.hamming.html#scipy.spatial.distance.hamming
>
>
> http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.jaccard.html#scipy.spatial.distance.jaccard
>
>
> Those are the nicely formatted versions of the docstrings of functions.
> You can also access these in an interactive shell, e.g.
>
> >>> from scipy.spatial.distance import hamming
> >>> help(hamming)
>
> or in ipython
>
> In [1] from scipy.spatial.distance import hamming
> In [2] hamming?
>
>
> Warren
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120301/3147b08c/attachment.html>

From barthpi at gmail.com  Thu Mar  1 10:20:14 2012
From: barthpi at gmail.com (Pierre Barthelemy)
Date: Thu, 1 Mar 2012 16:20:14 +0100
Subject: [Numpy-discussion] Numpy fitting
Message-ID: <CAB2iWq0jktGYLGvyswg61fETqLndH3QYyXp0wfpnePOeo72+5Q@mail.gmail.com>

Dear all,

i am writing a program for data analysis. One of the functions of this
program gives the possibility to fit the functions. I therefore use the
recipe described in :
http://www.scipy.org/Cookbook/FittingData<http://www.scipy.org/Cookbook/FittingData>
under
the section "Simplifying the syntax". The code looks like this:

class Parameter:
    def __init__(self, value):
            self.value = value
            self.fixed=False
    def set(self, value):
            if not self.fixed:
                self.value = value
    def __call__(self):
            return self.value

def fit(function, parameters, y, x = None):
    def f(params):
        i = 0
        for p in parameters:
            p.set(params[i])
            i += 1
        return y - function(x)

    if x is None: x = arange(y.shape[0])
    p = [param() for param in parameters]
    out=optimize.leastsq(f, p, full_output=1)

One thing that i would like to know is how can i get the error on the
parameters ? From what i understood from the "Cookbook" page, and from the
scipy manual (
http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.leastsq.html#scipy.optimize.leastsq),
the second argument returned by the leastsq function gives access to these
errors.
std_error=std(y-function(x))
param_error=sqrt(diagonal(out[1])*std_error)

The param_errors that i get in this case are extremely small. Much smaller
than what i expected, and much smaller than what i can get fitting the
function with matlab. So i guess i made an error here.

Can someone tell me how i should do to retrieve the parameter errors ?

Bests,

Pierre
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120301/56fd68e0/attachment.html>

From barthpi at gmail.com  Thu Mar  1 10:24:47 2012
From: barthpi at gmail.com (Pierre Barthelemy)
Date: Thu, 1 Mar 2012 16:24:47 +0100
Subject: [Numpy-discussion] Numpy fitting
In-Reply-To: <CAB2iWq0jktGYLGvyswg61fETqLndH3QYyXp0wfpnePOeo72+5Q@mail.gmail.com>
References: <CAB2iWq0jktGYLGvyswg61fETqLndH3QYyXp0wfpnePOeo72+5Q@mail.gmail.com>
Message-ID: <CAB2iWq0ZwLipQrV0cg4MndmJ0ES0TdpQMDw1fEcgkXSPuB-kcQ@mail.gmail.com>

Dear all,

i am writing a program for data analysis. One of the functions of this
program gives the possibility to fit the functions. I followed the recipe
described in : http://www.scipy.org/Cookbook/FittingData<http://www.scipy.org/Cookbook/FittingData>
under
the section "Simplifying the syntax".

To fit, i use the function:

out=optimize.leastsq(f, p, full_output=1)

where f is my function and p a list of parameters.

One thing that i would like to know is how can i get the error on the
parameters ? From what i understood from the "Cookbook" page, and from the
scipy manual (
http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.leastsq.html#scipy.optimize.leastsq),
the second argument returned by the leastsq function gives access to these
errors.
std_error=std(y-function(x))
param_error=sqrt(diagonal(out[1])*std_error)

The param_errors that i get in this case are extremely small. Much smaller
than what i expected, and much smaller than what i can get fitting the
function with matlab. So i guess i made an error here.

Can someone tell me how i should do to retrieve the parameter errors ?

Bests,

Pierre

PS: i got the impression something went wrong with my previous message,
sorry for that.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120301/9a7dcfa5/attachment.html>

From shish at keba.be  Thu Mar  1 10:39:13 2012
From: shish at keba.be (Olivier Delalleau)
Date: Thu, 1 Mar 2012 10:39:13 -0500
Subject: [Numpy-discussion] Numpy fitting
In-Reply-To: <CAB2iWq0ZwLipQrV0cg4MndmJ0ES0TdpQMDw1fEcgkXSPuB-kcQ@mail.gmail.com>
References: <CAB2iWq0jktGYLGvyswg61fETqLndH3QYyXp0wfpnePOeo72+5Q@mail.gmail.com>
	<CAB2iWq0ZwLipQrV0cg4MndmJ0ES0TdpQMDw1fEcgkXSPuB-kcQ@mail.gmail.com>
Message-ID: <CAFXk4bq-ug2tE=d-ncqkiBGdXzdk4-H75iLJP3VwKjYRFK+LrQ@mail.gmail.com>

Sorry I can't help, but I'd just suggest to post this on the scipy mailing
list as you may get more replies there.

-=- Olivier

Le 1 mars 2012 10:24, Pierre Barthelemy <barthpi at gmail.com> a ?crit :

> Dear all,
>
> i am writing a program for data analysis. One of the functions of this
> program gives the possibility to fit the functions. I followed the recipe
> described in : h*MailScanner soup?onne le lien suivant d'?tre une
> tentative de fraude de la part de "www.scipy.org" *ttp://www.scipy.org/Cookbook/FittingData<http://www.scipy.org/Cookbook/FittingData> under
> the section "Simplifying the syntax".
>
> To fit, i use the function:
>
> out=optimize.leastsq(f, p, full_output=1)
>
> where f is my function and p a list of parameters.
>
> One thing that i would like to know is how can i get the error on the
> parameters ? From what i understood from the "Cookbook" page, and from the
> scipy manual (
> http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.leastsq.html#scipy.optimize.leastsq),
> the second argument returned by the leastsq function gives access to these
> errors.
> std_error=std(y-function(x))
> param_error=sqrt(diagonal(out[1])*std_error)
>
> The param_errors that i get in this case are extremely small. Much smaller
> than what i expected, and much smaller than what i can get fitting the
> function with matlab. So i guess i made an error here.
>
> Can someone tell me how i should do to retrieve the parameter errors ?
>
> Bests,
>
> Pierre
>
> PS: i got the impression something went wrong with my previous message,
> sorry for that.
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120301/a5d5c98a/attachment.html>

From josef.pktd at gmail.com  Thu Mar  1 12:30:18 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 1 Mar 2012 12:30:18 -0500
Subject: [Numpy-discussion] Numpy fitting
In-Reply-To: <CAFXk4bq-ug2tE=d-ncqkiBGdXzdk4-H75iLJP3VwKjYRFK+LrQ@mail.gmail.com>
References: <CAB2iWq0jktGYLGvyswg61fETqLndH3QYyXp0wfpnePOeo72+5Q@mail.gmail.com>
	<CAB2iWq0ZwLipQrV0cg4MndmJ0ES0TdpQMDw1fEcgkXSPuB-kcQ@mail.gmail.com>
	<CAFXk4bq-ug2tE=d-ncqkiBGdXzdk4-H75iLJP3VwKjYRFK+LrQ@mail.gmail.com>
Message-ID: <CAMMTP+ANuobZ3j8y8T0fAE4PD3RrN_8fWQJ2G9zejS3Cf4q4tQ@mail.gmail.com>

On Thu, Mar 1, 2012 at 10:39 AM, Olivier Delalleau <shish at keba.be> wrote:
> Sorry I can't help, but I'd just suggest to post this on the scipy mailing
> list as you may get more replies there.
>
> -=- Olivier
>
> Le 1 mars 2012 10:24, Pierre Barthelemy <barthpi at gmail.com> a ?crit :
>>
>> Dear all,
>>
>> i am writing a program for data analysis. One of the functions of this
>> program gives the possibility to fit the functions. I followed the recipe
>> described in : hMailScanner soup?onne le lien suivant d'?tre une tentative
>> de fraude de la part de "www.scipy.org"
>> ttp://www.scipy.org/Cookbook/FittingData?under the section "Simplifying the
>> syntax".

http://translate.google.com/#auto|en|MailScanner%20soup%C3%A7onne%20le%20lien%20suivant%20d'%C3%AAtre%20une%20tentative%20de%20fraude%20de%20la%20part%20de%20%22www.scipy.org%22

:)
Josef

>>
>> To fit, i use the function:
>>
>> out=optimize.leastsq(f, p, full_output=1)
>>
>> where f is my function and p a list of parameters.
>>
>> One thing that i would like to know is how can i get the error on the
>> parameters ? From what i understood from the "Cookbook" page, and from the
>> scipy manual
>> (http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.leastsq.html#scipy.optimize.leastsq),
>> the second argument returned by the leastsq function gives access to these
>> errors.
>> std_error=std(y-function(x))
>> param_error=sqrt(diagonal(out[1])*std_error)
>>
>> The param_errors that i get in this case are extremely small. Much smaller
>> than what i expected, and much smaller than what i can get fitting the
>> function with matlab. So i guess i made an error here.
>>
>> Can someone tell me how i should do to retrieve the parameter errors ?
>>
>> Bests,
>>
>> Pierre
>>
>> PS: i got the impression something went wrong with my previous message,
>> sorry for that.
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From jkington at wisc.edu  Thu Mar  1 17:44:23 2012
From: jkington at wisc.edu (Joe Kington)
Date: Thu, 01 Mar 2012 16:44:23 -0600
Subject: [Numpy-discussion] Floating point "close" function?
Message-ID: <CACe1pJdnj11a+-DAYssNQV9bSBxEjp_nCAjchk7_itH2-RYP4g@mail.gmail.com>

Is there a numpy function for testing floating point equality that returns
a boolean array?

I'm aware of np.allclose, but I need a boolean array.  Properly handling
NaN's and Inf's (as allclose does) would be a nice bonus.

I wrote the function below to do this, but I suspect there's a method in
numpy that I missed.

import numpy as np

def close(a, b, rtol=1.e-5, atol=1.e-8, check_invalid=True):
    """Similar to numpy.allclose, but returns a boolean array.
    See numpy.allclose for an explanation of *rtol* and *atol*."""
    def within_tol(x, y, atol, rtol):
        return np.less_equal(np.abs(x-y), atol + rtol * np.abs(y))
    x = np.array(a, copy=False)
    y = np.array(b, copy=False)
    if not check_invalid:
        return within_tol(x, y, atol, rtol)
    xfin = np.isfinite(x)
    yfin = np.isfinite(y)
    if np.all(xfin) and np.all(yfin):
        return within_tol(x, y, atol, rtol)
    else:
        # Avoid subtraction with infinite/nan values...
        cond = np.zeros(np.broadcast(x, y).shape, dtype=np.bool)
        mask = xfin & yfin
        cond[mask] = within_tol(x[mask], y[mask], atol, rtol)
        # Inf and -Inf equality...
        cond[~mask] = (x[~mask] == y[~mask])
        # NaN equality...
        cond[np.isnan(x) & np.isnan(y)] = True
        return cond

# A few quick tests...
assert np.any(close(0.300001, np.array([0.1, 0.2, 0.3, 0.4])))

x = np.array([0.1, np.nan, np.inf, -np.inf])
y = np.array([0.1000001, np.nan, np.inf, -np.inf])
assert np.all(close(x, y))

x = np.array([0.1, 0.2, np.inf])
y = np.array([0.101, np.nan, 0.2])
assert not np.all(close(x, y))


Thanks,
-Joe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120301/5a2e2561/attachment.html>

From josef.pktd at gmail.com  Thu Mar  1 20:13:03 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 1 Mar 2012 20:13:03 -0500
Subject: [Numpy-discussion] Jaccard & Hamming Problem
In-Reply-To: <CAAbYDp=wfVN6O-hCFkn3bKSbYzTB4G-bqBxXZBsqLMrwPORXUw@mail.gmail.com>
References: <CAAbYDpntwERkasR_UhKP76Pk3ayp7xqHvUQCMhyv5PsvTAMvuw@mail.gmail.com>
	<CAM-+wY9gK-U36f8weGfBQRzN2aab4J_T1xhf=gGZ=5FE4r_RAg@mail.gmail.com>
	<CAAbYDp=wfVN6O-hCFkn3bKSbYzTB4G-bqBxXZBsqLMrwPORXUw@mail.gmail.com>
Message-ID: <CAMMTP+D8wCn-5paJ5h-H7H_JSpui0Kvw6qYf8SXRx7G88zVdnA@mail.gmail.com>

On Thu, Mar 1, 2012 at 10:10 AM, Zayd YAKOUBI <zayd.yakoubi at gmail.com> wrote:
> thank you very much,
> In fact, the functions of these two measures are? for binary vectors, and I
> have not found their extension to real data such as: 0.7, 0.9, 1.7 ....
> Knowing that I applied to this data and it worked well..
>
> Have an idea about the version of these functions for this type of data ?

for hamming

just guessing : 1 - np.mean(x==y)  which might depend on the implementation

>>> spatial.distance.hamming([0,0.5,1,2], np.ones(4))
0.75
>>> 1 - np.mean([0,0.5,1,2] == np.ones(4))
0.75
>>> spatial.distance.hamming([0,0.5,1,1], np.ones(4))
0.5
>>> 1 - np.mean([0,0.5,1,1] == np.ones(4))
0.5

However I wouldn't trust it for floating point numbers, unless you are
sure about the floating point representation

>>> [0,0.5,3,2], [0,0.5,np.sqrt(3)**2,2]
([0, 0.5, 3, 2], [0, 0.5, 2.9999999999999996, 2])
>>> spatial.distance.hamming([0,0.5,3,2], [0,0.5,np.sqrt(3)**2,2])
0.25
>>> spatial.distance.hamming([0,0.5,3,2], [0,0.5,3,2])
0.0

Josef

>
> thank you for your help
> Saisissez du texte, l'adresse d'un site Web ou importez un document ?
> traduire.
> Annuler
> Alpha
>
> Regards,
> Zayd
>
>
>
>
> 2012/3/1 Warren Weckesser <warren.weckesser at enthought.com>
>>
>>
>>
>> On Thu, Mar 1, 2012 at 8:43 AM, Zayd YAKOUBI <zayd.yakoubi at gmail.com>
>> wrote:
>>>
>>> Hello,
>>>
>>> I use the similarity measure "Jaccard" and "Hamming" of pckage
>>> Scipy.spacial.cdist (Python) in a clustering context, I applied to given
>>> typs of real and integer (0.6 0.2 1.7 May 8 ). They gave good results. But I
>>> just know that they normally only applies to binary data. The function of
>>> these two similarity measures are not specified in the documentation:
>>> http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.cdist.html.
>>> Does anyone of you can help me find these functions?
>>> Thank you in advance
>>>
>>
>>
>>
>> http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.hamming.html#scipy.spatial.distance.hamming
>>
>>
>> http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.jaccard.html#scipy.spatial.distance.jaccard
>>
>>
>> Those are the nicely formatted versions of the docstrings of functions.
>> You can also access these in an interactive shell, e.g.
>>
>> >>> from scipy.spatial.distance import hamming
>> >>> help(hamming)
>>
>> or in ipython
>>
>> In [1] from scipy.spatial.distance import hamming
>> In [2] hamming?
>>
>>
>> Warren
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From jayvius at gmail.com  Fri Mar  2 01:58:26 2012
From: jayvius at gmail.com (Jay Bourque)
Date: Fri, 2 Mar 2012 00:58:26 -0600
Subject: [Numpy-discussion] Possible roadmap addendum: building better
 text file readers
In-Reply-To: <CAPJVwBnO6xbBWREL-XNTfhhKzhawWzBTheZyeLm-=TOxrXBRiA@mail.gmail.com>
References: <CAJPUwMD=q4kxjUJ4+cyC=hrPx8i7AUtg=ULmvngScDX3czGc9w@mail.gmail.com>
	<ji65gb$i5l$1@dough.gmane.org>
	<BBC7E390-BF69-42F9-B74E-2E673BE53F53@continuum.io>
	<1330092347-sup-3918@rohan>
	<CAJPUwMCTCy29N2iv2HGy4r1uPcCz6d0_mFi9hh7JkgwdbEkVFQ@mail.gmail.com>
	<1330207186-sup-1957@rohan>
	<loom.20120227T062258-15@post.gmane.org>
	<1330351883-sup-9943@rohan>
	<CAPJVwB=H2P68CU_oKAf7n_fvj2q1Ne8TSw153_twAY0aq69H6A@mail.gmail.com>
	<1330365437-sup-7898@rohan>
	<CAPJVwBnaEd=EF9xD_-=VJeCMTYpb=7gOwxRYoQ9xce7xqiwWEg@mail.gmail.com>
	<1330387831-sup-839@rohan>
	<CAPJVwBnO6xbBWREL-XNTfhhKzhawWzBTheZyeLm-=TOxrXBRiA@mail.gmail.com>
Message-ID: <CACHfV1s-z6=hM_j+8t8VeLzbEs8kKXfiWrf3MxiNpioqKtseig@mail.gmail.com>

*In an effort to build a consensus of what numpy's New and Improved text
file readers should look like, I've put together a short list of the main
points discussed in this thread so far:*
*
*
1. Loading text files using loadtxt/genfromtxt need a significant
performance boost (I think at least an order of magnitude increase in
performance is very doable based on what I've seen with Erin's recfile code)
2. Improved memory usage. Memory used for reading in a text file shouldn?t
be more than the file itself, and less if only reading a subset of file.
3. Keep existing interfaces for reading text files (loadtxt, genfromtxt,
etc). No new ones.
4. Underlying code should keep IO iteration and transformation of data
separate (awaiting more thoughts from Travis on this).
5. Be able to plug in different transformations of data at low level (also
awaiting more thoughts from Travis).
6. memory mapping of text files?
7. Eventually reduce memory usage even more by using same object for
duplicate values in array (depends on implementing enum dtype?)

Anything else?

-Jay Bourque
continuum.io
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120302/23a2d0cf/attachment.html>

From pwl_b at wp.pl  Fri Mar  2 05:47:22 2012
From: pwl_b at wp.pl (=?utf-8?b?UGF3ZcWC?= Biernat)
Date: Fri, 2 Mar 2012 10:47:22 +0000 (UTC)
Subject: [Numpy-discussion] [Numpy] quadruple precision
References: <loom.20120229T151516-646@post.gmane.org>
	<4F4E81E7.3040107@crans.org>
	<2078A981-04C4-4A4A-9BCA-B363AEC1D23F@continuum.io>
	<CAB6mnxKQzmt4mUZWYPVUYOZXUKe8AaXuLaKQ_=XsnCC+HJSr6g@mail.gmail.com>
Message-ID: <loom.20120302T100118-132@post.gmane.org>

Charles R Harris <charlesr.harris <at> gmail.com> writes:

>
>
> The quad precision library has been there for a while, and quad
  precision is also supported by the Intel compiler. I don't know
  about MSVC. Intel has been working on adding quad precision to their
  hardware for several years and there is an IEEE spec for it, so some
  day it will be here, but it isn't here yet. It's a bit sad, I could
  use quad precision in FORTRAN on a VAX 25 years ago. Mind, I only
  needed it once ;) I suppose lack of pressing need accounts for the
  delay.Chuck
>
>
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion <at> scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>

Waiting for hardware support can last forever, and __float128 is
already here. Despite being software supported, it is still reasonably
fast for people who need it. The slow-down depends on a case and
optimization and can be roughly from x2 (using sse) to x10 (without
optimization), but you gain x2 significant digits when compared to
double, see for example
http://locklessinc.com/articles/classifying_floats/. This is still
faster than mpfr for example. And gcc-4.6 already supports __float128
on a number of machines: i386, x86_64, ia64 and HP-UX. Also fftw now
supports binary128: http://www.fftw.org/release-notes.html (although
this might not be the most representative numerical software, it
confirms that it is unlikely that __float128 will be ignored by the
others unless hardware supported).

The portability is broken for numpy.float128 anyway (as I understand,
it behaves in different ways on different architectures), so adding a
new type (call it, say, quad128) that properly supports binary128
shouldn't be a drawback. Later on, when the hardware support for
binary128 shows up, the quad128 will be already there.

Pawe?.


From njs at pobox.com  Fri Mar  2 08:39:47 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 2 Mar 2012 13:39:47 +0000
Subject: [Numpy-discussion] [Numpy] quadruple precision
In-Reply-To: <loom.20120302T100118-132@post.gmane.org>
References: <loom.20120229T151516-646@post.gmane.org>
	<4F4E81E7.3040107@crans.org>
	<2078A981-04C4-4A4A-9BCA-B363AEC1D23F@continuum.io>
	<CAB6mnxKQzmt4mUZWYPVUYOZXUKe8AaXuLaKQ_=XsnCC+HJSr6g@mail.gmail.com>
	<loom.20120302T100118-132@post.gmane.org>
Message-ID: <CAPJVwBnAP_Tz3y8ULB8V_jiWwurmCncrKbzor=e5BD4_CRF+yA@mail.gmail.com>

On Mar 2, 2012 10:48 AM, "Pawe? Biernat" <pwl_b at wp.pl> wrote:
> The portability is broken for numpy.float128 anyway (as I understand,
> it behaves in different ways on different architectures), so adding a
> new type (call it, say, quad128) that properly supports binary128
> shouldn't be a drawback. Later on, when the hardware support for
> binary128 shows up, the quad128 will be already there.

There's already been movement to deprecate using float128 as the name for
machine-specific long doubles. This just gives even more reason. If/when
someone adds __float128 support to numpy we should really just call it
float128, not quad128. (This would even be backwards compatible, since
float128 currently gives no guarantees on precision or representation.)

- n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120302/efd57c99/attachment.html>

From pierre.haessig at crans.org  Fri Mar  2 09:39:31 2012
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Fri, 02 Mar 2012 15:39:31 +0100
Subject: [Numpy-discussion] [Numpy] quadruple precision
In-Reply-To: <CAPJVwBnAP_Tz3y8ULB8V_jiWwurmCncrKbzor=e5BD4_CRF+yA@mail.gmail.com>
References: <loom.20120229T151516-646@post.gmane.org>	<4F4E81E7.3040107@crans.org>	<2078A981-04C4-4A4A-9BCA-B363AEC1D23F@continuum.io>	<CAB6mnxKQzmt4mUZWYPVUYOZXUKe8AaXuLaKQ_=XsnCC+HJSr6g@mail.gmail.com>	<loom.20120302T100118-132@post.gmane.org>
	<CAPJVwBnAP_Tz3y8ULB8V_jiWwurmCncrKbzor=e5BD4_CRF+yA@mail.gmail.com>
Message-ID: <4F50DBA3.6080201@crans.org>

Le 02/03/2012 14:39, Nathaniel Smith a ?crit :
> If/when someone adds __float128 support to numpy we should really just
> call it float128
I agree!

Other types could become "float80_128" and "float80_96", as mentioned
about a week ago by Matthew.

-- 
Pierre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120302/7f9d0eba/attachment.sig>

From nouiz at nouiz.org  Fri Mar  2 10:26:48 2012
From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=)
Date: Fri, 2 Mar 2012 10:26:48 -0500
Subject: [Numpy-discussion] Possible roadmap addendum: building better
 text file readers
In-Reply-To: <CAM-+wY9Vv_fFd-7Aava9bWchiO1ibvP=nnaWYShQKJUE5tA6gQ@mail.gmail.com>
References: <CAJPUwMD=q4kxjUJ4+cyC=hrPx8i7AUtg=ULmvngScDX3czGc9w@mail.gmail.com>
	<ji65gb$i5l$1@dough.gmane.org>
	<BBC7E390-BF69-42F9-B74E-2E673BE53F53@continuum.io>
	<CAM-+wY9SYktVXo1TMW1gU1gH9RWdOnKaxOPC580412i4x3EC8A@mail.gmail.com>
	<CAM-+wY9uMra44wVrY_7ZbjVuhN6zRadf7VTdVSLTk05ZaxJwYQ@mail.gmail.com>
	<CAPJVwB=pyDdE0xoZrj_9N7Z-G+t+vZC04vSMwKL=U_nDir2aLQ@mail.gmail.com>
	<CAM-+wY8UwAv4BY2NwD0eTOyHpNM2c9jkEz-gRKCoqsf7T4tKZg@mail.gmail.com>
	<CAPJVwB=kMAXC9G24AR8vmdwcjQN+s1iu6xN0S3e2_bVMTQU9Tg@mail.gmail.com>
	<CAM-+wY-tdYEQHSA+frMqu1GK-ZAt+ObQMBkrL1zRcS+Hc8K=fQ@mail.gmail.com>
	<CAPJVwBmQF=z-7dPaQC3Ncr7Ww0H8F5qg7=FBOwBZy14TPw5o8A@mail.gmail.com>
	<CAM-+wY9Vv_fFd-7Aava9bWchiO1ibvP=nnaWYShQKJUE5tA6gQ@mail.gmail.com>
Message-ID: <CADKKbtgLqE4=TTthOLBN4aiKdLtSTJJXFF=GPByGyFCP=ZLF8Q@mail.gmail.com>

Hi,

mmap can give a speed up in some case, but slow down in other. So care
must be taken when using it. For example, the speed difference between
read and mmap are not the same when the file is local and when it is
on NFS. On NFS, you need to read bigger chunk to make it worthwhile.

Another example is on an SMP computer. If for example you have a 8
cores computer but have only enought ram for 1 or 2 copy of your
dataset, using mmap is a bad idea. If you read the file by chunk
normally the OS will keep the file in its cache in ram. So if you
launch 8 jobs, they will all use the system cache to shared the data.
If you use mmap, I think this bypass the OS cache. So you will always
read the file. On NFS with a cluster of computer, this can bring a
high load on the file server. So having a way to specify to use or not
to use mmap would be great as you can't always guess the right thing
to do. (Except if I'm wrong and this don't by pass the OS cache)

Anyway, it is great to see people work in this problem, this was just
a few comments I had in mind when I read this thread.

Fr?d?ric

On Sun, Feb 26, 2012 at 4:22 PM, Warren Weckesser
<warren.weckesser at enthought.com> wrote:
>
>
> On Sun, Feb 26, 2012 at 3:00 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On Sun, Feb 26, 2012 at 7:58 PM, Warren Weckesser
>> <warren.weckesser at enthought.com> wrote:
>> > Right, I got that.? Sorry if the placement of the notes about how to
>> > clear
>> > the cache seemed to imply otherwise.
>>
>> OK, cool, np.
>>
>> >> Clearing the disk cache is very important for getting meaningful,
>> >> repeatable benchmarks in code where you know that the cache will
>> >> usually be cold and where hitting the disk will have unpredictable
>> >> effects (i.e., pretty much anything doing random access, like
>> >> databases, which have complicated locality patterns, you may or may
>> >> not trigger readahead, etc.). But here we're talking about pure
>> >> sequential reads, where the disk just goes however fast it goes, and
>> >> your code can either keep up or not.
>> >>
>> >> One minor point where the OS interface could matter: it's good to set
>> >> up your code so it can use mmap() instead of read(), since this can
>> >> reduce overhead. read() has to copy the data from the disk into OS
>> >> memory, and then from OS memory into your process's memory; mmap()
>> >> skips the second step.
>> >
>> > Thanks for the tip.? Do you happen to have any sample code that
>> > demonstrates
>> > this?? I'd like to explore this more.
>>
>> No, I've never actually run into a situation where I needed it myself,
>> but I learned the trick from Tridge so I tend to believe it :-).
>> mmap() is actually a pretty simple interface -- the only thing I'd
>> watch out for is that you want to mmap() the file in pieces (so as to
>> avoid VM exhaustion on 32-bit systems), but you want to use pretty big
>> pieces (because each call to mmap()/munmap() has overhead). So you
>> might want to use chunks in the 32-128 MiB range. Or since I guess
>> you're probably developing on a 64-bit system you can just be lazy and
>> mmap the whole file for initial testing. git uses mmap, but I'm not
>> sure it's very useful example code.
>>
>> Also it's not going to do magic. Your code has to be fairly quick
>> before avoiding a single memcpy() will be noticeable.
>>
>> HTH,
>
>
>
> Yes, thanks! ? I'm working on a mmap version now.? I'm very curious to see
> just how much of an improvement it can give.
>
> Warren
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From xscript at gmx.net  Fri Mar  2 11:37:11 2012
From: xscript at gmx.net (=?utf-8?Q?Llu=C3=ADs?=)
Date: Fri, 02 Mar 2012 17:37:11 +0100
Subject: [Numpy-discussion] Possible roadmap addendum: building better
	text file readers
In-Reply-To: <CADKKbtgLqE4=TTthOLBN4aiKdLtSTJJXFF=GPByGyFCP=ZLF8Q@mail.gmail.com>
	(=?utf-8?B?IkZyw6lkw6lyaWM=?= Bastien"'s message of "Fri, 2 Mar 2012
	10:26:48 -0500")
References: <CAJPUwMD=q4kxjUJ4+cyC=hrPx8i7AUtg=ULmvngScDX3czGc9w@mail.gmail.com>
	<ji65gb$i5l$1@dough.gmane.org>
	<BBC7E390-BF69-42F9-B74E-2E673BE53F53@continuum.io>
	<CAM-+wY9SYktVXo1TMW1gU1gH9RWdOnKaxOPC580412i4x3EC8A@mail.gmail.com>
	<CAM-+wY9uMra44wVrY_7ZbjVuhN6zRadf7VTdVSLTk05ZaxJwYQ@mail.gmail.com>
	<CAPJVwB=pyDdE0xoZrj_9N7Z-G+t+vZC04vSMwKL=U_nDir2aLQ@mail.gmail.com>
	<CAM-+wY8UwAv4BY2NwD0eTOyHpNM2c9jkEz-gRKCoqsf7T4tKZg@mail.gmail.com>
	<CAPJVwB=kMAXC9G24AR8vmdwcjQN+s1iu6xN0S3e2_bVMTQU9Tg@mail.gmail.com>
	<CAM-+wY-tdYEQHSA+frMqu1GK-ZAt+ObQMBkrL1zRcS+Hc8K=fQ@mail.gmail.com>
	<CAPJVwBmQF=z-7dPaQC3Ncr7Ww0H8F5qg7=FBOwBZy14TPw5o8A@mail.gmail.com>
	<CAM-+wY9Vv_fFd-7Aava9bWchiO1ibvP=nnaWYShQKJUE5tA6gQ@mail.gmail.com>
	<CADKKbtgLqE4=TTthOLBN4aiKdLtSTJJXFF=GPByGyFCP=ZLF8Q@mail.gmail.com>
Message-ID: <87r4xbvu14.fsf@ginnungagap.bsc.es>

Fr?d?ric Bastien writes:

> Hi,
> mmap can give a speed up in some case, but slow down in other. So care
> must be taken when using it. For example, the speed difference between
> read and mmap are not the same when the file is local and when it is
> on NFS. On NFS, you need to read bigger chunk to make it worthwhile.

> Another example is on an SMP computer. If for example you have a 8
> cores computer but have only enought ram for 1 or 2 copy of your
> dataset, using mmap is a bad idea. If you read the file by chunk
> normally the OS will keep the file in its cache in ram. So if you
> launch 8 jobs, they will all use the system cache to shared the data.
> If you use mmap, I think this bypass the OS cache. So you will always
> read the file.

Not according to mmap(2):

       MAP_SHARED Share this mapping.  Updates to the mapping are visible to
                  other processes that map this file, and are carried through to
                  the underlying file.  The file may not actually be updated
                  until msync(2) or munmap() is called.

My understanding is that all processes will use exactly the same physical
memory, and swapping that memory will use the file itself.


> On NFS with a cluster of computer, this can bring a
> high load on the file server. So having a way to specify to use or not
> to use mmap would be great as you can't always guess the right thing
> to do. (Except if I'm wrong and this don't by pass the OS cache)

> Anyway, it is great to see people work in this problem, this was just
> a few comments I had in mind when I read this thread.


Lluis

-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth


From matthew.brett at gmail.com  Fri Mar  2 18:36:26 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Fri, 2 Mar 2012 18:36:26 -0500
Subject: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
Message-ID: <CAH6Pt5qkeVoo3d2+q4+WMmKZFC2FnHx3=pun5S5dwv2P5ijxKw@mail.gmail.com>

Hi,

Sorry that this report is not complete, I don't have full access to
this box but, on a Debian squeeze machine running linux
2.6.32-5-sparc64-smp:

nosetests ~/usr/local/lib/python2.6/site-packages/numpy/lib/tests/test_io.py:TestFromTxt.test_user_missing_values

test_user_missing_values (test_io.TestFromTxt) ... Bus error

This on current master : 1.7.0.dev-b9872b4

Cheers,

Matthew


From charlesr.harris at gmail.com  Fri Mar  2 20:59:24 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 2 Mar 2012 18:59:24 -0700
Subject: [Numpy-discussion] Where did Numpy-svn come from?
Message-ID: <CAB6mnx+E-o215v_NTyoAQV=kK2WRgsPOqg+aRJwN822tSrdzUg@mail.gmail.com>

Github commits now send out mail:

reply-toGitHub <noreply at github.com>
tonumpy-svn at scipy.org
This is a recent change, did we change something? I liked getting mail
directly from github better.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120302/9bd04e53/attachment.html>

From charlesr.harris at gmail.com  Fri Mar  2 21:05:57 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 2 Mar 2012 19:05:57 -0700
Subject: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
In-Reply-To: <CAH6Pt5qkeVoo3d2+q4+WMmKZFC2FnHx3=pun5S5dwv2P5ijxKw@mail.gmail.com>
References: <CAH6Pt5qkeVoo3d2+q4+WMmKZFC2FnHx3=pun5S5dwv2P5ijxKw@mail.gmail.com>
Message-ID: <CAB6mnx+U1vXz6=TCSHO=-a+OjffcAntFKge9bm8LF-Vkj4jH8w@mail.gmail.com>

On Fri, Mar 2, 2012 at 4:36 PM, Matthew Brett <matthew.brett at gmail.com>wrote:

> Hi,
>
> Sorry that this report is not complete, I don't have full access to
> this box but, on a Debian squeeze machine running linux
> 2.6.32-5-sparc64-smp:
>
> nosetests
> ~/usr/local/lib/python2.6/site-packages/numpy/lib/tests/test_io.py:TestFromTxt.test_user_missing_values
>
> test_user_missing_values (test_io.TestFromTxt) ... Bus error
>
> This on current master : 1.7.0.dev-b9872b4
>
>
Hmm, some tests might have been recently enabled. Any chance of doing a
bisection?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120302/c301f8e2/attachment.html>

From matthew.brett at gmail.com  Sat Mar  3 00:07:55 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Sat, 3 Mar 2012 00:07:55 -0500
Subject: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
In-Reply-To: <CAB6mnx+U1vXz6=TCSHO=-a+OjffcAntFKge9bm8LF-Vkj4jH8w@mail.gmail.com>
References: <CAH6Pt5qkeVoo3d2+q4+WMmKZFC2FnHx3=pun5S5dwv2P5ijxKw@mail.gmail.com>
	<CAB6mnx+U1vXz6=TCSHO=-a+OjffcAntFKge9bm8LF-Vkj4jH8w@mail.gmail.com>
Message-ID: <CAH6Pt5o+XW+DVy2L4-GG1Pq2B9YWuebKUKuMGHm-hWabVUV2=w@mail.gmail.com>

Hi,

On Fri, Mar 2, 2012 at 9:05 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Fri, Mar 2, 2012 at 4:36 PM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
>>
>> Hi,
>>
>> Sorry that this report is not complete, I don't have full access to
>> this box but, on a Debian squeeze machine running linux
>> 2.6.32-5-sparc64-smp:
>>
>> nosetests
>> ~/usr/local/lib/python2.6/site-packages/numpy/lib/tests/test_io.py:TestFromTxt.test_user_missing_values
>>
>> test_user_missing_values (test_io.TestFromTxt) ... Bus error
>>
>> This on current master : 1.7.0.dev-b9872b4
>>
>
> Hmm, some tests might have been recently enabled. Any chance of doing a
> bisection?

I'm on it - will get back to you tomorrow.

See you,

Matthew


From ralf.gommers at googlemail.com  Sat Mar  3 08:59:33 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sat, 3 Mar 2012 14:59:33 +0100
Subject: [Numpy-discussion] Floating point "close" function?
In-Reply-To: <CACe1pJdnj11a+-DAYssNQV9bSBxEjp_nCAjchk7_itH2-RYP4g@mail.gmail.com>
References: <CACe1pJdnj11a+-DAYssNQV9bSBxEjp_nCAjchk7_itH2-RYP4g@mail.gmail.com>
Message-ID: <CABL7CQhQDeSi36TJbSq9pB3odyQTg91r=JEeUGTriWmkB7NdPw@mail.gmail.com>

On Thu, Mar 1, 2012 at 11:44 PM, Joe Kington <jkington at wisc.edu> wrote:

> Is there a numpy function for testing floating point equality that returns
> a boolean array?
>
> I'm aware of np.allclose, but I need a boolean array.  Properly handling
> NaN's and Inf's (as allclose does) would be a nice bonus.
>
> I wrote the function below to do this, but I suspect there's a method in
> numpy that I missed.
>

I don't think such a function exists, would be nice to have. How about just
adding a keyword "return_array" to allclose to do so?

Ralf


> import numpy as np
>
> def close(a, b, rtol=1.e-5, atol=1.e-8, check_invalid=True):
>     """Similar to numpy.allclose, but returns a boolean array.
>     See numpy.allclose for an explanation of *rtol* and *atol*."""
>     def within_tol(x, y, atol, rtol):
>         return np.less_equal(np.abs(x-y), atol + rtol * np.abs(y))
>     x = np.array(a, copy=False)
>     y = np.array(b, copy=False)
>     if not check_invalid:
>         return within_tol(x, y, atol, rtol)
>     xfin = np.isfinite(x)
>     yfin = np.isfinite(y)
>     if np.all(xfin) and np.all(yfin):
>         return within_tol(x, y, atol, rtol)
>     else:
>         # Avoid subtraction with infinite/nan values...
>         cond = np.zeros(np.broadcast(x, y).shape, dtype=np.bool)
>         mask = xfin & yfin
>         cond[mask] = within_tol(x[mask], y[mask], atol, rtol)
>         # Inf and -Inf equality...
>         cond[~mask] = (x[~mask] == y[~mask])
>         # NaN equality...
>         cond[np.isnan(x) & np.isnan(y)] = True
>         return cond
>
> # A few quick tests...
> assert np.any(close(0.300001, np.array([0.1, 0.2, 0.3, 0.4])))
>
> x = np.array([0.1, np.nan, np.inf, -np.inf])
> y = np.array([0.1000001, np.nan, np.inf, -np.inf])
> assert np.all(close(x, y))
>
> x = np.array([0.1, 0.2, np.inf])
> y = np.array([0.101, np.nan, 0.2])
> assert not np.all(close(x, y))
>
>
> Thanks,
> -Joe
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120303/1fcd9fae/attachment.html>

From robert.kern at gmail.com  Sat Mar  3 09:05:30 2012
From: robert.kern at gmail.com (Robert Kern)
Date: Sat, 3 Mar 2012 14:05:30 +0000
Subject: [Numpy-discussion] Floating point "close" function?
In-Reply-To: <CABL7CQhQDeSi36TJbSq9pB3odyQTg91r=JEeUGTriWmkB7NdPw@mail.gmail.com>
References: <CACe1pJdnj11a+-DAYssNQV9bSBxEjp_nCAjchk7_itH2-RYP4g@mail.gmail.com>
	<CABL7CQhQDeSi36TJbSq9pB3odyQTg91r=JEeUGTriWmkB7NdPw@mail.gmail.com>
Message-ID: <CAF6FJivo8LrEhFc+r3biW3iH8hjdoJZDJ8Txf8Cj9aS37r0VnQ@mail.gmail.com>

On Sat, Mar 3, 2012 at 13:59, Ralf Gommers <ralf.gommers at googlemail.com> wrote:
>
>
> On Thu, Mar 1, 2012 at 11:44 PM, Joe Kington <jkington at wisc.edu> wrote:
>>
>> Is there a numpy function for testing floating point equality that returns
>> a boolean array?
>>
>> I'm aware of np.allclose, but I need a boolean array.? Properly handling
>> NaN's and Inf's (as allclose does) would be a nice bonus.
>>
>> I wrote the function below to do this, but I suspect there's a method in
>> numpy that I missed.
>
>
> I don't think such a function exists, would be nice to have. How about just
> adding a keyword "return_array" to allclose to do so?

As a general design principle, adding a boolean flag that changes the
return type is worse than making a new function.

-- 
Robert Kern


From ralf.gommers at googlemail.com  Sat Mar  3 09:31:01 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sat, 3 Mar 2012 15:31:01 +0100
Subject: [Numpy-discussion] Floating point "close" function?
In-Reply-To: <CAF6FJivo8LrEhFc+r3biW3iH8hjdoJZDJ8Txf8Cj9aS37r0VnQ@mail.gmail.com>
References: <CACe1pJdnj11a+-DAYssNQV9bSBxEjp_nCAjchk7_itH2-RYP4g@mail.gmail.com>
	<CABL7CQhQDeSi36TJbSq9pB3odyQTg91r=JEeUGTriWmkB7NdPw@mail.gmail.com>
	<CAF6FJivo8LrEhFc+r3biW3iH8hjdoJZDJ8Txf8Cj9aS37r0VnQ@mail.gmail.com>
Message-ID: <CABL7CQhT4HuS_PhDYYTE2tV+0Z8e_mmuAwLq_YG68hR8LH4Xtw@mail.gmail.com>

On Sat, Mar 3, 2012 at 3:05 PM, Robert Kern <robert.kern at gmail.com> wrote:

> On Sat, Mar 3, 2012 at 13:59, Ralf Gommers <ralf.gommers at googlemail.com>
> wrote:
> >
> >
> > On Thu, Mar 1, 2012 at 11:44 PM, Joe Kington <jkington at wisc.edu> wrote:
> >>
> >> Is there a numpy function for testing floating point equality that
> returns
> >> a boolean array?
> >>
> >> I'm aware of np.allclose, but I need a boolean array.  Properly handling
> >> NaN's and Inf's (as allclose does) would be a nice bonus.
> >>
> >> I wrote the function below to do this, but I suspect there's a method in
> >> numpy that I missed.
> >
> >
> > I don't think such a function exists, would be nice to have. How about
> just
> > adding a keyword "return_array" to allclose to do so?
>
> As a general design principle, adding a boolean flag that changes the
> return type is worse than making a new function.
>

That's certainly true as a general principle. Do you have a concrete
suggestion in this case though? Because this is also bad:
>>> np.<TAB>
Display all 561 possibilities? (y or n)

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120303/1d3624d6/attachment.html>

From robert.kern at gmail.com  Sat Mar  3 09:34:31 2012
From: robert.kern at gmail.com (Robert Kern)
Date: Sat, 3 Mar 2012 14:34:31 +0000
Subject: [Numpy-discussion] Floating point "close" function?
In-Reply-To: <CABL7CQhT4HuS_PhDYYTE2tV+0Z8e_mmuAwLq_YG68hR8LH4Xtw@mail.gmail.com>
References: <CACe1pJdnj11a+-DAYssNQV9bSBxEjp_nCAjchk7_itH2-RYP4g@mail.gmail.com>
	<CABL7CQhQDeSi36TJbSq9pB3odyQTg91r=JEeUGTriWmkB7NdPw@mail.gmail.com>
	<CAF6FJivo8LrEhFc+r3biW3iH8hjdoJZDJ8Txf8Cj9aS37r0VnQ@mail.gmail.com>
	<CABL7CQhT4HuS_PhDYYTE2tV+0Z8e_mmuAwLq_YG68hR8LH4Xtw@mail.gmail.com>
Message-ID: <CAF6FJiu5Gx2ZQ9e_P-1GmR=UkucK8MyT_C8M+k3i4CZv+hPG=A@mail.gmail.com>

On Sat, Mar 3, 2012 at 14:31, Ralf Gommers <ralf.gommers at googlemail.com> wrote:
>
>
> On Sat, Mar 3, 2012 at 3:05 PM, Robert Kern <robert.kern at gmail.com> wrote:
>>
>> On Sat, Mar 3, 2012 at 13:59, Ralf Gommers <ralf.gommers at googlemail.com>
>> wrote:
>> >
>> >
>> > On Thu, Mar 1, 2012 at 11:44 PM, Joe Kington <jkington at wisc.edu> wrote:
>> >>
>> >> Is there a numpy function for testing floating point equality that
>> >> returns
>> >> a boolean array?
>> >>
>> >> I'm aware of np.allclose, but I need a boolean array.? Properly
>> >> handling
>> >> NaN's and Inf's (as allclose does) would be a nice bonus.
>> >>
>> >> I wrote the function below to do this, but I suspect there's a method
>> >> in
>> >> numpy that I missed.
>> >
>> >
>> > I don't think such a function exists, would be nice to have. How about
>> > just
>> > adding a keyword "return_array" to allclose to do so?
>>
>> As a general design principle, adding a boolean flag that changes the
>> return type is worse than making a new function.
>
>
> That's certainly true as a general principle. Do you have a concrete
> suggestion in this case though?

np.close()

> Because this is also bad:
>>>> np.<TAB>
> Display all 561 possibilities? (y or n)

Not as bad as overloading np.allclose(x,y,return_array=True). Or
deprecating np.allclose() in favor of np.close().all().

-- 
Robert Kern


From ben.root at ou.edu  Sat Mar  3 10:22:07 2012
From: ben.root at ou.edu (Benjamin Root)
Date: Sat, 3 Mar 2012 09:22:07 -0600
Subject: [Numpy-discussion] Floating point "close" function?
In-Reply-To: <CAF6FJiu5Gx2ZQ9e_P-1GmR=UkucK8MyT_C8M+k3i4CZv+hPG=A@mail.gmail.com>
References: <CACe1pJdnj11a+-DAYssNQV9bSBxEjp_nCAjchk7_itH2-RYP4g@mail.gmail.com>
	<CABL7CQhQDeSi36TJbSq9pB3odyQTg91r=JEeUGTriWmkB7NdPw@mail.gmail.com>
	<CAF6FJivo8LrEhFc+r3biW3iH8hjdoJZDJ8Txf8Cj9aS37r0VnQ@mail.gmail.com>
	<CABL7CQhT4HuS_PhDYYTE2tV+0Z8e_mmuAwLq_YG68hR8LH4Xtw@mail.gmail.com>
	<CAF6FJiu5Gx2ZQ9e_P-1GmR=UkucK8MyT_C8M+k3i4CZv+hPG=A@mail.gmail.com>
Message-ID: <CANNq6F=Uh9Hr-ti423DEWbEsKrGGVyWf5bYmE1UDMcA8aCzhhQ@mail.gmail.com>

On Saturday, March 3, 2012, Robert Kern <robert.kern at gmail.com> wrote:
> On Sat, Mar 3, 2012 at 14:31, Ralf Gommers <ralf.gommers at googlemail.com>
wrote:
>>
>>
>> On Sat, Mar 3, 2012 at 3:05 PM, Robert Kern <robert.kern at gmail.com>
wrote:
>>>
>>> On Sat, Mar 3, 2012 at 13:59, Ralf Gommers <ralf.gommers at googlemail.com>
>>> wrote:
>>> >
>>> >
>>> > On Thu, Mar 1, 2012 at 11:44 PM, Joe Kington <jkington at wisc.edu>
wrote:
>>> >>
>>> >> Is there a numpy function for testing floating point equality that
>>> >> returns
>>> >> a boolean array?
>>> >>
>>> >> I'm aware of np.allclose, but I need a boolean array.  Properly
>>> >> handling
>>> >> NaN's and Inf's (as allclose does) would be a nice bonus.
>>> >>
>>> >> I wrote the function below to do this, but I suspect there's a method
>>> >> in
>>> >> numpy that I missed.
>>> >
>>> >
>>> > I don't think such a function exists, would be nice to have. How about
>>> > just
>>> > adding a keyword "return_array" to allclose to do so?
>>>
>>> As a general design principle, adding a boolean flag that changes the
>>> return type is worse than making a new function.
>>
>>
>> That's certainly true as a general principle. Do you have a concrete
>> suggestion in this case though?
>
> np.close()
>

When I read that, I mentally think of "close" as in closing a file.  I
think we need a synonym.

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120303/3761e081/attachment.html>

From robert.kern at gmail.com  Sat Mar  3 10:26:27 2012
From: robert.kern at gmail.com (Robert Kern)
Date: Sat, 3 Mar 2012 15:26:27 +0000
Subject: [Numpy-discussion] Floating point "close" function?
In-Reply-To: <CANNq6F=Uh9Hr-ti423DEWbEsKrGGVyWf5bYmE1UDMcA8aCzhhQ@mail.gmail.com>
References: <CACe1pJdnj11a+-DAYssNQV9bSBxEjp_nCAjchk7_itH2-RYP4g@mail.gmail.com>
	<CABL7CQhQDeSi36TJbSq9pB3odyQTg91r=JEeUGTriWmkB7NdPw@mail.gmail.com>
	<CAF6FJivo8LrEhFc+r3biW3iH8hjdoJZDJ8Txf8Cj9aS37r0VnQ@mail.gmail.com>
	<CABL7CQhT4HuS_PhDYYTE2tV+0Z8e_mmuAwLq_YG68hR8LH4Xtw@mail.gmail.com>
	<CAF6FJiu5Gx2ZQ9e_P-1GmR=UkucK8MyT_C8M+k3i4CZv+hPG=A@mail.gmail.com>
	<CANNq6F=Uh9Hr-ti423DEWbEsKrGGVyWf5bYmE1UDMcA8aCzhhQ@mail.gmail.com>
Message-ID: <CAF6FJitkQT5THf59R=A9pjqdk2OVOFE1d4_R7oPjZ6tc+AQg0g@mail.gmail.com>

On Sat, Mar 3, 2012 at 15:22, Benjamin Root <ben.root at ou.edu> wrote:
>
>
> On Saturday, March 3, 2012, Robert Kern <robert.kern at gmail.com> wrote:
>> On Sat, Mar 3, 2012 at 14:31, Ralf Gommers <ralf.gommers at googlemail.com>
>> wrote:
>>>
>>>
>>> On Sat, Mar 3, 2012 at 3:05 PM, Robert Kern <robert.kern at gmail.com>
>>> wrote:
>>>>
>>>> On Sat, Mar 3, 2012 at 13:59, Ralf Gommers <ralf.gommers at googlemail.com>
>>>> wrote:
>>>> >
>>>> >
>>>> > On Thu, Mar 1, 2012 at 11:44 PM, Joe Kington <jkington at wisc.edu>
>>>> > wrote:
>>>> >>
>>>> >> Is there a numpy function for testing floating point equality that
>>>> >> returns
>>>> >> a boolean array?
>>>> >>
>>>> >> I'm aware of np.allclose, but I need a boolean array.? Properly
>>>> >> handling
>>>> >> NaN's and Inf's (as allclose does) would be a nice bonus.
>>>> >>
>>>> >> I wrote the function below to do this, but I suspect there's a method
>>>> >> in
>>>> >> numpy that I missed.
>>>> >
>>>> >
>>>> > I don't think such a function exists, would be nice to have. How about
>>>> > just
>>>> > adding a keyword "return_array" to allclose to do so?
>>>>
>>>> As a general design principle, adding a boolean flag that changes the
>>>> return type is worse than making a new function.
>>>
>>>
>>> That's certainly true as a general principle. Do you have a concrete
>>> suggestion in this case though?
>>
>> np.close()
>>
>
> When I read that, I mentally think of "close" as in closing a file. ?I think
> we need a synonym.

np.isclose()

-- 
Robert Kern


From robert.kern at gmail.com  Sat Mar  3 10:27:14 2012
From: robert.kern at gmail.com (Robert Kern)
Date: Sat, 3 Mar 2012 15:27:14 +0000
Subject: [Numpy-discussion] Floating point "close" function?
In-Reply-To: <CAF6FJiu5Gx2ZQ9e_P-1GmR=UkucK8MyT_C8M+k3i4CZv+hPG=A@mail.gmail.com>
References: <CACe1pJdnj11a+-DAYssNQV9bSBxEjp_nCAjchk7_itH2-RYP4g@mail.gmail.com>
	<CABL7CQhQDeSi36TJbSq9pB3odyQTg91r=JEeUGTriWmkB7NdPw@mail.gmail.com>
	<CAF6FJivo8LrEhFc+r3biW3iH8hjdoJZDJ8Txf8Cj9aS37r0VnQ@mail.gmail.com>
	<CABL7CQhT4HuS_PhDYYTE2tV+0Z8e_mmuAwLq_YG68hR8LH4Xtw@mail.gmail.com>
	<CAF6FJiu5Gx2ZQ9e_P-1GmR=UkucK8MyT_C8M+k3i4CZv+hPG=A@mail.gmail.com>
Message-ID: <CAF6FJitkyXsL76P0mvL0ujGMhPw2Ds0UH-+SdGsTWyTXsK=hgQ@mail.gmail.com>

On Sat, Mar 3, 2012 at 14:34, Robert Kern <robert.kern at gmail.com> wrote:
> On Sat, Mar 3, 2012 at 14:31, Ralf Gommers <ralf.gommers at googlemail.com> wrote:

>> Because this is also bad:
>>>>> np.<TAB>
>> Display all 561 possibilities? (y or n)
>
> Not as bad as overloading np.allclose(x,y,return_array=True). Or
> deprecating np.allclose() in favor of np.close().all().

I screwed up this paragraph. I meant that as "Another alternative
would be to deprecate ...".

-- 
Robert Kern


From shish at keba.be  Sat Mar  3 10:51:13 2012
From: shish at keba.be (Olivier Delalleau)
Date: Sat, 3 Mar 2012 10:51:13 -0500
Subject: [Numpy-discussion] Floating point "close" function?
In-Reply-To: <CAF6FJitkyXsL76P0mvL0ujGMhPw2Ds0UH-+SdGsTWyTXsK=hgQ@mail.gmail.com>
References: <CACe1pJdnj11a+-DAYssNQV9bSBxEjp_nCAjchk7_itH2-RYP4g@mail.gmail.com>
	<CABL7CQhQDeSi36TJbSq9pB3odyQTg91r=JEeUGTriWmkB7NdPw@mail.gmail.com>
	<CAF6FJivo8LrEhFc+r3biW3iH8hjdoJZDJ8Txf8Cj9aS37r0VnQ@mail.gmail.com>
	<CABL7CQhT4HuS_PhDYYTE2tV+0Z8e_mmuAwLq_YG68hR8LH4Xtw@mail.gmail.com>
	<CAF6FJiu5Gx2ZQ9e_P-1GmR=UkucK8MyT_C8M+k3i4CZv+hPG=A@mail.gmail.com>
	<CAF6FJitkyXsL76P0mvL0ujGMhPw2Ds0UH-+SdGsTWyTXsK=hgQ@mail.gmail.com>
Message-ID: <CAFXk4bohtWeDaA5GDeVP_uqH5r4t1e05W-xxv50fASAXG1BbhA@mail.gmail.com>

Le 3 mars 2012 10:27, Robert Kern <robert.kern at gmail.com> a ?crit :

> On Sat, Mar 3, 2012 at 14:34, Robert Kern <robert.kern at gmail.com> wrote:
> > On Sat, Mar 3, 2012 at 14:31, Ralf Gommers <ralf.gommers at googlemail.com>
> wrote:
>
> >> Because this is also bad:
> >>>>> np.<TAB>
> >> Display all 561 possibilities? (y or n)
> >
> > Not as bad as overloading np.allclose(x,y,return_array=True). Or
> > deprecating np.allclose() in favor of np.close().all().
>
> I screwed up this paragraph. I meant that as "Another alternative
> would be to deprecate ...".
>

np.close().all() would probably be a lot less efficient in terms of CPU /
memory though, wouldn't it?

-=- Olivier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120303/978f25a2/attachment.html>

From robert.kern at gmail.com  Sat Mar  3 11:03:23 2012
From: robert.kern at gmail.com (Robert Kern)
Date: Sat, 3 Mar 2012 16:03:23 +0000
Subject: [Numpy-discussion] Floating point "close" function?
In-Reply-To: <CAFXk4bohtWeDaA5GDeVP_uqH5r4t1e05W-xxv50fASAXG1BbhA@mail.gmail.com>
References: <CACe1pJdnj11a+-DAYssNQV9bSBxEjp_nCAjchk7_itH2-RYP4g@mail.gmail.com>
	<CABL7CQhQDeSi36TJbSq9pB3odyQTg91r=JEeUGTriWmkB7NdPw@mail.gmail.com>
	<CAF6FJivo8LrEhFc+r3biW3iH8hjdoJZDJ8Txf8Cj9aS37r0VnQ@mail.gmail.com>
	<CABL7CQhT4HuS_PhDYYTE2tV+0Z8e_mmuAwLq_YG68hR8LH4Xtw@mail.gmail.com>
	<CAF6FJiu5Gx2ZQ9e_P-1GmR=UkucK8MyT_C8M+k3i4CZv+hPG=A@mail.gmail.com>
	<CAF6FJitkyXsL76P0mvL0ujGMhPw2Ds0UH-+SdGsTWyTXsK=hgQ@mail.gmail.com>
	<CAFXk4bohtWeDaA5GDeVP_uqH5r4t1e05W-xxv50fASAXG1BbhA@mail.gmail.com>
Message-ID: <CAF6FJivi-06QkHih+L8Qn2iGt7AO780vz3APJ0+kaUxYX-Q=Fw@mail.gmail.com>

On Sat, Mar 3, 2012 at 15:51, Olivier Delalleau <shish at keba.be> wrote:
> Le 3 mars 2012 10:27, Robert Kern <robert.kern at gmail.com> a ?crit :
>>
>> On Sat, Mar 3, 2012 at 14:34, Robert Kern <robert.kern at gmail.com> wrote:
>> > On Sat, Mar 3, 2012 at 14:31, Ralf Gommers <ralf.gommers at googlemail.com>
>> > wrote:
>>
>> >> Because this is also bad:
>> >>>>> np.<TAB>
>> >> Display all 561 possibilities? (y or n)
>> >
>> > Not as bad as overloading np.allclose(x,y,return_array=True). Or
>> > deprecating np.allclose() in favor of np.close().all().
>>
>> I screwed up this paragraph. I meant that as "Another alternative
>> would be to deprecate ...".
>
>
> np.close().all() would probably be a lot less efficient in terms of CPU /
> memory though, wouldn't it?

No. np.allclose() is essentially doing exactly this already.

-- 
Robert Kern


From shish at keba.be  Sat Mar  3 11:06:51 2012
From: shish at keba.be (Olivier Delalleau)
Date: Sat, 3 Mar 2012 11:06:51 -0500
Subject: [Numpy-discussion] Floating point "close" function?
In-Reply-To: <CAF6FJivi-06QkHih+L8Qn2iGt7AO780vz3APJ0+kaUxYX-Q=Fw@mail.gmail.com>
References: <CACe1pJdnj11a+-DAYssNQV9bSBxEjp_nCAjchk7_itH2-RYP4g@mail.gmail.com>
	<CABL7CQhQDeSi36TJbSq9pB3odyQTg91r=JEeUGTriWmkB7NdPw@mail.gmail.com>
	<CAF6FJivo8LrEhFc+r3biW3iH8hjdoJZDJ8Txf8Cj9aS37r0VnQ@mail.gmail.com>
	<CABL7CQhT4HuS_PhDYYTE2tV+0Z8e_mmuAwLq_YG68hR8LH4Xtw@mail.gmail.com>
	<CAF6FJiu5Gx2ZQ9e_P-1GmR=UkucK8MyT_C8M+k3i4CZv+hPG=A@mail.gmail.com>
	<CAF6FJitkyXsL76P0mvL0ujGMhPw2Ds0UH-+SdGsTWyTXsK=hgQ@mail.gmail.com>
	<CAFXk4bohtWeDaA5GDeVP_uqH5r4t1e05W-xxv50fASAXG1BbhA@mail.gmail.com>
	<CAF6FJivi-06QkHih+L8Qn2iGt7AO780vz3APJ0+kaUxYX-Q=Fw@mail.gmail.com>
Message-ID: <CAFXk4bos6Nyp6PxobPEY4SjcBvNo=aQhPhyJmC=jMxeFtU7TTg@mail.gmail.com>

Le 3 mars 2012 11:03, Robert Kern <robert.kern at gmail.com> a ?crit :

> On Sat, Mar 3, 2012 at 15:51, Olivier Delalleau <shish at keba.be> wrote:
> > Le 3 mars 2012 10:27, Robert Kern <robert.kern at gmail.com> a ?crit :
> >>
> >> On Sat, Mar 3, 2012 at 14:34, Robert Kern <robert.kern at gmail.com>
> wrote:
> >> > On Sat, Mar 3, 2012 at 14:31, Ralf Gommers <
> ralf.gommers at googlemail.com>
> >> > wrote:
> >>
> >> >> Because this is also bad:
> >> >>>>> np.<TAB>
> >> >> Display all 561 possibilities? (y or n)
> >> >
> >> > Not as bad as overloading np.allclose(x,y,return_array=True). Or
> >> > deprecating np.allclose() in favor of np.close().all().
> >>
> >> I screwed up this paragraph. I meant that as "Another alternative
> >> would be to deprecate ...".
> >
> >
> > np.close().all() would probably be a lot less efficient in terms of CPU /
> > memory though, wouldn't it?
>
> No. np.allclose() is essentially doing exactly this already.
>

Ok. What about then, np.allclose() could theoretically be a lot more
efficient in terms of CPU / memory? ;)

-=- Olivier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120303/199e9ac6/attachment.html>

From lpc at cmu.edu  Sat Mar  3 11:07:16 2012
From: lpc at cmu.edu (Luis Pedro Coelho)
Date: Sat, 03 Mar 2012 16:07:16 +0000
Subject: [Numpy-discussion] C++ Example
Message-ID: <30965644.bxiDOaWOx7@rabbit>

Hi,

I sort of missed the big C++ discussion, but I'd like to give some examples of 
how writing code can become much simpler if you are based on C++. This is from 
my mahotas package, which has a thin C++ wrapper around numpy's C API

https://github.com/luispedro/mahotas/blob/master/mahotas/_morph.cpp

and it implements multi-type greyscale erosion.


// numpy::aligned_array wraps PyArrayObject*
template<typename T>
void erode(numpy::aligned_array<T> res, 
				numpy::aligned_array<T> array,
				numpy::aligned_array<T> Bc) {


	// Release the GIL using RAII
    gil_release nogil;
    const int N = res.size();
    typename numpy::aligned_array<T>::iterator iter = array.begin();
    // this is adapted from scipy.ndimage.
    // it implements the convolution-like filtering.
    filter_iterator<T> filter(res.raw_array(),
										Bc.raw_array(),
										EXTEND_NEAREST,
										is_bool(T()));
    const int N2 = filter.size();
    T* rpos = res.data();

    for (int i = 0; i != N; ++i, ++rpos, filter.iterate_both(iter)) {
        T value = std::numeric_limits<T>::max();
        for (int j = 0; j != N2; ++j) {
            T arr_val = T();
            filter.retrieve(iter, j, arr_val);
            value = std::min<T>(value, erode_sub(arr_val, filter[j]));
        }
        *rpos = value;
    }
}

If you compare this with the equivalent scipy.ndimage function, which is very 
good C code (but mostly write-only?in fact, ndimage has not been maintainable 
because it is so hard [at least for me, I've tried]):

int NI_BinaryErosion(PyArrayObject* input, PyArrayObject* strct,
                PyArrayObject* mask, PyArrayObject* output, int bdr_value,
                     npy_intp *origins, int invert, int center_is_true,
                     int* changed, NI_CoordinateList **coordinate_list)
{
    npy_intp struct_size = 0, *offsets = NULL, size, *oo, jj;
    npy_intp ssize, block_size = 0, *current = NULL, border_flag_value;
    int kk, true, false, msk_value;
    NI_Iterator ii, io, mi;
    NI_FilterIterator fi;
    Bool *ps, out = 0;
    char *pi, *po, *pm = NULL;
    NI_CoordinateBlock *block = NULL;

    ps = (Bool*)PyArray_DATA(strct);
    ssize = 1;
    for(kk = 0; kk < strct->nd; kk++)
        ssize *= strct->dimensions[kk];
    for(jj = 0; jj < ssize; jj++)
        if (ps[jj]) ++struct_size;
    if (mask) {
        if (!NI_InitPointIterator(mask, &mi))
            return 0;
        pm = (void *)PyArray_DATA(mask);
    }
    /* calculate the filter offsets: */
    if (!NI_InitFilterOffsets(input, ps, strct->dimensions, origins,
                                    NI_EXTEND_CONSTANT, &offsets, 
&border_flag_value, NULL))
        goto exit;
    /* initialize input element iterator: */
    if (!NI_InitPointIterator(input, &ii))
        goto exit;
    /* initialize output element iterator: */
    if (!NI_InitPointIterator(output, &io))
        goto exit;
    /* initialize filter iterator: */
    if (!NI_InitFilterIterator(input->nd, strct->dimensions, struct_size,
                                                         input->dimensions, 
origins, &fi))
        goto exit;

    /* get data pointers an size: */
    pi = (void *)PyArray_DATA(input);
    po = (void *)PyArray_DATA(output);
    size = 1;
    for(kk = 0; kk < input->nd; kk++)
        size *= input->dimensions[kk];
    if (invert) {
        bdr_value = bdr_value ? 0 : 1;
        true = 0;
        false = 1;
    } else {
        bdr_value = bdr_value ? 1 : 0;
        true = 1;
        false = 0;
    }
    if (coordinate_list) {
        block_size = LIST_SIZE / input->nd / sizeof(int);
        if (block_size < 1)
            block_size = 1;
        if (block_size > size)
            block_size = size;
        *coordinate_list = NI_InitCoordinateList(block_size, input->nd);
        if (!*coordinate_list)
            goto exit;
    }
    /* iterator over the elements: */
    oo = offsets;
    *changed = 0;
    msk_value = 1;
    for(jj = 0; jj < size; jj++) {
        int pchange = 0;
        if (mask) {
            switch(mask->descr->type_num) {
            CASE_GET_MASK(msk_value, pm, Bool);
            CASE_GET_MASK(msk_value, pm, UInt8);
            CASE_GET_MASK(msk_value, pm, UInt16);
            CASE_GET_MASK(msk_value, pm, UInt32);
#if HAS_UINT64
            CASE_GET_MASK(msk_value, pm, UInt64);
#endif
            CASE_GET_MASK(msk_value, pm, Int8);
            CASE_GET_MASK(msk_value, pm, Int16);
            CASE_GET_MASK(msk_value, pm, Int32);
            CASE_GET_MASK(msk_value, pm, Int64);
            CASE_GET_MASK(msk_value, pm, Float32);
            CASE_GET_MASK(msk_value, pm, Float64);
            default:
                PyErr_SetString(PyExc_RuntimeError, "data type not 
supported");
                return 0;
            }
        }
        switch (input->descr->type_num) {
        CASE_NI_ERODE_POINT(pi, out, oo, struct_size, Bool, msk_value,
                                                bdr_value, border_flag_value, 
center_is_true,
                                                true, false, pchange);
        CASE_NI_ERODE_POINT(pi, out, oo, struct_size, UInt8, msk_value,
                                                bdr_value, border_flag_value, 
center_is_true,
                                                true, false, pchange);
        CASE_NI_ERODE_POINT(pi, out, oo, struct_size, UInt16, msk_value,
                                                bdr_value, border_flag_value, 
center_is_true,
                                                true, false, pchange);
        CASE_NI_ERODE_POINT(pi, out, oo, struct_size, UInt32, msk_value,
                                                bdr_value, border_flag_value, 
center_is_true,
                                                true, false, pchange);
#if HAS_UINT64
        CASE_NI_ERODE_POINT(pi, out, oo, struct_size, UInt64, msk_value,
                                                bdr_value, border_flag_value, 
center_is_true,
                                                true, false, pchange);
#endif
        CASE_NI_ERODE_POINT(pi, out, oo, struct_size, Int8, msk_value,
                                                bdr_value, border_flag_value, 
center_is_true,
                                                true, false, pchange);
        CASE_NI_ERODE_POINT(pi, out, oo, struct_size, Int16, msk_value,
                                                bdr_value, border_flag_value, 
center_is_true,
                                                true, false, pchange);
        CASE_NI_ERODE_POINT(pi, out, oo, struct_size, Int32, msk_value,
                                                bdr_value, border_flag_value, 
center_is_true,
                                                true, false, pchange);
        CASE_NI_ERODE_POINT(pi, out, oo, struct_size, Int64, msk_value,
                                                bdr_value, border_flag_value, 
center_is_true,
                                                true, false, pchange);
        CASE_NI_ERODE_POINT(pi, out, oo, struct_size, Float32, msk_value,
                                                bdr_value, border_flag_value, 
center_is_true,
                                                true, false, pchange);
        CASE_NI_ERODE_POINT(pi, out, oo, struct_size, Float64, msk_value,
                                                bdr_value, border_flag_value, 
center_is_true,
                                                true, false, pchange);
        default:
            PyErr_SetString(PyExc_RuntimeError, "data type not supported");
            goto exit;
        }
        switch (output->descr->type_num) {
        CASE_OUTPUT(po, out, Bool);
        CASE_OUTPUT(po, out, UInt8);
        CASE_OUTPUT(po, out, UInt16);
        CASE_OUTPUT(po, out, UInt32);
#if HAS_UINT64
        CASE_OUTPUT(po, out, UInt64);
#endif
        CASE_OUTPUT(po, out, Int8);
        CASE_OUTPUT(po, out, Int16);
        CASE_OUTPUT(po, out, Int32);
        CASE_OUTPUT(po, out, Int64);
        CASE_OUTPUT(po, out, Float32);
        CASE_OUTPUT(po, out, Float64);
        default:
            PyErr_SetString(PyExc_RuntimeError, "data type not supported");
            goto exit;
        }
        if (pchange) {
            *changed = 1;
            if (coordinate_list) {
                if (block == NULL ||  block->size == block_size) {
                    block = NI_CoordinateListAddBlock(*coordinate_list);
                    current = block->coordinates;
                }
                for(kk = 0; kk < input->nd; kk++)
                    *current++ = ii.coordinates[kk];
                block->size++;
            }
        }
        if (mask) {
            NI_FILTER_NEXT3(fi, ii, io, mi, oo, pi, po, pm);
        } else {
            NI_FILTER_NEXT2(fi, ii, io, oo, pi, po);
        }
    }

 exit:
    if (offsets)
        free(offsets);
    if (PyErr_Occurred()) {
        if (coordinate_list) {
            NI_FreeCoordinateList(*coordinate_list);
            *coordinate_list = NULL;
        }
        return 0;
    } else {
        return 1;
    }
    return PyErr_Occurred() ? 0 : 1;
}


HTH
-- 
Luis Pedro Coelho | Institute for Molecular Medicine | http://luispedro.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120303/6aa473e5/attachment.sig>

From robert.kern at gmail.com  Sat Mar  3 11:10:12 2012
From: robert.kern at gmail.com (Robert Kern)
Date: Sat, 3 Mar 2012 16:10:12 +0000
Subject: [Numpy-discussion] Floating point "close" function?
In-Reply-To: <CAFXk4bos6Nyp6PxobPEY4SjcBvNo=aQhPhyJmC=jMxeFtU7TTg@mail.gmail.com>
References: <CACe1pJdnj11a+-DAYssNQV9bSBxEjp_nCAjchk7_itH2-RYP4g@mail.gmail.com>
	<CABL7CQhQDeSi36TJbSq9pB3odyQTg91r=JEeUGTriWmkB7NdPw@mail.gmail.com>
	<CAF6FJivo8LrEhFc+r3biW3iH8hjdoJZDJ8Txf8Cj9aS37r0VnQ@mail.gmail.com>
	<CABL7CQhT4HuS_PhDYYTE2tV+0Z8e_mmuAwLq_YG68hR8LH4Xtw@mail.gmail.com>
	<CAF6FJiu5Gx2ZQ9e_P-1GmR=UkucK8MyT_C8M+k3i4CZv+hPG=A@mail.gmail.com>
	<CAF6FJitkyXsL76P0mvL0ujGMhPw2Ds0UH-+SdGsTWyTXsK=hgQ@mail.gmail.com>
	<CAFXk4bohtWeDaA5GDeVP_uqH5r4t1e05W-xxv50fASAXG1BbhA@mail.gmail.com>
	<CAF6FJivi-06QkHih+L8Qn2iGt7AO780vz3APJ0+kaUxYX-Q=Fw@mail.gmail.com>
	<CAFXk4bos6Nyp6PxobPEY4SjcBvNo=aQhPhyJmC=jMxeFtU7TTg@mail.gmail.com>
Message-ID: <CAF6FJiuaOC11F1XrPrPMo1-Oj2fxLJ3VDo+uor-zg1p+sTwv0w@mail.gmail.com>

On Sat, Mar 3, 2012 at 16:06, Olivier Delalleau <shish at keba.be> wrote:
> Le 3 mars 2012 11:03, Robert Kern <robert.kern at gmail.com> a ?crit :
>>
>> On Sat, Mar 3, 2012 at 15:51, Olivier Delalleau <shish at keba.be> wrote:
>> > Le 3 mars 2012 10:27, Robert Kern <robert.kern at gmail.com> a ?crit :
>> >>
>> >> On Sat, Mar 3, 2012 at 14:34, Robert Kern <robert.kern at gmail.com>
>> >> wrote:
>> >> > On Sat, Mar 3, 2012 at 14:31, Ralf Gommers
>> >> > <ralf.gommers at googlemail.com>
>> >> > wrote:
>> >>
>> >> >> Because this is also bad:
>> >> >>>>> np.<TAB>
>> >> >> Display all 561 possibilities? (y or n)
>> >> >
>> >> > Not as bad as overloading np.allclose(x,y,return_array=True). Or
>> >> > deprecating np.allclose() in favor of np.close().all().
>> >>
>> >> I screwed up this paragraph. I meant that as "Another alternative
>> >> would be to deprecate ...".
>> >
>> >
>> > np.close().all() would probably be a lot less efficient in terms of CPU
>> > /
>> > memory though, wouldn't it?
>>
>> No. np.allclose() is essentially doing exactly this already.
>
>
> Ok. What about then, np.allclose() could theoretically be a lot more
> efficient in terms of CPU / memory? ;)

True, but so could a hypothetical np.allequal(), np.allless_equal(), etc.

-- 
Robert Kern


From jkington at wisc.edu  Sat Mar  3 13:06:54 2012
From: jkington at wisc.edu (Joe Kington)
Date: Sat, 03 Mar 2012 12:06:54 -0600
Subject: [Numpy-discussion] Floating point "close" function?
In-Reply-To: <CAFXk4bos6Nyp6PxobPEY4SjcBvNo=aQhPhyJmC=jMxeFtU7TTg@mail.gmail.com>
References: <CACe1pJdnj11a+-DAYssNQV9bSBxEjp_nCAjchk7_itH2-RYP4g@mail.gmail.com>
	<CABL7CQhQDeSi36TJbSq9pB3odyQTg91r=JEeUGTriWmkB7NdPw@mail.gmail.com>
	<CAF6FJivo8LrEhFc+r3biW3iH8hjdoJZDJ8Txf8Cj9aS37r0VnQ@mail.gmail.com>
	<CABL7CQhT4HuS_PhDYYTE2tV+0Z8e_mmuAwLq_YG68hR8LH4Xtw@mail.gmail.com>
	<CAF6FJiu5Gx2ZQ9e_P-1GmR=UkucK8MyT_C8M+k3i4CZv+hPG=A@mail.gmail.com>
	<CAF6FJitkyXsL76P0mvL0ujGMhPw2Ds0UH-+SdGsTWyTXsK=hgQ@mail.gmail.com>
	<CAFXk4bohtWeDaA5GDeVP_uqH5r4t1e05W-xxv50fASAXG1BbhA@mail.gmail.com>
	<CAF6FJivi-06QkHih+L8Qn2iGt7AO780vz3APJ0+kaUxYX-Q=Fw@mail.gmail.com>
	<CAFXk4bos6Nyp6PxobPEY4SjcBvNo=aQhPhyJmC=jMxeFtU7TTg@mail.gmail.com>
Message-ID: <CACe1pJc6NdNkBrQ9_aAYVUf3t5Ri0xN9FRnBVn061krXnBEuYg@mail.gmail.com>

On Sat, Mar 3, 2012 at 10:06 AM, Olivier Delalleau <shish at keba.be> wrote:

> Le 3 mars 2012 11:03, Robert Kern <robert.kern at gmail.com> a ?crit :
>
>> On Sat, Mar 3, 2012 at 15:51, Olivier Delalleau <shish at keba.be> wrote:
>> > Le 3 mars 2012 10:27, Robert Kern <robert.kern at gmail.com> a ?crit :
>> >>
>> >> On Sat, Mar 3, 2012 at 14:34, Robert Kern <robert.kern at gmail.com>
>> wrote:
>> >> > On Sat, Mar 3, 2012 at 14:31, Ralf Gommers <
>> ralf.gommers at googlemail.com>
>> >> > wrote:
>> >>
>> >> >> Because this is also bad:
>> >> >>>>> np.<TAB>
>> >> >> Display all 561 possibilities? (y or n)
>> >> >
>> >> > Not as bad as overloading np.allclose(x,y,return_array=True). Or
>> >> > deprecating np.allclose() in favor of np.close().all().
>> >>
>> >> I screwed up this paragraph. I meant that as "Another alternative
>> >> would be to deprecate ...".
>> >
>> >
>> > np.close().all() would probably be a lot less efficient in terms of CPU
>> /
>> > memory though, wouldn't it?
>>
>> No. np.allclose() is essentially doing exactly this already.
>>
>
> Ok. What about then, np.allclose() could theoretically be a lot more
> efficient in terms of CPU / memory? ;)
>

allclose() does short-circuit in a few cases where the pattern of Inf's
doesn't match.  E.g.

    if not all(xinf == isinf(y)):
        return False
    if not all(x[xinf] == y[xinf]):
        return False

At least for the function I wrote, allclose() would be a bit faster than
isclose().all() in those specific cases.  It's not likely to be terribly
significant, though.
-Joe


>
> -=- Olivier
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120303/acd831e9/attachment.html>

From jkington at wisc.edu  Sat Mar  3 13:07:29 2012
From: jkington at wisc.edu (Joe Kington)
Date: Sat, 03 Mar 2012 12:07:29 -0600
Subject: [Numpy-discussion] Floating point "close" function?
In-Reply-To: <CAF6FJitkQT5THf59R=A9pjqdk2OVOFE1d4_R7oPjZ6tc+AQg0g@mail.gmail.com>
References: <CACe1pJdnj11a+-DAYssNQV9bSBxEjp_nCAjchk7_itH2-RYP4g@mail.gmail.com>
	<CABL7CQhQDeSi36TJbSq9pB3odyQTg91r=JEeUGTriWmkB7NdPw@mail.gmail.com>
	<CAF6FJivo8LrEhFc+r3biW3iH8hjdoJZDJ8Txf8Cj9aS37r0VnQ@mail.gmail.com>
	<CABL7CQhT4HuS_PhDYYTE2tV+0Z8e_mmuAwLq_YG68hR8LH4Xtw@mail.gmail.com>
	<CAF6FJiu5Gx2ZQ9e_P-1GmR=UkucK8MyT_C8M+k3i4CZv+hPG=A@mail.gmail.com>
	<CANNq6F=Uh9Hr-ti423DEWbEsKrGGVyWf5bYmE1UDMcA8aCzhhQ@mail.gmail.com>
	<CAF6FJitkQT5THf59R=A9pjqdk2OVOFE1d4_R7oPjZ6tc+AQg0g@mail.gmail.com>
Message-ID: <CACe1pJeLafrL5Xx5PiTO+rTLch8w-fGu_pbhddWxS4uXGWh10g@mail.gmail.com>

On Sat, Mar 3, 2012 at 9:26 AM, Robert Kern <robert.kern at gmail.com> wrote:

> On Sat, Mar 3, 2012 at 15:22, Benjamin Root <ben.root at ou.edu> wrote:
> >
> >
> > On Saturday, March 3, 2012, Robert Kern <robert.kern at gmail.com> wrote:
> >> On Sat, Mar 3, 2012 at 14:31, Ralf Gommers <ralf.gommers at googlemail.com
> >
> >> wrote:
> >>>
> >>>
> >>> On Sat, Mar 3, 2012 at 3:05 PM, Robert Kern <robert.kern at gmail.com>
> >>> wrote:
> >>>>
> >>>> On Sat, Mar 3, 2012 at 13:59, Ralf Gommers <
> ralf.gommers at googlemail.com>
> >>>> wrote:
> >>>> >
> >>>> >
> >>>> > On Thu, Mar 1, 2012 at 11:44 PM, Joe Kington <jkington at wisc.edu>
> >>>> > wrote:
> >>>> >>
> >>>> >> Is there a numpy function for testing floating point equality that
> >>>> >> returns
> >>>> >> a boolean array?
> >>>> >>
> >>>> >> I'm aware of np.allclose, but I need a boolean array.  Properly
> >>>> >> handling
> >>>> >> NaN's and Inf's (as allclose does) would be a nice bonus.
> >>>> >>
> >>>> >> I wrote the function below to do this, but I suspect there's a
> method
> >>>> >> in
> >>>> >> numpy that I missed.
> >>>> >
> >>>> >
> >>>> > I don't think such a function exists, would be nice to have. How
> about
> >>>> > just
> >>>> > adding a keyword "return_array" to allclose to do so?
> >>>>
> >>>> As a general design principle, adding a boolean flag that changes the
> >>>> return type is worse than making a new function.
> >>>
> >>>
> >>> That's certainly true as a general principle. Do you have a concrete
> >>> suggestion in this case though?
> >>
> >> np.close()
> >>
> >
> > When I read that, I mentally think of "close" as in closing a file.  I
> think
> > we need a synonym.
>
> np.isclose()
>

Would it be helpful if I went ahead and submitted a pull request with the
function in my original question called "isclose" (along with a complete
docstring and a few tests)?

One note:
At the moment, it deliberately compares NaN's as equal. E.g.

    isclose([np.nan, np.nan], [np.nan, np.nan])

will return:

    [True, True]

This obviously runs counter to the standard way NaN's are handled (and
indeed the definition of NaN).

However, in the context of a floating point "close to" function, I think it
makes the most sense.

I've had this sitting around in a small project for awhile now, and it's
been more useful to have it compare NaN's as "approximately equal" than not
for my purposes at least.

Nonetheless, it's something that needs additional consideration.

Thanks,
-Joe


>
> --
> Robert Kern
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120303/0e483321/attachment.html>

From shish at keba.be  Sat Mar  3 13:50:50 2012
From: shish at keba.be (Olivier Delalleau)
Date: Sat, 3 Mar 2012 13:50:50 -0500
Subject: [Numpy-discussion] Floating point "close" function?
In-Reply-To: <CACe1pJeLafrL5Xx5PiTO+rTLch8w-fGu_pbhddWxS4uXGWh10g@mail.gmail.com>
References: <CACe1pJdnj11a+-DAYssNQV9bSBxEjp_nCAjchk7_itH2-RYP4g@mail.gmail.com>
	<CABL7CQhQDeSi36TJbSq9pB3odyQTg91r=JEeUGTriWmkB7NdPw@mail.gmail.com>
	<CAF6FJivo8LrEhFc+r3biW3iH8hjdoJZDJ8Txf8Cj9aS37r0VnQ@mail.gmail.com>
	<CABL7CQhT4HuS_PhDYYTE2tV+0Z8e_mmuAwLq_YG68hR8LH4Xtw@mail.gmail.com>
	<CAF6FJiu5Gx2ZQ9e_P-1GmR=UkucK8MyT_C8M+k3i4CZv+hPG=A@mail.gmail.com>
	<CANNq6F=Uh9Hr-ti423DEWbEsKrGGVyWf5bYmE1UDMcA8aCzhhQ@mail.gmail.com>
	<CAF6FJitkQT5THf59R=A9pjqdk2OVOFE1d4_R7oPjZ6tc+AQg0g@mail.gmail.com>
	<CACe1pJeLafrL5Xx5PiTO+rTLch8w-fGu_pbhddWxS4uXGWh10g@mail.gmail.com>
Message-ID: <CAFXk4bqhc3FaBrwELyfzLOCO0TrwYsyrXJv9ee_uiHhN4xD7gg@mail.gmail.com>

Le 3 mars 2012 13:07, Joe Kington <jkington at wisc.edu> a ?crit :

>
>
> On Sat, Mar 3, 2012 at 9:26 AM, Robert Kern <robert.kern at gmail.com> wrote:
>
>> On Sat, Mar 3, 2012 at 15:22, Benjamin Root <ben.root at ou.edu> wrote:
>> >
>> >
>> > On Saturday, March 3, 2012, Robert Kern <robert.kern at gmail.com> wrote:
>> >> On Sat, Mar 3, 2012 at 14:31, Ralf Gommers <
>> ralf.gommers at googlemail.com>
>> >> wrote:
>> >>>
>> >>>
>> >>> On Sat, Mar 3, 2012 at 3:05 PM, Robert Kern <robert.kern at gmail.com>
>> >>> wrote:
>> >>>>
>> >>>> On Sat, Mar 3, 2012 at 13:59, Ralf Gommers <
>> ralf.gommers at googlemail.com>
>> >>>> wrote:
>> >>>> >
>> >>>> >
>> >>>> > On Thu, Mar 1, 2012 at 11:44 PM, Joe Kington <jkington at wisc.edu>
>> >>>> > wrote:
>> >>>> >>
>> >>>> >> Is there a numpy function for testing floating point equality that
>> >>>> >> returns
>> >>>> >> a boolean array?
>> >>>> >>
>> >>>> >> I'm aware of np.allclose, but I need a boolean array.  Properly
>> >>>> >> handling
>> >>>> >> NaN's and Inf's (as allclose does) would be a nice bonus.
>> >>>> >>
>> >>>> >> I wrote the function below to do this, but I suspect there's a
>> method
>> >>>> >> in
>> >>>> >> numpy that I missed.
>> >>>> >
>> >>>> >
>> >>>> > I don't think such a function exists, would be nice to have. How
>> about
>> >>>> > just
>> >>>> > adding a keyword "return_array" to allclose to do so?
>> >>>>
>> >>>> As a general design principle, adding a boolean flag that changes the
>> >>>> return type is worse than making a new function.
>> >>>
>> >>>
>> >>> That's certainly true as a general principle. Do you have a concrete
>> >>> suggestion in this case though?
>> >>
>> >> np.close()
>> >>
>> >
>> > When I read that, I mentally think of "close" as in closing a file.  I
>> think
>> > we need a synonym.
>>
>> np.isclose()
>>
>
> Would it be helpful if I went ahead and submitted a pull request with the
> function in my original question called "isclose" (along with a complete
> docstring and a few tests)?
>
> One note:
> At the moment, it deliberately compares NaN's as equal. E.g.
>
>     isclose([np.nan, np.nan], [np.nan, np.nan])
>
> will return:
>
>     [True, True]
>
> This obviously runs counter to the standard way NaN's are handled (and
> indeed the definition of NaN).
>
> However, in the context of a floating point "close to" function, I think
> it makes the most sense.
>
> I've had this sitting around in a small project for awhile now, and it's
> been more useful to have it compare NaN's as "approximately equal" than not
> for my purposes at least.
>
> Nonetheless, it's something that needs additional consideration.
>
> Thanks,
> -Joe
>

It would be confusing if numpy.isclose().all() was different from
numpy.allclose(). That being said, I agree it's useful to have NaNs compare
equal in some cases, maybe it could be a new argument to the function?

-=- Olivier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120303/2be33307/attachment.html>

From travis at continuum.io  Sat Mar  3 15:30:47 2012
From: travis at continuum.io (Travis Oliphant)
Date: Sat, 3 Mar 2012 14:30:47 -0600
Subject: [Numpy-discussion] Missing data again
Message-ID: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>

Hi all, 

I've been thinking a lot about the masked array implementation lately.     I finally had the time to look hard at what has been done and now am of the opinion that I do not think that 1.7 can be released with the current state of the masked array implementation *unless* it is clearly marked as experimental and may be changed in 1.8  

I wish I had been able to be a bigger part of this conversation last year.   But, that is why I took the steps I took to try and figure out another way to feed my family *and* stay involved in the NumPy community.   I would love to stay involved in what is happening in the SciPy community, but I am more satisfied with what Ralf, Warren, Robert, Pauli, Josef, Charles, Stefan, and others are doing there right now, and don't have time to keep up with everything.    Even though SciPy was the heart and soul of why I even got involved with Python for open source in the first place and took many years of my volunteer labor, I won't be able to spend significant time on SciPy code over the coming months.   At some point, I really hope to be able to make contributions again to that code-base.   Time will tell whether or not my aspirations will be realized.  It depends quite a bit on whether or not my kids have what they need from me (which right now is money and time). 
 
NumPy, on the other hand, is not in a position where I can feel comfortable leaving my "baby" to others.  I recognize and value the contributions from many people to make NumPy what it is today (e.g. code contributions, code rearrangement and standardization, build and install improvement, and most recently some architectural changes).    But, I feel a personal responsibility for the code base as I spent a great many months writing NumPy in the first place, and I've spent a great deal of time interacting with NumPy users and feel like I have at least some sense of their stories.    Of course, I built on the shoulders of giants, and much of what is there is *because of* where the code was adapted from (it was not created de-novo).   Currently,  there remains much that needs to be communicated, improved, and worked on, and I have specific opinions about what some changes and improvements should be, how they should be written, and how the resulting users need to be benefited.   It will take time to discuss all of this, and that's where I will spend my open-source time in the coming months. 

In that vein: 

Because it is slated to go into release 1.7, we need to re-visit the masked array discussion again.    The NEP process is the appropriate one and I'm glad we are taking that route for these discussions.   My goal is to get consensus in order for code to get into NumPy (regardless of who writes the code).    It may be that we don't come to a consensus (reasonable and intelligent people can disagree on things --- look at the coming election...).   We can represent different parts of what is fortunately a very large user-base of NumPy users.    

First of all, I want to be clear that I think there is much great work that has been done in the current missing data code.  There are some nice features in the where clause of the ufunc and the machinery for the iterator that allows re-using ufunc loops that are not re-written to check for missing data.   I'm sure there are other things as well that I'm not quite aware of yet.    However, I don't think the API presented to the numpy user presently is the correct one for NumPy 1.X.   

A few particulars: 

	* the reduction operations need to default to "skipna" --- this is the most common use case which has been re-inforced again to me today by a new user to Python who is using masked arrays presently 
	
	* the mask needs to be visible to the user if they use that approach to missing data (people should be able to get a hold of the mask and work with it in Python)

	* bit-pattern approaches to missing data (at least for float64 and int32) need to be implemented. 

	* there should be some way when using "masks" (even if it's hidden from most users) for missing data to separate the low-level ufunc operation from the operation
	   on the masks...

I have heard from several users that they will *not use the missing data* in NumPy as currently implemented, and I can now see why.    For better or for worse, my approach to software is generally very user-driven and very pragmatic.  On the other hand, I'm also a mathematician and appreciate the cognitive compression that can come out of well-formed structure.    None-the-less, I'm an *applied* mathematician and am ultimately motivated by applications.

I will get a hold of the NEP and spend some time with it to discuss some of this in that document.   This will take several weeks (as PyCon is next week and I have a tutorial I'm giving there).    For now, I do not think 1.7 can be released unless the masked array is labeled *experimental*.  

Thanks,

-Travis


From mwwiebe at gmail.com  Sat Mar  3 16:46:29 2012
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Sat, 3 Mar 2012 13:46:29 -0800
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
Message-ID: <CAMRnEmoiTZcVpBZKU1B-kOGMsmN8SxghJ=oc5v7Y8s7wRvd+BA@mail.gmail.com>

On Sat, Mar 3, 2012 at 12:30 PM, Travis Oliphant <travis at continuum.io>wrote:

> <snip>
>


> First of all, I want to be clear that I think there is much great work
> that has been done in the current missing data code.  There are some nice
> features in the where clause of the ufunc and the machinery for the
> iterator that allows re-using ufunc loops that are not re-written to check
> for missing data.   I'm sure there are other things as well that I'm not
> quite aware of yet.    However, I don't think the API presented to the
> numpy user presently is the correct one for NumPy 1.X.
>

I thought I might chime in with some implementation-detail notes, as while
Travis has dug into the code, I'm still the person who knows it best.

A few particulars:
>
>        * the reduction operations need to default to "skipna" --- this is
> the most common use case which has been re-inforced again to me today by a
> new user to Python who is using masked arrays presently
>

This is a completely trivial change. I went with the default as I did
because it's what R, the primary inspiration for the NA design, does. We'll
have to be sure this is well-marked in the documentation about "NumPy NA
for R users".


>        * the mask needs to be visible to the user if they use that
> approach to missing data (people should be able to get a hold of the mask
> and work with it in Python)
>

This is relatively easy. Probably the way to do it is with an
ndarray.maskna property. It could be in 1.7 if we really push. For the
multi-NA future, I think the NPY_MASK dtype, currently an alias for
NPY_UBYTE, would need to become its own dtype with separate .exposed and
.payload attributes.


>        * bit-pattern approaches to missing data (at least for float64 and
> int32) need to be implemented.
>

I strongly wanted to do masks first, because of the greater generality and
because the bit-patterns would best be implemented sharing mask
implementation details. I still believe this was the correct choice, and it
set the stage for bit-patterns. It will be possible to make inner loops
that specialize for the default hard-coded bit-pattern dtypes. I paid very
careful attention in the design making sure high performance is possible
without significant rework. The immense scale of the required code changes
meant I couldn't actually implement high performance in the time frame.

The place I think this affects 1.7 the most is in the default choice for
what np.array([1.0, np.NA, 3.0]) and np.array([1, np.NA, 3]) mean. In 1.7,
both mean an NA-masked array. In 1.8, I can see a strong case that the
first should mean an NA-dtype, and the second an NA-masked array.

Also, here's a thought for the usability of NA-float64. As much as global
state is a bad idea, something which determines whether implicit float
dtypes are NA-float64 or float64 could help. In IPython, "pylab" mode would
default to float64, and "statlab" or "pystat" would default to NA-float64.
One way to write this might be:

>>> np.set_default_float(np.nafloat64)
>>> np.array([1.0, 2.0, 3.0])
array([ 1.,  2.,  3.], dtype=nafloat64)
>>> np.set_default_float(np.float64)
>>> np.array([1.0, 2.0, 3.0])
array([ 1.,  2.,  3.], dtype=float64)


>        * there should be some way when using "masks" (even if it's hidden
> from most users) for missing data to separate the low-level ufunc operation
> from the operation
>           on the masks...
>

This is completely trivial to implement. Maybe
ndarray.view(maskna='ignore') is a reasonable way to spell direct access
without a mask.

Cheers,
Mark


> I have heard from several users that they will *not use the missing data*
> in NumPy as currently implemented, and I can now see why.    For better or
> for worse, my approach to software is generally very user-driven and very
> pragmatic.  On the other hand, I'm also a mathematician and appreciate the
> cognitive compression that can come out of well-formed structure.
>  None-the-less, I'm an *applied* mathematician and am ultimately motivated
> by applications.
>
> I will get a hold of the NEP and spend some time with it to discuss some
> of this in that document.   This will take several weeks (as PyCon is next
> week and I have a tutorial I'm giving there).    For now, I do not think
> 1.7 can be released unless the masked array is labeled *experimental*.
>
> Thanks,
>
> -Travis
>
>
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120303/dd349ace/attachment.html>

From ralf.gommers at googlemail.com  Sat Mar  3 16:53:09 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sat, 3 Mar 2012 22:53:09 +0100
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
Message-ID: <CABL7CQhccyPHfXT3K2aRrgU31fNuVgCKmGgLZua1aSacXUN-6A@mail.gmail.com>

On Sat, Mar 3, 2012 at 9:30 PM, Travis Oliphant <travis at continuum.io> wrote:

> Hi all,
>
> I've been thinking a lot about the masked array implementation lately.
> I finally had the time to look hard at what has been done and now am of the
> opinion that I do not think that 1.7 can be released with the current state
> of the masked array implementation *unless* it is clearly marked as
> experimental and may be changed in 1.8
>

We had already decided to put an "experimental" label on the
implementation. Also on datetime. I will open a ticket for this now to make
sure we won't forget.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120303/fb8505f6/attachment.html>

From charlesr.harris at gmail.com  Sat Mar  3 16:55:04 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 3 Mar 2012 14:55:04 -0700
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
Message-ID: <CAB6mnxKQ3aW8csw7+P0DMAiJCeXxeDVqFVG95hpfepU2RC=p=g@mail.gmail.com>

On Sat, Mar 3, 2012 at 1:30 PM, Travis Oliphant <travis at continuum.io> wrote:

> Hi all,
>
> I've been thinking a lot about the masked array implementation lately.
> I finally had the time to look hard at what has been done and now am of the
> opinion that I do not think that 1.7 can be released with the current state
> of the masked array implementation *unless* it is clearly marked as
> experimental and may be changed in 1.8
>
>
That was the intention.


> I wish I had been able to be a bigger part of this conversation last year.
>   But, that is why I took the steps I took to try and figure out another
> way to feed my family *and* stay involved in the NumPy community.   I would
> love to stay involved in what is happening in the SciPy community, but I am
> more satisfied with what Ralf, Warren, Robert, Pauli, Josef, Charles,
> Stefan, and others are doing there right now, and don't have time to keep
> up with everything.    Even though SciPy was the heart and soul of why I
> even got involved with Python for open source in the first place and took
> many years of my volunteer labor, I won't be able to spend significant time
> on SciPy code over the coming months.   At some point, I really hope to be
> able to make contributions again to that code-base.   Time will tell
> whether or not my aspirations will be realized.  It depends quite a bit on
> whether or not my kids have what they need from me (which right now is
> money and time).
>
> NumPy, on the other hand, is not in a position where I can feel
> comfortable leaving my "baby" to others.  I recognize and value the
> contributions from many people to make NumPy what it is today (e.g. code
> contributions, code rearrangement and standardization, build and install
> improvement, and most recently some architectural changes).    But, I feel
> a personal responsibility for the code base as I spent a great many months
> writing NumPy in the first place, and I've spent a great deal of time
> interacting with NumPy users and feel like I have at least some sense of
> their stories.    Of course, I built on the shoulders of giants, and much
> of what is there is *because of* where the code was adapted from (it was
> not created de-novo).   Currently,  there remains much that needs to be
> communicated, improved, and worked on, and I have specific opinions about
> what some changes and improvements should be, how they should be written,
> and how the resulting users need to be benefited.
>  It will take time to discuss all of this, and that's where I will spend
> my open-source time in the coming months.
>
> In that vein:
>
> Because it is slated to go into release 1.7, we need to re-visit the
> masked array discussion again.    The NEP process is the appropriate one
> and I'm glad we are taking that route for these discussions.   My goal is
> to get consensus in order for code to get into NumPy (regardless of who
> writes the code).    It may be that we don't come to a consensus
> (reasonable and intelligent people can disagree on things --- look at the
> coming election...).   We can represent different parts of what is
> fortunately a very large user-base of NumPy users.
>
> First of all, I want to be clear that I think there is much great work
> that has been done in the current missing data code.  There are some nice
> features in the where clause of the ufunc and the machinery for the
> iterator that allows re-using ufunc loops that are not re-written to check
> for missing data.   I'm sure there are other things as well that I'm not
> quite aware of yet.    However, I don't think the API presented to the
> numpy user presently is the correct one for NumPy 1.X.
>

A few particulars:
>
>        * the reduction operations need to default to "skipna" --- this is
> the most common use case which has been re-inforced again to me today by a
> new user to Python who is using masked arrays presently
>
>        * the mask needs to be visible to the user if they use that
> approach to missing data (people should be able to get a hold of the mask
> and work with it in Python)
>
>        * bit-pattern approaches to missing data (at least for float64 and
> int32) need to be implemented.
>
>        * there should be some way when using "masks" (even if it's hidden
> from most users) for missing data to separate the low-level ufunc operation
> from the operation
>           on the masks...
>
>
Mind, Mark only had a few weeks to write code. I think the unfinished state
is a direct function of that.


> I have heard from several users that they will *not use the missing data*
> in NumPy as currently implemented, and I can now see why.    For better or
> for worse, my approach to software is generally very user-driven and very
> pragmatic.  On the other hand, I'm also a mathematician and appreciate the
> cognitive compression that can come out of well-formed structure.
>  None-the-less, I'm an *applied* mathematician and am ultimately motivated
> by applications.
>
>
I think that would be Wes. I thought the current state wasn't that far away
from what he wanted in the only post where he was somewhat explicit. I
think it would be useful for him to sit down with Mark at some time and
thrash things out since I think there is some misunderstanding involved.


> I will get a hold of the NEP and spend some time with it to discuss some
> of this in that document.   This will take several weeks (as PyCon is next
> week and I have a tutorial I'm giving there).    For now, I do not think
> 1.7 can be released unless the masked array is labeled *experimental*.
>
>
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120303/345f46aa/attachment.html>

From travis at continuum.io  Sat Mar  3 17:01:40 2012
From: travis at continuum.io (Travis Oliphant)
Date: Sat, 3 Mar 2012 16:01:40 -0600
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <CAB6mnxKQ3aW8csw7+P0DMAiJCeXxeDVqFVG95hpfepU2RC=p=g@mail.gmail.com>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
	<CAB6mnxKQ3aW8csw7+P0DMAiJCeXxeDVqFVG95hpfepU2RC=p=g@mail.gmail.com>
Message-ID: <36EC0418-E7D2-4C64-A30E-B6FBCDEE3A80@continuum.io>

> 
> Mind, Mark only had a few weeks to write code. I think the unfinished state is a direct function of that.
>  
> I have heard from several users that they will *not use the missing data* in NumPy as currently implemented, and I can now see why.    For better or for worse, my approach to software is generally very user-driven and very pragmatic.  On the other hand, I'm also a mathematician and appreciate the cognitive compression that can come out of well-formed structure.    None-the-less, I'm an *applied* mathematician and am ultimately motivated by applications.
> 
> 
> I think that would be Wes. I thought the current state wasn't that far away from what he wanted in the only post where he was somewhat explicit. I think it would be useful for him to sit down with Mark at some time and thrash things out since I think there is some misunderstanding involved.
>  

Actually it wasn't Wes.  It was 3 other people.   I'm already well aware of Wes's perspective and actually think his concerns have been handled already.    Also, the person who showed me their use-case was a new user.

But, your point about getting people together is well-taken.  I also recognize the fact that there have been (and likely continue to be) misunderstandings on multiple fronts.   Fortunately, many of us will be at PyCon later this week.   We tried really hard to get Mark Wiebe here this weekend as well --- but he could only sacrifice a week away from his degree work to join us for PyCon. 

It would be great if you could come to PyCon as well.   Perhaps we can apply to NumFOCUS for a travel grant to bring NumPy developers together with other interested people to finish the masked array design and implementation.

-Travis


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120303/09fabf14/attachment.html>

From jsseabold at gmail.com  Sat Mar  3 17:10:44 2012
From: jsseabold at gmail.com (Skipper Seabold)
Date: Sat, 3 Mar 2012 17:10:44 -0500
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <CAMRnEmoiTZcVpBZKU1B-kOGMsmN8SxghJ=oc5v7Y8s7wRvd+BA@mail.gmail.com>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
	<CAMRnEmoiTZcVpBZKU1B-kOGMsmN8SxghJ=oc5v7Y8s7wRvd+BA@mail.gmail.com>
Message-ID: <CAKF=Djut3imUZCzhsDWCNgh+ppFthF=aV=vVeOSK2wfzEQd4hQ@mail.gmail.com>

On Sat, Mar 3, 2012 at 4:46 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> On Sat, Mar 3, 2012 at 12:30 PM, Travis Oliphant <travis at continuum.io>
<snip>
>>
>> ? ? ? ?* the reduction operations need to default to "skipna" --- this is
>> the most common use case which has been re-inforced again to me today by a
>> new user to Python who is using masked arrays presently
>
>
> This is a completely trivial change. I went with the default as I did
> because it's what R, the primary inspiration for the NA design, does. We'll
> have to be sure this is well-marked in the documentation about "NumPy NA for
> R users".
>

It may be trivial to change the code, but this isn't a trivial change.
"Most common use case" is hard for me to swallow, since there are so
many. Of the different statistical softwares I've used, none that I
recall ignores missing data (silently) by default. This sounds
dangerous to me. It's one thing to be convenient to work with missing
data, but it's another to try to sweep the problem under the rug. I
imagine the choice of the R developers was a thoughtful one.

Perhaps something like np.seterr should also be implemented for
missing data since there's probably no resolution to what's most
sensible here.

Skipper


From jkington at wisc.edu  Sat Mar  3 18:33:49 2012
From: jkington at wisc.edu (Joe Kington)
Date: Sat, 03 Mar 2012 17:33:49 -0600
Subject: [Numpy-discussion] Floating point "close" function?
In-Reply-To: <CAFXk4bqhc3FaBrwELyfzLOCO0TrwYsyrXJv9ee_uiHhN4xD7gg@mail.gmail.com>
References: <CACe1pJdnj11a+-DAYssNQV9bSBxEjp_nCAjchk7_itH2-RYP4g@mail.gmail.com>
	<CABL7CQhQDeSi36TJbSq9pB3odyQTg91r=JEeUGTriWmkB7NdPw@mail.gmail.com>
	<CAF6FJivo8LrEhFc+r3biW3iH8hjdoJZDJ8Txf8Cj9aS37r0VnQ@mail.gmail.com>
	<CABL7CQhT4HuS_PhDYYTE2tV+0Z8e_mmuAwLq_YG68hR8LH4Xtw@mail.gmail.com>
	<CAF6FJiu5Gx2ZQ9e_P-1GmR=UkucK8MyT_C8M+k3i4CZv+hPG=A@mail.gmail.com>
	<CANNq6F=Uh9Hr-ti423DEWbEsKrGGVyWf5bYmE1UDMcA8aCzhhQ@mail.gmail.com>
	<CAF6FJitkQT5THf59R=A9pjqdk2OVOFE1d4_R7oPjZ6tc+AQg0g@mail.gmail.com>
	<CACe1pJeLafrL5Xx5PiTO+rTLch8w-fGu_pbhddWxS4uXGWh10g@mail.gmail.com>
	<CAFXk4bqhc3FaBrwELyfzLOCO0TrwYsyrXJv9ee_uiHhN4xD7gg@mail.gmail.com>
Message-ID: <CACe1pJckGaJ_xokGz3-4UQnTXuEBcv3mu1_j4En3cJNEU5QoDg@mail.gmail.com>

On Sat, Mar 3, 2012 at 12:50 PM, Olivier Delalleau <shish at keba.be> wrote:

>
>> Would it be helpful if I went ahead and submitted a pull request with the
>> function in my original question called "isclose" (along with a complete
>> docstring and a few tests)?
>>
>> One note:
>> At the moment, it deliberately compares NaN's as equal. E.g.
>>
>>     isclose([np.nan, np.nan], [np.nan, np.nan])
>>
>> will return:
>>
>>     [True, True]
>>
>> This obviously runs counter to the standard way NaN's are handled (and
>> indeed the definition of NaN).
>>
>> However, in the context of a floating point "close to" function, I think
>> it makes the most sense.
>>
>> I've had this sitting around in a small project for awhile now, and it's
>> been more useful to have it compare NaN's as "approximately equal" than not
>> for my purposes at least.
>>
>> Nonetheless, it's something that needs additional consideration.
>>
>> Thanks,
>> -Joe
>>
>
> It would be confusing if numpy.isclose().all() was different from
> numpy.allclose(). That being said, I agree it's useful to have NaNs compare
> equal in some cases, maybe it could be a new argument to the function?
>

Good point. I went ahead and added an "equal_nan" kwarg and removed the
"check_invalid" kwarg I had in before.  I also made it mimic what
np.equal() does in the case of two scalars (return a scalar instead of an
array).

I went ahead an make a pull request:
https://github.com/numpy/numpy/pull/224  Hope that's alright.

Cheers,
-Joe


>
> -=- Olivier
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120303/7210b99e/attachment.html>

From cournape at gmail.com  Sat Mar  3 19:38:53 2012
From: cournape at gmail.com (David Cournapeau)
Date: Sat, 3 Mar 2012 16:38:53 -0800
Subject: [Numpy-discussion] C++ Example
In-Reply-To: <30965644.bxiDOaWOx7@rabbit>
References: <30965644.bxiDOaWOx7@rabbit>
Message-ID: <CAGY4rcWohTs=R1oPF84SsAS1JFb3okXDStFLfU5keg1FC9pXVw@mail.gmail.com>

On Sat, Mar 3, 2012 at 8:07 AM, Luis Pedro Coelho <lpc at cmu.edu> wrote:
> Hi,
>
> I sort of missed the big C++ discussion, but I'd like to give some examples of
> how writing code can become much simpler if you are based on C++. This is from
> my mahotas package, which has a thin C++ wrapper around numpy's C API
>
> https://github.com/luispedro/mahotas/blob/master/mahotas/_morph.cpp
>
> and it implements multi-type greyscale erosion.
>
>
> // numpy::aligned_array wraps PyArrayObject*
> template<typename T>
> void erode(numpy::aligned_array<T> res,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?numpy::aligned_array<T> array,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?numpy::aligned_array<T> Bc) {
>
>
> ? ? ? ?// Release the GIL using RAII
> ? ?gil_release nogil;
> ? ?const int N = res.size();
> ? ?typename numpy::aligned_array<T>::iterator iter = array.begin();
> ? ?// this is adapted from scipy.ndimage.
> ? ?// it implements the convolution-like filtering.
> ? ?filter_iterator<T> filter(res.raw_array(),
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Bc.raw_array(),
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?EXTEND_NEAREST,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?is_bool(T()));
> ? ?const int N2 = filter.size();
> ? ?T* rpos = res.data();
>
> ? ?for (int i = 0; i != N; ++i, ++rpos, filter.iterate_both(iter)) {
> ? ? ? ?T value = std::numeric_limits<T>::max();
> ? ? ? ?for (int j = 0; j != N2; ++j) {
> ? ? ? ? ? ?T arr_val = T();
> ? ? ? ? ? ?filter.retrieve(iter, j, arr_val);
> ? ? ? ? ? ?value = std::min<T>(value, erode_sub(arr_val, filter[j]));
> ? ? ? ?}
> ? ? ? ?*rpos = value;
> ? ?}
> }
>
> If you compare this with the equivalent scipy.ndimage function, which is very
> good C code (but mostly write-only?in fact, ndimage has not been maintainable
> because it is so hard [at least for me, I've tried]):

The fact that this is good C is matter of opinon :)

I don't think the code is comparable either - some of the stuff done
in the C code is done in the C++ code your are calling. The C code
could be significantly improved. Even more important here: almost none
of this code should be written anymore anyway, C++ or not. This is
really the kind of code that should be done in cython, as it is mostly
about wrapping C code into the python C API.

cheers,

David


From gael.varoquaux at normalesup.org  Sun Mar  4 09:07:53 2012
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Sun, 4 Mar 2012 15:07:53 +0100
Subject: [Numpy-discussion] C++ Example
In-Reply-To: <CAGY4rcWohTs=R1oPF84SsAS1JFb3okXDStFLfU5keg1FC9pXVw@mail.gmail.com>
References: <30965644.bxiDOaWOx7@rabbit>
	<CAGY4rcWohTs=R1oPF84SsAS1JFb3okXDStFLfU5keg1FC9pXVw@mail.gmail.com>
Message-ID: <20120304140753.GC705@phare.normalesup.org>

On Sat, Mar 03, 2012 at 04:38:53PM -0800, David Cournapeau wrote:
> This is really the kind of code that should be done in cython, as it is
> mostly about wrapping C code into the python C API.

+1

Gael


From chaoyuejoy at gmail.com  Sun Mar  4 13:01:55 2012
From: chaoyuejoy at gmail.com (Chao YUE)
Date: Sun, 4 Mar 2012 19:01:55 +0100
Subject: [Numpy-discussion] copy mask from existing masked array?
Message-ID: <CAAN-aRH1CTNhXrGmabEPLZDctO14ZtrotB_-Z=AfRvQT_8S1pA@mail.gmail.com>

Dear all,

I have a matrix with dimension of (360,720) but with all global data.
I have another land-sea mask matrix with only 2 unique values in it
(land=1, sea=-1).
So I can easily create transform the second array to a masked array.
the problem is, how can I quickly transform the first one to a masked array
using the same mask as the land-sea mask array?

I hope my question is clear. If not, here is an example:

In [93]: a=np.arange(10).reshape(2,5)
In [95]: a=np.ma.masked_equal(a,2
In [96]: a=np.ma.masked_equal(a,8)

In [97]: a
Out[97]:
masked_array(data =
 [[0 1 -- 3 4]
 [5 6 7 -- 9]],
             mask =
 [[False False  True False False]
 [False False False  True False]],
       fill_value = 8)

In [100]: b=np.random.normal(0,2,size=(2,5))

I want to convert b to a masked array using exactly the same mask as a.

thanks to all,
cheers,

Chao
-- 
***********************************************************************************
Chao YUE
Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
UMR 1572 CEA-CNRS-UVSQ
Batiment 712 - Pe 119
91191 GIF Sur YVETTE Cedex
Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
************************************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120304/4fb01db6/attachment.html>

From shish at keba.be  Sun Mar  4 13:42:08 2012
From: shish at keba.be (Olivier Delalleau)
Date: Sun, 4 Mar 2012 13:42:08 -0500
Subject: [Numpy-discussion] copy mask from existing masked array?
In-Reply-To: <CAAN-aRH1CTNhXrGmabEPLZDctO14ZtrotB_-Z=AfRvQT_8S1pA@mail.gmail.com>
References: <CAAN-aRH1CTNhXrGmabEPLZDctO14ZtrotB_-Z=AfRvQT_8S1pA@mail.gmail.com>
Message-ID: <CAFXk4botYGumfk6X7pOXGPXdUwK1EakCYx4yYjjRPJyfH_TCWw@mail.gmail.com>

Should work with:
b = numpy.ma.masked_array(b, mask=a.mask)

-=- Olivier

Le 4 mars 2012 13:01, Chao YUE <chaoyuejoy at gmail.com> a ?crit :

> Dear all,
>
> I have a matrix with dimension of (360,720) but with all global data.
> I have another land-sea mask matrix with only 2 unique values in it
> (land=1, sea=-1).
> So I can easily create transform the second array to a masked array.
> the problem is, how can I quickly transform the first one to a masked
> array using the same mask as the land-sea mask array?
>
> I hope my question is clear. If not, here is an example:
>
> In [93]: a=np.arange(10).reshape(2,5)
> In [95]: a=np.ma.masked_equal(a,2
> In [96]: a=np.ma.masked_equal(a,8)
>
> In [97]: a
> Out[97]:
> masked_array(data =
>  [[0 1 -- 3 4]
>  [5 6 7 -- 9]],
>              mask =
>  [[False False  True False False]
>  [False False False  True False]],
>        fill_value = 8)
>
> In [100]: b=np.random.normal(0,2,size=(2,5))
>
> I want to convert b to a masked array using exactly the same mask as a.
>
> thanks to all,
> cheers,
>
> Chao
> --
>
> ***********************************************************************************
> Chao YUE
> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
> UMR 1572 CEA-CNRS-UVSQ
> Batiment 712 - Pe 119
> 91191 GIF Sur YVETTE Cedex
> Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
>
> ************************************************************************************
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120304/06136e04/attachment.html>

From matthew.brett at gmail.com  Sun Mar  4 14:27:47 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Sun, 4 Mar 2012 14:27:47 -0500
Subject: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
In-Reply-To: <CAH6Pt5o+XW+DVy2L4-GG1Pq2B9YWuebKUKuMGHm-hWabVUV2=w@mail.gmail.com>
References: <CAH6Pt5qkeVoo3d2+q4+WMmKZFC2FnHx3=pun5S5dwv2P5ijxKw@mail.gmail.com>
	<CAB6mnx+U1vXz6=TCSHO=-a+OjffcAntFKge9bm8LF-Vkj4jH8w@mail.gmail.com>
	<CAH6Pt5o+XW+DVy2L4-GG1Pq2B9YWuebKUKuMGHm-hWabVUV2=w@mail.gmail.com>
Message-ID: <CAH6Pt5p3zm=ugcQ1W-L5j9C6BQ3g_9yALzfFmxd0Lb6c2UUr6Q@mail.gmail.com>

Hi,

On Sat, Mar 3, 2012 at 12:07 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Fri, Mar 2, 2012 at 9:05 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>>
>>
>> On Fri, Mar 2, 2012 at 4:36 PM, Matthew Brett <matthew.brett at gmail.com>
>> wrote:
>>>
>>> Hi,
>>>
>>> Sorry that this report is not complete, I don't have full access to
>>> this box but, on a Debian squeeze machine running linux
>>> 2.6.32-5-sparc64-smp:
>>>
>>> nosetests
>>> ~/usr/local/lib/python2.6/site-packages/numpy/lib/tests/test_io.py:TestFromTxt.test_user_missing_values
>>>
>>> test_user_missing_values (test_io.TestFromTxt) ... Bus error
>>>
>>> This on current master : 1.7.0.dev-b9872b4
>>>
>>
>> Hmm, some tests might have been recently enabled. Any chance of doing a
>> bisection?

Struggling because compilation is very slow and there are lots of
untestable commits.  df907e6 is the first known bad.  Here's the
output from a log:

* df907e6 - (HEAD, refs/bisect/bad) BLD: Failure in single file build
mode because of a static function in two separate files (6 months ago)
[Mark Wiebe]
* 01b200b - (refs/bisect/skip-01b200b10149312f51234448e44b230b1b548046)
BUG: nditer: The nditer was reusing the reduce loop inappropriately
(#1938) (6 months ago) [Mark Wiebe]
* f45fd67 - (refs/bisect/skip-f45fd67fe8eefc8fd2e4b914ab4e376ab5226887)
DOC: Small tweak to release notes (6 months ago) [Mark Wiebe]
* 73be11d - (refs/bisect/skip-73be11db794d115a7d9bd2e822c0d8008bc14a28)
BUG: Some bugs in squeeze and concatenate found by testing SciPy (6
months ago) [Mark Wiebe]
* c873295 - (refs/bisect/skip-c8732958c8e07f2306029dfde2178faf9c01d049)
TST: missingdata: Finish up NA mask tests for np.std and np.var (6
months ago) [Mark Wiebe]
* e15712c - (refs/bisect/skip-e15712cf5df41806980f040606744040a433b331)
BUG: nditer: NA masks in arrays with leading 1 dimensions had an issue
(6 months ago) [Mark Wiebe]
* ded81ae - (refs/bisect/skip-ded81ae7d529ac0fba641b7e5e3ecf52e120700f)
ENH: missingdata: Implement tests for np.std, add skipna= and
keepdims= parameters to more functions (6 months ago) [Mark Wiebe]
* a112fc4 - (refs/bisect/skip-a112fc4a6b28fbb85e1b0c6d423095d13cf7b226)
ENH: missingdata: Implement skipna= support for np.std and np.var (6
months ago) [Mark Wiebe]
* 0fa4f22 - (refs/bisect/skip-0fa4f22fec4b19e2a8c1d93e5a1f955167c9addd)
ENH: missingdata: Support 'skipna=' parameter in np.mean (6 months
ago) [Mark Wiebe]
* bfda229 - (refs/bisect/skip-bfda229ec93d37b1ee2cdd8b9443ec4e34536bbf)
ENH: missingdata: Create count_reduce_items function (6 months ago)
[Mark Wiebe]
* d9b3f90 - (refs/bisect/skip-d9b3f90de3213ece9a78b77088fdec17910e81d9)
ENH: missingdata: Move the Reduce boilerplate into a function
PyArray_ReduceWrapper (6 months ago) [Mark Wiebe]
* 67ece6b - (refs/bisect/skip-67ece6bdd2b35d011893e78154dbff6ab51c7d35)
ENH: missingdata: Finish count_nonzero as a full-fledged reduction
operation (6 months ago) [Mark Wiebe]
* 6bfd819 - (refs/bisect/skip-6bfd819a0897caf6e6db244930c40ed0d17b9e62)
ENH: missingdata: Towards making count_nonzero a full-featured
reduction operation (6 months ago) [Mark Wiebe]
* a1faa1b - (refs/bisect/skip-a1faa1b6883c47333508a0476c1304b0a8a3f64e)
ENH: missingdata: Move some of the refactored reduction code into the
API (6 months ago) [Mark Wiebe]
* f597374 - (refs/bisect/skip-f597374edc298810083799e8539c99fc0a93b319)
ENH: missingdata: Change default to create NA-mask when NAs are in
lists (6 months ago) [Mark Wiebe]
* 965e4cf - (refs/bisect/skip-965e4cff5c4c50e8ff051a3363adc6cf6aa640cd)
ENH: missingdata: trying some more functions to see how they treat NAs
(6 months ago) [Mark Wiebe]
* b1cb211 - (refs/bisect/skip-b1cb211d159c617ee4ebd16266d6f1042417ef75)
ENH: missingdata: Add nastr= parameter to np.set_printoptions() (6
months ago) [Mark Wiebe]
* ba4d116 - (refs/bisect/skip-ba4d1161fe4943cb720f35c0abfd0581628255d6)
BUG: missingdata: Fix mask usage in PyArray_TakeFrom, add tests for it
(6 months ago) [Mark Wiebe]
* a3a0ee8 - (refs/bisect/skip-a3a0ee8c72fdd55ffacb96bbb1fa9c3569cfb3e9)
BUG: missingdata: The ndmin parameter to np.array wasn't respecting NA
masks (6 months ago) [Mark Wiebe]
* 9194b3a - (refs/bisect/skip-9194b3af704df71aa9b1ff2f53f169848d0f9dc7)
ENH: missingdata: Rewrite PyArray_Concatenate to work with NA masks (6
months ago) [Mark Wiebe]
* 99a21ef - (refs/bisect/good-99a21efff4b1f2292dc370c7c9c7c58f10385f2a)
ENH: missingdata: Add NA support to np.diagonal, change np.diagonal to
always return a view (6 months ago) [Mark Wiebe]

So - the problem arises somewhere between 99a21ef (good) and  df907e6 (bad)

There seems to be a compilation error for the skipped commits - here's
the one I tested, 9194b3a:

gcc: numpy/core/src/multiarray/multiarraymodule_onefile.c
In file included from numpy/core/src/multiarray/scalartypes.c.src:25,
                 from numpy/core/src/multiarray/multiarraymodule_onefile.c:10:
numpy/core/src/multiarray/_datetime.h:9: warning: function declaration
isn't a prototype
In file included from numpy/core/src/multiarray/multiarraymodule_onefile.c:13:
numpy/core/src/multiarray/datetime.c:34: warning: function declaration
isn't a prototype
In file included from numpy/core/src/multiarray/multiarraymodule_onefile.c:41:
numpy/core/src/multiarray/nditer_constr.c:2373: error: redefinition of
'intp_abs'
numpy/core/src/multiarray/shape.c:927: note: previous definition of
'intp_abs' was here
In file included from numpy/core/src/multiarray/scalartypes.c.src:25,
                 from numpy/core/src/multiarray/multiarraymodule_onefile.c:10:
numpy/core/src/multiarray/_datetime.h:9: warning: function declaration
isn't a prototype
In file included from numpy/core/src/multiarray/multiarraymodule_onefile.c:13:
numpy/core/src/multiarray/datetime.c:34: warning: function declaration
isn't a prototype
In file included from numpy/core/src/multiarray/multiarraymodule_onefile.c:41:
numpy/core/src/multiarray/nditer_constr.c:2373: error: redefinition of
'intp_abs'
numpy/core/src/multiarray/shape.c:927: note: previous definition of
'intp_abs' was here
error: Command "gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv
-O2 -Wall -Wstrict-prototypes -fPIC -Inumpy/core/include
-Ibuild/src.linux-sparc64-2.6/numpy/core/include/numpy
-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core
-Inumpy/core/src/npymath -Inumpy/core/src/multiarray
-Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/include
-I/usr/include/python2.6
-Ibuild/src.linux-sparc64-2.6/numpy/core/src/multiarray
-Ibuild/src.linux-sparc64-2.6/numpy/core/src/umath -c
numpy/core/src/multiarray/multiarraymodule_onefile.c -o
build/temp.linux-sparc64-2.6/numpy/core/src/multiarray/multiarraymodule_onefile.o"
failed with exit status 1

See you,

Matthew


From mwwiebe at gmail.com  Sun Mar  4 14:41:53 2012
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Sun, 4 Mar 2012 11:41:53 -0800
Subject: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
In-Reply-To: <CAH6Pt5p3zm=ugcQ1W-L5j9C6BQ3g_9yALzfFmxd0Lb6c2UUr6Q@mail.gmail.com>
References: <CAH6Pt5qkeVoo3d2+q4+WMmKZFC2FnHx3=pun5S5dwv2P5ijxKw@mail.gmail.com>
	<CAB6mnx+U1vXz6=TCSHO=-a+OjffcAntFKge9bm8LF-Vkj4jH8w@mail.gmail.com>
	<CAH6Pt5o+XW+DVy2L4-GG1Pq2B9YWuebKUKuMGHm-hWabVUV2=w@mail.gmail.com>
	<CAH6Pt5p3zm=ugcQ1W-L5j9C6BQ3g_9yALzfFmxd0Lb6c2UUr6Q@mail.gmail.com>
Message-ID: <CAMRnEmrV1OTEABbiJMz2HfUb-vcbyq-SHRXE91SS93gtASWgUQ@mail.gmail.com>

On Sun, Mar 4, 2012 at 11:27 AM, Matthew Brett <matthew.brett at gmail.com>wrote:

> Hi,
>
> On Sat, Mar 3, 2012 at 12:07 AM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
> > Hi,
> >
> > On Fri, Mar 2, 2012 at 9:05 PM, Charles R Harris
> > <charlesr.harris at gmail.com> wrote:
> >>
> >>
> >> On Fri, Mar 2, 2012 at 4:36 PM, Matthew Brett <matthew.brett at gmail.com>
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>> Sorry that this report is not complete, I don't have full access to
> >>> this box but, on a Debian squeeze machine running linux
> >>> 2.6.32-5-sparc64-smp:
> >>>
> >>> nosetests
> >>>
> ~/usr/local/lib/python2.6/site-packages/numpy/lib/tests/test_io.py:TestFromTxt.test_user_missing_values
> >>>
> >>> test_user_missing_values (test_io.TestFromTxt) ... Bus error
> >>>
> >>> This on current master : 1.7.0.dev-b9872b4
> >>>
> >>
> >> Hmm, some tests might have been recently enabled. Any chance of doing a
> >> bisection?
>
> Struggling because compilation is very slow and there are lots of
> untestable commits.  df907e6 is the first known bad.  Here's the
> output from a log:
>
> * df907e6 - (HEAD, refs/bisect/bad) BLD: Failure in single file build
> mode because of a static function in two separate files (6 months ago)
> [Mark Wiebe]
> * 01b200b - (refs/bisect/skip-01b200b10149312f51234448e44b230b1b548046)
> BUG: nditer: The nditer was reusing the reduce loop inappropriately
> (#1938) (6 months ago) [Mark Wiebe]
> * f45fd67 - (refs/bisect/skip-f45fd67fe8eefc8fd2e4b914ab4e376ab5226887)
> DOC: Small tweak to release notes (6 months ago) [Mark Wiebe]
> * 73be11d - (refs/bisect/skip-73be11db794d115a7d9bd2e822c0d8008bc14a28)
> BUG: Some bugs in squeeze and concatenate found by testing SciPy (6
> months ago) [Mark Wiebe]
> * c873295 - (refs/bisect/skip-c8732958c8e07f2306029dfde2178faf9c01d049)
> TST: missingdata: Finish up NA mask tests for np.std and np.var (6
> months ago) [Mark Wiebe]
> * e15712c - (refs/bisect/skip-e15712cf5df41806980f040606744040a433b331)
> BUG: nditer: NA masks in arrays with leading 1 dimensions had an issue
> (6 months ago) [Mark Wiebe]
> * ded81ae - (refs/bisect/skip-ded81ae7d529ac0fba641b7e5e3ecf52e120700f)
> ENH: missingdata: Implement tests for np.std, add skipna= and
> keepdims= parameters to more functions (6 months ago) [Mark Wiebe]
> * a112fc4 - (refs/bisect/skip-a112fc4a6b28fbb85e1b0c6d423095d13cf7b226)
> ENH: missingdata: Implement skipna= support for np.std and np.var (6
> months ago) [Mark Wiebe]
> * 0fa4f22 - (refs/bisect/skip-0fa4f22fec4b19e2a8c1d93e5a1f955167c9addd)
> ENH: missingdata: Support 'skipna=' parameter in np.mean (6 months
> ago) [Mark Wiebe]
> * bfda229 - (refs/bisect/skip-bfda229ec93d37b1ee2cdd8b9443ec4e34536bbf)
> ENH: missingdata: Create count_reduce_items function (6 months ago)
> [Mark Wiebe]
> * d9b3f90 - (refs/bisect/skip-d9b3f90de3213ece9a78b77088fdec17910e81d9)
> ENH: missingdata: Move the Reduce boilerplate into a function
> PyArray_ReduceWrapper (6 months ago) [Mark Wiebe]
> * 67ece6b - (refs/bisect/skip-67ece6bdd2b35d011893e78154dbff6ab51c7d35)
> ENH: missingdata: Finish count_nonzero as a full-fledged reduction
> operation (6 months ago) [Mark Wiebe]
> * 6bfd819 - (refs/bisect/skip-6bfd819a0897caf6e6db244930c40ed0d17b9e62)
> ENH: missingdata: Towards making count_nonzero a full-featured
> reduction operation (6 months ago) [Mark Wiebe]
> * a1faa1b - (refs/bisect/skip-a1faa1b6883c47333508a0476c1304b0a8a3f64e)
> ENH: missingdata: Move some of the refactored reduction code into the
> API (6 months ago) [Mark Wiebe]
> * f597374 - (refs/bisect/skip-f597374edc298810083799e8539c99fc0a93b319)
> ENH: missingdata: Change default to create NA-mask when NAs are in
> lists (6 months ago) [Mark Wiebe]
> * 965e4cf - (refs/bisect/skip-965e4cff5c4c50e8ff051a3363adc6cf6aa640cd)
> ENH: missingdata: trying some more functions to see how they treat NAs
> (6 months ago) [Mark Wiebe]
> * b1cb211 - (refs/bisect/skip-b1cb211d159c617ee4ebd16266d6f1042417ef75)
> ENH: missingdata: Add nastr= parameter to np.set_printoptions() (6
> months ago) [Mark Wiebe]
> * ba4d116 - (refs/bisect/skip-ba4d1161fe4943cb720f35c0abfd0581628255d6)
> BUG: missingdata: Fix mask usage in PyArray_TakeFrom, add tests for it
> (6 months ago) [Mark Wiebe]
> * a3a0ee8 - (refs/bisect/skip-a3a0ee8c72fdd55ffacb96bbb1fa9c3569cfb3e9)
> BUG: missingdata: The ndmin parameter to np.array wasn't respecting NA
> masks (6 months ago) [Mark Wiebe]
> * 9194b3a - (refs/bisect/skip-9194b3af704df71aa9b1ff2f53f169848d0f9dc7)
> ENH: missingdata: Rewrite PyArray_Concatenate to work with NA masks (6
> months ago) [Mark Wiebe]
> * 99a21ef - (refs/bisect/good-99a21efff4b1f2292dc370c7c9c7c58f10385f2a)
> ENH: missingdata: Add NA support to np.diagonal, change np.diagonal to
> always return a view (6 months ago) [Mark Wiebe]
>
> So - the problem arises somewhere between 99a21ef (good) and  df907e6 (bad)
>
> There seems to be a compilation error for the skipped commits - here's
> the one I tested, 9194b3a:
>

If you enable separate compilation by setting an environment variable,
these commits should build as well.

$ export ENABLE_SEPARATE_COMPILATION=1
<proceed with builds>

Cheers,
Mark


> gcc: numpy/core/src/multiarray/multiarraymodule_onefile.c
> In file included from numpy/core/src/multiarray/scalartypes.c.src:25,
>                 from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:10:
> numpy/core/src/multiarray/_datetime.h:9: warning: function declaration
> isn't a prototype
> In file included from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:13:
> numpy/core/src/multiarray/datetime.c:34: warning: function declaration
> isn't a prototype
> In file included from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:41:
> numpy/core/src/multiarray/nditer_constr.c:2373: error: redefinition of
> 'intp_abs'
> numpy/core/src/multiarray/shape.c:927: note: previous definition of
> 'intp_abs' was here
> In file included from numpy/core/src/multiarray/scalartypes.c.src:25,
>                 from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:10:
> numpy/core/src/multiarray/_datetime.h:9: warning: function declaration
> isn't a prototype
> In file included from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:13:
> numpy/core/src/multiarray/datetime.c:34: warning: function declaration
> isn't a prototype
> In file included from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:41:
> numpy/core/src/multiarray/nditer_constr.c:2373: error: redefinition of
> 'intp_abs'
> numpy/core/src/multiarray/shape.c:927: note: previous definition of
> 'intp_abs' was here
> error: Command "gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv
> -O2 -Wall -Wstrict-prototypes -fPIC -Inumpy/core/include
> -Ibuild/src.linux-sparc64-2.6/numpy/core/include/numpy
> -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core
> -Inumpy/core/src/npymath -Inumpy/core/src/multiarray
> -Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/include
> -I/usr/include/python2.6
> -Ibuild/src.linux-sparc64-2.6/numpy/core/src/multiarray
> -Ibuild/src.linux-sparc64-2.6/numpy/core/src/umath -c
> numpy/core/src/multiarray/multiarraymodule_onefile.c -o
>
> build/temp.linux-sparc64-2.6/numpy/core/src/multiarray/multiarraymodule_onefile.o"
> failed with exit status 1
>
> See you,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120304/cb04263e/attachment.html>

From charlesr.harris at gmail.com  Sun Mar  4 14:50:33 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun, 4 Mar 2012 12:50:33 -0700
Subject: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
In-Reply-To: <CAH6Pt5p3zm=ugcQ1W-L5j9C6BQ3g_9yALzfFmxd0Lb6c2UUr6Q@mail.gmail.com>
References: <CAH6Pt5qkeVoo3d2+q4+WMmKZFC2FnHx3=pun5S5dwv2P5ijxKw@mail.gmail.com>
	<CAB6mnx+U1vXz6=TCSHO=-a+OjffcAntFKge9bm8LF-Vkj4jH8w@mail.gmail.com>
	<CAH6Pt5o+XW+DVy2L4-GG1Pq2B9YWuebKUKuMGHm-hWabVUV2=w@mail.gmail.com>
	<CAH6Pt5p3zm=ugcQ1W-L5j9C6BQ3g_9yALzfFmxd0Lb6c2UUr6Q@mail.gmail.com>
Message-ID: <CAB6mnxLPrZ-W=0SVqxmY64MOvR3pqKnerb1XC9AyG5KcFkddRQ@mail.gmail.com>

On Sun, Mar 4, 2012 at 12:27 PM, Matthew Brett <matthew.brett at gmail.com>wrote:

> Hi,
>
> On Sat, Mar 3, 2012 at 12:07 AM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
> > Hi,
> >
> > On Fri, Mar 2, 2012 at 9:05 PM, Charles R Harris
> > <charlesr.harris at gmail.com> wrote:
> >>
> >>
> >> On Fri, Mar 2, 2012 at 4:36 PM, Matthew Brett <matthew.brett at gmail.com>
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>> Sorry that this report is not complete, I don't have full access to
> >>> this box but, on a Debian squeeze machine running linux
> >>> 2.6.32-5-sparc64-smp:
> >>>
> >>> nosetests
> >>>
> ~/usr/local/lib/python2.6/site-packages/numpy/lib/tests/test_io.py:TestFromTxt.test_user_missing_values
> >>>
> >>> test_user_missing_values (test_io.TestFromTxt) ... Bus error
> >>>
> >>> This on current master : 1.7.0.dev-b9872b4
> >>>
> >>
> >> Hmm, some tests might have been recently enabled. Any chance of doing a
> >> bisection?
>
> Struggling because compilation is very slow and there are lots of
> untestable commits.  df907e6 is the first known bad.  Here's the
> output from a log:
>

The effort is much appreciated. At least we are down to a 3 day period.


>
> * df907e6 - (HEAD, refs/bisect/bad) BLD: Failure in single file build
> mode because of a static function in two separate files (6 months ago)
> [Mark Wiebe]
> * 01b200b - (refs/bisect/skip-01b200b10149312f51234448e44b230b1b548046)
> BUG: nditer: The nditer was reusing the reduce loop inappropriately
> (#1938) (6 months ago) [Mark Wiebe]
> * f45fd67 - (refs/bisect/skip-f45fd67fe8eefc8fd2e4b914ab4e376ab5226887)
> DOC: Small tweak to release notes (6 months ago) [Mark Wiebe]
> * 73be11d - (refs/bisect/skip-73be11db794d115a7d9bd2e822c0d8008bc14a28)
> BUG: Some bugs in squeeze and concatenate found by testing SciPy (6
> months ago) [Mark Wiebe]
> * c873295 - (refs/bisect/skip-c8732958c8e07f2306029dfde2178faf9c01d049)
> TST: missingdata: Finish up NA mask tests for np.std and np.var (6
> months ago) [Mark Wiebe]
> * e15712c - (refs/bisect/skip-e15712cf5df41806980f040606744040a433b331)
> BUG: nditer: NA masks in arrays with leading 1 dimensions had an issue
> (6 months ago) [Mark Wiebe]
> * ded81ae - (refs/bisect/skip-ded81ae7d529ac0fba641b7e5e3ecf52e120700f)
> ENH: missingdata: Implement tests for np.std, add skipna= and
> keepdims= parameters to more functions (6 months ago) [Mark Wiebe]
> * a112fc4 - (refs/bisect/skip-a112fc4a6b28fbb85e1b0c6d423095d13cf7b226)
> ENH: missingdata: Implement skipna= support for np.std and np.var (6
> months ago) [Mark Wiebe]
> * 0fa4f22 - (refs/bisect/skip-0fa4f22fec4b19e2a8c1d93e5a1f955167c9addd)
> ENH: missingdata: Support 'skipna=' parameter in np.mean (6 months
> ago) [Mark Wiebe]
> * bfda229 - (refs/bisect/skip-bfda229ec93d37b1ee2cdd8b9443ec4e34536bbf)
> ENH: missingdata: Create count_reduce_items function (6 months ago)
> [Mark Wiebe]
> * d9b3f90 - (refs/bisect/skip-d9b3f90de3213ece9a78b77088fdec17910e81d9)
> ENH: missingdata: Move the Reduce boilerplate into a function
> PyArray_ReduceWrapper (6 months ago) [Mark Wiebe]
> * 67ece6b - (refs/bisect/skip-67ece6bdd2b35d011893e78154dbff6ab51c7d35)
> ENH: missingdata: Finish count_nonzero as a full-fledged reduction
> operation (6 months ago) [Mark Wiebe]
> * 6bfd819 - (refs/bisect/skip-6bfd819a0897caf6e6db244930c40ed0d17b9e62)
> ENH: missingdata: Towards making count_nonzero a full-featured
> reduction operation (6 months ago) [Mark Wiebe]
> * a1faa1b - (refs/bisect/skip-a1faa1b6883c47333508a0476c1304b0a8a3f64e)
> ENH: missingdata: Move some of the refactored reduction code into the
> API (6 months ago) [Mark Wiebe]
> * f597374 - (refs/bisect/skip-f597374edc298810083799e8539c99fc0a93b319)
> ENH: missingdata: Change default to create NA-mask when NAs are in
> lists (6 months ago) [Mark Wiebe]
> * 965e4cf - (refs/bisect/skip-965e4cff5c4c50e8ff051a3363adc6cf6aa640cd)
> ENH: missingdata: trying some more functions to see how they treat NAs
> (6 months ago) [Mark Wiebe]
> * b1cb211 - (refs/bisect/skip-b1cb211d159c617ee4ebd16266d6f1042417ef75)
> ENH: missingdata: Add nastr= parameter to np.set_printoptions() (6
> months ago) [Mark Wiebe]
> * ba4d116 - (refs/bisect/skip-ba4d1161fe4943cb720f35c0abfd0581628255d6)
> BUG: missingdata: Fix mask usage in PyArray_TakeFrom, add tests for it
> (6 months ago) [Mark Wiebe]
> * a3a0ee8 - (refs/bisect/skip-a3a0ee8c72fdd55ffacb96bbb1fa9c3569cfb3e9)
> BUG: missingdata: The ndmin parameter to np.array wasn't respecting NA
> masks (6 months ago) [Mark Wiebe]
> * 9194b3a - (refs/bisect/skip-9194b3af704df71aa9b1ff2f53f169848d0f9dc7)
> ENH: missingdata: Rewrite PyArray_Concatenate to work with NA masks (6
> months ago) [Mark Wiebe]
> * 99a21ef - (refs/bisect/good-99a21efff4b1f2292dc370c7c9c7c58f10385f2a)
> ENH: missingdata: Add NA support to np.diagonal, change np.diagonal to
> always return a view (6 months ago) [Mark Wiebe]
>
> So - the problem arises somewhere between 99a21ef (good) and  df907e6 (bad)
>
> There seems to be a compilation error for the skipped commits - here's
> the one I tested, 9194b3a:
>
> gcc: numpy/core/src/multiarray/multiarraymodule_onefile.c
> In file included from numpy/core/src/multiarray/scalartypes.c.src:25,
>                 from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:10:
> numpy/core/src/multiarray/_datetime.h:9: warning: function declaration
> isn't a prototype
> In file included from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:13:
> numpy/core/src/multiarray/datetime.c:34: warning: function declaration
> isn't a prototype
> In file included from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:41:
> numpy/core/src/multiarray/nditer_constr.c:2373: error: redefinition of
> 'intp_abs'
> numpy/core/src/multiarray/shape.c:927: note: previous definition of
> 'intp_abs' was here
> In file included from numpy/core/src/multiarray/scalartypes.c.src:25,
>                 from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:10:
> numpy/core/src/multiarray/_datetime.h:9: warning: function declaration
> isn't a prototype
> In file included from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:13:
> numpy/core/src/multiarray/datetime.c:34: warning: function declaration
> isn't a prototype
> In file included from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:41:
> numpy/core/src/multiarray/nditer_constr.c:2373: error: redefinition of
> 'intp_abs'
> numpy/core/src/multiarray/shape.c:927: note: previous definition of
> 'intp_abs' was here
> error: Command "gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv
> -O2 -Wall -Wstrict-prototypes -fPIC -Inumpy/core/include
> -Ibuild/src.linux-sparc64-2.6/numpy/core/include/numpy
> -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core
> -Inumpy/core/src/npymath -Inumpy/core/src/multiarray
> -Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/include
> -I/usr/include/python2.6
> -Ibuild/src.linux-sparc64-2.6/numpy/core/src/multiarray
> -Ibuild/src.linux-sparc64-2.6/numpy/core/src/umath -c
> numpy/core/src/multiarray/multiarraymodule_onefile.c -o
>
> build/temp.linux-sparc64-2.6/numpy/core/src/multiarray/multiarraymodule_onefile.o"
> failed with exit status 1
>
>
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120304/8beda360/attachment.html>

From ralf.gommers at googlemail.com  Sun Mar  4 15:56:03 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sun, 4 Mar 2012 21:56:03 +0100
Subject: [Numpy-discussion] test errors on deprecation/runtime warnings
In-Reply-To: <CABL7CQj8z5UB3hLyeYB9T=f6mysQnQR253rLLuB-Y0AO3zjZLQ@mail.gmail.com>
References: <CABL7CQiF20LObhvkArPBRpLKuMMhZO0jFFJDYzWP5YyCwOQEEA@mail.gmail.com>
	<CABDkGQmGHP4nj_2HWNMd4qHx-=mOg-9x7ckHTiM1kDSz+tYBdg@mail.gmail.com>
	<CABL7CQj8z5UB3hLyeYB9T=f6mysQnQR253rLLuB-Y0AO3zjZLQ@mail.gmail.com>
Message-ID: <CABL7CQjcnCco1Tc7MoWU+dZZgiyT=-MUVya_PGsJr1X5+TKQSg@mail.gmail.com>

On Sat, Feb 18, 2012 at 10:13 AM, Ralf Gommers
<ralf.gommers at googlemail.com>wrote:

>
>
> 2012/2/17 St?fan van der Walt <stefan at sun.ac.za>
>
>> Hi Ralf
>>
>> On Thu, Feb 16, 2012 at 11:05 AM, Ralf Gommers
>> <ralf.gommers at googlemail.com> wrote:
>> > Last week we merged https://github.com/numpy/numpy/pull/201, which
>> causes
>> > DeprecationWarning's and RuntimeWarning's to be converted to errors if
>> they
>> > occur when running the test suite.
>>
>> It looks like this change affects other packages, too,
>
>
> It does, which is why I wanted to bring it up here.
>
>
>> which may legitimately raise RuntimeWarnings while running their test
>> suites
>> (unless I read the patch wrong).  Would it be an option to rather add
>> a flag (False by default) to enable this behaviour, and enable it
>> inside of numpy.test() ?
>>
>
> Well, the idea is that this behavior is the correct one for all packages.
> It calls attention to those RuntimeWarnings, which may only occur on
> certain platforms. If they're legitimate, you silence them in the test
> suite of that package. If not, you fix them. Would you agree with that? Or
> would you prefer to just ignore DeprecationWarnings and/or RuntimeWarnings
> in skimage for example?
>
> Note that the changed behavior would only be visible for people running
> numpy master.
>

This behavior has been made configurable, and I added instructions in
HOWTO_RELEASE to turn it off in maintenance branches, in
https://github.com/rgommers/numpy/compare/pull-219-warnings. There is some
discussion at https://github.com/numpy/numpy/pull/219.

The intention is to merge this soon, so now is the time to comment.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120304/06d2a4e1/attachment.html>

From lpc at cmu.edu  Sun Mar  4 17:18:40 2012
From: lpc at cmu.edu (Luis Pedro Coelho)
Date: Sun, 04 Mar 2012 22:18:40 +0000
Subject: [Numpy-discussion] C++ Example
In-Reply-To: <CAGY4rcWohTs=R1oPF84SsAS1JFb3okXDStFLfU5keg1FC9pXVw@mail.gmail.com>
References: <30965644.bxiDOaWOx7@rabbit>
	<CAGY4rcWohTs=R1oPF84SsAS1JFb3okXDStFLfU5keg1FC9pXVw@mail.gmail.com>
Message-ID: <1375080.6SoljH8rGC@rabbit>

On Saturday, March 03, 2012 04:38:53 PM David Cournapeau wrote:
> I don't think the code is comparable either - some of the stuff done
> in the C code is done in the C++ code your are calling. The C code
> could be significantly improved.

Actually, that's not 100% accurate. The C code calls the same functions. Most 
of the extra cruft is that it needs to do all of this error checking and type-
dispatch, while in C++ you can have RAII and templates.

> Even more important here: almost none
> of this code should be written anymore anyway, C++ or not. This is
> really the kind of code that should be done in cython, as it is mostly
> about wrapping C code into the python C API.

At least last time I read up on it, cython was not able to do multi-type code, 
i.e., have code that works on arrays of multiple types. Does it support it 
now?

Best,
-- 
Luis Pedro Coelho | University of Lisbon | http://luispedro.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120304/90238296/attachment.sig>

From jswhit at fastmail.fm  Sun Mar  4 20:57:24 2012
From: jswhit at fastmail.fm (Jeff Whitaker)
Date: Sun, 04 Mar 2012 18:57:24 -0700
Subject: [Numpy-discussion] C++ Example
In-Reply-To: <1375080.6SoljH8rGC@rabbit>
References: <30965644.bxiDOaWOx7@rabbit>
	<CAGY4rcWohTs=R1oPF84SsAS1JFb3okXDStFLfU5keg1FC9pXVw@mail.gmail.com>
	<1375080.6SoljH8rGC@rabbit>
Message-ID: <4F541D84.1050706@fastmail.fm>

On 3/4/12 3:18 PM, Luis Pedro Coelho wrote:
> On Saturday, March 03, 2012 04:38:53 PM David Cournapeau wrote:
>> I don't think the code is comparable either - some of the stuff done
>> in the C code is done in the C++ code your are calling. The C code
>> could be significantly improved.
> Actually, that's not 100% accurate. The C code calls the same functions. Most
> of the extra cruft is that it needs to do all of this error checking and type-
> dispatch, while in C++ you can have RAII and templates.
>
>> Even more important here: almost none
>> of this code should be written anymore anyway, C++ or not. This is
>> really the kind of code that should be done in cython, as it is mostly
>> about wrapping C code into the python C API.
> At least last time I read up on it, cython was not able to do multi-type code,
> i.e., have code that works on arrays of multiple types. Does it support it
> now?
>
> Best,
Coming soon in version 0.16:

https://sage.math.washington.edu:8091/hudson/job/cython-docs/doclinks/1/src/userguide/fusedtypes.html

-Jeff


From matthew.brett at gmail.com  Sun Mar  4 23:08:51 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Sun, 4 Mar 2012 20:08:51 -0800
Subject: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
In-Reply-To: <CAMRnEmrV1OTEABbiJMz2HfUb-vcbyq-SHRXE91SS93gtASWgUQ@mail.gmail.com>
References: <CAH6Pt5qkeVoo3d2+q4+WMmKZFC2FnHx3=pun5S5dwv2P5ijxKw@mail.gmail.com>
	<CAB6mnx+U1vXz6=TCSHO=-a+OjffcAntFKge9bm8LF-Vkj4jH8w@mail.gmail.com>
	<CAH6Pt5o+XW+DVy2L4-GG1Pq2B9YWuebKUKuMGHm-hWabVUV2=w@mail.gmail.com>
	<CAH6Pt5p3zm=ugcQ1W-L5j9C6BQ3g_9yALzfFmxd0Lb6c2UUr6Q@mail.gmail.com>
	<CAMRnEmrV1OTEABbiJMz2HfUb-vcbyq-SHRXE91SS93gtASWgUQ@mail.gmail.com>
Message-ID: <CAH6Pt5q3xZaioK-kQ1=_mCCTFkOo-uwS0i5G+yQqhqmNPrmqaA@mail.gmail.com>

Hi,

On Sun, Mar 4, 2012 at 11:41 AM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> On Sun, Mar 4, 2012 at 11:27 AM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
>>
>> Hi,
>>
>> On Sat, Mar 3, 2012 at 12:07 AM, Matthew Brett <matthew.brett at gmail.com>
>> wrote:
>> > Hi,
>> >
>> > On Fri, Mar 2, 2012 at 9:05 PM, Charles R Harris
>> > <charlesr.harris at gmail.com> wrote:
>> >>
>> >>
>> >> On Fri, Mar 2, 2012 at 4:36 PM, Matthew Brett <matthew.brett at gmail.com>
>> >> wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> Sorry that this report is not complete, I don't have full access to
>> >>> this box but, on a Debian squeeze machine running linux
>> >>> 2.6.32-5-sparc64-smp:
>> >>>
>> >>> nosetests
>> >>>
>> >>> ~/usr/local/lib/python2.6/site-packages/numpy/lib/tests/test_io.py:TestFromTxt.test_user_missing_values
>> >>>
>> >>> test_user_missing_values (test_io.TestFromTxt) ... Bus error
>> >>>
>> >>> This on current master : 1.7.0.dev-b9872b4
>> >>>
>> >>
>> >> Hmm, some tests might have been recently enabled. Any chance of doing a
>> >> bisection?
>>
>> Struggling because compilation is very slow and there are lots of
>> untestable commits. ?df907e6 is the first known bad. ?Here's the
>> output from a log:
>>
>> * df907e6 - (HEAD, refs/bisect/bad) BLD: Failure in single file build
>> mode because of a static function in two separate files (6 months ago)
>> [Mark Wiebe]
>> * 01b200b - (refs/bisect/skip-01b200b10149312f51234448e44b230b1b548046)
>> BUG: nditer: The nditer was reusing the reduce loop inappropriately
>> (#1938) (6 months ago) [Mark Wiebe]
>> * f45fd67 - (refs/bisect/skip-f45fd67fe8eefc8fd2e4b914ab4e376ab5226887)
>> DOC: Small tweak to release notes (6 months ago) [Mark Wiebe]
>> * 73be11d - (refs/bisect/skip-73be11db794d115a7d9bd2e822c0d8008bc14a28)
>> BUG: Some bugs in squeeze and concatenate found by testing SciPy (6
>> months ago) [Mark Wiebe]
>> * c873295 - (refs/bisect/skip-c8732958c8e07f2306029dfde2178faf9c01d049)
>> TST: missingdata: Finish up NA mask tests for np.std and np.var (6
>> months ago) [Mark Wiebe]
>> * e15712c - (refs/bisect/skip-e15712cf5df41806980f040606744040a433b331)
>> BUG: nditer: NA masks in arrays with leading 1 dimensions had an issue
>> (6 months ago) [Mark Wiebe]
>> * ded81ae - (refs/bisect/skip-ded81ae7d529ac0fba641b7e5e3ecf52e120700f)
>> ENH: missingdata: Implement tests for np.std, add skipna= and
>> keepdims= parameters to more functions (6 months ago) [Mark Wiebe]
>> * a112fc4 - (refs/bisect/skip-a112fc4a6b28fbb85e1b0c6d423095d13cf7b226)
>> ENH: missingdata: Implement skipna= support for np.std and np.var (6
>> months ago) [Mark Wiebe]
>> * 0fa4f22 - (refs/bisect/skip-0fa4f22fec4b19e2a8c1d93e5a1f955167c9addd)
>> ENH: missingdata: Support 'skipna=' parameter in np.mean (6 months
>> ago) [Mark Wiebe]
>> * bfda229 - (refs/bisect/skip-bfda229ec93d37b1ee2cdd8b9443ec4e34536bbf)
>> ENH: missingdata: Create count_reduce_items function (6 months ago)
>> [Mark Wiebe]
>> * d9b3f90 - (refs/bisect/skip-d9b3f90de3213ece9a78b77088fdec17910e81d9)
>> ENH: missingdata: Move the Reduce boilerplate into a function
>> PyArray_ReduceWrapper (6 months ago) [Mark Wiebe]
>> * 67ece6b - (refs/bisect/skip-67ece6bdd2b35d011893e78154dbff6ab51c7d35)
>> ENH: missingdata: Finish count_nonzero as a full-fledged reduction
>> operation (6 months ago) [Mark Wiebe]
>> * 6bfd819 - (refs/bisect/skip-6bfd819a0897caf6e6db244930c40ed0d17b9e62)
>> ENH: missingdata: Towards making count_nonzero a full-featured
>> reduction operation (6 months ago) [Mark Wiebe]
>> * a1faa1b - (refs/bisect/skip-a1faa1b6883c47333508a0476c1304b0a8a3f64e)
>> ENH: missingdata: Move some of the refactored reduction code into the
>> API (6 months ago) [Mark Wiebe]
>> * f597374 - (refs/bisect/skip-f597374edc298810083799e8539c99fc0a93b319)
>> ENH: missingdata: Change default to create NA-mask when NAs are in
>> lists (6 months ago) [Mark Wiebe]
>> * 965e4cf - (refs/bisect/skip-965e4cff5c4c50e8ff051a3363adc6cf6aa640cd)
>> ENH: missingdata: trying some more functions to see how they treat NAs
>> (6 months ago) [Mark Wiebe]
>> * b1cb211 - (refs/bisect/skip-b1cb211d159c617ee4ebd16266d6f1042417ef75)
>> ENH: missingdata: Add nastr= parameter to np.set_printoptions() (6
>> months ago) [Mark Wiebe]
>> * ba4d116 - (refs/bisect/skip-ba4d1161fe4943cb720f35c0abfd0581628255d6)
>> BUG: missingdata: Fix mask usage in PyArray_TakeFrom, add tests for it
>> (6 months ago) [Mark Wiebe]
>> * a3a0ee8 - (refs/bisect/skip-a3a0ee8c72fdd55ffacb96bbb1fa9c3569cfb3e9)
>> BUG: missingdata: The ndmin parameter to np.array wasn't respecting NA
>> masks (6 months ago) [Mark Wiebe]
>> * 9194b3a - (refs/bisect/skip-9194b3af704df71aa9b1ff2f53f169848d0f9dc7)
>> ENH: missingdata: Rewrite PyArray_Concatenate to work with NA masks (6
>> months ago) [Mark Wiebe]
>> * 99a21ef - (refs/bisect/good-99a21efff4b1f2292dc370c7c9c7c58f10385f2a)
>> ENH: missingdata: Add NA support to np.diagonal, change np.diagonal to
>> always return a view (6 months ago) [Mark Wiebe]
>>
>> So - the problem arises somewhere between 99a21ef (good) and ?df907e6
>> (bad)
>>
>> There seems to be a compilation error for the skipped commits - here's
>> the one I tested, 9194b3a:
>
>
> If you enable separate compilation by setting an environment variable, these
> commits should build as well.
>
> $ export?ENABLE_SEPARATE_COMPILATION=1
> <proceed with builds>

I might be doing something wrong but:

In file included from numpy/core/src/multiarray/scalartypes.c.src:25,
                 from numpy/core/src/multiarray/multiarraymodule_onefile.c:10:
numpy/core/src/multiarray/_datetime.h:9: warning: function declaration
isn't a prototype
In file included from numpy/core/src/multiarray/multiarraymodule_onefile.c:13:
numpy/core/src/multiarray/datetime.c:34: warning: function declaration
isn't a prototype
In file included from numpy/core/src/multiarray/multiarraymodule_onefile.c:41:
numpy/core/src/multiarray/nditer_constr.c:2373: error: redefinition of
'intp_abs'
numpy/core/src/multiarray/shape.c:927: note: previous definition of
'intp_abs' was here
In file included from numpy/core/src/multiarray/scalartypes.c.src:25,
                 from numpy/core/src/multiarray/multiarraymodule_onefile.c:10:
numpy/core/src/multiarray/_datetime.h:9: warning: function declaration
isn't a prototype
In file included from numpy/core/src/multiarray/multiarraymodule_onefile.c:13:
numpy/core/src/multiarray/datetime.c:34: warning: function declaration
isn't a prototype
In file included from numpy/core/src/multiarray/multiarraymodule_onefile.c:41:
numpy/core/src/multiarray/nditer_constr.c:2373: error: redefinition of
'intp_abs'
numpy/core/src/multiarray/shape.c:927: note: previous definition of
'intp_abs' was here
error: Command "gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv
-O2 -Wall -Wstrict-prototypes -fPIC -Inumpy/core/include
-Ibuild/src.linux-sparc64-2.6/numpy/core/include/numpy
-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core
-Inumpy/core/src/npymath -Inumpy/core/src/multiarray
-Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/include
-I/usr/include/python2.6
-Ibuild/src.linux-sparc64-2.6/numpy/core/src/multiarray
-Ibuild/src.linux-sparc64-2.6/numpy/core/src/umath -c
numpy/core/src/multiarray/multiarraymodule_onefile.c -o
build/temp.linux-sparc64-2.6/numpy/core/src/multiarray/multiarraymodule_onefile.o"
failed with exit status 1

(bare-env)[matthew at vagus ~/dev_trees/numpy ((9194b3a...)|BISECTING)]$
echo $ENABLE_SEPARATE_COMPILATION
1

Best,

Matthew


From mwwiebe at gmail.com  Sun Mar  4 23:32:35 2012
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Sun, 4 Mar 2012 22:32:35 -0600
Subject: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
In-Reply-To: <CAH6Pt5q3xZaioK-kQ1=_mCCTFkOo-uwS0i5G+yQqhqmNPrmqaA@mail.gmail.com>
References: <CAH6Pt5qkeVoo3d2+q4+WMmKZFC2FnHx3=pun5S5dwv2P5ijxKw@mail.gmail.com>
	<CAB6mnx+U1vXz6=TCSHO=-a+OjffcAntFKge9bm8LF-Vkj4jH8w@mail.gmail.com>
	<CAH6Pt5o+XW+DVy2L4-GG1Pq2B9YWuebKUKuMGHm-hWabVUV2=w@mail.gmail.com>
	<CAH6Pt5p3zm=ugcQ1W-L5j9C6BQ3g_9yALzfFmxd0Lb6c2UUr6Q@mail.gmail.com>
	<CAMRnEmrV1OTEABbiJMz2HfUb-vcbyq-SHRXE91SS93gtASWgUQ@mail.gmail.com>
	<CAH6Pt5q3xZaioK-kQ1=_mCCTFkOo-uwS0i5G+yQqhqmNPrmqaA@mail.gmail.com>
Message-ID: <CAMRnEmrjQ+Tv4EPPETboc-zcB+BFKHTeDrCdLQFvd32JzQhQZA@mail.gmail.com>

On Sun, Mar 4, 2012 at 10:08 PM, Matthew Brett <matthew.brett at gmail.com>wrote:

> Hi,
>
> On Sun, Mar 4, 2012 at 11:41 AM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> > On Sun, Mar 4, 2012 at 11:27 AM, Matthew Brett <matthew.brett at gmail.com>
> > wrote:
> >>
> >> Hi,
> >>
> >> On Sat, Mar 3, 2012 at 12:07 AM, Matthew Brett <matthew.brett at gmail.com
> >
> >> wrote:
> >> > Hi,
> >> >
> >> > On Fri, Mar 2, 2012 at 9:05 PM, Charles R Harris
> >> > <charlesr.harris at gmail.com> wrote:
> >> >>
> >> >>
> >> >> On Fri, Mar 2, 2012 at 4:36 PM, Matthew Brett <
> matthew.brett at gmail.com>
> >> >> wrote:
> >> >>>
> >> >>> Hi,
> >> >>>
> >> >>> Sorry that this report is not complete, I don't have full access to
> >> >>> this box but, on a Debian squeeze machine running linux
> >> >>> 2.6.32-5-sparc64-smp:
> >> >>>
> >> >>> nosetests
> >> >>>
> >> >>>
> ~/usr/local/lib/python2.6/site-packages/numpy/lib/tests/test_io.py:TestFromTxt.test_user_missing_values
> >> >>>
> >> >>> test_user_missing_values (test_io.TestFromTxt) ... Bus error
> >> >>>
> >> >>> This on current master : 1.7.0.dev-b9872b4
> >> >>>
> >> >>
> >> >> Hmm, some tests might have been recently enabled. Any chance of
> doing a
> >> >> bisection?
> >>
> >> Struggling because compilation is very slow and there are lots of
> >> untestable commits.  df907e6 is the first known bad.  Here's the
> >> output from a log:
> >>
> >> * df907e6 - (HEAD, refs/bisect/bad) BLD: Failure in single file build
> >> mode because of a static function in two separate files (6 months ago)
> >> [Mark Wiebe]
> >> * 01b200b - (refs/bisect/skip-01b200b10149312f51234448e44b230b1b548046)
> >> BUG: nditer: The nditer was reusing the reduce loop inappropriately
> >> (#1938) (6 months ago) [Mark Wiebe]
> >> * f45fd67 - (refs/bisect/skip-f45fd67fe8eefc8fd2e4b914ab4e376ab5226887)
> >> DOC: Small tweak to release notes (6 months ago) [Mark Wiebe]
> >> * 73be11d - (refs/bisect/skip-73be11db794d115a7d9bd2e822c0d8008bc14a28)
> >> BUG: Some bugs in squeeze and concatenate found by testing SciPy (6
> >> months ago) [Mark Wiebe]
> >> * c873295 - (refs/bisect/skip-c8732958c8e07f2306029dfde2178faf9c01d049)
> >> TST: missingdata: Finish up NA mask tests for np.std and np.var (6
> >> months ago) [Mark Wiebe]
> >> * e15712c - (refs/bisect/skip-e15712cf5df41806980f040606744040a433b331)
> >> BUG: nditer: NA masks in arrays with leading 1 dimensions had an issue
> >> (6 months ago) [Mark Wiebe]
> >> * ded81ae - (refs/bisect/skip-ded81ae7d529ac0fba641b7e5e3ecf52e120700f)
> >> ENH: missingdata: Implement tests for np.std, add skipna= and
> >> keepdims= parameters to more functions (6 months ago) [Mark Wiebe]
> >> * a112fc4 - (refs/bisect/skip-a112fc4a6b28fbb85e1b0c6d423095d13cf7b226)
> >> ENH: missingdata: Implement skipna= support for np.std and np.var (6
> >> months ago) [Mark Wiebe]
> >> * 0fa4f22 - (refs/bisect/skip-0fa4f22fec4b19e2a8c1d93e5a1f955167c9addd)
> >> ENH: missingdata: Support 'skipna=' parameter in np.mean (6 months
> >> ago) [Mark Wiebe]
> >> * bfda229 - (refs/bisect/skip-bfda229ec93d37b1ee2cdd8b9443ec4e34536bbf)
> >> ENH: missingdata: Create count_reduce_items function (6 months ago)
> >> [Mark Wiebe]
> >> * d9b3f90 - (refs/bisect/skip-d9b3f90de3213ece9a78b77088fdec17910e81d9)
> >> ENH: missingdata: Move the Reduce boilerplate into a function
> >> PyArray_ReduceWrapper (6 months ago) [Mark Wiebe]
> >> * 67ece6b - (refs/bisect/skip-67ece6bdd2b35d011893e78154dbff6ab51c7d35)
> >> ENH: missingdata: Finish count_nonzero as a full-fledged reduction
> >> operation (6 months ago) [Mark Wiebe]
> >> * 6bfd819 - (refs/bisect/skip-6bfd819a0897caf6e6db244930c40ed0d17b9e62)
> >> ENH: missingdata: Towards making count_nonzero a full-featured
> >> reduction operation (6 months ago) [Mark Wiebe]
> >> * a1faa1b - (refs/bisect/skip-a1faa1b6883c47333508a0476c1304b0a8a3f64e)
> >> ENH: missingdata: Move some of the refactored reduction code into the
> >> API (6 months ago) [Mark Wiebe]
> >> * f597374 - (refs/bisect/skip-f597374edc298810083799e8539c99fc0a93b319)
> >> ENH: missingdata: Change default to create NA-mask when NAs are in
> >> lists (6 months ago) [Mark Wiebe]
> >> * 965e4cf - (refs/bisect/skip-965e4cff5c4c50e8ff051a3363adc6cf6aa640cd)
> >> ENH: missingdata: trying some more functions to see how they treat NAs
> >> (6 months ago) [Mark Wiebe]
> >> * b1cb211 - (refs/bisect/skip-b1cb211d159c617ee4ebd16266d6f1042417ef75)
> >> ENH: missingdata: Add nastr= parameter to np.set_printoptions() (6
> >> months ago) [Mark Wiebe]
> >> * ba4d116 - (refs/bisect/skip-ba4d1161fe4943cb720f35c0abfd0581628255d6)
> >> BUG: missingdata: Fix mask usage in PyArray_TakeFrom, add tests for it
> >> (6 months ago) [Mark Wiebe]
> >> * a3a0ee8 - (refs/bisect/skip-a3a0ee8c72fdd55ffacb96bbb1fa9c3569cfb3e9)
> >> BUG: missingdata: The ndmin parameter to np.array wasn't respecting NA
> >> masks (6 months ago) [Mark Wiebe]
> >> * 9194b3a - (refs/bisect/skip-9194b3af704df71aa9b1ff2f53f169848d0f9dc7)
> >> ENH: missingdata: Rewrite PyArray_Concatenate to work with NA masks (6
> >> months ago) [Mark Wiebe]
> >> * 99a21ef - (refs/bisect/good-99a21efff4b1f2292dc370c7c9c7c58f10385f2a)
> >> ENH: missingdata: Add NA support to np.diagonal, change np.diagonal to
> >> always return a view (6 months ago) [Mark Wiebe]
> >>
> >> So - the problem arises somewhere between 99a21ef (good) and  df907e6
> >> (bad)
> >>
> >> There seems to be a compilation error for the skipped commits - here's
> >> the one I tested, 9194b3a:
> >
> >
> > If you enable separate compilation by setting an environment variable,
> these
> > commits should build as well.
> >
> > $ export ENABLE_SEPARATE_COMPILATION=1
> > <proceed with builds>
>
> I might be doing something wrong but:
>

I made a mistake, sorry! I even copy/pasted the variable name to make sure
I didn't misspell it, but I didn't notice that setup.py uses a different
but nearly identical name internally as the environment variable. That
should have been

$ export NPY_SEPARATE_COMPILATION=1

The place this is used inside of setup.py is here:

https://github.com/numpy/numpy/blob/master/numpy/core/setup.py#L14

I really dislike this build feature, it repeatedly trips me up. In my
opinion, the build should be changed to always do separate compilation, and
the single file mode should be eradicated.

Thanks ,
Mark


>
> In file included from numpy/core/src/multiarray/scalartypes.c.src:25,
>                 from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:10:
> numpy/core/src/multiarray/_datetime.h:9: warning: function declaration
> isn't a prototype
> In file included from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:13:
> numpy/core/src/multiarray/datetime.c:34: warning: function declaration
> isn't a prototype
> In file included from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:41:
> numpy/core/src/multiarray/nditer_constr.c:2373: error: redefinition of
> 'intp_abs'
> numpy/core/src/multiarray/shape.c:927: note: previous definition of
> 'intp_abs' was here
> In file included from numpy/core/src/multiarray/scalartypes.c.src:25,
>                 from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:10:
> numpy/core/src/multiarray/_datetime.h:9: warning: function declaration
> isn't a prototype
> In file included from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:13:
> numpy/core/src/multiarray/datetime.c:34: warning: function declaration
> isn't a prototype
> In file included from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:41:
> numpy/core/src/multiarray/nditer_constr.c:2373: error: redefinition of
> 'intp_abs'
> numpy/core/src/multiarray/shape.c:927: note: previous definition of
> 'intp_abs' was here
> error: Command "gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv
> -O2 -Wall -Wstrict-prototypes -fPIC -Inumpy/core/include
> -Ibuild/src.linux-sparc64-2.6/numpy/core/include/numpy
> -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core
> -Inumpy/core/src/npymath -Inumpy/core/src/multiarray
> -Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/include
> -I/usr/include/python2.6
> -Ibuild/src.linux-sparc64-2.6/numpy/core/src/multiarray
> -Ibuild/src.linux-sparc64-2.6/numpy/core/src/umath -c
> numpy/core/src/multiarray/multiarraymodule_onefile.c -o
>
> build/temp.linux-sparc64-2.6/numpy/core/src/multiarray/multiarraymodule_onefile.o"
> failed with exit status 1
>
> (bare-env)[matthew at vagus ~/dev_trees/numpy ((9194b3a...)|BISECTING)]$
> echo $ENABLE_SEPARATE_COMPILATION
> 1
>
> Best,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120304/6c31c22d/attachment.html>

From matthew.brett at gmail.com  Mon Mar  5 01:34:56 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Sun, 4 Mar 2012 22:34:56 -0800
Subject: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
In-Reply-To: <CAMRnEmrjQ+Tv4EPPETboc-zcB+BFKHTeDrCdLQFvd32JzQhQZA@mail.gmail.com>
References: <CAH6Pt5qkeVoo3d2+q4+WMmKZFC2FnHx3=pun5S5dwv2P5ijxKw@mail.gmail.com>
	<CAB6mnx+U1vXz6=TCSHO=-a+OjffcAntFKge9bm8LF-Vkj4jH8w@mail.gmail.com>
	<CAH6Pt5o+XW+DVy2L4-GG1Pq2B9YWuebKUKuMGHm-hWabVUV2=w@mail.gmail.com>
	<CAH6Pt5p3zm=ugcQ1W-L5j9C6BQ3g_9yALzfFmxd0Lb6c2UUr6Q@mail.gmail.com>
	<CAMRnEmrV1OTEABbiJMz2HfUb-vcbyq-SHRXE91SS93gtASWgUQ@mail.gmail.com>
	<CAH6Pt5q3xZaioK-kQ1=_mCCTFkOo-uwS0i5G+yQqhqmNPrmqaA@mail.gmail.com>
	<CAMRnEmrjQ+Tv4EPPETboc-zcB+BFKHTeDrCdLQFvd32JzQhQZA@mail.gmail.com>
Message-ID: <CAH6Pt5q6xnYvRtQUvHKSpvJtTyJTDiR5efCduqop4dp5sx5veQ@mail.gmail.com>

Hi,

On Sun, Mar 4, 2012 at 8:32 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> On Sun, Mar 4, 2012 at 10:08 PM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
>>
>> Hi,
>>
>> On Sun, Mar 4, 2012 at 11:41 AM, Mark Wiebe <mwwiebe at gmail.com> wrote:
>> > On Sun, Mar 4, 2012 at 11:27 AM, Matthew Brett <matthew.brett at gmail.com>
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> On Sat, Mar 3, 2012 at 12:07 AM, Matthew Brett
>> >> <matthew.brett at gmail.com>
>> >> wrote:
>> >> > Hi,
>> >> >
>> >> > On Fri, Mar 2, 2012 at 9:05 PM, Charles R Harris
>> >> > <charlesr.harris at gmail.com> wrote:
>> >> >>
>> >> >>
>> >> >> On Fri, Mar 2, 2012 at 4:36 PM, Matthew Brett
>> >> >> <matthew.brett at gmail.com>
>> >> >> wrote:
>> >> >>>
>> >> >>> Hi,
>> >> >>>
>> >> >>> Sorry that this report is not complete, I don't have full access to
>> >> >>> this box but, on a Debian squeeze machine running linux
>> >> >>> 2.6.32-5-sparc64-smp:
>> >> >>>
>> >> >>> nosetests
>> >> >>>
>> >> >>>
>> >> >>> ~/usr/local/lib/python2.6/site-packages/numpy/lib/tests/test_io.py:TestFromTxt.test_user_missing_values
>> >> >>>
>> >> >>> test_user_missing_values (test_io.TestFromTxt) ... Bus error
>> >> >>>
>> >> >>> This on current master : 1.7.0.dev-b9872b4
>> >> >>>
>> >> >>
>> >> >> Hmm, some tests might have been recently enabled. Any chance of
>> >> >> doing a
>> >> >> bisection?
>> >>
>> >> Struggling because compilation is very slow and there are lots of
>> >> untestable commits. ?df907e6 is the first known bad. ?Here's the
>> >> output from a log:
>> >>
>> >> * df907e6 - (HEAD, refs/bisect/bad) BLD: Failure in single file build
>> >> mode because of a static function in two separate files (6 months ago)
>> >> [Mark Wiebe]
>> >> * 01b200b - (refs/bisect/skip-01b200b10149312f51234448e44b230b1b548046)
>> >> BUG: nditer: The nditer was reusing the reduce loop inappropriately
>> >> (#1938) (6 months ago) [Mark Wiebe]
>> >> * f45fd67 - (refs/bisect/skip-f45fd67fe8eefc8fd2e4b914ab4e376ab5226887)
>> >> DOC: Small tweak to release notes (6 months ago) [Mark Wiebe]
>> >> * 73be11d - (refs/bisect/skip-73be11db794d115a7d9bd2e822c0d8008bc14a28)
>> >> BUG: Some bugs in squeeze and concatenate found by testing SciPy (6
>> >> months ago) [Mark Wiebe]
>> >> * c873295 - (refs/bisect/skip-c8732958c8e07f2306029dfde2178faf9c01d049)
>> >> TST: missingdata: Finish up NA mask tests for np.std and np.var (6
>> >> months ago) [Mark Wiebe]
>> >> * e15712c - (refs/bisect/skip-e15712cf5df41806980f040606744040a433b331)
>> >> BUG: nditer: NA masks in arrays with leading 1 dimensions had an issue
>> >> (6 months ago) [Mark Wiebe]
>> >> * ded81ae - (refs/bisect/skip-ded81ae7d529ac0fba641b7e5e3ecf52e120700f)
>> >> ENH: missingdata: Implement tests for np.std, add skipna= and
>> >> keepdims= parameters to more functions (6 months ago) [Mark Wiebe]
>> >> * a112fc4 - (refs/bisect/skip-a112fc4a6b28fbb85e1b0c6d423095d13cf7b226)
>> >> ENH: missingdata: Implement skipna= support for np.std and np.var (6
>> >> months ago) [Mark Wiebe]
>> >> * 0fa4f22 - (refs/bisect/skip-0fa4f22fec4b19e2a8c1d93e5a1f955167c9addd)
>> >> ENH: missingdata: Support 'skipna=' parameter in np.mean (6 months
>> >> ago) [Mark Wiebe]
>> >> * bfda229 - (refs/bisect/skip-bfda229ec93d37b1ee2cdd8b9443ec4e34536bbf)
>> >> ENH: missingdata: Create count_reduce_items function (6 months ago)
>> >> [Mark Wiebe]
>> >> * d9b3f90 - (refs/bisect/skip-d9b3f90de3213ece9a78b77088fdec17910e81d9)
>> >> ENH: missingdata: Move the Reduce boilerplate into a function
>> >> PyArray_ReduceWrapper (6 months ago) [Mark Wiebe]
>> >> * 67ece6b - (refs/bisect/skip-67ece6bdd2b35d011893e78154dbff6ab51c7d35)
>> >> ENH: missingdata: Finish count_nonzero as a full-fledged reduction
>> >> operation (6 months ago) [Mark Wiebe]
>> >> * 6bfd819 - (refs/bisect/skip-6bfd819a0897caf6e6db244930c40ed0d17b9e62)
>> >> ENH: missingdata: Towards making count_nonzero a full-featured
>> >> reduction operation (6 months ago) [Mark Wiebe]
>> >> * a1faa1b - (refs/bisect/skip-a1faa1b6883c47333508a0476c1304b0a8a3f64e)
>> >> ENH: missingdata: Move some of the refactored reduction code into the
>> >> API (6 months ago) [Mark Wiebe]
>> >> * f597374 - (refs/bisect/skip-f597374edc298810083799e8539c99fc0a93b319)
>> >> ENH: missingdata: Change default to create NA-mask when NAs are in
>> >> lists (6 months ago) [Mark Wiebe]
>> >> * 965e4cf - (refs/bisect/skip-965e4cff5c4c50e8ff051a3363adc6cf6aa640cd)
>> >> ENH: missingdata: trying some more functions to see how they treat NAs
>> >> (6 months ago) [Mark Wiebe]
>> >> * b1cb211 - (refs/bisect/skip-b1cb211d159c617ee4ebd16266d6f1042417ef75)
>> >> ENH: missingdata: Add nastr= parameter to np.set_printoptions() (6
>> >> months ago) [Mark Wiebe]
>> >> * ba4d116 - (refs/bisect/skip-ba4d1161fe4943cb720f35c0abfd0581628255d6)
>> >> BUG: missingdata: Fix mask usage in PyArray_TakeFrom, add tests for it
>> >> (6 months ago) [Mark Wiebe]
>> >> * a3a0ee8 - (refs/bisect/skip-a3a0ee8c72fdd55ffacb96bbb1fa9c3569cfb3e9)
>> >> BUG: missingdata: The ndmin parameter to np.array wasn't respecting NA
>> >> masks (6 months ago) [Mark Wiebe]
>> >> * 9194b3a - (refs/bisect/skip-9194b3af704df71aa9b1ff2f53f169848d0f9dc7)
>> >> ENH: missingdata: Rewrite PyArray_Concatenate to work with NA masks (6
>> >> months ago) [Mark Wiebe]
>> >> * 99a21ef - (refs/bisect/good-99a21efff4b1f2292dc370c7c9c7c58f10385f2a)
>> >> ENH: missingdata: Add NA support to np.diagonal, change np.diagonal to
>> >> always return a view (6 months ago) [Mark Wiebe]
>> >>
>> >> So - the problem arises somewhere between 99a21ef (good) and ?df907e6
>> >> (bad)
>> >>
>> >> There seems to be a compilation error for the skipped commits - here's
>> >> the one I tested, 9194b3a:
>> >
>> >
>> > If you enable separate compilation by setting an environment variable,
>> > these
>> > commits should build as well.
>> >
>> > $ export?ENABLE_SEPARATE_COMPILATION=1
>> > <proceed with builds>
>>
>> I might be doing something wrong but:
>
>
> I made a mistake, sorry! I even copy/pasted the variable name to make sure I
> didn't misspell it, but I didn't notice that setup.py uses a different but
> nearly identical name internally as the environment variable. That should
> have been
>
> $ export?NPY_SEPARATE_COMPILATION=1

Thanks, that did it:

9194b3af704df71aa9b1ff2f53f169848d0f9dc7 is the first bad commit

Let me know if I can debug further,

See you,

Matthew


From mwwiebe at gmail.com  Mon Mar  5 02:53:09 2012
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Sun, 4 Mar 2012 23:53:09 -0800
Subject: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
In-Reply-To: <CAH6Pt5q6xnYvRtQUvHKSpvJtTyJTDiR5efCduqop4dp5sx5veQ@mail.gmail.com>
References: <CAH6Pt5qkeVoo3d2+q4+WMmKZFC2FnHx3=pun5S5dwv2P5ijxKw@mail.gmail.com>
	<CAB6mnx+U1vXz6=TCSHO=-a+OjffcAntFKge9bm8LF-Vkj4jH8w@mail.gmail.com>
	<CAH6Pt5o+XW+DVy2L4-GG1Pq2B9YWuebKUKuMGHm-hWabVUV2=w@mail.gmail.com>
	<CAH6Pt5p3zm=ugcQ1W-L5j9C6BQ3g_9yALzfFmxd0Lb6c2UUr6Q@mail.gmail.com>
	<CAMRnEmrV1OTEABbiJMz2HfUb-vcbyq-SHRXE91SS93gtASWgUQ@mail.gmail.com>
	<CAH6Pt5q3xZaioK-kQ1=_mCCTFkOo-uwS0i5G+yQqhqmNPrmqaA@mail.gmail.com>
	<CAMRnEmrjQ+Tv4EPPETboc-zcB+BFKHTeDrCdLQFvd32JzQhQZA@mail.gmail.com>
	<CAH6Pt5q6xnYvRtQUvHKSpvJtTyJTDiR5efCduqop4dp5sx5veQ@mail.gmail.com>
Message-ID: <CAMRnEmpxr1xe36QRLuT0nCtGMXWJr4i-NNeZL5EVhyD8=aGZCQ@mail.gmail.com>

On Sun, Mar 4, 2012 at 10:34 PM, Matthew Brett <matthew.brett at gmail.com>wrote:

> <snip>
> > $ export NPY_SEPARATE_COMPILATION=1
>
> Thanks, that did it:
>
> 9194b3af704df71aa9b1ff2f53f169848d0f9dc7 is the first bad commit
>
> Let me know if I can debug further,
>

That commit was a rewrite of np.concatenate, and I've traced the test
function you got the crash in. The only call to concatenate is as follows:

>>> a = np.array([True], dtype=object)
>>> np.concatenate((a,)*3)
array([True, True, True], dtype=object)
>>>

Can you try this and see if it crashes?

Another thing you can do is compile with debug information enabled, then
run the crashing case in gdb. This will look something like this:

$ *export CFLAGS=-g*
$ *rm -rf build* # make sure it's a fresh build from scratch
$ *python setup.py install --prefix=<dir>* # or however you do it
[... build printout]
$ *gdb python <script/options>*
[... gdb info]
(gdb) *run*
 [... script runs]
Program received signal SIGSEGV, Segmentation fault,
[crash address]
(gdb) *backtrace*
[full backtrace printout]

Such a backtrace would be immensely helpful in tracking down why it's
crashing.

Thanks,
Mark


> See you,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120304/80911a52/attachment.html>

From travis at continuum.io  Mon Mar  5 03:20:31 2012
From: travis at continuum.io (Travis Oliphant)
Date: Mon, 5 Mar 2012 02:20:31 -0600
Subject: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
In-Reply-To: <CAMRnEmrjQ+Tv4EPPETboc-zcB+BFKHTeDrCdLQFvd32JzQhQZA@mail.gmail.com>
References: <CAH6Pt5qkeVoo3d2+q4+WMmKZFC2FnHx3=pun5S5dwv2P5ijxKw@mail.gmail.com>
	<CAB6mnx+U1vXz6=TCSHO=-a+OjffcAntFKge9bm8LF-Vkj4jH8w@mail.gmail.com>
	<CAH6Pt5o+XW+DVy2L4-GG1Pq2B9YWuebKUKuMGHm-hWabVUV2=w@mail.gmail.com>
	<CAH6Pt5p3zm=ugcQ1W-L5j9C6BQ3g_9yALzfFmxd0Lb6c2UUr6Q@mail.gmail.com>
	<CAMRnEmrV1OTEABbiJMz2HfUb-vcbyq-SHRXE91SS93gtASWgUQ@mail.gmail.com>
	<CAH6Pt5q3xZaioK-kQ1=_mCCTFkOo-uwS0i5G+yQqhqmNPrmqaA@mail.gmail.com>
	<CAMRnEmrjQ+Tv4EPPETboc-zcB+BFKHTeDrCdLQFvd32JzQhQZA@mail.gmail.com>
Message-ID: <CECB0BFD-03D9-4B51-818B-7EA753A052CC@continuum.io>

> 
> The place this is used inside of setup.py is here:
> 
> https://github.com/numpy/numpy/blob/master/numpy/core/setup.py#L14
> 
> I really dislike this build feature, it repeatedly trips me up. In my opinion, the build should be changed to always do separate compilation, and the single file mode should be eradicated.
> 

Nobody *likes* this build "feature".   If it can be eradicated let's do it, but when I wanted to early on I couldn't for various reasons.   I'm not convinced these reasons have been addressed.   

The big problem I had (if I remember correctly) was getting the NumPy C-API to be exportable on all platforms (yet having the variables be static and hidden from the shared symbol-space) if it was assembled from multiple files.   The NumPy C-API uses a clever trick that was invented for Python back in the late 90s which was important for getting extensions that used the NumPy C-API to work on several platforms.  

If we can be assured that platforms that might have needed that are no longer needing NumPy (or there are work-arounds), then we can certainly get rid of a single-file build mode. 

-Travis

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120305/0955b16c/attachment.html>

From sole at esrf.fr  Mon Mar  5 08:26:02 2012
From: sole at esrf.fr (=?ISO-8859-1?Q?=22V=2E_Armando_Sol=E9=22?=)
Date: Mon, 05 Mar 2012 14:26:02 +0100
Subject: [Numpy-discussion] (2012) Accessing LAPACK and BLAS from the numpy
	C API
Message-ID: <4F54BEEA.4070503@esrf.fr>

Hello,

In 2009 there was a thread in this mailing list concerning the access to 
BLAS from C extension modules.

If I have properly understood the thread:

http://mail.scipy.org/pipermail/numpy-discussion/2009-November/046567.html

the answer by then was that those functions were not exposed (only f2py 
functions).

I just wanted to know if the situation has changed since 2009 because it 
is not uncommon that to optimize some operations one has to sooner or 
later access BLAS functions that are already wrapped in numpy (either 
from ATLAS, from the Intel MKL, ...)

Thanks for your time,

Armando


From charlesr.harris at gmail.com  Mon Mar  5 08:50:16 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 5 Mar 2012 06:50:16 -0700
Subject: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
In-Reply-To: <CECB0BFD-03D9-4B51-818B-7EA753A052CC@continuum.io>
References: <CAH6Pt5qkeVoo3d2+q4+WMmKZFC2FnHx3=pun5S5dwv2P5ijxKw@mail.gmail.com>
	<CAB6mnx+U1vXz6=TCSHO=-a+OjffcAntFKge9bm8LF-Vkj4jH8w@mail.gmail.com>
	<CAH6Pt5o+XW+DVy2L4-GG1Pq2B9YWuebKUKuMGHm-hWabVUV2=w@mail.gmail.com>
	<CAH6Pt5p3zm=ugcQ1W-L5j9C6BQ3g_9yALzfFmxd0Lb6c2UUr6Q@mail.gmail.com>
	<CAMRnEmrV1OTEABbiJMz2HfUb-vcbyq-SHRXE91SS93gtASWgUQ@mail.gmail.com>
	<CAH6Pt5q3xZaioK-kQ1=_mCCTFkOo-uwS0i5G+yQqhqmNPrmqaA@mail.gmail.com>
	<CAMRnEmrjQ+Tv4EPPETboc-zcB+BFKHTeDrCdLQFvd32JzQhQZA@mail.gmail.com>
	<CECB0BFD-03D9-4B51-818B-7EA753A052CC@continuum.io>
Message-ID: <CAB6mnxK3ZRU0qw5NT_BDs_G_OtRbRo8_9q__wuhrG1PLGZzzHw@mail.gmail.com>

On Mon, Mar 5, 2012 at 1:20 AM, Travis Oliphant <travis at continuum.io> wrote:

>
> The place this is used inside of setup.py is here:
>
> https://github.com/numpy/numpy/blob/master/numpy/core/setup.py#L14
>
> I really dislike this build feature, it repeatedly trips me up. In my
> opinion, the build should be changed to always do separate compilation, and
> the single file mode should be eradicated.
>
>
> Nobody *likes* this build "feature".   If it can be eradicated let's do
> it, but when I wanted to early on I couldn't for various reasons.   I'm not
> convinced these reasons have been addressed.
>
>
This was David's contribution and wasn't put in, IIRC,  until the files
were broken up in 1.3. The reason it isn't standard is that it was
experimental at the time.


> The big problem I had (if I remember correctly) was getting the NumPy
> C-API to be exportable on all platforms (yet having the variables be static
> and hidden from the shared symbol-space) if it was assembled from multiple
> files.   The NumPy C-API uses a clever trick that was invented for Python
> back in the late 90s which was important for getting extensions that used
> the NumPy C-API to work on several platforms.
>
> If we can be assured that platforms that might have needed that are no
> longer needing NumPy (or there are work-arounds), then we can certainly get
> rid of a single-file build mode.
>
>
We should probably make is standard for a while to see if it causes
problems.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120305/a1e53730/attachment.html>

From cournape at gmail.com  Mon Mar  5 08:58:33 2012
From: cournape at gmail.com (David Cournapeau)
Date: Mon, 5 Mar 2012 05:58:33 -0800
Subject: [Numpy-discussion] C++ Example
In-Reply-To: <1375080.6SoljH8rGC@rabbit>
References: <30965644.bxiDOaWOx7@rabbit>
	<CAGY4rcWohTs=R1oPF84SsAS1JFb3okXDStFLfU5keg1FC9pXVw@mail.gmail.com>
	<1375080.6SoljH8rGC@rabbit>
Message-ID: <CAGY4rcV6q+vhqJdQC3Nh8BD-pgV_6e-neqWZ+_Xn3GP1=a5vew@mail.gmail.com>

On Sun, Mar 4, 2012 at 2:18 PM, Luis Pedro Coelho <lpc at cmu.edu> wrote:
> On Saturday, March 03, 2012 04:38:53 PM David Cournapeau wrote:
>> I don't think the code is comparable either - some of the stuff done
>> in the C code is done in the C++ code your are calling. The C code
>> could be significantly improved.
>
> Actually, that's not 100% accurate. The C code calls the same functions. Most
> of the extra cruft is that it needs to do all of this error checking and type-
> dispatch, while in C++ you can have RAII and templates.

But most of the code in the C one is not related to either. All the
size computation, etc? still needs to be done somewhere.

While I agree that RAII is by itself a very useful feature, it is not
that used in your example. IMO, all you are doing is comparing decent
C++ to awful C. That's not very interesting to me.

cheers,

David


From chaoyuejoy at gmail.com  Mon Mar  5 10:27:29 2012
From: chaoyuejoy at gmail.com (Chao YUE)
Date: Mon, 5 Mar 2012 16:27:29 +0100
Subject: [Numpy-discussion] verbose output when running python script?
Message-ID: <CAAN-aRHRgZzTxPAKzrt1X-=DiO6H3nxKP4LOz=7Xfchq1X0ZuA@mail.gmail.com>

Dear all,

Sorry this is not the good place to ask but I think there must be someone
who has done this before.
Is there any way to see the execution of python script line by line as the
verbose model of shell script?
this can be done either in ipython or by running the python script in a
shell script?

thanks to all,

cheers,

Chao

-- 
***********************************************************************************
Chao YUE
Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
UMR 1572 CEA-CNRS-UVSQ
Batiment 712 - Pe 119
91191 GIF Sur YVETTE Cedex
Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
************************************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120305/d58d81f0/attachment.html>

From josef.pktd at gmail.com  Mon Mar  5 10:47:26 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 5 Mar 2012 10:47:26 -0500
Subject: [Numpy-discussion] verbose output when running python script?
In-Reply-To: <CAAN-aRHRgZzTxPAKzrt1X-=DiO6H3nxKP4LOz=7Xfchq1X0ZuA@mail.gmail.com>
References: <CAAN-aRHRgZzTxPAKzrt1X-=DiO6H3nxKP4LOz=7Xfchq1X0ZuA@mail.gmail.com>
Message-ID: <CAMMTP+Dc9+h_Dr=2+FyZgKYpFjOB1GHMVMKiwfjddgu=hZFGnQ@mail.gmail.com>

On Mon, Mar 5, 2012 at 10:27 AM, Chao YUE <chaoyuejoy at gmail.com> wrote:
> Dear all,
>
> Sorry this is not the good place to ask but I think there must be someone
> who has done this before.
> Is there any way to see the execution of python script line by line as the
> verbose model of shell script?
> this can be done either in ipython or by running the python script in a
> shell script?

I'm using Spyder for this F9: run selection or current block
after selecting lines of code in the editor.

(just like matlab)

Josef

>
> thanks to all,
>
> cheers,
>
> Chao
>
> --
> ***********************************************************************************
> Chao YUE
> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
> UMR 1572 CEA-CNRS-UVSQ
> Batiment 712 - Pe 119
> 91191 GIF Sur YVETTE Cedex
> Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
> ************************************************************************************
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From chaoyuejoy at gmail.com  Mon Mar  5 10:51:42 2012
From: chaoyuejoy at gmail.com (Chao YUE)
Date: Mon, 5 Mar 2012 16:51:42 +0100
Subject: [Numpy-discussion] verbose output when running python script?
In-Reply-To: <CAMMTP+Dc9+h_Dr=2+FyZgKYpFjOB1GHMVMKiwfjddgu=hZFGnQ@mail.gmail.com>
References: <CAAN-aRHRgZzTxPAKzrt1X-=DiO6H3nxKP4LOz=7Xfchq1X0ZuA@mail.gmail.com>
	<CAMMTP+Dc9+h_Dr=2+FyZgKYpFjOB1GHMVMKiwfjddgu=hZFGnQ@mail.gmail.com>
Message-ID: <CAAN-aRFMakTa7Go2Gnq0k_HWhqtaua8HcZ77WGcQcny+e+JcuA@mail.gmail.com>

thanks. I am using ubuntu (for my local computer) and on our server we have
only ipython or shell... but I think I can try your suggestion on my own
computer.

Chao

2012/3/5 <josef.pktd at gmail.com>

> On Mon, Mar 5, 2012 at 10:27 AM, Chao YUE <chaoyuejoy at gmail.com> wrote:
> > Dear all,
> >
> > Sorry this is not the good place to ask but I think there must be someone
> > who has done this before.
> > Is there any way to see the execution of python script line by line as
> the
> > verbose model of shell script?
> > this can be done either in ipython or by running the python script in a
> > shell script?
>
> I'm using Spyder for this F9: run selection or current block
> after selecting lines of code in the editor.
>
> (just like matlab)
>
> Josef
>
> >
> > thanks to all,
> >
> > cheers,
> >
> > Chao
> >
> > --
> >
> ***********************************************************************************
> > Chao YUE
> > Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
> > UMR 1572 CEA-CNRS-UVSQ
> > Batiment 712 - Pe 119
> > 91191 GIF Sur YVETTE Cedex
> > Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
> >
> ************************************************************************************
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
***********************************************************************************
Chao YUE
Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
UMR 1572 CEA-CNRS-UVSQ
Batiment 712 - Pe 119
91191 GIF Sur YVETTE Cedex
Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
************************************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120305/fb7de8f2/attachment.html>

From takowl at gmail.com  Mon Mar  5 11:34:43 2012
From: takowl at gmail.com (Thomas Kluyver)
Date: Mon, 5 Mar 2012 16:34:43 +0000
Subject: [Numpy-discussion] verbose output when running python script?
In-Reply-To: <CAAN-aRHRgZzTxPAKzrt1X-=DiO6H3nxKP4LOz=7Xfchq1X0ZuA@mail.gmail.com>
References: <CAAN-aRHRgZzTxPAKzrt1X-=DiO6H3nxKP4LOz=7Xfchq1X0ZuA@mail.gmail.com>
Message-ID: <CAOvn4qj2fTGK8gU6iBKUFrR=KDjtfXOzMeLUry+cKXKhxYSfJw@mail.gmail.com>

On 5 March 2012 15:27, Chao YUE <chaoyuejoy at gmail.com> wrote:

> Sorry this is not the good place to ask but I think there must be someone
> who has done this before.
> Is there any way to see the execution of python script line by line as the
> verbose model of shell script?
> this can be done either in ipython or by running the python script in a
> shell script?


Depending on what you want this for, you might look into IPython's demo
mode:
http://ipython.org/ipython-doc/stable/interactive/reference.html#interactive-demos-with-ipython

Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120305/358f725b/attachment.html>

From njs at pobox.com  Mon Mar  5 11:37:53 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 5 Mar 2012 16:37:53 +0000
Subject: [Numpy-discussion] verbose output when running python script?
In-Reply-To: <CAAN-aRHRgZzTxPAKzrt1X-=DiO6H3nxKP4LOz=7Xfchq1X0ZuA@mail.gmail.com>
References: <CAAN-aRHRgZzTxPAKzrt1X-=DiO6H3nxKP4LOz=7Xfchq1X0ZuA@mail.gmail.com>
Message-ID: <CAPJVwB=M1fmrA+=rNnJG0uoj8BzfeAumnifFzvvCYMi47XnHMQ@mail.gmail.com>

Try the trace module in the standard library:
http://docs.python.org/library/trace.html
http://www.doughellmann.com/PyMOTW/trace/

- Nathaniel
On Mar 5, 2012 3:27 PM, "Chao YUE" <chaoyuejoy at gmail.com> wrote:

> Dear all,
>
> Sorry this is not the good place to ask but I think there must be someone
> who has done this before.
> Is there any way to see the execution of python script line by line as the
> verbose model of shell script?
> this can be done either in ipython or by running the python script in a
> shell script?
>
> thanks to all,
>
> cheers,
>
> Chao
>
> --
>
> ***********************************************************************************
> Chao YUE
> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
> UMR 1572 CEA-CNRS-UVSQ
> Batiment 712 - Pe 119
> 91191 GIF Sur YVETTE Cedex
> Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
>
> ************************************************************************************
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120305/57b0a01a/attachment.html>

From langton at gmail.com  Mon Mar  5 13:17:59 2012
From: langton at gmail.com (Asher Langton)
Date: Mon, 5 Mar 2012 10:17:59 -0800
Subject: [Numpy-discussion] Followup on Python+MPI import performance
Message-ID: <CAB079H+kfzYNDqzaKGUNZmgzThHn=irvmc3Rx58dPy-CUEjvpQ@mail.gmail.com>

This is a followup to my post from January
(http://mail.scipy.org/pipermail/numpy-discussion/2012-January/059801.html)
and the panel discussion at PyData this weekend. As a few people have
suggested, a better approach than the MPI-broadcasted lookups is to
cache the locations of all the modules found in sys.path.

I previously claimed the the PEP 302 finders/loaders wouldn't work
here because the finder is selected by a module's path and filename,
at which point the damage is already done. At the PyData panel, Guido
countered that PEP 302 does indeed provide the necessary machinery for
implementing the 'right' solution. The trick is to use sys.meta_path.
(Thanks to Travis for pointing me in the direction of sys.meta_path,
and Dag for helping me work through the details.)

Here's an example demonstrating the use of sys.meta_path:

import os
# Simple finder/loader that pretends to load module 'foo'
class foo(object):
    def find_module(self,fullname,path=None):
        if fullname == "bar":
            return self
        return None

    def load_module(self,fullname):
        if fullname == "bar":
            return os
        raise ImportError("This shouldn't happen!")

if __name__ == "__main__":
    import sys
    sys.meta_path.append(foo())
    import bar # actually the os module
    print bar.getcwd()

To eliminate the import bottleneck, the finder/loader just needs to
traverse sys.path, make a dict mapping modules to their location in
the filesystem, and 'claim responsibility' for those modules in
find_module(). Building (and maintaining, when sys.path changes) this
dict, even if each process does it independently, shouldn't be much
worse than the traversal required by a single import statement. We
could even subclass the finder/loader so that the dict construction is
done by only one process and the result broadcasted over MPI, though
that probably isn't necessary.

I'll put an initial implementation of this importer on github sometime
this week, and I'll follow up this post with some performance numbers
when I have them.

-Asher


From matthew.brett at gmail.com  Mon Mar  5 14:11:32 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 5 Mar 2012 11:11:32 -0800
Subject: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
In-Reply-To: <CAMRnEmpxr1xe36QRLuT0nCtGMXWJr4i-NNeZL5EVhyD8=aGZCQ@mail.gmail.com>
References: <CAH6Pt5qkeVoo3d2+q4+WMmKZFC2FnHx3=pun5S5dwv2P5ijxKw@mail.gmail.com>
	<CAB6mnx+U1vXz6=TCSHO=-a+OjffcAntFKge9bm8LF-Vkj4jH8w@mail.gmail.com>
	<CAH6Pt5o+XW+DVy2L4-GG1Pq2B9YWuebKUKuMGHm-hWabVUV2=w@mail.gmail.com>
	<CAH6Pt5p3zm=ugcQ1W-L5j9C6BQ3g_9yALzfFmxd0Lb6c2UUr6Q@mail.gmail.com>
	<CAMRnEmrV1OTEABbiJMz2HfUb-vcbyq-SHRXE91SS93gtASWgUQ@mail.gmail.com>
	<CAH6Pt5q3xZaioK-kQ1=_mCCTFkOo-uwS0i5G+yQqhqmNPrmqaA@mail.gmail.com>
	<CAMRnEmrjQ+Tv4EPPETboc-zcB+BFKHTeDrCdLQFvd32JzQhQZA@mail.gmail.com>
	<CAH6Pt5q6xnYvRtQUvHKSpvJtTyJTDiR5efCduqop4dp5sx5veQ@mail.gmail.com>
	<CAMRnEmpxr1xe36QRLuT0nCtGMXWJr4i-NNeZL5EVhyD8=aGZCQ@mail.gmail.com>
Message-ID: <CAH6Pt5rN5q5xXU6=iqBJZcW9vzhdSmwoxQang90aSA1kas=SFQ@mail.gmail.com>

Hi,

On Sun, Mar 4, 2012 at 11:53 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> On Sun, Mar 4, 2012 at 10:34 PM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
>>
>> <snip>
>> > $ export?NPY_SEPARATE_COMPILATION=1
>>
>> Thanks, that did it:
>>
>> 9194b3af704df71aa9b1ff2f53f169848d0f9dc7 is the first bad commit
>>
>> Let me know if I can debug further,
>
>
> That commit was a rewrite of np.concatenate, and I've traced the test
> function you got the crash in. The only call to concatenate is as follows:
>
>>>> a = np.array([True], dtype=object)
>>>> np.concatenate((a,)*3)
> array([True, True, True], dtype=object)
>>>>
>
> Can you try this and see if it crashes?

No, that doesn't crash.

Further investigation revealed the crash to be:

(bare-env)[matthew at vagus ~]$ nosetests
~/dev_trees/numpy/numpy/lib/tests/test_io.py:TestFromTxt.test_with_masked_column_various
nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$']
Test masked column ... Bus error

Accordingly:

In [1]: import numpy as np

In [2]: from StringIO import StringIO

In [3]: data = StringIO('True 2 3\nFalse 5 6\n')

In [4]: test = np.genfromtxt(data, dtype=None, missing_values='2,5',
usemask=True)

In [6]: from numpy import ma

In [7]: control = ma.array([(1, 2, 3), (0, 5, 6)], mask=[(0, 1, 0),
(0, 1, 0)], dtype=[('f0', bool), ('f1', bool), ('f2', int)])

In [8]: test == control
Bus error

> Another thing you can do is compile with debug information enabled, then run
> the crashing case in gdb. This will look something like this:
>
> $ export CFLAGS=-g
> $ rm -rf build?# make sure it's a fresh build from scratch
> $ python setup.py install --prefix=<dir> # or however you do it
> [... build printout]
> $ gdb python <script/options>
> [... gdb info]
> (gdb) run
> ?[... script runs]
> Program received signal SIGSEGV, Segmentation fault,
> [crash address]
> (gdb) backtrace
> [full backtrace printout]
>
> Such a backtrace would be immensely helpful in tracking down why it's
> crashing.

I'll get back to you with this.

See you,

Matthew


From ndbecker2 at gmail.com  Mon Mar  5 14:14:02 2012
From: ndbecker2 at gmail.com (Neal Becker)
Date: Mon, 05 Mar 2012 14:14:02 -0500
Subject: [Numpy-discussion] all elements equal
Message-ID: <jj339q$aas$1@dough.gmane.org>

What is a simple, efficient way to determine if all elements in an array (in my 
case, 1D) are equal?  How about close?


From kwgoodman at gmail.com  Mon Mar  5 14:19:44 2012
From: kwgoodman at gmail.com (Keith Goodman)
Date: Mon, 5 Mar 2012 11:19:44 -0800
Subject: [Numpy-discussion] all elements equal
In-Reply-To: <jj339q$aas$1@dough.gmane.org>
References: <jj339q$aas$1@dough.gmane.org>
Message-ID: <CAB6Y534Fzc1C+iDo-kvYQayjg3aLMSsVHA-3QRv4xmT=7Yg-LQ@mail.gmail.com>

On Mon, Mar 5, 2012 at 11:14 AM, Neal Becker <ndbecker2 at gmail.com> wrote:
> What is a simple, efficient way to determine if all elements in an array (in my
> case, 1D) are equal? ?How about close?

For the exactly equal case, how about:

I[1] a = np.array([1,1,1,1])
I[2] np.unique(a).size
O[2] 1    # All equal

I[3] a = np.array([1,1,1,2])
I[4] np.unique(a).size
O[4] 2   # All not equal


From ndbecker2 at gmail.com  Mon Mar  5 14:24:07 2012
From: ndbecker2 at gmail.com (Neal Becker)
Date: Mon, 05 Mar 2012 14:24:07 -0500
Subject: [Numpy-discussion] all elements equal
References: <jj339q$aas$1@dough.gmane.org>
	<CAB6Y534Fzc1C+iDo-kvYQayjg3aLMSsVHA-3QRv4xmT=7Yg-LQ@mail.gmail.com>
Message-ID: <jj33sn$aas$2@dough.gmane.org>

Keith Goodman wrote:

> On Mon, Mar 5, 2012 at 11:14 AM, Neal Becker <ndbecker2 at gmail.com> wrote:
>> What is a simple, efficient way to determine if all elements in an array (in
>> my case, 1D) are equal?  How about close?
> 
> For the exactly equal case, how about:
> 
> I[1] a = np.array([1,1,1,1])
> I[2] np.unique(a).size
> O[2] 1    # All equal
> 
> I[3] a = np.array([1,1,1,2])
> I[4] np.unique(a).size
> O[4] 2   # All not equal

I considered this - just not sure if it's the most efficient


From zachary.pincus at yale.edu  Mon Mar  5 14:28:41 2012
From: zachary.pincus at yale.edu (Zachary Pincus)
Date: Mon, 5 Mar 2012 14:28:41 -0500
Subject: [Numpy-discussion] all elements equal
In-Reply-To: <CAB6Y534Fzc1C+iDo-kvYQayjg3aLMSsVHA-3QRv4xmT=7Yg-LQ@mail.gmail.com>
References: <jj339q$aas$1@dough.gmane.org>
	<CAB6Y534Fzc1C+iDo-kvYQayjg3aLMSsVHA-3QRv4xmT=7Yg-LQ@mail.gmail.com>
Message-ID: <37A7C00A-85CA-48DC-AC1A-3E471DA8E494@yale.edu>

How about the following?
exact: numpy.all(a == a[0])
inexact: numpy.allclose(a, a[0])

On Mar 5, 2012, at 2:19 PM, Keith Goodman wrote:

> On Mon, Mar 5, 2012 at 11:14 AM, Neal Becker <ndbecker2 at gmail.com> wrote:
>> What is a simple, efficient way to determine if all elements in an array (in my
>> case, 1D) are equal?  How about close?
> 
> For the exactly equal case, how about:
> 
> I[1] a = np.array([1,1,1,1])
> I[2] np.unique(a).size
> O[2] 1    # All equal
> 
> I[3] a = np.array([1,1,1,2])
> I[4] np.unique(a).size
> O[4] 2   # All not equal
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From kwgoodman at gmail.com  Mon Mar  5 14:29:13 2012
From: kwgoodman at gmail.com (Keith Goodman)
Date: Mon, 5 Mar 2012 11:29:13 -0800
Subject: [Numpy-discussion] all elements equal
In-Reply-To: <jj33sn$aas$2@dough.gmane.org>
References: <jj339q$aas$1@dough.gmane.org>
	<CAB6Y534Fzc1C+iDo-kvYQayjg3aLMSsVHA-3QRv4xmT=7Yg-LQ@mail.gmail.com>
	<jj33sn$aas$2@dough.gmane.org>
Message-ID: <CAB6Y535QRAPaJpxJ45f3RsUKJaCGEaRAFz_-53=4chryrt4yLw@mail.gmail.com>

On Mon, Mar 5, 2012 at 11:24 AM, Neal Becker <ndbecker2 at gmail.com> wrote:
> Keith Goodman wrote:
>
>> On Mon, Mar 5, 2012 at 11:14 AM, Neal Becker <ndbecker2 at gmail.com> wrote:
>>> What is a simple, efficient way to determine if all elements in an array (in
>>> my case, 1D) are equal? ?How about close?
>>
>> For the exactly equal case, how about:
>>
>> I[1] a = np.array([1,1,1,1])
>> I[2] np.unique(a).size
>> O[2] 1 ? ?# All equal
>>
>> I[3] a = np.array([1,1,1,2])
>> I[4] np.unique(a).size
>> O[4] 2 ? # All not equal
>
> I considered this - just not sure if it's the most efficient

Yeah, it is slow:

I[1] a = np.ones(100000)
I[2] timeit np.unique(a).size
1000 loops, best of 3: 1.56 ms per loop
I[3] timeit (a == a[0]).all()
1000 loops, best of 3: 203 us per loop

I think all() short-circuits for bool arrays:

I[4] a[1] = 9
I[5] timeit (a == a[0]).all()
10000 loops, best of 3: 89 us per loop

You could avoid making the bool array by writing a function in cython.
It could grab the first array element and then return False as soon as
it finds an element that is not equal to it. And you could check for
closeness.

Or:

I[8] np.allclose(a, a[0])
O[8] False
I[9] a = np.ones(100000)
I[10] np.allclose(a, a[0])
O[10] True


From jdh2358 at gmail.com  Mon Mar  5 14:32:40 2012
From: jdh2358 at gmail.com (John Hunter)
Date: Mon, 5 Mar 2012 13:32:40 -0600
Subject: [Numpy-discussion] all elements equal
In-Reply-To: <CAB6Y535QRAPaJpxJ45f3RsUKJaCGEaRAFz_-53=4chryrt4yLw@mail.gmail.com>
References: <jj339q$aas$1@dough.gmane.org>
	<CAB6Y534Fzc1C+iDo-kvYQayjg3aLMSsVHA-3QRv4xmT=7Yg-LQ@mail.gmail.com>
	<jj33sn$aas$2@dough.gmane.org>
	<CAB6Y535QRAPaJpxJ45f3RsUKJaCGEaRAFz_-53=4chryrt4yLw@mail.gmail.com>
Message-ID: <CAGD8yY8FOOQL2ehq6zcG+YjRGrkerbKqOHn73AJWidf7-5mDhQ@mail.gmail.com>

On Mon, Mar 5, 2012 at 1:29 PM, Keith Goodman <kwgoodman at gmail.com> wrote:

>
> I[8] np.allclose(a, a[0])
> O[8] False
> I[9] a = np.ones(100000)
> I[10] np.allclose(a, a[0])
> O[10] True
>
>
One disadvantage of using a[0] as a proxy is that the result depends on the
ordering of a

  (a.max() - a.min()) < epsilon

is an alternative that avoids this.  Another good use case for a minmax
func.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120305/8787ddf7/attachment.html>

From shish at keba.be  Mon Mar  5 14:33:23 2012
From: shish at keba.be (Olivier Delalleau)
Date: Mon, 5 Mar 2012 14:33:23 -0500
Subject: [Numpy-discussion] all elements equal
In-Reply-To: <CAB6Y535QRAPaJpxJ45f3RsUKJaCGEaRAFz_-53=4chryrt4yLw@mail.gmail.com>
References: <jj339q$aas$1@dough.gmane.org>
	<CAB6Y534Fzc1C+iDo-kvYQayjg3aLMSsVHA-3QRv4xmT=7Yg-LQ@mail.gmail.com>
	<jj33sn$aas$2@dough.gmane.org>
	<CAB6Y535QRAPaJpxJ45f3RsUKJaCGEaRAFz_-53=4chryrt4yLw@mail.gmail.com>
Message-ID: <CAFXk4bpyCq=2zbZQAiRBMOw7RnWTb26kmyyiKkd7f84oMrKwxQ@mail.gmail.com>

Le 5 mars 2012 14:29, Keith Goodman <kwgoodman at gmail.com> a ?crit :

> On Mon, Mar 5, 2012 at 11:24 AM, Neal Becker <ndbecker2 at gmail.com> wrote:
> > Keith Goodman wrote:
> >
> >> On Mon, Mar 5, 2012 at 11:14 AM, Neal Becker <ndbecker2 at gmail.com>
> wrote:
> >>> What is a simple, efficient way to determine if all elements in an
> array (in
> >>> my case, 1D) are equal?  How about close?
> >>
> >> For the exactly equal case, how about:
> >>
> >> I[1] a = np.array([1,1,1,1])
> >> I[2] np.unique(a).size
> >> O[2] 1    # All equal
> >>
> >> I[3] a = np.array([1,1,1,2])
> >> I[4] np.unique(a).size
> >> O[4] 2   # All not equal
> >
> > I considered this - just not sure if it's the most efficient
>
> Yeah, it is slow:
>
> I[1] a = np.ones(100000)
> I[2] timeit np.unique(a).size
> 1000 loops, best of 3: 1.56 ms per loop
> I[3] timeit (a == a[0]).all()
> 1000 loops, best of 3: 203 us per loop
>
> I think all() short-circuits for bool arrays:
>
> I[4] a[1] = 9
> I[5] timeit (a == a[0]).all()
> 10000 loops, best of 3: 89 us per loop
>
> You could avoid making the bool array by writing a function in cython.
> It could grab the first array element and then return False as soon as
> it finds an element that is not equal to it. And you could check for
> closeness.
>
> Or:
>
> I[8] np.allclose(a, a[0])
> O[8] False
> I[9] a = np.ones(100000)
> I[10] np.allclose(a, a[0])
> O[10] True
>

Looks like the following is even faster:
np.max(a) == np.min(a)

-=- Olivier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120305/9ef1b217/attachment.html>

From josef.pktd at gmail.com  Mon Mar  5 14:36:44 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 5 Mar 2012 14:36:44 -0500
Subject: [Numpy-discussion] all elements equal
In-Reply-To: <CAFXk4bpyCq=2zbZQAiRBMOw7RnWTb26kmyyiKkd7f84oMrKwxQ@mail.gmail.com>
References: <jj339q$aas$1@dough.gmane.org>
	<CAB6Y534Fzc1C+iDo-kvYQayjg3aLMSsVHA-3QRv4xmT=7Yg-LQ@mail.gmail.com>
	<jj33sn$aas$2@dough.gmane.org>
	<CAB6Y535QRAPaJpxJ45f3RsUKJaCGEaRAFz_-53=4chryrt4yLw@mail.gmail.com>
	<CAFXk4bpyCq=2zbZQAiRBMOw7RnWTb26kmyyiKkd7f84oMrKwxQ@mail.gmail.com>
Message-ID: <CAMMTP+C5pbW1LZjkDbXXn8bi-uYHsrkxbuD9BXZo7e=vvayfxw@mail.gmail.com>

On Mon, Mar 5, 2012 at 2:33 PM, Olivier Delalleau <shish at keba.be> wrote:
> Le 5 mars 2012 14:29, Keith Goodman <kwgoodman at gmail.com> a ?crit :
>
>> On Mon, Mar 5, 2012 at 11:24 AM, Neal Becker <ndbecker2 at gmail.com> wrote:
>> > Keith Goodman wrote:
>> >
>> >> On Mon, Mar 5, 2012 at 11:14 AM, Neal Becker <ndbecker2 at gmail.com>
>> >> wrote:
>> >>> What is a simple, efficient way to determine if all elements in an
>> >>> array (in
>> >>> my case, 1D) are equal? ?How about close?
>> >>
>> >> For the exactly equal case, how about:
>> >>
>> >> I[1] a = np.array([1,1,1,1])
>> >> I[2] np.unique(a).size
>> >> O[2] 1 ? ?# All equal
>> >>
>> >> I[3] a = np.array([1,1,1,2])
>> >> I[4] np.unique(a).size
>> >> O[4] 2 ? # All not equal
>> >
>> > I considered this - just not sure if it's the most efficient
>>
>> Yeah, it is slow:
>>
>> I[1] a = np.ones(100000)
>> I[2] timeit np.unique(a).size
>> 1000 loops, best of 3: 1.56 ms per loop
>> I[3] timeit (a == a[0]).all()
>> 1000 loops, best of 3: 203 us per loop
>>
>> I think all() short-circuits for bool arrays:
>>
>> I[4] a[1] = 9
>> I[5] timeit (a == a[0]).all()
>> 10000 loops, best of 3: 89 us per loop
>>
>> You could avoid making the bool array by writing a function in cython.
>> It could grab the first array element and then return False as soon as
>> it finds an element that is not equal to it. And you could check for
>> closeness.
>>
>> Or:
>>
>> I[8] np.allclose(a, a[0])
>> O[8] False
>> I[9] a = np.ones(100000)
>> I[10] np.allclose(a, a[0])
>> O[10] True
>
>
> Looks like the following is even faster:
> np.max(a) == np.min(a)

How about numpy.ptp, to follow this line? I would expect it's single
pass, but wouldn't short circuit compared to cython of Keith

Josef


>
> -=- Olivier
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From kwgoodman at gmail.com  Mon Mar  5 14:44:38 2012
From: kwgoodman at gmail.com (Keith Goodman)
Date: Mon, 5 Mar 2012 11:44:38 -0800
Subject: [Numpy-discussion] all elements equal
In-Reply-To: <CAMMTP+C5pbW1LZjkDbXXn8bi-uYHsrkxbuD9BXZo7e=vvayfxw@mail.gmail.com>
References: <jj339q$aas$1@dough.gmane.org>
	<CAB6Y534Fzc1C+iDo-kvYQayjg3aLMSsVHA-3QRv4xmT=7Yg-LQ@mail.gmail.com>
	<jj33sn$aas$2@dough.gmane.org>
	<CAB6Y535QRAPaJpxJ45f3RsUKJaCGEaRAFz_-53=4chryrt4yLw@mail.gmail.com>
	<CAFXk4bpyCq=2zbZQAiRBMOw7RnWTb26kmyyiKkd7f84oMrKwxQ@mail.gmail.com>
	<CAMMTP+C5pbW1LZjkDbXXn8bi-uYHsrkxbuD9BXZo7e=vvayfxw@mail.gmail.com>
Message-ID: <CAB6Y534LzSRaPbhi_4gM4m5pqaZ3njPCXoEfKR7QC=McsiXk+w@mail.gmail.com>

On Mon, Mar 5, 2012 at 11:36 AM,  <josef.pktd at gmail.com> wrote:
> How about numpy.ptp, to follow this line? I would expect it's single
> pass, but wouldn't short circuit compared to cython of Keith

I[1] a = np.ones(100000)
I[2] timeit (a == a[0]).all()
1000 loops, best of 3: 203 us per loop
I[3] timeit a.min() == a.max()
10000 loops, best of 3: 106 us per loop
I[4] timeit np.ptp(a)
10000 loops, best of 3: 106 us per loop

I[5] a[1] = 9
I[6] timeit (a == a[0]).all()
10000 loops, best of 3: 89.7 us per loop
I[7] timeit a.min() == a.max()
10000 loops, best of 3: 102 us per loop
I[8] timeit np.ptp(a)
10000 loops, best of 3: 103 us per loop


From ben.root at ou.edu  Mon Mar  5 14:52:04 2012
From: ben.root at ou.edu (Benjamin Root)
Date: Mon, 5 Mar 2012 13:52:04 -0600
Subject: [Numpy-discussion] all elements equal
In-Reply-To: <CAB6Y534LzSRaPbhi_4gM4m5pqaZ3njPCXoEfKR7QC=McsiXk+w@mail.gmail.com>
References: <jj339q$aas$1@dough.gmane.org>
	<CAB6Y534Fzc1C+iDo-kvYQayjg3aLMSsVHA-3QRv4xmT=7Yg-LQ@mail.gmail.com>
	<jj33sn$aas$2@dough.gmane.org>
	<CAB6Y535QRAPaJpxJ45f3RsUKJaCGEaRAFz_-53=4chryrt4yLw@mail.gmail.com>
	<CAFXk4bpyCq=2zbZQAiRBMOw7RnWTb26kmyyiKkd7f84oMrKwxQ@mail.gmail.com>
	<CAMMTP+C5pbW1LZjkDbXXn8bi-uYHsrkxbuD9BXZo7e=vvayfxw@mail.gmail.com>
	<CAB6Y534LzSRaPbhi_4gM4m5pqaZ3njPCXoEfKR7QC=McsiXk+w@mail.gmail.com>
Message-ID: <CANNq6F=ggn0MyEU=8R+kcSfE=QEcsin2nsu-_+h-VpUBQTXP_w@mail.gmail.com>

On Mon, Mar 5, 2012 at 1:44 PM, Keith Goodman <kwgoodman at gmail.com> wrote:

> On Mon, Mar 5, 2012 at 11:36 AM,  <josef.pktd at gmail.com> wrote:
> > How about numpy.ptp, to follow this line? I would expect it's single
> > pass, but wouldn't short circuit compared to cython of Keith
>
> I[1] a = np.ones(100000)
> I[2] timeit (a == a[0]).all()
> 1000 loops, best of 3: 203 us per loop
> I[3] timeit a.min() == a.max()
> 10000 loops, best of 3: 106 us per loop
> I[4] timeit np.ptp(a)
> 10000 loops, best of 3: 106 us per loop
>
> I[5] a[1] = 9
> I[6] timeit (a == a[0]).all()
> 10000 loops, best of 3: 89.7 us per loop
> I[7] timeit a.min() == a.max()
> 10000 loops, best of 3: 102 us per loop
> I[8] timeit np.ptp(a)
> 10000 loops, best of 3: 103 us per loop
>

Another issue to watch out for is if the array is empty.  Technically
speaking, that should be True, but some of the solutions offered so far
would fail in this case.

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120305/a5ec606d/attachment.html>

From kwgoodman at gmail.com  Mon Mar  5 14:56:56 2012
From: kwgoodman at gmail.com (Keith Goodman)
Date: Mon, 5 Mar 2012 11:56:56 -0800
Subject: [Numpy-discussion] all elements equal
In-Reply-To: <CANNq6F=ggn0MyEU=8R+kcSfE=QEcsin2nsu-_+h-VpUBQTXP_w@mail.gmail.com>
References: <jj339q$aas$1@dough.gmane.org>
	<CAB6Y534Fzc1C+iDo-kvYQayjg3aLMSsVHA-3QRv4xmT=7Yg-LQ@mail.gmail.com>
	<jj33sn$aas$2@dough.gmane.org>
	<CAB6Y535QRAPaJpxJ45f3RsUKJaCGEaRAFz_-53=4chryrt4yLw@mail.gmail.com>
	<CAFXk4bpyCq=2zbZQAiRBMOw7RnWTb26kmyyiKkd7f84oMrKwxQ@mail.gmail.com>
	<CAMMTP+C5pbW1LZjkDbXXn8bi-uYHsrkxbuD9BXZo7e=vvayfxw@mail.gmail.com>
	<CAB6Y534LzSRaPbhi_4gM4m5pqaZ3njPCXoEfKR7QC=McsiXk+w@mail.gmail.com>
	<CANNq6F=ggn0MyEU=8R+kcSfE=QEcsin2nsu-_+h-VpUBQTXP_w@mail.gmail.com>
Message-ID: <CAB6Y536N6+2Jkf+bwAieQXXpPtqGwcRdS_+5zV+i1ViHDbA9Zw@mail.gmail.com>

On Mon, Mar 5, 2012 at 11:52 AM, Benjamin Root <ben.root at ou.edu> wrote:
> Another issue to watch out for is if the array is empty.? Technically
> speaking, that should be True, but some of the solutions offered so far
> would fail in this case.

Good point.

For fun, here's the speed of a simple cython allclose:

I[2] a = np.ones(100000)
I[3] timeit a.min() == a.max()
10000 loops, best of 3: 106 us per loop
I[4] timeit allequal(a)
10000 loops, best of 3: 68.9 us per loop

I[5] a[1] = 9
I[6] timeit a.min() == a.max()
10000 loops, best of 3: 102 us per loop
I[7] timeit allequal(a)
1000000 loops, best of 3: 269 ns per loop

where

@cython.boundscheck(False)
@cython.wraparound(False)
def allequal(np.ndarray[np.float64_t, ndim=1] a):
    cdef:
        np.float64_t a0
        Py_ssize_t i, n=a.size
    a0 = a[0]
    for i in range(n):
        if a[i] != a0:
            return False
    return True


From brett.olsen at gmail.com  Mon Mar  5 15:01:52 2012
From: brett.olsen at gmail.com (Brett Olsen)
Date: Mon, 5 Mar 2012 14:01:52 -0600
Subject: [Numpy-discussion] all elements equal
In-Reply-To: <CANNq6F=ggn0MyEU=8R+kcSfE=QEcsin2nsu-_+h-VpUBQTXP_w@mail.gmail.com>
References: <jj339q$aas$1@dough.gmane.org>
	<CAB6Y534Fzc1C+iDo-kvYQayjg3aLMSsVHA-3QRv4xmT=7Yg-LQ@mail.gmail.com>
	<jj33sn$aas$2@dough.gmane.org>
	<CAB6Y535QRAPaJpxJ45f3RsUKJaCGEaRAFz_-53=4chryrt4yLw@mail.gmail.com>
	<CAFXk4bpyCq=2zbZQAiRBMOw7RnWTb26kmyyiKkd7f84oMrKwxQ@mail.gmail.com>
	<CAMMTP+C5pbW1LZjkDbXXn8bi-uYHsrkxbuD9BXZo7e=vvayfxw@mail.gmail.com>
	<CAB6Y534LzSRaPbhi_4gM4m5pqaZ3njPCXoEfKR7QC=McsiXk+w@mail.gmail.com>
	<CANNq6F=ggn0MyEU=8R+kcSfE=QEcsin2nsu-_+h-VpUBQTXP_w@mail.gmail.com>
Message-ID: <CAFq1z2WZojA4iW4Zq_ho5ZQGV88j75csXiX2fpM=fGV6RNAN6w@mail.gmail.com>

> Another issue to watch out for is if the array is empty.? Technically
> speaking, that should be True, but some of the solutions offered so far
> would fail in this case.

Similarly, NaNs or Infs could cause problems:  they should signal as
False, but several of the solutions would return True.

~Brett


From ndbecker2 at gmail.com  Mon Mar  5 15:06:23 2012
From: ndbecker2 at gmail.com (Neal Becker)
Date: Mon, 05 Mar 2012 15:06:23 -0500
Subject: [Numpy-discussion] all elements equal
References: <jj339q$aas$1@dough.gmane.org>
	<CAB6Y534Fzc1C+iDo-kvYQayjg3aLMSsVHA-3QRv4xmT=7Yg-LQ@mail.gmail.com>
	<jj33sn$aas$2@dough.gmane.org>
	<CAB6Y535QRAPaJpxJ45f3RsUKJaCGEaRAFz_-53=4chryrt4yLw@mail.gmail.com>
	<CAFXk4bpyCq=2zbZQAiRBMOw7RnWTb26kmyyiKkd7f84oMrKwxQ@mail.gmail.com>
	<CAMMTP+C5pbW1LZjkDbXXn8bi-uYHsrkxbuD9BXZo7e=vvayfxw@mail.gmail.com>
	<CAB6Y534LzSRaPbhi_4gM4m5pqaZ3njPCXoEfKR7QC=McsiXk+w@mail.gmail.com>
	<CANNq6F=ggn0MyEU=8R+kcSfE=QEcsin2nsu-_+h-VpUBQTXP_w@mail.gmail.com>
	<CAB6Y536N6+2Jkf+bwAieQXXpPtqGwcRdS_+5zV+i1ViHDbA9Zw@mail.gmail.com>
Message-ID: <jj36bv$3ic$1@dough.gmane.org>

Keith Goodman wrote:

> On Mon, Mar 5, 2012 at 11:52 AM, Benjamin Root <ben.root at ou.edu> wrote:
>> Another issue to watch out for is if the array is empty.  Technically
>> speaking, that should be True, but some of the solutions offered so far
>> would fail in this case.
> 
> Good point.
> 
> For fun, here's the speed of a simple cython allclose:
> 
> I[2] a = np.ones(100000)
> I[3] timeit a.min() == a.max()
> 10000 loops, best of 3: 106 us per loop
> I[4] timeit allequal(a)
> 10000 loops, best of 3: 68.9 us per loop
> 
> I[5] a[1] = 9
> I[6] timeit a.min() == a.max()
> 10000 loops, best of 3: 102 us per loop
> I[7] timeit allequal(a)
> 1000000 loops, best of 3: 269 ns per loop
> 
> where
> 
> @cython.boundscheck(False)
> @cython.wraparound(False)
> def allequal(np.ndarray[np.float64_t, ndim=1] a):
>     cdef:
>         np.float64_t a0
>         Py_ssize_t i, n=a.size
>     a0 = a[0]
>     for i in range(n):
>         if a[i] != a0:
>             return False
>     return True

But doesn't this one fail on empty array?


From kwgoodman at gmail.com  Mon Mar  5 15:12:41 2012
From: kwgoodman at gmail.com (Keith Goodman)
Date: Mon, 5 Mar 2012 12:12:41 -0800
Subject: [Numpy-discussion] all elements equal
In-Reply-To: <jj36bv$3ic$1@dough.gmane.org>
References: <jj339q$aas$1@dough.gmane.org>
	<CAB6Y534Fzc1C+iDo-kvYQayjg3aLMSsVHA-3QRv4xmT=7Yg-LQ@mail.gmail.com>
	<jj33sn$aas$2@dough.gmane.org>
	<CAB6Y535QRAPaJpxJ45f3RsUKJaCGEaRAFz_-53=4chryrt4yLw@mail.gmail.com>
	<CAFXk4bpyCq=2zbZQAiRBMOw7RnWTb26kmyyiKkd7f84oMrKwxQ@mail.gmail.com>
	<CAMMTP+C5pbW1LZjkDbXXn8bi-uYHsrkxbuD9BXZo7e=vvayfxw@mail.gmail.com>
	<CAB6Y534LzSRaPbhi_4gM4m5pqaZ3njPCXoEfKR7QC=McsiXk+w@mail.gmail.com>
	<CANNq6F=ggn0MyEU=8R+kcSfE=QEcsin2nsu-_+h-VpUBQTXP_w@mail.gmail.com>
	<CAB6Y536N6+2Jkf+bwAieQXXpPtqGwcRdS_+5zV+i1ViHDbA9Zw@mail.gmail.com>
	<jj36bv$3ic$1@dough.gmane.org>
Message-ID: <CAB6Y534DJb8eR0-FuW9uQA2+zmAeGzYnZcwAc6sE_VPwdVABkg@mail.gmail.com>

On Mon, Mar 5, 2012 at 12:06 PM, Neal Becker <ndbecker2 at gmail.com> wrote:
> But doesn't this one fail on empty array?

Yes. I'm optimizing for fun, not for corner cases. This should work
for size zero and NaNs:

@cython.boundscheck(False)
@cython.wraparound(False)
def allequal(np.ndarray[np.float64_t, ndim=1] a):
    cdef:
        np.float64_t a0
        Py_ssize_t i, n=a.size
    if n == 0:
        return False # Or would you like True?
    a0 = a[0]
    for i in range(n):
        if a[i] != a0:
            return False
    return True


From kwgoodman at gmail.com  Mon Mar  5 15:30:05 2012
From: kwgoodman at gmail.com (Keith Goodman)
Date: Mon, 5 Mar 2012 12:30:05 -0800
Subject: [Numpy-discussion] all elements equal
In-Reply-To: <CAB6Y534DJb8eR0-FuW9uQA2+zmAeGzYnZcwAc6sE_VPwdVABkg@mail.gmail.com>
References: <jj339q$aas$1@dough.gmane.org>
	<CAB6Y534Fzc1C+iDo-kvYQayjg3aLMSsVHA-3QRv4xmT=7Yg-LQ@mail.gmail.com>
	<jj33sn$aas$2@dough.gmane.org>
	<CAB6Y535QRAPaJpxJ45f3RsUKJaCGEaRAFz_-53=4chryrt4yLw@mail.gmail.com>
	<CAFXk4bpyCq=2zbZQAiRBMOw7RnWTb26kmyyiKkd7f84oMrKwxQ@mail.gmail.com>
	<CAMMTP+C5pbW1LZjkDbXXn8bi-uYHsrkxbuD9BXZo7e=vvayfxw@mail.gmail.com>
	<CAB6Y534LzSRaPbhi_4gM4m5pqaZ3njPCXoEfKR7QC=McsiXk+w@mail.gmail.com>
	<CANNq6F=ggn0MyEU=8R+kcSfE=QEcsin2nsu-_+h-VpUBQTXP_w@mail.gmail.com>
	<CAB6Y536N6+2Jkf+bwAieQXXpPtqGwcRdS_+5zV+i1ViHDbA9Zw@mail.gmail.com>
	<jj36bv$3ic$1@dough.gmane.org>
	<CAB6Y534DJb8eR0-FuW9uQA2+zmAeGzYnZcwAc6sE_VPwdVABkg@mail.gmail.com>
Message-ID: <CAB6Y537YfbJu1jAGMpLYV0tp5fN9q+-PWQVmRfeW6wxHgfNLqA@mail.gmail.com>

On Mon, Mar 5, 2012 at 12:12 PM, Keith Goodman <kwgoodman at gmail.com> wrote:
> On Mon, Mar 5, 2012 at 12:06 PM, Neal Becker <ndbecker2 at gmail.com> wrote:
>> But doesn't this one fail on empty array?
>
> Yes. I'm optimizing for fun, not for corner cases. This should work
> for size zero and NaNs:
>
> @cython.boundscheck(False)
> @cython.wraparound(False)
> def allequal(np.ndarray[np.float64_t, ndim=1] a):
> ? ?cdef:
> ? ? ? ?np.float64_t a0
> ? ? ? ?Py_ssize_t i, n=a.size
> ? ?if n == 0:
> ? ? ? ?return False # Or would you like True?
> ? ?a0 = a[0]
> ? ?for i in range(n):
> ? ? ? ?if a[i] != a0:
> ? ? ? ? ? ?return False
> ? ?return True

Sorry for all the posts. I'll go back to being quiet. Seems like
np.allclose returns True for empty arrays:

I[2] a = np.array([])
I[3] np.allclose(np.array([]), np.array([]))
O[3] True

The original allequal cython code did the same:

I[4] allequal(a)
O[4] True


From matthew.brett at gmail.com  Mon Mar  5 20:44:03 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 5 Mar 2012 17:44:03 -0800
Subject: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
In-Reply-To: <CAH6Pt5rN5q5xXU6=iqBJZcW9vzhdSmwoxQang90aSA1kas=SFQ@mail.gmail.com>
References: <CAH6Pt5qkeVoo3d2+q4+WMmKZFC2FnHx3=pun5S5dwv2P5ijxKw@mail.gmail.com>
	<CAB6mnx+U1vXz6=TCSHO=-a+OjffcAntFKge9bm8LF-Vkj4jH8w@mail.gmail.com>
	<CAH6Pt5o+XW+DVy2L4-GG1Pq2B9YWuebKUKuMGHm-hWabVUV2=w@mail.gmail.com>
	<CAH6Pt5p3zm=ugcQ1W-L5j9C6BQ3g_9yALzfFmxd0Lb6c2UUr6Q@mail.gmail.com>
	<CAMRnEmrV1OTEABbiJMz2HfUb-vcbyq-SHRXE91SS93gtASWgUQ@mail.gmail.com>
	<CAH6Pt5q3xZaioK-kQ1=_mCCTFkOo-uwS0i5G+yQqhqmNPrmqaA@mail.gmail.com>
	<CAMRnEmrjQ+Tv4EPPETboc-zcB+BFKHTeDrCdLQFvd32JzQhQZA@mail.gmail.com>
	<CAH6Pt5q6xnYvRtQUvHKSpvJtTyJTDiR5efCduqop4dp5sx5veQ@mail.gmail.com>
	<CAMRnEmpxr1xe36QRLuT0nCtGMXWJr4i-NNeZL5EVhyD8=aGZCQ@mail.gmail.com>
	<CAH6Pt5rN5q5xXU6=iqBJZcW9vzhdSmwoxQang90aSA1kas=SFQ@mail.gmail.com>
Message-ID: <CAH6Pt5oN2FWydvsnP2MngA_7fs=cZFwgDwUBbg2bQHLkR_Qc7Q@mail.gmail.com>

Hi,

On Mon, Mar 5, 2012 at 11:11 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Sun, Mar 4, 2012 at 11:53 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
>> On Sun, Mar 4, 2012 at 10:34 PM, Matthew Brett <matthew.brett at gmail.com>
>> wrote:
>>>
>>> <snip>
>>> > $ export?NPY_SEPARATE_COMPILATION=1
>>>
>>> Thanks, that did it:
>>>
>>> 9194b3af704df71aa9b1ff2f53f169848d0f9dc7 is the first bad commit
>>>
>>> Let me know if I can debug further,
>>
>>
>> That commit was a rewrite of np.concatenate, and I've traced the test
>> function you got the crash in. The only call to concatenate is as follows:
>>
>>>>> a = np.array([True], dtype=object)
>>>>> np.concatenate((a,)*3)
>> array([True, True, True], dtype=object)
>>>>>
>>
>> Can you try this and see if it crashes?
>
> No, that doesn't crash.
>
> Further investigation revealed the crash to be:
>
> (bare-env)[matthew at vagus ~]$ nosetests
> ~/dev_trees/numpy/numpy/lib/tests/test_io.py:TestFromTxt.test_with_masked_column_various
> nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$']
> Test masked column ... Bus error
>
> Accordingly:
>
> In [1]: import numpy as np
>
> In [2]: from StringIO import StringIO
>
> In [3]: data = StringIO('True 2 3\nFalse 5 6\n')
>
> In [4]: test = np.genfromtxt(data, dtype=None, missing_values='2,5',
> usemask=True)
>
> In [6]: from numpy import ma
>
> In [7]: control = ma.array([(1, 2, 3), (0, 5, 6)], mask=[(0, 1, 0),
> (0, 1, 0)], dtype=[('f0', bool), ('f1', bool), ('f2', int)])
>
> In [8]: test == control
> Bus error
>
>> Another thing you can do is compile with debug information enabled, then run
>> the crashing case in gdb. This will look something like this:
>>
>> $ export CFLAGS=-g
>> $ rm -rf build?# make sure it's a fresh build from scratch
>> $ python setup.py install --prefix=<dir> # or however you do it
>> [... build printout]
>> $ gdb python <script/options>
>> [... gdb info]
>> (gdb) run
>> ?[... script runs]
>> Program received signal SIGSEGV, Segmentation fault,
>> [crash address]
>> (gdb) backtrace
>> [full backtrace printout]
>>
>> Such a backtrace would be immensely helpful in tracking down why it's
>> crashing.
>
> I'll get back to you with this.

<atest.py>
from StringIO import StringIO

import numpy as np
from numpy import ma

data = StringIO('True 2 3\nFalse 5 6\n')
test = np.genfromtxt(data, dtype=None, missing_values='2,5',
                     usemask=True)
control = ma.array([(1, 2, 3), (0, 5, 6)], mask=[(0, 1, 0), (0, 1, 0)],
                   dtype=[('f0', bool), ('f1', bool), ('f2', int)])
test == control
</atest.py

gdb python
run atest.py

backtrace full attached.

See you,

Matthew
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sparc_bus_error.log
Type: text/x-log
Size: 13329 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120305/6a584123/attachment.bin>

From matthew.brett at gmail.com  Mon Mar  5 21:39:49 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 5 Mar 2012 18:39:49 -0800
Subject: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
In-Reply-To: <CAH6Pt5oN2FWydvsnP2MngA_7fs=cZFwgDwUBbg2bQHLkR_Qc7Q@mail.gmail.com>
References: <CAH6Pt5qkeVoo3d2+q4+WMmKZFC2FnHx3=pun5S5dwv2P5ijxKw@mail.gmail.com>
	<CAB6mnx+U1vXz6=TCSHO=-a+OjffcAntFKge9bm8LF-Vkj4jH8w@mail.gmail.com>
	<CAH6Pt5o+XW+DVy2L4-GG1Pq2B9YWuebKUKuMGHm-hWabVUV2=w@mail.gmail.com>
	<CAH6Pt5p3zm=ugcQ1W-L5j9C6BQ3g_9yALzfFmxd0Lb6c2UUr6Q@mail.gmail.com>
	<CAMRnEmrV1OTEABbiJMz2HfUb-vcbyq-SHRXE91SS93gtASWgUQ@mail.gmail.com>
	<CAH6Pt5q3xZaioK-kQ1=_mCCTFkOo-uwS0i5G+yQqhqmNPrmqaA@mail.gmail.com>
	<CAMRnEmrjQ+Tv4EPPETboc-zcB+BFKHTeDrCdLQFvd32JzQhQZA@mail.gmail.com>
	<CAH6Pt5q6xnYvRtQUvHKSpvJtTyJTDiR5efCduqop4dp5sx5veQ@mail.gmail.com>
	<CAMRnEmpxr1xe36QRLuT0nCtGMXWJr4i-NNeZL5EVhyD8=aGZCQ@mail.gmail.com>
	<CAH6Pt5rN5q5xXU6=iqBJZcW9vzhdSmwoxQang90aSA1kas=SFQ@mail.gmail.com>
	<CAH6Pt5oN2FWydvsnP2MngA_7fs=cZFwgDwUBbg2bQHLkR_Qc7Q@mail.gmail.com>
Message-ID: <CAH6Pt5qc2VAtOg=ZweBQDkH6qK+h+rxGUjwNiC7E5XCJj=b-NQ@mail.gmail.com>

And simplifying:

In [1]: import numpy as np

In [2]: control = np.array([(1, 2, 3), (0, 5, 6)], dtype=[('f0',
bool), ('f1', bool), ('f2', int)])

In [3]: control == control
Out[3]: array([ True,  True], dtype=bool)

In [4]: from numpy import ma

In [5]: control = ma.array([(1, 2, 3), (0, 5, 6)], dtype=[('f0',
bool), ('f1', bool), ('f2', int)])

In [6]: control == control
Bus error

Cheers,

Matthew


From cournape at gmail.com  Mon Mar  5 22:07:30 2012
From: cournape at gmail.com (David Cournapeau)
Date: Mon, 5 Mar 2012 22:07:30 -0500
Subject: [Numpy-discussion] dtype comparison, hash
In-Reply-To: <CAF6FJiu_Y=tKyNuggU6RXQK8tU6-GPJJkrOpz_DDs1Fxkw8U=g@mail.gmail.com>
References: <878vlyu7uq.fsf@ding.tiker.net>
	<CAF6FJit4zMEibGi6NACKUNCOZh5viucqB=Pf5+S1CvHDMLqdvA@mail.gmail.com>
	<87wr9drios.fsf@ding.tiker.net>
	<CAF6FJiuPGdg06LO-7B5v5gb741-t-OtN3H2Rn7h_HzcHUfkOMg@mail.gmail.com>
	<87ty3u7w29.fsf@ding.tiker.net>
	<CAF6FJiu_Y=tKyNuggU6RXQK8tU6-GPJJkrOpz_DDs1Fxkw8U=g@mail.gmail.com>
Message-ID: <CAGY4rcWi2Yy9Ka7A=-g6vu2KzEwa25vHfjvMY=XdBVkC=sj31w@mail.gmail.com>

On Tue, Jan 17, 2012 at 9:28 AM, Robert Kern <robert.kern at gmail.com> wrote:
> On Tue, Jan 17, 2012 at 05:11, Andreas Kloeckner
> <lists at informa.tiker.net> wrote:
>> Hi Robert,
>>
>> On Fri, 30 Dec 2011 20:05:14 +0000, Robert Kern <robert.kern at gmail.com> wrote:
>>> On Fri, Dec 30, 2011 at 18:57, Andreas Kloeckner
>>> <lists at informa.tiker.net> wrote:
>>> > Hi Robert,
>>> >
>>> > On Tue, 27 Dec 2011 10:17:41 +0000, Robert Kern <robert.kern at gmail.com> wrote:
>>> >> On Tue, Dec 27, 2011 at 01:22, Andreas Kloeckner
>>> >> <lists at informa.tiker.net> wrote:
>>> >> > Hi all,
>>> >> >
>>> >> > Two questions:
>>> >> >
>>> >> > - Are dtypes supposed to be comparable (i.e. implement '==', '!=')?
>>> >>
>>> >> Yes.
>>> >>
>>> >> > - Are dtypes supposed to be hashable?
>>> >>
>>> >> Yes, with caveats. Strictly speaking, we violate the condition that
>>> >> objects that equal each other should hash equal since we define == to
>>> >> be rather free. Namely,
>>> >>
>>> >> ? np.dtype(x) == x
>>> >>
>>> >> for all objects x that can be converted to a dtype.
>>> >>
>>> >> ? np.dtype(float) == np.dtype('float')
>>> >> ? np.dtype(float) == float
>>> >> ? np.dtype(float) == 'float'
>>> >>
>>> >> Since hash(float) != hash('float') we cannot implement
>>> >> np.dtype.__hash__() to follow the stricture that objects that compare
>>> >> equal should hash equal.
>>> >>
>>> >> However, if you restrict the domain of objects to just dtypes (i.e.
>>> >> only consider dicts that use only actual dtype objects as keys instead
>>> >> of arbitrary mixtures of objects), then the stricture is obeyed. This
>>> >> is a useful domain that is used internally in numpy.
>>> >>
>>> >> Is this the problem that you found?
>>> >
>>> > Thanks for the reply.
>>> >
>>> > It doesn't seem like this is our issue--instead, we're encountering two
>>> > different dtype objects that claim to be float64, compare as equal, but
>>> > don't hash to the same value.
>>> >
>>> > I've asked the user who encountered the user to investigate, and I'll
>>> > be back with more detail in a bit.
>>>
>>> I think we've run into this before and tried to fix it. Try to find
>>> the version of numpy the user has and a minimal example, if you can.
>>
>> This is what Thomas found:
>>
>> http://projects.scipy.org/numpy/ticket/2017
>
> It looks like the .flags attribute is different between np.uintp and
> np.uint32. The .flags attribute forms part of the hashed information
> about the dtype (or PyArray_Descr at the C-level).
>
> [~]
> |15> np.dtype(np.uintp).flags
> 1536
>
> [~]
> |16> np.dtype(np.uint32).flags
> 2048
>
> The same goes for np.intp and np.int32 in numpy 1.6.1 on OS X, so
> unlike the comment in the ticket, they do have different hashes for
> me.
>
> However, diving through the source a bit, I'm not entirely sure I
> trust the values being given at the Python level. It appears that the
> flag member of the PyArray_Descr struct is declared as a char.
> However, it is exposed as a T_INT member in the PyMemberDef table by
> direct addressing. Basically, a Python descriptor gets added to the
> np.dtype type that will look up sizeof(long) bytes from the starting
> position of the flags member in the struct. This includes 3 bytes of
> the following type_num member. Obviously, 2048 does not fit into a
> char. Nonetheless, the type_num is also part of the hash, so either
> the flags member or the type_num member is different between the two.
>
> Two bugs for the price of one!

Good catch !

So basically, the flag was changed from a char to an int back to a
char, and some of the code did not follow.

I could not really follow the exact history from the log alone, but basically:
  - there is indeed a char vs int discrepency (T_INT vs char)
  - in most dtype functions handling the flag variable, temporary
computation were made with an int (but every possible flag combination
can fit in a char)
  - quite a few usage of "i" instead of "c" in PyArg_ParseTuple and
PyBuild_Value.

Even after all those things, the original bug is there, because uintp
and uin32 have different typenum, even in 32 bits. I would actually
consider this a big in PyArray_EquivTypes, but changing this now may
be quite disrupting. Shall I remove type_num from the hash input (in
which case the bug would be fixed) ?

David


From travis at continuum.io  Mon Mar  5 22:24:59 2012
From: travis at continuum.io (Travis Oliphant)
Date: Mon, 5 Mar 2012 21:24:59 -0600
Subject: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
In-Reply-To: <CAH6Pt5rN5q5xXU6=iqBJZcW9vzhdSmwoxQang90aSA1kas=SFQ@mail.gmail.com>
References: <CAH6Pt5qkeVoo3d2+q4+WMmKZFC2FnHx3=pun5S5dwv2P5ijxKw@mail.gmail.com>
	<CAB6mnx+U1vXz6=TCSHO=-a+OjffcAntFKge9bm8LF-Vkj4jH8w@mail.gmail.com>
	<CAH6Pt5o+XW+DVy2L4-GG1Pq2B9YWuebKUKuMGHm-hWabVUV2=w@mail.gmail.com>
	<CAH6Pt5p3zm=ugcQ1W-L5j9C6BQ3g_9yALzfFmxd0Lb6c2UUr6Q@mail.gmail.com>
	<CAMRnEmrV1OTEABbiJMz2HfUb-vcbyq-SHRXE91SS93gtASWgUQ@mail.gmail.com>
	<CAH6Pt5q3xZaioK-kQ1=_mCCTFkOo-uwS0i5G+yQqhqmNPrmqaA@mail.gmail.com>
	<CAMRnEmrjQ+Tv4EPPETboc-zcB+BFKHTeDrCdLQFvd32JzQhQZA@mail.gmail.com>
	<CAH6Pt5q6xnYvRtQUvHKSpvJtTyJTDiR5efCduqop4dp5sx5veQ@mail.gmail.com>
	<CAMRnEmpxr1xe36QRLuT0nCtGMXWJr4i-NNeZL5EVhyD8=aGZCQ@mail.gmail.com>
	<CAH6Pt5rN5q5xXU6=iqBJZcW9vzhdSmwoxQang90aSA1kas=SFQ@mail.gmail.com>
Message-ID: <5A619A64-67CB-40D9-85BA-F058576E21D3@continuum.io>

I'm not sure if it's been re-hashed again or not, but this is the sort of error that showed up all the time while debugging NumPy on Sparc hard-ware.   The problem is that memory has to be aligned before calculations take place.    With this kind of dtype, field 'f2' will be aligned on a 2-byte boundary but likely must be aligned on a 4-byte boundary before any kind of comparison operation is done. 

Multiple times, this bug was fixed by just looking hard at the code again and making sure that memory copies were performed prior to calculation if the data is not aligned.    

Thanks for the error reports and for all the hard work in tracking this down and squashing it.  

-Travis


On Mar 5, 2012, at 1:11 PM, Matthew Brett wrote:

> Hi,
> 
> On Sun, Mar 4, 2012 at 11:53 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
>> On Sun, Mar 4, 2012 at 10:34 PM, Matthew Brett <matthew.brett at gmail.com>
>> wrote:
>>> 
>>> <snip>
>>>> $ export NPY_SEPARATE_COMPILATION=1
>>> 
>>> Thanks, that did it:
>>> 
>>> 9194b3af704df71aa9b1ff2f53f169848d0f9dc7 is the first bad commit
>>> 
>>> Let me know if I can debug further,
>> 
>> 
>> That commit was a rewrite of np.concatenate, and I've traced the test
>> function you got the crash in. The only call to concatenate is as follows:
>> 
>>>>> a = np.array([True], dtype=object)
>>>>> np.concatenate((a,)*3)
>> array([True, True, True], dtype=object)
>>>>> 
>> 
>> Can you try this and see if it crashes?
> 
> No, that doesn't crash.
> 
> Further investigation revealed the crash to be:
> 
> (bare-env)[matthew at vagus ~]$ nosetests
> ~/dev_trees/numpy/numpy/lib/tests/test_io.py:TestFromTxt.test_with_masked_column_various
> nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$']
> Test masked column ... Bus error
> 
> Accordingly:
> 
> In [1]: import numpy as np
> 
> In [2]: from StringIO import StringIO
> 
> In [3]: data = StringIO('True 2 3\nFalse 5 6\n')
> 
> In [4]: test = np.genfromtxt(data, dtype=None, missing_values='2,5',
> usemask=True)
> 
> In [6]: from numpy import ma
> 
> In [7]: control = ma.array([(1, 2, 3), (0, 5, 6)], mask=[(0, 1, 0),
> (0, 1, 0)], dtype=[('f0', bool), ('f1', bool), ('f2', int)])
> 
> In [8]: test == control
> Bus error
> 
>> Another thing you can do is compile with debug information enabled, then run
>> the crashing case in gdb. This will look something like this:
>> 
>> $ export CFLAGS=-g
>> $ rm -rf build # make sure it's a fresh build from scratch
>> $ python setup.py install --prefix=<dir> # or however you do it
>> [... build printout]
>> $ gdb python <script/options>
>> [... gdb info]
>> (gdb) run
>>  [... script runs]
>> Program received signal SIGSEGV, Segmentation fault,
>> [crash address]
>> (gdb) backtrace
>> [full backtrace printout]
>> 
>> Such a backtrace would be immensely helpful in tracking down why it's
>> crashing.
> 
> I'll get back to you with this.
> 
> See you,
> 
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From cournape at gmail.com  Mon Mar  5 22:53:05 2012
From: cournape at gmail.com (David Cournapeau)
Date: Mon, 5 Mar 2012 22:53:05 -0500
Subject: [Numpy-discussion] Fixing PyArray_Descr flags member size,
	ABI vs pickling issue
Message-ID: <CAGY4rcV3s78=uD-cem0bHL9PsJKLv33CFy7=K75Lvk2RcXdw9A@mail.gmail.com>

Hi,

This is following the discussion on bug
http://projects.scipy.org/numpy/ticket/2017

Essentially, there is a discrepency between the actual type of the
flags member in the dtype C structure (char) and how it is declared in
the descriptor table (T_INT). The problem is that we are damned if we
fix it, damned if we are not:
  - fixing T_INT to T_BYTE. flag in python is now fixed, but it breaks
pickled numpy arrays
  - not fixing it means that the value is broken at the python level.
Changing the flag to be an actual int would break the ABI.

I currently have this: https://github.com/numpy/numpy/pull/228

Which one is the more appropriate ? Maybe a deprecation warning for
1.7 and change it in 1.8 ?

Note that we can fix the hash issue with either solution,

David


From mwwiebe at gmail.com  Mon Mar  5 23:04:02 2012
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Mon, 5 Mar 2012 20:04:02 -0800
Subject: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
In-Reply-To: <CAH6Pt5qc2VAtOg=ZweBQDkH6qK+h+rxGUjwNiC7E5XCJj=b-NQ@mail.gmail.com>
References: <CAH6Pt5qkeVoo3d2+q4+WMmKZFC2FnHx3=pun5S5dwv2P5ijxKw@mail.gmail.com>
	<CAB6mnx+U1vXz6=TCSHO=-a+OjffcAntFKge9bm8LF-Vkj4jH8w@mail.gmail.com>
	<CAH6Pt5o+XW+DVy2L4-GG1Pq2B9YWuebKUKuMGHm-hWabVUV2=w@mail.gmail.com>
	<CAH6Pt5p3zm=ugcQ1W-L5j9C6BQ3g_9yALzfFmxd0Lb6c2UUr6Q@mail.gmail.com>
	<CAMRnEmrV1OTEABbiJMz2HfUb-vcbyq-SHRXE91SS93gtASWgUQ@mail.gmail.com>
	<CAH6Pt5q3xZaioK-kQ1=_mCCTFkOo-uwS0i5G+yQqhqmNPrmqaA@mail.gmail.com>
	<CAMRnEmrjQ+Tv4EPPETboc-zcB+BFKHTeDrCdLQFvd32JzQhQZA@mail.gmail.com>
	<CAH6Pt5q6xnYvRtQUvHKSpvJtTyJTDiR5efCduqop4dp5sx5veQ@mail.gmail.com>
	<CAMRnEmpxr1xe36QRLuT0nCtGMXWJr4i-NNeZL5EVhyD8=aGZCQ@mail.gmail.com>
	<CAH6Pt5rN5q5xXU6=iqBJZcW9vzhdSmwoxQang90aSA1kas=SFQ@mail.gmail.com>
	<CAH6Pt5oN2FWydvsnP2MngA_7fs=cZFwgDwUBbg2bQHLkR_Qc7Q@mail.gmail.com>
	<CAH6Pt5qc2VAtOg=ZweBQDkH6qK+h+rxGUjwNiC7E5XCJj=b-NQ@mail.gmail.com>
Message-ID: <CAMRnEmpY6ibpEF16dzVBuEL+8641Z5rmWKNCprNoAtzpaii4RQ@mail.gmail.com>

I've pushed a bugfix to github, can you confirm that the crash goes away on
your test box? Thanks for tracking that down, the stack trace was very
helpful. Since x86 machines don't have as strict alignment requirements,
bugs like this one will generally remain undetected until someone tests on
an architecture like sparc.

Cheers,
Mark

On Mon, Mar 5, 2012 at 6:39 PM, Matthew Brett <matthew.brett at gmail.com>wrote:

> And simplifying:
>
> In [1]: import numpy as np
>
> In [2]: control = np.array([(1, 2, 3), (0, 5, 6)], dtype=[('f0',
> bool), ('f1', bool), ('f2', int)])
>
> In [3]: control == control
> Out[3]: array([ True,  True], dtype=bool)
>
> In [4]: from numpy import ma
>
> In [5]: control = ma.array([(1, 2, 3), (0, 5, 6)], dtype=[('f0',
> bool), ('f1', bool), ('f2', int)])
>
> In [6]: control == control
> Bus error
>
> Cheers,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120305/22acf97c/attachment.html>

From matthew.brett at gmail.com  Tue Mar  6 00:53:30 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 5 Mar 2012 21:53:30 -0800
Subject: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
In-Reply-To: <CAMRnEmpY6ibpEF16dzVBuEL+8641Z5rmWKNCprNoAtzpaii4RQ@mail.gmail.com>
References: <CAH6Pt5qkeVoo3d2+q4+WMmKZFC2FnHx3=pun5S5dwv2P5ijxKw@mail.gmail.com>
	<CAB6mnx+U1vXz6=TCSHO=-a+OjffcAntFKge9bm8LF-Vkj4jH8w@mail.gmail.com>
	<CAH6Pt5o+XW+DVy2L4-GG1Pq2B9YWuebKUKuMGHm-hWabVUV2=w@mail.gmail.com>
	<CAH6Pt5p3zm=ugcQ1W-L5j9C6BQ3g_9yALzfFmxd0Lb6c2UUr6Q@mail.gmail.com>
	<CAMRnEmrV1OTEABbiJMz2HfUb-vcbyq-SHRXE91SS93gtASWgUQ@mail.gmail.com>
	<CAH6Pt5q3xZaioK-kQ1=_mCCTFkOo-uwS0i5G+yQqhqmNPrmqaA@mail.gmail.com>
	<CAMRnEmrjQ+Tv4EPPETboc-zcB+BFKHTeDrCdLQFvd32JzQhQZA@mail.gmail.com>
	<CAH6Pt5q6xnYvRtQUvHKSpvJtTyJTDiR5efCduqop4dp5sx5veQ@mail.gmail.com>
	<CAMRnEmpxr1xe36QRLuT0nCtGMXWJr4i-NNeZL5EVhyD8=aGZCQ@mail.gmail.com>
	<CAH6Pt5rN5q5xXU6=iqBJZcW9vzhdSmwoxQang90aSA1kas=SFQ@mail.gmail.com>
	<CAH6Pt5oN2FWydvsnP2MngA_7fs=cZFwgDwUBbg2bQHLkR_Qc7Q@mail.gmail.com>
	<CAH6Pt5qc2VAtOg=ZweBQDkH6qK+h+rxGUjwNiC7E5XCJj=b-NQ@mail.gmail.com>
	<CAMRnEmpY6ibpEF16dzVBuEL+8641Z5rmWKNCprNoAtzpaii4RQ@mail.gmail.com>
Message-ID: <CAH6Pt5qJiMxw1i2NvTGCemAO0Xp8SGFB9dBrNDDwtetP7AMOew@mail.gmail.com>

Hi,

On Mon, Mar 5, 2012 at 8:04 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> I've pushed a bugfix to github, can you confirm that the crash goes away on
> your test box? Thanks for tracking that down, the stack trace was very
> helpful. Since x86 machines don't have as strict alignment requirements,
> bugs like this one will generally remain undetected until someone tests on
> an architecture like sparc.

Thanks - no Bus error.  For your enjoyment: there were some failures
and errors from numpy.test("full")

======================================================================
ERROR: test_numeric.TestIsclose.test_ip_isclose_allclose([1e-08, 1,
1000020.0000000099], [0, nan, 1000000.0])
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/nose-1.1.2-py2.6.egg/nose/case.py",
line 197, in runTest
    self.test(*self.arg)
  File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/core/tests/test_numeric.py",
line 1288, in tst_isclose_allclose
    assert_array_equal(isclose(x, y).all(), allclose(x, y), msg % (x, y))
  File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/core/numeric.py",
line 2020, in allclose
    return all(less_equal(absolute(x-y), atol + rtol * absolute(y)))
RuntimeWarning: invalid value encountered in absolute

======================================================================
ERROR: test_numeric.TestIsclose.test_ip_isclose_allclose(nan, [nan, nan, nan])
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/nose-1.1.2-py2.6.egg/nose/case.py",
line 197, in runTest
    self.test(*self.arg)
  File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/core/tests/test_numeric.py",
line 1288, in tst_isclose_allclose
    assert_array_equal(isclose(x, y).all(), allclose(x, y), msg % (x, y))
  File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/core/numeric.py",
line 2020, in allclose
    return all(less_equal(absolute(x-y), atol + rtol * absolute(y)))
RuntimeWarning: invalid value encountered in absolute

======================================================================
ERROR: Test a special case for var
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/ma/tests/test_core.py",
line 2725, in test_varstd_specialcases
    _ = method(out=nout)
  File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/ma/core.py",
line 4778, in std
    dvar = sqrt(dvar)
  File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/ma/core.py",
line 849, in __call__
    m |= self.domain(d)
  File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/ma/core.py",
line 801, in __call__
    return umath.less(x, self.critical_value)
RuntimeWarning: invalid value encountered in less

======================================================================
FAIL: test_complex_dtype_repr (test_dtype.TestString)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/core/tests/test_dtype.py",
line 401, in test_complex_dtype_repr
    "dtype([('a', '<M8[D]'), ('b', '<m8[us]')])")
  File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/testing/utils.py",
line 313, in assert_equal
    raise AssertionError(msg)
AssertionError:
Items are not equal:
 ACTUAL: "dtype([('a', '>M8[D]'), ('b', '>m8[us]')])"
 DESIRED: "dtype([('a', '<M8[D]'), ('b', '<m8[us]')])"

======================================================================
FAIL: test_complex_dtype_str (test_dtype.TestString)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/core/tests/test_dtype.py",
line 353, in test_complex_dtype_str
    "[('a', '<m8[D]'), ('b', '<M8[us]')]")
  File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/testing/utils.py",
line 313, in assert_equal
    raise AssertionError(msg)
AssertionError:
Items are not equal:
 ACTUAL: "[('a', '>m8[D]'), ('b', '>M8[us]')]"
 DESIRED: "[('a', '<m8[D]'), ('b', '<M8[us]')]"

======================================================================
FAIL: test_kind.TestKind.test_all
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/nose-1.1.2-py2.6.egg/nose/case.py",
line 197, in runTest
    self.test(*self.arg)
  File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/f2py/tests/test_kind.py",
line 30, in test_all
    'selectedrealkind(%s): expected %r but got %r' %  (i,
selected_real_kind(i), selectedrealkind(i)))
  File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/testing/utils.py",
line 34, in assert_
    raise AssertionError(msg)
AssertionError: selectedrealkind(16): expected 10 but got 16

----------------------------------------------------------------------
Ran 3693 tests in 362.416s

FAILED (KNOWNFAIL=3, SKIP=6, errors=3, failures=3)
Running unit tests for numpy
NumPy version 1.7.0.dev-7c07089
NumPy is installed in
/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy
Python version 2.6.6 (r266:84292, Dec 26 2010, 23:29:26) [GCC 4.4.5
20100913 (prerelease)]
nose version 1.1.2

Thanks,

Matthew


From robert.kern at gmail.com  Tue Mar  6 06:20:28 2012
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 6 Mar 2012 11:20:28 +0000
Subject: [Numpy-discussion] Fixing PyArray_Descr flags member size,
 ABI vs pickling issue
In-Reply-To: <CAGY4rcV3s78=uD-cem0bHL9PsJKLv33CFy7=K75Lvk2RcXdw9A@mail.gmail.com>
References: <CAGY4rcV3s78=uD-cem0bHL9PsJKLv33CFy7=K75Lvk2RcXdw9A@mail.gmail.com>
Message-ID: <CAF6FJivPBww2rk+nzCAqcMoStR099uiB4X5Y-gkJW0pE2JKBXA@mail.gmail.com>

On Tue, Mar 6, 2012 at 03:53, David Cournapeau <cournape at gmail.com> wrote:
> Hi,
>
> This is following the discussion on bug
> http://projects.scipy.org/numpy/ticket/2017
>
> Essentially, there is a discrepency between the actual type of the
> flags member in the dtype C structure (char) and how it is declared in
> the descriptor table (T_INT). The problem is that we are damned if we
> fix it, damned if we are not:
> ?- fixing T_INT to T_BYTE. flag in python is now fixed, but it breaks
> pickled numpy arrays

Is the problem that T_BYTE returns a single-item string? My
handwrapping skills are rusty (Cython has blissfully eradicated this
from my memory), but aren't these T_INT/T_BYTE things just convenient
shortcuts for exposing C struct members as Python attributes? Couldn't
we just write a full getter function for returning the correct value,
just returned as a Python int instead of a str.

-- 
Robert Kern


From ppmime at gmail.com  Tue Mar  6 07:00:38 2012
From: ppmime at gmail.com (=?ISO-8859-1?Q?Jose_Miguel_Ib=E1=F1ez?=)
Date: Tue, 6 Mar 2012 13:00:38 +0100
Subject: [Numpy-discussion] Data cube optimization for combination
Message-ID: <CAFUEvSMgTXjGrAvUr2AKKua-RS3e_knYe3Bs-UR=o3=Za-4EmA@mail.gmail.com>

Hello everyone,

does anyone know of an efficient implementation (maybe using
numpy.where statement) of the next code for data cube (3d array)
combining ?

import numpy as np

def combine( )

  cube = np.random.rand(32,2048,2048)
  result = np.zeros([2048,2048], np.float32)

   for ii in range(2048):
       for jj in range(2048):
            result[, ii, jj] = np.sqrt((cube[:,ii, jj])).sum()


It takes long time to run, however,


>> result = np.median(cube,0)


only around one second ! where is the point ? any suggestions ?


Thanks !


From sebastian at sipsolutions.net  Tue Mar  6 07:16:00 2012
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Tue, 06 Mar 2012 13:16:00 +0100
Subject: [Numpy-discussion] Data cube optimization for combination
In-Reply-To: <CAFUEvSMgTXjGrAvUr2AKKua-RS3e_knYe3Bs-UR=o3=Za-4EmA@mail.gmail.com>
References: <CAFUEvSMgTXjGrAvUr2AKKua-RS3e_knYe3Bs-UR=o3=Za-4EmA@mail.gmail.com>
Message-ID: <1331036160.16978.29.camel@sebastian-laptop>

Hello,

On Tue, 2012-03-06 at 13:00 +0100, Jose Miguel Ib??ez wrote:
> Hello everyone,
> 
> does anyone know of an efficient implementation (maybe using
> numpy.where statement) of the next code for data cube (3d array)
> combining ?
> 
You use the axis keyword/argument to sum, at which point you want to
cast (if you do) to float32 I don't know.
result = np.sqrt(cube).sum(axis=0)

> import numpy as np
> 
> def combine( )
> 
>   cube = np.random.rand(32,2048,2048)
>   result = np.zeros([2048,2048], np.float32)
> 
>    for ii in range(2048):
>        for jj in range(2048):
>             result[, ii, jj] = np.sqrt((cube[:,ii, jj])).sum()
> 
> 
> It takes long time to run, however,
> 
> 
> >> result = np.median(cube,0)
> 
> 
> only around one second ! where is the point ? any suggestions ?
> 
> 
> 
> Thanks !
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


From klemm at phys.ethz.ch  Tue Mar  6 07:20:21 2012
From: klemm at phys.ethz.ch (Hanno Klemm)
Date: Tue, 6 Mar 2012 13:20:21 +0100
Subject: [Numpy-discussion] Data cube optimization for combination
In-Reply-To: <CAFUEvSMgTXjGrAvUr2AKKua-RS3e_knYe3Bs-UR=o3=Za-4EmA@mail.gmail.com>
References: <CAFUEvSMgTXjGrAvUr2AKKua-RS3e_knYe3Bs-UR=o3=Za-4EmA@mail.gmail.com>
Message-ID: <EC3CCD41-90E8-49EE-8A23-F0AE30B1A230@phys.ethz.ch>


Hi,

this should work:

import numpy as np

ndim = 20

cube = np.random.rand(32,ndim, ndim)
result = np.zeros([ndim, ndim], np.float32)

def combine(cube, result):

     for ii in range(ndim):
         for jj in range(ndim):
             result[ii, jj] = np.sqrt((cube[:,ii, jj])).sum()
     return result

def combine2(cube, result):

     r2 = np.sqrt(cube)
     r3 = r2.sum(axis=0)
     return r3

r3 = combine2(cube,result)
r1 = combine(cube, result)

print np.allclose(r3, r1)

When I time it in ipython with ndim=20, I get:

In [8]: timeit combine2(cube, result)
1000 loops, best of 3: 246 us per loop

In [9]: timeit combine(cube, result)
100 loops, best of 3: 3.86 ms per loop


Best regards,
Hanno


Am 06.03.2012 um 13:00 schrieb Jose Miguel Ib??ez:

> Hello everyone,
>
> does anyone know of an efficient implementation (maybe using
> numpy.where statement) of the next code for data cube (3d array)
> combining ?
>
> import numpy as np
>
> def combine( )
>
>  cube = np.random.rand(32,2048,2048)
>  result = np.zeros([2048,2048], np.float32)
>
>   for ii in range(2048):
>       for jj in range(2048):
>            result[, ii, jj] = np.sqrt((cube[:,ii, jj])).sum()
>
>
> It takes long time to run, however,
>
>
>>> result = np.median(cube,0)
>
>
> only around one second ! where is the point ? any suggestions ?
>
>
>
> Thanks !
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From pierre.haessig at crans.org  Tue Mar  6 08:48:00 2012
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Tue, 06 Mar 2012 14:48:00 +0100
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <CAMRnEmoiTZcVpBZKU1B-kOGMsmN8SxghJ=oc5v7Y8s7wRvd+BA@mail.gmail.com>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
	<CAMRnEmoiTZcVpBZKU1B-kOGMsmN8SxghJ=oc5v7Y8s7wRvd+BA@mail.gmail.com>
Message-ID: <4F561590.3000400@crans.org>

Hi Mark,

I went through the NA NEP a few days ago, but only too quickly so that
my question is probably a rather dumb one. It's about the usability of
bitpatter-based NAs, based on your recent post :

Le 03/03/2012 22:46, Mark Wiebe a ?crit :
> Also, here's a thought for the usability of NA-float64. As much as
> global state is a bad idea, something which determines whether
> implicit float dtypes are NA-float64 or float64 could help. In
> IPython, "pylab" mode would default to float64, and "statlab" or
> "pystat" would default to NA-float64. One way to write this might be:
>
> >>> np.set_default_float(np.nafloat64)
> >>> np.array([1.0, 2.0, 3.0])
> array([ 1.,  2.,  3.], dtype=nafloat64)
> >>> np.set_default_float(np.float64)
> >>> np.array([1.0, 2.0, 3.0])
> array([ 1.,  2.,  3.], dtype=float64)

Q: Is is an *absolute* necessity to have two separate dtypes "nafloatNN"
and "floatNN" to enable NA bitpattern storage ?

From a potential user perspective, I feel it would be nice to have NA
and non-NA cases look as similar as possible. Your code example is
particularly striking : two different dtypes to store (from a user
perspective) the exact same content ! If this *could* be avoided, it
would be great...

I don't know how the NA machinery is working R. Does it works with a
kind of "nafloat64" all the time or is there some type inference
mechanics involved in choosing the appropriate type ?

Best,
Pierre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120306/e5471ff4/attachment.sig>

From cournape at gmail.com  Tue Mar  6 11:07:31 2012
From: cournape at gmail.com (David Cournapeau)
Date: Tue, 6 Mar 2012 11:07:31 -0500
Subject: [Numpy-discussion] Fixing PyArray_Descr flags member size,
 ABI vs pickling issue
In-Reply-To: <CAF6FJivPBww2rk+nzCAqcMoStR099uiB4X5Y-gkJW0pE2JKBXA@mail.gmail.com>
References: <CAGY4rcV3s78=uD-cem0bHL9PsJKLv33CFy7=K75Lvk2RcXdw9A@mail.gmail.com>
	<CAF6FJivPBww2rk+nzCAqcMoStR099uiB4X5Y-gkJW0pE2JKBXA@mail.gmail.com>
Message-ID: <CAGY4rcUigxhn3xVkEXp7vZRiN8omSQxjzQU3NYqjDcRGAXqQxA@mail.gmail.com>

On Tue, Mar 6, 2012 at 6:20 AM, Robert Kern <robert.kern at gmail.com> wrote:
> On Tue, Mar 6, 2012 at 03:53, David Cournapeau <cournape at gmail.com> wrote:
>> Hi,
>>
>> This is following the discussion on bug
>> http://projects.scipy.org/numpy/ticket/2017
>>
>> Essentially, there is a discrepency between the actual type of the
>> flags member in the dtype C structure (char) and how it is declared in
>> the descriptor table (T_INT). The problem is that we are damned if we
>> fix it, damned if we are not:
>> ?- fixing T_INT to T_BYTE. flag in python is now fixed, but it breaks
>> pickled numpy arrays
>
> Is the problem that T_BYTE returns a single-item string?

Yes (although it is actually what we want, instead of an int).

> My handwrapping skills are rusty (Cython has blissfully eradicated this
> from my memory), but aren't these T_INT/T_BYTE things just convenient
> shortcuts for exposing C struct members as Python attributes? Couldn't
> we just write a full getter function for returning the correct value,
> just returned as a Python int instead of a str.

You're right, I did not think about this solution. That's certainly
better than the two I suggested.

Thanks,

David


From mwwiebe at gmail.com  Tue Mar  6 11:38:03 2012
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Tue, 6 Mar 2012 08:38:03 -0800
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <4F561590.3000400@crans.org>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
	<CAMRnEmoiTZcVpBZKU1B-kOGMsmN8SxghJ=oc5v7Y8s7wRvd+BA@mail.gmail.com>
	<4F561590.3000400@crans.org>
Message-ID: <CAMRnEmqVP9N6_fNEbhcrQGOmVaMo3wagEeTigWN3pTy95c546w@mail.gmail.com>

Hi Pierre,

On Tue, Mar 6, 2012 at 5:48 AM, Pierre Haessig <pierre.haessig at crans.org>wrote:

> Hi Mark,
>
> I went through the NA NEP a few days ago, but only too quickly so that
> my question is probably a rather dumb one. It's about the usability of
> bitpatter-based NAs, based on your recent post :
>
> Le 03/03/2012 22:46, Mark Wiebe a ?crit :
> > Also, here's a thought for the usability of NA-float64. As much as
> > global state is a bad idea, something which determines whether
> > implicit float dtypes are NA-float64 or float64 could help. In
> > IPython, "pylab" mode would default to float64, and "statlab" or
> > "pystat" would default to NA-float64. One way to write this might be:
> >
> > >>> np.set_default_float(np.nafloat64)
> > >>> np.array([1.0, 2.0, 3.0])
> > array([ 1.,  2.,  3.], dtype=nafloat64)
> > >>> np.set_default_float(np.float64)
> > >>> np.array([1.0, 2.0, 3.0])
> > array([ 1.,  2.,  3.], dtype=float64)
>
> Q: Is is an *absolute* necessity to have two separate dtypes "nafloatNN"
> and "floatNN" to enable NA bitpattern storage ?
>
> From a potential user perspective, I feel it would be nice to have NA
> and non-NA cases look as similar as possible. Your code example is
> particularly striking : two different dtypes to store (from a user
> perspective) the exact same content ! If this *could* be avoided, it
> would be great...
>

The biggest reason to keep the two types separate is performance. The
straight float dtypes map directly to hardware floating-point operations,
which can be very fast. The NA-float dtypes have to use additional logic to
handle the NA values correctly. NA is treated as a particular NaN, and if
the hardware float operations were used directly, NA would turn into NaN.
This additional logic usually means more branches, so is slower.

One possibility we could consider is to automatically convert an array's
dtype from "float64" to "nafloat64" the first time an NA is assigned. This
would have good performance when there are no NAs, but would transparently
switch on NA support when it's needed.


> I don't know how the NA machinery is working R. Does it works with a
> kind of "nafloat64" all the time or is there some type inference
> mechanics involved in choosing the appropriate type ?
>

My understanding of R is that it works with the "nafloat64" for all its
operations, yes.

Cheers,
Mark


> Best,
> Pierre
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120306/8461a35f/attachment.html>

From travis at continuum.io  Tue Mar  6 13:25:40 2012
From: travis at continuum.io (Travis Oliphant)
Date: Tue, 6 Mar 2012 10:25:40 -0800
Subject: [Numpy-discussion] Fixing PyArray_Descr flags member size,
	ABI vs pickling issue
In-Reply-To: <CAGY4rcUigxhn3xVkEXp7vZRiN8omSQxjzQU3NYqjDcRGAXqQxA@mail.gmail.com>
References: <CAGY4rcV3s78=uD-cem0bHL9PsJKLv33CFy7=K75Lvk2RcXdw9A@mail.gmail.com>
	<CAF6FJivPBww2rk+nzCAqcMoStR099uiB4X5Y-gkJW0pE2JKBXA@mail.gmail.com>
	<CAGY4rcUigxhn3xVkEXp7vZRiN8omSQxjzQU3NYqjDcRGAXqQxA@mail.gmail.com>
Message-ID: <10469DD0-B652-4161-9625-522A391706B2@continuum.io>

Why do we want to return a single string char instead of an int?

There is a need for more flags on the dtype object.   Using an actual attribute call seems like the way to go.  This could even merge the contents of two struct members so that we can add more flags but preserve ABI compatibility.

Travis 

--
Travis Oliphant
(on a mobile)
512-826-7480


On Mar 6, 2012, at 8:07 AM, David Cournapeau <cournape at gmail.com> wrote:

> On Tue, Mar 6, 2012 at 6:20 AM, Robert Kern <robert.kern at gmail.com> wrote:
>> On Tue, Mar 6, 2012 at 03:53, David Cournapeau <cournape at gmail.com> wrote:
>>> Hi,
>>> 
>>> This is following the discussion on bug
>>> http://projects.scipy.org/numpy/ticket/2017
>>> 
>>> Essentially, there is a discrepency between the actual type of the
>>> flags member in the dtype C structure (char) and how it is declared in
>>> the descriptor table (T_INT). The problem is that we are damned if we
>>> fix it, damned if we are not:
>>>  - fixing T_INT to T_BYTE. flag in python is now fixed, but it breaks
>>> pickled numpy arrays
>> 
>> Is the problem that T_BYTE returns a single-item string?
> 
> Yes (although it is actually what we want, instead of an int).
> 
>> My handwrapping skills are rusty (Cython has blissfully eradicated this
>> from my memory), but aren't these T_INT/T_BYTE things just convenient
>> shortcuts for exposing C struct members as Python attributes? Couldn't
>> we just write a full getter function for returning the correct value,
>> just returned as a Python int instead of a str.
> 
> You're right, I did not think about this solution. That's certainly
> better than the two I suggested.
> 
> Thanks,
> 
> David
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From robert.kern at gmail.com  Tue Mar  6 13:44:40 2012
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 6 Mar 2012 18:44:40 +0000
Subject: [Numpy-discussion] Fixing PyArray_Descr flags member size,
 ABI vs pickling issue
In-Reply-To: <10469DD0-B652-4161-9625-522A391706B2@continuum.io>
References: <CAGY4rcV3s78=uD-cem0bHL9PsJKLv33CFy7=K75Lvk2RcXdw9A@mail.gmail.com>
	<CAF6FJivPBww2rk+nzCAqcMoStR099uiB4X5Y-gkJW0pE2JKBXA@mail.gmail.com>
	<CAGY4rcUigxhn3xVkEXp7vZRiN8omSQxjzQU3NYqjDcRGAXqQxA@mail.gmail.com>
	<10469DD0-B652-4161-9625-522A391706B2@continuum.io>
Message-ID: <CAF6FJitRuR=ekKAP4KD9NvcBFiko240teGmr1aNjQaC+wh3a5w@mail.gmail.com>

On Tue, Mar 6, 2012 at 18:25, Travis Oliphant <travis at continuum.io> wrote:
> Why do we want to return a single string char instead of an int?

I suspect just to ensure that any provided value fits in the range
0..255. But that's easily done explicitly.

-- 
Robert Kern


From cournape at gmail.com  Tue Mar  6 13:59:41 2012
From: cournape at gmail.com (David Cournapeau)
Date: Tue, 6 Mar 2012 13:59:41 -0500
Subject: [Numpy-discussion] Fixing PyArray_Descr flags member size,
 ABI vs pickling issue
In-Reply-To: <10469DD0-B652-4161-9625-522A391706B2@continuum.io>
References: <CAGY4rcV3s78=uD-cem0bHL9PsJKLv33CFy7=K75Lvk2RcXdw9A@mail.gmail.com>
	<CAF6FJivPBww2rk+nzCAqcMoStR099uiB4X5Y-gkJW0pE2JKBXA@mail.gmail.com>
	<CAGY4rcUigxhn3xVkEXp7vZRiN8omSQxjzQU3NYqjDcRGAXqQxA@mail.gmail.com>
	<10469DD0-B652-4161-9625-522A391706B2@continuum.io>
Message-ID: <CAGY4rcV2dF-edfR4y_SDm_03u4rhxLfjHG47MmiW0EQg73iq1w@mail.gmail.com>

On Tue, Mar 6, 2012 at 1:25 PM, Travis Oliphant <travis at continuum.io> wrote:
> Why do we want to return a single string char instead of an int?
>
> There is a need for more flags on the dtype object. ? Using an actual attribute call seems like the way to go. ?This could even merge the contents of two struct members so that we can add more flags but preserve ABI compatibility.

Yes. The T_BYTE/T_INT is actually pretty  minor compared to the
underlying issue (where we cast back and forth between int and char).
I will make a new PR that fixes everything but this exact point, and
will put an actual accessor if needed. Given that dtype.flags is
nonsensical as of today (at the python level), I would expect nobody
uses it.

cheers,

David


From sturla at molden.no  Tue Mar  6 14:57:29 2012
From: sturla at molden.no (Sturla Molden)
Date: Tue, 06 Mar 2012 20:57:29 +0100
Subject: [Numpy-discussion] (2012) Accessing LAPACK and BLAS from the
 numpy	C API
In-Reply-To: <4F54BEEA.4070503@esrf.fr>
References: <4F54BEEA.4070503@esrf.fr>
Message-ID: <4F566C29.3050209@molden.no>

On 05.03.2012 14:26, "V. Armando Sol?" wrote:

> In 2009 there was a thread in this mailing list concerning the access to
> BLAS from C extension modules.
>
> If I have properly understood the thread:
>
> http://mail.scipy.org/pipermail/numpy-discussion/2009-November/046567.html
>
> the answer by then was that those functions were not exposed (only f2py
> functions).
>
> I just wanted to know if the situation has changed since 2009 because it
> is not uncommon that to optimize some operations one has to sooner or
> later access BLAS functions that are already wrapped in numpy (either
> from ATLAS, from the Intel MKL, ...)

Why do you want to do this? It does not make your life easier to use 
NumPy or SciPy's Python wrappers from C. Just use BLAS directly from C 
instead.

BLAS is a Fortran 77 library (although it might be implemented in C or 
assembly). Fortran 77 compilers do not use a predefined binary 
interface. f2py has knowledge about all the major compilers, and will 
generate different call statements depending on compiler. NumPy and 
SciPy does not use the standard C BLAS interface, nor is all of BLAS and 
LAPACK exposed.

You can, however, get a C function pointer to the part of BLAS and 
LAPACK that SciPy does expose:

import scipy as sp
from scipy.linalg import get_blas_funcs
DGEMM = get_blas_funcs('gemm', dtype=np.float64)

Now DGEMM._cpointer is a PyCObject that wraps the C function pointer. 
You can extract it by calling PyCObject_AsVoidPtr in C and cast the 
return value to the correct function pointer type. But be aware that you 
must know the binary interface of the underlying Fortran version.
Generally you can assume that all arguments to a Fortran 77 subroutine 
are pointers (character strings can be problematic).

For MKL you can let f2py generate code for the Intel Fortran compiler 
and use the same call statement in C. But then comes the question of 
legality: If you don't have a developer's license for MKL you are 
probably not allowed to use it like that. And if you do, you can just 
use the header files and the C BLAS interface instead.

Generally, I will recommend that you build GotoBLAS2 (or OpenBLAS) if 
you don't have a license for MKL -- or download ACML from AMD. You will 
in any case get a set of C headers you can use from C. In either case it 
is just extra work to hack into SciPy's BLAS functions using the 
_cpointer attribute, nor do you gain anything from doing it.


Sturla


From njs at pobox.com  Tue Mar  6 15:25:09 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 6 Mar 2012 20:25:09 +0000
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
Message-ID: <CAPJVwBn5od6OR+w3vp39E5BGX-nO9CWaFKixhPEn_CFwwiDaHA@mail.gmail.com>

On Sat, Mar 3, 2012 at 8:30 PM, Travis Oliphant <travis at continuum.io> wrote:
> Hi all,

Hi Travis,

Thanks for bringing this back up.

Have you looked at the summary from the last thread?
  https://github.com/njsmith/numpy/wiki/NA-discussion-status
The goal was to try and at least work out what points we all *could*
agree on, to have some common footing for further discussion. I won't
copy the whole thing here, but I'd summarize the state as:
  -- It's pretty clear that there are two fairly different conceptual
models/use cases in play here. For one of them (R-style "missing data"
cases) it's pretty clear what the desired semantics would be. For the
other (temporary "ignored values") there's still substantive
disagreement.
  -- We *haven't* yet established what we want numpy to actually support.

IMHO the critical next step is this latter one -- maybe we want to
fully support both use cases. Maybe it's really only one of them
that's worth trying to support in the numpy core right now. Maybe it's
just one of them, but it's worth doing so thoroughly that it should
have multiple implementations. Or whatever.

I fear that if we don't talk about these big picture questions and
just wade directly back into round-and-round arguments about API
details then we'll never get anywhere.

[...]
> Because it is slated to go into release 1.7, we need to re-visit the masked array discussion again. ? ?The NEP process is the appropriate one and I'm glad we are taking that route for these discussions. ? My goal is to get consensus in order for code to get into NumPy (regardless of who writes the code). ? ?It may be that we don't come to a consensus (reasonable and intelligent people can disagree on things --- look at the coming election...). ? We can represent different parts of what is fortunately a very large user-base of NumPy users.
>
> First of all, I want to be clear that I think there is much great work that has been done in the current missing data code. ?There are some nice features in the where clause of the ufunc and the machinery for the iterator that allows re-using ufunc loops that are not re-written to check for missing data. ? I'm sure there are other things as well that I'm not quite aware of yet. ? ?However, I don't think the API presented to the numpy user presently is the correct one for NumPy 1.X.
>
> A few particulars:
>
> ? ? ? ?* the reduction operations need to default to "skipna" --- this is the most common use case which has been re-inforced again to me today by a new user to Python who is using masked arrays presently

This is one of the points where the two conceptual models disagree
(see also Skipper's point down-thread). If you have "missing data",
then propagation has to be the default -- the sum of 1, 2, and
I-DON'T-KNOW-MAYBE-7-MAYBE-12 is not 3. If you have some data there
but you've asked numpy to temporarily ignore it, then, well, duh, of
course it should ignore it.

> ? ? ? ?* the mask needs to be visible to the user if they use that approach to missing data (people should be able to get a hold of the mask and work with it in Python)

This is also a point where the two conceptual models disagree.

Actually this is one of the original arguments we made against the NEP
design -- that if you want missing data, then having a mask at all is
counterproductive, and if you are ignoring data, then of course it
should be easy to manipulate the ignore mask. The rationale for the
current design is to compromise between these two approaches -- there
is a mask, but it's hidden behind a curtain. Mostly. (This may be a
compromise in the Solomonic sense.)

> ? ? ? ?* bit-pattern approaches to missing data (at least for float64 and int32) need to be implemented.
>
> ? ? ? ?* there should be some way when using "masks" (even if it's hidden from most users) for missing data to separate the low-level ufunc operation from the operation
> ? ? ? ? ? on the masks...

I don't understand what this means.

> I have heard from several users that they will *not use the missing data* in NumPy as currently implemented, and I can now see why. ? ?For better or for worse, my approach to software is generally very user-driven and very pragmatic. ?On the other hand, I'm also a mathematician and appreciate the cognitive compression that can come out of well-formed structure. ? ?None-the-less, I'm an *applied* mathematician and am ultimately motivated by applications.
>
> I will get a hold of the NEP and spend some time with it to discuss some of this in that document. ? This will take several weeks (as PyCon is next week and I have a tutorial I'm giving there). ? ?For now, I do not think 1.7 can be released unless the masked array is labeled *experimental*.

In project management terms, I see three options:
1) Put a big warning label on the functionality and leave it for now
("If this option is given, np.asarray returns a masked array. NOTE: IN
THE NEXT RELEASE, IT MAY INSTEAD RETURN A BAG OF RABID, HUNGRY
WEASELS. NO GUARANTEES.")
2) Move the code back out of mainline and into a branch until until
there's consensus.
3) Hold up the release until this is all sorted.

I come from the project-management school that says you should always
have a releasable mainline, keep unready code in branches, and never
hold up the release for features, so (2) seems obvious to me. But I
seem to be very much in the minority on that[1], so oh well :-). I
don't have any objection to (1), personally. (3) seems like a bad
idea. Just my 2 pence.

-- Nathaniel

[1] See replies here:
http://thread.gmane.org/gmane.comp.python.numeric.general/46460/focus=46546


From cournape at gmail.com  Tue Mar  6 15:33:10 2012
From: cournape at gmail.com (David Cournapeau)
Date: Tue, 6 Mar 2012 15:33:10 -0500
Subject: [Numpy-discussion] (2012) Accessing LAPACK and BLAS from the
 numpy C API
In-Reply-To: <4F566C29.3050209@molden.no>
References: <4F54BEEA.4070503@esrf.fr>
	<4F566C29.3050209@molden.no>
Message-ID: <CAGY4rcXzC0qLKOWegUD+FwyEKMaq0NHCO=xX7oM1y3wbsztqUA@mail.gmail.com>

On Tue, Mar 6, 2012 at 2:57 PM, Sturla Molden <sturla at molden.no> wrote:
> On 05.03.2012 14:26, "V. Armando Sol?" wrote:
>
>> In 2009 there was a thread in this mailing list concerning the access to
>> BLAS from C extension modules.
>>
>> If I have properly understood the thread:
>>
>> http://mail.scipy.org/pipermail/numpy-discussion/2009-November/046567.html
>>
>> the answer by then was that those functions were not exposed (only f2py
>> functions).
>>
>> I just wanted to know if the situation has changed since 2009 because it
>> is not uncommon that to optimize some operations one has to sooner or
>> later access BLAS functions that are already wrapped in numpy (either
>> from ATLAS, from the Intel MKL, ...)
>
> Why do you want to do this? It does not make your life easier to use
> NumPy or SciPy's Python wrappers from C. Just use BLAS directly from C
> instead.

Of course it does make his life easier. This way he does not have to
distribute his own BLAS/LAPACK/etc...

Please stop presenting as truth things which are at best highly
opiniated. You already made such statements many times, and it is not
helpful at all.

David


From sturla at molden.no  Tue Mar  6 15:39:46 2012
From: sturla at molden.no (Sturla Molden)
Date: Tue, 06 Mar 2012 21:39:46 +0100
Subject: [Numpy-discussion] C++ Example
In-Reply-To: <30965644.bxiDOaWOx7@rabbit>
References: <30965644.bxiDOaWOx7@rabbit>
Message-ID: <4F567612.4050305@molden.no>

On 03.03.2012 17:07, Luis Pedro Coelho wrote:

> I sort of missed the big C++ discussion, but I'd like to give some examples of
> how writing code can become much simpler if you are based on C++. This is from
> my mahotas package, which has a thin C++ wrapper around numpy's C API

Here you are using NumPy arrays, not implementing them. That makes a big 
difference. Your code would be even simpler if you had used Cython or 
Fortran (f2py) instead of C++.

The only thing you have proved is that using NumPy arrays in C can be 
messy, which I suspect everybody knows already.

C is a good systems programming language, but it sucks for any kind of 
numerical computing. C++ sucks a little bit less. Using either for 
numerical programming usually a mistake. But NumPy core is not a case of 
numerical programming, it is a case of systems programming, for which C 
is actually very nice.

And no, unmaintainable C is NOT an example of "very good C", rather the 
opposite. This is a big mistake:

    "If you compare this with the equivalent scipy.ndimage function, 
which is very good C code (but mostly write-only?in fact, ndimage has 
not been maintainable..."

Another thing to observe is that the ndimage C code you cited has a 
major DRY violation: We have the best text (pre)processor there is to 
our disposal: Python. So why are there macros and switch statements for 
so many dtypes in there? As with C++ templates, generics in C is easy if 
we just let Python generate the type-specialized code we need. (Actually 
that is what NumPy's .src files do.) Generics is therefore not an 
argument for using C++. We can augment standard C with syntax suger for 
generics, or even NumPy arrays and Python types, using a Python script 
as meta-compiler. We don't need C++ for that.


Sturla


From matthieu.brucher at gmail.com  Tue Mar  6 15:45:35 2012
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Tue, 6 Mar 2012 21:45:35 +0100
Subject: [Numpy-discussion] C++ Example
In-Reply-To: <4F567612.4050305@molden.no>
References: <30965644.bxiDOaWOx7@rabbit>
	<4F567612.4050305@molden.no>
Message-ID: <CAHCaCkL387Dp4krgFKE_+ay6CmqCCGf3mDtjdNu0i4AY1cmQvQ@mail.gmail.com>

Using either for
> numerical programming usually a mistake.
>

This is your opinion, but there are a lot of numerical code now in C++ and
they are far more maintainable than in Fortran. And they are faster for
exactly this reason.

Matthieu
-- 
Information System Engineer, Ph.D.
Blog: http://matt.eifelle.com
LinkedIn: http://www.linkedin.com/in/matthieubrucher
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120306/6756219f/attachment.html>

From sturla at molden.no  Tue Mar  6 15:54:29 2012
From: sturla at molden.no (Sturla Molden)
Date: Tue, 06 Mar 2012 21:54:29 +0100
Subject: [Numpy-discussion] C++ Example
In-Reply-To: <CAHCaCkL387Dp4krgFKE_+ay6CmqCCGf3mDtjdNu0i4AY1cmQvQ@mail.gmail.com>
References: <30965644.bxiDOaWOx7@rabbit>	<4F567612.4050305@molden.no>
	<CAHCaCkL387Dp4krgFKE_+ay6CmqCCGf3mDtjdNu0i4AY1cmQvQ@mail.gmail.com>
Message-ID: <4F567985.3020606@molden.no>

On 06.03.2012 21:45, Matthieu Brucher wrote:

> This is your opinion, but there are a lot of numerical code now in C++
> and they are far more maintainable than in Fortran. And they are faster
> for exactly this reason.

That is mostly because C++ makes tasks that are non-numerical easier.

But that is why we have Python.


Sturla


From matthieu.brucher at gmail.com  Tue Mar  6 15:58:29 2012
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Tue, 6 Mar 2012 21:58:29 +0100
Subject: [Numpy-discussion] C++ Example
In-Reply-To: <4F567985.3020606@molden.no>
References: <30965644.bxiDOaWOx7@rabbit> <4F567612.4050305@molden.no>
	<CAHCaCkL387Dp4krgFKE_+ay6CmqCCGf3mDtjdNu0i4AY1cmQvQ@mail.gmail.com>
	<4F567985.3020606@molden.no>
Message-ID: <CAHCaCkKwMP9zeqa0kVCx_Ykb6uA5EJaTLGspAnyiyenjx-E4+Q@mail.gmail.com>

2012/3/6 Sturla Molden <sturla at molden.no>

> On 06.03.2012 21:45, Matthieu Brucher wrote:
>
> > This is your opinion, but there are a lot of numerical code now in C++
> > and they are far more maintainable than in Fortran. And they are faster
> > for exactly this reason.
>
> That is mostly because C++ makes tasks that are non-numerical easier.
>

I talk about numerical code, and you talk about non-numerical code. I stand
by my words. It is efficient and more robust than Fortran for everything,
including numerical code.

Matthieu
-- 
Information System Engineer, Ph.D.
Blog: http://matt.eifelle.com
LinkedIn: http://www.linkedin.com/in/matthieubrucher
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120306/0176ae8c/attachment.html>

From sturla at molden.no  Tue Mar  6 15:58:43 2012
From: sturla at molden.no (Sturla Molden)
Date: Tue, 06 Mar 2012 21:58:43 +0100
Subject: [Numpy-discussion] (2012) Accessing LAPACK and BLAS from the
 numpy C API
In-Reply-To: <CAGY4rcXzC0qLKOWegUD+FwyEKMaq0NHCO=xX7oM1y3wbsztqUA@mail.gmail.com>
References: <4F54BEEA.4070503@esrf.fr>	<4F566C29.3050209@molden.no>
	<CAGY4rcXzC0qLKOWegUD+FwyEKMaq0NHCO=xX7oM1y3wbsztqUA@mail.gmail.com>
Message-ID: <4F567A83.6080401@molden.no>

On 06.03.2012 21:33, David Cournapeau wrote:

> Of course it does make his life easier. This way he does not have to
> distribute his own BLAS/LAPACK/etc...
>
> Please stop presenting as truth things which are at best highly
> opiniated. You already made such statements many times, and it is not
> helpful at all.

I showed him how to grab those function pointers if he wants to.

Sturla


From njs at pobox.com  Tue Mar  6 15:59:03 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 6 Mar 2012 20:59:03 +0000
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <CAMRnEmqVP9N6_fNEbhcrQGOmVaMo3wagEeTigWN3pTy95c546w@mail.gmail.com>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
	<CAMRnEmoiTZcVpBZKU1B-kOGMsmN8SxghJ=oc5v7Y8s7wRvd+BA@mail.gmail.com>
	<4F561590.3000400@crans.org>
	<CAMRnEmqVP9N6_fNEbhcrQGOmVaMo3wagEeTigWN3pTy95c546w@mail.gmail.com>
Message-ID: <CAPJVwBnKuA0NC3FvEuzmxgpA8FOT2fver98Ec5ow3oPZACfiyA@mail.gmail.com>

On Tue, Mar 6, 2012 at 4:38 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> On Tue, Mar 6, 2012 at 5:48 AM, Pierre Haessig <pierre.haessig at crans.org>
> wrote:
>> >From a potential user perspective, I feel it would be nice to have NA
>> and non-NA cases look as similar as possible. Your code example is
>> particularly striking : two different dtypes to store (from a user
>> perspective) the exact same content ! If this *could* be avoided, it
>> would be great...
>
> The biggest reason to keep the two types separate is performance. The
> straight float dtypes map directly to hardware floating-point operations,
> which can be very fast. The NA-float dtypes have to use additional logic to
> handle the NA values correctly. NA is treated as a particular NaN, and if
> the hardware float operations were used directly, NA would turn into NaN.
> This additional logic usually means more branches, so is slower.

Actually, no -- hardware float operations preserve NA-as-NaN. You
might well need to be careful around more exotic code like optimized
BLAS kernels, but all the basic ufuncs should Just Work at full speed.
Demo:

>>> def hexify(x): return hex(np.float64(x).view(np.int64))
>>> hexify(np.nan)
'0x7ff8000000000000L'
# IIRC this is R's NA bitpattern (presumably 1974 is someone's birthday)
>>> NA = np.int64(0x7ff8000000000000 + 1974).view(np.float64)
# It is an NaN...
>>> NA
nan
# But it has a distinct bitpattern:
>>> hexify(NA)
'0x7ff80000000007b6L'
# Like any NaN, it propagates through floating point operations:
>>> NA + 3
nan
# But, critically, so does the bitpattern; ordinary Python "+" is
returning NA on this operation:
>>> hexify(NA + 3)
'0x7ff80000000007b6L'

This is how R does it, which is more evidence that this actually works
on real hardware.

There is one place where it fails. In a binary operation with *two*
NaN values, there's an ambiguity about which payload should be
returned. IEEE754 recommends just returning the first one. This means
that NA + NaN = NA, NaN + NA = NaN. This is ugly, but it's an obscure
case that nobody cares about, so it's probably worth it for the speed
gain. (In fact, if you type those two expressions at the R prompt,
then that's what you get, and I can't find any reference to anyone
even noticing this.)

>> I don't know how the NA machinery is working R. Does it works with a
>> kind of "nafloat64" all the time or is there some type inference
>> mechanics involved in choosing the appropriate type ?
>
> My understanding of R is that it works with the "nafloat64" for all its
> operations, yes.

Right -- R has a very impoverished type system as compared to numpy.
There's basically four types: "numeric" (meaning double precision
float), "integer", "logical" (boolean), and "character" (string). And
in practice the integer type is essentially unused, because R parses
numbers like "1" as being floating point, not integer; the only way to
get an integer value is to explicitly cast to it. Each of these types
has a specific bit-pattern set aside for representing NA. And...
that's it. It's very simple when it works, but also very limited.

I'm still skeptical that we could make the floating point types
NA-aware by default -- until we have an implementation in hand, I'm
nervous there'd be some corner case that broke everything. (Maybe
ufuncs are fine but np.dot has an unavoidable overhead, or maybe it
would mess up casting from float types to non-NA-aware types, etc.)
But who knows. Probably not something we can really make a meaningful
decision about yet.

-- Nathaniel


From denis.laxalde at mcgill.ca  Tue Mar  6 16:03:15 2012
From: denis.laxalde at mcgill.ca (Denis Laxalde)
Date: Tue, 6 Mar 2012 21:03:15 +0000 (UTC)
Subject: [Numpy-discussion] addition,
	multiplication of a polynomial and np.{float, int}
Message-ID: <slrnjlcusj.ul5.dlaxalde@157-washington.campus.mcgill.ca>

Hi,

    >>> np.__version__
    '1.7.0.dev-7c07089'
    >>> p = np.poly1d([1,1])
    >>> p + 1.0
    poly1d([ 1.,  2.])
    >>> p + np.float64(1)
    poly1d([ 1.,  2.])
    >>> np.float64(1.0) + p
    array([ 2.,  2.])
    >>> np.int64(1) + p
    array([2, 2])
    >>> np.int(1) + p
    poly1d([1, 2])
    >>> np.int64(1)*p
    array([1, 1])

What's happening here? I'd expect operations with polynomials would
return a polynomial.

-- 
Denis


From ralf.gommers at googlemail.com  Tue Mar  6 16:14:00 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Tue, 6 Mar 2012 22:14:00 +0100
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <CAPJVwBn5od6OR+w3vp39E5BGX-nO9CWaFKixhPEn_CFwwiDaHA@mail.gmail.com>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
	<CAPJVwBn5od6OR+w3vp39E5BGX-nO9CWaFKixhPEn_CFwwiDaHA@mail.gmail.com>
Message-ID: <CABL7CQi1Yg0gQz7XiK=Zg2CVRPEh=R0pvToEzL8mdo9uM9uzpg@mail.gmail.com>

On Tue, Mar 6, 2012 at 9:25 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Sat, Mar 3, 2012 at 8:30 PM, Travis Oliphant <travis at continuum.io>
> wrote:
> > Hi all,
>
> Hi Travis,
>
> Thanks for bringing this back up.
>
> Have you looked at the summary from the last thread?
>  https://github.com/njsmith/numpy/wiki/NA-discussion-status
>

Re-reading that summary and the main documents and threads linked from it,
I could find either examples of statistical software that treats missing
and ignored data explicitly separately, or links to relevant literature.
Those would probably help the discussion a lot.

The goal was to try and at least work out what points we all *could*
> agree on, to have some common footing for further discussion. I won't
> copy the whole thing here, but I'd summarize the state as:
>  -- It's pretty clear that there are two fairly different conceptual
> models/use cases in play here. For one of them (R-style "missing data"
> cases) it's pretty clear what the desired semantics would be. For the
> other (temporary "ignored values") there's still substantive
> disagreement.
>  -- We *haven't* yet established what we want numpy to actually support.
>
> IMHO the critical next step is this latter one -- maybe we want to
> fully support both use cases. Maybe it's really only one of them
> that's worth trying to support in the numpy core right now. Maybe it's
> just one of them, but it's worth doing so thoroughly that it should
> have multiple implementations. Or whatever.
>
> I fear that if we don't talk about these big picture questions and
> just wade directly back into round-and-round arguments about API
> details then we'll never get anywhere.
>
> [...]
> > Because it is slated to go into release 1.7, we need to re-visit the
> masked array discussion again.    The NEP process is the appropriate one
> and I'm glad we are taking that route for these discussions.   My goal is
> to get consensus in order for code to get into NumPy (regardless of who
> writes the code).    It may be that we don't come to a consensus
> (reasonable and intelligent people can disagree on things --- look at the
> coming election...).   We can represent different parts of what is
> fortunately a very large user-base of NumPy users.
> >
> > First of all, I want to be clear that I think there is much great work
> that has been done in the current missing data code.  There are some nice
> features in the where clause of the ufunc and the machinery for the
> iterator that allows re-using ufunc loops that are not re-written to check
> for missing data.   I'm sure there are other things as well that I'm not
> quite aware of yet.    However, I don't think the API presented to the
> numpy user presently is the correct one for NumPy 1.X.
> >
> > A few particulars:
> >
> >        * the reduction operations need to default to "skipna" --- this
> is the most common use case which has been re-inforced again to me today by
> a new user to Python who is using masked arrays presently
>
> This is one of the points where the two conceptual models disagree
> (see also Skipper's point down-thread). If you have "missing data",
> then propagation has to be the default -- the sum of 1, 2, and
> I-DON'T-KNOW-MAYBE-7-MAYBE-12 is not 3. If you have some data there
> but you've asked numpy to temporarily ignore it, then, well, duh, of
> course it should ignore it.
>
> >        * the mask needs to be visible to the user if they use that
> approach to missing data (people should be able to get a hold of the mask
> and work with it in Python)
>
> This is also a point where the two conceptual models disagree.
>
> Actually this is one of the original arguments we made against the NEP
> design -- that if you want missing data, then having a mask at all is
> counterproductive, and if you are ignoring data, then of course it
> should be easy to manipulate the ignore mask. The rationale for the
> current design is to compromise between these two approaches -- there
> is a mask, but it's hidden behind a curtain. Mostly. (This may be a
> compromise in the Solomonic sense.)
>
> >        * bit-pattern approaches to missing data (at least for float64
> and int32) need to be implemented.
> >
> >        * there should be some way when using "masks" (even if it's
> hidden from most users) for missing data to separate the low-level ufunc
> operation from the operation
> >           on the masks...
>
> I don't understand what this means.
>
> > I have heard from several users that they will *not use the missing
> data* in NumPy as currently implemented, and I can now see why.    For
> better or for worse, my approach to software is generally very user-driven
> and very pragmatic.  On the other hand, I'm also a mathematician and
> appreciate the cognitive compression that can come out of well-formed
> structure.    None-the-less, I'm an *applied* mathematician and am
> ultimately motivated by applications.
> >
> > I will get a hold of the NEP and spend some time with it to discuss some
> of this in that document.   This will take several weeks (as PyCon is next
> week and I have a tutorial I'm giving there).    For now, I do not think
> 1.7 can be released unless the masked array is labeled *experimental*.
>
> In project management terms, I see three options:
> 1) Put a big warning label on the functionality and leave it for now
> ("If this option is given, np.asarray returns a masked array. NOTE: IN
> THE NEXT RELEASE, IT MAY INSTEAD RETURN A BAG OF RABID, HUNGRY
> WEASELS. NO GUARANTEES.")
>

I've opened http://projects.scipy.org/numpy/ticket/2072 for that. Assuming
we stick with this option, I'd appreciate it if you could check in the
first beta that comes out whether or not the warnings are obvious enough
and in all the right places. There probably won't be weasels though:)


> 2) Move the code back out of mainline and into a branch until until
> there's consensus.
> 3) Hold up the release until this is all sorted.
>
> I come from the project-management school that says you should always
> have a releasable mainline, keep unready code in branches, and never
> hold up the release for features, so (2) seems obvious to me.


While it may sound obvious, I hope you've understood why in practice it's
not at all obvious and why you got such strong reactions to your proposal
of taking out all that code. If not, just look at what happened with the
numpy-refactor work.

But I seem to be very much in the minority on that[1], so oh well :-). I
> don't have any objection to (1), personally. (3) seems like a bad
> idea. Just my 2 pence.
>

Agreed that (3) is a bad idea. +1 for (1).

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120306/07b7a1a9/attachment.html>

From charlesr.harris at gmail.com  Tue Mar  6 16:19:29 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 6 Mar 2012 14:19:29 -0700
Subject: [Numpy-discussion] addition,
 multiplication of a polynomial and np.{float, int}
In-Reply-To: <slrnjlcusj.ul5.dlaxalde@157-washington.campus.mcgill.ca>
References: <slrnjlcusj.ul5.dlaxalde@157-washington.campus.mcgill.ca>
Message-ID: <CAB6mnxJ=u7QGTwSwUZfibbSieh8caZ+3_yTmuWF_-3xzJmONYg@mail.gmail.com>

On Tue, Mar 6, 2012 at 2:03 PM, Denis Laxalde <denis.laxalde at mcgill.ca>wrote:

> Hi,
>
>    >>> np.__version__
>    '1.7.0.dev-7c07089'
>    >>> p = np.poly1d([1,1])
>    >>> p + 1.0
>    poly1d([ 1.,  2.])
>    >>> p + np.float64(1)
>    poly1d([ 1.,  2.])
>    >>> np.float64(1.0) + p
>    array([ 2.,  2.])
>    >>> np.int64(1) + p
>    array([2, 2])
>    >>> np.int(1) + p
>    poly1d([1, 2])
>    >>> np.int64(1)*p
>    array([1, 1])
>
> What's happening here? I'd expect operations with polynomials would
> return a polynomial.
>

Use polynomial.Polynomial and you won't have this problem.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120306/847ac70f/attachment.html>

From njs at pobox.com  Tue Mar  6 16:51:45 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 6 Mar 2012 21:51:45 +0000
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <CABL7CQi1Yg0gQz7XiK=Zg2CVRPEh=R0pvToEzL8mdo9uM9uzpg@mail.gmail.com>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
	<CAPJVwBn5od6OR+w3vp39E5BGX-nO9CWaFKixhPEn_CFwwiDaHA@mail.gmail.com>
	<CABL7CQi1Yg0gQz7XiK=Zg2CVRPEh=R0pvToEzL8mdo9uM9uzpg@mail.gmail.com>
Message-ID: <CAPJVwBkky1An2mKe6LBEUQiNi9wDThvp0vBHFpOOE3HS2ZDakQ@mail.gmail.com>

On Tue, Mar 6, 2012 at 9:14 PM, Ralf Gommers
<ralf.gommers at googlemail.com> wrote:
> On Tue, Mar 6, 2012 at 9:25 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> On Sat, Mar 3, 2012 at 8:30 PM, Travis Oliphant <travis at continuum.io>
>> wrote:
>> > Hi all,
>>
>> Hi Travis,
>>
>> Thanks for bringing this back up.
>>
>> Have you looked at the summary from the last thread?
>> ?https://github.com/njsmith/numpy/wiki/NA-discussion-status
>
> Re-reading that summary and the main documents and threads linked from it, I
> could find either examples of statistical software that treats missing and
> ignored data explicitly separately, or links to relevant literature. Those
> would probably help the discussion a lot.

(I think you mean "couldn't find"?)

I'm not aware of any software that supports the IGNORED concept at
all, whether in combination with missing data or not. np.ma is
probably the closest example. I think we'd be breaking new ground
there. This is also probably why it is less clear how it should work
:-).

IIUC, the basic reason that people want IGNORED in the core is that it
provides convenience and syntactic sugar for efficient "in place"
operation on subsets of large arrays. So there are actually two parts
there -- the efficient operation, and the convenience/syntactic sugar.
The key feature for efficient operation is the where= feature, which
is not controversial at all. So, there's an argument that for now we
should focus on where=, give people some time to work with it, and
then use that experience to decide what kind of convenience/sugar
would be useful, if any. But, that's just my own idea; I definitely
can't claim any consensus on it.

>> In project management terms, I see three options:
>> 1) Put a big warning label on the functionality and leave it for now
>> ("If this option is given, np.asarray returns a masked array. NOTE: IN
>> THE NEXT RELEASE, IT MAY INSTEAD RETURN A BAG OF RABID, HUNGRY
>> WEASELS. NO GUARANTEES.")
>
> I've opened http://projects.scipy.org/numpy/ticket/2072 for that.

Cool, thanks.

> Assuming
> we stick with this option, I'd appreciate it if you could check in the first
> beta that comes out whether or not the warnings are obvious enough and in
> all the right places. There probably won't be weasels though:)

Of course. I've added myself to the CC list. (Err, if the beta won't
be for a bit, though, then please remind me if you remember? I'm
juggling a lot of balls right now.)

>> 2) Move the code back out of mainline and into a branch until until
>> there's consensus.
>> 3) Hold up the release until this is all sorted.
>>
>> I come from the project-management school that says you should always
>> have a releasable mainline, keep unready code in branches, and never
>> hold up the release for features, so (2) seems obvious to me.
>
> While it may sound obvious, I hope you've understood why in practice it's
> not at all obvious and why you got such strong reactions to your proposal of
> taking out all that code. If not, just look at what happened with the
> numpy-refactor work.

Of course, and that's why I'm not pressing the point. These trade-offs
might be worth talking about at some point -- there are reasons that
basically all the major FOSS projects have moved towards time-based
releases :-) -- but that'd be a huge discussion at a time when we
already have more than enough of those on our plate...

>> But I seem to be very much in the minority on that[1], so oh well :-). I
>> don't have any objection to (1), personally. (3) seems like a bad
>> idea. Just my 2 pence.
>
>
> Agreed that (3) is a bad idea. +1 for (1).
>
> Ralf
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>

Cheers,
-- Nathaniel


From ralf.gommers at googlemail.com  Tue Mar  6 17:17:47 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Tue, 6 Mar 2012 23:17:47 +0100
Subject: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
In-Reply-To: <CAH6Pt5qJiMxw1i2NvTGCemAO0Xp8SGFB9dBrNDDwtetP7AMOew@mail.gmail.com>
References: <CAH6Pt5qkeVoo3d2+q4+WMmKZFC2FnHx3=pun5S5dwv2P5ijxKw@mail.gmail.com>
	<CAB6mnx+U1vXz6=TCSHO=-a+OjffcAntFKge9bm8LF-Vkj4jH8w@mail.gmail.com>
	<CAH6Pt5o+XW+DVy2L4-GG1Pq2B9YWuebKUKuMGHm-hWabVUV2=w@mail.gmail.com>
	<CAH6Pt5p3zm=ugcQ1W-L5j9C6BQ3g_9yALzfFmxd0Lb6c2UUr6Q@mail.gmail.com>
	<CAMRnEmrV1OTEABbiJMz2HfUb-vcbyq-SHRXE91SS93gtASWgUQ@mail.gmail.com>
	<CAH6Pt5q3xZaioK-kQ1=_mCCTFkOo-uwS0i5G+yQqhqmNPrmqaA@mail.gmail.com>
	<CAMRnEmrjQ+Tv4EPPETboc-zcB+BFKHTeDrCdLQFvd32JzQhQZA@mail.gmail.com>
	<CAH6Pt5q6xnYvRtQUvHKSpvJtTyJTDiR5efCduqop4dp5sx5veQ@mail.gmail.com>
	<CAMRnEmpxr1xe36QRLuT0nCtGMXWJr4i-NNeZL5EVhyD8=aGZCQ@mail.gmail.com>
	<CAH6Pt5rN5q5xXU6=iqBJZcW9vzhdSmwoxQang90aSA1kas=SFQ@mail.gmail.com>
	<CAH6Pt5oN2FWydvsnP2MngA_7fs=cZFwgDwUBbg2bQHLkR_Qc7Q@mail.gmail.com>
	<CAH6Pt5qc2VAtOg=ZweBQDkH6qK+h+rxGUjwNiC7E5XCJj=b-NQ@mail.gmail.com>
	<CAMRnEmpY6ibpEF16dzVBuEL+8641Z5rmWKNCprNoAtzpaii4RQ@mail.gmail.com>
	<CAH6Pt5qJiMxw1i2NvTGCemAO0Xp8SGFB9dBrNDDwtetP7AMOew@mail.gmail.com>
Message-ID: <CABL7CQg4=E8-VOd1dNVn_aWuQJQ=vAvvsSjCkH0Lm+9n-06K6w@mail.gmail.com>

On Tue, Mar 6, 2012 at 6:53 AM, Matthew Brett <matthew.brett at gmail.com>wrote:

> Hi,
>
> On Mon, Mar 5, 2012 at 8:04 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> > I've pushed a bugfix to github, can you confirm that the crash goes away
> on
> > your test box? Thanks for tracking that down, the stack trace was very
> > helpful. Since x86 machines don't have as strict alignment requirements,
> > bugs like this one will generally remain undetected until someone tests
> on
> > an architecture like sparc.
>
> Thanks - no Bus error.  For your enjoyment: there were some failures
> and errors from numpy.test("full")
>

The two SPARC build slaves also stopped crashing. They show a few more
failures than on Matthew's box:
http://buildbot.scipy.org/builders/Linux_SPARC_64_Debian/builds/1199/steps/shell_1/logs/stdio

A big thanks for Stefan by the way for fixing up the buildbot so it
triggers on commits again!

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120306/db896a56/attachment.html>

From chris.barker at noaa.gov  Tue Mar  6 17:26:40 2012
From: chris.barker at noaa.gov (Chris Barker)
Date: Tue, 6 Mar 2012 14:26:40 -0800
Subject: [Numpy-discussion] C++ Example
In-Reply-To: <1375080.6SoljH8rGC@rabbit>
References: <30965644.bxiDOaWOx7@rabbit>
	<CAGY4rcWohTs=R1oPF84SsAS1JFb3okXDStFLfU5keg1FC9pXVw@mail.gmail.com>
	<1375080.6SoljH8rGC@rabbit>
Message-ID: <CALGmxEKzL_x3uvLgupxJdnkwwHu7pN9xZs4Qqgsq2T0ZdrNHsg@mail.gmail.com>

On Sun, Mar 4, 2012 at 2:18 PM, Luis Pedro Coelho <lpc at cmu.edu> wrote:
> At least last time I read up on it, cython was not able to do multi-type code,
> i.e., have code that works on arrays of multiple types. Does it support it
> now?

The Bottleneck project used some sort of template system to generate
multiple type cyton code -- not cython itself, but none the less
useful.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959?? voice
7600 Sand Point Way NE ??(206) 526-6329?? fax
Seattle, WA ?98115 ? ? ??(206) 526-6317?? main reception

Chris.Barker at noaa.gov


From chris.barker at noaa.gov  Tue Mar  6 17:45:34 2012
From: chris.barker at noaa.gov (Chris Barker)
Date: Tue, 6 Mar 2012 14:45:34 -0800
Subject: [Numpy-discussion] Possible roadmap addendum: building better
 text file readers
In-Reply-To: <CACHfV1s-z6=hM_j+8t8VeLzbEs8kKXfiWrf3MxiNpioqKtseig@mail.gmail.com>
References: <CAJPUwMD=q4kxjUJ4+cyC=hrPx8i7AUtg=ULmvngScDX3czGc9w@mail.gmail.com>
	<ji65gb$i5l$1@dough.gmane.org>
	<BBC7E390-BF69-42F9-B74E-2E673BE53F53@continuum.io>
	<1330092347-sup-3918@rohan>
	<CAJPUwMCTCy29N2iv2HGy4r1uPcCz6d0_mFi9hh7JkgwdbEkVFQ@mail.gmail.com>
	<1330207186-sup-1957@rohan> <loom.20120227T062258-15@post.gmane.org>
	<1330351883-sup-9943@rohan>
	<CAPJVwB=H2P68CU_oKAf7n_fvj2q1Ne8TSw153_twAY0aq69H6A@mail.gmail.com>
	<1330365437-sup-7898@rohan>
	<CAPJVwBnaEd=EF9xD_-=VJeCMTYpb=7gOwxRYoQ9xce7xqiwWEg@mail.gmail.com>
	<1330387831-sup-839@rohan>
	<CAPJVwBnO6xbBWREL-XNTfhhKzhawWzBTheZyeLm-=TOxrXBRiA@mail.gmail.com>
	<CACHfV1s-z6=hM_j+8t8VeLzbEs8kKXfiWrf3MxiNpioqKtseig@mail.gmail.com>
Message-ID: <CALGmxEJZjfu_X6M_hU6sqsPNTX=hX+gL_iZc92kpdpikyupapg@mail.gmail.com>

On Thu, Mar 1, 2012 at 10:58 PM, Jay Bourque <jayvius at gmail.com> wrote:

> 1. Loading text files using loadtxt/genfromtxt need a significant
> performance boost (I think at least an order of magnitude increase in
> performance is very doable based on what I've seen with Erin's recfile code)

> 2. Improved memory usage. Memory used for reading in a text file shouldn?t
> be more than the file itself, and less if only reading a subset of file.

> 3. Keep existing interfaces for reading text files (loadtxt, genfromtxt,
> etc). No new ones.

> 4. Underlying code should keep IO iteration and transformation of data
> separate (awaiting more thoughts from Travis on this).

> 5. Be able to plug in different transformations of data at low level (also
> awaiting more thoughts from Travis).

> 6. memory mapping of text files?

> 7. Eventually reduce memory usage even more by using same object for
> duplicate values in array (depends on implementing enum dtype?)

> Anything else?

Yes -- I'd like to see the solution be able to do high -performance
reads of a portion of a file -- not always the whole thing. I seem to
have a number of custom text files that I need to read that are laid
out in chunks: a bit of a header, then a block of number, another
header, another block. I'm happy to read and parse the header sections
with pure pyton, but would love a way to read the blocks of numbers
into a numpy array fast. This will probably come out of the box with
any of the proposed solutions, as long as they start at the current
position of a passes-in fiel object, and can be told how much to read,
then leave the file pointer in the correct position.

Great to see this moving forward.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959?? voice
7600 Sand Point Way NE ??(206) 526-6329?? fax
Seattle, WA ?98115 ? ? ??(206) 526-6317?? main reception

Chris.Barker at noaa.gov


From sturla at molden.no  Tue Mar  6 18:03:06 2012
From: sturla at molden.no (Sturla Molden)
Date: Wed, 7 Mar 2012 00:03:06 +0100
Subject: [Numpy-discussion] C++ Example
In-Reply-To: <CALGmxEKzL_x3uvLgupxJdnkwwHu7pN9xZs4Qqgsq2T0ZdrNHsg@mail.gmail.com>
References: <30965644.bxiDOaWOx7@rabbit>
	<CAGY4rcWohTs=R1oPF84SsAS1JFb3okXDStFLfU5keg1FC9pXVw@mail.gmail.com>
	<1375080.6SoljH8rGC@rabbit>
	<CALGmxEKzL_x3uvLgupxJdnkwwHu7pN9xZs4Qqgsq2T0ZdrNHsg@mail.gmail.com>
Message-ID: <77A4BC71-4819-4C5C-8212-8FCE198684D7@molden.no>

Upcoming Cython releases will have a generics system called "fused types". 

Sturla

Sendt fra min iPad

Den 6. mars 2012 kl. 23:26 skrev Chris Barker <chris.barker at noaa.gov>:

> On Sun, Mar 4, 2012 at 2:18 PM, Luis Pedro Coelho <lpc at cmu.edu> wrote:
>> At least last time I read up on it, cython was not able to do multi-type code,
>> i.e., have code that works on arrays of multiple types. Does it support it
>> now?
> 
> The Bottleneck project used some sort of template system to generate
> multiple type cyton code -- not cython itself, but none the less
> useful.
> 
> -Chris
> 
> 
> 
> 
> -- 
> 
> Christopher Barker, Ph.D.
> Oceanographer
> 
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
> 
> Chris.Barker at noaa.gov
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From charlesr.harris at gmail.com  Tue Mar  6 18:43:17 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 6 Mar 2012 16:43:17 -0700
Subject: [Numpy-discussion] C++ Example
In-Reply-To: <77A4BC71-4819-4C5C-8212-8FCE198684D7@molden.no>
References: <30965644.bxiDOaWOx7@rabbit>
	<CAGY4rcWohTs=R1oPF84SsAS1JFb3okXDStFLfU5keg1FC9pXVw@mail.gmail.com>
	<1375080.6SoljH8rGC@rabbit>
	<CALGmxEKzL_x3uvLgupxJdnkwwHu7pN9xZs4Qqgsq2T0ZdrNHsg@mail.gmail.com>
	<77A4BC71-4819-4C5C-8212-8FCE198684D7@molden.no>
Message-ID: <CAB6mnxLPeimfHzebBYuKTtGF=_3VRf7DqsZQqjtqnZHeEUSCdw@mail.gmail.com>

On Tue, Mar 6, 2012 at 4:03 PM, Sturla Molden <sturla at molden.no> wrote:

> Upcoming Cython releases will have a generics system called "fused types".
>
> Sturla
>
> Sendt fra min iPad
>
> Den 6. mars 2012 kl. 23:26 skrev Chris Barker <chris.barker at noaa.gov>:
>
> > On Sun, Mar 4, 2012 at 2:18 PM, Luis Pedro Coelho <lpc at cmu.edu> wrote:
> >> At least last time I read up on it, cython was not able to do
> multi-type code,
> >> i.e., have code that works on arrays of multiple types. Does it support
> it
> >> now?
> >
> > The Bottleneck project used some sort of template system to generate
> > multiple type cyton code -- not cython itself, but none the less
> > useful.
> >
>

I don't see generics as the main selling point of C++ for Numpy. What I
expect to be really useful is exception handling, smart pointers, and RIAA.
And maybe some carefule uses of classes and inheritance. Having a standard
inline keyword will be nice too. But I'm not a modern C++ guru, so I may
have missed a lot of things.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120306/0d65428d/attachment.html>

From bryanv at continuum.io  Tue Mar  6 18:43:23 2012
From: bryanv at continuum.io (Bryan Van de Ven)
Date: Tue, 06 Mar 2012 17:43:23 -0600
Subject: [Numpy-discussion] C++ Example
In-Reply-To: <4F567612.4050305@molden.no>
References: <30965644.bxiDOaWOx7@rabbit> <4F567612.4050305@molden.no>
Message-ID: <4F56A11B.1060309@continuum.io>

On 3/6/12 2:39 PM, Sturla Molden wrote:
> We can augment standard C with syntax suger for
> generics, or even NumPy arrays and Python types, using a Python script
> as meta-compiler. We don't need C++ for that.
I can only speak for myself. I do not want to be in the meta-compiler 
business, and I do want proper tool support for templating.

Moreover:

**begin repeat
  *
  * #from = BOOL, BYTE, UBYTE, SHORT, USHORT, INT, UINT, LONG, ULONG,
  *         LONGLONG, ULONGLONG, HALF, FLOAT, DOUBLE, LONGDOUBLE,
  *         CFLOAT, CDOUBLE, CLONGDOUBLE, OBJECT, DATETIME, TIMEDELTA#
  * #suff = bool, byte, ubyte, short, ushort, int, uint, long, ulong,
  *         longlong, ulonglong, half, float, double, longdouble,
  *         cfloat, cdouble, clongdouble, object, datetime, timedelta#
  * #sort = 1*18, 0*3#
  * #num = 1*15, 2*3, 1*3#
  * #fromtyp = Bool, byte, ubyte, short, ushort, int, uint, long, ulong,
  *            longlong, ulonglong, npy_half, float, double, longdouble,
  *            float, double, longdouble, PyObject *, datetime, timedelta#
  * #NAME = Bool, Byte, UByte, Short, UShort, Int, UInt, Long, ULong,
  *         LongLong, ULongLong, Half, Float, Double, LongDouble,
  *         CFloat, CDouble, CLongDouble, Object, Datetime, Timedelta#
  * #kind = GENBOOL, SIGNED, UNSIGNED, SIGNED, UNSIGNED, SIGNED, 
UNSIGNED, SIGNED, UNSIGNED,
  *         SIGNED, UNSIGNED, FLOATING, FLOATING, FLOATING, FLOATING,
  *         COMPLEX, COMPLEX, COMPLEX, OBJECT, DATETIME, TIMEDELTA#
  * #endian = |*3, =*15, |, =*2#
  * #isobject= 0*18,NPY_OBJECT_DTYPE_FLAGS,0*2#
  */

is not exactly sweet-tasting sugar. It would make more sense for all the 
various values to be bundled together by type, so that a type is 
comprehensible as whole, quickly. Instead all the values are in 
separated property lists that cut across all the types. Worse, very 
similar or identical information to what is in this list is repeated, 
without any chance of consistency or error checking, all over the place.

Bryan


From cournape at gmail.com  Tue Mar  6 18:57:02 2012
From: cournape at gmail.com (David Cournapeau)
Date: Tue, 6 Mar 2012 18:57:02 -0500
Subject: [Numpy-discussion] C++ Example
In-Reply-To: <4F56A11B.1060309@continuum.io>
References: <30965644.bxiDOaWOx7@rabbit> <4F567612.4050305@molden.no>
	<4F56A11B.1060309@continuum.io>
Message-ID: <CAGY4rcWUP3bgugPQ-+FJGWvKT=EB9wR5MUawQ1W1ZLkQ_0Je1A@mail.gmail.com>

On Tue, Mar 6, 2012 at 6:43 PM, Bryan Van de Ven <bryanv at continuum.io> wrote:
> On 3/6/12 2:39 PM, Sturla Molden wrote:
>> We can augment standard C with syntax suger for
>> generics, or even NumPy arrays and Python types, using a Python script
>> as meta-compiler. We don't need C++ for that.
> I can only speak for myself. I do not want to be in the meta-compiler
> business, and I do want proper tool support for templating.
>
> Moreover:
>
> **begin repeat
> ?*
> ?* #from = BOOL, BYTE, UBYTE, SHORT, USHORT, INT, UINT, LONG, ULONG,
> ?* ? ? ? ? LONGLONG, ULONGLONG, HALF, FLOAT, DOUBLE, LONGDOUBLE,
> ?* ? ? ? ? CFLOAT, CDOUBLE, CLONGDOUBLE, OBJECT, DATETIME, TIMEDELTA#
> ?* #suff = bool, byte, ubyte, short, ushort, int, uint, long, ulong,
> ?* ? ? ? ? longlong, ulonglong, half, float, double, longdouble,
> ?* ? ? ? ? cfloat, cdouble, clongdouble, object, datetime, timedelta#
> ?* #sort = 1*18, 0*3#
> ?* #num = 1*15, 2*3, 1*3#
> ?* #fromtyp = Bool, byte, ubyte, short, ushort, int, uint, long, ulong,
> ?* ? ? ? ? ? ?longlong, ulonglong, npy_half, float, double, longdouble,
> ?* ? ? ? ? ? ?float, double, longdouble, PyObject *, datetime, timedelta#
> ?* #NAME = Bool, Byte, UByte, Short, UShort, Int, UInt, Long, ULong,
> ?* ? ? ? ? LongLong, ULongLong, Half, Float, Double, LongDouble,
> ?* ? ? ? ? CFloat, CDouble, CLongDouble, Object, Datetime, Timedelta#
> ?* #kind = GENBOOL, SIGNED, UNSIGNED, SIGNED, UNSIGNED, SIGNED,
> UNSIGNED, SIGNED, UNSIGNED,
> ?* ? ? ? ? SIGNED, UNSIGNED, FLOATING, FLOATING, FLOATING, FLOATING,
> ?* ? ? ? ? COMPLEX, COMPLEX, COMPLEX, OBJECT, DATETIME, TIMEDELTA#
> ?* #endian = |*3, =*15, |, =*2#
> ?* #isobject= 0*18,NPY_OBJECT_DTYPE_FLAGS,0*2#
> ?*/
>
> is not exactly sweet-tasting sugar. It would make more sense for all the
> various values to be bundled together by type, so that a type is
> comprehensible as whole, quickly. Instead all the values are in
> separated property lists that cut across all the types. Worse, very
> similar or identical information to what is in this list is repeated,
> without any chance of consistency or error checking, all over the place.

I don't think anyone likes it . All your points are certainly valid.
There are solutions to this that does not require C++ (Chuck worked on
that I believe, although I guess he knows would rather see C++ used).

David


From d.s.seljebotn at astro.uio.no  Tue Mar  6 19:13:17 2012
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Tue, 06 Mar 2012 16:13:17 -0800
Subject: [Numpy-discussion] C++ Example
In-Reply-To: <4F567985.3020606@molden.no>
References: <30965644.bxiDOaWOx7@rabbit>	<4F567612.4050305@molden.no>	<CAHCaCkL387Dp4krgFKE_+ay6CmqCCGf3mDtjdNu0i4AY1cmQvQ@mail.gmail.com>
	<4F567985.3020606@molden.no>
Message-ID: <4F56A81D.8070203@astro.uio.no>

On 03/06/2012 12:54 PM, Sturla Molden wrote:
> On 06.03.2012 21:45, Matthieu Brucher wrote:
>
>> This is your opinion, but there are a lot of numerical code now in C++
>> and they are far more maintainable than in Fortran. And they are faster
>> for exactly this reason.
>
> That is mostly because C++ makes tasks that are non-numerical easier.
>
> But that is why we have Python.

In Norwegian there's a saying (as you will know): "Badly reasoned 
argument, so talk louder".

Anyway, I agree with you a long way, but saying there's no place at all 
for C++ anywhere seems a bit over the top for me. E.g., I believe C++ 
Elemental made the right choice in C++ (distributed linear algebra, 
where they use templates just enough to make things readable, but not 
over-the-top like STL or Boost):

http://code.google.com/p/elemental/

Dag


From sturla at molden.no  Tue Mar  6 19:39:27 2012
From: sturla at molden.no (Sturla Molden)
Date: Wed, 7 Mar 2012 01:39:27 +0100
Subject: [Numpy-discussion] C++ Example
In-Reply-To: <CAB6mnxLPeimfHzebBYuKTtGF=_3VRf7DqsZQqjtqnZHeEUSCdw@mail.gmail.com>
References: <30965644.bxiDOaWOx7@rabbit>
	<CAGY4rcWohTs=R1oPF84SsAS1JFb3okXDStFLfU5keg1FC9pXVw@mail.gmail.com>
	<1375080.6SoljH8rGC@rabbit>
	<CALGmxEKzL_x3uvLgupxJdnkwwHu7pN9xZs4Qqgsq2T0ZdrNHsg@mail.gmail.com>
	<77A4BC71-4819-4C5C-8212-8FCE198684D7@molden.no>
	<CAB6mnxLPeimfHzebBYuKTtGF=_3VRf7DqsZQqjtqnZHeEUSCdw@mail.gmail.com>
Message-ID: <78D81CDC-4C29-4BB0-9026-0037D7F2F518@molden.no>


Den 7. mars 2012 kl. 00:43 skrev Charles R Harris <charlesr.harris at gmail.com>:

> 
> 
> I don't see generics as the main selling point of C++ for Numpy. What I expect to be really useful is exception handling, smart pointers, and RIAA. And maybe some carefule uses of classes and inheritance. Having a standard inline keyword will be nice too. But I'm not a modern C++ guru, so I may have missed a lot of things.
> 

"RIAA is evil, RAII is not" ;-)

Actually, Cython has all of those features too :-)

I am not suggesting NumPy should use Cython in the core. However if it did, the main machinery would already be in the compiler (typed memory views) :-)

Sturla


From charlesr.harris at gmail.com  Tue Mar  6 19:56:47 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 6 Mar 2012 17:56:47 -0700
Subject: [Numpy-discussion] C++ Example
In-Reply-To: <78D81CDC-4C29-4BB0-9026-0037D7F2F518@molden.no>
References: <30965644.bxiDOaWOx7@rabbit>
	<CAGY4rcWohTs=R1oPF84SsAS1JFb3okXDStFLfU5keg1FC9pXVw@mail.gmail.com>
	<1375080.6SoljH8rGC@rabbit>
	<CALGmxEKzL_x3uvLgupxJdnkwwHu7pN9xZs4Qqgsq2T0ZdrNHsg@mail.gmail.com>
	<77A4BC71-4819-4C5C-8212-8FCE198684D7@molden.no>
	<CAB6mnxLPeimfHzebBYuKTtGF=_3VRf7DqsZQqjtqnZHeEUSCdw@mail.gmail.com>
	<78D81CDC-4C29-4BB0-9026-0037D7F2F518@molden.no>
Message-ID: <CAB6mnxJ462VsYRFPF2Jtyf4SHETo0zGE17bW5-EXc5e8sVFfwQ@mail.gmail.com>

On Tue, Mar 6, 2012 at 5:39 PM, Sturla Molden <sturla at molden.no> wrote:

>
>
> Den 7. mars 2012 kl. 00:43 skrev Charles R Harris <
> charlesr.harris at gmail.com>:
>
> >
> >
> > I don't see generics as the main selling point of C++ for Numpy. What I
> expect to be really useful is exception handling, smart pointers, and RIAA.
> And maybe some carefule uses of classes and inheritance. Having a standard
> inline keyword will be nice too. But I'm not a modern C++ guru, so I may
> have missed a lot of things.
> >
>
> "RIAA is evil, RAII is not" ;-)
>
> Actually, Cython has all of those features too :-)
>
> I am not suggesting NumPy should use Cython in the core. However if it
> did, the main machinery would already be in the compiler (typed memory
> views) :-)
>

Oh, I'd *much* rather use C or C++ for writing C code ;) Cython is great
for hiding all the mess of interfacing to Python, but, no Python, no mess.
The idea is to layer Numpy so that the bottom layer is independent of
Python. So we will probably do our own memory management and such (ala the
refactor), and C++ will be helpful for such things.

I don't stress generics for such things as the loop code. The current
template code for that isn't beautiful, but it isn't hopelessly ugly
either. There may, note may, be a role for inheritance there. But in any
case, I don't see the C++ transition happening over night, so there will be
plenty of time for long, testy threads along the way to keep us all happily
entertained.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120306/6b7c9513/attachment.html>

From matthew.brett at gmail.com  Tue Mar  6 23:07:53 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Wed, 7 Mar 2012 04:07:53 +0000
Subject: [Numpy-discussion] More SPARC pain
Message-ID: <CAH6Pt5qzienQuOrFhSLB=omhNnooQXpJaXeF6bMm0S=iimGeJw@mail.gmail.com>

Hi,

I found this test caused a bus error on current trunk:

<pre>
import numpy as np

from StringIO import StringIO as BytesIO

from numpy.testing import assert_array_equal


def test_2d_buf():
    dtt = np.complex64
    arr = np.arange(10, dtype=dtt)
    # 2D array
    arr2 = np.reshape(arr, (2, 5))
    # Fortran write followed by (C or F) read caused bus error
    data_str = arr2.tostring('F')
    data_back = np.ndarray(arr2.shape,
                           arr2.dtype,
                           buffer=data_str,
                           order='F')
    assert_array_equal(arr2, data_back)
</pre>

gdb run gives ...

test_me3.test_2d_buf ...
Program received signal SIGBUS, Bus error.
0xf78f5458 in _aligned_strided_to_contig_size8 (
    dst=0xdc0e08
"\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\373\373\373\373",
dst_stride=8, src=0xcdfc44 "", src_stride=16, N=5,
__NPY_UNUSED_TAGGEDsrc_itemsize=8,
    __NPY_UNUSED_TAGGEDdata=0x0) at
numpy/core/src/multiarray/lowlevel_strided_loops.c.src:137
137             (*((@type@ *)dst)) = @swap@@elsize@(*((@type@ *)src));

Debug log attached.  Shall I make an issue?

Best,

Matthew
-------------- next part --------------
A non-text attachment was scrubbed...
Name: buf_2d.log.gz
Type: application/x-gzip
Size: 7956 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120307/b1f08dd7/attachment.bin>

From warren.weckesser at enthought.com  Wed Mar  7 10:49:47 2012
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Wed, 7 Mar 2012 09:49:47 -0600
Subject: [Numpy-discussion] Possible roadmap addendum: building better
 text file readers
In-Reply-To: <CALGmxEJZjfu_X6M_hU6sqsPNTX=hX+gL_iZc92kpdpikyupapg@mail.gmail.com>
References: <CAJPUwMD=q4kxjUJ4+cyC=hrPx8i7AUtg=ULmvngScDX3czGc9w@mail.gmail.com>
	<ji65gb$i5l$1@dough.gmane.org>
	<BBC7E390-BF69-42F9-B74E-2E673BE53F53@continuum.io>
	<1330092347-sup-3918@rohan>
	<CAJPUwMCTCy29N2iv2HGy4r1uPcCz6d0_mFi9hh7JkgwdbEkVFQ@mail.gmail.com>
	<1330207186-sup-1957@rohan> <loom.20120227T062258-15@post.gmane.org>
	<1330351883-sup-9943@rohan>
	<CAPJVwB=H2P68CU_oKAf7n_fvj2q1Ne8TSw153_twAY0aq69H6A@mail.gmail.com>
	<1330365437-sup-7898@rohan>
	<CAPJVwBnaEd=EF9xD_-=VJeCMTYpb=7gOwxRYoQ9xce7xqiwWEg@mail.gmail.com>
	<1330387831-sup-839@rohan>
	<CAPJVwBnO6xbBWREL-XNTfhhKzhawWzBTheZyeLm-=TOxrXBRiA@mail.gmail.com>
	<CACHfV1s-z6=hM_j+8t8VeLzbEs8kKXfiWrf3MxiNpioqKtseig@mail.gmail.com>
	<CALGmxEJZjfu_X6M_hU6sqsPNTX=hX+gL_iZc92kpdpikyupapg@mail.gmail.com>
Message-ID: <CAM-+wY8ieo0qgUGXNjTDWk3c_O8nr2c0Ha3sDVmxJJYyvdV8mg@mail.gmail.com>

On Tue, Mar 6, 2012 at 4:45 PM, Chris Barker <chris.barker at noaa.gov> wrote:

> On Thu, Mar 1, 2012 at 10:58 PM, Jay Bourque <jayvius at gmail.com> wrote:
>
> > 1. Loading text files using loadtxt/genfromtxt need a significant
> > performance boost (I think at least an order of magnitude increase in
> > performance is very doable based on what I've seen with Erin's recfile
> code)
>
> > 2. Improved memory usage. Memory used for reading in a text file
> shouldn?t
> > be more than the file itself, and less if only reading a subset of file.
>
> > 3. Keep existing interfaces for reading text files (loadtxt, genfromtxt,
> > etc). No new ones.
>
> > 4. Underlying code should keep IO iteration and transformation of data
> > separate (awaiting more thoughts from Travis on this).
>
> > 5. Be able to plug in different transformations of data at low level
> (also
> > awaiting more thoughts from Travis).
>
> > 6. memory mapping of text files?
>
> > 7. Eventually reduce memory usage even more by using same object for
> > duplicate values in array (depends on implementing enum dtype?)
>
> > Anything else?
>
> Yes -- I'd like to see the solution be able to do high -performance
> reads of a portion of a file -- not always the whole thing. I seem to
> have a number of custom text files that I need to read that are laid
> out in chunks: a bit of a header, then a block of number, another
> header, another block. I'm happy to read and parse the header sections
> with pure pyton, but would love a way to read the blocks of numbers
> into a numpy array fast. This will probably come out of the box with
> any of the proposed solutions, as long as they start at the current
> position of a passes-in fiel object, and can be told how much to read,
> then leave the file pointer in the correct position.
>
>

If you are setup with Cython to build extension modules, and you don't mind
testing an unreleased and experimental reader, you can try the text reader
that I'm working on: https://github.com/WarrenWeckesser/textreader

You can read a file like this, where the first line gives the number of
rows of the following array, and that pattern repeats:

5
1.0, 2.0, 3.0
4.0, 5.0, 6.0
7.0, 8.0, 9.0
10.0, 11.0, 12.0
13.0, 14.0, 15.0
3
1.0, 1.5, 2.0, 2.5
3.0, 3.5, 4.0, 4.5
5.0, 5.5, 6.0, 6.5
1
1.0D2, 1.25D-1, 6.25D-2, 99

with code like this:

import numpy as np
from textreader import readrows

filename = 'data/multi.dat'

f = open(filename, 'r')
line = f.readline()
while len(line) > 0:
    nrows = int(line)
    a = readrows(f, np.float32, numrows=nrows, sci='D', delimiter=',')
    print "a:"
    print a
    print
    line = f.readline()


Warren
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120307/9ca012e1/attachment.html>

From pierre.haessig at crans.org  Wed Mar  7 11:35:44 2012
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Wed, 07 Mar 2012 17:35:44 +0100
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <CAPJVwBnKuA0NC3FvEuzmxgpA8FOT2fver98Ec5ow3oPZACfiyA@mail.gmail.com>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>	<CAMRnEmoiTZcVpBZKU1B-kOGMsmN8SxghJ=oc5v7Y8s7wRvd+BA@mail.gmail.com>	<4F561590.3000400@crans.org>	<CAMRnEmqVP9N6_fNEbhcrQGOmVaMo3wagEeTigWN3pTy95c546w@mail.gmail.com>
	<CAPJVwBnKuA0NC3FvEuzmxgpA8FOT2fver98Ec5ow3oPZACfiyA@mail.gmail.com>
Message-ID: <4F578E60.7080304@crans.org>

Hi,

Thanks you very much for your lights !

Le 06/03/2012 21:59, Nathaniel Smith a ?crit :
> Right -- R has a very impoverished type system as compared to numpy.
> There's basically four types: "numeric" (meaning double precision
> float), "integer", "logical" (boolean), and "character" (string). And
> in practice the integer type is essentially unused, because R parses
> numbers like "1" as being floating point, not integer; the only way to
> get an integer value is to explicitly cast to it. Each of these types
> has a specific bit-pattern set aside for representing NA. And...
> that's it. It's very simple when it works, but also very limited.
I also suspected R to be less powerful in terms of types.
However, I think  the fact that "It's very simple when it works" is
important to take into account. At the end of the day, when using all
the fanciness it is not only about "can I have some NAs in my array ?"
but also "how *easily* can I have some NAs in my array ?". It's about
balancing the "how easy" and the "how powerful".

The easyness-of-use is the reason of my concern about having separate
types "nafloatNN" and "floatNN". Of course, I won't argue that "not
breaking everything" is even more important !!

Coming back to Travis proposition "bit-pattern approaches to missing
data (*at least* for float64 and int32) need to be implemented.", I
wonder what is the amount of extra work to go from nafloat64 to
nafloat32/16 ? Is there an hardware support NaN payloads with these
smaller floats ? If not, or if it is too complicated, I feel it is
acceptable to say "it's too complicated" and fall back to mask. One may
have to choose between fancy types and fancy NAs...

Best,
Pierre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120307/abd0dc35/attachment.sig>

From pierre.haessig at crans.org  Wed Mar  7 11:45:47 2012
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Wed, 07 Mar 2012 17:45:47 +0100
Subject: [Numpy-discussion] addition,
 multiplication of a polynomial and np.{float, int}
In-Reply-To: <CAB6mnxJ=u7QGTwSwUZfibbSieh8caZ+3_yTmuWF_-3xzJmONYg@mail.gmail.com>
References: <slrnjlcusj.ul5.dlaxalde@157-washington.campus.mcgill.ca>
	<CAB6mnxJ=u7QGTwSwUZfibbSieh8caZ+3_yTmuWF_-3xzJmONYg@mail.gmail.com>
Message-ID: <4F5790BB.2050801@crans.org>

Hi,
Le 06/03/2012 22:19, Charles R Harris a ?crit :
> Use polynomial.Polynomial and you won't have this problem.
I was not familiar with the "poly1d vs. Polynomial" choice.

Now, I found in the doc some more or less explicit guidelines in:
http://docs.scipy.org/doc/numpy/reference/routines.polynomials.html
"The polynomial package is newer and more complete than poly1d and the
convenience classes are better behaved in the numpy environment."

However, poly1d, which is nicely documented doesn't mention the
"competitor".
Going further in this transition, do you feel it would make sense adding
a "See Also" section in poly1d function ?

Best,
Pierre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120307/171d8dbd/attachment.sig>

From cournape at gmail.com  Wed Mar  7 11:46:13 2012
From: cournape at gmail.com (David Cournapeau)
Date: Wed, 7 Mar 2012 11:46:13 -0500
Subject: [Numpy-discussion] Fixing PyArray_Descr flags member size,
 ABI vs pickling issue
In-Reply-To: <CAF6FJitRuR=ekKAP4KD9NvcBFiko240teGmr1aNjQaC+wh3a5w@mail.gmail.com>
References: <CAGY4rcV3s78=uD-cem0bHL9PsJKLv33CFy7=K75Lvk2RcXdw9A@mail.gmail.com>
	<CAF6FJivPBww2rk+nzCAqcMoStR099uiB4X5Y-gkJW0pE2JKBXA@mail.gmail.com>
	<CAGY4rcUigxhn3xVkEXp7vZRiN8omSQxjzQU3NYqjDcRGAXqQxA@mail.gmail.com>
	<10469DD0-B652-4161-9625-522A391706B2@continuum.io>
	<CAF6FJitRuR=ekKAP4KD9NvcBFiko240teGmr1aNjQaC+wh3a5w@mail.gmail.com>
Message-ID: <CAGY4rcUNyTQh9LApvaZ9zNoNVcRjY2fvLKzzxpKUBGCFrjHM6A@mail.gmail.com>

On Tue, Mar 6, 2012 at 1:44 PM, Robert Kern <robert.kern at gmail.com> wrote:

> On Tue, Mar 6, 2012 at 18:25, Travis Oliphant <travis at continuum.io> wrote:
> > Why do we want to return a single string char instead of an int?
>
> I suspect just to ensure that any provided value fits in the range
> 0..255. But that's easily done explicitly.
>

That was not even the issue in the end, my initial analysis was wrong.

In any case, I have now a new PR that fixes both dtypes.flags value and
dtype hashing reported in #2017: https://github.com/numpy/numpy/pull/231

regards,

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120307/cd9a5968/attachment.html>

From charlesr.harris at gmail.com  Wed Mar  7 12:00:09 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 7 Mar 2012 10:00:09 -0700
Subject: [Numpy-discussion] addition,
 multiplication of a polynomial and np.{float, int}
In-Reply-To: <4F5790BB.2050801@crans.org>
References: <slrnjlcusj.ul5.dlaxalde@157-washington.campus.mcgill.ca>
	<CAB6mnxJ=u7QGTwSwUZfibbSieh8caZ+3_yTmuWF_-3xzJmONYg@mail.gmail.com>
	<4F5790BB.2050801@crans.org>
Message-ID: <CAB6mnx+kJys-F76wKP+6KpF2rQObLmQD7g8G4nOWL03Ks_UDpA@mail.gmail.com>

On Wed, Mar 7, 2012 at 9:45 AM, Pierre Haessig <pierre.haessig at crans.org>wrote:

> Hi,
> Le 06/03/2012 22:19, Charles R Harris a ?crit :
> > Use polynomial.Polynomial and you won't have this problem.
> I was not familiar with the "poly1d vs. Polynomial" choice.
>
> Now, I found in the doc some more or less explicit guidelines in:
> http://docs.scipy.org/doc/numpy/reference/routines.polynomials.html
> "The polynomial package is newer and more complete than poly1d and the
> convenience classes are better behaved in the numpy environment."
>
> However, poly1d, which is nicely documented doesn't mention the
> "competitor".
> Going further in this transition, do you feel it would make sense adding
> a "See Also" section in poly1d function ?
>
>
That's a good idea, I'll take care of it. Note the caveat about the
coefficients going in the opposite direction.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120307/b9c85547/attachment.html>

From njs at pobox.com  Wed Mar  7 12:14:15 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 7 Mar 2012 17:14:15 +0000
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <4F578E60.7080304@crans.org>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
	<CAMRnEmoiTZcVpBZKU1B-kOGMsmN8SxghJ=oc5v7Y8s7wRvd+BA@mail.gmail.com>
	<4F561590.3000400@crans.org>
	<CAMRnEmqVP9N6_fNEbhcrQGOmVaMo3wagEeTigWN3pTy95c546w@mail.gmail.com>
	<CAPJVwBnKuA0NC3FvEuzmxgpA8FOT2fver98Ec5ow3oPZACfiyA@mail.gmail.com>
	<4F578E60.7080304@crans.org>
Message-ID: <CAPJVwBkgTpse_hp1CWKf79cZqa+Q5gmOkOB=w1_h3b7-MnccnQ@mail.gmail.com>

On Wed, Mar 7, 2012 at 4:35 PM, Pierre Haessig <pierre.haessig at crans.org> wrote:
> Hi,
>
> Thanks you very much for your lights !
>
> Le 06/03/2012 21:59, Nathaniel Smith a ?crit :
>> Right -- R has a very impoverished type system as compared to numpy.
>> There's basically four types: "numeric" (meaning double precision
>> float), "integer", "logical" (boolean), and "character" (string). And
>> in practice the integer type is essentially unused, because R parses
>> numbers like "1" as being floating point, not integer; the only way to
>> get an integer value is to explicitly cast to it. Each of these types
>> has a specific bit-pattern set aside for representing NA. And...
>> that's it. It's very simple when it works, but also very limited.
> I also suspected R to be less powerful in terms of types.
> However, I think ?the fact that "It's very simple when it works" is
> important to take into account. At the end of the day, when using all
> the fanciness it is not only about "can I have some NAs in my array ?"
> but also "how *easily* can I have some NAs in my array ?". It's about
> balancing the "how easy" and the "how powerful".
>
> The easyness-of-use is the reason of my concern about having separate
> types "nafloatNN" and "floatNN". Of course, I won't argue that "not
> breaking everything" is even more important !!

It's a good point, I just don't see how we can really tell what the
trade-offs are at this point. You should bring this up again once more
of the big picture stuff is hammered out.

> Coming back to Travis proposition "bit-pattern approaches to missing
> data (*at least* for float64 and int32) need to be implemented.", I
> wonder what is the amount of extra work to go from nafloat64 to
> nafloat32/16 ? Is there an hardware support NaN payloads with these
> smaller floats ? If not, or if it is too complicated, I feel it is
> acceptable to say "it's too complicated" and fall back to mask. One may
> have to choose between fancy types and fancy NAs...

All modern floating point formats can represent NaNs with payloads, so
in principle there's no difficulty in supporting NA the same way for
all of them. If you're using float16 because you want to offload
computation to a GPU then I would test carefully before trusting the
GPU to handle NaNs correctly, and there may need to be a bit of care
to make sure that casts between these types properly map NAs to NAs,
but generally it should be fine.

-- Nathaniel


From charlesr.harris at gmail.com  Wed Mar  7 12:17:02 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 7 Mar 2012 10:17:02 -0700
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <4F578E60.7080304@crans.org>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
	<CAMRnEmoiTZcVpBZKU1B-kOGMsmN8SxghJ=oc5v7Y8s7wRvd+BA@mail.gmail.com>
	<4F561590.3000400@crans.org>
	<CAMRnEmqVP9N6_fNEbhcrQGOmVaMo3wagEeTigWN3pTy95c546w@mail.gmail.com>
	<CAPJVwBnKuA0NC3FvEuzmxgpA8FOT2fver98Ec5ow3oPZACfiyA@mail.gmail.com>
	<4F578E60.7080304@crans.org>
Message-ID: <CAB6mnxK47+s5FBS+ej=U0CpYWFK4nTjicuHz03qUxwC2F+Wieg@mail.gmail.com>

On Wed, Mar 7, 2012 at 9:35 AM, Pierre Haessig <pierre.haessig at crans.org>wrote:

> Hi,
>
> Thanks you very much for your lights !
>
> Le 06/03/2012 21:59, Nathaniel Smith a ?crit :
> > Right -- R has a very impoverished type system as compared to numpy.
> > There's basically four types: "numeric" (meaning double precision
> > float), "integer", "logical" (boolean), and "character" (string). And
> > in practice the integer type is essentially unused, because R parses
> > numbers like "1" as being floating point, not integer; the only way to
> > get an integer value is to explicitly cast to it. Each of these types
> > has a specific bit-pattern set aside for representing NA. And...
> > that's it. It's very simple when it works, but also very limited.
> I also suspected R to be less powerful in terms of types.
> However, I think  the fact that "It's very simple when it works" is
> important to take into account. At the end of the day, when using all
> the fanciness it is not only about "can I have some NAs in my array ?"
> but also "how *easily* can I have some NAs in my array ?". It's about
> balancing the "how easy" and the "how powerful".
>
> The easyness-of-use is the reason of my concern about having separate
> types "nafloatNN" and "floatNN". Of course, I won't argue that "not
> breaking everything" is even more important !!
>
> Coming back to Travis proposition "bit-pattern approaches to missing
> data (*at least* for float64 and int32) need to be implemented.", I
> wonder what is the amount of extra work to go from nafloat64 to
> nafloat32/16 ? Is there an hardware support NaN payloads with these
> smaller floats ? If not, or if it is too complicated, I feel it is
> acceptable to say "it's too complicated" and fall back to mask. One may
> have to choose between fancy types and fancy NAs...
>
>
I'm in agreement here, and that was a major consideration in making a
'masked' implementation first. Also, different folks adopt different values
for 'missing' data, and distributing one or several masks along with the
data is another common practice.

One inconvenience I have run into with the current API is that is should be
easier to clear the mask from an "ignored" value without taking a new view
or assigning known data. So maybe two types of masks (different payloads),
or an additional flag could be helpful. The process of assigning masks
could also be made a bit easier than using fancy indexing.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120307/67ab43c2/attachment.html>

From jsseabold at gmail.com  Wed Mar  7 12:35:59 2012
From: jsseabold at gmail.com (Skipper Seabold)
Date: Wed, 7 Mar 2012 12:35:59 -0500
Subject: [Numpy-discussion] checking for c compiler during build
Message-ID: <CAKF=DjubDu2iJp4qCxih11BbFujahHktsVFSJn7FsUbwJnsdUQ@mail.gmail.com>

Is there a way to use numpy.distuils to programmatically check for a C
compiler at build time in a platform independent way?


From xscript at gmx.net  Wed Mar  7 13:21:54 2012
From: xscript at gmx.net (=?utf-8?Q?Llu=C3=ADs?=)
Date: Wed, 07 Mar 2012 19:21:54 +0100
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <CAB6mnxK47+s5FBS+ej=U0CpYWFK4nTjicuHz03qUxwC2F+Wieg@mail.gmail.com>
	(Charles R. Harris's message of "Wed, 7 Mar 2012 10:17:02 -0700")
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
	<CAMRnEmoiTZcVpBZKU1B-kOGMsmN8SxghJ=oc5v7Y8s7wRvd+BA@mail.gmail.com>
	<4F561590.3000400@crans.org>
	<CAMRnEmqVP9N6_fNEbhcrQGOmVaMo3wagEeTigWN3pTy95c546w@mail.gmail.com>
	<CAPJVwBnKuA0NC3FvEuzmxgpA8FOT2fver98Ec5ow3oPZACfiyA@mail.gmail.com>
	<4F578E60.7080304@crans.org>
	<CAB6mnxK47+s5FBS+ej=U0CpYWFK4nTjicuHz03qUxwC2F+Wieg@mail.gmail.com>
Message-ID: <87y5rcff0d.fsf@ginnungagap.bsc.es>

Charles R Harris writes:
[...]
> One inconvenience I have run into with the current API is that is should be
> easier to clear the mask from an "ignored" value without taking a new view or
> assigning known data.

AFAIR, the inability to directly access a "mask" attribute was intentional to
make bit-patterns and masks indistinguishable from the POV of the array user.

What's the workflow that leads you to un-ignore specific elements?


> So maybe two types of masks (different payloads), or an additional flag could
> be helpful.

Do you mean different NA values? If that's the case, I think it was taken into
account when implementing the current mechanisms (and was also mentioned in the
NEP), so that it could be supported by both bit-patterns and masks (as one of
the main design points was to make them indistinguishable in the common case).

I think the name was "parametrized dtypes".


> The process of assigning masks could also be made a bit easier than using
> fancy indexing.

I don't get what you mean here, sorry.

Do you mean here that this is too cumbersome to use?

    >>> a[a < 5] = np.NA

(obviously oversimplified example where everything looks sufficiently simple :))


Lluis

-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth


From jordens at gmail.com  Wed Mar  7 13:39:46 2012
From: jordens at gmail.com (=?UTF-8?Q?Robert_J=C3=B6rdens?=)
Date: Wed, 7 Mar 2012 11:39:46 -0700
Subject: [Numpy-discussion] [enhancement] sum_angle() and sum_polar()
Message-ID: <CANb+zoH9ukH26BXhTvE2SL8YF8ZCvyu9JybB1K6vMa2HUbQhjA@mail.gmail.com>

Hi everyone,
I am proposing to add the the two following functions to
numpy/lib/twodim_base.py:

sum_angle() computes the sum of a 2-d array along an angled axis
sum_polar() computes the sum of a 2-d array along radial lines or
along azimuthal circles

https://github.com/numpy/numpy/pull/230

Comments?

When I was looking for a solution to these problems of calculating
special sums of 2-d arrays I could not find anything and it took me a
while to figure out a (hopefully) useful and consistent algorithm.
I can see how one would extend these to higher dimensions but that
would preclude using bincount() to do the heavy lifting.
Looking at some other functions, the doctests might need to be split
into real examples and unittests.

Best,

-- 
Robert Jordens.


From charlesr.harris at gmail.com  Wed Mar  7 13:48:05 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 7 Mar 2012 11:48:05 -0700
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <87y5rcff0d.fsf@ginnungagap.bsc.es>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
	<CAMRnEmoiTZcVpBZKU1B-kOGMsmN8SxghJ=oc5v7Y8s7wRvd+BA@mail.gmail.com>
	<4F561590.3000400@crans.org>
	<CAMRnEmqVP9N6_fNEbhcrQGOmVaMo3wagEeTigWN3pTy95c546w@mail.gmail.com>
	<CAPJVwBnKuA0NC3FvEuzmxgpA8FOT2fver98Ec5ow3oPZACfiyA@mail.gmail.com>
	<4F578E60.7080304@crans.org>
	<CAB6mnxK47+s5FBS+ej=U0CpYWFK4nTjicuHz03qUxwC2F+Wieg@mail.gmail.com>
	<87y5rcff0d.fsf@ginnungagap.bsc.es>
Message-ID: <CAB6mnx+H42J-eqRAtnYCTc-n1sKbGmYOO+pRWWrUoT0ucDKayQ@mail.gmail.com>

On Wed, Mar 7, 2012 at 11:21 AM, Llu?s <xscript at gmx.net> wrote:

> Charles R Harris writes:
> [...]
> > One inconvenience I have run into with the current API is that is should
> be
> > easier to clear the mask from an "ignored" value without taking a new
> view or
> > assigning known data.
>
> AFAIR, the inability to directly access a "mask" attribute was intentional
> to
> make bit-patterns and masks indistinguishable from the POV of the array
> user.
>
> What's the workflow that leads you to un-ignore specific elements?
>
>
>
Because they are not 'unknown', just (temporarily) 'ignored'. This might be
the case if you are experimenting with what happens if certain data is left
out of a fit. The current implementation tries to handle both these case,
and can do so, I would just like the 'ignored' use to be more convenient
than it is.


> > So maybe two types of masks (different payloads), or an additional flag
> could
> > be helpful.
>
> Do you mean different NA values? If that's the case, I think it was taken
> into
> account when implementing the current mechanisms (and was also mentioned
> in the
> NEP), so that it could be supported by both bit-patterns and masks (as one
> of
> the main design points was to make them indistinguishable in the common
> case).
>
>
No, the mask as currently implemented is eight bits and can be extended to
handle different mask values, aka, payloads.


> I think the name was "parametrized dtypes".
>
>
They don't interest me in the least. But that is a whole different area of
discussion.


>
> > The process of assigning masks could also be made a bit easier than using
> > fancy indexing.
>
> I don't get what you mean here, sorry.
>
>
Suppose I receive a data set, say an hdf file, that also includes a mask.
I'd like to load the data and apply the mask directly without doing
something like

data[mask] = np.NA


Do you mean here that this is too cumbersome to use?
>
>    >>> a[a < 5] = np.NA
>
> (obviously oversimplified example where everything looks sufficiently
> simple :))
>
>
Mostly speed and memory.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120307/888760aa/attachment.html>

From charlesr.harris at gmail.com  Wed Mar  7 13:54:52 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 7 Mar 2012 11:54:52 -0700
Subject: [Numpy-discussion] Github key audit.
Message-ID: <CAB6mnxLyCv0OBR3kAxei6kwMBxDjdZho6vm=K8Q++qeAXOPu=A@mail.gmail.com>

Hi All,

Many here have probably received the message from github about push/pull
access being blocked until you have auditied your ssh keys. To generate a
key fingerprint on fedora, I did the following:

$charris at f16 ~$ ssh-keygen -l -f .ssh/id_dsa.pub

I don't how this looks for those of you using windows.

The compromise looks to be a 'whitehat' attack telling the rails developers
to "FIX THE G*DD*MNED PROBLEM". More here, Github
compromized<http://lwn.net/Articles/485162/>
.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120307/ddd92f9e/attachment.html>

From jsseabold at gmail.com  Wed Mar  7 14:03:45 2012
From: jsseabold at gmail.com (Skipper Seabold)
Date: Wed, 7 Mar 2012 14:03:45 -0500
Subject: [Numpy-discussion] checking for c compiler during build
In-Reply-To: <CAKF=DjubDu2iJp4qCxih11BbFujahHktsVFSJn7FsUbwJnsdUQ@mail.gmail.com>
References: <CAKF=DjubDu2iJp4qCxih11BbFujahHktsVFSJn7FsUbwJnsdUQ@mail.gmail.com>
Message-ID: <CAKF=DjsviucwuzCNiG405Qmoc1O0t9=GQ+H3M7hdfOrbK0uBRA@mail.gmail.com>

On Wed, Mar 7, 2012 at 12:35 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
> Is there a way to use numpy.distuils to programmatically check for a C
> compiler at build time in a platform independent way?

Wading through the numpy/distutils code some more. Would something as
simple as this work all the time? Seems to do the trick for me.

from distutils.dist import Distribution
from distutils.command.config import config as distutils_config
from distutils import log

dummy_c_text = r'''
void do_nothing(void);
int main(void) {
    do_nothing();
    return 0;
}
'''

def has_c_compiler():
    c = distutils_config(Distribution())
    try:
        success = c.try_compile(dummy_c_text)
        return True
    except:
        log.info("No C compiler detected of files.")
        return False

Skipper


From njs at pobox.com  Wed Mar  7 14:26:53 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 7 Mar 2012 19:26:53 +0000
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <CAB6mnxK47+s5FBS+ej=U0CpYWFK4nTjicuHz03qUxwC2F+Wieg@mail.gmail.com>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
	<CAMRnEmoiTZcVpBZKU1B-kOGMsmN8SxghJ=oc5v7Y8s7wRvd+BA@mail.gmail.com>
	<4F561590.3000400@crans.org>
	<CAMRnEmqVP9N6_fNEbhcrQGOmVaMo3wagEeTigWN3pTy95c546w@mail.gmail.com>
	<CAPJVwBnKuA0NC3FvEuzmxgpA8FOT2fver98Ec5ow3oPZACfiyA@mail.gmail.com>
	<4F578E60.7080304@crans.org>
	<CAB6mnxK47+s5FBS+ej=U0CpYWFK4nTjicuHz03qUxwC2F+Wieg@mail.gmail.com>
Message-ID: <CAPJVwBnzTAk1t-okcNz1_zyEvehhy2DFASaVWBaChmC4q-6aVA@mail.gmail.com>

On Wed, Mar 7, 2012 at 5:17 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> On Wed, Mar 7, 2012 at 9:35 AM, Pierre Haessig <pierre.haessig at crans.org>
>> Coming back to Travis proposition "bit-pattern approaches to missing
>> data (*at least* for float64 and int32) need to be implemented.", I
>> wonder what is the amount of extra work to go from nafloat64 to
>> nafloat32/16 ? Is there an hardware support NaN payloads with these
>> smaller floats ? If not, or if it is too complicated, I feel it is
>> acceptable to say "it's too complicated" and fall back to mask. One may
>> have to choose between fancy types and fancy NAs...
>
> I'm in agreement here, and that was a major consideration in making a
> 'masked' implementation first.

When it comes to "missing data", bitpatterns can do everything that
masks can do, are no more complicated to implement, and have better
performance characteristics.

> Also, different folks adopt different values
> for 'missing' data, and distributing one or several masks along with the
> data is another common practice.

True, but not really relevant to the current debate, because you have
to handle such issues as part of your general data import workflow
anyway, and none of these is any more complicated no matter which
implementations are available.

> One inconvenience I have run into with the current API is that is should be
> easier to clear the mask from an "ignored" value without taking a new view
> or assigning known data. So maybe two types of masks (different payloads),
> or an additional flag could be helpful. The process of assigning masks could
> also be made a bit easier than using fancy indexing.

So this, uh... this was actually the whole goal of the "alterNEP"
design for masks -- making all this stuff easy for people (like you,
apparently?) that want support for ignored values, separately from
missing data, and want a nice clean API for it. Basically having a
separate .mask attribute which was an ordinary, assignable array
broadcastable to the attached array's shape. Nobody seemed interested
in talking about it much then but maybe there's interest now?

-- Nathaniel


From charlesr.harris at gmail.com  Wed Mar  7 14:37:15 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 7 Mar 2012 12:37:15 -0700
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <CAPJVwBnzTAk1t-okcNz1_zyEvehhy2DFASaVWBaChmC4q-6aVA@mail.gmail.com>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
	<CAMRnEmoiTZcVpBZKU1B-kOGMsmN8SxghJ=oc5v7Y8s7wRvd+BA@mail.gmail.com>
	<4F561590.3000400@crans.org>
	<CAMRnEmqVP9N6_fNEbhcrQGOmVaMo3wagEeTigWN3pTy95c546w@mail.gmail.com>
	<CAPJVwBnKuA0NC3FvEuzmxgpA8FOT2fver98Ec5ow3oPZACfiyA@mail.gmail.com>
	<4F578E60.7080304@crans.org>
	<CAB6mnxK47+s5FBS+ej=U0CpYWFK4nTjicuHz03qUxwC2F+Wieg@mail.gmail.com>
	<CAPJVwBnzTAk1t-okcNz1_zyEvehhy2DFASaVWBaChmC4q-6aVA@mail.gmail.com>
Message-ID: <CAB6mnxKJ=imqNgfBAh3hEvqVCVWjtT=OaomF5ue-1k8qRB+qUA@mail.gmail.com>

On Wed, Mar 7, 2012 at 12:26 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Wed, Mar 7, 2012 at 5:17 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> > On Wed, Mar 7, 2012 at 9:35 AM, Pierre Haessig <pierre.haessig at crans.org
> >
> >> Coming back to Travis proposition "bit-pattern approaches to missing
> >> data (*at least* for float64 and int32) need to be implemented.", I
> >> wonder what is the amount of extra work to go from nafloat64 to
> >> nafloat32/16 ? Is there an hardware support NaN payloads with these
> >> smaller floats ? If not, or if it is too complicated, I feel it is
> >> acceptable to say "it's too complicated" and fall back to mask. One may
> >> have to choose between fancy types and fancy NAs...
> >
> > I'm in agreement here, and that was a major consideration in making a
> > 'masked' implementation first.
>
> When it comes to "missing data", bitpatterns can do everything that
> masks can do, are no more complicated to implement, and have better
> performance characteristics.
>
>
Maybe for float, for other things, no. And we have lots of otherthings. The
performance is a strawman, and it *isn't* easier to implement.


> > Also, different folks adopt different values
> > for 'missing' data, and distributing one or several masks along with the
> > data is another common practice.
>
> True, but not really relevant to the current debate, because you have
> to handle such issues as part of your general data import workflow
> anyway, and none of these is any more complicated no matter which
> implementations are available.
>
> > One inconvenience I have run into with the current API is that is should
> be
> > easier to clear the mask from an "ignored" value without taking a new
> view
> > or assigning known data. So maybe two types of masks (different
> payloads),
> > or an additional flag could be helpful. The process of assigning masks
> could
> > also be made a bit easier than using fancy indexing.
>
> So this, uh... this was actually the whole goal of the "alterNEP"
> design for masks -- making all this stuff easy for people (like you,
> apparently?) that want support for ignored values, separately from
> missing data, and want a nice clean API for it. Basically having a
> separate .mask attribute which was an ordinary, assignable array
> broadcastable to the attached array's shape. Nobody seemed interested
> in talking about it much then but maybe there's interest now?
>
>
Come off it, Nathaniel, the problem is minor and fixable. The intent of the
initial implementation was to discover such things. These things are less
accessible with the current API *precisely* because of the feedback from R
users. It didn't start that way.

We now have something to evolve into what we want. That is a heck of a lot
more useful than endless discussion.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120307/8ae928d7/attachment.html>

From ben.root at ou.edu  Wed Mar  7 14:39:04 2012
From: ben.root at ou.edu (Benjamin Root)
Date: Wed, 7 Mar 2012 13:39:04 -0600
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <CAPJVwBnzTAk1t-okcNz1_zyEvehhy2DFASaVWBaChmC4q-6aVA@mail.gmail.com>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
	<CAMRnEmoiTZcVpBZKU1B-kOGMsmN8SxghJ=oc5v7Y8s7wRvd+BA@mail.gmail.com>
	<4F561590.3000400@crans.org>
	<CAMRnEmqVP9N6_fNEbhcrQGOmVaMo3wagEeTigWN3pTy95c546w@mail.gmail.com>
	<CAPJVwBnKuA0NC3FvEuzmxgpA8FOT2fver98Ec5ow3oPZACfiyA@mail.gmail.com>
	<4F578E60.7080304@crans.org>
	<CAB6mnxK47+s5FBS+ej=U0CpYWFK4nTjicuHz03qUxwC2F+Wieg@mail.gmail.com>
	<CAPJVwBnzTAk1t-okcNz1_zyEvehhy2DFASaVWBaChmC4q-6aVA@mail.gmail.com>
Message-ID: <CANNq6Fnsh0kasNVGRq3z-YSUeL0Zt3Lgxyvajchk0fj0cw0VgA@mail.gmail.com>

On Wed, Mar 7, 2012 at 1:26 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Wed, Mar 7, 2012 at 5:17 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> > On Wed, Mar 7, 2012 at 9:35 AM, Pierre Haessig <pierre.haessig at crans.org
> >
> >> Coming back to Travis proposition "bit-pattern approaches to missing
> >> data (*at least* for float64 and int32) need to be implemented.", I
> >> wonder what is the amount of extra work to go from nafloat64 to
> >> nafloat32/16 ? Is there an hardware support NaN payloads with these
> >> smaller floats ? If not, or if it is too complicated, I feel it is
> >> acceptable to say "it's too complicated" and fall back to mask. One may
> >> have to choose between fancy types and fancy NAs...
> >
> > I'm in agreement here, and that was a major consideration in making a
> > 'masked' implementation first.
>
> When it comes to "missing data", bitpatterns can do everything that
> masks can do, are no more complicated to implement, and have better
> performance characteristics.
>
>
Not true.  bitpatterns inherently destroys the data, while masks do not.
For matplotlib, we can not use bitpatterns because it could over-write user
data (or we have to copy the data).  I would imagine other extension
writers would have similar issues when they need to play around with input
data in a safe manner.

Also, I doubt that the performance characteristics for strings and integers
are the same as it is for masks.

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120307/487553d6/attachment.html>

From matthew.brett at gmail.com  Wed Mar  7 14:54:43 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Wed, 7 Mar 2012 11:54:43 -0800
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <CAB6mnxKJ=imqNgfBAh3hEvqVCVWjtT=OaomF5ue-1k8qRB+qUA@mail.gmail.com>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
	<CAMRnEmoiTZcVpBZKU1B-kOGMsmN8SxghJ=oc5v7Y8s7wRvd+BA@mail.gmail.com>
	<4F561590.3000400@crans.org>
	<CAMRnEmqVP9N6_fNEbhcrQGOmVaMo3wagEeTigWN3pTy95c546w@mail.gmail.com>
	<CAPJVwBnKuA0NC3FvEuzmxgpA8FOT2fver98Ec5ow3oPZACfiyA@mail.gmail.com>
	<4F578E60.7080304@crans.org>
	<CAB6mnxK47+s5FBS+ej=U0CpYWFK4nTjicuHz03qUxwC2F+Wieg@mail.gmail.com>
	<CAPJVwBnzTAk1t-okcNz1_zyEvehhy2DFASaVWBaChmC4q-6aVA@mail.gmail.com>
	<CAB6mnxKJ=imqNgfBAh3hEvqVCVWjtT=OaomF5ue-1k8qRB+qUA@mail.gmail.com>
Message-ID: <CAH6Pt5pUwgxFc6mTQ5hEdGbvg4S9v=tpav5XeYweC9SSk+3Cwg@mail.gmail.com>

Hi,

On Wed, Mar 7, 2012 at 11:37 AM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Wed, Mar 7, 2012 at 12:26 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On Wed, Mar 7, 2012 at 5:17 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> > On Wed, Mar 7, 2012 at 9:35 AM, Pierre Haessig
>> > <pierre.haessig at crans.org>
>> >> Coming back to Travis proposition "bit-pattern approaches to missing
>> >> data (*at least* for float64 and int32) need to be implemented.", I
>> >> wonder what is the amount of extra work to go from nafloat64 to
>> >> nafloat32/16 ? Is there an hardware support NaN payloads with these
>> >> smaller floats ? If not, or if it is too complicated, I feel it is
>> >> acceptable to say "it's too complicated" and fall back to mask. One may
>> >> have to choose between fancy types and fancy NAs...
>> >
>> > I'm in agreement here, and that was a major consideration in making a
>> > 'masked' implementation first.
>>
>> When it comes to "missing data", bitpatterns can do everything that
>> masks can do, are no more complicated to implement, and have better
>> performance characteristics.
>>
>
> Maybe for float, for other things, no. And we have lots of otherthings. The
> performance is a strawman, and it *isn't* easier to implement.
>
>>
>> > Also, different folks adopt different values
>> > for 'missing' data, and distributing one or several masks along with the
>> > data is another common practice.
>>
>> True, but not really relevant to the current debate, because you have
>> to handle such issues as part of your general data import workflow
>> anyway, and none of these is any more complicated no matter which
>> implementations are available.
>>
>> > One inconvenience I have run into with the current API is that is should
>> > be
>> > easier to clear the mask from an "ignored" value without taking a new
>> > view
>> > or assigning known data. So maybe two types of masks (different
>> > payloads),
>> > or an additional flag could be helpful. The process of assigning masks
>> > could
>> > also be made a bit easier than using fancy indexing.
>>
>> So this, uh... this was actually the whole goal of the "alterNEP"
>> design for masks -- making all this stuff easy for people (like you,
>> apparently?) that want support for ignored values, separately from
>> missing data, and want a nice clean API for it. Basically having a
>> separate .mask attribute which was an ordinary, assignable array
>> broadcastable to the attached array's shape. Nobody seemed interested
>> in talking about it much then but maybe there's interest now?
>>
>
> Come off it, Nathaniel, the problem is minor and fixable. The intent of the
> initial implementation was to discover such things. These things are less
> accessible with the current API *precisely* because of the feedback from R
> users. It didn't start that way.
>
> We now have something to evolve into what we want. That is a heck of a lot
> more useful than endless discussion.

The endless discussion is for the following reason:

- The discussion was never adequately resolved.

The discussion was never adequately resolved because there was not
enough work done to understand the various arguments.   In particular,
you've several times said things that indicate to me, as to Nathaniel,
that you either have not read or have not understood the points that
Nathaniel was making.

Travis' recent email - to me - also indicates that there is still a
genuine problem here that has not been adequately explored.

There is no future in trying to stop discussion, and trying to do so
will only prolong it and make it less useful.  It will make the
discussion - endless.

If you want to help - read the alterNEP, respond to it directly, and
further the discussion by engaged debate.

Best,

Matthew


From efiring at hawaii.edu  Wed Mar  7 14:57:35 2012
From: efiring at hawaii.edu (Eric Firing)
Date: Wed, 07 Mar 2012 09:57:35 -1000
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <CAPJVwBnzTAk1t-okcNz1_zyEvehhy2DFASaVWBaChmC4q-6aVA@mail.gmail.com>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
	<CAMRnEmoiTZcVpBZKU1B-kOGMsmN8SxghJ=oc5v7Y8s7wRvd+BA@mail.gmail.com>
	<4F561590.3000400@crans.org>
	<CAMRnEmqVP9N6_fNEbhcrQGOmVaMo3wagEeTigWN3pTy95c546w@mail.gmail.com>
	<CAPJVwBnKuA0NC3FvEuzmxgpA8FOT2fver98Ec5ow3oPZACfiyA@mail.gmail.com>
	<4F578E60.7080304@crans.org>
	<CAB6mnxK47+s5FBS+ej=U0CpYWFK4nTjicuHz03qUxwC2F+Wieg@mail.gmail.com>
	<CAPJVwBnzTAk1t-okcNz1_zyEvehhy2DFASaVWBaChmC4q-6aVA@mail.gmail.com>
Message-ID: <4F57BDAF.2000406@hawaii.edu>

On 03/07/2012 09:26 AM, Nathaniel Smith wrote:
> On Wed, Mar 7, 2012 at 5:17 PM, Charles R Harris
> <charlesr.harris at gmail.com>  wrote:
>> On Wed, Mar 7, 2012 at 9:35 AM, Pierre Haessig<pierre.haessig at crans.org>
>>> Coming back to Travis proposition "bit-pattern approaches to missing
>>> data (*at least* for float64 and int32) need to be implemented.", I
>>> wonder what is the amount of extra work to go from nafloat64 to
>>> nafloat32/16 ? Is there an hardware support NaN payloads with these
>>> smaller floats ? If not, or if it is too complicated, I feel it is
>>> acceptable to say "it's too complicated" and fall back to mask. One may
>>> have to choose between fancy types and fancy NAs...
>>
>> I'm in agreement here, and that was a major consideration in making a
>> 'masked' implementation first.
>
> When it comes to "missing data", bitpatterns can do everything that
> masks can do, are no more complicated to implement, and have better
> performance characteristics.
>
>> Also, different folks adopt different values
>> for 'missing' data, and distributing one or several masks along with the
>> data is another common practice.
>
> True, but not really relevant to the current debate, because you have
> to handle such issues as part of your general data import workflow
> anyway, and none of these is any more complicated no matter which
> implementations are available.
>
>> One inconvenience I have run into with the current API is that is should be
>> easier to clear the mask from an "ignored" value without taking a new view
>> or assigning known data. So maybe two types of masks (different payloads),
>> or an additional flag could be helpful. The process of assigning masks could
>> also be made a bit easier than using fancy indexing.
>
> So this, uh... this was actually the whole goal of the "alterNEP"
> design for masks -- making all this stuff easy for people (like you,
> apparently?) that want support for ignored values, separately from
> missing data, and want a nice clean API for it. Basically having a
> separate .mask attribute which was an ordinary, assignable array
> broadcastable to the attached array's shape. Nobody seemed interested
> in talking about it much then but maybe there's interest now?

In other words, good low-level support for numpy.ma functionality?  With 
a migration path so that a separate numpy.ma might wither away?  Yes, 
there is interest; this is exactly what I think is needed for my own 
style of applications (which I think are common at least in geoscience), 
and for matplotlib.  The question is how to achieve it as simply and 
cleanly as possible while also satisfying the needs of the R users, and 
while making it easy for matplotlib, for example, to handle *any* 
reasonable input: ma, other masking, nan, or NA-bitpattern.

It may be that a rather pragmatic approach to implementation will prove 
better than a highly idealized set of data models.  Or, it may be that a 
dual approach is best, in which the flag value missing data 
implementation is tightly bound to the R model and the mask 
implementation is explicitly designed for the numpy.ma model. In any 
case, a reasonable level of agreement on the goals is needed.  I presume 
Travis's involvement will facilitate a clarification of the goals and of 
the implementation; and I expect that much of Mark's work will end up 
serving well, even if much needs to be added and the API evolves 
considerably.

Eric

>
> -- Nathaniel
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From sole at esrf.fr  Wed Mar  7 15:02:29 2012
From: sole at esrf.fr (=?UTF-8?B?IlYuIEFybWFuZG8gU29sw6ki?=)
Date: Wed, 07 Mar 2012 21:02:29 +0100
Subject: [Numpy-discussion] (2012) Accessing LAPACK and BLAS from the
 numpy	C API
In-Reply-To: <4F566C29.3050209@molden.no>
References: <4F54BEEA.4070503@esrf.fr> <4F566C29.3050209@molden.no>
Message-ID: <4F57BED5.7050305@esrf.fr>

On 06/03/2012 20:57, Sturla Molden wrote:
> On 05.03.2012 14:26, "V. Armando Sol?" wrote:
>
>> In 2009 there was a thread in this mailing list concerning the access to
>> BLAS from C extension modules.
>>
>> If I have properly understood the thread:
>>
>> http://mail.scipy.org/pipermail/numpy-discussion/2009-November/046567.html
>>
>> the answer by then was that those functions were not exposed (only f2py
>> functions).
>>
>> I just wanted to know if the situation has changed since 2009 because it
>> is not uncommon that to optimize some operations one has to sooner or
>> later access BLAS functions that are already wrapped in numpy (either
>> from ATLAS, from the Intel MKL, ...)
> Why do you want to do this? It does not make your life easier to use
> NumPy or SciPy's Python wrappers from C. Just use BLAS directly from C
> instead.
>
Wow! It certainly makes my life much, much easier. I can compile and 
distribute my python extension *even without having ATLAS, BLAS or MKL 
installed*.
Please note I am not using the python wrappers from C. That would make 
no sense. I am using the underlying libraries supplied with python from C.

I had already used the information Robert Kern provided on the 2009 
thread and obtained the PyCObject as:

from scipy.linalg.blas import fblas
dgemm = fblas.dgemm._cpointer
sgemm = fblas.sgemm._cpointer

but I did not find a way to obtain those pointers from numpy. That was 
the goal of my post. My extension needs SciPy installed just to fetch 
the pointer. It would be very nice to have a way to get similar 
information from numpy.

I have made a test on a Debian machine with BLAS installed but no 
ATLAS-> Extension slow but working.
Then the system maintainer has installed ATLAS -> The extension flies. 
So, one can distribute a python extension that works on its own but that 
can take profit of any advanced library the end user might have installed.

Your point of view is valid if one is not going to distribute the 
extension module but I *have to* distribute the module for Linux and for 
windows. To have a proper fortran compiler for windows 64 bit compatible 
with python is already an issue. If I have to distribute my own ATLAS or 
MKL then it gets even worse. All those issues are solved just by using 
the pointer to the function.

Concerning licenses, if the end user has the right to use MKL, then he 
has the right to use it via my extension. It is not me who is using MKL

Armando
PS. The only issue I see with the whole approach is safety because the 
extension might be used to call some nasty function.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120307/d533c7a2/attachment.html>

From ndbecker2 at gmail.com  Wed Mar  7 15:05:47 2012
From: ndbecker2 at gmail.com (Neal Becker)
Date: Wed, 07 Mar 2012 15:05:47 -0500
Subject: [Numpy-discussion] use for missing (ignored) data?
Message-ID: <jj8f2s$74a$1@dough.gmane.org>

I'm wondering what is the use for the ignored data feature?

I can use:

A[valid_A_indexes] = whatever

to process only the 'non-ignored' portions of A.  So at least some simple cases 
of ignored data are already supported without introducing a new type.

OTOH:

w = A[valid_A_indexes]

will copy A's data, and subsequent use of 

w[:] = something

will not update A.

Is this the reason for wanting the ignored data feature?


From charlesr.harris at gmail.com  Wed Mar  7 15:17:49 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 7 Mar 2012 13:17:49 -0700
Subject: [Numpy-discussion] use for missing (ignored) data?
In-Reply-To: <jj8f2s$74a$1@dough.gmane.org>
References: <jj8f2s$74a$1@dough.gmane.org>
Message-ID: <CAB6mnxJpikzyVjd7gVdxa9M6btTh-t71UOrRD9jOBv07rqnYbg@mail.gmail.com>

On Wed, Mar 7, 2012 at 1:05 PM, Neal Becker <ndbecker2 at gmail.com> wrote:

> I'm wondering what is the use for the ignored data feature?
>
> I can use:
>
> A[valid_A_indexes] = whatever
>
> to process only the 'non-ignored' portions of A.  So at least some simple
> cases
> of ignored data are already supported without introducing a new type.
>
> OTOH:
>
> w = A[valid_A_indexes]
>
> will copy A's data, and subsequent use of
>
> w[:] = something
>
> will not update A.
>
> Is this the reason for wanting the ignored data feature?
>

Suppose you are working with plotted data and want to turn points on/off by
clicking on them interactively to see how that affects a fit. Why make
multiple copies, change sizes, destroy data, and all that nonsense? Just
have the click update the mask and redraw.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120307/2d6950bb/attachment.html>

From ndbecker2 at gmail.com  Wed Mar  7 15:21:49 2012
From: ndbecker2 at gmail.com (Neal Becker)
Date: Wed, 07 Mar 2012 15:21:49 -0500
Subject: [Numpy-discussion] use for missing (ignored) data?
References: <jj8f2s$74a$1@dough.gmane.org>
	<CAB6mnxJpikzyVjd7gVdxa9M6btTh-t71UOrRD9jOBv07rqnYbg@mail.gmail.com>
Message-ID: <jj8g0t$e0r$1@dough.gmane.org>

Charles R Harris wrote:

> On Wed, Mar 7, 2012 at 1:05 PM, Neal Becker <ndbecker2 at gmail.com> wrote:
> 
>> I'm wondering what is the use for the ignored data feature?
>>
>> I can use:
>>
>> A[valid_A_indexes] = whatever
>>
>> to process only the 'non-ignored' portions of A.  So at least some simple
>> cases
>> of ignored data are already supported without introducing a new type.
>>
>> OTOH:
>>
>> w = A[valid_A_indexes]
>>
>> will copy A's data, and subsequent use of
>>
>> w[:] = something
>>
>> will not update A.
>>
>> Is this the reason for wanting the ignored data feature?
>>
> 
> Suppose you are working with plotted data and want to turn points on/off by
> clicking on them interactively to see how that affects a fit. Why make
> multiple copies, change sizes, destroy data, and all that nonsense? Just
> have the click update the mask and redraw.
> 
> Chuck

But does

some_func (A[valid_data_mask])

actually perform a copy?


From pierre.haessig at crans.org  Wed Mar  7 16:15:19 2012
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Wed, 07 Mar 2012 22:15:19 +0100
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <4F57BDAF.2000406@hawaii.edu>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>	<CAMRnEmoiTZcVpBZKU1B-kOGMsmN8SxghJ=oc5v7Y8s7wRvd+BA@mail.gmail.com>	<4F561590.3000400@crans.org>	<CAMRnEmqVP9N6_fNEbhcrQGOmVaMo3wagEeTigWN3pTy95c546w@mail.gmail.com>	<CAPJVwBnKuA0NC3FvEuzmxgpA8FOT2fver98Ec5ow3oPZACfiyA@mail.gmail.com>	<4F578E60.7080304@crans.org>	<CAB6mnxK47+s5FBS+ej=U0CpYWFK4nTjicuHz03qUxwC2F+Wieg@mail.gmail.com>	<CAPJVwBnzTAk1t-okcNz1_zyEvehhy2DFASaVWBaChmC4q-6aVA@mail.gmail.com>
	<4F57BDAF.2000406@hawaii.edu>
Message-ID: <4F57CFE7.90006@crans.org>

Hi,
Le 07/03/2012 20:57, Eric Firing a ?crit :
> In other words, good low-level support for numpy.ma functionality?
Coming back to *existing* ma support, I was just wondering whether it
was now possible to "np.save" a masked array.
(I'm using numpy 1.5)
In the end, this is the most annoying problem I have with the existing
ma module which otherwise is pretty useful to me. I'm happy not to need
to process 100% of my data though.

Best,
Pierre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120307/eabefdd9/attachment.sig>

From efiring at hawaii.edu  Wed Mar  7 16:35:12 2012
From: efiring at hawaii.edu (Eric Firing)
Date: Wed, 07 Mar 2012 11:35:12 -1000
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <4F57CFE7.90006@crans.org>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
	<CAMRnEmoiTZcVpBZKU1B-kOGMsmN8SxghJ=oc5v7Y8s7wRvd+BA@mail.gmail.com>
	<4F561590.3000400@crans.org>
	<CAMRnEmqVP9N6_fNEbhcrQGOmVaMo3wagEeTigWN3pTy95c546w@mail.gmail.com>
	<CAPJVwBnKuA0NC3FvEuzmxgpA8FOT2fver98Ec5ow3oPZACfiyA@mail.gmail.com>
	<4F578E60.7080304@crans.org>
	<CAB6mnxK47+s5FBS+ej=U0CpYWFK4nTjicuHz03qUxwC2F+Wieg@mail.gmail.com>
	<CAPJVwBnzTAk1t-okcNz1_zyEvehhy2DFASaVWBaChmC4q-6aVA@mail.gmail.com>
	<4F57BDAF.2000406@hawaii.edu> <4F57CFE7.90006@crans.org>
Message-ID: <4F57D490.6090104@hawaii.edu>

On 03/07/2012 11:15 AM, Pierre Haessig wrote:
> Hi,
> Le 07/03/2012 20:57, Eric Firing a ?crit :
>> In other words, good low-level support for numpy.ma functionality?
> Coming back to *existing* ma support, I was just wondering whether it
> was now possible to "np.save" a masked array.
> (I'm using numpy 1.5)

No, not with the mask preserved.  This is one of the improvements I am 
hoping for with the upcoming missing data work.

Eric

> In the end, this is the most annoying problem I have with the existing
> ma module which otherwise is pretty useful to me. I'm happy not to need
> to process 100% of my data though.
>
> Best,
> Pierre
>
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From ben.root at ou.edu  Wed Mar  7 16:40:13 2012
From: ben.root at ou.edu (Benjamin Root)
Date: Wed, 7 Mar 2012 15:40:13 -0600
Subject: [Numpy-discussion] use for missing (ignored) data?
In-Reply-To: <jj8g0t$e0r$1@dough.gmane.org>
References: <jj8f2s$74a$1@dough.gmane.org>
	<CAB6mnxJpikzyVjd7gVdxa9M6btTh-t71UOrRD9jOBv07rqnYbg@mail.gmail.com>
	<jj8g0t$e0r$1@dough.gmane.org>
Message-ID: <CANNq6F=id1GG+bAHerWKzbbugN4FBBSLwuq234YnPKFgpBjU3w@mail.gmail.com>

On Wednesday, March 7, 2012, Neal Becker <ndbecker2 at gmail.com> wrote:
> Charles R Harris wrote:
>
>> On Wed, Mar 7, 2012 at 1:05 PM, Neal Becker <ndbecker2 at gmail.com> wrote:
>>
>>> I'm wondering what is the use for the ignored data feature?
>>>
>>> I can use:
>>>
>>> A[valid_A_indexes] = whatever
>>>
>>> to process only the 'non-ignored' portions of A.  So at least some
simple
>>> cases
>>> of ignored data are already supported without introducing a new type.
>>>
>>> OTOH:
>>>
>>> w = A[valid_A_indexes]
>>>
>>> will copy A's data, and subsequent use of
>>>
>>> w[:] = something
>>>
>>> will not update A.
>>>
>>> Is this the reason for wanting the ignored data feature?
>>>
>>
>> Suppose you are working with plotted data and want to turn points on/off
by
>> clicking on them interactively to see how that affects a fit. Why make
>> multiple copies, change sizes, destroy data, and all that nonsense? Just
>> have the click update the mask and redraw.
>>
>> Chuck
>
> But does
>
> some_func (A[valid_data_mask])
>
> actually perform a copy?
>
>

Yes! If it isn't sliced, or accessed by a scalar index, then you are given
a copy.  Fancy indexing and Boolean indexing will not return a view.

Note that assignments to a Boolean-indexed array by a scalar is
special-cased. I.e.,

A[valid_points] = 5

will do what you expect. But,

A[valid_points] += 5

may not, IIRC.

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120307/4d0bea78/attachment.html>

From pierre.haessig at crans.org  Wed Mar  7 17:02:39 2012
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Wed, 07 Mar 2012 23:02:39 +0100
Subject: [Numpy-discussion] addition,
 multiplication of a polynomial and np.{float, int}
In-Reply-To: <CAB6mnx+kJys-F76wKP+6KpF2rQObLmQD7g8G4nOWL03Ks_UDpA@mail.gmail.com>
References: <slrnjlcusj.ul5.dlaxalde@157-washington.campus.mcgill.ca>	<CAB6mnxJ=u7QGTwSwUZfibbSieh8caZ+3_yTmuWF_-3xzJmONYg@mail.gmail.com>	<4F5790BB.2050801@crans.org>
	<CAB6mnx+kJys-F76wKP+6KpF2rQObLmQD7g8G4nOWL03Ks_UDpA@mail.gmail.com>
Message-ID: <4F57DAFF.3030302@crans.org>

Hi Charles,
Le 07/03/2012 18:00, Charles R Harris a ?crit :
>
> That's a good idea, I'll take care of it. Note the caveat about the
> coefficients going in the opposite direction.
Great ! In the mean time I changed a bit the root polynomials reference
to emphasize the new Polynomial class.

http://docs.scipy.org/numpy/docs/numpy-docs/reference/routines.polynomials.rst/

I feel like that the preexisting text was more targeting people with a
preexisting knowledge of Poly1d than people looking for Polynomials in
NumPy in general. (Poly1d is unfortunately the number 1 Google result
for "polynomials numpy"...)

Please make sure I didn't write something completely stupid. Also, I
tried to include links using :doc: but the editor complains. I hope it
will work after a sphinx compilation. There is a build bot, isn't it ?
The aim of the first link is that the "user-in-a-hurry" ends up on the
best piece of documentation available, namely "Using the Convenience
Classes" after a short intro. Does this sounds good ?

Best,
Pierre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120307/6302de66/attachment.sig>

From josef.pktd at gmail.com  Wed Mar  7 17:56:47 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 7 Mar 2012 17:56:47 -0500
Subject: [Numpy-discussion] Github key audit.
In-Reply-To: <CAB6mnxLyCv0OBR3kAxei6kwMBxDjdZho6vm=K8Q++qeAXOPu=A@mail.gmail.com>
References: <CAB6mnxLyCv0OBR3kAxei6kwMBxDjdZho6vm=K8Q++qeAXOPu=A@mail.gmail.com>
Message-ID: <CAMMTP+BLMd4PxHZctWQCkSYyozWE5FjxxsFxtryoNqnvZ7BhGg@mail.gmail.com>

On Wed, Mar 7, 2012 at 1:54 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> Hi All,
>
> Many here have probably received the message from github about push/pull
> access being blocked until you have auditied your ssh keys. To generate a
> key fingerprint on fedora, I did the following:
>
> $charris at f16 ~$ ssh-keygen -l -f .ssh/id_dsa.pub
>
> I don't how this looks for those of you using windows.

I didn't do much, just confirmed with github that my ssh key is really
mine, which git gui was nice enough to tell me.

Josef

>
> The compromise looks to be a 'whitehat' attack telling the rails developers
> to "FIX THE G*DD*MNED PROBLEM". More here, Github compromized.
>
> Chuck
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From njs at pobox.com  Wed Mar  7 18:10:51 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 7 Mar 2012 23:10:51 +0000
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <CAB6mnxKJ=imqNgfBAh3hEvqVCVWjtT=OaomF5ue-1k8qRB+qUA@mail.gmail.com>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
	<CAMRnEmoiTZcVpBZKU1B-kOGMsmN8SxghJ=oc5v7Y8s7wRvd+BA@mail.gmail.com>
	<4F561590.3000400@crans.org>
	<CAMRnEmqVP9N6_fNEbhcrQGOmVaMo3wagEeTigWN3pTy95c546w@mail.gmail.com>
	<CAPJVwBnKuA0NC3FvEuzmxgpA8FOT2fver98Ec5ow3oPZACfiyA@mail.gmail.com>
	<4F578E60.7080304@crans.org>
	<CAB6mnxK47+s5FBS+ej=U0CpYWFK4nTjicuHz03qUxwC2F+Wieg@mail.gmail.com>
	<CAPJVwBnzTAk1t-okcNz1_zyEvehhy2DFASaVWBaChmC4q-6aVA@mail.gmail.com>
	<CAB6mnxKJ=imqNgfBAh3hEvqVCVWjtT=OaomF5ue-1k8qRB+qUA@mail.gmail.com>
Message-ID: <CAPJVwB=wHMJbdDjMksK7=jdtP5eLkpe=BaPtnnX-r2-4tkfrkA@mail.gmail.com>

On Wed, Mar 7, 2012 at 7:37 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Wed, Mar 7, 2012 at 12:26 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> When it comes to "missing data", bitpatterns can do everything that
>> masks can do, are no more complicated to implement, and have better
>> performance characteristics.
>>
>
> Maybe for float, for other things, no. And we have lots of otherthings.

It would be easier to discuss this if you'd, like, discuss :-(. If you
know of some advantage that masks have over bitpatterns when it comes
to missing data, can you please share it, instead of just asserting
it?

Not that I'm immune... I perhaps should have been more explicit
myself, when I said "performance characteristics", let me clarify that
I was thinking of both speed (for floats) and memory (for
most-but-not-all things).

> The
> performance is a strawman,

How many users need to speak up to say that this is a serious problem
they have with the current implementation before you stop calling it a
strawman? Because when Wes says that it's not going to fly for his
stats/econometics cases, and the neuroimaging folk like Gary and Matt
say it's not going to fly for their use cases... surely just waving
that away is a bit dismissive?

I'm not saying that we *have* to implement bitpatterns because
performance is *the most important feature* -- I'm just saying, well,
what I said. For *missing data use* cases, bitpatterns have better
performance characteristics than masks. If we decide that these use
cases are important, then we should take this into account and weigh
it against other considerations. Maybe what you think is that these
use cases shouldn't be the focus of this feature and it should focus
on the "ignored" use cases instead? That would be a legitimate
argument... but if that's what you want to say, say it, don't just
dismiss your users!

> and it *isn't* easier to implement.

If I thought bitpatterns would be easier to implement, I would have
said so... What I said was that they're not harder. You have some
extra complexity, mostly in casting, and some reduced complexity -- no
need to allocate and manipulate the mask. (E.g., simple same-type
assignments and slicing require special casing for masks, but not for
bitpatterns.) In many places the complexity is identical -- printing
routines need to check for either special bitpatterns or masked
values, whatever. Ufunc loops need to either find the appropriate part
of the mask, or create a temporary mask buffer by calling a dtype
func, whatever. On net they seem about equivalent, complexity-wise.

...I assume you disagree with this analysis, since I've said it
before, wrote up a sketch for how the implementation would work at the
C level, etc., and you continue to claim that simplicity is a
compelling advantage for the masked approach. But I still don't know
why you think that :-(.

>> > Also, different folks adopt different values
>> > for 'missing' data, and distributing one or several masks along with the
>> > data is another common practice.
>>
>> True, but not really relevant to the current debate, because you have
>> to handle such issues as part of your general data import workflow
>> anyway, and none of these is any more complicated no matter which
>> implementations are available.
>>
>> > One inconvenience I have run into with the current API is that is should
>> > be
>> > easier to clear the mask from an "ignored" value without taking a new
>> > view
>> > or assigning known data. So maybe two types of masks (different
>> > payloads),
>> > or an additional flag could be helpful. The process of assigning masks
>> > could
>> > also be made a bit easier than using fancy indexing.
>>
>> So this, uh... this was actually the whole goal of the "alterNEP"
>> design for masks -- making all this stuff easy for people (like you,
>> apparently?) that want support for ignored values, separately from
>> missing data, and want a nice clean API for it. Basically having a
>> separate .mask attribute which was an ordinary, assignable array
>> broadcastable to the attached array's shape. Nobody seemed interested
>> in talking about it much then but maybe there's interest now?
>>
>
> Come off it, Nathaniel, the problem is minor and fixable. The intent of the
> initial implementation was to discover such things.

Implementation can be wonderful, I absolutely agree. But you
understand that I'd be more impressed by this example if your
discovery weren't something I had been arguing for since before the
implementation began :-).

> These things are less
> accessible with the current API *precisely* because of the feedback from R
> users. It didn't start that way.
>
> We now have something to evolve into what we want. That is a heck of a lot
> more useful than endless discussion.

No, you are still missing the point completely! There is no "what *we*
want", because what you want is different than what I want. The
masking stuff in the alterNEP was an attempt to give people like you
who wanted "ignored" support what they wanted, and the bitpattern
stuff was to satisfy people like me who want "missing data" support.
The NEP took a different approach to trying to make everyone happy...
unfortunately it sounds like it made no-one happy. Blaming the R users
for this isn't *wrong*, exactly, but it's a bit one-sided.

If you have a proposal for how the current code can be "evolved" into
something that will make the neuro/econ/stats people happy, then
please tell us. But I don't see how it's possible, and your current
proposals are going in the wrong direction. Unless we can actually
talk about these disagreements, we're just going to have more endless
discussion.

-- Nathaniel


From njs at pobox.com  Wed Mar  7 18:21:33 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 7 Mar 2012 23:21:33 +0000
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <CANNq6Fnsh0kasNVGRq3z-YSUeL0Zt3Lgxyvajchk0fj0cw0VgA@mail.gmail.com>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
	<CAMRnEmoiTZcVpBZKU1B-kOGMsmN8SxghJ=oc5v7Y8s7wRvd+BA@mail.gmail.com>
	<4F561590.3000400@crans.org>
	<CAMRnEmqVP9N6_fNEbhcrQGOmVaMo3wagEeTigWN3pTy95c546w@mail.gmail.com>
	<CAPJVwBnKuA0NC3FvEuzmxgpA8FOT2fver98Ec5ow3oPZACfiyA@mail.gmail.com>
	<4F578E60.7080304@crans.org>
	<CAB6mnxK47+s5FBS+ej=U0CpYWFK4nTjicuHz03qUxwC2F+Wieg@mail.gmail.com>
	<CAPJVwBnzTAk1t-okcNz1_zyEvehhy2DFASaVWBaChmC4q-6aVA@mail.gmail.com>
	<CANNq6Fnsh0kasNVGRq3z-YSUeL0Zt3Lgxyvajchk0fj0cw0VgA@mail.gmail.com>
Message-ID: <CAPJVwBk4AhffK9SxCAWba_Cqp+mdRQAadKkf8xcQE3QzLhN+bg@mail.gmail.com>

On Wed, Mar 7, 2012 at 7:39 PM, Benjamin Root <ben.root at ou.edu> wrote:
> On Wed, Mar 7, 2012 at 1:26 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> When it comes to "missing data", bitpatterns can do everything that
>> masks can do, are no more complicated to implement, and have better
>> performance characteristics.
>>
>
> Not true.? bitpatterns inherently destroys the data, while masks do not.

Yes, that's why I only wrote that this is true for "missing data", not
in general :-). If you have data that is being destroyed, then that's
not missing data, by definition. We don't have consensus yet on
whether that's the use case we are aiming for, but it's the one that
Pierre was worrying about.

> For matplotlib, we can not use bitpatterns because it could over-write user
> data (or we have to copy the data).? I would imagine other extension writers
> would have similar issues when they need to play around with input data in a
> safe manner.

Right. You clearly need some sort of masking, either an explicit mask
array that you keep somewhere, or one that gets attached to the
underlying ndarray in some non-destructive way.

> Also, I doubt that the performance characteristics for strings and integers
> are the same as it is for masks.

Not sure what you mean by this, but I'd be happy to hear more.

-- Nathaniel


From njs at pobox.com  Wed Mar  7 18:41:01 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 7 Mar 2012 23:41:01 +0000
Subject: [Numpy-discussion] use for missing (ignored) data?
In-Reply-To: <jj8f2s$74a$1@dough.gmane.org>
References: <jj8f2s$74a$1@dough.gmane.org>
Message-ID: <CAPJVwBmBTXdALVUt0qR3sqX7bgkhSoZ4vbssm0kqRxBc-+H5fg@mail.gmail.com>

On Wed, Mar 7, 2012 at 8:05 PM, Neal Becker <ndbecker2 at gmail.com> wrote:
> I'm wondering what is the use for the ignored data feature?
>
> I can use:
>
> A[valid_A_indexes] = whatever
>
> to process only the 'non-ignored' portions of A. ?So at least some simple cases
> of ignored data are already supported without introducing a new type.
>
> OTOH:
>
> w = A[valid_A_indexes]
>
> will copy A's data, and subsequent use of
>
> w[:] = something
>
> will not update A.
>
> Is this the reason for wanting the ignored data feature?

Hi Neal,

There are a few reasons that I know of why people want more support
from numpy for ignored data/masks, specifically (as opposed to missing
data or other related concepts):

1) If you're often working on some subset of your data, then it's
convenient to set the mask once and have it stay in effect for further
operations. Anything you can accomplish this way can also be
accomplished by keeping an explicit mask array and using it for
indexing "by hand", but in some situations it may be more convenient
not to.

2) Operating on subsets of an array without making a copy. Like
Benjamin pointed out, indexing with a mask makes a copy. This is slow,
and what's worse, people who work with large data sets (e.g., big fMRI
volumes) may not have enough memory to afford such a copy. This
problem can be solved by using the new where= argument to ufuncs
(which skips the copy). (But then see (1) -- passing where= to a bunch
of functions takes more typing than just setting it once and leaving
it.)

3) Suppose there's a 3rd-party function that takes an array --
borrowing Charles example, say it's draw_points(arr). Now you want to
apply it to just a subset of your data, and want to avoid a copy. It
would be nice if the original author had made it draw_points(arr,
mask), but they didn't. Well, if you have masking "built in" to your
array type, then maybe you can call this as draw_points(masked_arr)
and it will Just Work. I.e., maybe people who aren't thinking about
masking will sometimes write code that accidentally works with masking
anyway. I'm not sure how much I'd trust this, but I guess it's nice
when it happens. And if it does work, then implementing the show/hide
point functionality will be easier. (And if it doesn't work, and
masking is built into numpy.ndarray, then maybe you can use this to
argue with the original author that this is a bug, not just a missing
feature. Again, I'm not sure if this is a good thing on net: one could
argue that people shouldn't be forced to think about masking every
time they write any function, just in case it becomes relevant later.
But certainly it'd be useful sometimes.)

There may be other motivations that I'm not aware of, of course.

-- Nathaniel


From matthew.brett at gmail.com  Wed Mar  7 19:08:39 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Wed, 7 Mar 2012 16:08:39 -0800
Subject: [Numpy-discussion] Casting rules changed in trunk?
Message-ID: <CAH6Pt5qJPn0f+T4b=ZbNZcq=FgpAoOG-XvAZH7=d5nZco-wKeg@mail.gmail.com>

Hi,

I noticed a casting change running the test suite on our image reader,
nibabel: https://github.com/nipy/nibabel/blob/master/nibabel/tests/test_casting.py

For this script:

<pre>
import numpy as np

Adata = np.zeros((2,), dtype=np.uint8)
Bdata = np.zeros((2,), dtype=np.int16)
Bzero = np.int16(0)
Bbig = np.int16(256)

print np.__version__
print 'Array add', (Adata + Bdata).dtype
print 'Scalar 0 add', (Adata + Bzero).dtype
print 'Scalar 256 add', (Adata + Bbig).dtype
</pre>

1.4.1
Array add int16
Scalar 0 add uint8
Scalar 256 add uint8

1.5.1
Array add int16
Scalar 0 add uint8
Scalar 256 add uint8

1.6.1
Array add int16
Scalar 0 add uint8
Scalar 256 add int16

1.7.0.dev-aae5b0a
Array add int16
Scalar 0 add uint8
Scalar 256 add uint16

I can understand the uint8 outputs from numpy < 1.6 - the rule being
not to upcast for scalars.

I can understand the int16 output from 1.6.1 on the basis that the
value is outside uint8 range and therefore we might prefer a type that
can handle values from both uint8 and int16.

Was the current change intended?  It has the following odd effect:

In [5]: Adata + np.int16(257)
Out[5]: array([257, 257], dtype=uint16)

In [7]: Adata + np.int16(-257)
Out[7]: array([-257, -257], dtype=int16)

In [8]: Adata - np.int16(257)
Out[8]: array([65279, 65279], dtype=uint16)

but I guess you can argue that there are odd effects for other choices too,

Best,

Matthew


From christoph.gohle at mpq.mpg.de  Wed Mar  7 20:04:53 2012
From: christoph.gohle at mpq.mpg.de (Christoph Gohle)
Date: Thu, 8 Mar 2012 02:04:53 +0100
Subject: [Numpy-discussion] subclassing array in c
Message-ID: <EE364848-602F-4CBC-942B-592C8C66E4BA@mpq.mpg.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi, 

I have been struggeling for quite some time now. Desperate as I am, now I need help. 

I was trying to subclass ndarrays in a c extension (see code below) and do constantly get segfaults. I have been checking my INCREF and DECREF stuff up and down but can't find the error. Probably I got something completely wrong... anybody able to help?

Thanks,
Christoph
- -----------------
#include <Python.h>
#include <structmember.h>

#include <numpy/arrayobject.h>

static PyObject *SpamError;

typedef struct {
  PyArrayObject base;
  PyDictObject* unit;
} UnitArrayObject;

PyTypeObject unitArrayObjectType;

static int
checkunit(PyObject *unit1, PyObject *unit2) {
  return PyObject_Compare(unit1, unit2);
}

static PyObject *
unitArray_new(PyTypeObject *cls, PyObject *args, PyObject *kwargs) {
	PyObject *data = NULL,
			 *unit = NULL;
  PyArray_Descr* dtype = NULL;
  PyObject *res = NULL, *tmp = NULL;

	if (!PyArg_ParseTuple(args, "OO|O&", &data, &unit, PyArray_DescrConverter, &dtype)) {
		Py_XDECREF(dtype);
		return NULL;
	}
  
  res = PyArray_FromAny(data, dtype, 0, 0, NPY_ENSURECOPY, NULL);
	if (res == NULL) {
		Py_XDECREF(dtype);
    //TODO: raise exception?
		return NULL;
	}

	if (PyObject_IsInstance(data, (PyObject*)cls)) {
		if (unit!=NULL && !checkunit((PyObject*)((UnitArrayObject*)data)->unit,unit)) {
			Py_XDECREF(res);
 			//TODO: raise exception
			return NULL;
		}
	} else {
		if (PyObject_IsTrue(unit)) {
      tmp = res;
			res = PyArray_View((PyArrayObject*)res, NULL, &unitArrayObjectType);
      if (tmp!=res) {
        Py_XDECREF(tmp);
      }
      ((UnitArrayObject*)res)->unit = (PyDictObject*)unit;
      Py_INCREF(unit);
      if (unit!=NULL) {
      }
		}
	}
	return res;
}

static PyObject*
unitArray__array_finalize__(PyObject* new, PyObject* args) {
	PyObject *attr = NULL, *tmp = NULL;
  PyObject *parent = NULL;

  if (!PyArg_ParseTuple(args, "O", &parent)) {
    return NULL;
  }
   if (parent!=NULL) {
     attr = PyObject_GetAttrString(parent, "unit");
     if (attr == NULL) {
        //parent has no 'unit' so we make a new empty one
       attr = PyDict_New();
       PyErr_Clear();
     }
   } 
  tmp = (PyObject*)((UnitArrayObject*)new)->unit;
    ((UnitArrayObject*)new)->unit = (PyDictObject*)attr;

  Py_INCREF(Py_None);
  return Py_None;
}

static PyObject* 
unitArray__array_wrap__(PyObject *self, PyObject *args) {
	PyObject *array = NULL, *context = NULL;
	
	if (!PyArg_ParseTuple(args, "OO", array, context)) {
		//TODO: raise exception
		return NULL;
	}
	
	printf("%s",PyString_AsString(PyObject_Str(context)));
	
  Py_INCREF(array);
  return array;
}


static PyMethodDef unitArrayMethods[] = {
  {"__array_finalize__", unitArray__array_finalize__, METH_VARARGS, "array finalize method"},
  {"__array_wrap__", unitArray__array_wrap__, METH_VARARGS, "array wrap method"},
  {NULL, NULL, 0, NULL}
};

static PyMemberDef unitArrayMembers[] = {
  {"unit", T_OBJECT, offsetof(UnitArrayObject, unit),  0, "dictionary containing unit info."},
  {NULL, 0, 0, 0, NULL}
};

PyTypeObject unitArrayObjectType = {
	PyObject_HEAD_INIT(NULL)
	0,				/* ob_size        */
	"spam.UnitArray",		/* tp_name        */
	sizeof(UnitArrayObject),		/* tp_basicsize   */
	0,				/* tp_itemsize    */
	0,				/* tp_dealloc     */
	0,				/* tp_print       */
	0,				/* tp_getattr     */
	0,				/* tp_setattr     */
	0,				/* tp_compare     */
	0,				/* tp_repr        */
	0,				/* tp_as_number   */
	0,				/* tp_as_sequence */
	0,				/* tp_as_mapping  */
	0,				/* tp_hash        */
	0,				/* tp_call        */
	0,				/* tp_str         */
	0,				/* tp_getattro    */
	0,				/* tp_setattro    */
	0,				/* tp_as_buffer   */
	Py_TPFLAGS_DEFAULT,		/* tp_flags       */
	"A numpy array with units",	/* tp_doc         */
  0,        /* traverseproc */
  0,        /* tp_clear*/
  0,        /* tp_richcompare */
  0,        /* tp_weaklistoffset */
  0,        /* tp_iter */
  0,        /* tp_iternext */
  unitArrayMethods,        /* tp_methods */
  unitArrayMembers,        /* tp_members */
  0,        /* tp_getset */
  0,        /* tp_base*/
  0,        /* tp_dict */
  0,        /* tp_descr_get*/
  0,        /* tp_descr_set */
  0,        /* tp_dictoffset */
  0,        /* tp_init */
  0,        /* tp_alloc */
  unitArray_new /* tp_new */
};


static PyMethodDef SpamMethods[] = {
    {NULL, NULL, 0, NULL}        /* Sentinel */
};

PyObject *unitArray = NULL;
PyMODINIT_FUNC
initspampub(void)
{
  import_array();

  PyObject *m;

  Py_INCREF(&PyArray_Type);
  unitArrayObjectType.tp_base = &PyArray_Type;

  if (PyType_Ready(&unitArrayObjectType) < 0)
    return;


  m = Py_InitModule3("spampub", SpamMethods, "some tests and a array type with units.");
  if (m == NULL)
    return;

  SpamError = PyErr_NewException("spampub.error", NULL, NULL);
  Py_INCREF(SpamError);
  PyModule_AddObject(m, "error", SpamError);
  Py_INCREF(&unitArrayObjectType);
  PyModule_AddObject(m, "UnitArray", (PyObject *)&unitArrayObjectType);
  (void) Py_InitModule("spampub", SpamMethods);

}


-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.14 (Darwin)

iEYEARECAAYFAk9YBbkACgkQLYu25rCEIzsExQCggHg0e0ibYP3bEsAE0Ce8Rbm3
qOcAoKwuTuE4BDZgrqNpwISgMS6uSMnO
=Qtek
-----END PGP SIGNATURE-----


From kalatsky at gmail.com  Wed Mar  7 20:15:15 2012
From: kalatsky at gmail.com (Val Kalatsky)
Date: Wed, 7 Mar 2012 19:15:15 -0600
Subject: [Numpy-discussion] subclassing array in c
In-Reply-To: <EE364848-602F-4CBC-942B-592C8C66E4BA@mpq.mpg.de>
References: <EE364848-602F-4CBC-942B-592C8C66E4BA@mpq.mpg.de>
Message-ID: <CAE8bXEkycG4G5PT9kjdHTbzUJVz7NST-HcySGozL+0934g5_mQ@mail.gmail.com>

Seeing the backtrace would be helpful.
Can you do whatever leads to the segfault
from python run from gdb?
Val

On Wed, Mar 7, 2012 at 7:04 PM, Christoph Gohle
<christoph.gohle at mpq.mpg.de>wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi,
>
> I have been struggeling for quite some time now. Desperate as I am, now I
> need help.
>
> I was trying to subclass ndarrays in a c extension (see code below) and do
> constantly get segfaults. I have been checking my INCREF and DECREF stuff
> up and down but can't find the error. Probably I got something completely
> wrong... anybody able to help?
>
> Thanks,
> Christoph
> - -----------------
> #include <Python.h>
> #include <structmember.h>
>
> #include <numpy/arrayobject.h>
>
> static PyObject *SpamError;
>
> typedef struct {
>  PyArrayObject base;
>  PyDictObject* unit;
> } UnitArrayObject;
>
> PyTypeObject unitArrayObjectType;
>
> static int
> checkunit(PyObject *unit1, PyObject *unit2) {
>  return PyObject_Compare(unit1, unit2);
> }
>
> static PyObject *
> unitArray_new(PyTypeObject *cls, PyObject *args, PyObject *kwargs) {
>        PyObject *data = NULL,
>                         *unit = NULL;
>  PyArray_Descr* dtype = NULL;
>  PyObject *res = NULL, *tmp = NULL;
>
>        if (!PyArg_ParseTuple(args, "OO|O&", &data, &unit,
> PyArray_DescrConverter, &dtype)) {
>                Py_XDECREF(dtype);
>                return NULL;
>        }
>
>  res = PyArray_FromAny(data, dtype, 0, 0, NPY_ENSURECOPY, NULL);
>        if (res == NULL) {
>                Py_XDECREF(dtype);
>    //TODO: raise exception?
>                return NULL;
>        }
>
>        if (PyObject_IsInstance(data, (PyObject*)cls)) {
>                if (unit!=NULL &&
> !checkunit((PyObject*)((UnitArrayObject*)data)->unit,unit)) {
>                        Py_XDECREF(res);
>                        //TODO: raise exception
>                        return NULL;
>                }
>        } else {
>                if (PyObject_IsTrue(unit)) {
>      tmp = res;
>                        res = PyArray_View((PyArrayObject*)res, NULL,
> &unitArrayObjectType);
>      if (tmp!=res) {
>        Py_XDECREF(tmp);
>      }
>      ((UnitArrayObject*)res)->unit = (PyDictObject*)unit;
>      Py_INCREF(unit);
>      if (unit!=NULL) {
>      }
>                }
>        }
>        return res;
> }
>
> static PyObject*
> unitArray__array_finalize__(PyObject* new, PyObject* args) {
>        PyObject *attr = NULL, *tmp = NULL;
>  PyObject *parent = NULL;
>
>  if (!PyArg_ParseTuple(args, "O", &parent)) {
>    return NULL;
>  }
>   if (parent!=NULL) {
>     attr = PyObject_GetAttrString(parent, "unit");
>     if (attr == NULL) {
>        //parent has no 'unit' so we make a new empty one
>       attr = PyDict_New();
>       PyErr_Clear();
>     }
>   }
>  tmp = (PyObject*)((UnitArrayObject*)new)->unit;
>    ((UnitArrayObject*)new)->unit = (PyDictObject*)attr;
>
>  Py_INCREF(Py_None);
>  return Py_None;
> }
>
> static PyObject*
> unitArray__array_wrap__(PyObject *self, PyObject *args) {
>        PyObject *array = NULL, *context = NULL;
>
>        if (!PyArg_ParseTuple(args, "OO", array, context)) {
>                //TODO: raise exception
>                return NULL;
>        }
>
>        printf("%s",PyString_AsString(PyObject_Str(context)));
>
>  Py_INCREF(array);
>  return array;
> }
>
>
> static PyMethodDef unitArrayMethods[] = {
>  {"__array_finalize__", unitArray__array_finalize__, METH_VARARGS, "array
> finalize method"},
>  {"__array_wrap__", unitArray__array_wrap__, METH_VARARGS, "array wrap
> method"},
>  {NULL, NULL, 0, NULL}
> };
>
> static PyMemberDef unitArrayMembers[] = {
>  {"unit", T_OBJECT, offsetof(UnitArrayObject, unit),  0, "dictionary
> containing unit info."},
>  {NULL, 0, 0, 0, NULL}
> };
>
> PyTypeObject unitArrayObjectType = {
>        PyObject_HEAD_INIT(NULL)
>        0,                              /* ob_size        */
>        "spam.UnitArray",               /* tp_name        */
>        sizeof(UnitArrayObject),                /* tp_basicsize   */
>        0,                              /* tp_itemsize    */
>        0,                              /* tp_dealloc     */
>        0,                              /* tp_print       */
>        0,                              /* tp_getattr     */
>        0,                              /* tp_setattr     */
>        0,                              /* tp_compare     */
>        0,                              /* tp_repr        */
>        0,                              /* tp_as_number   */
>        0,                              /* tp_as_sequence */
>        0,                              /* tp_as_mapping  */
>        0,                              /* tp_hash        */
>        0,                              /* tp_call        */
>        0,                              /* tp_str         */
>        0,                              /* tp_getattro    */
>        0,                              /* tp_setattro    */
>        0,                              /* tp_as_buffer   */
>        Py_TPFLAGS_DEFAULT,             /* tp_flags       */
>        "A numpy array with units",     /* tp_doc         */
>  0,        /* traverseproc */
>  0,        /* tp_clear*/
>  0,        /* tp_richcompare */
>  0,        /* tp_weaklistoffset */
>  0,        /* tp_iter */
>  0,        /* tp_iternext */
>  unitArrayMethods,        /* tp_methods */
>  unitArrayMembers,        /* tp_members */
>  0,        /* tp_getset */
>  0,        /* tp_base*/
>  0,        /* tp_dict */
>  0,        /* tp_descr_get*/
>  0,        /* tp_descr_set */
>  0,        /* tp_dictoffset */
>  0,        /* tp_init */
>  0,        /* tp_alloc */
>  unitArray_new /* tp_new */
> };
>
>
> static PyMethodDef SpamMethods[] = {
>    {NULL, NULL, 0, NULL}        /* Sentinel */
> };
>
> PyObject *unitArray = NULL;
> PyMODINIT_FUNC
> initspampub(void)
> {
>  import_array();
>
>  PyObject *m;
>
>  Py_INCREF(&PyArray_Type);
>  unitArrayObjectType.tp_base = &PyArray_Type;
>
>  if (PyType_Ready(&unitArrayObjectType) < 0)
>    return;
>
>
>  m = Py_InitModule3("spampub", SpamMethods, "some tests and a array type
> with units.");
>  if (m == NULL)
>    return;
>
>  SpamError = PyErr_NewException("spampub.error", NULL, NULL);
>  Py_INCREF(SpamError);
>  PyModule_AddObject(m, "error", SpamError);
>  Py_INCREF(&unitArrayObjectType);
>  PyModule_AddObject(m, "UnitArray", (PyObject *)&unitArrayObjectType);
>  (void) Py_InitModule("spampub", SpamMethods);
>
> }
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG/MacGPG2 v2.0.14 (Darwin)
>
> iEYEARECAAYFAk9YBbkACgkQLYu25rCEIzsExQCggHg0e0ibYP3bEsAE0Ce8Rbm3
> qOcAoKwuTuE4BDZgrqNpwISgMS6uSMnO
> =Qtek
> -----END PGP SIGNATURE-----
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120307/4a0bb2ab/attachment.html>

From kalatsky at gmail.com  Wed Mar  7 20:36:39 2012
From: kalatsky at gmail.com (Val Kalatsky)
Date: Wed, 7 Mar 2012 19:36:39 -0600
Subject: [Numpy-discussion] subclassing array in c
In-Reply-To: <CAE8bXEkycG4G5PT9kjdHTbzUJVz7NST-HcySGozL+0934g5_mQ@mail.gmail.com>
References: <EE364848-602F-4CBC-942B-592C8C66E4BA@mpq.mpg.de>
	<CAE8bXEkycG4G5PT9kjdHTbzUJVz7NST-HcySGozL+0934g5_mQ@mail.gmail.com>
Message-ID: <CAE8bXE=hUrTvM96sMXWK5_Z0WbO87JiPJWfS2why+GF-snvvMw@mail.gmail.com>

Tried it on my Ubuntu 10.10 box, no problem:

1) Saved as spampub.c
2) Compiled with (setup.py attached): python setup.py build_ext -i
3) Tested from ipython:
In [1]: import spampub
In [2]: ua=spampub.UnitArray([0,1,2,3.0],'liter')
In [3]: ua
Out[3]: UnitArray([ 0.,  1.,  2.,  3.])
In [4]: ua.unit
Out[4]: 'liter'


On Wed, Mar 7, 2012 at 7:15 PM, Val Kalatsky <kalatsky at gmail.com> wrote:

>
> Seeing the backtrace would be helpful.
> Can you do whatever leads to the segfault
> from python run from gdb?
> Val
>
>
> On Wed, Mar 7, 2012 at 7:04 PM, Christoph Gohle <
> christoph.gohle at mpq.mpg.de> wrote:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Hi,
>>
>> I have been struggeling for quite some time now. Desperate as I am, now I
>> need help.
>>
>> I was trying to subclass ndarrays in a c extension (see code below) and
>> do constantly get segfaults. I have been checking my INCREF and DECREF
>> stuff up and down but can't find the error. Probably I got something
>> completely wrong... anybody able to help?
>>
>> Thanks,
>> Christoph
>> - -----------------
>> #include <Python.h>
>> #include <structmember.h>
>>
>> #include <numpy/arrayobject.h>
>>
>> static PyObject *SpamError;
>>
>> typedef struct {
>>  PyArrayObject base;
>>  PyDictObject* unit;
>> } UnitArrayObject;
>>
>> PyTypeObject unitArrayObjectType;
>>
>> static int
>> checkunit(PyObject *unit1, PyObject *unit2) {
>>  return PyObject_Compare(unit1, unit2);
>> }
>>
>> static PyObject *
>> unitArray_new(PyTypeObject *cls, PyObject *args, PyObject *kwargs) {
>>        PyObject *data = NULL,
>>                         *unit = NULL;
>>  PyArray_Descr* dtype = NULL;
>>  PyObject *res = NULL, *tmp = NULL;
>>
>>        if (!PyArg_ParseTuple(args, "OO|O&", &data, &unit,
>> PyArray_DescrConverter, &dtype)) {
>>                Py_XDECREF(dtype);
>>                return NULL;
>>        }
>>
>>  res = PyArray_FromAny(data, dtype, 0, 0, NPY_ENSURECOPY, NULL);
>>        if (res == NULL) {
>>                Py_XDECREF(dtype);
>>    //TODO: raise exception?
>>                return NULL;
>>        }
>>
>>        if (PyObject_IsInstance(data, (PyObject*)cls)) {
>>                if (unit!=NULL &&
>> !checkunit((PyObject*)((UnitArrayObject*)data)->unit,unit)) {
>>                        Py_XDECREF(res);
>>                        //TODO: raise exception
>>                        return NULL;
>>                }
>>        } else {
>>                if (PyObject_IsTrue(unit)) {
>>      tmp = res;
>>                        res = PyArray_View((PyArrayObject*)res, NULL,
>> &unitArrayObjectType);
>>      if (tmp!=res) {
>>        Py_XDECREF(tmp);
>>      }
>>      ((UnitArrayObject*)res)->unit = (PyDictObject*)unit;
>>      Py_INCREF(unit);
>>      if (unit!=NULL) {
>>      }
>>                }
>>        }
>>        return res;
>> }
>>
>> static PyObject*
>> unitArray__array_finalize__(PyObject* new, PyObject* args) {
>>        PyObject *attr = NULL, *tmp = NULL;
>>  PyObject *parent = NULL;
>>
>>  if (!PyArg_ParseTuple(args, "O", &parent)) {
>>    return NULL;
>>  }
>>   if (parent!=NULL) {
>>     attr = PyObject_GetAttrString(parent, "unit");
>>     if (attr == NULL) {
>>        //parent has no 'unit' so we make a new empty one
>>       attr = PyDict_New();
>>       PyErr_Clear();
>>     }
>>   }
>>  tmp = (PyObject*)((UnitArrayObject*)new)->unit;
>>    ((UnitArrayObject*)new)->unit = (PyDictObject*)attr;
>>
>>  Py_INCREF(Py_None);
>>  return Py_None;
>> }
>>
>> static PyObject*
>> unitArray__array_wrap__(PyObject *self, PyObject *args) {
>>        PyObject *array = NULL, *context = NULL;
>>
>>        if (!PyArg_ParseTuple(args, "OO", array, context)) {
>>                //TODO: raise exception
>>                return NULL;
>>        }
>>
>>        printf("%s",PyString_AsString(PyObject_Str(context)));
>>
>>  Py_INCREF(array);
>>  return array;
>> }
>>
>>
>> static PyMethodDef unitArrayMethods[] = {
>>  {"__array_finalize__", unitArray__array_finalize__, METH_VARARGS, "array
>> finalize method"},
>>  {"__array_wrap__", unitArray__array_wrap__, METH_VARARGS, "array wrap
>> method"},
>>  {NULL, NULL, 0, NULL}
>> };
>>
>> static PyMemberDef unitArrayMembers[] = {
>>  {"unit", T_OBJECT, offsetof(UnitArrayObject, unit),  0, "dictionary
>> containing unit info."},
>>  {NULL, 0, 0, 0, NULL}
>> };
>>
>> PyTypeObject unitArrayObjectType = {
>>        PyObject_HEAD_INIT(NULL)
>>        0,                              /* ob_size        */
>>        "spam.UnitArray",               /* tp_name        */
>>        sizeof(UnitArrayObject),                /* tp_basicsize   */
>>        0,                              /* tp_itemsize    */
>>        0,                              /* tp_dealloc     */
>>        0,                              /* tp_print       */
>>        0,                              /* tp_getattr     */
>>        0,                              /* tp_setattr     */
>>        0,                              /* tp_compare     */
>>        0,                              /* tp_repr        */
>>        0,                              /* tp_as_number   */
>>        0,                              /* tp_as_sequence */
>>        0,                              /* tp_as_mapping  */
>>        0,                              /* tp_hash        */
>>        0,                              /* tp_call        */
>>        0,                              /* tp_str         */
>>        0,                              /* tp_getattro    */
>>        0,                              /* tp_setattro    */
>>        0,                              /* tp_as_buffer   */
>>        Py_TPFLAGS_DEFAULT,             /* tp_flags       */
>>        "A numpy array with units",     /* tp_doc         */
>>  0,        /* traverseproc */
>>  0,        /* tp_clear*/
>>  0,        /* tp_richcompare */
>>  0,        /* tp_weaklistoffset */
>>  0,        /* tp_iter */
>>  0,        /* tp_iternext */
>>  unitArrayMethods,        /* tp_methods */
>>  unitArrayMembers,        /* tp_members */
>>  0,        /* tp_getset */
>>  0,        /* tp_base*/
>>  0,        /* tp_dict */
>>  0,        /* tp_descr_get*/
>>  0,        /* tp_descr_set */
>>  0,        /* tp_dictoffset */
>>  0,        /* tp_init */
>>  0,        /* tp_alloc */
>>  unitArray_new /* tp_new */
>> };
>>
>>
>> static PyMethodDef SpamMethods[] = {
>>    {NULL, NULL, 0, NULL}        /* Sentinel */
>> };
>>
>> PyObject *unitArray = NULL;
>> PyMODINIT_FUNC
>> initspampub(void)
>> {
>>  import_array();
>>
>>  PyObject *m;
>>
>>  Py_INCREF(&PyArray_Type);
>>  unitArrayObjectType.tp_base = &PyArray_Type;
>>
>>  if (PyType_Ready(&unitArrayObjectType) < 0)
>>    return;
>>
>>
>>  m = Py_InitModule3("spampub", SpamMethods, "some tests and a array type
>> with units.");
>>  if (m == NULL)
>>    return;
>>
>>  SpamError = PyErr_NewException("spampub.error", NULL, NULL);
>>  Py_INCREF(SpamError);
>>  PyModule_AddObject(m, "error", SpamError);
>>  Py_INCREF(&unitArrayObjectType);
>>  PyModule_AddObject(m, "UnitArray", (PyObject *)&unitArrayObjectType);
>>  (void) Py_InitModule("spampub", SpamMethods);
>>
>> }
>>
>>
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG/MacGPG2 v2.0.14 (Darwin)
>>
>> iEYEARECAAYFAk9YBbkACgkQLYu25rCEIzsExQCggHg0e0ibYP3bEsAE0Ce8Rbm3
>> qOcAoKwuTuE4BDZgrqNpwISgMS6uSMnO
>> =Qtek
>> -----END PGP SIGNATURE-----
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120307/a7726319/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: setup.py
Type: application/octet-stream
Size: 550 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120307/a7726319/attachment.obj>

From cgohlke at uci.edu  Wed Mar  7 21:01:14 2012
From: cgohlke at uci.edu (Christoph Gohlke)
Date: Wed, 07 Mar 2012 18:01:14 -0800
Subject: [Numpy-discussion] subclassing array in c
In-Reply-To: <CAE8bXE=hUrTvM96sMXWK5_Z0WbO87JiPJWfS2why+GF-snvvMw@mail.gmail.com>
References: <EE364848-602F-4CBC-942B-592C8C66E4BA@mpq.mpg.de>
	<CAE8bXEkycG4G5PT9kjdHTbzUJVz7NST-HcySGozL+0934g5_mQ@mail.gmail.com>
	<CAE8bXE=hUrTvM96sMXWK5_Z0WbO87JiPJWfS2why+GF-snvvMw@mail.gmail.com>
Message-ID: <4F5812EA.5020306@uci.edu>

FWIW, this crashes on Windows with numpy 1.6.1 but not numpy 1.7-git 
debug build.

Christoph Gohlke


On 3/7/2012 5:36 PM, Val Kalatsky wrote:
>
> Tried it on my Ubuntu 10.10 box, no problem:
>
> 1) Saved as spampub.c
> 2) Compiled with (setup.py attached): python setup.py build_ext -i
> 3) Tested from ipython:
> In [1]: import spampub
> In [2]: ua=spampub.UnitArray([0,1,2,3.0],'liter')
> In [3]: ua
> Out[3]: UnitArray([ 0.,  1.,  2.,  3.])
> In [4]: ua.unit
> Out[4]: 'liter'
>
>
> On Wed, Mar 7, 2012 at 7:15 PM, Val Kalatsky <kalatsky at gmail.com
> <mailto:kalatsky at gmail.com>> wrote:
>
>
>     Seeing the backtrace would be helpful.
>     Can you do whatever leads to the segfault
>     from python run from gdb?
>     Val
>
>
>     On Wed, Mar 7, 2012 at 7:04 PM, Christoph Gohle
>     <christoph.gohle at mpq.mpg.de <mailto:christoph.gohle at mpq.mpg.de>> wrote:
>
>         -----BEGIN PGP SIGNED MESSAGE-----
>         Hash: SHA1
>
>         Hi,
>
>         I have been struggeling for quite some time now. Desperate as I
>         am, now I need help.
>
>         I was trying to subclass ndarrays in a c extension (see code
>         below) and do constantly get segfaults. I have been checking my
>         INCREF and DECREF stuff up and down but can't find the error.
>         Probably I got something completely wrong... anybody able to help?
>
>         Thanks,
>         Christoph
>         - -----------------
>         #include <Python.h>
>         #include <structmember.h>
>
>         #include <numpy/arrayobject.h>
>
>         static PyObject *SpamError;
>
>         typedef struct {
>           PyArrayObject base;
>           PyDictObject* unit;
>         } UnitArrayObject;
>
>         PyTypeObject unitArrayObjectType;
>
>         static int
>         checkunit(PyObject *unit1, PyObject *unit2) {
>           return PyObject_Compare(unit1, unit2);
>         }
>
>         static PyObject *
>         unitArray_new(PyTypeObject *cls, PyObject *args, PyObject *kwargs) {
>                 PyObject *data = NULL,
>                                  *unit = NULL;
>           PyArray_Descr* dtype = NULL;
>           PyObject *res = NULL, *tmp = NULL;
>
>                 if (!PyArg_ParseTuple(args, "OO|O&", &data, &unit,
>         PyArray_DescrConverter, &dtype)) {
>                         Py_XDECREF(dtype);
>                         return NULL;
>                 }
>
>           res = PyArray_FromAny(data, dtype, 0, 0, NPY_ENSURECOPY, NULL);
>                 if (res == NULL) {
>                         Py_XDECREF(dtype);
>             //TODO: raise exception?
>                         return NULL;
>                 }
>
>                 if (PyObject_IsInstance(data, (PyObject*)cls)) {
>                         if (unit!=NULL &&
>         !checkunit((PyObject*)((UnitArrayObject*)data)->unit,unit)) {
>                                 Py_XDECREF(res);
>                                 //TODO: raise exception
>                                 return NULL;
>                         }
>                 } else {
>                         if (PyObject_IsTrue(unit)) {
>               tmp = res;
>                                 res = PyArray_View((PyArrayObject*)res,
>         NULL, &unitArrayObjectType);
>               if (tmp!=res) {
>                 Py_XDECREF(tmp);
>               }
>               ((UnitArrayObject*)res)->unit = (PyDictObject*)unit;
>               Py_INCREF(unit);
>               if (unit!=NULL) {
>               }
>                         }
>                 }
>                 return res;
>         }
>
>         static PyObject*
>         unitArray__array_finalize__(PyObject* new, PyObject* args) {
>                 PyObject *attr = NULL, *tmp = NULL;
>           PyObject *parent = NULL;
>
>           if (!PyArg_ParseTuple(args, "O", &parent)) {
>             return NULL;
>           }
>            if (parent!=NULL) {
>              attr = PyObject_GetAttrString(parent, "unit");
>              if (attr == NULL) {
>                 //parent has no 'unit' so we make a new empty one
>                attr = PyDict_New();
>                PyErr_Clear();
>              }
>            }
>           tmp = (PyObject*)((UnitArrayObject*)new)->unit;
>             ((UnitArrayObject*)new)->unit = (PyDictObject*)attr;
>
>           Py_INCREF(Py_None);
>           return Py_None;
>         }
>
>         static PyObject*
>         unitArray__array_wrap__(PyObject *self, PyObject *args) {
>                 PyObject *array = NULL, *context = NULL;
>
>                 if (!PyArg_ParseTuple(args, "OO", array, context)) {
>                         //TODO: raise exception
>                         return NULL;
>                 }
>
>                 printf("%s",PyString_AsString(PyObject_Str(context)));
>
>           Py_INCREF(array);
>           return array;
>         }
>
>
>         static PyMethodDef unitArrayMethods[] = {
>           {"__array_finalize__", unitArray__array_finalize__,
>         METH_VARARGS, "array finalize method"},
>           {"__array_wrap__", unitArray__array_wrap__, METH_VARARGS,
>         "array wrap method"},
>           {NULL, NULL, 0, NULL}
>         };
>
>         static PyMemberDef unitArrayMembers[] = {
>           {"unit", T_OBJECT, offsetof(UnitArrayObject, unit),  0,
>         "dictionary containing unit info."},
>           {NULL, 0, 0, 0, NULL}
>         };
>
>         PyTypeObject unitArrayObjectType = {
>                 PyObject_HEAD_INIT(NULL)
>                 0,                              /* ob_size        */
>         "spam.UnitArray",               /* tp_name        */
>                 sizeof(UnitArrayObject),                /* tp_basicsize   */
>                 0,                              /* tp_itemsize    */
>                 0,                              /* tp_dealloc     */
>                 0,                              /* tp_print       */
>                 0,                              /* tp_getattr     */
>                 0,                              /* tp_setattr     */
>                 0,                              /* tp_compare     */
>                 0,                              /* tp_repr        */
>                 0,                              /* tp_as_number   */
>                 0,                              /* tp_as_sequence */
>                 0,                              /* tp_as_mapping  */
>                 0,                              /* tp_hash        */
>                 0,                              /* tp_call        */
>                 0,                              /* tp_str         */
>                 0,                              /* tp_getattro    */
>                 0,                              /* tp_setattro    */
>                 0,                              /* tp_as_buffer   */
>                 Py_TPFLAGS_DEFAULT,             /* tp_flags       */
>         "A numpy array with units",     /* tp_doc         */
>           0,        /* traverseproc */
>           0,        /* tp_clear*/
>           0,        /* tp_richcompare */
>           0,        /* tp_weaklistoffset */
>           0,        /* tp_iter */
>           0,        /* tp_iternext */
>           unitArrayMethods,        /* tp_methods */
>           unitArrayMembers,        /* tp_members */
>           0,        /* tp_getset */
>           0,        /* tp_base*/
>           0,        /* tp_dict */
>           0,        /* tp_descr_get*/
>           0,        /* tp_descr_set */
>           0,        /* tp_dictoffset */
>           0,        /* tp_init */
>           0,        /* tp_alloc */
>           unitArray_new /* tp_new */
>         };
>
>
>         static PyMethodDef SpamMethods[] = {
>             {NULL, NULL, 0, NULL}        /* Sentinel */
>         };
>
>         PyObject *unitArray = NULL;
>         PyMODINIT_FUNC
>         initspampub(void)
>         {
>           import_array();
>
>           PyObject *m;
>
>           Py_INCREF(&PyArray_Type);
>           unitArrayObjectType.tp_base = &PyArray_Type;
>
>           if (PyType_Ready(&unitArrayObjectType) < 0)
>             return;
>
>
>           m = Py_InitModule3("spampub", SpamMethods, "some tests and a
>         array type with units.");
>           if (m == NULL)
>             return;
>
>           SpamError = PyErr_NewException("spampub.error", NULL, NULL);
>           Py_INCREF(SpamError);
>           PyModule_AddObject(m, "error", SpamError);
>           Py_INCREF(&unitArrayObjectType);
>           PyModule_AddObject(m, "UnitArray", (PyObject
>         *)&unitArrayObjectType);
>           (void) Py_InitModule("spampub", SpamMethods);
>
>         }
>
>
>         -----BEGIN PGP SIGNATURE-----
>         Version: GnuPG/MacGPG2 v2.0.14 (Darwin)
>
>         iEYEARECAAYFAk9YBbkACgkQLYu25rCEIzsExQCggHg0e0ibYP3bEsAE0Ce8Rbm3
>         qOcAoKwuTuE4BDZgrqNpwISgMS6uSMnO
>         =Qtek
>         -----END PGP SIGNATURE-----
>         _______________________________________________
>         NumPy-Discussion mailing list
>         NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>         http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From ben.root at ou.edu  Wed Mar  7 21:13:18 2012
From: ben.root at ou.edu (Benjamin Root)
Date: Wed, 7 Mar 2012 20:13:18 -0600
Subject: [Numpy-discussion] use for missing (ignored) data?
In-Reply-To: <CAPJVwBmBTXdALVUt0qR3sqX7bgkhSoZ4vbssm0kqRxBc-+H5fg@mail.gmail.com>
References: <jj8f2s$74a$1@dough.gmane.org>
	<CAPJVwBmBTXdALVUt0qR3sqX7bgkhSoZ4vbssm0kqRxBc-+H5fg@mail.gmail.com>
Message-ID: <CANNq6F=cVxMOEyxLNTqoqArgFJ6ODe_KsGHvAPHhUK2k_Wx3_A@mail.gmail.com>

On Wednesday, March 7, 2012, Nathaniel Smith <njs at pobox.com> wrote:
> On Wed, Mar 7, 2012 at 8:05 PM, Neal Becker <ndbecker2 at gmail.com> wrote:
>> I'm wondering what is the use for the ignored data feature?
>>
>> I can use:
>>
>> A[valid_A_indexes] = whatever
>>
>> to process only the 'non-ignored' portions of A.  So at least some
simple cases
>> of ignored data are already supported without introducing a new type.
>>
>> OTOH:
>>
>> w = A[valid_A_indexes]
>>
>> will copy A's data, and subsequent use of
>>
>> w[:] = something
>>
>> will not update A.
>>
>> Is this the reason for wanting the ignored data feature?
>
> Hi Neal,
>
> There are a few reasons that I know of why people want more support
> from numpy for ignored data/masks, specifically (as opposed to missing
> data or other related concepts):
>
> 1) If you're often working on some subset of your data, then it's
> convenient to set the mask once and have it stay in effect for further
> operations. Anything you can accomplish this way can also be
> accomplished by keeping an explicit mask array and using it for
> indexing "by hand", but in some situations it may be more convenient
> not to.
>
> 2) Operating on subsets of an array without making a copy. Like
> Benjamin pointed out, indexing with a mask makes a copy. This is slow,
> and what's worse, people who work with large data sets (e.g., big fMRI
> volumes) may not have enough memory to afford such a copy. This
> problem can be solved by using the new where= argument to ufuncs
> (which skips the copy). (But then see (1) -- passing where= to a bunch
> of functions takes more typing than just setting it once and leaving
> it.)
>
> 3) Suppose there's a 3rd-party function that takes an array --
> borrowing Charles example, say it's draw_points(arr). Now you want to
> apply it to just a subset of your data, and want to avoid a copy. It
> would be nice if the original author had made it draw_points(arr,
> mask), but they didn't. Well, if you have masking "built in" to your
> array type, then maybe you can call this as draw_points(masked_arr)
> and it will Just Work. I.e., maybe people who aren't thinking about
> masking will sometimes write code that accidentally works with masking
> anyway. I'm not sure how much I'd trust this, but I guess it's nice
> when it happens. And if it does work, then implementing the show/hide
> point functionality will be easier. (And if it doesn't work, and
> masking is built into numpy.ndarray, then maybe you can use this to
> argue with the original author that this is a bug, not just a missing
> feature. Again, I'm not sure if this is a good thing on net: one could
> argue that people shouldn't be forced to think about masking every
> time they write any function, just in case it becomes relevant later.
> But certainly it'd be useful sometimes.)
>
> There may be other motivations that I'm not aware of, of course.
>
> -- Nathaniel
>

I think you got most of the motivations right. I would say on the last
point that extension authors should be able to say "does not support NA!".
 The important thing is that it makes it more up-front.

An additional motivation is with regards to mathematical operations.
 Personally, I hate getting bitten by a function that takes a max(), and I
have a NaN in the array.  In addition, what about adding two arrays
together that may or may not have different masks?  This has been the major
advantage of no.ma.  All of Mostly Works.

Cheers!
Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120307/a798bec7/attachment.html>

From christoph.gohle at mpq.mpg.de  Thu Mar  8 02:49:18 2012
From: christoph.gohle at mpq.mpg.de (Christoph Gohle)
Date: Thu, 8 Mar 2012 08:49:18 +0100
Subject: [Numpy-discussion] subclassing array in c
In-Reply-To: <CAE8bXE=hUrTvM96sMXWK5_Z0WbO87JiPJWfS2why+GF-snvvMw@mail.gmail.com>
References: <EE364848-602F-4CBC-942B-592C8C66E4BA@mpq.mpg.de>
	<CAE8bXEkycG4G5PT9kjdHTbzUJVz7NST-HcySGozL+0934g5_mQ@mail.gmail.com>
	<CAE8bXE=hUrTvM96sMXWK5_Z0WbO87JiPJWfS2why+GF-snvvMw@mail.gmail.com>
Message-ID: <80A2C97D-8386-4BBA-AA5E-14FD18568457@mpq.mpg.de>

Dear Val,

I agree that more detail is needed. Sorry for that it was late yesterday.

 I am running Python 2.6.1, numpy development branch (numpy-2.0.0.dev_20101104-py2.6-macosx-10.6-universal.egg). maybe I should switch to release?

I compile with your setup.py using 'python setup.py build_ext -i' and run the commands below in ipython. As you can see, I don't get a crash for a single call to the constructor, consistent with your observation. And it looks like, one has to use dict in the unit to make it crash.

Can you make sense out of that?


In [1]: import spampub

In [4]: spampub.UnitArray(1,{'s':1})
Out[4]: UnitArray(1)

In [6]: a=[spampub.UnitArray(i,{'s':i}) for i in xrange(1000)]
Segmentation fault

backtrace is the following:

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000021
Crashed Thread:  0  Dispatch queue: com.apple.main-thread

Thread 0 Crashed:  Dispatch queue: com.apple.main-thread
0   org.python.python             	0x00000001000b0b8e PyObject_GC_Track + 1473
1   org.python.python             	0x00000001000b1255 _PyObject_GC_Malloc + 191
2   org.python.python             	0x00000001000b12d0 _PyObject_GC_New + 21
3   org.python.python             	0x000000010003a856 PyDict_New + 116
4   org.python.python             	0x000000010003a99c _PyDict_NewPresized + 24
5   org.python.python             	0x00000001000880cb PyEval_EvalFrameEx + 11033
6   org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
7   org.python.python             	0x000000010008735d PyEval_EvalFrameEx + 7595
8   org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
9   org.python.python             	0x000000010008935e PyEval_EvalFrameEx + 15788
10  org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
11  org.python.python             	0x000000010008935e PyEval_EvalFrameEx + 15788
12  org.python.python             	0x00000001000892e1 PyEval_EvalFrameEx + 15663
13  org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
14  org.python.python             	0x000000010008935e PyEval_EvalFrameEx + 15788
15  org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
16  org.python.python             	0x000000010008935e PyEval_EvalFrameEx + 15788
17  org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
18  org.python.python             	0x000000010008935e PyEval_EvalFrameEx + 15788
19  org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
20  org.python.python             	0x000000010008935e PyEval_EvalFrameEx + 15788
21  org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
22  org.python.python             	0x000000010008ad61 PyEval_EvalCode + 54
23  org.python.python             	0x00000001000a265a Py_CompileString + 78
24  org.python.python             	0x00000001000a2723 PyRun_FileExFlags + 150
25  org.python.python             	0x00000001000a423d PyRun_SimpleFileExFlags + 704
26  org.python.python             	0x00000001000b0286 Py_Main + 2718
27  org.python.python.app         	0x0000000100000e6c start + 52

Am 08.03.2012 um 02:36 schrieb Val Kalatsky:

> 
> Tried it on my Ubuntu 10.10 box, no problem:
> 
> 1) Saved as spampub.c
> 2) Compiled with (setup.py attached): python setup.py build_ext -i
> 3) Tested from ipython:
> In [1]: import spampub
> In [2]: ua=spampub.UnitArray([0,1,2,3.0],'liter')
> In [3]: ua
> Out[3]: UnitArray([ 0.,  1.,  2.,  3.])
> In [4]: ua.unit
> Out[4]: 'liter'
> 
> 
> On Wed, Mar 7, 2012 at 7:15 PM, Val Kalatsky <kalatsky at gmail.com> wrote:
> 
> Seeing the backtrace would be helpful. 
> Can you do whatever leads to the segfault 
> from python run from gdb?
> Val
> 
> 
> On Wed, Mar 7, 2012 at 7:04 PM, Christoph Gohle <christoph.gohle at mpq.mpg.de> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hi,
> 
> I have been struggeling for quite some time now. Desperate as I am, now I need help.
> 
> I was trying to subclass ndarrays in a c extension (see code below) and do constantly get segfaults. I have been checking my INCREF and DECREF stuff up and down but can't find the error. Probably I got something completely wrong... anybody able to help?
> 
> Thanks,
> Christoph
> - -----------------
> #include <Python.h>
> #include <structmember.h>
> 
> #include <numpy/arrayobject.h>
> 
> static PyObject *SpamError;
> 
> typedef struct {
>  PyArrayObject base;
>  PyDictObject* unit;
> } UnitArrayObject;
> 
> PyTypeObject unitArrayObjectType;
> 
> static int
> checkunit(PyObject *unit1, PyObject *unit2) {
>  return PyObject_Compare(unit1, unit2);
> }
> 
> static PyObject *
> unitArray_new(PyTypeObject *cls, PyObject *args, PyObject *kwargs) {
>        PyObject *data = NULL,
>                         *unit = NULL;
>  PyArray_Descr* dtype = NULL;
>  PyObject *res = NULL, *tmp = NULL;
> 
>        if (!PyArg_ParseTuple(args, "OO|O&", &data, &unit, PyArray_DescrConverter, &dtype)) {
>                Py_XDECREF(dtype);
>                return NULL;
>        }
> 
>  res = PyArray_FromAny(data, dtype, 0, 0, NPY_ENSURECOPY, NULL);
>        if (res == NULL) {
>                Py_XDECREF(dtype);
>    //TODO: raise exception?
>                return NULL;
>        }
> 
>        if (PyObject_IsInstance(data, (PyObject*)cls)) {
>                if (unit!=NULL && !checkunit((PyObject*)((UnitArrayObject*)data)->unit,unit)) {
>                        Py_XDECREF(res);
>                        //TODO: raise exception
>                        return NULL;
>                }
>        } else {
>                if (PyObject_IsTrue(unit)) {
>      tmp = res;
>                        res = PyArray_View((PyArrayObject*)res, NULL, &unitArrayObjectType);
>      if (tmp!=res) {
>        Py_XDECREF(tmp);
>      }
>      ((UnitArrayObject*)res)->unit = (PyDictObject*)unit;
>      Py_INCREF(unit);
>      if (unit!=NULL) {
>      }
>                }
>        }
>        return res;
> }
> 
> static PyObject*
> unitArray__array_finalize__(PyObject* new, PyObject* args) {
>        PyObject *attr = NULL, *tmp = NULL;
>  PyObject *parent = NULL;
> 
>  if (!PyArg_ParseTuple(args, "O", &parent)) {
>    return NULL;
>  }
>   if (parent!=NULL) {
>     attr = PyObject_GetAttrString(parent, "unit");
>     if (attr == NULL) {
>        //parent has no 'unit' so we make a new empty one
>       attr = PyDict_New();
>       PyErr_Clear();
>     }
>   }
>  tmp = (PyObject*)((UnitArrayObject*)new)->unit;
>    ((UnitArrayObject*)new)->unit = (PyDictObject*)attr;
> 
>  Py_INCREF(Py_None);
>  return Py_None;
> }
> 
> static PyObject*
> unitArray__array_wrap__(PyObject *self, PyObject *args) {
>        PyObject *array = NULL, *context = NULL;
> 
>        if (!PyArg_ParseTuple(args, "OO", array, context)) {
>                //TODO: raise exception
>                return NULL;
>        }
> 
>        printf("%s",PyString_AsString(PyObject_Str(context)));
> 
>  Py_INCREF(array);
>  return array;
> }
> 
> 
> static PyMethodDef unitArrayMethods[] = {
>  {"__array_finalize__", unitArray__array_finalize__, METH_VARARGS, "array finalize method"},
>  {"__array_wrap__", unitArray__array_wrap__, METH_VARARGS, "array wrap method"},
>  {NULL, NULL, 0, NULL}
> };
> 
> static PyMemberDef unitArrayMembers[] = {
>  {"unit", T_OBJECT, offsetof(UnitArrayObject, unit),  0, "dictionary containing unit info."},
>  {NULL, 0, 0, 0, NULL}
> };
> 
> PyTypeObject unitArrayObjectType = {
>        PyObject_HEAD_INIT(NULL)
>        0,                              /* ob_size        */
>        "spam.UnitArray",               /* tp_name        */
>        sizeof(UnitArrayObject),                /* tp_basicsize   */
>        0,                              /* tp_itemsize    */
>        0,                              /* tp_dealloc     */
>        0,                              /* tp_print       */
>        0,                              /* tp_getattr     */
>        0,                              /* tp_setattr     */
>        0,                              /* tp_compare     */
>        0,                              /* tp_repr        */
>        0,                              /* tp_as_number   */
>        0,                              /* tp_as_sequence */
>        0,                              /* tp_as_mapping  */
>        0,                              /* tp_hash        */
>        0,                              /* tp_call        */
>        0,                              /* tp_str         */
>        0,                              /* tp_getattro    */
>        0,                              /* tp_setattro    */
>        0,                              /* tp_as_buffer   */
>        Py_TPFLAGS_DEFAULT,             /* tp_flags       */
>        "A numpy array with units",     /* tp_doc         */
>  0,        /* traverseproc */
>  0,        /* tp_clear*/
>  0,        /* tp_richcompare */
>  0,        /* tp_weaklistoffset */
>  0,        /* tp_iter */
>  0,        /* tp_iternext */
>  unitArrayMethods,        /* tp_methods */
>  unitArrayMembers,        /* tp_members */
>  0,        /* tp_getset */
>  0,        /* tp_base*/
>  0,        /* tp_dict */
>  0,        /* tp_descr_get*/
>  0,        /* tp_descr_set */
>  0,        /* tp_dictoffset */
>  0,        /* tp_init */
>  0,        /* tp_alloc */
>  unitArray_new /* tp_new */
> };
> 
> 
> static PyMethodDef SpamMethods[] = {
>    {NULL, NULL, 0, NULL}        /* Sentinel */
> };
> 
> PyObject *unitArray = NULL;
> PyMODINIT_FUNC
> initspampub(void)
> {
>  import_array();
> 
>  PyObject *m;
> 
>  Py_INCREF(&PyArray_Type);
>  unitArrayObjectType.tp_base = &PyArray_Type;
> 
>  if (PyType_Ready(&unitArrayObjectType) < 0)
>    return;
> 
> 
>  m = Py_InitModule3("spampub", SpamMethods, "some tests and a array type with units.");
>  if (m == NULL)
>    return;
> 
>  SpamError = PyErr_NewException("spampub.error", NULL, NULL);
>  Py_INCREF(SpamError);
>  PyModule_AddObject(m, "error", SpamError);
>  Py_INCREF(&unitArrayObjectType);
>  PyModule_AddObject(m, "UnitArray", (PyObject *)&unitArrayObjectType);
>  (void) Py_InitModule("spampub", SpamMethods);
> 
> }
> 
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG/MacGPG2 v2.0.14 (Darwin)
> 
> iEYEARECAAYFAk9YBbkACgkQLYu25rCEIzsExQCggHg0e0ibYP3bEsAE0Ce8Rbm3
> qOcAoKwuTuE4BDZgrqNpwISgMS6uSMnO
> =Qtek
> -----END PGP SIGNATURE-----
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> <setup.py>_______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


Christoph Gohle
--
Max-Planck-Institut f?r Quantenoptik
Abteilung Quantenvielteilchensysteme
Hans-Kopfermann-Strasse 1
85748 Garching

christoph.gohle at mpq.mpg.de
tel: +49 89 32905 283
fax: +49 89 32905 313


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120308/4a95ce0e/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 203 bytes
Desc: Signierter Teil der Nachricht
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120308/4a95ce0e/attachment.sig>

From kalatsky at gmail.com  Thu Mar  8 03:08:19 2012
From: kalatsky at gmail.com (Val Kalatsky)
Date: Thu, 8 Mar 2012 02:08:19 -0600
Subject: [Numpy-discussion] subclassing array in c
In-Reply-To: <80A2C97D-8386-4BBA-AA5E-14FD18568457@mpq.mpg.de>
References: <EE364848-602F-4CBC-942B-592C8C66E4BA@mpq.mpg.de>
	<CAE8bXEkycG4G5PT9kjdHTbzUJVz7NST-HcySGozL+0934g5_mQ@mail.gmail.com>
	<CAE8bXE=hUrTvM96sMXWK5_Z0WbO87JiPJWfS2why+GF-snvvMw@mail.gmail.com>
	<80A2C97D-8386-4BBA-AA5E-14FD18568457@mpq.mpg.de>
Message-ID: <CAE8bXEkC9uyzb3MWtnJ1eyssOsZgiwzA8YH7X-TxgtOnDqrq9Q@mail.gmail.com>

Hi Christoph,

I've just tried
a=[spampub.UnitArray(i,{'s':i}) for i in xrange(1000)]
and everything looks fine on my side.
Probably my test environment is too different to give comparable results:

In [3]: call(["uname", "-a"])
Linux ubuntu 2.6.35-22-generic #33-Ubuntu SMP Sun Sep 19 20:32:27 UTC 2010
x86_64 GNU/Linux
In [4]: np.__version__
Out[4]: '1.5.1'

I am afraid I will not be able to help testing it on super-nova versions of
numpy.
Cheers
Val


On Thu, Mar 8, 2012 at 1:49 AM, Christoph Gohle
<christoph.gohle at mpq.mpg.de>wrote:

> Dear Val,
>
> I agree that more detail is needed. Sorry for that it was late yesterday.
>
>  I am running Python 2.6.1, numpy development branch
> (numpy-2.0.0.dev_20101104-py2.6-macosx-10.6-universal.egg). maybe I should
> switch to release?
>
> I compile with your setup.py using 'python setup.py build_ext -i' and run
> the commands below in ipython. As you can see, I don't get a crash for a
> single call to the constructor, consistent with your observation. And it
> looks like, one has to use dict in the unit to make it crash.
>
> Can you make sense out of that?
>
>
> In [1]: import spampub
>
> In [4]: spampub.UnitArray(1,{'s':1})
> Out[4]: UnitArray(1)
>
> In [6]: a=[spampub.UnitArray(i,{'s':i}) for i in xrange(1000)]
> Segmentation fault
>
> backtrace is the following:
>
> Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
> Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000021
> Crashed Thread:  0  Dispatch queue: com.apple.main-thread
>
> Thread 0 Crashed:  Dispatch queue: com.apple.main-thread
> 0   org.python.python             0x00000001000b0b8e PyObject_GC_Track +
> 1473
> 1   org.python.python             0x00000001000b1255 _PyObject_GC_Malloc
> + 191
> 2   org.python.python             0x00000001000b12d0 _PyObject_GC_New + 21
> 3   org.python.python             0x000000010003a856 PyDict_New + 116
> 4   org.python.python             0x000000010003a99c _PyDict_NewPresized
> + 24
> 5   org.python.python             0x00000001000880cb PyEval_EvalFrameEx +
> 11033
> 6   org.python.python             0x000000010008acce PyEval_EvalCodeEx +
> 1803
> 7   org.python.python             0x000000010008735d PyEval_EvalFrameEx +
> 7595
> 8   org.python.python             0x000000010008acce PyEval_EvalCodeEx +
> 1803
> 9   org.python.python             0x000000010008935e PyEval_EvalFrameEx +
> 15788
> 10  org.python.python             0x000000010008acce PyEval_EvalCodeEx +
> 1803
> 11  org.python.python             0x000000010008935e PyEval_EvalFrameEx +
> 15788
> 12  org.python.python             0x00000001000892e1 PyEval_EvalFrameEx +
> 15663
> 13  org.python.python             0x000000010008acce PyEval_EvalCodeEx +
> 1803
> 14  org.python.python             0x000000010008935e PyEval_EvalFrameEx +
> 15788
> 15  org.python.python             0x000000010008acce PyEval_EvalCodeEx +
> 1803
> 16  org.python.python             0x000000010008935e PyEval_EvalFrameEx +
> 15788
> 17  org.python.python             0x000000010008acce PyEval_EvalCodeEx +
> 1803
> 18  org.python.python             0x000000010008935e PyEval_EvalFrameEx +
> 15788
> 19  org.python.python             0x000000010008acce PyEval_EvalCodeEx +
> 1803
> 20  org.python.python             0x000000010008935e PyEval_EvalFrameEx +
> 15788
> 21  org.python.python             0x000000010008acce PyEval_EvalCodeEx +
> 1803
> 22  org.python.python             0x000000010008ad61 PyEval_EvalCode + 54
> 23  org.python.python             0x00000001000a265a Py_CompileString + 78
> 24  org.python.python             0x00000001000a2723 PyRun_FileExFlags +
> 150
> 25  org.python.python             0x00000001000a423d
> PyRun_SimpleFileExFlags + 704
> 26  org.python.python             0x00000001000b0286 Py_Main + 2718
> 27  org.python.python.app         0x0000000100000e6c start + 52
>
> Am 08.03.2012 um 02:36 schrieb Val Kalatsky:
>
>
> Tried it on my Ubuntu 10.10 box, no problem:
>
> 1) Saved as spampub.c
> 2) Compiled with (setup.py attached): python setup.py build_ext -i
> 3) Tested from ipython:
> In [1]: import spampub
> In [2]: ua=spampub.UnitArray([0,1,2,3.0],'liter')
> In [3]: ua
> Out[3]: UnitArray([ 0.,  1.,  2.,  3.])
> In [4]: ua.unit
> Out[4]: 'liter'
>
>
> On Wed, Mar 7, 2012 at 7:15 PM, Val Kalatsky <kalatsky at gmail.com> wrote:
>
>>
>> Seeing the backtrace would be helpful.
>> Can you do whatever leads to the segfault
>> from python run from gdb?
>> Val
>>
>>
>> On Wed, Mar 7, 2012 at 7:04 PM, Christoph Gohle <
>> christoph.gohle at mpq.mpg.de> wrote:
>>
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA1
>>>
>>> Hi,
>>>
>>> I have been struggeling for quite some time now. Desperate as I am, now
>>> I need help.
>>>
>>> I was trying to subclass ndarrays in a c extension (see code below) and
>>> do constantly get segfaults. I have been checking my INCREF and DECREF
>>> stuff up and down but can't find the error. Probably I got something
>>> completely wrong... anybody able to help?
>>>
>>> Thanks,
>>> Christoph
>>> - -----------------
>>> #include <Python.h>
>>> #include <structmember.h>
>>>
>>> #include <numpy/arrayobject.h>
>>>
>>> static PyObject *SpamError;
>>>
>>> typedef struct {
>>>  PyArrayObject base;
>>>  PyDictObject* unit;
>>> } UnitArrayObject;
>>>
>>> PyTypeObject unitArrayObjectType;
>>>
>>> static int
>>> checkunit(PyObject *unit1, PyObject *unit2) {
>>>  return PyObject_Compare(unit1, unit2);
>>> }
>>>
>>> static PyObject *
>>> unitArray_new(PyTypeObject *cls, PyObject *args, PyObject *kwargs) {
>>>        PyObject *data = NULL,
>>>                         *unit = NULL;
>>>  PyArray_Descr* dtype = NULL;
>>>  PyObject *res = NULL, *tmp = NULL;
>>>
>>>        if (!PyArg_ParseTuple(args, "OO|O&", &data, &unit,
>>> PyArray_DescrConverter, &dtype)) {
>>>                Py_XDECREF(dtype);
>>>                return NULL;
>>>        }
>>>
>>>  res = PyArray_FromAny(data, dtype, 0, 0, NPY_ENSURECOPY, NULL);
>>>        if (res == NULL) {
>>>                Py_XDECREF(dtype);
>>>    //TODO: raise exception?
>>>                return NULL;
>>>        }
>>>
>>>        if (PyObject_IsInstance(data, (PyObject*)cls)) {
>>>                if (unit!=NULL &&
>>> !checkunit((PyObject*)((UnitArrayObject*)data)->unit,unit)) {
>>>                        Py_XDECREF(res);
>>>                        //TODO: raise exception
>>>                        return NULL;
>>>                }
>>>        } else {
>>>                if (PyObject_IsTrue(unit)) {
>>>      tmp = res;
>>>                        res = PyArray_View((PyArrayObject*)res, NULL,
>>> &unitArrayObjectType);
>>>      if (tmp!=res) {
>>>        Py_XDECREF(tmp);
>>>      }
>>>      ((UnitArrayObject*)res)->unit = (PyDictObject*)unit;
>>>      Py_INCREF(unit);
>>>      if (unit!=NULL) {
>>>      }
>>>                }
>>>        }
>>>        return res;
>>> }
>>>
>>> static PyObject*
>>> unitArray__array_finalize__(PyObject* new, PyObject* args) {
>>>        PyObject *attr = NULL, *tmp = NULL;
>>>  PyObject *parent = NULL;
>>>
>>>  if (!PyArg_ParseTuple(args, "O", &parent)) {
>>>    return NULL;
>>>  }
>>>   if (parent!=NULL) {
>>>     attr = PyObject_GetAttrString(parent, "unit");
>>>     if (attr == NULL) {
>>>        //parent has no 'unit' so we make a new empty one
>>>       attr = PyDict_New();
>>>       PyErr_Clear();
>>>     }
>>>   }
>>>  tmp = (PyObject*)((UnitArrayObject*)new)->unit;
>>>    ((UnitArrayObject*)new)->unit = (PyDictObject*)attr;
>>>
>>>  Py_INCREF(Py_None);
>>>  return Py_None;
>>> }
>>>
>>> static PyObject*
>>> unitArray__array_wrap__(PyObject *self, PyObject *args) {
>>>        PyObject *array = NULL, *context = NULL;
>>>
>>>        if (!PyArg_ParseTuple(args, "OO", array, context)) {
>>>                //TODO: raise exception
>>>                return NULL;
>>>        }
>>>
>>>        printf("%s",PyString_AsString(PyObject_Str(context)));
>>>
>>>  Py_INCREF(array);
>>>  return array;
>>> }
>>>
>>>
>>> static PyMethodDef unitArrayMethods[] = {
>>>  {"__array_finalize__", unitArray__array_finalize__, METH_VARARGS,
>>> "array finalize method"},
>>>  {"__array_wrap__", unitArray__array_wrap__, METH_VARARGS, "array wrap
>>> method"},
>>>  {NULL, NULL, 0, NULL}
>>> };
>>>
>>> static PyMemberDef unitArrayMembers[] = {
>>>  {"unit", T_OBJECT, offsetof(UnitArrayObject, unit),  0, "dictionary
>>> containing unit info."},
>>>  {NULL, 0, 0, 0, NULL}
>>> };
>>>
>>> PyTypeObject unitArrayObjectType = {
>>>        PyObject_HEAD_INIT(NULL)
>>>        0,                              /* ob_size        */
>>>        "spam.UnitArray",               /* tp_name        */
>>>        sizeof(UnitArrayObject),                /* tp_basicsize   */
>>>        0,                              /* tp_itemsize    */
>>>        0,                              /* tp_dealloc     */
>>>        0,                              /* tp_print       */
>>>        0,                              /* tp_getattr     */
>>>        0,                              /* tp_setattr     */
>>>        0,                              /* tp_compare     */
>>>        0,                              /* tp_repr        */
>>>        0,                              /* tp_as_number   */
>>>        0,                              /* tp_as_sequence */
>>>        0,                              /* tp_as_mapping  */
>>>        0,                              /* tp_hash        */
>>>        0,                              /* tp_call        */
>>>        0,                              /* tp_str         */
>>>        0,                              /* tp_getattro    */
>>>        0,                              /* tp_setattro    */
>>>        0,                              /* tp_as_buffer   */
>>>        Py_TPFLAGS_DEFAULT,             /* tp_flags       */
>>>        "A numpy array with units",     /* tp_doc         */
>>>  0,        /* traverseproc */
>>>  0,        /* tp_clear*/
>>>  0,        /* tp_richcompare */
>>>  0,        /* tp_weaklistoffset */
>>>  0,        /* tp_iter */
>>>  0,        /* tp_iternext */
>>>  unitArrayMethods,        /* tp_methods */
>>>  unitArrayMembers,        /* tp_members */
>>>  0,        /* tp_getset */
>>>  0,        /* tp_base*/
>>>  0,        /* tp_dict */
>>>  0,        /* tp_descr_get*/
>>>  0,        /* tp_descr_set */
>>>  0,        /* tp_dictoffset */
>>>  0,        /* tp_init */
>>>  0,        /* tp_alloc */
>>>  unitArray_new /* tp_new */
>>> };
>>>
>>>
>>> static PyMethodDef SpamMethods[] = {
>>>    {NULL, NULL, 0, NULL}        /* Sentinel */
>>> };
>>>
>>> PyObject *unitArray = NULL;
>>> PyMODINIT_FUNC
>>> initspampub(void)
>>> {
>>>  import_array();
>>>
>>>  PyObject *m;
>>>
>>>  Py_INCREF(&PyArray_Type);
>>>  unitArrayObjectType.tp_base = &PyArray_Type;
>>>
>>>  if (PyType_Ready(&unitArrayObjectType) < 0)
>>>    return;
>>>
>>>
>>>  m = Py_InitModule3("spampub", SpamMethods, "some tests and a array type
>>> with units.");
>>>  if (m == NULL)
>>>    return;
>>>
>>>  SpamError = PyErr_NewException("spampub.error", NULL, NULL);
>>>  Py_INCREF(SpamError);
>>>  PyModule_AddObject(m, "error", SpamError);
>>>  Py_INCREF(&unitArrayObjectType);
>>>  PyModule_AddObject(m, "UnitArray", (PyObject *)&unitArrayObjectType);
>>>  (void) Py_InitModule("spampub", SpamMethods);
>>>
>>> }
>>>
>>>
>>> -----BEGIN PGP SIGNATURE-----
>>> Version: GnuPG/MacGPG2 v2.0.14 (Darwin)
>>>
>>> iEYEARECAAYFAk9YBbkACgkQLYu25rCEIzsExQCggHg0e0ibYP3bEsAE0Ce8Rbm3
>>> qOcAoKwuTuE4BDZgrqNpwISgMS6uSMnO
>>> =Qtek
>>> -----END PGP SIGNATURE-----
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>
>>
> <setup.py>_______________________________________________
>
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
> Christoph Gohle
> --
> Max-Planck-Institut f?r Quantenoptik
> Abteilung Quantenvielteilchensysteme
> Hans-Kopfermann-Strasse 1
> 85748 Garching
>
> christoph.gohle at mpq.mpg.de
> tel: +49 89 32905 283
> fax: +49 89 32905 313
>
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120308/e935714a/attachment.html>

From christoph.gohle at mpq.mpg.de  Thu Mar  8 11:37:50 2012
From: christoph.gohle at mpq.mpg.de (Christoph Gohle)
Date: Thu, 8 Mar 2012 17:37:50 +0100
Subject: [Numpy-discussion] subclassing array in c
In-Reply-To: <CAE8bXEkC9uyzb3MWtnJ1eyssOsZgiwzA8YH7X-TxgtOnDqrq9Q@mail.gmail.com>
References: <EE364848-602F-4CBC-942B-592C8C66E4BA@mpq.mpg.de>
	<CAE8bXEkycG4G5PT9kjdHTbzUJVz7NST-HcySGozL+0934g5_mQ@mail.gmail.com>
	<CAE8bXE=hUrTvM96sMXWK5_Z0WbO87JiPJWfS2why+GF-snvvMw@mail.gmail.com>
	<80A2C97D-8386-4BBA-AA5E-14FD18568457@mpq.mpg.de>
	<CAE8bXEkC9uyzb3MWtnJ1eyssOsZgiwzA8YH7X-TxgtOnDqrq9Q@mail.gmail.com>
Message-ID: <345C8549-457F-4FA8-AECF-38E1BF24BC9D@mpq.mpg.de>

Dear Val,

thanks for testing. I have now tried on different platforms. I get all kinds of crashes on os x (now with numpy 1.6.1) and windows with numpy 1.6.0. On Ubuntu with numpy 1.3.0 I get a hughe memory leak...

Any hints would be welcome.

Thanks,
Christoph
Am 08.03.2012 um 09:08 schrieb Val Kalatsky:

> Hi Christoph,
> 
> I've just tried 
> a=[spampub.UnitArray(i,{'s':i}) for i in xrange(1000)] 
> and everything looks fine on my side.
> Probably my test environment is too different to give comparable results:
> 
> In [3]: call(["uname", "-a"])
> Linux ubuntu 2.6.35-22-generic #33-Ubuntu SMP Sun Sep 19 20:32:27 UTC 2010 x86_64 GNU/Linux
> In [4]: np.__version__
> Out[4]: '1.5.1'
> 
> I am afraid I will not be able to help testing it on super-nova versions of numpy.
> Cheers
> Val
> 
> 
> On Thu, Mar 8, 2012 at 1:49 AM, Christoph Gohle <christoph.gohle at mpq.mpg.de> wrote:
> Dear Val,
> 
> I agree that more detail is needed. Sorry for that it was late yesterday.
> 
>  I am running Python 2.6.1, numpy development branch (numpy-2.0.0.dev_20101104-py2.6-macosx-10.6-universal.egg). maybe I should switch to release?
> 
> I compile with your setup.py using 'python setup.py build_ext -i' and run the commands below in ipython. As you can see, I don't get a crash for a single call to the constructor, consistent with your observation. And it looks like, one has to use dict in the unit to make it crash.
> 
> Can you make sense out of that?
> 
> 
> In [1]: import spampub
> 
> In [4]: spampub.UnitArray(1,{'s':1})
> Out[4]: UnitArray(1)
> 
> In [6]: a=[spampub.UnitArray(i,{'s':i}) for i in xrange(1000)]
> Segmentation fault
> 
> backtrace is the following:
> 
> Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
> Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000021
> Crashed Thread:  0  Dispatch queue: com.apple.main-thread
> 
> Thread 0 Crashed:  Dispatch queue: com.apple.main-thread
> 0   org.python.python             	0x00000001000b0b8e PyObject_GC_Track + 1473
> 1   org.python.python             	0x00000001000b1255 _PyObject_GC_Malloc + 191
> 2   org.python.python             	0x00000001000b12d0 _PyObject_GC_New + 21
> 3   org.python.python             	0x000000010003a856 PyDict_New + 116
> 4   org.python.python             	0x000000010003a99c _PyDict_NewPresized + 24
> 5   org.python.python             	0x00000001000880cb PyEval_EvalFrameEx + 11033
> 6   org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
> 7   org.python.python             	0x000000010008735d PyEval_EvalFrameEx + 7595
> 8   org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
> 9   org.python.python             	0x000000010008935e PyEval_EvalFrameEx + 15788
> 10  org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
> 11  org.python.python             	0x000000010008935e PyEval_EvalFrameEx + 15788
> 12  org.python.python             	0x00000001000892e1 PyEval_EvalFrameEx + 15663
> 13  org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
> 14  org.python.python             	0x000000010008935e PyEval_EvalFrameEx + 15788
> 15  org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
> 16  org.python.python             	0x000000010008935e PyEval_EvalFrameEx + 15788
> 17  org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
> 18  org.python.python             	0x000000010008935e PyEval_EvalFrameEx + 15788
> 19  org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
> 20  org.python.python             	0x000000010008935e PyEval_EvalFrameEx + 15788
> 21  org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
> 22  org.python.python             	0x000000010008ad61 PyEval_EvalCode + 54
> 23  org.python.python             	0x00000001000a265a Py_CompileString + 78
> 24  org.python.python             	0x00000001000a2723 PyRun_FileExFlags + 150
> 25  org.python.python             	0x00000001000a423d PyRun_SimpleFileExFlags + 704
> 26  org.python.python             	0x00000001000b0286 Py_Main + 2718
> 27  org.python.python.app         	0x0000000100000e6c start + 52
> 
> Am 08.03.2012 um 02:36 schrieb Val Kalatsky:
> 
>> 
>> Tried it on my Ubuntu 10.10 box, no problem:
>> 
>> 1) Saved as spampub.c
>> 2) Compiled with (setup.py attached): python setup.py build_ext -i
>> 3) Tested from ipython:
>> In [1]: import spampub
>> In [2]: ua=spampub.UnitArray([0,1,2,3.0],'liter')
>> In [3]: ua
>> Out[3]: UnitArray([ 0.,  1.,  2.,  3.])
>> In [4]: ua.unit
>> Out[4]: 'liter'
>> 
>> 
>> On Wed, Mar 7, 2012 at 7:15 PM, Val Kalatsky <kalatsky at gmail.com> wrote:
>> 
>> Seeing the backtrace would be helpful. 
>> Can you do whatever leads to the segfault 
>> from python run from gdb?
>> Val
>> 
>> 
>> On Wed, Mar 7, 2012 at 7:04 PM, Christoph Gohle <christoph.gohle at mpq.mpg.de> wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>> 
>> Hi,
>> 
>> I have been struggeling for quite some time now. Desperate as I am, now I need help.
>> 
>> I was trying to subclass ndarrays in a c extension (see code below) and do constantly get segfaults. I have been checking my INCREF and DECREF stuff up and down but can't find the error. Probably I got something completely wrong... anybody able to help?
>> 
>> Thanks,
>> Christoph
>> - -----------------
>> #include <Python.h>
>> #include <structmember.h>
>> 
>> #include <numpy/arrayobject.h>
>> 
>> static PyObject *SpamError;
>> 
>> typedef struct {
>>  PyArrayObject base;
>>  PyDictObject* unit;
>> } UnitArrayObject;
>> 
>> PyTypeObject unitArrayObjectType;
>> 
>> static int
>> checkunit(PyObject *unit1, PyObject *unit2) {
>>  return PyObject_Compare(unit1, unit2);
>> }
>> 
>> static PyObject *
>> unitArray_new(PyTypeObject *cls, PyObject *args, PyObject *kwargs) {
>>        PyObject *data = NULL,
>>                         *unit = NULL;
>>  PyArray_Descr* dtype = NULL;
>>  PyObject *res = NULL, *tmp = NULL;
>> 
>>        if (!PyArg_ParseTuple(args, "OO|O&", &data, &unit, PyArray_DescrConverter, &dtype)) {
>>                Py_XDECREF(dtype);
>>                return NULL;
>>        }
>> 
>>  res = PyArray_FromAny(data, dtype, 0, 0, NPY_ENSURECOPY, NULL);
>>        if (res == NULL) {
>>                Py_XDECREF(dtype);
>>    //TODO: raise exception?
>>                return NULL;
>>        }
>> 
>>        if (PyObject_IsInstance(data, (PyObject*)cls)) {
>>                if (unit!=NULL && !checkunit((PyObject*)((UnitArrayObject*)data)->unit,unit)) {
>>                        Py_XDECREF(res);
>>                        //TODO: raise exception
>>                        return NULL;
>>                }
>>        } else {
>>                if (PyObject_IsTrue(unit)) {
>>      tmp = res;
>>                        res = PyArray_View((PyArrayObject*)res, NULL, &unitArrayObjectType);
>>      if (tmp!=res) {
>>        Py_XDECREF(tmp);
>>      }
>>      ((UnitArrayObject*)res)->unit = (PyDictObject*)unit;
>>      Py_INCREF(unit);
>>      if (unit!=NULL) {
>>      }
>>                }
>>        }
>>        return res;
>> }
>> 
>> static PyObject*
>> unitArray__array_finalize__(PyObject* new, PyObject* args) {
>>        PyObject *attr = NULL, *tmp = NULL;
>>  PyObject *parent = NULL;
>> 
>>  if (!PyArg_ParseTuple(args, "O", &parent)) {
>>    return NULL;
>>  }
>>   if (parent!=NULL) {
>>     attr = PyObject_GetAttrString(parent, "unit");
>>     if (attr == NULL) {
>>        //parent has no 'unit' so we make a new empty one
>>       attr = PyDict_New();
>>       PyErr_Clear();
>>     }
>>   }
>>  tmp = (PyObject*)((UnitArrayObject*)new)->unit;
>>    ((UnitArrayObject*)new)->unit = (PyDictObject*)attr;
>> 
>>  Py_INCREF(Py_None);
>>  return Py_None;
>> }
>> 
>> static PyObject*
>> unitArray__array_wrap__(PyObject *self, PyObject *args) {
>>        PyObject *array = NULL, *context = NULL;
>> 
>>        if (!PyArg_ParseTuple(args, "OO", array, context)) {
>>                //TODO: raise exception
>>                return NULL;
>>        }
>> 
>>        printf("%s",PyString_AsString(PyObject_Str(context)));
>> 
>>  Py_INCREF(array);
>>  return array;
>> }
>> 
>> 
>> static PyMethodDef unitArrayMethods[] = {
>>  {"__array_finalize__", unitArray__array_finalize__, METH_VARARGS, "array finalize method"},
>>  {"__array_wrap__", unitArray__array_wrap__, METH_VARARGS, "array wrap method"},
>>  {NULL, NULL, 0, NULL}
>> };
>> 
>> static PyMemberDef unitArrayMembers[] = {
>>  {"unit", T_OBJECT, offsetof(UnitArrayObject, unit),  0, "dictionary containing unit info."},
>>  {NULL, 0, 0, 0, NULL}
>> };
>> 
>> PyTypeObject unitArrayObjectType = {
>>        PyObject_HEAD_INIT(NULL)
>>        0,                              /* ob_size        */
>>        "spam.UnitArray",               /* tp_name        */
>>        sizeof(UnitArrayObject),                /* tp_basicsize   */
>>        0,                              /* tp_itemsize    */
>>        0,                              /* tp_dealloc     */
>>        0,                              /* tp_print       */
>>        0,                              /* tp_getattr     */
>>        0,                              /* tp_setattr     */
>>        0,                              /* tp_compare     */
>>        0,                              /* tp_repr        */
>>        0,                              /* tp_as_number   */
>>        0,                              /* tp_as_sequence */
>>        0,                              /* tp_as_mapping  */
>>        0,                              /* tp_hash        */
>>        0,                              /* tp_call        */
>>        0,                              /* tp_str         */
>>        0,                              /* tp_getattro    */
>>        0,                              /* tp_setattro    */
>>        0,                              /* tp_as_buffer   */
>>        Py_TPFLAGS_DEFAULT,             /* tp_flags       */
>>        "A numpy array with units",     /* tp_doc         */
>>  0,        /* traverseproc */
>>  0,        /* tp_clear*/
>>  0,        /* tp_richcompare */
>>  0,        /* tp_weaklistoffset */
>>  0,        /* tp_iter */
>>  0,        /* tp_iternext */
>>  unitArrayMethods,        /* tp_methods */
>>  unitArrayMembers,        /* tp_members */
>>  0,        /* tp_getset */
>>  0,        /* tp_base*/
>>  0,        /* tp_dict */
>>  0,        /* tp_descr_get*/
>>  0,        /* tp_descr_set */
>>  0,        /* tp_dictoffset */
>>  0,        /* tp_init */
>>  0,        /* tp_alloc */
>>  unitArray_new /* tp_new */
>> };
>> 
>> 
>> static PyMethodDef SpamMethods[] = {
>>    {NULL, NULL, 0, NULL}        /* Sentinel */
>> };
>> 
>> PyObject *unitArray = NULL;
>> PyMODINIT_FUNC
>> initspampub(void)
>> {
>>  import_array();
>> 
>>  PyObject *m;
>> 
>>  Py_INCREF(&PyArray_Type);
>>  unitArrayObjectType.tp_base = &PyArray_Type;
>> 
>>  if (PyType_Ready(&unitArrayObjectType) < 0)
>>    return;
>> 
>> 
>>  m = Py_InitModule3("spampub", SpamMethods, "some tests and a array type with units.");
>>  if (m == NULL)
>>    return;
>> 
>>  SpamError = PyErr_NewException("spampub.error", NULL, NULL);
>>  Py_INCREF(SpamError);
>>  PyModule_AddObject(m, "error", SpamError);
>>  Py_INCREF(&unitArrayObjectType);
>>  PyModule_AddObject(m, "UnitArray", (PyObject *)&unitArrayObjectType);
>>  (void) Py_InitModule("spampub", SpamMethods);
>> 
>> }
>> 
>> 
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG/MacGPG2 v2.0.14 (Darwin)
>> 
>> iEYEARECAAYFAk9YBbkACgkQLYu25rCEIzsExQCggHg0e0ibYP3bEsAE0Ce8Rbm3
>> qOcAoKwuTuE4BDZgrqNpwISgMS6uSMnO
>> =Qtek
>> -----END PGP SIGNATURE-----
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> 
>> 
>> <setup.py>_______________________________________________
>> 
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> Christoph Gohle
> --
> Max-Planck-Institut f?r Quantenoptik
> Abteilung Quantenvielteilchensysteme
> Hans-Kopfermann-Strasse 1
> 85748 Garching
> 
> christoph.gohle at mpq.mpg.de
> tel: +49 89 32905 283
> fax: +49 89 32905 313
> 
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


Christoph Gohle
--
Max-Planck-Institut f?r Quantenoptik
Abteilung Quantenvielteilchensysteme
Hans-Kopfermann-Strasse 1
85748 Garching

christoph.gohle at mpq.mpg.de
tel: +49 89 32905 283
fax: +49 89 32905 313


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120308/ff143589/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 203 bytes
Desc: Signierter Teil der Nachricht
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120308/ff143589/attachment.sig>

From pav at iki.fi  Thu Mar  8 14:39:23 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Thu, 08 Mar 2012 20:39:23 +0100
Subject: [Numpy-discussion] subclassing array in c
In-Reply-To: <345C8549-457F-4FA8-AECF-38E1BF24BC9D@mpq.mpg.de>
References: <EE364848-602F-4CBC-942B-592C8C66E4BA@mpq.mpg.de>
	<CAE8bXEkycG4G5PT9kjdHTbzUJVz7NST-HcySGozL+0934g5_mQ@mail.gmail.com>
	<CAE8bXE=hUrTvM96sMXWK5_Z0WbO87JiPJWfS2why+GF-snvvMw@mail.gmail.com>
	<80A2C97D-8386-4BBA-AA5E-14FD18568457@mpq.mpg.de>
	<CAE8bXEkC9uyzb3MWtnJ1eyssOsZgiwzA8YH7X-TxgtOnDqrq9Q@mail.gmail.com>
	<345C8549-457F-4FA8-AECF-38E1BF24BC9D@mpq.mpg.de>
Message-ID: <jjb1tb$ru8$1@dough.gmane.org>

08.03.2012 17:37, Christoph Gohle kirjoitti:
> thanks for testing. I have now tried on different platforms. I get
> all kinds of crashes on os x (now with numpy 1.6.1) and windows
> with numpy 1.6.0. On Ubuntu with numpy 1.3.0 I get a hughe memory
> leak...
> 
> Any hints would be welcome.

The type object inherits `tp_alloc` from Numpy. This routine always
allocates memory of size NPY_SIZEOF_PYARRAYOBJECT for the
PyArrayObject. Therefore, the write to new->unit in your
__array_finalize__ goes to unallocated memory.

This is probably a bug in Numpy --- arrayobject.c:array_alloc should
respect the size specified by the subtype.

A workaround is probably to specify a suitable tp_alloc routine yourself:

   PyType_GenericAlloc,        /* tp_alloc */
    unitArray_new,              /* tp_new */
    _PyObject_Del               /* tp_free */

-- 
Pauli Virtanen


From matthew.brett at gmail.com  Thu Mar  8 18:14:09 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Thu, 8 Mar 2012 15:14:09 -0800
Subject: [Numpy-discussion] Casting rules changed in trunk?
In-Reply-To: <CAH6Pt5qJPn0f+T4b=ZbNZcq=FgpAoOG-XvAZH7=d5nZco-wKeg@mail.gmail.com>
References: <CAH6Pt5qJPn0f+T4b=ZbNZcq=FgpAoOG-XvAZH7=d5nZco-wKeg@mail.gmail.com>
Message-ID: <CAH6Pt5rq794KK3uTwkD+oxKUP5x=6Z5zgosfTior0O7oAozMPg@mail.gmail.com>

Hi,

On Wed, Mar 7, 2012 at 4:08 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> I noticed a casting change running the test suite on our image reader,
> nibabel: https://github.com/nipy/nibabel/blob/master/nibabel/tests/test_casting.py
>
> For this script:
>
> <pre>
> import numpy as np
>
> Adata = np.zeros((2,), dtype=np.uint8)
> Bdata = np.zeros((2,), dtype=np.int16)
> Bzero = np.int16(0)
> Bbig = np.int16(256)
>
> print np.__version__
> print 'Array add', (Adata + Bdata).dtype
> print 'Scalar 0 add', (Adata + Bzero).dtype
> print 'Scalar 256 add', (Adata + Bbig).dtype
> </pre>
>
> 1.4.1
> Array add int16
> Scalar 0 add uint8
> Scalar 256 add uint8
>
> 1.5.1
> Array add int16
> Scalar 0 add uint8
> Scalar 256 add uint8
>
> 1.6.1
> Array add int16
> Scalar 0 add uint8
> Scalar 256 add int16
>
> 1.7.0.dev-aae5b0a
> Array add int16
> Scalar 0 add uint8
> Scalar 256 add uint16
>
> I can understand the uint8 outputs from numpy < 1.6 - the rule being
> not to upcast for scalars.
>
> I can understand the int16 output from 1.6.1 on the basis that the
> value is outside uint8 range and therefore we might prefer a type that
> can handle values from both uint8 and int16.
>
> Was the current change intended? ?It has the following odd effect:
>
> In [5]: Adata + np.int16(257)
> Out[5]: array([257, 257], dtype=uint16)
>
> In [7]: Adata + np.int16(-257)
> Out[7]: array([-257, -257], dtype=int16)
>
> In [8]: Adata - np.int16(257)
> Out[8]: array([65279, 65279], dtype=uint16)
>
> but I guess you can argue that there are odd effects for other choices too,

In case it wasn't clear, this, in numpy 1.6.1:

In [2]: (np.zeros((2,), dtype=np.uint8) + np.int16(257)).dtype
Out[2]: dtype('int16')

changed to this in current trunk:

In [2]: (np.zeros((2,), dtype=np.uint8) + np.int16(257)).dtype
Out[2]: dtype('uint16')

which is different still in previous versions of numpy (e.g. 1.4.1):

In [2]: (np.zeros((2,), dtype=np.uint8) + np.int16(257)).dtype
Out[2]: dtype('uint8')

My impression had been that the plan was to avoid changes in the
casting rules if possible.

Was this change in trunk intentional?  If not, I am happy to bisect,

Best,

Matthew


From matthew.brett at gmail.com  Thu Mar  8 18:27:30 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Thu, 8 Mar 2012 15:27:30 -0800
Subject: [Numpy-discussion] More SPARC pain
In-Reply-To: <CAH6Pt5qzienQuOrFhSLB=omhNnooQXpJaXeF6bMm0S=iimGeJw@mail.gmail.com>
References: <CAH6Pt5qzienQuOrFhSLB=omhNnooQXpJaXeF6bMm0S=iimGeJw@mail.gmail.com>
Message-ID: <CAH6Pt5ri05RFAAqPO5R60eZBSOe_acJ24yOs+m0NBUe7XLygHA@mail.gmail.com>

Hi,

On Tue, Mar 6, 2012 at 8:07 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> I found this test caused a bus error on current trunk:
>
> <pre>
> import numpy as np
>
> from StringIO import StringIO as BytesIO
>
> from numpy.testing import assert_array_equal
>
>
> def test_2d_buf():
> ? ?dtt = np.complex64
> ? ?arr = np.arange(10, dtype=dtt)
> ? ?# 2D array
> ? ?arr2 = np.reshape(arr, (2, 5))
> ? ?# Fortran write followed by (C or F) read caused bus error
> ? ?data_str = arr2.tostring('F')
> ? ?data_back = np.ndarray(arr2.shape,
> ? ? ? ? ? ? ? ? ? ? ? ? ? arr2.dtype,
> ? ? ? ? ? ? ? ? ? ? ? ? ? buffer=data_str,
> ? ? ? ? ? ? ? ? ? ? ? ? ? order='F')
> ? ?assert_array_equal(arr2, data_back)
> </pre>
>
> gdb run gives ...
>
> test_me3.test_2d_buf ...
> Program received signal SIGBUS, Bus error.
> 0xf78f5458 in _aligned_strided_to_contig_size8 (
> ? ?dst=0xdc0e08
> "\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\373\373\373\373",
> dst_stride=8, src=0xcdfc44 "", src_stride=16, N=5,
> __NPY_UNUSED_TAGGEDsrc_itemsize=8,
> ? ?__NPY_UNUSED_TAGGEDdata=0x0) at
> numpy/core/src/multiarray/lowlevel_strided_loops.c.src:137
> 137 ? ? ? ? ? ? (*((@type@ *)dst)) = @swap@@elsize@(*((@type@ *)src));
>
> Debug log attached. ?Shall I make an issue?

http://projects.scipy.org/numpy/ticket/2076

Best,

Matthew


From christoph.gohle at mpq.mpg.de  Thu Mar  8 19:22:44 2012
From: christoph.gohle at mpq.mpg.de (Christoph Gohle)
Date: Fri, 9 Mar 2012 01:22:44 +0100
Subject: [Numpy-discussion] subclassing array in c
In-Reply-To: <jjb1tb$ru8$1@dough.gmane.org>
References: <EE364848-602F-4CBC-942B-592C8C66E4BA@mpq.mpg.de>
	<CAE8bXEkycG4G5PT9kjdHTbzUJVz7NST-HcySGozL+0934g5_mQ@mail.gmail.com>
	<CAE8bXE=hUrTvM96sMXWK5_Z0WbO87JiPJWfS2why+GF-snvvMw@mail.gmail.com>
	<80A2C97D-8386-4BBA-AA5E-14FD18568457@mpq.mpg.de>
	<CAE8bXEkC9uyzb3MWtnJ1eyssOsZgiwzA8YH7X-TxgtOnDqrq9Q@mail.gmail.com>
	<345C8549-457F-4FA8-AECF-38E1BF24BC9D@mpq.mpg.de>
	<jjb1tb$ru8$1@dough.gmane.org>
Message-ID: <8D44D2DD-A3D5-4F91-9BF1-E63BE10546AF@mpq.mpg.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi
Am 08.03.2012 um 20:39 schrieb Pauli Virtanen:

> 08.03.2012 17:37, Christoph Gohle kirjoitti:
>> thanks for testing. I have now tried on different platforms. I get
>> all kinds of crashes on os x (now with numpy 1.6.1) and windows
>> with numpy 1.6.0. On Ubuntu with numpy 1.3.0 I get a hughe memory
>> leak...
>> 
>> Any hints would be welcome.
> 
> The type object inherits `tp_alloc` from Numpy. This routine always
> allocates memory of size NPY_SIZEOF_PYARRAYOBJECT for the
> PyArrayObject. Therefore, the write to new->unit in your
> __array_finalize__ goes to unallocated memory.
> 
> This is probably a bug in Numpy --- arrayobject.c:array_alloc should
> respect the size specified by the subtype.
> 
> A workaround is probably to specify a suitable tp_alloc routine yourself:
> 
>   PyType_GenericAlloc,        /* tp_alloc */
>    unitArray_new,              /* tp_new */
>    _PyObject_Del               /* tp_free */
> 
OK, I did that. And I get no more segfaults as far as I can tell. But there is still a memory leak:

In [1]: import spampub

In [2]: a=[spampub.UnitArray(i,{'s':i}) for i in xrange(100000)]

In [3]: del a

after the last two statements, python uses ~60MB more memory than before.

Thanks for your help
Christoph
> -- 
> Pauli Virtanen
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


Christoph Gohle
- --
Max-Planck-Institut f?r Quantenoptik
Abteilung Quantenvielteilchensysteme
Hans-Kopfermann-Strasse 1
85748 Garching

christoph.gohle at mpq.mpg.de
tel: +49 89 32905 283
fax: +49 89 32905 313


-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.14 (Darwin)

iEYEARECAAYFAk9ZTVgACgkQLYu25rCEIztbcwCfcyeQ+FtKTOwFUGbleX/CrjPi
nZcAnj86kejcAO45YbX+I+rxhU9kq4PU
=KGdt
-----END PGP SIGNATURE-----


From salahudeen03 at gmail.com  Fri Mar  9 01:33:16 2012
From: salahudeen03 at gmail.com (salahudeen razac)
Date: Fri, 9 Mar 2012 12:03:16 +0530
Subject: [Numpy-discussion] solving linear equations
In-Reply-To: <mailman.1347.1331203588.1208.numpy-discussion@scipy.org>
References: <mailman.1347.1331203588.1208.numpy-discussion@scipy.org>
Message-ID: <CAENU5GH6Fk53POwQSrZEeDG=BOqHVLyVgmEE7CAs=15PvUkn9Q@mail.gmail.com>

I wrote a script to solve the equation ?*P =Kd .V2 + Kl.VA.BT + C?* . To
solve the equation I have used the matrix method.


Ie, K*X = P          where

*[P] = [             P1*

*                                **P2
a column matrix with the total powers P1, P2,P3??.PN at (V1,T1),(V2,T2)??(
VN.TN) respectively,\*

*                        P3*

*                                ..*

*                                **PN  ]*

* *

*                                    [K] = [            V12      V1A.BT1
1*

*                                                            V22      V2A.B
T2    1*

*                                                            V32      V3A.B
T3    1*

*                                                            ?????????..*

*                                                            VN2      VNA.B
TN    1 ]*

* *

*                                    [X]= [ Kd                           A
column matrix with the unknowns*

*                                                                **Kl*

*                                                                **C  ]*


Now in order to solve multiply both sides with KT, ie             *KT.K.X =
KTP *or *A.X = B* where *A*=* KT.K *and* B*=* KTP*

*
*

Now we will get the matrix X by using *linalg.solve(A,B*) function which
will eventually solve for a set of linear equations of the form Ax=B*
*


I was able to get the solution. But it is not acceptable since some of the
values are negative. SO I want to know is there any way to give some
constraints like the solution should contain only positive values and give
any range for the solution?


I also tried with linalg.lstsq and I am getting the same results
*
-----
Salahudeen razak* *| +917760902602*

*o__*
* _> /__
(_) \(_)*... *Burn fat not fuel
*
*
*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120309/84a43492/attachment.html>

From christoph.gohle at mpq.mpg.de  Fri Mar  9 02:00:01 2012
From: christoph.gohle at mpq.mpg.de (Christoph Gohle)
Date: Fri, 9 Mar 2012 08:00:01 +0100
Subject: [Numpy-discussion] subclassing array in c
In-Reply-To: <8D44D2DD-A3D5-4F91-9BF1-E63BE10546AF@mpq.mpg.de>
References: <EE364848-602F-4CBC-942B-592C8C66E4BA@mpq.mpg.de>
	<CAE8bXEkycG4G5PT9kjdHTbzUJVz7NST-HcySGozL+0934g5_mQ@mail.gmail.com>
	<CAE8bXE=hUrTvM96sMXWK5_Z0WbO87JiPJWfS2why+GF-snvvMw@mail.gmail.com>
	<80A2C97D-8386-4BBA-AA5E-14FD18568457@mpq.mpg.de>
	<CAE8bXEkC9uyzb3MWtnJ1eyssOsZgiwzA8YH7X-TxgtOnDqrq9Q@mail.gmail.com>
	<345C8549-457F-4FA8-AECF-38E1BF24BC9D@mpq.mpg.de>
	<jjb1tb$ru8$1@dough.gmane.org>
	<8D44D2DD-A3D5-4F91-9BF1-E63BE10546AF@mpq.mpg.de>
Message-ID: <14B53E79-54F6-464C-96B9-E72F120D42D0@mpq.mpg.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi again,

I don't want to look as if I want other people do my work, so I would like to ask if there is a simple way of tracing memory leaks (without recompiling the python interpreter)?

Cheers,
Christoph
Am 09.03.2012 um 01:22 schrieb Christoph Gohle:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hi
> Am 08.03.2012 um 20:39 schrieb Pauli Virtanen:
> 
>> 08.03.2012 17:37, Christoph Gohle kirjoitti:
>>> thanks for testing. I have now tried on different platforms. I get
>>> all kinds of crashes on os x (now with numpy 1.6.1) and windows
>>> with numpy 1.6.0. On Ubuntu with numpy 1.3.0 I get a hughe memory
>>> leak...
>>> 
>>> Any hints would be welcome.
>> 
>> The type object inherits `tp_alloc` from Numpy. This routine always
>> allocates memory of size NPY_SIZEOF_PYARRAYOBJECT for the
>> PyArrayObject. Therefore, the write to new->unit in your
>> __array_finalize__ goes to unallocated memory.
>> 
>> This is probably a bug in Numpy --- arrayobject.c:array_alloc should
>> respect the size specified by the subtype.
>> 
>> A workaround is probably to specify a suitable tp_alloc routine yourself:
>> 
>>  PyType_GenericAlloc,        /* tp_alloc */
>>   unitArray_new,              /* tp_new */
>>   _PyObject_Del               /* tp_free */
>> 
> OK, I did that. And I get no more segfaults as far as I can tell. But there is still a memory leak:
> 
> In [1]: import spampub
> 
> In [2]: a=[spampub.UnitArray(i,{'s':i}) for i in xrange(100000)]
> 
> In [3]: del a
> 
> after the last two statements, python uses ~60MB more memory than before.
> 
> Thanks for your help
> Christoph
>> -- 
>> Pauli Virtanen
>> 
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> Christoph Gohle
> - --
> Max-Planck-Institut f?r Quantenoptik
> Abteilung Quantenvielteilchensysteme
> Hans-Kopfermann-Strasse 1
> 85748 Garching
> 
> christoph.gohle at mpq.mpg.de
> tel: +49 89 32905 283
> fax: +49 89 32905 313
> 
> 
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG/MacGPG2 v2.0.14 (Darwin)
> 
> iEYEARECAAYFAk9ZTVgACgkQLYu25rCEIztbcwCfcyeQ+FtKTOwFUGbleX/CrjPi
> nZcAnj86kejcAO45YbX+I+rxhU9kq4PU
> =KGdt
> -----END PGP SIGNATURE-----
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


Christoph Gohle
- --
Max-Planck-Institut f?r Quantenoptik
Abteilung Quantenvielteilchensysteme
Hans-Kopfermann-Strasse 1
85748 Garching

christoph.gohle at mpq.mpg.de
tel: +49 89 32905 283
fax: +49 89 32905 313


-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.14 (Darwin)

iEYEARECAAYFAk9ZqnQACgkQLYu25rCEIzthWACgi0dYy2nh83w57Ho8emkvJZ8z
KrkAnistJfaU29tzul8nrJBYsrdmksJk
=Iyr4
-----END PGP SIGNATURE-----


From francesc at continuum.io  Fri Mar  9 02:11:54 2012
From: francesc at continuum.io (Francesc Alted)
Date: Thu, 8 Mar 2012 23:11:54 -0800
Subject: [Numpy-discussion] subclassing array in c
In-Reply-To: <14B53E79-54F6-464C-96B9-E72F120D42D0@mpq.mpg.de>
References: <EE364848-602F-4CBC-942B-592C8C66E4BA@mpq.mpg.de>
	<CAE8bXEkycG4G5PT9kjdHTbzUJVz7NST-HcySGozL+0934g5_mQ@mail.gmail.com>
	<CAE8bXE=hUrTvM96sMXWK5_Z0WbO87JiPJWfS2why+GF-snvvMw@mail.gmail.com>
	<80A2C97D-8386-4BBA-AA5E-14FD18568457@mpq.mpg.de>
	<CAE8bXEkC9uyzb3MWtnJ1eyssOsZgiwzA8YH7X-TxgtOnDqrq9Q@mail.gmail.com>
	<345C8549-457F-4FA8-AECF-38E1BF24BC9D@mpq.mpg.de>
	<jjb1tb$ru8$1@dough.gmane.org>
	<8D44D2DD-A3D5-4F91-9BF1-E63BE10546AF@mpq.mpg.de>
	<14B53E79-54F6-464C-96B9-E72F120D42D0@mpq.mpg.de>
Message-ID: <0D96F2CF-069A-44BE-BF2F-F8ACEA8A150B@continuum.io>

Sure.  Check the memcheck tool of Valgrind:

 http://valgrind.org/info/tools.html#memcheck

It is a really amazing tool.

Francesc

On Mar 8, 2012, at 11:00 PM, Christoph Gohle wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hi again,
> 
> I don't want to look as if I want other people do my work, so I would like to ask if there is a simple way of tracing memory leaks (without recompiling the python interpreter)?
> 
> Cheers,
> Christoph
> Am 09.03.2012 um 01:22 schrieb Christoph Gohle:
> 
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>> 
>> Hi
>> Am 08.03.2012 um 20:39 schrieb Pauli Virtanen:
>> 
>>> 08.03.2012 17:37, Christoph Gohle kirjoitti:
>>>> thanks for testing. I have now tried on different platforms. I get
>>>> all kinds of crashes on os x (now with numpy 1.6.1) and windows
>>>> with numpy 1.6.0. On Ubuntu with numpy 1.3.0 I get a hughe memory
>>>> leak...
>>>> 
>>>> Any hints would be welcome.
>>> 
>>> The type object inherits `tp_alloc` from Numpy. This routine always
>>> allocates memory of size NPY_SIZEOF_PYARRAYOBJECT for the
>>> PyArrayObject. Therefore, the write to new->unit in your
>>> __array_finalize__ goes to unallocated memory.
>>> 
>>> This is probably a bug in Numpy --- arrayobject.c:array_alloc should
>>> respect the size specified by the subtype.
>>> 
>>> A workaround is probably to specify a suitable tp_alloc routine yourself:
>>> 
>>> PyType_GenericAlloc,        /* tp_alloc */
>>>  unitArray_new,              /* tp_new */
>>>  _PyObject_Del               /* tp_free */
>>> 
>> OK, I did that. And I get no more segfaults as far as I can tell. But there is still a memory leak:
>> 
>> In [1]: import spampub
>> 
>> In [2]: a=[spampub.UnitArray(i,{'s':i}) for i in xrange(100000)]
>> 
>> In [3]: del a
>> 
>> after the last two statements, python uses ~60MB more memory than before.
>> 
>> Thanks for your help
>> Christoph
>>> -- 
>>> Pauli Virtanen
>>> 
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> 
>> 
>> Christoph Gohle
>> - --
>> Max-Planck-Institut f?r Quantenoptik
>> Abteilung Quantenvielteilchensysteme
>> Hans-Kopfermann-Strasse 1
>> 85748 Garching
>> 
>> christoph.gohle at mpq.mpg.de
>> tel: +49 89 32905 283
>> fax: +49 89 32905 313
>> 
>> 
>> 
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG/MacGPG2 v2.0.14 (Darwin)
>> 
>> iEYEARECAAYFAk9ZTVgACgkQLYu25rCEIztbcwCfcyeQ+FtKTOwFUGbleX/CrjPi
>> nZcAnj86kejcAO45YbX+I+rxhU9kq4PU
>> =KGdt
>> -----END PGP SIGNATURE-----
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> Christoph Gohle
> - --
> Max-Planck-Institut f?r Quantenoptik
> Abteilung Quantenvielteilchensysteme
> Hans-Kopfermann-Strasse 1
> 85748 Garching
> 
> christoph.gohle at mpq.mpg.de
> tel: +49 89 32905 283
> fax: +49 89 32905 313
> 
> 
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG/MacGPG2 v2.0.14 (Darwin)
> 
> iEYEARECAAYFAk9ZqnQACgkQLYu25rCEIzthWACgi0dYy2nh83w57Ho8emkvJZ8z
> KrkAnistJfaU29tzul8nrJBYsrdmksJk
> =Iyr4
> -----END PGP SIGNATURE-----
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-- Francesc Alted


From pav at iki.fi  Fri Mar  9 04:08:27 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Fri, 09 Mar 2012 10:08:27 +0100
Subject: [Numpy-discussion] subclassing array in c
In-Reply-To: <14B53E79-54F6-464C-96B9-E72F120D42D0@mpq.mpg.de>
References: <EE364848-602F-4CBC-942B-592C8C66E4BA@mpq.mpg.de>
	<CAE8bXEkycG4G5PT9kjdHTbzUJVz7NST-HcySGozL+0934g5_mQ@mail.gmail.com>
	<CAE8bXE=hUrTvM96sMXWK5_Z0WbO87JiPJWfS2why+GF-snvvMw@mail.gmail.com>
	<80A2C97D-8386-4BBA-AA5E-14FD18568457@mpq.mpg.de>
	<CAE8bXEkC9uyzb3MWtnJ1eyssOsZgiwzA8YH7X-TxgtOnDqrq9Q@mail.gmail.com>
	<345C8549-457F-4FA8-AECF-38E1BF24BC9D@mpq.mpg.de>
	<jjb1tb$ru8$1@dough.gmane.org>
	<8D44D2DD-A3D5-4F91-9BF1-E63BE10546AF@mpq.mpg.de>
	<14B53E79-54F6-464C-96B9-E72F120D42D0@mpq.mpg.de>
Message-ID: <jjchac$asc$1@dough.gmane.org>

09.03.2012 08:00, Christoph Gohle kirjoitti:
> I don't want to look as if I want other people do my work, 
> so I would like to ask if there is a simple way of tracing
> memory leaks (without recompiling the python interpreter)?

The easiest way probably is to compile Numpy with debug symbols on, set
breakpoints to relevant routines in your class and inside Numpy, and
then step through the execution with (c)gdb.

Automatic memory leak detection tools maybe don't help so much here, as
the problem is likely in wrong reference counting.

-- 
Pauli Virtanen


From klemm at phys.ethz.ch  Fri Mar  9 04:35:18 2012
From: klemm at phys.ethz.ch (Hanno Klemm)
Date: Fri, 9 Mar 2012 10:35:18 +0100
Subject: [Numpy-discussion] solving linear equations
In-Reply-To: <CAENU5GH6Fk53POwQSrZEeDG=BOqHVLyVgmEE7CAs=15PvUkn9Q@mail.gmail.com>
References: <mailman.1347.1331203588.1208.numpy-discussion@scipy.org>
	<CAENU5GH6Fk53POwQSrZEeDG=BOqHVLyVgmEE7CAs=15PvUkn9Q@mail.gmail.com>
Message-ID: <A6174E6A-4EF9-4B52-9DA7-C27D7226F16D@phys.ethz.ch>


Hi,

have a look at scipy optimize. For a solution with only positive values you could consider using scipy.optimize.nnls, if you want more general (linear) constraints, have a look at the linear programming functions. 

Another possibility would be looking at openOpt, which has probably more general solvers.


On Mar 9, 2012, at 7:33, salahudeen razac wrote:

> 
> I wrote a script to solve the equation ?P =Kd .V2 + Kl.VA.BT + C? . To solve the equation I have used the matrix method.
> 
>  
> Ie, K*X = P          where
> 
> [P] = [             P1
> 
>                                 P2                                           a column matrix with the total powers P1, P2,P3??.PN at (V1,T1),(V2,T2)??(VN.TN) respectively,\
> 
>                         P3
> 
>                                 ..
> 
>                                 PN  ]
> 
>  
> 
>                                     [K] = [            V12      V1A.BT1    1
> 
>                                                             V22      V2A.BT2    1
> 
>                                                             V32      V3A.BT3    1
> 
>                                                             ?????????..
> 
>                                                             VN2      VNA.BTN    1 ]
> 
>  
> 
>                                     [X]= [ Kd                           A column matrix with the unknowns
> 
>                                                                 Kl
> 
>                                                                 C  ]
> 
>  
> Now in order to solve multiply both sides with KT, ie             KT.K.X = KTP or A.X = B where A= KT.K and B= KTP
> 
> 
> 
> Now we will get the matrix X by using linalg.solve(A,B) function which will eventually solve for a set of linear equations of the form Ax=B
> 
> 
> 
> I was able to get the solution. But it is not acceptable since some of the values are negative. SO I want to know is there any way to give some constraints like the solution should contain only positive values and give any range for the solution?
> 
> 
> 
> I also tried with linalg.lstsq and I am getting the same results
> 
> -----
> Salahudeen razak | +917760902602 
> 
> o__
>  _> /__
> (_) \(_)... Burn fat not fuel
> 
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120309/c77c341d/attachment.html>

From tmp50 at ukr.net  Fri Mar  9 06:22:30 2012
From: tmp50 at ukr.net (Dmitrey)
Date: Fri, 09 Mar 2012 13:22:30 +0200
Subject: [Numpy-discussion] memory leak in numpy.take
Message-ID: <76769.1331292150.15115500847125430272@ffe1.ukr.net>


memory leak was observed in numpy versions 1.5.1 and latest git trunc

from numpy import *
for i in range(100000):
if i % 100 == 0:
print(i)
a = empty(10000,object)
for j in range(10000):
a[j] = array(1)
a = take(a, range(9000),out=a[:9000])
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120309/05a4a38c/attachment.html>

From ncreati at inogs.it  Fri Mar  9 08:26:18 2012
From: ncreati at inogs.it (Nicola Creati)
Date: Fri, 09 Mar 2012 14:26:18 +0100
Subject: [Numpy-discussion] Fromfile Issue
Message-ID: <4F5A04FA.9010302@inogs.it>

Hello,
I'm writing a library able to read LAS lidar files. I generally use it 
under Linux without any problems. I'm now testing my library on a 
Windows 7 64 bit computer and I meet some problems reading the file. I 
generally use fromfile function to read the file. I noted that the 
fromfile returns an arry with some zeros at the end if I read the whole 
file or if i read only a finite number of data i get only a fraction of 
them. By the way I have created a sample script that reproduces the 
problem I noted:

""" Sample code """
import numpy as np

sz = 1000000
data = np.random.random(sz)

output = open('test.bin', 'w')
data.tofile(output)
output.close()
del output

print 'Original data:', data

fid = open('test.bin', 'r')
new_data = np.fromfile(fid)
fid.close()
del fid
print 'Data read:',  new_data

fid = open('test.bin', 'r')
slice = np.fromfile(fid, count=100)
fid.close()
print '100 read data:',  slice


The "new_data" array has several zeros at the end. The "slice" array has 
not 100 items. At every code run the "slice" array size changes.

I'm using numpy 1.6.1 64bit (taken from Christoph Gohlke website) , with 
python 2.7.2 64 bit.
Under Linux  Ubuntu 10.10 64bit and numpy 1.6.1 the code works as expected.
Thanks.

Nicola

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120309/b4ae4a72/attachment.html>

From warren.weckesser at enthought.com  Fri Mar  9 08:33:05 2012
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Fri, 9 Mar 2012 07:33:05 -0600
Subject: [Numpy-discussion] Fromfile Issue
In-Reply-To: <4F5A04FA.9010302@inogs.it>
References: <4F5A04FA.9010302@inogs.it>
Message-ID: <CAM-+wY_BjQ4V-UY_FaReJgKXUo4AjEJqu_CCBTicwrw+H0A5Nw@mail.gmail.com>

On Fri, Mar 9, 2012 at 7:26 AM, Nicola Creati <ncreati at inogs.it> wrote:

>  Hello,
> I'm writing a library able to read LAS lidar files. I generally use it
> under Linux without any problems. I'm now testing my library on a Windows 7
> 64 bit computer and I meet some problems reading the file. I generally use
> fromfile function to read the file. I noted that the fromfile returns an
> arry with some zeros at the end if I read the whole file or if i read only
> a finite number of data i get only a fraction of them. By the way I have
> created a sample script that reproduces the problem I noted:
>
> """ Sample code """
> import numpy as np
>
> sz = 1000000
> data = np.random.random(sz)
>
> output = open('test.bin', 'w')
> data.tofile(output)
> output.close()
> del output
>
> print 'Original data:', data
>
> fid = open('test.bin', 'r')
> new_data = np.fromfile(fid)
> fid.close()
> del fid
> print 'Data read:',  new_data
>
> fid = open('test.bin', 'r')
> slice = np.fromfile(fid, count=100)
> fid.close()
> print '100 read data:',  slice
>
>
> The "new_data" array has several zeros at the end. The "slice" array has
> not 100 items. At every code run the "slice" array size changes.
>
> I'm using numpy 1.6.1 64bit (taken from Christoph Gohlke website) , with
> python 2.7.2 64 bit.
> Under Linux  Ubuntu 10.10 64bit and numpy 1.6.1 the code works as expected.
> Thanks.
>
> Nicola
>
>

Use the binary mode ('wb' and 'rb') when you open the files.

Warren


>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120309/edd86828/attachment.html>

From ncreati at inogs.it  Fri Mar  9 08:53:46 2012
From: ncreati at inogs.it (Nicola Creati)
Date: Fri, 09 Mar 2012 14:53:46 +0100
Subject: [Numpy-discussion] Fromfile Issue
In-Reply-To: <CAM-+wY_BjQ4V-UY_FaReJgKXUo4AjEJqu_CCBTicwrw+H0A5Nw@mail.gmail.com>
References: <4F5A04FA.9010302@inogs.it>
	<CAM-+wY_BjQ4V-UY_FaReJgKXUo4AjEJqu_CCBTicwrw+H0A5Nw@mail.gmail.com>
Message-ID: <4F5A0B6A.5070304@inogs.it>

Il 09/03/2012 14:33, Warren Weckesser ha scritto:
>
>
> On Fri, Mar 9, 2012 at 7:26 AM, Nicola Creati <ncreati at inogs.it 
> <mailto:ncreati at inogs.it>> wrote:
>
>     Hello,
>     I'm writing a library able to read LAS lidar files. I generally
>     use it under Linux without any problems. I'm now testing my
>     library on a Windows 7 64 bit computer and I meet some problems
>     reading the file. I generally use fromfile function to read the
>     file. I noted that the fromfile returns an arry with some zeros at
>     the end if I read the whole file or if i read only a finite number
>     of data i get only a fraction of them. By the way I have created a
>     sample script that reproduces the problem I noted:
>
>     """ Sample code """
>     import numpy as np
>
>     sz = 1000000
>     data = np.random.random(sz)
>
>     output = open('test.bin', 'w')
>     data.tofile(output)
>     output.close()
>     del output
>
>     print 'Original data:', data
>
>     fid = open('test.bin', 'r')
>     new_data = np.fromfile(fid)
>     fid.close()
>     del fid
>     print 'Data read:',  new_data
>
>     fid = open('test.bin', 'r')
>     slice = np.fromfile(fid, count=100)
>     fid.close()
>     print '100 read data:',  slice
>
>
>     The "new_data" array has several zeros at the end. The "slice"
>     array has not 100 items. At every code run the "slice" array size
>     changes.
>
>     I'm using numpy 1.6.1 64bit (taken from Christoph Gohlke website)
>     , with python 2.7.2 64 bit.
>     Under Linux  Ubuntu 10.10 64bit and numpy 1.6.1 the code works as
>     expected.
>     Thanks.
>
>     Nicola
>
>
>
> Use the binary mode ('wb' and 'rb') when you open the files.
>
> Warren
>
>
>     _______________________________________________
>     NumPy-Discussion mailing list
>     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>

Hello,
thank you, I did not know that on Window you must explicitly use 'rb' or 
'wb'.

Cheers.

Nicola


>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120309/98c70ad0/attachment.html>

From Andreas.FRENK at 3ds.com  Fri Mar  9 09:07:56 2012
From: Andreas.FRENK at 3ds.com (FRENK Andreas)
Date: Fri, 9 Mar 2012 14:07:56 +0000
Subject: [Numpy-discussion] Memory leak in numpy
Message-ID: <5F84D7606A50BF4BA04EB53E5E4AF2F5A1E007CA@EU-DCC-MBX03.dsone.3ds.com>

Dear Community,

I have an issue with numpy consuming more and more memory.
According to ticket: http://projects.scipy.org/numpy/ticket/1427

This is a known issue. It should be fixed in 2.0.0.dev-9451260. What does this mean? It sound like a development release.
Is there any chance to fix it in 1.6.1 also?
Or, would it be possible to send me the fixes to include it in 1.6.1 (or 1.4.1) on my own?!

Thanks,
Andreas

This email and any attachments are intended solely for the use of the individual or entity to whom it is addressed and may be confidential and/or privileged.

If you are not one of the named recipients or have received this email in error,

(i) you should not read, disclose, or copy it,

(ii) please notify sender of your receipt by reply email and delete this email and all attachments,

(iii) Dassault Systemes does not accept or assume any liability or responsibility for any use of or reliance on this email.

For other languages, go to http://www.3ds.com/terms/email-disclaimer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120309/a53e144d/attachment.html>

From bryanv at continuum.io  Fri Mar  9 11:55:00 2012
From: bryanv at continuum.io (Bryan Van de Ven)
Date: Fri, 09 Mar 2012 10:55:00 -0600
Subject: [Numpy-discussion] draft enum NEP
Message-ID: <4F5A35E4.6080501@continuum.io>

Hi all,

I have started working on a NEP for adding an enumerated type to NumPy. 
It is on my GitHub:

     https://github.com/bryevdv/numpy/blob/enum/doc/neps/enum.rst

It is still very rough, and incomplete in places. But I would like to 
get feedback sooner rather than later in order to refine it. In 
particular there are a few questions inline in the document that I would 
like input on. Any comments, suggestions, questions, concerns, etc. are 
very welcome.

Thanks,

Bryan


From sameer.grover.1 at gmail.com  Fri Mar  9 14:50:06 2012
From: sameer.grover.1 at gmail.com (Sameer Grover)
Date: Sat, 10 Mar 2012 01:20:06 +0530
Subject: [Numpy-discussion] f2py and pygtk on windows
Message-ID: <CAOxZyWMc7wwjpTT1BXEzaX-ihLcTK7MNYSAGOcdYdwdhCOVygg@mail.gmail.com>

>>>import gtk
>>>import foo # where foo is any f2py-wrapped program

Subsequently, on exiting python interpreter, the interpreter crashes
with this error message - "This application has requested the Runtime
to terminate it in an unusual way. Please contact the application's
support team for more information."

Strangely enough, interchanging the order of the import statements,
i.e. importing the f2py wrapped program before gtk works fine.
Furthermore, each module works fine individually.

This is a windows-only problem. I'm using Windows 7, Python 2.7,
latest numpy, mingw32 compiler and the "pygtk all-in-one installer"
(mentioned on the pygtk download page).

This happens even for very simple fortran programs such as this one -
subroutine hello ()
    write(*,*)'Hello from Fortran90!!!'
end subroutine hello

I don't know whether the problem is with f2py or with gtk or with
python but maybe somebody can shed some light on this.

Regards,
Sameer Grover


From njs at pobox.com  Fri Mar  9 15:00:48 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 9 Mar 2012 20:00:48 +0000
Subject: [Numpy-discussion] draft enum NEP
In-Reply-To: <4F5A35E4.6080501@continuum.io>
References: <4F5A35E4.6080501@continuum.io>
Message-ID: <CAPJVwBn8BeWcPOUdoeak-B5HuVqq3z+NrhRjxEnS0Qbe-PB1OQ@mail.gmail.com>

On Fri, Mar 9, 2012 at 4:55 PM, Bryan Van de Ven <bryanv at continuum.io> wrote:
> Hi all,
>
> I have started working on a NEP for adding an enumerated type to NumPy.
> It is on my GitHub:
>
> ? ? https://github.com/bryevdv/numpy/blob/enum/doc/neps/enum.rst
>
> It is still very rough, and incomplete in places. But I would like to
> get feedback sooner rather than later in order to refine it. In
> particular there are a few questions inline in the document that I would
> like input on. Any comments, suggestions, questions, concerns, etc. are
> very welcome.

Hi Bryan,

That's excellent, an enumerated type would be very useful. From a
quick read, though, what I'd really like to see is some discussion of
the goals here -- like some example situations where you see these
being used, and the problems they're intended to solve? Because for
example, C "enums" are designed to solve a completely different
problem than something like an R "factor", and off the top of my head
I don't know how well either maps onto hdf5 enumerated types. Another
example is that I can't tell from the document what the motivation for
having both "open" and "closed" enums is?

(Also, general question: is there some technical advantage to being
able to represent more complicated dtypes as strings, that justifies
making up these mini-languages like "enum:uint16[A, B, C, D, E:128]"?
It can't be necessary for pickling or anything, right, since AFAICT
there's already no string representation for structured dtypes? It
just seems like it'd be simpler and more elegant to use some Python
syntax like 'dtype(Enum(["a", "b", "c"], storage=np.uint16))' instead
of writing a tiny one-off parser and wedging what's really a data
structure into a string, but I may be missing something.)

-- Nathaniel


From sameer.grover.1 at gmail.com  Fri Mar  9 15:05:33 2012
From: sameer.grover.1 at gmail.com (Sameer Grover)
Date: Sat, 10 Mar 2012 01:35:33 +0530
Subject: [Numpy-discussion] f2py function name issues
Message-ID: <CAOxZyWP+R8rv5t6OY4OZ5fDaVpw4j2wezr-g9=469xbAW+Twcw@mail.gmail.com>

subroutine union ()
   write(*,*)'Hello from Fortran90!!!'
end subroutine union

f2py is unable to wrap this function. Changing the name of the
subroutine from 'union' to something else works. I understand that
problems like these have been reported a long time ago
(http://cens.ioc.ee/pipermail/f2py-users/2003-November/000605.html)
with some other function names as well.

This looks like a bug to me or is there something I'm missing?


Regards,
Sameer Grover


From ralf.gommers at googlemail.com  Fri Mar  9 15:28:58 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Fri, 9 Mar 2012 21:28:58 +0100
Subject: [Numpy-discussion] Memory leak in numpy
In-Reply-To: <5F84D7606A50BF4BA04EB53E5E4AF2F5A1E007CA@EU-DCC-MBX03.dsone.3ds.com>
References: <5F84D7606A50BF4BA04EB53E5E4AF2F5A1E007CA@EU-DCC-MBX03.dsone.3ds.com>
Message-ID: <CABL7CQiT430a+mZB8Stb6PDGmPwGhMyaAGuDATag-9g2JZOcOw@mail.gmail.com>

On Fri, Mar 9, 2012 at 3:07 PM, FRENK Andreas <Andreas.FRENK at 3ds.com> wrote:

>  Dear Community,
>
>
>
> I have an issue with numpy consuming more and more memory.
>
> According to ticket: http://projects.scipy.org/numpy/ticket/1427
>
>
>
> This is a known issue. It should be fixed in 2.0.0.dev-9451260. What does
> this mean? It sound like a development release.
>
> Is there any chance to fix it in 1.6.1 also?
>
> Or, would it be possible to send me the fixes to include it in 1.6.1 (or
> 1.4.1) on my own?!
>

The ticket you linked indicates it was fixed in the 1.6.0 release. If
you're sure that you still have the same issue, please reopen that ticket
with a self-contained example that reproducibly demonstrates your issue.
Also include your OS, compilers used and Python version. If you're not sure
that it's the same issue, you can open a new ticket instead.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120309/08618451/attachment.html>

From pav at iki.fi  Fri Mar  9 15:29:37 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Fri, 09 Mar 2012 21:29:37 +0100
Subject: [Numpy-discussion] f2py function name issues
In-Reply-To: <CAOxZyWP+R8rv5t6OY4OZ5fDaVpw4j2wezr-g9=469xbAW+Twcw@mail.gmail.com>
References: <CAOxZyWP+R8rv5t6OY4OZ5fDaVpw4j2wezr-g9=469xbAW+Twcw@mail.gmail.com>
Message-ID: <jjdp7h$7p2$1@dough.gmane.org>

09.03.2012 21:05, Sameer Grover kirjoitti:
> subroutine union ()
>    write(*,*)'Hello from Fortran90!!!'
> end subroutine union
> 
> f2py is unable to wrap this function. Changing the name of the
> subroutine from 'union' to something else works. I understand that
> problems like these have been reported a long time ago
> (http://cens.ioc.ee/pipermail/f2py-users/2003-November/000605.html)
> with some other function names as well.
> 
> This looks like a bug to me or is there something I'm missing?

f2py indeed seems to skip names that collide with C keywords. The reason
probably is that it's not guaranteed that the symbol name is mangled to
"union_" but it could also become "union". If the name matches a C
keyword, there's probably no (sane) way to call the function from the C
language, and since f2py goes through C, it's not able to call that.

Whether this is a real problem in practice or not is then a different
question.

The full list of "bad" names is

'int','double','float','char','short','long','void','case','while',
'return','signed','unsigned','if','for','typedef','sizeof','union',
'struct','static','register','new','break','do','goto','switch',
'continue','else','inline','extern','delete','const','auto',
'len','rank','shape','index','slen','size','_i',
'max', 'min', 'flen','fshape',
'string','complex_double','float_double','stdin','stderr','stdout',
'type','default'


From sameer.grover.1 at gmail.com  Fri Mar  9 15:42:05 2012
From: sameer.grover.1 at gmail.com (Sameer Grover)
Date: Sat, 10 Mar 2012 02:12:05 +0530
Subject: [Numpy-discussion] f2py function name issues
In-Reply-To: <jjdp7h$7p2$1@dough.gmane.org>
References: <CAOxZyWP+R8rv5t6OY4OZ5fDaVpw4j2wezr-g9=469xbAW+Twcw@mail.gmail.com>
	<jjdp7h$7p2$1@dough.gmane.org>
Message-ID: <CAOxZyWPPy2Wo+S=Jwz_2mV2LyEFYie_LQuJnW-HPXb-UC2ZxhA@mail.gmail.com>

Thank you very much for the clarification. It would be nice if this
were mentioned in the documentation.

Sameer

On 10 March 2012 01:59, Pauli Virtanen <pav at iki.fi> wrote:
> 09.03.2012 21:05, Sameer Grover kirjoitti:
>> subroutine union ()
>> ? ?write(*,*)'Hello from Fortran90!!!'
>> end subroutine union
>>
>> f2py is unable to wrap this function. Changing the name of the
>> subroutine from 'union' to something else works. I understand that
>> problems like these have been reported a long time ago
>> (http://cens.ioc.ee/pipermail/f2py-users/2003-November/000605.html)
>> with some other function names as well.
>>
>> This looks like a bug to me or is there something I'm missing?
>
> f2py indeed seems to skip names that collide with C keywords. The reason
> probably is that it's not guaranteed that the symbol name is mangled to
> "union_" but it could also become "union". If the name matches a C
> keyword, there's probably no (sane) way to call the function from the C
> language, and since f2py goes through C, it's not able to call that.
>
> Whether this is a real problem in practice or not is then a different
> question.
>
> The full list of "bad" names is
>
> 'int','double','float','char','short','long','void','case','while',
> 'return','signed','unsigned','if','for','typedef','sizeof','union',
> 'struct','static','register','new','break','do','goto','switch',
> 'continue','else','inline','extern','delete','const','auto',
> 'len','rank','shape','index','slen','size','_i',
> 'max', 'min', 'flen','fshape',
> 'string','complex_double','float_double','stdin','stderr','stdout',
> 'type','default'
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From cgohlke at uci.edu  Fri Mar  9 15:53:57 2012
From: cgohlke at uci.edu (Christoph Gohlke)
Date: Fri, 09 Mar 2012 12:53:57 -0800
Subject: [Numpy-discussion] f2py and pygtk on windows
In-Reply-To: <CAOxZyWMc7wwjpTT1BXEzaX-ihLcTK7MNYSAGOcdYdwdhCOVygg@mail.gmail.com>
References: <CAOxZyWMc7wwjpTT1BXEzaX-ihLcTK7MNYSAGOcdYdwdhCOVygg@mail.gmail.com>
Message-ID: <4F5A6DE5.6000602@uci.edu>


On 3/9/2012 11:50 AM, Sameer Grover wrote:
>>>> import gtk
>>>> import foo # where foo is any f2py-wrapped program
>
> Subsequently, on exiting python interpreter, the interpreter crashes
> with this error message - "This application has requested the Runtime
> to terminate it in an unusual way. Please contact the application's
> support team for more information."
>
> Strangely enough, interchanging the order of the import statements,
> i.e. importing the f2py wrapped program before gtk works fine.
> Furthermore, each module works fine individually.
>
> This is a windows-only problem. I'm using Windows 7, Python 2.7,
> latest numpy, mingw32 compiler and the "pygtk all-in-one installer"
> (mentioned on the pygtk download page).
>
> This happens even for very simple fortran programs such as this one -
> subroutine hello ()
>      write(*,*)'Hello from Fortran90!!!'
> end subroutine hello
>
> I don't know whether the problem is with f2py or with gtk or with
> python but maybe somebody can shed some light on this.
>
> Regards,
> Sameer Grover

The error can be due to memory corruption. It works for me with 
msvc9/ifort builds of the pygtk and f2py extensions.

Which DLLs does foo.pyd depend on (use DependencyWalker)?

Christoph


From 00ai99 at gmail.com  Fri Mar  9 17:48:42 2012
From: 00ai99 at gmail.com (David Gowers (kampu))
Date: Sat, 10 Mar 2012 09:18:42 +1030
Subject: [Numpy-discussion] draft enum NEP
In-Reply-To: <4F5A35E4.6080501@continuum.io>
References: <4F5A35E4.6080501@continuum.io>
Message-ID: <CAKcu9cdxLeE8O5dRjKJ02ryhO_w6WdACLDptyCCj3v_ZSnuwCQ@mail.gmail.com>

Hi,

On Sat, Mar 10, 2012 at 3:25 AM, Bryan Van de Ven <bryanv at continuum.io> wrote:
> Hi all,
>
> I have started working on a NEP for adding an enumerated type to NumPy.
> It is on my GitHub:
>
> ? ? https://github.com/bryevdv/numpy/blob/enum/doc/neps/enum.rst
>
> It is still very rough, and incomplete in places. But I would like to
> get feedback sooner rather than later in order to refine it. In
> particular there are a few questions inline in the document that I would
> like input on. Any comments, suggestions, questions, concerns, etc. are
> very welcome.

"t = np.dtype('enum', map=(n,v))"

^ Is this supposed to be indicating 'this is an enum with values
ranging between n and v'? It could be a bit more clear.

Is it possible to partially define an enum? That is, give the maximum
and minimum values, and only some of the enumeration value:name
mappings?
For example, an enum where 0 means 'n/a', +n means 'Type A Object
#(n-1)' and -n means 'Type B Object #(abs(n) - 1)'. I just want to map
the non-scalar values, while having a way to avoid treating valid
scalar values (eg +64) as out-of-range.
Example of what I mean:

"t = np.dtype('enum[N_A:0]', range = (-127, 127))"
(defined values being printed as a string, undefined being printed as a number.)

David


From ralf.gommers at googlemail.com  Fri Mar  9 17:57:26 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Fri, 9 Mar 2012 23:57:26 +0100
Subject: [Numpy-discussion] addition,
 multiplication of a polynomial and np.{float, int}
In-Reply-To: <4F57DAFF.3030302@crans.org>
References: <slrnjlcusj.ul5.dlaxalde@157-washington.campus.mcgill.ca>
	<CAB6mnxJ=u7QGTwSwUZfibbSieh8caZ+3_yTmuWF_-3xzJmONYg@mail.gmail.com>
	<4F5790BB.2050801@crans.org>
	<CAB6mnx+kJys-F76wKP+6KpF2rQObLmQD7g8G4nOWL03Ks_UDpA@mail.gmail.com>
	<4F57DAFF.3030302@crans.org>
Message-ID: <CABL7CQgDjboUHUYWs8BLyDuQRWCxDmnsLh2vZp26+u8Ve1XXFQ@mail.gmail.com>

On Wed, Mar 7, 2012 at 11:02 PM, Pierre Haessig <pierre.haessig at crans.org>wrote:

> Hi Charles,
> Le 07/03/2012 18:00, Charles R Harris a ?crit :
> >
> > That's a good idea, I'll take care of it. Note the caveat about the
> > coefficients going in the opposite direction.
> Great ! In the mean time I changed a bit the root polynomials reference
> to emphasize the new Polynomial class.
>
>
> http://docs.scipy.org/numpy/docs/numpy-docs/reference/routines.polynomials.rst/
>
>
Thanks Pierre.


> I feel like that the preexisting text was more targeting people with a
> preexisting knowledge of Poly1d than people looking for Polynomials in
> NumPy in general. (Poly1d is unfortunately the number 1 Google result
> for "polynomials numpy"...)
>
> Please make sure I didn't write something completely stupid. Also, I
> tried to include links using :doc: but the editor complains. I hope it
> will work after a sphinx compilation. There is a build bot, isn't it ?
> The aim of the first link is that the "user-in-a-hurry" ends up on the
> best piece of documentation available, namely "Using the Convenience
> Classes" after a short intro. Does this sounds good ?
>

The buildbot doesn't check the doc build. I've edited a few of the links.
The :doc: link should work I think, but I'll make sure to check when
merging your edit into the master branch. The text looks good.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120309/e60758e5/attachment.html>

From charlesr.harris at gmail.com  Fri Mar  9 18:13:53 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 9 Mar 2012 16:13:53 -0700
Subject: [Numpy-discussion] addition,
 multiplication of a polynomial and np.{float, int}
In-Reply-To: <CABL7CQgDjboUHUYWs8BLyDuQRWCxDmnsLh2vZp26+u8Ve1XXFQ@mail.gmail.com>
References: <slrnjlcusj.ul5.dlaxalde@157-washington.campus.mcgill.ca>
	<CAB6mnxJ=u7QGTwSwUZfibbSieh8caZ+3_yTmuWF_-3xzJmONYg@mail.gmail.com>
	<4F5790BB.2050801@crans.org>
	<CAB6mnx+kJys-F76wKP+6KpF2rQObLmQD7g8G4nOWL03Ks_UDpA@mail.gmail.com>
	<4F57DAFF.3030302@crans.org>
	<CABL7CQgDjboUHUYWs8BLyDuQRWCxDmnsLh2vZp26+u8Ve1XXFQ@mail.gmail.com>
Message-ID: <CAB6mnxJaxYtNo444Vk7EFa-JAQ8zJ91JQK2jziBDF+=J9TDzWw@mail.gmail.com>

On Fri, Mar 9, 2012 at 3:57 PM, Ralf Gommers <ralf.gommers at googlemail.com>wrote:

>
>
> On Wed, Mar 7, 2012 at 11:02 PM, Pierre Haessig <pierre.haessig at crans.org>wrote:
>
>> Hi Charles,
>> Le 07/03/2012 18:00, Charles R Harris a ?crit :
>> >
>> > That's a good idea, I'll take care of it. Note the caveat about the
>> > coefficients going in the opposite direction.
>> Great ! In the mean time I changed a bit the root polynomials reference
>> to emphasize the new Polynomial class.
>>
>>
>> http://docs.scipy.org/numpy/docs/numpy-docs/reference/routines.polynomials.rst/
>>
>>
> Thanks Pierre.
>
>
>> I feel like that the preexisting text was more targeting people with a
>> preexisting knowledge of Poly1d than people looking for Polynomials in
>> NumPy in general. (Poly1d is unfortunately the number 1 Google result
>> for "polynomials numpy"...)
>>
>> Please make sure I didn't write something completely stupid. Also, I
>> tried to include links using :doc: but the editor complains. I hope it
>> will work after a sphinx compilation. There is a build bot, isn't it ?
>> The aim of the first link is that the "user-in-a-hurry" ends up on the
>> best piece of documentation available, namely "Using the Convenience
>> Classes" after a short intro. Does this sounds good ?
>>
>
> The buildbot doesn't check the doc build. I've edited a few of the links.
> The :doc: link should work I think, but I'll make sure to check when
> merging your edit into the master branch. The text looks good.
>
>
Thanks Pierre, that reads nicely. The lack of knowledge about the package
is probably my fault, since I neglected to do more than the minimal amount
of reference documentation until recently. Undocumented features are
ghostly creatures without tangible reality...

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120309/06fd9b7a/attachment.html>

From sturla at molden.no  Sat Mar 10 16:27:42 2012
From: sturla at molden.no (Sturla Molden)
Date: Sat, 10 Mar 2012 22:27:42 +0100
Subject: [Numpy-discussion] (2012) Accessing LAPACK and BLAS from the
 numpy	C API
In-Reply-To: <4F57BED5.7050305@esrf.fr>
References: <4F54BEEA.4070503@esrf.fr> <4F566C29.3050209@molden.no>
	<4F57BED5.7050305@esrf.fr>
Message-ID: <4F5BC74E.2080901@molden.no>

Den 07.03.2012 21:02, skrev "V. Armando Sol?":
>
> I had already used the information Robert Kern provided on the 2009 
> thread and obtained the PyCObject as:
>
> from scipy.linalg.blas import fblas
> dgemm = fblas.dgemm._cpointer
> sgemm = fblas.sgemm._cpointer
>
> but I did not find a way to obtain those pointers from numpy. That was 
> the goal of my post. My extension needs SciPy installed just to fetch 
> the pointer. It would be very nice to have a way to get similar 
> information from numpy.


By the way, here is a code I wrote to use DGELS instead of DGELSS for 
least-squares (i.e. QR instead of SVD).  It shows how we can grab a 
LAPACK function from MKL when using EPD.

Sturla


import numpy as np
import scipy as sp
from scipy.linalg import LinAlgError

try:

     import ctypes
     from numpy.ctypeslib import ndpointer

     _c_int_p = ctypes.POINTER(ctypes.c_int)
     _c_double_p = ctypes.POINTER(ctypes.c_double)
     _double_array_1d = ndpointer(dtype=np.float64, ndim=1, 
flags='C_CONTIGUOUS' )
     _int_array_1d = ndpointer(dtype=np.int32, ndim=1, 
flags='C_CONTIGUOUS' )
     _double_array_2d = ndpointer(dtype=np.float64, ndim=2, 
flags='F_CONTIGUOUS' )

     intel_mkl = ctypes.CDLL('mk2_rt.dll')

     dgels = intel_mkl.DGELS
     dgels.restype = None
     dgels.argtypes = (ctypes.c_char_p, _c_int_p, _c_int_p, _c_int_p, # 
TRANS, M, N, NRHS
                         _double_array_2d, _c_int_p, # A, LDA
                         _double_array_2d, _c_int_p, # b, LDB
                         _double_array_1d, _c_int_p, _c_int_p) # WORK, 
LWORK, INFO

     _one = ctypes.c_int(1)
     _minus_one = ctypes.c_int(-1)
     _no_transpose = ctypes.c_char_p("N")

     def copy_fortran(x, dtype=np.float64):
         return np.array(x, dtype=dtype, copy=True, order='F')

     def lstsq( X, y ):
         assert(X.ndim == 2)
         assert(y.ndim == 1)
         X = copy_fortran(X)
         y = copy_fortran(y[:,np.newaxis])
         assert(X.shape[0] == y.shape[0])
         m = ctypes.c_int(X.shape[0])
         n = ctypes.c_int(X.shape[1])
         ldx = ctypes.c_int(m.value)
         ldy = ctypes.c_int(m.value)
         info = ctypes.c_int(0)

         swork = np.zeros(1, dtype=np.float64)
         dgels(_no_transpose, ctypes.byref(m), ctypes.byref(n), 
ctypes.byref(_one),
                 X, ctypes.byref(ldx), y,  ctypes.byref(ldy),
                 swork, ctypes.byref(_minus_one),
                 ctypes.byref(info))

         if info.value < 0:
             raise ValueError, 'illegal argument to lapack dgels: arg 
no. %d' % (-info.value,)
         if info.value > 0:
             raise LinAlgError, 'lapack error %d, X does not have full 
rank' % (info.value,)

         lwork = ctypes.c_int(int(swork[0]))
         work = np.zeros(lwork.value, dtype=np.float64)

         dgels(_no_transpose, ctypes.byref(m), ctypes.byref(n), 
ctypes.byref(_one),
                 X, ctypes.byref(ldx), y,  ctypes.byref(ldy),
                 work, ctypes.byref(lwork),
                 ctypes.byref(info))

         if info.value < 0:
             raise ValueError, 'illegal argument to lapack dgels: arg 
no. %d' % (-info.value,)
         if info.value != 0:
             raise LinAlgError, 'lapack error %d, X does not have full 
rank' % (info.value,)

         return (y[:X.shape[1],0],)

except:
     from scipy.linalg import lstsq


def linreg(y, X):
     return lstsq(X,y)[0]


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120310/8c1d4d61/attachment.html>

From sturla at molden.no  Sat Mar 10 16:56:34 2012
From: sturla at molden.no (Sturla Molden)
Date: Sat, 10 Mar 2012 22:56:34 +0100
Subject: [Numpy-discussion] (2012) Accessing LAPACK and BLAS from the
 numpy	C API
In-Reply-To: <4F57BED5.7050305@esrf.fr>
References: <4F54BEEA.4070503@esrf.fr> <4F566C29.3050209@molden.no>
	<4F57BED5.7050305@esrf.fr>
Message-ID: <4F5BCE12.6070303@molden.no>

Den 07.03.2012 21:02, skrev "V. Armando Sol?":
>
> I had already used the information Robert Kern provided on the 2009 
> thread and obtained the PyCObject as:
>
> from scipy.linalg.blas import fblas
> dgemm = fblas.dgemm._cpointer
> sgemm = fblas.sgemm._cpointer
>
> but I did not find a way to obtain those pointers from numpy. That was 
> the goal of my post. My extension needs SciPy installed just to fetch 
> the pointer. It would be very nice to have a way to get similar 
> information from numpy.

The problem here is that NumPy's lapack_lite is compiled to C with f2c, 
and there is a handwritten C extension module (lapack_litemodule.c) to 
use it. Unlike SciPy, it does not use f2py to call Fortran LAPACK directly.

I am not sure why NumPy uses f2c'd routines instead of a dependency on 
BLAS and LAPACK like SciPy.

And by Murphy's law, function pointers to the f2c'd LAPACK routines are 
not exported from lapack_lite.pyd. So it does not even help to load it 
with ctypes as an ordinary DLL.

Sturla


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120310/8e0a76a4/attachment.html>

From sturla at molden.no  Sat Mar 10 17:19:27 2012
From: sturla at molden.no (Sturla Molden)
Date: Sat, 10 Mar 2012 23:19:27 +0100
Subject: [Numpy-discussion] (2012) Accessing LAPACK and BLAS from the
 numpy	C API
In-Reply-To: <4F5BCE12.6070303@molden.no>
References: <4F54BEEA.4070503@esrf.fr> <4F566C29.3050209@molden.no>
	<4F57BED5.7050305@esrf.fr> <4F5BCE12.6070303@molden.no>
Message-ID: <4F5BD36F.5010502@molden.no>

Den 10.03.2012 22:56, skrev Sturla Molden:
>
> I am not sure why NumPy uses f2c'd routines instead of a dependency on 
> BLAS and LAPACK like SciPy.

Actually, np.dot does depend on the CBLAS interface to BLAS (_dotblas.c).

But the lapack methods in lapack_lite seems to use f2c'd code. I am not 
sure if they will use an optimized BLAS or just link to f2c's BLAS in 
blas_lite.c.

If the intention is to avoid Fortran dependency in NumPy, I am not sure 
why this is better than a dependency on CBLAS and LAPACKE.

Sturla


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120310/89b4ea4d/attachment.html>

From pav at iki.fi  Sat Mar 10 19:12:06 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Sun, 11 Mar 2012 01:12:06 +0100
Subject: [Numpy-discussion] (2012) Accessing LAPACK and BLAS from the
	numpy C API
In-Reply-To: <4F5BD36F.5010502@molden.no>
References: <4F54BEEA.4070503@esrf.fr> <4F566C29.3050209@molden.no>
	<4F57BED5.7050305@esrf.fr> <4F5BCE12.6070303@molden.no>
	<4F5BD36F.5010502@molden.no>
Message-ID: <jjgqkm$2e1$1@dough.gmane.org>

10.03.2012 23:19, Sturla Molden kirjoitti:
[clip]
> If the intention is to avoid Fortran dependency in NumPy, I am not sure
> why this is better than a dependency on CBLAS and LAPACKE.

The CBLAS parts aren't compiled if the library is not available --- in
that case Numpy just falls back to a totally naive implementation of the
matrix multiplication.

	Pauli


From sole at esrf.fr  Sat Mar 10 19:12:24 2012
From: sole at esrf.fr (sole at esrf.fr)
Date: Sun, 11 Mar 2012 01:12:24 +0100
Subject: [Numpy-discussion] (2012) Accessing LAPACK and BLAS from the
 numpy C API
In-Reply-To: <4F5BD36F.5010502@molden.no>
References: <4F54BEEA.4070503@esrf.fr> <4F566C29.3050209@molden.no>
	<4F57BED5.7050305@esrf.fr> <4F5BCE12.6070303@molden.no>
	<4F5BD36F.5010502@molden.no>
Message-ID: <20120311011224.Horde.XpQPfnXVVqxPW_3o0RHHaxA@160.103.2.152>

Hi Sturla,

Quoting Sturla Molden <sturla at molden.no>:

> Den 10.03.2012 22:56, skrev Sturla Molden:
>>
>> I am not sure why NumPy uses f2c'd routines instead of a dependency  
>> on BLAS and LAPACK like SciPy.
>
> Actually, np.dot does depend on the CBLAS interface to BLAS (_dotblas.c).
>
> But the lapack methods in lapack_lite seems to use f2c'd code. I am  
> not sure if they will use an optimized BLAS or just link to f2c's  
> BLAS in blas_lite.c.
>
> If the intention is to avoid Fortran dependency in NumPy, I am not  
> sure why this is better than a dependency on CBLAS and LAPACKE.
>

Thanks for the more complete information. Now I understand better why  
it is more difficult to access the underlying libraries when using  
numpy instead of scipy.

My main objective was to avoid having to ship libraries with my python  
extension modules. Using those already available via numpy seemed to  
me the most natural option since my C extension depends on numpy but  
not on scipy. In addition, people using my extension module could  
benefit from better libraries than those I would be able to ship.

Eliminating the fortran dependency appeared to me as an added bonus: I  
have a 64 bit intel fortran compiler license for the only reason of  
not being limited by a fortran dependency.

Best regards,

Armando


From sameer.grover.1 at gmail.com  Sun Mar 11 00:31:53 2012
From: sameer.grover.1 at gmail.com (Sameer Grover)
Date: Sun, 11 Mar 2012 11:01:53 +0530
Subject: [Numpy-discussion] f2py and pygtk on windows
In-Reply-To: <4F5A6DE5.6000602@uci.edu>
References: <CAOxZyWMc7wwjpTT1BXEzaX-ihLcTK7MNYSAGOcdYdwdhCOVygg@mail.gmail.com>
	<4F5A6DE5.6000602@uci.edu>
Message-ID: <CAOxZyWNsY69G6UzbwUsb0vKgyUnTTTMQv8LwmrGG-kPNMwAK2w@mail.gmail.com>

On 10 March 2012 02:23, Christoph Gohlke <cgohlke at uci.edu> wrote:
>
>
> On 3/9/2012 11:50 AM, Sameer Grover wrote:
>>>>> import gtk
>>>>> import foo # where foo is any f2py-wrapped program
>>
>> Subsequently, on exiting python interpreter, the interpreter crashes
>> with this error message - "This application has requested the Runtime
>> to terminate it in an unusual way. Please contact the application's
>> support team for more information."
>>
>> Strangely enough, interchanging the order of the import statements,
>> i.e. importing the f2py wrapped program before gtk works fine.
>> Furthermore, each module works fine individually.
>>
>> This is a windows-only problem. I'm using Windows 7, Python 2.7,
>> latest numpy, mingw32 compiler and the "pygtk all-in-one installer"
>> (mentioned on the pygtk download page).
>>
>> This happens even for very simpl fortran programs such as this one -
>> subroutine hello ()
>> ? ? ?write(*,*)'Hello from Fortran90!!!'
>> end subroutine hello
>>
>> I don't know whether the problem is with f2py or with gtk or with
>> python but maybe somebody can shed some light on this.
>>
>> Regards,
>> Sameer Grover
>
> The error can be due to memory corruption. It works for me with
> msvc9/ifort builds of the pygtk and f2py extensions.
>
> Which DLLs does foo.pyd depend on (use DependencyWalker)?
>
> Christoph
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

The .pyd file generated by f2py depends on the following

KERNEL32.DLL
MSVCRT.DLL
LIBGCC_S_DW2-1.DLL
LIBGFORTRAN-3.DLL
PYTHON27.DLL

It is possible that this is a mingw32 problem. I haven't been able to
try using ifort/msvc.(mscv is somehow not getting installed on my
system, but that's a separate issue).

Sameer


From teoliphant at gmail.com  Sun Mar 11 01:35:36 2012
From: teoliphant at gmail.com (Travis Oliphant)
Date: Sun, 11 Mar 2012 00:35:36 -0600
Subject: [Numpy-discussion] Looking for people interested in helping with
	Python compiler to LLVM
Message-ID: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>

Hey all, 

I gave a lightning talk this morning on numba which is the start of a Python compiler to machine code through the LLVM tool-chain.   It is proof of concept stage only at this point (use it only if you are interested in helping develop the code at this point).   The only thing that works is a fast-vectorize capability on a few functions (without for-loops).   But, it shows how creating functions in Python that can be used by the NumPy runtime in various ways.   Several NEPS that will be discussed in the coming months will use this concept.  

Right now there is very little design documentation, but I will be adding some in the days ahead, especially if I get people who are interested in collaborating on the project.   I did talk to Fijal and Alex of the PyPy project at PyCon and they both graciously suggested that I look at some of the PyPy code which walks the byte-code and does translation to their intermediate representation for inspiration. 

Again, the code is not ready for use, it is only proof of concept, but I would like to get feedback and help especially from people who might have written compilers before.  The code lives at:   https://github.com/ContinuumIO/numba

-Travis


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120311/df5a1ddd/attachment.html>

From telukpalu at gmail.com  Sun Mar 11 03:06:12 2012
From: telukpalu at gmail.com (aa)
Date: Sun, 11 Mar 2012 14:06:12 +0700
Subject: [Numpy-discussion] about sympy
Message-ID: <4F5C4EE4.9010500@gmail.com>

why sympy cannot integrate sin(x**2)??


From sameer.grover.1 at gmail.com  Sun Mar 11 03:27:01 2012
From: sameer.grover.1 at gmail.com (Sameer Grover)
Date: Sun, 11 Mar 2012 12:57:01 +0530
Subject: [Numpy-discussion] about sympy
In-Reply-To: <4F5C4EE4.9010500@gmail.com>
References: <4F5C4EE4.9010500@gmail.com>
Message-ID: <CAOxZyWP0oY9o85GUekPcmrZ19Y_E2FWA+SxNU=2-f+ApoDbdrA@mail.gmail.com>

It does not have an anti-derivative in terms of elementary functions.
You could do a series expansion and integrate each term.

Note that this is the numpy mailing list. The sympy mailing list is to
be found at http://sympy.org/en/support.html

Sameer

On 11 March 2012 12:36, aa <telukpalu at gmail.com> wrote:
> why sympy cannot integrate sin(x**2)??
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From kalatsky at gmail.com  Sun Mar 11 03:31:41 2012
From: kalatsky at gmail.com (Val Kalatsky)
Date: Sun, 11 Mar 2012 01:31:41 -0600
Subject: [Numpy-discussion] about sympy
In-Reply-To: <4F5C4EE4.9010500@gmail.com>
References: <4F5C4EE4.9010500@gmail.com>
Message-ID: <CAE8bXE=kXk21jbLQ-NfTHGzfNNw0a1uU7H6oPMzz48n3L_CbCw@mail.gmail.com>

Can you?
The question should be: Why sympy does not have Fresnel integrals?

On Sun, Mar 11, 2012 at 1:06 AM, aa <telukpalu at gmail.com> wrote:

> why sympy cannot integrate sin(x**2)??
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120311/a6ffa82a/attachment.html>

From mictadlo at gmail.com  Sun Mar 11 03:59:00 2012
From: mictadlo at gmail.com (Mic)
Date: Sun, 11 Mar 2012 17:59:00 +1000
Subject: [Numpy-discussion] Looking for people interested in helping
 with Python compiler to LLVM
In-Reply-To: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
Message-ID: <CAOP6n=h_R9hEg3GLFbrz0P=7bd9GDz5zfhVDQOeByGUfw=tL3g@mail.gmail.com>

what is the difference to http://www.python.org/dev/peps/pep-3146/ ?

On Sun, Mar 11, 2012 at 4:35 PM, Travis Oliphant <teoliphant at gmail.com>wrote:

> Hey all,
>
> I gave a lightning talk this morning on numba which is the start of a
> Python compiler to machine code through the LLVM tool-chain.   It is proof
> of concept stage only at this point (use it only if you are interested in
> helping develop the code at this point).   The only thing that works is a
> fast-vectorize capability on a few functions (without for-loops).   But, it
> shows how creating functions in Python that can be used by the NumPy
> runtime in various ways.   Several NEPS that will be discussed in the
> coming months will use this concept.
>
> Right now there is very little design documentation, but I will be adding
> some in the days ahead, especially if I get people who are interested in
> collaborating on the project.   I did talk to Fijal and Alex of the PyPy
> project at PyCon and they both graciously suggested that I look at some of
> the PyPy code which walks the byte-code and does translation to their
> intermediate representation for inspiration.
>
> Again, the code is not ready for use, it is only proof of concept, but I
> would like to get feedback and help especially from people who might have
> written compilers before.  The code lives at:
> https://github.com/ContinuumIO/numba
>
> -Travis
>
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120311/418a9558/attachment.html>

From cgohlke at uci.edu  Sun Mar 11 04:09:19 2012
From: cgohlke at uci.edu (Christoph Gohlke)
Date: Sun, 11 Mar 2012 00:09:19 -0800
Subject: [Numpy-discussion] ArgumentError when using
	numpy.ctypeslib.ndpointer
Message-ID: <4F5C5DAF.8030105@uci.edu>

Hello,

I am unable to get the simple numpy.ctypeslib.ndpointer docstring 
example from 
<http://docs.scipy.org/doc/numpy/reference/routines.ctypeslib.html> 
working on Windows.

Given a DLL `foo.dll` that exports a function `int bar(double *)`, 
calling foo.bar using the np.ctypeslib.ndpointer approach always fails 
with a ctypes.ArgumentError:

In [5]: foo = ctypes.CDLL('foo.dll')
In [6]: foo.bar.argtpes = [np.ctypeslib.ndpointer(dtype=np.float64, 
ndim=1, flags='C_CONTIGUOUS')]
In [7]: a = np.array([1, 2, 3], dtype=np.float64)
In [8]: foo.bar(a)
ArgumentError: argument 1: <type 'exceptions.TypeError'>: Don't know how 
to convert parameter 1

This works as expected:

In [9]: c_double_p = ctypes.POINTER(ctypes.c_double)
In [10]: foo.bar.argtpes = [c_double_p]
In [11]: foo.bar(a.ctypes.data_as(c_double_p))
Out[11]: 0

What am I missing? I also tried ndpointer(), several numpy and Python 
versions, real DLLs, and argument types. A self contained script is 
attached.

Thank you,

Christoph
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ndpointer_test.py
Type: text/x-python
Size: 695 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120311/f3978ba0/attachment.py>

From pav at iki.fi  Sun Mar 11 07:33:26 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Sun, 11 Mar 2012 12:33:26 +0100
Subject: [Numpy-discussion] ArgumentError when using
	numpy.ctypeslib.ndpointer
In-Reply-To: <4F5C5DAF.8030105@uci.edu>
References: <4F5C5DAF.8030105@uci.edu>
Message-ID: <jji2i6$t7p$1@dough.gmane.org>

11.03.2012 09:09, Christoph Gohlke kirjoitti:
> I am unable to get the simple numpy.ctypeslib.ndpointer docstring
> example from
> <http://docs.scipy.org/doc/numpy/reference/routines.ctypeslib.html>
> working on Windows.
> 
> Given a DLL `foo.dll` that exports a function `int bar(double *)`,
> calling foo.bar using the np.ctypeslib.ndpointer approach always fails
> with a ctypes.ArgumentError:
> 
> In [5]: foo = ctypes.CDLL('foo.dll')
> In [6]: foo.bar.argtpes = [np.ctypeslib.ndpointer(dtype=np.float64,
> ndim=1, flags='C_CONTIGUOUS')]

Typo: "argtpes" -> "argtypes"

Probably not Windows-specific :)

-- 
Pauli Virtanen


From xavier.gnata at gmail.com  Sun Mar 11 10:12:22 2012
From: xavier.gnata at gmail.com (xavier.gnata at gmail.com)
Date: Sun, 11 Mar 2012 15:12:22 +0100
Subject: [Numpy-discussion] Looking for people interested in helping
 with Python compiler to LLVM
In-Reply-To: <CAOP6n=h_R9hEg3GLFbrz0P=7bd9GDz5zfhVDQOeByGUfw=tL3g@mail.gmail.com>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<CAOP6n=h_R9hEg3GLFbrz0P=7bd9GDz5zfhVDQOeByGUfw=tL3g@mail.gmail.com>
Message-ID: <4F5CB2C6.9080809@gmail.com>

It says : " Numba is a NumPy aware optimizing compiler for Python. It 
uses the remarkable LLVM compiler infrastructure to compile Python 
byte-code to machine code especially for use in the NumPy run-time and 
SciPy modules."*
*
If this description is correct, Numba is an additional pass once the 
cpython  bytecode has be produced by cpython.
Is it correct??
Is python bytecote a good intermediate representation to perform numpy 
related optimization?

One current big issue with numpy is that C=A+B+D produces temporaries.
numexpr addresses this issue and it would be great to get the same 
result by default in numpy.
numexpr also optimizes polynomials using Horner's method. It is hard to 
do that at bytecode level, isn't it?

Unladen swallow wanted to replace the full cpython implementation by a 
jit compiler built using LLVM...but unladen swallow is dead.

Xavier


> what is the difference to http://www.python.org/dev/peps/pep-3146/ ?
>
> On Sun, Mar 11, 2012 at 4:35 PM, Travis Oliphant <teoliphant at gmail.com 
> <mailto:teoliphant at gmail.com>> wrote:
>
>     Hey all,
>
>     I gave a lightning talk this morning on numba which is the start
>     of a Python compiler to machine code through the LLVM tool-chain.
>       It is proof of concept stage only at this point (use it only if
>     you are interested in helping develop the code at this point).  
>     The only thing that works is a fast-vectorize capability on a few
>     functions (without for-loops).   But, it shows how creating
>     functions in Python that can be used by the NumPy runtime in
>     various ways.   Several NEPS that will be discussed in the coming
>     months will use this concept.
>
>     Right now there is very little design documentation, but I will be
>     adding some in the days ahead, especially if I get people who are
>     interested in collaborating on the project.   I did talk to Fijal
>     and Alex of the PyPy project at PyCon and they both graciously
>     suggested that I look at some of the PyPy code which walks the
>     byte-code and does translation to their intermediate
>     representation for inspiration.
>
>     Again, the code is not ready for use, it is only proof of concept,
>     but I would like to get feedback and help especially from people
>     who might have written compilers before.  The code lives at:
>     https://github.com/ContinuumIO/numba
>
>     -Travis
>
>
>
>
>     _______________________________________________
>     NumPy-Discussion mailing list
>     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From pav at iki.fi  Sun Mar 11 10:52:40 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Sun, 11 Mar 2012 15:52:40 +0100
Subject: [Numpy-discussion] Looking for people interested in helping
 with Python compiler to LLVM
In-Reply-To: <4F5CB2C6.9080809@gmail.com>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<CAOP6n=h_R9hEg3GLFbrz0P=7bd9GDz5zfhVDQOeByGUfw=tL3g@mail.gmail.com>
	<4F5CB2C6.9080809@gmail.com>
Message-ID: <jjie7p$b6n$1@dough.gmane.org>

11.03.2012 15:12, xavier.gnata at gmail.com kirjoitti:
[clip]
> If this description is correct, Numba is an additional pass once the 
> cpython  bytecode has be produced by cpython.
> Is it correct??
> Is python bytecote a good intermediate representation to perform numpy 
> related optimization?
> 
> One current big issue with numpy is that C=A+B+D produces temporaries.
> numexpr addresses this issue and it would be great to get the same 
> result by default in numpy.
> numexpr also optimizes polynomials using Horner's method. It is hard to 
> do that at bytecode level, isn't it?

My impression is that dealing with Python's bytecode is not necessarily
significantly harder than dealing with the AST.

Your example reads

  1           0 LOAD_NAME                0 (A)
              3 LOAD_NAME                1 (B)
              6 BINARY_ADD
              7 LOAD_NAME                2 (D)
             10 BINARY_ADD
             11 STORE_NAME               3 (C)

For instance, interpreting the bytecode (e.g. loop body) once with dummy
objects lets you know what the final compound expression is.

> Unladen swallow wanted to replace the full cpython implementation by a 
> jit compiler built using LLVM... but unladen swallow is dead.

To get speed gains, you need to optimize not only the bytecode
interpreter side, but also the object space --- Python classes, strings
and all that. Keeping in mind Python's dynamism, there are potential
side effects everywhere. I guess this is what sunk the swallow.

Just speeding up effectively statically typed code dealing with arrays
and basic types, on the other hand, sounds much easier.

The PyPy guys have a much more ambitious approach, and are getting nice
results. Also with arrays --- as I understand, the fact that they want
to be able to do this sort of optimization is the main reason why they
want to reimplement the core parts of Numpy in RPython.

The second issue is that unfortunately their emulation of CPython's
C-API is at the moment seems to have quite large overheads. Porting
Numpy on top of that is possible --- I managed to get basic things
(apart from string/unicode arrays) to work, but things took very large
speed hits (of the order ~ 100x for things like `arange(10000).sum()`).
This pushes the speed advantage of Numpy to a bit too large array sizes.
The reason is probably that Numpy uses PyObjects internally heavily,
which accumulates the cost of passing objects through the emulation layer.

-- 
Pauli Virtanen


From ronan.lamy at gmail.com  Sun Mar 11 11:46:39 2012
From: ronan.lamy at gmail.com (Ronan Lamy)
Date: Sun, 11 Mar 2012 15:46:39 +0000
Subject: [Numpy-discussion] about sympy
In-Reply-To: <CAE8bXE=kXk21jbLQ-NfTHGzfNNw0a1uU7H6oPMzz48n3L_CbCw@mail.gmail.com>
References: <4F5C4EE4.9010500@gmail.com>
	<CAE8bXE=kXk21jbLQ-NfTHGzfNNw0a1uU7H6oPMzz48n3L_CbCw@mail.gmail.com>
Message-ID: <1331480799.8452.92.camel@ronan-desktop>

Le dimanche 11 mars 2012 ? 01:31 -0600, Val Kalatsky a ?crit :
> Can you?
> The question should be: Why sympy does not have Fresnel integrals?
> 
Well, the development version has hypergeometric functions, which is
enough to integrate sin(x**2):

>>> from sympy import *
>>> x = Symbol('x')
>>> integrate(sin(x**2))
x**3*gamma(3/4)*hyper((3/4,), (3/2, 7/4), -x**4/4)/(4*gamma(7/4))
>>> integrate(sin(x**2), (x, 0, oo))
sqrt(2)*sqrt(pi)/4

It can also create a numerical function for the result, but it uses
mpmath, so it won't play well with numpy:

>>> f = lambdify(x, integrate(sin(x**2)))
>>> f(4567.)
mpf('0.62676518399179759')
>>> f(np.linspace(-5., 5., 10))
Traceback (most recent call last):
  File "<ipython-input-35-97e0996f77c3>", line 1, in <module>
    f(np.linspace(-5., 5., 10))
  File "<string>", line 1, in <lambda>
  File "/home/ronan/dev/sympy/sympy/mpmath/functions/hypergeometric.py",
line 199, in hyper
    z = ctx.convert(z)
  File "/home/ronan/dev/sympy/sympy/mpmath/ctx_mp_python.py", line 662,
in convert
    return ctx._convert_fallback(x, strings)
  File "/home/ronan/dev/sympy/sympy/mpmath/ctx_mp.py", line 556, in
_convert_fallback
    raise TypeError("cannot create mpf from " + repr(x))
TypeError: cannot create mpf from array([ -1.56250000e+02,  -5.71797363e
+01,  -1.48843545e+01,
        -1.92901235e+00,  -2.38149672e-02,  -2.38149672e-02,
        -1.92901235e+00,  -1.48843545e+01,  -5.71797363e+01,
        -1.56250000e+02])


> On Sun, Mar 11, 2012 at 1:06 AM, aa <telukpalu at gmail.com> wrote:
>         why sympy cannot integrate sin(x**2)??
>         _______________________________________________
>         NumPy-Discussion mailing list
>         NumPy-Discussion at scipy.org
>         http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From xavier.gnata at gmail.com  Sun Mar 11 12:22:17 2012
From: xavier.gnata at gmail.com (xavier.gnata at gmail.com)
Date: Sun, 11 Mar 2012 17:22:17 +0100
Subject: [Numpy-discussion] Looking for people interested in helping
 with Python compiler to LLVM
In-Reply-To: <jjie7p$b6n$1@dough.gmane.org>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<CAOP6n=h_R9hEg3GLFbrz0P=7bd9GDz5zfhVDQOeByGUfw=tL3g@mail.gmail.com>
	<4F5CB2C6.9080809@gmail.com> <jjie7p$b6n$1@dough.gmane.org>
Message-ID: <4F5CD139.7080004@gmail.com>

On 03/11/2012 03:52 PM, Pauli Virtanen wrote:
> 11.03.2012 15:12, xavier.gnata at gmail.com kirjoitti:
> [clip]
>> If this description is correct, Numba is an additional pass once the
>> cpython  bytecode has be produced by cpython.
>> Is it correct??
>> Is python bytecote a good intermediate representation to perform numpy
>> related optimization?
>>
>> One current big issue with numpy is that C=A+B+D produces temporaries.
>> numexpr addresses this issue and it would be great to get the same
>> result by default in numpy.
>> numexpr also optimizes polynomials using Horner's method. It is hard to
>> do that at bytecode level, isn't it?
> My impression is that dealing with Python's bytecode is not necessarily
> significantly harder than dealing with the AST.
>
> Your example reads
>
>    1           0 LOAD_NAME                0 (A)
>                3 LOAD_NAME                1 (B)
>                6 BINARY_ADD
>                7 LOAD_NAME                2 (D)
>               10 BINARY_ADD
>               11 STORE_NAME               3 (C)
>
> For instance, interpreting the bytecode (e.g. loop body) once with dummy
> objects lets you know what the final compound expression is.

ok.
>
>> Unladen swallow wanted to replace the full cpython implementation by a
>> jit compiler built using LLVM... but unladen swallow is dead.
> To get speed gains, you need to optimize not only the bytecode
> interpreter side, but also the object space --- Python classes, strings
> and all that. Keeping in mind Python's dynamism, there are potential
> side effects everywhere. I guess this is what sunk the swallow.
Correct : impact of dynamism underestimated and unexpected bugs in LLVM 
--> dead swallow.

> Just speeding up effectively statically typed code dealing with arrays
> and basic types, on the other hand, sounds much easier.
it is :)

> The PyPy guys have a much more ambitious approach, and are getting nice
> results. Also with arrays --- as I understand, the fact that they want
> to be able to do this sort of optimization is the main reason why they
> want to reimplement the core parts of Numpy in RPython.
It sonds ok for numpy but it is no sense because numpy is only the 
beginning of the story.
Numpy is not *that* useful without scipy, matplotlib and so on.

> The second issue is that unfortunately their emulation of CPython's
> C-API is at the moment seems to have quite large overheads. Porting
> Numpy on top of that is possible --- I managed to get basic things
> (apart from string/unicode arrays) to work, but things took very large
> speed hits (of the order ~ 100x for things like `arange(10000).sum()`).
> This pushes the speed advantage of Numpy to a bit too large array sizes.
> The reason is probably that Numpy uses PyObjects internally heavily,
> which accumulates the cost of passing objects through the emulation layer.
Yes. We have two types of users :
1) Guys using python without numpy. These guys want the python core 
language to go faster (with the full complexity of a dynamic language).
2) science users. These guys are fine with python speed because their 
codes spend 99% the time in 
numpy/scipy/any_other_science_module_using_c-api. Who cares if it takes 
1.1s (in python) instead of 1.0 (in C) to read the data input file if it 
takes hours (or even minutes) to process them (using the C-api and doing 
the basics with temporaries and not optimized loops)?


From cgohlke at uci.edu  Sun Mar 11 13:44:03 2012
From: cgohlke at uci.edu (Christoph Gohlke)
Date: Sun, 11 Mar 2012 10:44:03 -0700
Subject: [Numpy-discussion] ArgumentError when using
	numpy.ctypeslib.ndpointer
In-Reply-To: <jji2i6$t7p$1@dough.gmane.org>
References: <4F5C5DAF.8030105@uci.edu> <jji2i6$t7p$1@dough.gmane.org>
Message-ID: <4F5CE463.2060604@uci.edu>


On 3/11/2012 4:33 AM, Pauli Virtanen wrote:
> 11.03.2012 09:09, Christoph Gohlke kirjoitti:
>> I am unable to get the simple numpy.ctypeslib.ndpointer docstring
>> example from
>> <http://docs.scipy.org/doc/numpy/reference/routines.ctypeslib.html>
>> working on Windows.
>>
>> Given a DLL `foo.dll` that exports a function `int bar(double *)`,
>> calling foo.bar using the np.ctypeslib.ndpointer approach always fails
>> with a ctypes.ArgumentError:
>>
>> In [5]: foo = ctypes.CDLL('foo.dll')
>> In [6]: foo.bar.argtpes = [np.ctypeslib.ndpointer(dtype=np.float64,
>> ndim=1, flags='C_CONTIGUOUS')]
>
> Typo: "argtpes" ->  "argtypes"
>
> Probably not Windows-specific :)
>

Thank you! Funny enough, I had this typo all over in a wrapper for a 
real DLL, but all the unit tests (about a hundred) passed...

Christoph


From sturla at molden.no  Sun Mar 11 13:46:28 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 11 Mar 2012 18:46:28 +0100
Subject: [Numpy-discussion] Looking for people interested in helping
	with Python compiler to LLVM
In-Reply-To: <jjie7p$b6n$1@dough.gmane.org>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<CAOP6n=h_R9hEg3GLFbrz0P=7bd9GDz5zfhVDQOeByGUfw=tL3g@mail.gmail.com>
	<4F5CB2C6.9080809@gmail.com> <jjie7p$b6n$1@dough.gmane.org>
Message-ID: <A8B0540B-400C-456F-B7EB-6209411FD1D0@molden.no>


Den 11. mars 2012 kl. 15:52 skrev Pauli Virtanen <pav at iki.fi>:

> To get speed gains, you need to optimize not only the bytecode
> interpreter side, but also the object space --- Python classes, strings
> and all that. Keeping in mind Python's dynamism, there are potential
> side effects everywhere. I guess this is what sunk the swallow.
> 
> Just speeding up effectively statically typed code dealing with arrays
> and basic types, on the other hand, sounds much easier.

Yes, it would be like psyco, except it has LLVM as compiler and knows about NumPy.

Sturla


From silideba at gmail.com  Sun Mar 11 14:50:38 2012
From: silideba at gmail.com (silideba at gmail.com)
Date: Mon, 12 Mar 2012 00:20:38 +0530 (IST)
Subject: [Numpy-discussion] Planning to Study Abroad?
Message-ID: <16246792.3454.1331491874227.JavaMail.javamailuser@localhost>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120312/00dccd5f/attachment.html>

From cgohlke at uci.edu  Sun Mar 11 14:56:59 2012
From: cgohlke at uci.edu (Christoph Gohlke)
Date: Sun, 11 Mar 2012 11:56:59 -0700
Subject: [Numpy-discussion] f2py and pygtk on windows
In-Reply-To: <CAOxZyWNsY69G6UzbwUsb0vKgyUnTTTMQv8LwmrGG-kPNMwAK2w@mail.gmail.com>
References: <CAOxZyWMc7wwjpTT1BXEzaX-ihLcTK7MNYSAGOcdYdwdhCOVygg@mail.gmail.com>
	<4F5A6DE5.6000602@uci.edu>
	<CAOxZyWNsY69G6UzbwUsb0vKgyUnTTTMQv8LwmrGG-kPNMwAK2w@mail.gmail.com>
Message-ID: <4F5CF57B.9070407@uci.edu>


On 3/10/2012 9:31 PM, Sameer Grover wrote:
> On 10 March 2012 02:23, Christoph Gohlke<cgohlke at uci.edu>  wrote:
>>
>>
>> On 3/9/2012 11:50 AM, Sameer Grover wrote:
>>>>>> import gtk
>>>>>> import foo # where foo is any f2py-wrapped program
>>>
>>> Subsequently, on exiting python interpreter, the interpreter crashes
>>> with this error message - "This application has requested the Runtime
>>> to terminate it in an unusual way. Please contact the application's
>>> support team for more information."
>>>
>>> Strangely enough, interchanging the order of the import statements,
>>> i.e. importing the f2py wrapped program before gtk works fine.
>>> Furthermore, each module works fine individually.
>>>
>>> This is a windows-only problem. I'm using Windows 7, Python 2.7,
>>> latest numpy, mingw32 compiler and the "pygtk all-in-one installer"
>>> (mentioned on the pygtk download page).
>>>
>>> This happens even for very simpl fortran programs such as this one -
>>> subroutine hello ()
>>>       write(*,*)'Hello from Fortran90!!!'
>>> end subroutine hello
>>>
>>> I don't know whether the problem is with f2py or with gtk or with
>>> python but maybe somebody can shed some light on this.
>>>
>>> Regards,
>>> Sameer Grover
>>
>> The error can be due to memory corruption. It works for me with
>> msvc9/ifort builds of the pygtk and f2py extensions.
>>
>> Which DLLs does foo.pyd depend on (use DependencyWalker)?
>>
>> Christoph
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
> The .pyd file generated by f2py depends on the following
>
> KERNEL32.DLL
> MSVCRT.DLL
> LIBGCC_S_DW2-1.DLL
> LIBGFORTRAN-3.DLL
> PYTHON27.DLL
>
> It is possible that this is a mingw32 problem. I haven't been able to
> try using ifort/msvc.(mscv is somehow not getting installed on my
> system, but that's a separate issue).
>
> Sameer


Can you try linking to the msvc9 runtime DLL? E.g.

f2py.py -c --fcompiler=gnu95 --compiler=mingw32 -lmsvcr90 -m foo foo.f

Christoph


From travis at continuum.io  Sun Mar 11 18:11:31 2012
From: travis at continuum.io (Travis Oliphant)
Date: Sun, 11 Mar 2012 17:11:31 -0500
Subject: [Numpy-discussion] Looking for people interested in helping
	with Python compiler to LLVM
In-Reply-To: <CAOP6n=h_R9hEg3GLFbrz0P=7bd9GDz5zfhVDQOeByGUfw=tL3g@mail.gmail.com>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<CAOP6n=h_R9hEg3GLFbrz0P=7bd9GDz5zfhVDQOeByGUfw=tL3g@mail.gmail.com>
Message-ID: <26F46EC1-9118-428E-BF46-30F407722F2F@continuum.io>


On Mar 11, 2012, at 1:59 AM, Mic wrote:

> what is the difference to http://www.python.org/dev/peps/pep-3146/ ?
> 

To fully expound the difference would take a lot of discussion.   But, summarizing: 

	* Numba is not nearly as ambitious as US (Numba will be fine with some user-directed information and with a subset of the language that gets compiled)
	* Numba will focus on compilation rather than JITing --- in other words it won't be trying to detect hot-spots and compile segments (actually LLVM is not a good candidate for that sort of thing as it is not a very fast or memory-efficient compiler for a lot of reasons,  I believe this is why PyPy does not use it anymore, for example). 
	* Numba will be much closer to Cython in spirit than Unladen Swallow (or PyPy) --- people who just use Cython for a loop or two will be able to use Numba instead
	* Numba will borrow the idea of call-sites from IronPython wherein a function is replaced by a generic function that dispatches based on input types to either cached compiled code for the types specified or the generic function.
	* Numba will be mainly about trying to leverage the code-generation of LLVM which multiple hardware manufacturers are using (Intel's OpenCL support, Nvidia's PTX backend, Apple's CLang, etc.) for NumPy arrays


-Travis


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120311/b6aae55a/attachment.html>

From sturla at molden.no  Sun Mar 11 19:01:26 2012
From: sturla at molden.no (Sturla Molden)
Date: Mon, 12 Mar 2012 00:01:26 +0100
Subject: [Numpy-discussion] Looking for people interested in helping
 with Python compiler to LLVM
In-Reply-To: <jjie7p$b6n$1@dough.gmane.org>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<CAOP6n=h_R9hEg3GLFbrz0P=7bd9GDz5zfhVDQOeByGUfw=tL3g@mail.gmail.com>
	<4F5CB2C6.9080809@gmail.com> <jjie7p$b6n$1@dough.gmane.org>
Message-ID: <4F5D2EC6.7000505@molden.no>

Den 11.03.2012 15:52, skrev Pauli Virtanen:
> To get speed gains, you need to optimize not only the bytecode 
> interpreter side, but also the object space --- Python classes, 
> strings and all that. Keeping in mind Python's dynamism, there are 
> potential side effects everywhere. I guess this is what sunk the swallow. 

I'm not sure what scared off or killed the bird. Maybe they just 
approached it from the wrong side.

Psyco has proven that some "algorithmic" Python code can be accelerated 
by an order of magnitude or two. But Psyco did not use an optimizing 
compiler like LLVM, it just generated unoptimized x86 binary code from 
Python.

Also, a JIT for a dynamic language is possible. One example is the 
Strongtalk JIT compiler for Smalltalk. It was purchased by Sun and 
renamed "Java VM Hotspot", because Sun could not produce a JIT for its 
static language Java. The static nature of Java does not have anything 
to do with the performance of its JIT compiler.

Maybe there are differences between Python and Smalltalk that makes the 
latter more easy to JIT compile? Or perhaps there are differences 
between Hotspot/Strongtalk and LLVM that makes the latter less efficient 
for dynamic languages? I don't know. But making a performant JIT for all 
of Python is obviously not easy.

A third example of a fast JIT for a dynamic language is LuaJIT. It can 
often make "interpreted" and duck-typed Lua run faster than statically 
compiled C. Like psyco, LuaJIT just focuses on making algorithmic code 
with a few elementary types run fast.

Another approach to speeding up dynamic languages is optional static 
typing and static compilation. Cython, Bigloo (et al.), and CMUCL/SBCL 
are exellent examples of that. Maybe one could use type annotations like 
Cython in pure Python mode? Python 3 has even got type annotations in 
the syntax.

I think (but I am not 100% sure) that the main problem with JIT'ing 
Python is its dynamic attributes. So maybe some magic with __slots__ or 
__metaclass__ could be used to turn that dynamicity off? So a JIT like 
Numba would only work with builtin types (int, float, list, dict, set, 
tuple), NumPy arrays, and some less-dynamic classes. It would not speed 
up all of Python, but a sufficient subset to make scientists happy.

Sturla


From wesmckinn at gmail.com  Sun Mar 11 19:03:17 2012
From: wesmckinn at gmail.com (Wes McKinney)
Date: Sun, 11 Mar 2012 19:03:17 -0400
Subject: [Numpy-discussion] draft enum NEP
In-Reply-To: <CAKcu9cdxLeE8O5dRjKJ02ryhO_w6WdACLDptyCCj3v_ZSnuwCQ@mail.gmail.com>
References: <4F5A35E4.6080501@continuum.io>
	<CAKcu9cdxLeE8O5dRjKJ02ryhO_w6WdACLDptyCCj3v_ZSnuwCQ@mail.gmail.com>
Message-ID: <CAJPUwMCpnYyOe7rXLO5Ef4MM_CZyY3BOd=2SqEX0S-Z8TNn1og@mail.gmail.com>

On Fri, Mar 9, 2012 at 5:48 PM, David Gowers (kampu) <00ai99 at gmail.com> wrote:
> Hi,
>
> On Sat, Mar 10, 2012 at 3:25 AM, Bryan Van de Ven <bryanv at continuum.io> wrote:
>> Hi all,
>>
>> I have started working on a NEP for adding an enumerated type to NumPy.
>> It is on my GitHub:
>>
>> ? ? https://github.com/bryevdv/numpy/blob/enum/doc/neps/enum.rst
>>
>> It is still very rough, and incomplete in places. But I would like to
>> get feedback sooner rather than later in order to refine it. In
>> particular there are a few questions inline in the document that I would
>> like input on. Any comments, suggestions, questions, concerns, etc. are
>> very welcome.
>
> "t = np.dtype('enum', map=(n,v))"
>
> ^ Is this supposed to be indicating 'this is an enum with values
> ranging between n and v'? It could be a bit more clear.
>
> Is it possible to partially define an enum? That is, give the maximum
> and minimum values, and only some of the enumeration value:name
> mappings?
> For example, an enum where 0 means 'n/a', +n means 'Type A Object
> #(n-1)' and -n means 'Type B Object #(abs(n) - 1)'. I just want to map
> the non-scalar values, while having a way to avoid treating valid
> scalar values (eg +64) as out-of-range.
> Example of what I mean:
>
> "t = np.dtype('enum[N_A:0]', range = (-127, 127))"
> (defined values being printed as a string, undefined being printed as a number.)
>
> David
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

I'll have to think about this (a little brain dump here). I have many
use cases in pandas where this would be useful which are basically
direct translations of R's factor data type. Note that R always
coerces the levels (the unique values) AFAICT to string type. However,
mapping back to a well-dtyped array is important, too. So the
temptation might be to do something like this:

ndarray: dtype storage type (uint32 or something)
mapping : khash with type PyObject* -> uint32

Now, one problem with this is that you want the mapping + dtype to be
invertible (otherwise you're left doing some type inference). The way
that I implement the mapping is to restrict the labeling to be from 0
to N - 1 which makes things easier. If we decide that having an
explicit value mapping

The nice thing about this is that the same set of core algorithms can
be used to fix numpy.unique. For example you would like to be able to
do:

enum_arr = np.enum(arr)

(this seems like a reasonable API to me) and that is a direct
equivalent of R's factor function. You need to be able to pass an
explicit ordering when calling the enum/factor function. If not
specified, you should have an option to either sort or not-- for
example suppose you convert an array of 1 million integers to enum but
you don't particularly care about the uniques (which could be very
large, up to the size of the array) being ordered (no need to pay N
log N for large N).

One nice thing about khash is that it can be serialized fairly easily.

Have you looked much at how I use enum-like ideas in pandas? It would
be great if I could offload some of this data algorithmic work to
NumPy.

We will want the enum data type to integrate with text file readers--
if you "factorize as you go" you can drastically reduce the memory
usage of a structured array (or pandas DataFrame) columns with
long-ish strings and relatively few unique values.

- Wes


From sturla at molden.no  Sun Mar 11 19:21:27 2012
From: sturla at molden.no (Sturla Molden)
Date: Mon, 12 Mar 2012 00:21:27 +0100
Subject: [Numpy-discussion] Looking for people interested in helping
 with Python compiler to LLVM
In-Reply-To: <26F46EC1-9118-428E-BF46-30F407722F2F@continuum.io>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<CAOP6n=h_R9hEg3GLFbrz0P=7bd9GDz5zfhVDQOeByGUfw=tL3g@mail.gmail.com>
	<26F46EC1-9118-428E-BF46-30F407722F2F@continuum.io>
Message-ID: <4F5D3377.7030207@molden.no>

Den 11.03.2012 23:11, skrev Travis Oliphant:
>
> * Numba will be much closer to Cython in spirit than Unladen Swallow 
> (or PyPy) --- people who just use Cython for a loop or two will be 
> able to use Numba instead

This is perhaps the most important issue for scientific and algorithmic 
codes.

Not having to resort to Cython, C, C++ or Fortran to get decent 
performance for algorithmic code would be great.

It could also put Python/Numba high up on the Debian shootout ;-)

Or one could benchmark a pure Python version of timsort against the 
standard library.

Sturla
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120312/34b24042/attachment.html>

From 00ai99 at gmail.com  Sun Mar 11 19:50:12 2012
From: 00ai99 at gmail.com (David Gowers (kampu))
Date: Mon, 12 Mar 2012 10:20:12 +1030
Subject: [Numpy-discussion] draft enum NEP
In-Reply-To: <CAJPUwMCpnYyOe7rXLO5Ef4MM_CZyY3BOd=2SqEX0S-Z8TNn1og@mail.gmail.com>
References: <4F5A35E4.6080501@continuum.io>
	<CAKcu9cdxLeE8O5dRjKJ02ryhO_w6WdACLDptyCCj3v_ZSnuwCQ@mail.gmail.com>
	<CAJPUwMCpnYyOe7rXLO5Ef4MM_CZyY3BOd=2SqEX0S-Z8TNn1og@mail.gmail.com>
Message-ID: <CAKcu9cfsA1YQGcXghWK=+khvLqQsO05yhZSnbYZkgKOCKS4vEw@mail.gmail.com>

Hi Wes,

On Mon, Mar 12, 2012 at 9:33 AM, Wes McKinney <wesmckinn at gmail.com> wrote:
>
> Now, one problem with this is that you want the mapping + dtype to be
> invertible (otherwise you're left doing some type inference). The way
> that I implement the mapping is to restrict the labeling to be from 0
> to N - 1 which makes things easier.

> If we decide that having an explicit value mapping
(...?)

You might want to finish whatever thought that was :)


From gias98 at gmail.com  Mon Mar 12 00:15:04 2012
From: gias98 at gmail.com (Gias)
Date: Mon, 12 Mar 2012 04:15:04 +0000 (UTC)
Subject: [Numpy-discussion] nltk dispersion plot problem
Message-ID: <loom.20120312T051324-676@post.gmane.org>

I am using Ubuntu 11.04 (natty) in my laptop and Python 2.7. I installed nltk 
(2.09b), numpy (1.5.1), and matplotlib(1.1.0). The installation is global and I 
am not using virtualenv.When I try (text4.dispersion_plot(["citizens",
"democracy", "freedom", "duties", 
"America"])) in terminal (gnome
terminal 2.32.1), the plot 
is not showing up. There is no error message, just a second or two interval 
before the last (>>>) shows up.


From wardefar at iro.umontreal.ca  Mon Mar 12 03:42:13 2012
From: wardefar at iro.umontreal.ca (David Warde-Farley)
Date: Mon, 12 Mar 2012 03:42:13 -0400
Subject: [Numpy-discussion] nltk dispersion plot problem
In-Reply-To: <loom.20120312T051324-676@post.gmane.org>
References: <loom.20120312T051324-676@post.gmane.org>
Message-ID: <20120312074212.GA30612@ravage>

On Mon, Mar 12, 2012 at 04:15:04AM +0000, Gias wrote:
> I am using Ubuntu 11.04 (natty) in my laptop and Python 2.7. I installed nltk 
> (2.09b), numpy (1.5.1), and matplotlib(1.1.0). The installation is global and I 
> am not using virtualenv.When I try (text4.dispersion_plot(["citizens",
> "democracy", "freedom", "duties", 
> "America"])) in terminal (gnome
> terminal 2.32.1), the plot 
> is not showing up. There is no error message, just a second or two interval 
> before the last (>>>) shows up.

Of those three packages, I'd say that the least likely to be implicated is
NumPy, making this one probably the list where you'll get the least help.

Since it's a plotting problem I would try the matplotlib-users mailing list,
and include the source of dispersion_plot, or a link to it in the Google
Code code browser for the nltk project, e.g.

http://code.google.com/p/nltk/source/browse/trunk/nltk/nltk/draw/dispersion.py

David


From mail.till at gmx.de  Mon Mar 12 12:57:33 2012
From: mail.till at gmx.de (Till Stensitzki)
Date: Mon, 12 Mar 2012 16:57:33 +0000 (UTC)
Subject: [Numpy-discussion]
	=?utf-8?q?Looking_for_people_interested_in_hel?=
	=?utf-8?q?ping=09with_Python_compiler_to_LLVM?=
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<CAOP6n=h_R9hEg3GLFbrz0P=7bd9GDz5zfhVDQOeByGUfw=tL3g@mail.gmail.com>
	<26F46EC1-9118-428E-BF46-30F407722F2F@continuum.io>
Message-ID: <loom.20120312T175647-108@post.gmane.org>

Doesent Theano does the same, only via GCC compilation?


From shish at keba.be  Mon Mar 12 13:29:10 2012
From: shish at keba.be (Olivier Delalleau)
Date: Mon, 12 Mar 2012 13:29:10 -0400
Subject: [Numpy-discussion] Looking for people interested in helping
 with Python compiler to LLVM
In-Reply-To: <loom.20120312T175647-108@post.gmane.org>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<CAOP6n=h_R9hEg3GLFbrz0P=7bd9GDz5zfhVDQOeByGUfw=tL3g@mail.gmail.com>
	<26F46EC1-9118-428E-BF46-30F407722F2F@continuum.io>
	<loom.20120312T175647-108@post.gmane.org>
Message-ID: <CAFXk4bppyDqfqYqWZS42JYSYxEAo3NeQ+tqpi6MFg2p4FkuR8A@mail.gmail.com>

One major difference is that Theano doesn't attempt to parse existing
Python (byte)code: you need to explicitly code with the Theano syntax
(which tries to be close to Numpy, but can end up looking quite different,
especially if you want to control the program flow with loops and ifs for
instance).

A potentially interesting avenue would be to parse Python (byte)code to
generate a Theano graph. It'd be nice if numba could output some
intermediate information that would represent the computational graph being
compiled, so that Theano could re-use it directly :) (probably much easier
said than done though)

-=- Olivier

Le 12 mars 2012 12:57, Till Stensitzki <mail.till at gmx.de> a ?crit :

> Doesent Theano does the same, only via GCC compilation?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120312/9b66019d/attachment.html>

From ndbecker2 at gmail.com  Mon Mar 12 15:04:41 2012
From: ndbecker2 at gmail.com (Neal Becker)
Date: Mon, 12 Mar 2012 15:04:41 -0400
Subject: [Numpy-discussion] unique along axis?
Message-ID: <jjlhc9$rek$1@dough.gmane.org>

I see unique does not take an axis arg.

Suggested way to apply unique to each column of a 2d array?


From pierre.haessig at crans.org  Mon Mar 12 16:25:04 2012
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Mon, 12 Mar 2012 21:25:04 +0100
Subject: [Numpy-discussion] Looking for people interested in helping
 with Python compiler to LLVM
In-Reply-To: <4F5D3377.7030207@molden.no>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>	<CAOP6n=h_R9hEg3GLFbrz0P=7bd9GDz5zfhVDQOeByGUfw=tL3g@mail.gmail.com>	<26F46EC1-9118-428E-BF46-30F407722F2F@continuum.io>
	<4F5D3377.7030207@molden.no>
Message-ID: <4F5E5BA0.3080502@crans.org>

Hi,
Le 12/03/2012 00:21, Sturla Molden a ?crit :
>
> It could also put Python/Numba high up on the Debian shootout ;-)
Can you tell a bit more about it ? (I just didn't understand the whole
sentence in fact ;-) )

Thanks !
-- 
Pierre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120312/d128d1d2/attachment.sig>

From apratap at lbl.gov  Mon Mar 12 18:04:40 2012
From: apratap at lbl.gov (Abhishek Pratap)
Date: Mon, 12 Mar 2012 15:04:40 -0700
Subject: [Numpy-discussion] numpy videos
Message-ID: <CA+7hxbzaA2tL-kZrLrQrcHJE+eQ4OVkCwZUcakMTGj_nxQi7fA@mail.gmail.com>

Hey Guys

Few days with folks at my first pycon has made me wonder how much of
cool things I was missing ..

I am looking to do some quick catch up on numpy and wondering if there
are any set of videos that I can refer to. I learn quicker seeing
videos  and would appreciate if you guys can point me to anything
available it will be of great help.

Thanks!
-Abhi


From jsseabold at gmail.com  Mon Mar 12 18:08:37 2012
From: jsseabold at gmail.com (Skipper Seabold)
Date: Mon, 12 Mar 2012 18:08:37 -0400
Subject: [Numpy-discussion] numpy videos
In-Reply-To: <CA+7hxbzaA2tL-kZrLrQrcHJE+eQ4OVkCwZUcakMTGj_nxQi7fA@mail.gmail.com>
References: <CA+7hxbzaA2tL-kZrLrQrcHJE+eQ4OVkCwZUcakMTGj_nxQi7fA@mail.gmail.com>
Message-ID: <CAKF=DjtoMhh44DvtyK_ru=9oJw+wcc3aLfSxN=rBcXGTZWwMgg@mail.gmail.com>

On Mon, Mar 12, 2012 at 6:04 PM, Abhishek Pratap <apratap at lbl.gov> wrote:
>
> Hey Guys
>
> Few days with folks at my first pycon has made me wonder how much of
> cool things I was missing ..
>
> I am looking to do some quick catch up on numpy and wondering if there
> are any set of videos that I can refer to. I learn quicker seeing
> videos ?and would appreciate if you guys can point me to anything
> available it will be of great help.
>

You'll find a lot of videos here. The tutorials in particular may
interest you from past conferences.

http://conference.scipy.org/index.html

Oddly though it doesn't look like there's a straight link to the 2011
conference there.

http://conference.scipy.org/scipy2011/

Skipper


From hugadams at gwmail.gwu.edu  Mon Mar 12 18:18:25 2012
From: hugadams at gwmail.gwu.edu (Adam Hughes)
Date: Mon, 12 Mar 2012 18:18:25 -0400
Subject: [Numpy-discussion] numpy videos
In-Reply-To: <CAKF=DjtoMhh44DvtyK_ru=9oJw+wcc3aLfSxN=rBcXGTZWwMgg@mail.gmail.com>
References: <CA+7hxbzaA2tL-kZrLrQrcHJE+eQ4OVkCwZUcakMTGj_nxQi7fA@mail.gmail.com>
	<CAKF=DjtoMhh44DvtyK_ru=9oJw+wcc3aLfSxN=rBcXGTZWwMgg@mail.gmail.com>
Message-ID: <CAGOQHEOkAcOsaRsr_GWqQ0b1QiP0KuVJR+sYj2ugzfm+A_9-xg@mail.gmail.com>

Abhi,

One thing I would suggest is to tackle numpy with a particular focus.  Once
you've gotten the basics down through tutorials and videos, do you have a
research project in mind to use with numpy?

On Mon, Mar 12, 2012 at 6:08 PM, Skipper Seabold <jsseabold at gmail.com>wrote:

> On Mon, Mar 12, 2012 at 6:04 PM, Abhishek Pratap <apratap at lbl.gov> wrote:
> >
> > Hey Guys
> >
> > Few days with folks at my first pycon has made me wonder how much of
> > cool things I was missing ..
> >
> > I am looking to do some quick catch up on numpy and wondering if there
> > are any set of videos that I can refer to. I learn quicker seeing
> > videos  and would appreciate if you guys can point me to anything
> > available it will be of great help.
> >
>
> You'll find a lot of videos here. The tutorials in particular may
> interest you from past conferences.
>
> http://conference.scipy.org/index.html
>
> Oddly though it doesn't look like there's a straight link to the 2011
> conference there.
>
> http://conference.scipy.org/scipy2011/
>
> Skipper
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120312/88f722a5/attachment.html>

From apratap at lbl.gov  Mon Mar 12 18:23:05 2012
From: apratap at lbl.gov (Abhishek Pratap)
Date: Mon, 12 Mar 2012 15:23:05 -0700
Subject: [Numpy-discussion] numpy videos
In-Reply-To: <CAGOQHEOkAcOsaRsr_GWqQ0b1QiP0KuVJR+sYj2ugzfm+A_9-xg@mail.gmail.com>
References: <CA+7hxbzaA2tL-kZrLrQrcHJE+eQ4OVkCwZUcakMTGj_nxQi7fA@mail.gmail.com>
	<CAKF=DjtoMhh44DvtyK_ru=9oJw+wcc3aLfSxN=rBcXGTZWwMgg@mail.gmail.com>
	<CAGOQHEOkAcOsaRsr_GWqQ0b1QiP0KuVJR+sYj2ugzfm+A_9-xg@mail.gmail.com>
Message-ID: <CA+7hxbyFb+xwhZ-7xQ4eVWJWqWhKv1n=NtzP0BenEJ+-RxzkiA@mail.gmail.com>

Super awesome. I love how the python community in general keeps the
recordings available for free.

@Adam : I do have some problems that I can hit numpy with, mainly
bigData based. So in summary I have millions/billions of rows of
biological data on which I want to run some computation but at the
same time have a capability to do quick lookup. I am not sure if numpy
will be applicable for quick lookups  by a string based key right ??

-Abhi

On Mon, Mar 12, 2012 at 3:18 PM, Adam Hughes <hugadams at gwmail.gwu.edu> wrote:
> Abhi,
>
> One thing I would suggest is to tackle numpy with a particular focus.? Once
> you've gotten the basics down through tutorials and videos, do you have a
> research project in mind to use with numpy?
>
>
> On Mon, Mar 12, 2012 at 6:08 PM, Skipper Seabold <jsseabold at gmail.com>
> wrote:
>>
>> On Mon, Mar 12, 2012 at 6:04 PM, Abhishek Pratap <apratap at lbl.gov> wrote:
>> >
>> > Hey Guys
>> >
>> > Few days with folks at my first pycon has made me wonder how much of
>> > cool things I was missing ..
>> >
>> > I am looking to do some quick catch up on numpy and wondering if there
>> > are any set of videos that I can refer to. I learn quicker seeing
>> > videos ?and would appreciate if you guys can point me to anything
>> > available it will be of great help.
>> >
>>
>> You'll find a lot of videos here. The tutorials in particular may
>> interest you from past conferences.
>>
>> http://conference.scipy.org/index.html
>>
>> Oddly though it doesn't look like there's a straight link to the 2011
>> conference there.
>>
>> http://conference.scipy.org/scipy2011/
>>
>> Skipper
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From hugadams at gwmail.gwu.edu  Mon Mar 12 18:42:14 2012
From: hugadams at gwmail.gwu.edu (Adam Hughes)
Date: Mon, 12 Mar 2012 18:42:14 -0400
Subject: [Numpy-discussion] numpy videos
In-Reply-To: <CA+7hxbyFb+xwhZ-7xQ4eVWJWqWhKv1n=NtzP0BenEJ+-RxzkiA@mail.gmail.com>
References: <CA+7hxbzaA2tL-kZrLrQrcHJE+eQ4OVkCwZUcakMTGj_nxQi7fA@mail.gmail.com>
	<CAKF=DjtoMhh44DvtyK_ru=9oJw+wcc3aLfSxN=rBcXGTZWwMgg@mail.gmail.com>
	<CAGOQHEOkAcOsaRsr_GWqQ0b1QiP0KuVJR+sYj2ugzfm+A_9-xg@mail.gmail.com>
	<CA+7hxbyFb+xwhZ-7xQ4eVWJWqWhKv1n=NtzP0BenEJ+-RxzkiA@mail.gmail.com>
Message-ID: <CAGOQHEODDaLPHdP72ucs2=X8-ikb8SrcDV475fT80ygWse-DTw@mail.gmail.com>

This is a probably an area that is quite common, so I'd be interested to
hear some other chime in.  I refer to the lookup and storage in numpy data.

Your implementation will of course be unique, but there are several avenues
that you can consider.  Here is how I handle a similar problem.

Imagine I have data, probably similar to yours, where there is qualitative
data (maybe biological or experimental parameters and other things), as
well as numerical data.  I would define a dictionary object that stores
both of these to a unique key.  In my work, I use the original file that
all the information was taken from as my key.  So for example:

dict{ key: (file_info), (data_array, dtype='float')}

The value of the item in the dictionary is split so that the information
and the actually data arrays are kept separate.  Notice my use of
dtype...it is also possible to build your own numpy data type that gives
you a bit more flexibility for storing your data.  This is very useful if
your data is not all that standardized, or if you want to quickly look up
data by reference.  For example, if you have a column in your file called
"counts" and you want later to access this, having a custom datatype will
let you do this with ease.  Anyway, you can read into that later.

This storage type is also highly useful if you need to make new data
structures later.  For example, if you want to plot all of your data in a
multiplot, you can design a method to take this object and return the
formatted multi-array data, as well as any axis arrays that can be
extracted from this data.  Generally, if you can this object built, than
any other representation of the data that you need can be taken from this.
This approach is useful to me, but may not be ideal if your dataset is so
large that you cannot afford to have several data structures that are
holding it simultanesouly in your code.

On Mon, Mar 12, 2012 at 6:23 PM, Abhishek Pratap <apratap at lbl.gov> wrote:

> Super awesome. I love how the python community in general keeps the
> recordings available for free.
>
> @Adam : I do have some problems that I can hit numpy with, mainly
> bigData based. So in summary I have millions/billions of rows of
> biological data on which I want to run some computation but at the
> same time have a capability to do quick lookup. I am not sure if numpy
> will be applicable for quick lookups  by a string based key right ??
>
> -Abhi
>
> On Mon, Mar 12, 2012 at 3:18 PM, Adam Hughes <hugadams at gwmail.gwu.edu>
> wrote:
> > Abhi,
> >
> > One thing I would suggest is to tackle numpy with a particular focus.
> Once
> > you've gotten the basics down through tutorials and videos, do you have a
> > research project in mind to use with numpy?
> >
> >
> > On Mon, Mar 12, 2012 at 6:08 PM, Skipper Seabold <jsseabold at gmail.com>
> > wrote:
> >>
> >> On Mon, Mar 12, 2012 at 6:04 PM, Abhishek Pratap <apratap at lbl.gov>
> wrote:
> >> >
> >> > Hey Guys
> >> >
> >> > Few days with folks at my first pycon has made me wonder how much of
> >> > cool things I was missing ..
> >> >
> >> > I am looking to do some quick catch up on numpy and wondering if there
> >> > are any set of videos that I can refer to. I learn quicker seeing
> >> > videos  and would appreciate if you guys can point me to anything
> >> > available it will be of great help.
> >> >
> >>
> >> You'll find a lot of videos here. The tutorials in particular may
> >> interest you from past conferences.
> >>
> >> http://conference.scipy.org/index.html
> >>
> >> Oddly though it doesn't look like there's a straight link to the 2011
> >> conference there.
> >>
> >> http://conference.scipy.org/scipy2011/
> >>
> >> Skipper
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120312/1a030182/attachment.html>

From francesc at continuum.io  Tue Mar 13 01:07:37 2012
From: francesc at continuum.io (Francesc Alted)
Date: Tue, 13 Mar 2012 00:07:37 -0500
Subject: [Numpy-discussion] numpy videos
In-Reply-To: <CA+7hxbyFb+xwhZ-7xQ4eVWJWqWhKv1n=NtzP0BenEJ+-RxzkiA@mail.gmail.com>
References: <CA+7hxbzaA2tL-kZrLrQrcHJE+eQ4OVkCwZUcakMTGj_nxQi7fA@mail.gmail.com>
	<CAKF=DjtoMhh44DvtyK_ru=9oJw+wcc3aLfSxN=rBcXGTZWwMgg@mail.gmail.com>
	<CAGOQHEOkAcOsaRsr_GWqQ0b1QiP0KuVJR+sYj2ugzfm+A_9-xg@mail.gmail.com>
	<CA+7hxbyFb+xwhZ-7xQ4eVWJWqWhKv1n=NtzP0BenEJ+-RxzkiA@mail.gmail.com>
Message-ID: <5FF4D8FE-F325-4991-AF84-61D578E7BBF4@continuum.io>

On Mar 12, 2012, at 5:23 PM, Abhishek Pratap wrote:

> Super awesome. I love how the python community in general keeps the
> recordings available for free.
> 
> @Adam : I do have some problems that I can hit numpy with, mainly
> bigData based. So in summary I have millions/billions of rows of
> biological data on which I want to run some computation but at the
> same time have a capability to do quick lookup. I am not sure if numpy
> will be applicable for quick lookups  by a string based key right ??

PyTables does precisely that.  Allows to do out-of-core operations with large arrays, store tables with an unlimited number of rows on-disk and, by using its integrated indexing engine (OPSI), you can perform quick lookups based on strings (or whatever other type).  Look into these examples:

http://www.pytables.org/moin/HowToUse#Selectingvalues

HTH,

-- Francesc Alted


From d.s.seljebotn at astro.uio.no  Tue Mar 13 01:58:49 2012
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Mon, 12 Mar 2012 22:58:49 -0700
Subject: [Numpy-discussion] Looking for people interested in helping
 with Python compiler to LLVM
In-Reply-To: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
Message-ID: <4F5EE219.8010108@astro.uio.no>

On 03/10/2012 10:35 PM, Travis Oliphant wrote:
> Hey all,
>
> I gave a lightning talk this morning on numba which is the start of a
> Python compiler to machine code through the LLVM tool-chain. It is proof
> of concept stage only at this point (use it only if you are interested
> in helping develop the code at this point). The only thing that works is
> a fast-vectorize capability on a few functions (without for-loops). But,
> it shows how creating functions in Python that can be used by the NumPy
> runtime in various ways. Several NEPS that will be discussed in the
> coming months will use this concept.
>
> Right now there is very little design documentation, but I will be
> adding some in the days ahead, especially if I get people who are
> interested in collaborating on the project. I did talk to Fijal and Alex
> of the PyPy project at PyCon and they both graciously suggested that I
> look at some of the PyPy code which walks the byte-code and does
> translation to their intermediate representation for inspiration.
>
> Again, the code is not ready for use, it is only proof of concept, but I
> would like to get feedback and help especially from people who might
> have written compilers before. The code lives at:
> https://github.com/ContinuumIO/numba

Hi Travis,

me and Mark F. has been talking today about whether some of numba and
Cython development could overlap -- not right away, but in the sense
that if Cython gets some features for optimization of numerical code,
then make it easy for numba to reuse that functionality.

This may be sort of off-topic re: the above-- but part of the goal of 
this post is to figure out numba's intended scope. If there isn't an 
overlap, that's good to know in itself.

Question 1: Did you look at Clyther and/or Copperhead? Though similar, 
they target GPUs...but at first glance they look as though they may be 
parsing Python bytecode to get their ASTs... (didn't check though)

Question 2: What kind of performance are you targeting -- in the short
term, and in the long term? Is competing with "Fortran-level"
performance a goal at all?

E.g., for ufunc computations with different iteration orders such
as "a + b.T" (a and b in C-order), one must do blocking to get good
performance. And when dealing with strided arrays, copying small chunks 
at the time will sometimes help performance (and sometimes not).

This is optimization strategies which (as I understand it) is quite
beyond what NumPy iterators etc. can provide. And the LLVM level could
be too low -- one has quite a lot of information when generating the
ufunc/reduction/etc. that would be thrown away when generating LLVM
code. Vectorizing compilers do their best to reconstruct this
information; I know nothing about what actually exists here for
LLVM. They are certainly a lot more complicated to implement and work
with than making use of on higher-level information available before
code generation.

The idea we've been playing with is for Cython to define a limited
subset of its syntax tree (essentially the "GIL-less" subset) seperate
from the rest of Cython, with a more well-defined API for optimization
passes etc., and targeted for a numerical optimization pipeline.

This subset would actually be pretty close to what numba needs to
compile, even if the overlap isn't perfect. So such a pipeline could
possibly be shared between Cython and numba, even if Cython would use
it at compile-time and numba at runtime, and even if the code
generation backend is different (the code generation backend is
probably not the hard part...). To be concrete, the idea is:

(Cython|numba) -> high-level numerical compiler and
loop-structure/blocking optimizer (by us on a shared parse tree
representation) -> (LLVM/C/OpenCL) -> low-level optimization (by the
respective compilers)

Some algorithms that could be shareable are iteration strategies
(already in NumPy though), blocking strategies, etc.

Even if this may be beyond numba's (and perhaps Cython's) current
ambition, it may be worth thinking about, if nothing else then just
for how Cython's code should be structured.

(Mark F., how does the above match how you feel about this?)

Dag


From travis at continuum.io  Tue Mar 13 04:19:47 2012
From: travis at continuum.io (Travis Oliphant)
Date: Tue, 13 Mar 2012 03:19:47 -0500
Subject: [Numpy-discussion] Looking for people interested in helping
	with Python compiler to LLVM
In-Reply-To: <4F5EE219.8010108@astro.uio.no>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<4F5EE219.8010108@astro.uio.no>
Message-ID: <C15466AA-DBDC-4582-A522-D50D1A2E4AC1@continuum.io>


On Mar 13, 2012, at 12:58 AM, Dag Sverre Seljebotn wrote:

> On 03/10/2012 10:35 PM, Travis Oliphant wrote:
>> Hey all,
>> 
>> I gave a lightning talk this morning on numba which is the start of a
>> Python compiler to machine code through the LLVM tool-chain. It is proof
>> of concept stage only at this point (use it only if you are interested
>> in helping develop the code at this point). The only thing that works is
>> a fast-vectorize capability on a few functions (without for-loops). But,
>> it shows how creating functions in Python that can be used by the NumPy
>> runtime in various ways. Several NEPS that will be discussed in the
>> coming months will use this concept.
>> 
>> Right now there is very little design documentation, but I will be
>> adding some in the days ahead, especially if I get people who are
>> interested in collaborating on the project. I did talk to Fijal and Alex
>> of the PyPy project at PyCon and they both graciously suggested that I
>> look at some of the PyPy code which walks the byte-code and does
>> translation to their intermediate representation for inspiration.
>> 
>> Again, the code is not ready for use, it is only proof of concept, but I
>> would like to get feedback and help especially from people who might
>> have written compilers before. The code lives at:
>> https://github.com/ContinuumIO/numba
> 
> Hi Travis,
> 
> me and Mark F. has been talking today about whether some of numba and
> Cython development could overlap -- not right away, but in the sense
> that if Cython gets some features for optimization of numerical code,
> then make it easy for numba to reuse that functionality.

That would be very, very interesting.  

> 
> This may be sort of off-topic re: the above-- but part of the goal of 
> this post is to figure out numba's intended scope. If there isn't an 
> overlap, that's good to know in itself.
> 
> Question 1: Did you look at Clyther and/or Copperhead? Though similar, 
> they target GPUs...but at first glance they look as though they may be 
> parsing Python bytecode to get their ASTs... (didn't check though)

I have looked at both projects although Clyther more in depth.    Clyther is parsing bytecode to get the AST (through a sub-project by the same author called Meta:  http://srossross.github.com/Meta/html/index.html). 

> 
> Question 2: What kind of performance are you targeting -- in the short
> term, and in the long term? Is competing with "Fortran-level"
> performance a goal at all?

In the short-term, I'm targeting C-equivalent performance (like weave).   In the long-term, I'm targeting optimized high-level expressions (i.e. Fortran-level) with GPU and mulit-core. 

> 
> E.g., for ufunc computations with different iteration orders such
> as "a + b.T" (a and b in C-order), one must do blocking to get good
> performance. And when dealing with strided arrays, copying small chunks 
> at the time will sometimes help performance (and sometimes not).
> 
> This is optimization strategies which (as I understand it) is quite
> beyond what NumPy iterators etc. can provide.


> And the LLVM level could
> be too low -- one has quite a lot of information when generating the
> ufunc/reduction/etc. that would be thrown away when generating LLVM
> code.

It doesn't need to be thrown away at all.   It could be used to generate appropriate code for the arrays being used.   The long-term idea is to actually be aware of NumPy arrays and encourage expression of high-level constructs which generate optimized code using chunking, blocking, AVX instructions, multiple threads, etc.  

To do this, it may make more sense to actually emit OpenMP (unless LLVM grows standard threading intrinsics).   This is not out of the question. 

> Vectorizing compilers do their best to reconstruct this
> information; I know nothing about what actually exists here for
> LLVM. They are certainly a lot more complicated to implement and work
> with than making use of on higher-level information available before
> code generation.
> 
> The idea we've been playing with is for Cython to define a limited
> subset of its syntax tree (essentially the "GIL-less" subset) seperate
> from the rest of Cython, with a more well-defined API for optimization
> passes etc., and targeted for a numerical optimization pipeline.
> 
> This subset would actually be pretty close to what numba needs to
> compile, even if the overlap isn't perfect. So such a pipeline could
> possibly be shared between Cython and numba, even if Cython would use
> it at compile-time and numba at runtime, and even if the code
> generation backend is different (the code generation backend is
> probably not the hard part...). To be concrete, the idea is:

> 
> (Cython|numba) -> high-level numerical compiler and
> loop-structure/blocking optimizer (by us on a shared parse tree
> representation) -> (LLVM/C/OpenCL) -> low-level optimization (by the
> respective compilers)
> 
> Some algorithms that could be shareable are iteration strategies
> (already in NumPy though), blocking strategies, etc.
> 
> Even if this may be beyond numba's (and perhaps Cython's) current
> ambition, it may be worth thinking about, if nothing else then just
> for how Cython's code should be structured.

This kind of collaboration would be very nice.  I agree, there might be some kind of intermediate representation that would be good for both projects.  

-Travis


> 
> (Mark F., how does the above match how you feel about this?)
> 
> Dag
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120313/0b405084/attachment.html>

From sturla at molden.no  Tue Mar 13 08:31:50 2012
From: sturla at molden.no (Sturla Molden)
Date: Tue, 13 Mar 2012 13:31:50 +0100
Subject: [Numpy-discussion] numpy videos
In-Reply-To: <CA+7hxbyFb+xwhZ-7xQ4eVWJWqWhKv1n=NtzP0BenEJ+-RxzkiA@mail.gmail.com>
References: <CA+7hxbzaA2tL-kZrLrQrcHJE+eQ4OVkCwZUcakMTGj_nxQi7fA@mail.gmail.com>	<CAKF=DjtoMhh44DvtyK_ru=9oJw+wcc3aLfSxN=rBcXGTZWwMgg@mail.gmail.com>	<CAGOQHEOkAcOsaRsr_GWqQ0b1QiP0KuVJR+sYj2ugzfm+A_9-xg@mail.gmail.com>
	<CA+7hxbyFb+xwhZ-7xQ4eVWJWqWhKv1n=NtzP0BenEJ+-RxzkiA@mail.gmail.com>
Message-ID: <4F5F3E36.3080102@molden.no>

On 12.03.2012 23:23, Abhishek Pratap wrote:
> Super awesome. I love how the python community in general keeps the
> recordings available for free.
>
> @Adam : I do have some problems that I can hit numpy with, mainly
> bigData based. So in summary I have millions/billions of rows of
> biological data on which I want to run some computation but at the
> same time have a capability to do quick lookup. I am not sure if numpy
> will be applicable for quick lookups  by a string based key right ??


Jason Kinser's book on Python for bioinformatics might be of interest. 
Though I don't always agree with his NumPy coding style.

As for "big data", it is a problem regardless of language. The HDF5 
library might be of help (cf. PyTables or h5py, I actually prefer the 
latter).

With a 64 bit system it is also possible to memory map a temporary file, 
and tell the OS to keep as much of it in memory if possible. That way we 
can "fake" more RAM than we actually have. (The Linux equivalent of the 
code in bigmem.c would be to mmap from tmpfs.) A usecase for bigmem.c is 
e.g. if you need to use 10 tables that each are 1-2 GB in size, but only 
have 4 GB of RAM on the desktop computer.


Sturla


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: bigmem.c
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120313/09ae1b17/attachment.c>

From francesc at continuum.io  Tue Mar 13 12:24:11 2012
From: francesc at continuum.io (Francesc Alted)
Date: Tue, 13 Mar 2012 11:24:11 -0500
Subject: [Numpy-discussion] numpy videos
In-Reply-To: <4F5F3E36.3080102@molden.no>
References: <CA+7hxbzaA2tL-kZrLrQrcHJE+eQ4OVkCwZUcakMTGj_nxQi7fA@mail.gmail.com>	<CAKF=DjtoMhh44DvtyK_ru=9oJw+wcc3aLfSxN=rBcXGTZWwMgg@mail.gmail.com>	<CAGOQHEOkAcOsaRsr_GWqQ0b1QiP0KuVJR+sYj2ugzfm+A_9-xg@mail.gmail.com>
	<CA+7hxbyFb+xwhZ-7xQ4eVWJWqWhKv1n=NtzP0BenEJ+-RxzkiA@mail.gmail.com>
	<4F5F3E36.3080102@molden.no>
Message-ID: <1805A8E7-C470-406B-8227-5B3C9D56F372@continuum.io>

On Mar 13, 2012, at 7:31 AM, Sturla Molden wrote:

> On 12.03.2012 23:23, Abhishek Pratap wrote:
>> Super awesome. I love how the python community in general keeps the
>> recordings available for free.
>> 
>> @Adam : I do have some problems that I can hit numpy with, mainly
>> bigData based. So in summary I have millions/billions of rows of
>> biological data on which I want to run some computation but at the
>> same time have a capability to do quick lookup. I am not sure if numpy
>> will be applicable for quick lookups  by a string based key right ??
> 
> 
> Jason Kinser's book on Python for bioinformatics might be of interest. Though I don't always agree with his NumPy coding style.
> 
> As for "big data", it is a problem regardless of language. The HDF5 library might be of help (cf. PyTables or h5py, I actually prefer the latter).

Yes, however IMO PyTables does adapt better to the OP lookup user case.  For example, let's suppose a very simple key-value problem, where we need to locate a certain value by using a key.  Using h5py I get:

In [1]: import numpy as np

In [2]: N = 100*1000

In [3]: sa = np.fromiter((('key'+str(i), i) for i in xrange(N)), dtype="S8,i4")

In [4]: import h5py

In [5]: f = h5py.File('h5py.h5', 'w')

In [6]: d = f.create_dataset('sa', data=sa)

In [7]: time [val for val in d if val[0] == 'key500']
CPU times: user 28.34 s, sys: 0.06 s, total: 28.40 s
Wall time: 29.25 s
Out[7]: [('key500', 500)]

Another option is to use fancy selection:

In [8]: time d[d['f0']=='key500']
CPU times: user 0.01 s, sys: 0.00 s, total: 0.01 s
Wall time: 0.01 s
Out[8]: 
array([('key500', 500)], 
      dtype=[('f0', 'S8'), ('f1', '<i4')])

Hmm, time resolution is too poor here.  Let's use the %timeit magic:

In [9]: timeit d[d['f0']=='key500']
100 loops, best of 3: 9.3 ms per loop

which is much better.  But, in this case you need to load the column d['f0'] completely in-memory, and this is *not* what you want when you have large tables that does not fit in-memory.

Using PyTables:

In [10]: import tables

In [11]: ft = tables.openFile('pytables.h5', 'w')

In [12]: dt = ft.createTable(ft.root, 'sa', sa)

In [13]: time [val[:] for val in dt if val[0] == 'key500']
CPU times: user 0.04 s, sys: 0.01 s, total: 0.05 s
Wall time: 0.04 s
Out[13]: [('key500', 500)]

That's almost a 100x of speed-up compared with h5py.  But, in addition, PyTables has specific machinery to optimize these queries by using the numexpr behind the scenes:

In [14]: time [val[:] for val in dt.where("f0=='key500'")]
CPU times: user 0.01 s, sys: 0.00 s, total: 0.01 s
Wall time: 0.00 s
Out[14]: [('key500', 500)]

Again, time resolution is too poor here.  Let's use timeit magic:

In [15]: timeit [val[:] for val in dt.where("f0=='key500'")]
100 loops, best of 3: 2.36 ms per loop

This is an additional 10x speed-up.  In fact, this is almost as fast as performing the query using NumPy directly:

In [16]: timeit sa[sa['f0']=='key500']
100 loops, best of 3: 2.14 ms per loop

with the difference that PyTables uses an out-of-core paradigm (i.e. it does not need to load the datasets completely in-memory).  And finally, PyTables does support true indexing capabilities, so that you do not have to read the complete dataset for getting results:

In [17]: dt.cols.f0.createIndex()
Out[17]: 100000

In [18]: timeit [val[:] for val in dt.where("f0=='key500'")]
1000 loops, best of 3: 213 us per loop

which accounts for another additional 10x speedup.  Of course, this speed up can be *much* more larger for bigger datasets, and specially for those that does not fit in-memory.  See:

http://pytables.github.com/usersguide/optimization.html#accelerating-your-searches

for more detailed rational and benchmarks in big datasets.

-- Francesc Alted


From markflorisson88 at gmail.com  Tue Mar 13 12:28:26 2012
From: markflorisson88 at gmail.com (mark florisson)
Date: Tue, 13 Mar 2012 17:28:26 +0100
Subject: [Numpy-discussion] Looking for people interested in helping
 with Python compiler to LLVM
In-Reply-To: <C15466AA-DBDC-4582-A522-D50D1A2E4AC1@continuum.io>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<4F5EE219.8010108@astro.uio.no>
	<C15466AA-DBDC-4582-A522-D50D1A2E4AC1@continuum.io>
Message-ID: <CANg26EWHOqu2RH1A-jBTpY79h973P9Ob-070NbBwLPeZkY7eKA@mail.gmail.com>

On 13 March 2012 09:19, Travis Oliphant <travis at continuum.io> wrote:
>
> On Mar 13, 2012, at 12:58 AM, Dag Sverre Seljebotn wrote:
>
> On 03/10/2012 10:35 PM, Travis Oliphant wrote:
>
> Hey all,
>
>
> I gave a lightning talk this morning on numba which is the start of a
>
> Python compiler to machine code through the LLVM tool-chain. It is proof
>
> of concept stage only at this point (use it only if you are interested
>
> in helping develop the code at this point). The only thing that works is
>
> a fast-vectorize capability on a few functions (without for-loops). But,
>
> it shows how creating functions in Python that can be used by the NumPy
>
> runtime in various ways. Several NEPS that will be discussed in the
>
> coming months will use this concept.
>
>
> Right now there is very little design documentation, but I will be
>
> adding some in the days ahead, especially if I get people who are
>
> interested in collaborating on the project. I did talk to Fijal and Alex
>
> of the PyPy project at PyCon and they both graciously suggested that I
>
> look at some of the PyPy code which walks the byte-code and does
>
> translation to their intermediate representation for inspiration.
>
>
> Again, the code is not ready for use, it is only proof of concept, but I
>
> would like to get feedback and help especially from people who might
>
> have written compilers before. The code lives at:
>
> https://github.com/ContinuumIO/numba
>
>
> Hi Travis,
>
> me and Mark F. has been talking today about whether some of numba and
> Cython development could overlap -- not right away, but in the sense
> that if Cython gets some features for optimization of numerical code,
> then make it easy for numba to reuse that functionality.
>
>
> That would be very, very interesting.
>
>
> This may be sort of off-topic re: the above-- but part of the goal of
> this post is to figure out numba's intended scope. If there isn't an
> overlap, that's good to know in itself.
>
> Question 1: Did you look at Clyther and/or Copperhead? Though similar,
> they target GPUs...but at first glance they look as though they may be
> parsing Python bytecode to get their ASTs... (didn't check though)
>
>
> I have looked at both projects although Clyther more in depth. ? ?Clyther is
> parsing bytecode to get the AST (through a sub-project by the same author
> called Meta: ?http://srossross.github.com/Meta/html/index.html).
>
>
> Question 2: What kind of performance are you targeting -- in the short
> term, and in the long term? Is competing with "Fortran-level"
> performance a goal at all?
>
>
> In the short-term, I'm targeting C-equivalent performance (like weave). ? In
> the long-term, I'm targeting optimized high-level expressions (i.e.
> Fortran-level) with GPU and mulit-core.
>
>
> E.g., for ufunc computations with different iteration orders such
> as "a + b.T" (a and b in C-order), one must do blocking to get good
> performance. And when dealing with strided arrays, copying small chunks
> at the time will sometimes help performance (and sometimes not).
>
> This is optimization strategies which (as I understand it) is quite
> beyond what NumPy iterators etc. can provide.
>

As for blocking, this could be done by the numpy iterators themselves,
by simply introducing more dimensions with appropriate shape and
strides (not saying that's a solution :).

>
>
> And the LLVM level could
> be too low -- one has quite a lot of information when generating the
> ufunc/reduction/etc. that would be thrown away when generating LLVM
> code.
>
>
> It doesn't need to be thrown away at all. ? It could be used to generate
> appropriate code for the arrays being used. ? The long-term idea is to
> actually be aware of NumPy arrays and encourage expression of high-level
> constructs which generate optimized code using chunking, blocking, AVX
> instructions, multiple threads, etc.
>
> To do this, it may make more sense to actually emit OpenMP (unless LLVM
> grows standard threading intrinsics). ? This is not out of the question.

That would be interesting, my experience with OpenMP is that the
standard doesn't define (ironically enough) the use of OpenMP in the
context of threading, and indeed, trying to use OpenMP outside of the
main thread simply segfaults your program. If llvm would get such
features, one must be prepared to make the OpenMP runtime thread-safe
as well (hopefully it will be in the first place, like I believe
Intel's implementation).

> Vectorizing compilers do their best to reconstruct this
> information; I know nothing about what actually exists here for
> LLVM. They are certainly a lot more complicated to implement and work
> with than making use of on higher-level information available before
> code generation.
>
> The idea we've been playing with is for Cython to define a limited
> subset of its syntax tree (essentially the "GIL-less" subset) seperate
> from the rest of Cython, with a more well-defined API for optimization
> passes etc., and targeted for a numerical optimization pipeline.
>
> This subset would actually be pretty close to what numba needs to
> compile, even if the overlap isn't perfect. So such a pipeline could
> possibly be shared between Cython and numba, even if Cython would use
> it at compile-time and numba at runtime, and even if the code
> generation backend is different (the code generation backend is
> probably not the hard part...). To be concrete, the idea is:
>
>
>
> (Cython|numba) -> high-level numerical compiler and
> loop-structure/blocking optimizer (by us on a shared parse tree
> representation) -> (LLVM/C/OpenCL) -> low-level optimization (by the
> respective compilers)
>
> Some algorithms that could be shareable are iteration strategies
> (already in NumPy though), blocking strategies, etc.
>
> Even if this may be beyond numba's (and perhaps Cython's) current
> ambition, it may be worth thinking about, if nothing else then just
> for how Cython's code should be structured.
>
>
> This kind of collaboration would be very nice. ?I agree, there might be some
> kind of intermediate representation that would be good for both projects.
>
> -Travis
>
>
>
> (Mark F., how does the above match how you feel about this?)

I would like collaboration, but from a technical perspective I think
this would be much more involved than just dumping the AST to an IR
and generating some code from there. For vector expressions I think
sharing code would be more feasible than arbitrary (parallel) loops,
etc. Cython as a compiler can make many decisions that a Python
(bytecode) compiler can't make (at least without annotations and a
well-defined subset of the language (not so much the syntax as the
semantics)). I think in numba, if parallelism is to be supported, you
will want a prange-like construct, as proving independence between
iterations can be very hard to near impossible for a compiler.

As for code generation, I'm not sure how llvm would do things like
slicing arrays, reshaping, resizing etc (for vector expressions you
can first evaluate all slicing and indexing operations and then
compile the remaining vector expression), but for loops and array
reassignment within loops this would have to invoke the actual slicing
code from the llvm code (I presume). There are many other things, like
bounds checking, wraparound, etc, that are all supported in both numpy
and Cython, but going through an llvm layer would as far as I can see,
require re-implementing those, at least if you want top-notch
performance. Personally, I think for non-trivial performance-critical
code (for loops with indexing, slicing, function calls, etc) Cython is
a better target.

So for vector expressions I think Cython and Numba could work together
by specifying AST transformations that operate on vector expressions.
For the purposes of Cython it would go from the Cython AST to the IR
and after transformations either back to the Cython AST, or directly
to llvm. For Cython, going from that code to llvm is not necessarily
more useful than C and OpenCL, as you will know the types anyway at
compile time and you can immediately exploit multicore as well as SIMD
parallelism. In the face of blocking and chunking etc, certain
specializations may be created in advance for Cython, or it could even
generate a C version (+ openmp + auto-vectorization appeasing
pragmas), an OpenCL version for the CPU and possibly a different one
for the GPU, and a numba + numba IR version, i.e. feed the IR at
runtime to numba and have it compile to llvm. If the compiler
additionally fuses vector expressions together, this will be even more
powerful.

Finally, as for non-vector-expression code, I really believe Cython is
a better target. cython.inline can have high overhead (at least the
first time it has to compile), but with better (numpy-aware) type
inference or profile guided optimizations (see recent threads on the
cython-dev mailing list), in addition to things like prange, I
personally believe Cython targets most of the use cases where numba
would be able to generate performing code.

> Dag
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From travis at continuum.io  Tue Mar 13 13:18:19 2012
From: travis at continuum.io (Travis Oliphant)
Date: Tue, 13 Mar 2012 12:18:19 -0500
Subject: [Numpy-discussion] Looking for people interested in helping
	with Python compiler to LLVM
In-Reply-To: <CANg26EWHOqu2RH1A-jBTpY79h973P9Ob-070NbBwLPeZkY7eKA@mail.gmail.com>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<4F5EE219.8010108@astro.uio.no>
	<C15466AA-DBDC-4582-A522-D50D1A2E4AC1@continuum.io>
	<CANg26EWHOqu2RH1A-jBTpY79h973P9Ob-070NbBwLPeZkY7eKA@mail.gmail.com>
Message-ID: <FA0A06B7-5EDF-4D2F-8919-A65E310D8AC5@continuum.io>

>> 
>> (Mark F., how does the above match how you feel about this?)
> 
> I would like collaboration, but from a technical perspective I think
> this would be much more involved than just dumping the AST to an IR
> and generating some code from there. For vector expressions I think
> sharing code would be more feasible than arbitrary (parallel) loops,
> etc. Cython as a compiler can make many decisions that a Python
> (bytecode) compiler can't make (at least without annotations and a
> well-defined subset of the language (not so much the syntax as the
> semantics)). I think in numba, if parallelism is to be supported, you
> will want a prange-like construct, as proving independence between
> iterations can be very hard to near impossible for a compiler.

I completely agree that you have to define some kind of syntax to get parallelism.  But, a prange construct would not be out of the question, of course.  

> 
> As for code generation, I'm not sure how llvm would do things like
> slicing arrays, reshaping, resizing etc (for vector expressions you
> can first evaluate all slicing and indexing operations and then
> compile the remaining vector expression), but for loops and array
> reassignment within loops this would have to invoke the actual slicing
> code from the llvm code (I presume).

There could be some analysis on the byte-code, prior to emitting the llvm code in order to handle lots of things.   Basically, you have to "play" the byte-code on a simple machine anyway in order to emit the correct code.   The big thing about Cython is you have to typedef too many things that are really quite knowable from the code.   If Cython could improve it's type inference, then it would be a more suitable target. 

> There are many other things, like
> bounds checking, wraparound, etc, that are all supported in both numpy
> and Cython, but going through an llvm layer would as far as I can see,
> require re-implementing those, at least if you want top-notch
> performance. Personally, I think for non-trivial performance-critical
> code (for loops with indexing, slicing, function calls, etc) Cython is
> a better target.

With libclang it is really quite possible to imagine a cython -> C target that itself compiles to llvm so that you can do everything at that intermediate layer.   However,  LLVM is a much better layer for optimization than C now that there are a lot of people collaborating on that layer.   I think it would be great if Cython targeted LLVM actually instead of C.  

> 
> Finally, as for non-vector-expression code, I really believe Cython is
> a better target. cython.inline can have high overhead (at least the
> first time it has to compile), but with better (numpy-aware) type
> inference or profile guided optimizations (see recent threads on the
> cython-dev mailing list), in addition to things like prange, I
> personally believe Cython targets most of the use cases where numba
> would be able to generate performing code.

Cython and Numba certainly overlap.  However, Cython requires:

	1) learning another language
	2) creating an extension module --- loading bit-code files and dynamically executing (even on a different machine from the one that initially created them) can be a powerful alternative for run-time compilation and distribution of code. 

These aren't show-stoppers obviously.   But, I think some users would prefer an even simpler approach to getting fast-code than Cython (which currently doesn't do enought type-inference and requires building a dlopen extension module). 

-Travis


> 
>> Dag
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> 
>> 
>> 
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From njs at pobox.com  Tue Mar 13 14:39:01 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 13 Mar 2012 18:39:01 +0000
Subject: [Numpy-discussion] Looking for people interested in helping
 with Python compiler to LLVM
In-Reply-To: <FA0A06B7-5EDF-4D2F-8919-A65E310D8AC5@continuum.io>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<4F5EE219.8010108@astro.uio.no>
	<C15466AA-DBDC-4582-A522-D50D1A2E4AC1@continuum.io>
	<CANg26EWHOqu2RH1A-jBTpY79h973P9Ob-070NbBwLPeZkY7eKA@mail.gmail.com>
	<FA0A06B7-5EDF-4D2F-8919-A65E310D8AC5@continuum.io>
Message-ID: <CAPJVwBkzYG6Ybe7SF3PdvrRGaAFq11dVYw4XWm6480mSxVW1Fg@mail.gmail.com>

On Tue, Mar 13, 2012 at 5:18 PM, Travis Oliphant <travis at continuum.io> wrote:
> Cython and Numba certainly overlap. ?However, Cython requires:
>
> ? ? ? ?1) learning another language

So is the goal for numba to actually handle arbitrary Python code with
correct semantics, i.e., it's actually a compiled implementation of
Python-the-language? (I feel like most of where Cython-the-language
differs from Python-the-language is that Cython adds extensions for
stuff where getting speed out of Python-the-language would just be too
hard. Dynamic type inference for numpy arrays is definitely a good
start, but you can't even, say, promote a Python integer to a C
integer without changing semantics...)

> ? ? ? ?2) creating an extension module --- loading bit-code files and dynamically executing (even on a different machine from the one that initially created them) can be a powerful alternative for run-time compilation and distribution of code.

Totally agreed on this point, the workflow for Cython could definitely
be smoother.

-- Nathaniel


From ischnell at enthought.com  Tue Mar 13 16:47:10 2012
From: ischnell at enthought.com (Ilan Schnell)
Date: Tue, 13 Mar 2012 15:47:10 -0500
Subject: [Numpy-discussion] Looking for people interested in helping
 with Python compiler to LLVM
In-Reply-To: <CAPJVwBkzYG6Ybe7SF3PdvrRGaAFq11dVYw4XWm6480mSxVW1Fg@mail.gmail.com>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<4F5EE219.8010108@astro.uio.no>
	<C15466AA-DBDC-4582-A522-D50D1A2E4AC1@continuum.io>
	<CANg26EWHOqu2RH1A-jBTpY79h973P9Ob-070NbBwLPeZkY7eKA@mail.gmail.com>
	<FA0A06B7-5EDF-4D2F-8919-A65E310D8AC5@continuum.io>
	<CAPJVwBkzYG6Ybe7SF3PdvrRGaAFq11dVYw4XWm6480mSxVW1Fg@mail.gmail.com>
Message-ID: <CAAUn5qLRf6yAbX3go_Bo3fXpVQ9E+ok3n9eF34x5qr-ZmgvXFQ@mail.gmail.com>

As far as I understand, the goal is not to handle arbitrary
Python code, because this would become too difficult as is
not necessary when you have a simple math oriented function
which you want to speed up.  The idea is to create something
similar to http://www.enthought.com/~ischnell/paper.html which
only handles some restricted Python code, but using LLVM
(instead of C) as the target language.

- Ilan


On Tue, Mar 13, 2012 at 1:39 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Tue, Mar 13, 2012 at 5:18 PM, Travis Oliphant <travis at continuum.io> wrote:
>> Cython and Numba certainly overlap. ?However, Cython requires:
>>
>> ? ? ? ?1) learning another language
>
> So is the goal for numba to actually handle arbitrary Python code with
> correct semantics, i.e., it's actually a compiled implementation of
> Python-the-language? (I feel like most of where Cython-the-language
> differs from Python-the-language is that Cython adds extensions for
> stuff where getting speed out of Python-the-language would just be too
> hard. Dynamic type inference for numpy arrays is definitely a good
> start, but you can't even, say, promote a Python integer to a C
> integer without changing semantics...)
>
>> ? ? ? ?2) creating an extension module --- loading bit-code files and dynamically executing (even on a different machine from the one that initially created them) can be a powerful alternative for run-time compilation and distribution of code.
>
> Totally agreed on this point, the workflow for Cython could definitely
> be smoother.
>
> -- Nathaniel
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From apratap at lbl.gov  Tue Mar 13 18:58:11 2012
From: apratap at lbl.gov (Abhishek Pratap)
Date: Tue, 13 Mar 2012 15:58:11 -0700
Subject: [Numpy-discussion] numpy videos
In-Reply-To: <1805A8E7-C470-406B-8227-5B3C9D56F372@continuum.io>
References: <CA+7hxbzaA2tL-kZrLrQrcHJE+eQ4OVkCwZUcakMTGj_nxQi7fA@mail.gmail.com>
	<CAKF=DjtoMhh44DvtyK_ru=9oJw+wcc3aLfSxN=rBcXGTZWwMgg@mail.gmail.com>
	<CAGOQHEOkAcOsaRsr_GWqQ0b1QiP0KuVJR+sYj2ugzfm+A_9-xg@mail.gmail.com>
	<CA+7hxbyFb+xwhZ-7xQ4eVWJWqWhKv1n=NtzP0BenEJ+-RxzkiA@mail.gmail.com>
	<4F5F3E36.3080102@molden.no>
	<1805A8E7-C470-406B-8227-5B3C9D56F372@continuum.io>
Message-ID: <CA+7hxbxmdyyxCEU34oYnQ2m8kenKFyc1Kh-ZpDpbh1xB0i2M+A@mail.gmail.com>

Thanks guys..very handy examples by Francesc. I need to bookmark them
until I reach this point.

best,
-Abhi

On Tue, Mar 13, 2012 at 9:24 AM, Francesc Alted <francesc at continuum.io> wrote:
> On Mar 13, 2012, at 7:31 AM, Sturla Molden wrote:
>
>> On 12.03.2012 23:23, Abhishek Pratap wrote:
>>> Super awesome. I love how the python community in general keeps the
>>> recordings available for free.
>>>
>>> @Adam : I do have some problems that I can hit numpy with, mainly
>>> bigData based. So in summary I have millions/billions of rows of
>>> biological data on which I want to run some computation but at the
>>> same time have a capability to do quick lookup. I am not sure if numpy
>>> will be applicable for quick lookups ?by a string based key right ??
>>
>>
>> Jason Kinser's book on Python for bioinformatics might be of interest. Though I don't always agree with his NumPy coding style.
>>
>> As for "big data", it is a problem regardless of language. The HDF5 library might be of help (cf. PyTables or h5py, I actually prefer the latter).
>
> Yes, however IMO PyTables does adapt better to the OP lookup user case. ?For example, let's suppose a very simple key-value problem, where we need to locate a certain value by using a key. ?Using h5py I get:
>
> In [1]: import numpy as np
>
> In [2]: N = 100*1000
>
> In [3]: sa = np.fromiter((('key'+str(i), i) for i in xrange(N)), dtype="S8,i4")
>
> In [4]: import h5py
>
> In [5]: f = h5py.File('h5py.h5', 'w')
>
> In [6]: d = f.create_dataset('sa', data=sa)
>
> In [7]: time [val for val in d if val[0] == 'key500']
> CPU times: user 28.34 s, sys: 0.06 s, total: 28.40 s
> Wall time: 29.25 s
> Out[7]: [('key500', 500)]
>
> Another option is to use fancy selection:
>
> In [8]: time d[d['f0']=='key500']
> CPU times: user 0.01 s, sys: 0.00 s, total: 0.01 s
> Wall time: 0.01 s
> Out[8]:
> array([('key500', 500)],
> ? ? ?dtype=[('f0', 'S8'), ('f1', '<i4')])
>
> Hmm, time resolution is too poor here. ?Let's use the %timeit magic:
>
> In [9]: timeit d[d['f0']=='key500']
> 100 loops, best of 3: 9.3 ms per loop
>
> which is much better. ?But, in this case you need to load the column d['f0'] completely in-memory, and this is *not* what you want when you have large tables that does not fit in-memory.
>
> Using PyTables:
>
> In [10]: import tables
>
> In [11]: ft = tables.openFile('pytables.h5', 'w')
>
> In [12]: dt = ft.createTable(ft.root, 'sa', sa)
>
> In [13]: time [val[:] for val in dt if val[0] == 'key500']
> CPU times: user 0.04 s, sys: 0.01 s, total: 0.05 s
> Wall time: 0.04 s
> Out[13]: [('key500', 500)]
>
> That's almost a 100x of speed-up compared with h5py. ?But, in addition, PyTables has specific machinery to optimize these queries by using the numexpr behind the scenes:
>
> In [14]: time [val[:] for val in dt.where("f0=='key500'")]
> CPU times: user 0.01 s, sys: 0.00 s, total: 0.01 s
> Wall time: 0.00 s
> Out[14]: [('key500', 500)]
>
> Again, time resolution is too poor here. ?Let's use timeit magic:
>
> In [15]: timeit [val[:] for val in dt.where("f0=='key500'")]
> 100 loops, best of 3: 2.36 ms per loop
>
> This is an additional 10x speed-up. ?In fact, this is almost as fast as performing the query using NumPy directly:
>
> In [16]: timeit sa[sa['f0']=='key500']
> 100 loops, best of 3: 2.14 ms per loop
>
> with the difference that PyTables uses an out-of-core paradigm (i.e. it does not need to load the datasets completely in-memory). ?And finally, PyTables does support true indexing capabilities, so that you do not have to read the complete dataset for getting results:
>
> In [17]: dt.cols.f0.createIndex()
> Out[17]: 100000
>
> In [18]: timeit [val[:] for val in dt.where("f0=='key500'")]
> 1000 loops, best of 3: 213 us per loop
>
> which accounts for another additional 10x speedup. ?Of course, this speed up can be *much* more larger for bigger datasets, and specially for those that does not fit in-memory. ?See:
>
> http://pytables.github.com/usersguide/optimization.html#accelerating-your-searches
>
> for more detailed rational and benchmarks in big datasets.
>
> -- Francesc Alted
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From mwwiebe at gmail.com  Tue Mar 13 21:44:39 2012
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Tue, 13 Mar 2012 18:44:39 -0700
Subject: [Numpy-discussion] draft enum NEP
In-Reply-To: <4F5A35E4.6080501@continuum.io>
References: <4F5A35E4.6080501@continuum.io>
Message-ID: <CAMRnEmrOhZ7-DNGyrQ_oRrJEzd=zeR8HjpZOfwEX5sEyYoSVDA@mail.gmail.com>

On Fri, Mar 9, 2012 at 8:55 AM, Bryan Van de Ven <bryanv at continuum.io>wrote:

> Hi all,
>
> I have started working on a NEP for adding an enumerated type to NumPy.
> It is on my GitHub:
>
>     https://github.com/bryevdv/numpy/blob/enum/doc/neps/enum.rst
>
> It is still very rough, and incomplete in places. But I would like to
> get feedback sooner rather than later in order to refine it. In
> particular there are a few questions inline in the document that I would
> like input on. Any comments, suggestions, questions, concerns, etc. are
> very welcome.
>

This looks like a great start to me.

I think the open/closed enum distinction will need to be explored a little
bit more, because it interacts with dtype immutability/hashability. Do you
know if there are any examples of Python objects in the wild that
dynamically convert from not being hashable (i.e. raising an exception if
used as a dict key) to become hashable?

It might be worth adding a section which briefly compares and contrasts the
proposed functionality with enums in various programming languages. Here
are two links I found to try and get an idea:

MS on C# enum usage:
http://msdn.microsoft.com/en-us/library/cc138362.aspx
Wikipedia on C++ enum class:
http://en.wikipedia.org/wiki/C%2B%2B11#Strongly_typed_enumerations

For example, the C# enum has a way to enable a "flags" mode, which will
create successive powers of 2. This may not be a feature NumPy needs, but
if people are finding it useful in C#, maybe it would be useful here too.

Cheers,
Mark


>
> Thanks,
>
> Bryan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120313/b68c73cc/attachment.html>

From d.s.seljebotn at astro.uio.no  Wed Mar 14 01:08:38 2012
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Tue, 13 Mar 2012 22:08:38 -0700
Subject: [Numpy-discussion] draft enum NEP
In-Reply-To: <CAMRnEmrOhZ7-DNGyrQ_oRrJEzd=zeR8HjpZOfwEX5sEyYoSVDA@mail.gmail.com>
References: <4F5A35E4.6080501@continuum.io>
	<CAMRnEmrOhZ7-DNGyrQ_oRrJEzd=zeR8HjpZOfwEX5sEyYoSVDA@mail.gmail.com>
Message-ID: <4F6027D6.1090308@astro.uio.no>

On 03/13/2012 06:44 PM, Mark Wiebe wrote:
> On Fri, Mar 9, 2012 at 8:55 AM, Bryan Van de Ven <bryanv at continuum.io
> <mailto:bryanv at continuum.io>> wrote:
>
>     Hi all,
>
>     I have started working on a NEP for adding an enumerated type to NumPy.
>     It is on my GitHub:
>
>     https://github.com/bryevdv/numpy/blob/enum/doc/neps/enum.rst
>
>     It is still very rough, and incomplete in places. But I would like to
>     get feedback sooner rather than later in order to refine it. In
>     particular there are a few questions inline in the document that I would
>     like input on. Any comments, suggestions, questions, concerns, etc. are
>     very welcome.
>
>
> This looks like a great start to me.
>
> I think the open/closed enum distinction will need to be explored a
> little bit more, because it interacts with dtype
> immutability/hashability. Do you know if there are any examples of
> Python objects in the wild that dynamically convert from not being
> hashable (i.e. raising an exception if used as a dict key) to become
> hashable?

In Sage, the matrix objects are mutable when constructed, and you can 
set_immutable to make them immutable.

The way I look at that though is that it is part of the construction 
phase of the object, you'd typically construct, fill it in, then 
set_immutable (to finish construction), then use it.

set/frozenset is an example of the opposite, and a design I personally 
like better (i.e., "frozen_dtype" :-)).

Dag

>
> It might be worth adding a section which briefly compares and contrasts
> the proposed functionality with enums in various programming languages.
> Here are two links I found to try and get an idea:
>
> MS on C# enum usage:
> http://msdn.microsoft.com/en-us/library/cc138362.aspx
> Wikipedia on C++ enum class:
> http://en.wikipedia.org/wiki/C%2B%2B11#Strongly_typed_enumerations
>
> For example, the C# enum has a way to enable a "flags" mode, which will
> create successive powers of 2. This may not be a feature NumPy needs,
> but if people are finding it useful in C#, maybe it would be useful here
> too.
>
> Cheers,
> Mark
>
>
>     Thanks,
>
>     Bryan
>     _______________________________________________
>     NumPy-Discussion mailing list
>     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From sameer.grover.1 at gmail.com  Wed Mar 14 05:01:11 2012
From: sameer.grover.1 at gmail.com (Sameer Grover)
Date: Wed, 14 Mar 2012 14:31:11 +0530
Subject: [Numpy-discussion] f2py and pygtk on windows
In-Reply-To: <4F5CF57B.9070407@uci.edu>
References: <CAOxZyWMc7wwjpTT1BXEzaX-ihLcTK7MNYSAGOcdYdwdhCOVygg@mail.gmail.com>
	<4F5A6DE5.6000602@uci.edu>
	<CAOxZyWNsY69G6UzbwUsb0vKgyUnTTTMQv8LwmrGG-kPNMwAK2w@mail.gmail.com>
	<4F5CF57B.9070407@uci.edu>
Message-ID: <4F605E57.2030206@gmail.com>

On Monday 12 March 2012 12:26 AM, Christoph Gohlke wrote:
>
> On 3/10/2012 9:31 PM, Sameer Grover wrote:
>> On 10 March 2012 02:23, Christoph Gohlke<cgohlke at uci.edu>   wrote:
>>>
>>> On 3/9/2012 11:50 AM, Sameer Grover wrote:
>>>>>>> import gtk
>>>>>>> import foo # where foo is any f2py-wrapped program
>>>> Subsequently, on exiting python interpreter, the interpreter crashes
>>>> with this error message - "This application has requested the Runtime
>>>> to terminate it in an unusual way. Please contact the application's
>>>> support team for more information."
>>>>
>>>> Strangely enough, interchanging the order of the import statements,
>>>> i.e. importing the f2py wrapped program before gtk works fine.
>>>> Furthermore, each module works fine individually.
>>>>
>>>> This is a windows-only problem. I'm using Windows 7, Python 2.7,
>>>> latest numpy, mingw32 compiler and the "pygtk all-in-one installer"
>>>> (mentioned on the pygtk download page).
>>>>
>>>> This happens even for very simpl fortran programs such as this one -
>>>> subroutine hello ()
>>>>        write(*,*)'Hello from Fortran90!!!'
>>>> end subroutine hello
>>>>
>>>> I don't know whether the problem is with f2py or with gtk or with
>>>> python but maybe somebody can shed some light on this.
>>>>
>>>> Regards,
>>>> Sameer Grover
>>> The error can be due to memory corruption. It works for me with
>>> msvc9/ifort builds of the pygtk and f2py extensions.
>>>
>>> Which DLLs does foo.pyd depend on (use DependencyWalker)?
>>>
>>> Christoph
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> The .pyd file generated by f2py depends on the following
>>
>> KERNEL32.DLL
>> MSVCRT.DLL
>> LIBGCC_S_DW2-1.DLL
>> LIBGFORTRAN-3.DLL
>> PYTHON27.DLL
>>
>> It is possible that this is a mingw32 problem. I haven't been able to
>> try using ifort/msvc.(mscv is somehow not getting installed on my
>> system, but that's a separate issue).
>>
>> Sameer
>
> Can you try linking to the msvc9 runtime DLL? E.g.
>
> f2py.py -c --fcompiler=gnu95 --compiler=mingw32 -lmsvcr90 -m foo foo.f
>
> Christoph
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

That gives the same error as before.

I was also trying to perform the compilation using the msvc compiler but 
am getting the these errors at the final linking stage.

f2py.py -c --fcompiler=gnu95 --compiler=msvc -lmsvcr90 -m foo foo.f
->Cannot open input file 'msvcr90.lib'

f2py.py -c --fcompiler=gnu95 --compiler=msvc -m foo foo.f
->Cannot open input file 'mingw32.lib'

Where am I going wrong?

Thanks for the help.

Sameer


From pgmdevlist at gmail.com  Wed Mar 14 09:28:38 2012
From: pgmdevlist at gmail.com (Pierre GM)
Date: Wed, 14 Mar 2012 13:28:38 +0000 (UTC)
Subject: [Numpy-discussion] Recovering from a STOP ?
Message-ID: <loom.20120314T142725-938@post.gmane.org>

THYC


Dear all,

I'm working with some large inherited F90 code that needs to be wrapped in
Python. if the code base itself cannot be modified (it's a static archive), some
additional F90 files were written to help the interaction with the code. Writing
a python extension combining the archive and these additional files is rather
straightforward using F2PY, the extension loads, all is well but...
...when a problem occurs in a subroutine deep in the base code, it calls another
routine (say, ENDRUN) which itself uses a STOP statement. As you know, this
statement not only exits the fortran code, but makes the extension and then the
interpreter crash...

I patched ENDRUN use STOP, I tried to call a C external function instead of
using STOP:
"""
*- ENDRUN.f90 -*
subroutine ENDRUN(C_NOM)
  use endrun_wrap
  Implicit none
  character(LEN=*), intent(in) :: C_NOM
  ! write(*, *) C_NOM
  ! stop
  call pyraise_runtime(C_NOM)
end subroutine ENDRUN

*- ENDRUNWRAP.f90 -*
module endrun_wrap
  interface
    subroutine pyraise_runtime(message)
        character(LEN=*), intent(in) :: message
    end subroutine pyraise_runtime
  end interface
end module endrun_wrap

*- pyraise_runtime.c -*
#include <Python.h>

void pyraise_runtime_(char *message);

void pyraise_runtime_(char *message)
{
   printf("calling PyErr_SetString(PyExc_RuntimeError, message)\n");
   PyErr_SetString(PyExc_RuntimeError, message);
}
"""

Alas, the RuntimeError doesn't look like it's passed back to the interpreter,
which still crashes. (Adding a Py_Exit(-1) at the end of pyraise_runtime at
least let the interpreter do some extra cleaning after the fortran code stopped,
but still...)

Note that ENDRUN is never supposed to be called directly by the user (so no
point to define a callback function via f2py, right ?).

Any idea would be quite welcome.
Thanks an awful lot in advance.

P.
PS: Oh, BTW, I'm limited to numpy 1.4.1 and the f2py that comes with it... 


From pav at iki.fi  Wed Mar 14 12:25:31 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Wed, 14 Mar 2012 17:25:31 +0100
Subject: [Numpy-discussion] Recovering from a STOP ?
In-Reply-To: <loom.20120314T142725-938@post.gmane.org>
References: <loom.20120314T142725-938@post.gmane.org>
Message-ID: <jjqgpr$3n2$1@dough.gmane.org>

14.03.2012 14:28, Pierre GM kirjoitti:
[clip]
> Alas, the RuntimeError doesn't look like it's passed back to the interpreter,
> which still crashes. (Adding a Py_Exit(-1) at the end of pyraise_runtime at
> least let the interpreter do some extra cleaning after the fortran code stopped,
> but still...)
> 
> Note that ENDRUN is never supposed to be called directly by the user (so no
> point to define a callback function via f2py, right ?).

The crash maybe occurs because if the code is not stopped, it writes out
of bounds or something?

You can try to use longjmp in ENDRUN to jump back to the beginning, and
return an error code. YMMV, this probably plays all hell with cleanup etc.

Or, maybe the whole Fortran stuff can be run in a separate process, so
that crashing doesn't matter.

	Pauli


From chris.barker at noaa.gov  Wed Mar 14 16:49:38 2012
From: chris.barker at noaa.gov (Chris Barker)
Date: Wed, 14 Mar 2012 13:49:38 -0700
Subject: [Numpy-discussion] Recovering from a STOP ?
In-Reply-To: <jjqgpr$3n2$1@dough.gmane.org>
References: <loom.20120314T142725-938@post.gmane.org>
	<jjqgpr$3n2$1@dough.gmane.org>
Message-ID: <CALGmxEL6QFZYbUTdWrrNgSbcJ7-E+EhwiZb06q_TVwDErsJ7uw@mail.gmail.com>

On Wed, Mar 14, 2012 at 9:25 AM, Pauli Virtanen <pav at iki.fi> wrote:
> Or, maybe the whole Fortran stuff can be run in a separate process, so
> that crashing doesn't matter.

That's what I was going to suggest -- even if you can get it not to
crash, it may well be in a bad state -- memory leaks, and who know
what else.

We did something similar with some C code that called the system
exit() function when it encountered errors -- it may not have been too
hard to replace those calls, but making sure the memory was all
cleaned up was going to be a trick -- so we just used the multiprocess
module to call it in another process.

HTH,

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959?? voice
7600 Sand Point Way NE ??(206) 526-6329?? fax
Seattle, WA ?98115 ? ? ??(206) 526-6317?? main reception

Chris.Barker at noaa.gov


From jonas.ruebsam at web.de  Wed Mar 14 19:35:47 2012
From: jonas.ruebsam at web.de (jonasr)
Date: Wed, 14 Mar 2012 16:35:47 -0700 (PDT)
Subject: [Numpy-discussion] Re place array values
Message-ID: <33506446.post@talk.nabble.com>


Hello, 

my problem is that i want to remove some small numbers of an 2d array,
for example if i want to sort out all numbers smaller then 1 of an array i
get

x=[[0,1,2],[3,4,5][6,7,8]]

c=x>=1

In [213]: c
Out[213]: 
array([[False,  True,  True],
       [ True,  True,  True],
       [ True,  True,  True]], dtype=bool)

In [214]: x[c]
Out[214]: array([1, 2, 3, 4, 5, 6, 7, 8])

the problem ist that i now have a 1d array, is there any possibility to
keep the 2d structure ? 

greets jonas
-- 
View this message in context: http://old.nabble.com/Replace-array-values-tp33506446p33506446.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.


From gokhansever at gmail.com  Wed Mar 14 19:51:04 2012
From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=)
Date: Wed, 14 Mar 2012 17:51:04 -0600
Subject: [Numpy-discussion] Re place array values
In-Reply-To: <33506446.post@talk.nabble.com>
References: <33506446.post@talk.nabble.com>
Message-ID: <CAE5kuyi+k8gtu5T=6MHhH_19g=ZKUaT6eotupZ6n0cknTzp7wg@mail.gmail.com>

Hi,

You can try masked_array module:

x = np.array([[0,1,2],[3,4,5],[6,7,8]])

I3 np.ma.masked_where(x<1, x)
O3
masked_array(data =
 [[-- 1 2]
 [3 4 5]
 [6 7 8]],
             mask =
 [[ True False False]
 [False False False]
 [False False False]],
       fill_value = 999999)

There might be a smarter solution than this, since ma tends to get slower
when you deal with big data arrays. But you retain the 2D information
instead of getting a flattened np.array.


On Wed, Mar 14, 2012 at 5:35 PM, jonasr <jonas.ruebsam at web.de> wrote:

>
> Hello,
>
> my problem is that i want to remove some small numbers of an 2d array,
> for example if i want to sort out all numbers smaller then 1 of an array i
> get
>
> x=[[0,1,2],[3,4,5][6,7,8]]
>
> c=x>=1
>
> In [213]: c
> Out[213]:
> array([[False,  True,  True],
>       [ True,  True,  True],
>       [ True,  True,  True]], dtype=bool)
>
> In [214]: x[c]
> Out[214]: array([1, 2, 3, 4, 5, 6, 7, 8])
>
> the problem ist that i now have a 1d array, is there any possibility to
> keep the 2d structure ?
>
> greets jonas
> --
> View this message in context:
> http://old.nabble.com/Replace-array-values-tp33506446p33506446.html
> Sent from the Numpy-discussion mailing list archive at Nabble.com.
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
G?khan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120314/967f0015/attachment.html>

From derekathomas at gmail.com  Wed Mar 14 19:57:48 2012
From: derekathomas at gmail.com (Derek Ashley Thomas)
Date: Thu, 15 Mar 2012 08:57:48 +0900
Subject: [Numpy-discussion] Re place array values
In-Reply-To: <33506446.post@talk.nabble.com>
References: <33506446.post@talk.nabble.com>
Message-ID: <CAEoiczfAGp3x-Q4a3R-eEna07hHSM6aJa_gudA4ibRKK5Mc_yQ@mail.gmail.com>

Hi,
I'm completely new to this list (and fairly new to numpy in general), but I
was wondering if you tried multiplying the bool array and the original
array.

Try:
x=array([[0,1,2],[3,4,5],[6,7,8]])

if rank(x) <= 1
  dot(x>=1, x)
else
  (x>=1)*x

This will give you a completely numerical array of the same structure, but
all numbers less than 1 will be zero fo easy handling. Is this what you're
trying to accomplish?

--
Derek


On Thu, Mar 15, 2012 at 08:35, jonasr <jonas.ruebsam at web.de> wrote:

>
> Hello,
>
> my problem is that i want to remove some small numbers of an 2d array,
> for example if i want to sort out all numbers smaller then 1 of an array i
> get
>
> x=[[0,1,2],[3,4,5][6,7,8]]
>
> c=x>=1
>
> In [213]: c
> Out[213]:
> array([[False,  True,  True],
>       [ True,  True,  True],
>       [ True,  True,  True]], dtype=bool)
>
> In [214]: x[c]
> Out[214]: array([1, 2, 3, 4, 5, 6, 7, 8])
>
> the problem ist that i now have a 1d array, is there any possibility to
> keep the 2d structure ?
>
> greets jonas
> --
> View this message in context:
> http://old.nabble.com/Replace-array-values-tp33506446p33506446.html
> Sent from the Numpy-discussion mailing list archive at Nabble.com.
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120315/18dcdf84/attachment.html>

From tmp50 at ukr.net  Thu Mar 15 06:48:04 2012
From: tmp50 at ukr.net (Dmitrey)
Date: Thu, 15 Mar 2012 12:48:04 +0200
Subject: [Numpy-discussion] [ANN] new release 0.38 of OpenOpt, FuncDesigner,
	SpaceFuncs, DerApproximator
Message-ID: <73423.1331808484.17417893843813335040@ffe8.ukr.net>


Hi,
I'm glad to inform you about new release 0.38 (2012-March-15):


OpenOpt:

interalg can handle discrete variables (see MINLP for examples)
interalg can handle multiobjective problems (MOP)
interalg can handle problems with parameters fixedVars/freeVars
Many interalg improvements and some bugfixes
Add another EIG solver: numpy.linalg.eig
New LLSP solver pymls with box bounds handling

FuncDesigner:

Some improvements for sum()
Add funcs tanh, arctanh, arcsinh, arccosh
Can solve EIG built from derivatives of several functions, obtained by
automatic differentiation by FuncDesigner

SpaceFuncs:

Add method point.symmetry(Point|Line|Plane)
Add method LineSegment.middle
Add method Point.rotate(Center, angle)

DerApproximator:

Minor changes


See http://openopt.org for more details.


Regards, D.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120315/73f69cc5/attachment.html>

From pierre.haessig at crans.org  Thu Mar 15 09:08:00 2012
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Thu, 15 Mar 2012 14:08:00 +0100
Subject: [Numpy-discussion] addition,
 multiplication of a polynomial and np.{float, int}
In-Reply-To: <CABL7CQgDjboUHUYWs8BLyDuQRWCxDmnsLh2vZp26+u8Ve1XXFQ@mail.gmail.com>
References: <slrnjlcusj.ul5.dlaxalde@157-washington.campus.mcgill.ca>	<CAB6mnxJ=u7QGTwSwUZfibbSieh8caZ+3_yTmuWF_-3xzJmONYg@mail.gmail.com>	<4F5790BB.2050801@crans.org>	<CAB6mnx+kJys-F76wKP+6KpF2rQObLmQD7g8G4nOWL03Ks_UDpA@mail.gmail.com>	<4F57DAFF.3030302@crans.org>
	<CABL7CQgDjboUHUYWs8BLyDuQRWCxDmnsLh2vZp26+u8Ve1XXFQ@mail.gmail.com>
Message-ID: <4F61E9B0.5060704@crans.org>

Le 09/03/2012 23:57, Ralf Gommers a ?crit :
>
> The buildbot doesn't check the doc build. I've edited a few of the links. 
Thanks for checking !

I had not realized that simply using the `numpy.package` notation was
enough to get a link to the package.

Best,
Pierre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120315/e74ff145/attachment.sig>

From langton2 at llnl.gov  Thu Mar 15 13:38:14 2012
From: langton2 at llnl.gov (Asher Langton)
Date: Thu, 15 Mar 2012 10:38:14 -0700
Subject: [Numpy-discussion] Followup on Python+MPI import performance
In-Reply-To: <CAB079H+kfzYNDqzaKGUNZmgzThHn=irvmc3Rx58dPy-CUEjvpQ@mail.gmail.com>
References: <CAB079H+kfzYNDqzaKGUNZmgzThHn=irvmc3Rx58dPy-CUEjvpQ@mail.gmail.com>
Message-ID: <CAB079HJc0Oz6D1eAfY2KpWCwcpkEKLNe+G5Td1vtptOQT36waA@mail.gmail.com>

On Mon, Mar 5, 2012 at 10:17 AM, Asher Langton <langton at gmail.com> wrote:
> This is a followup to my post from January
> (http://mail.scipy.org/pipermail/numpy-discussion/2012-January/059801.html)
> and the panel discussion at PyData this weekend. As a few people have
> suggested, a better approach than the MPI-broadcasted lookups is to
> cache the locations of all the modules found in sys.path.
> [...]
> I'll put an initial implementation of this importer on github sometime
> this week, and I'll follow up this post with some performance numbers
> when I have them.

Here are some numbers for the PEP302-based cached importer on an IBM
BlueGene/P machine. Numbers are wallclock measurements by the time
utility in minutes:seconds, one run for each test (not an average),
with no attempt to take into account other activity on the system or
fileservers. (With that said, I ran a variety of other tests, and the
results have been consistent.) I still need to run some larger tests,
particularly in the 16k-64k range, where Python imports start to scale
very poorly on this machine.

The tests use the code currently at github.com/langton/MPI_Import with
a script that simply imports 100 small C-extension modules.

With 1k cores/MPI processes:
cached_import.finder: 14:19.98
- skip actual import [1]: 13:37.77
- with checks [2]: 27:09.60
- w/checks, no import: 26:23.63

cached_import.mpi4py_finder [3]: 2:32.51
- skip actual import: 1:42.55
- with checks: 2:32.38
- w/checks, no import: 1:42.94

MPI_Import [4]: 2:22.20

standard import : 15:43.63
- skip actual imports [5]: 0:56.59


With 4k cores/MPI processes:
cached_import.finder: 27:34.45
- skip actual import: 27:40.58
- with checks: 52:14.83
- w/checks, no import: 50:04.73

cached_import.mpi4py_finder: 4:03.02
- skip actual import: 3:12.75
- with checks: 4:13.65
- w/checks, no import: 3:18.46

MPI_Import: 4:02.76

standard import : 35:24.77
- skip actual imports: 1:56.36

Notes:
[1] Builds the cache, but omits the actual imports.
[2] Check whether modules in sys.path are readable while building the
cache. Because filesystem operations are expensive, these checks are
off by default.
[3] Only the rank 0 process builds the initial cache, which is then
broadcasted over MPI.
[4] The other import replacement.
[5] This is roughly the interpreter startup/initialization time.


-Asher


From gokhansever at gmail.com  Thu Mar 15 14:56:01 2012
From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=)
Date: Thu, 15 Mar 2012 12:56:01 -0600
Subject: [Numpy-discussion] Question on numpy.ma.masked_values
Message-ID: <CAE5kuyjTj8DiBUS_VRcDok9qem-_gR6uCt7p1J9O5St-Csov2A@mail.gmail.com>

Hello,

>From the masked_values() documentation ->
http://docs.scipy.org/doc/numpy/reference/generated/numpy.ma.masked_values.html

I10 np.ma.masked_values(x, 1.5)
O10
masked_array(data = [ 1.   1.1  2.   1.1  3. ],
             mask = False,
       fill_value = 1.5)


I12 np.ma.masked_values(x, 1.5, shrink=False)
O12
masked_array(data = [ 1.   1.1  2.   1.1  3. ],
             mask = False,
       fill_value = 1.5)

Shouldn't setting the 'shrink' to False return an array of False values for
the mask field?
If not so, how can I return a set of False values if my masking condition
is not met?

Using:
I16 np.__version__
O16 '2.0.0.dev-7e202a2'

Thanks.


-- 
G?khan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120315/7b4fd887/attachment.html>

From gokhansever at gmail.com  Thu Mar 15 15:06:48 2012
From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=)
Date: Thu, 15 Mar 2012 13:06:48 -0600
Subject: [Numpy-discussion] Question on numpy.ma.masked_values
In-Reply-To: <CAE5kuyjTj8DiBUS_VRcDok9qem-_gR6uCt7p1J9O5St-Csov2A@mail.gmail.com>
References: <CAE5kuyjTj8DiBUS_VRcDok9qem-_gR6uCt7p1J9O5St-Csov2A@mail.gmail.com>
Message-ID: <CAE5kuyi85K-MSo-qGbOv2BO9-RdnNgo+vu6KPm+=ypBVzZkEKw@mail.gmail.com>

On Thu, Mar 15, 2012 at 12:56 PM, G?khan Sever <gokhansever at gmail.com>wrote:

If not so, how can I return a set of False values if my masking condition
> is not met?
>

Self-answer: I can force the mask to be filled with False's, however unsure
if this is a safe operation.

I50 x = np.array([1, 1.1, 2, 1.1, 3])

I51 y = np.ma.masked_values(x, 1.5, shrink=0)

I52 y
O52
masked_array(data = [1.0 1.1 2.0 1.1 3.0],
             mask = False,
       fill_value = 1.5)


I53 y.mask = np.zeros(len(x), dtype=np.bool)*True

I54 y
O54
masked_array(data = [1.0 1.1 2.0 1.1 3.0],
             mask = [False False False False False],
       fill_value = 1.5)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120315/bbf7201a/attachment.html>

From pgmdevlist at gmail.com  Thu Mar 15 15:12:40 2012
From: pgmdevlist at gmail.com (Pierre GM)
Date: Thu, 15 Mar 2012 20:12:40 +0100
Subject: [Numpy-discussion] Question on numpy.ma.masked_values
In-Reply-To: <CAE5kuyjTj8DiBUS_VRcDok9qem-_gR6uCt7p1J9O5St-Csov2A@mail.gmail.com>
References: <CAE5kuyjTj8DiBUS_VRcDok9qem-_gR6uCt7p1J9O5St-Csov2A@mail.gmail.com>
Message-ID: <CA+X_USUbcB81rcqp7jA4dFGxZUEscBjmTNR938_FqRzf8=E84w@mail.gmail.com>

Ciao G?khan,
AFAIR, shrink is used only to force a collapse of a mask full of False, not
to force the creation of such a mask.
Now, it should work as you expected, meaning that it needs to be fixed.
Could you open a ticket? And put me in copy, just in case.
Anyhow:
Your trick is a tad dangerous, as it erases the previous mask. I'd prefer
to create x w/ a full mask, then use masked_values w/ shrink=False... Now,
if you're sure there's no masked values, go for it.
Cheers
On Mar 15, 2012 7:56 PM, "G?khan Sever" <gokhansever at gmail.com> wrote:

> Hello,
>
> From the masked_values() documentation ->
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.ma.masked_values.html
>
> I10 np.ma.masked_values(x, 1.5)
> O10
> masked_array(data = [ 1.   1.1  2.   1.1  3. ],
>              mask = False,
>        fill_value = 1.5)
>
>
> I12 np.ma.masked_values(x, 1.5, shrink=False)
> O12
> masked_array(data = [ 1.   1.1  2.   1.1  3. ],
>              mask = False,
>        fill_value = 1.5)
>
> Shouldn't setting the 'shrink' to False return an array of False values
> for the mask field?
> If not so, how can I return a set of False values if my masking condition
> is not met?
>
> Using:
> I16 np.__version__
> O16 '2.0.0.dev-7e202a2'
>
> Thanks.
>
>
> --
> G?khan
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120315/937bcc21/attachment.html>

From gokhansever at gmail.com  Thu Mar 15 15:24:37 2012
From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=)
Date: Thu, 15 Mar 2012 13:24:37 -0600
Subject: [Numpy-discussion] Question on numpy.ma.masked_values
In-Reply-To: <CA+X_USUbcB81rcqp7jA4dFGxZUEscBjmTNR938_FqRzf8=E84w@mail.gmail.com>
References: <CAE5kuyjTj8DiBUS_VRcDok9qem-_gR6uCt7p1J9O5St-Csov2A@mail.gmail.com>
	<CA+X_USUbcB81rcqp7jA4dFGxZUEscBjmTNR938_FqRzf8=E84w@mail.gmail.com>
Message-ID: <CAE5kuygc-jMx0ozx0L4s98dU8oKO49gQ720KOPNGSu00CAWqvg@mail.gmail.com>

On Thu, Mar 15, 2012 at 1:12 PM, Pierre GM <pgmdevlist at gmail.com> wrote:

> Ciao G?khan,
> AFAIR, shrink is used only to force a collapse of a mask full of False,
> not to force the creation of such a mask.
> Now, it should work as you expected, meaning that it needs to be fixed.
> Could you open a ticket? And put me in copy, just in case.
> Anyhow:
> Your trick is a tad dangerous, as it erases the previous mask. I'd prefer
> to create x w/ a full mask, then use masked_values w/ shrink=False... Now,
> if you're sure there's x= no masked values, go for it.
> Cheers
>
This condition checking should make it stronger:

I7 x = np.array([1, 1.1, 2, 1.1, 3])

I8 y = np.ma.masked_values(x, 1.5)

I9 if y.mask == False:
    y.mask = np.zeros(len(x), dtype=np.bool)*True
   ...:

I10 y.mask
O10 array([False, False, False, False, False], dtype=bool)

I11 y
O11
masked_array(data = [1.0 1.1 2.0 1.1 3.0],
             mask = [False False False False False],
       fill_value = 1.5)

How do you create "x w/ a full mask"?

-- 
G?khan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120315/402a2f3e/attachment.html>

From gokhansever at gmail.com  Thu Mar 15 15:41:16 2012
From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=)
Date: Thu, 15 Mar 2012 13:41:16 -0600
Subject: [Numpy-discussion] Question on numpy.ma.masked_values
In-Reply-To: <CAE5kuygc-jMx0ozx0L4s98dU8oKO49gQ720KOPNGSu00CAWqvg@mail.gmail.com>
References: <CAE5kuyjTj8DiBUS_VRcDok9qem-_gR6uCt7p1J9O5St-Csov2A@mail.gmail.com>
	<CA+X_USUbcB81rcqp7jA4dFGxZUEscBjmTNR938_FqRzf8=E84w@mail.gmail.com>
	<CAE5kuygc-jMx0ozx0L4s98dU8oKO49gQ720KOPNGSu00CAWqvg@mail.gmail.com>
Message-ID: <CAE5kuyjMhdzRJ13AbGu3MfuQPwid2mGu5Nzm-KUV8Y3t=6GmGQ@mail.gmail.com>

Submitted the ticket at http://projects.scipy.org/numpy/ticket/2082


On Thu, Mar 15, 2012 at 1:24 PM, G?khan Sever <gokhansever at gmail.com> wrote:

>
>
> On Thu, Mar 15, 2012 at 1:12 PM, Pierre GM <pgmdevlist at gmail.com> wrote:
>
>> Ciao G?khan,
>> AFAIR, shrink is used only to force a collapse of a mask full of False,
>> not to force the creation of such a mask.
>> Now, it should work as you expected, meaning that it needs to be fixed.
>> Could you open a ticket? And put me in copy, just in case.
>> Anyhow:
>> Your trick is a tad dangerous, as it erases the previous mask. I'd prefer
>> to create x w/ a full mask, then use masked_values w/ shrink=False... Now,
>> if you're sure there's x= no masked values, go for it.
>> Cheers
>>
>> This condition checking should make it stronger:
>
> I7 x = np.array([1, 1.1, 2, 1.1, 3])
>
> I8 y = np.ma.masked_values(x, 1.5)
>
> I9 if y.mask == False:
>     y.mask = np.zeros(len(x), dtype=np.bool)*True
>    ...:
>
> I10 y.mask
> O10 array([False, False, False, False, False], dtype=bool)
>
> I11 y
> O11
> masked_array(data = [1.0 1.1 2.0 1.1 3.0],
>              mask = [False False False False False],
>        fill_value = 1.5)
>
> How do you create "x w/ a full mask"?
>
> --
> G?khan
>


-- 
G?khan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120315/19eb6674/attachment.html>

From njs at pobox.com  Thu Mar 15 19:02:30 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 15 Mar 2012 23:02:30 +0000
Subject: [Numpy-discussion] draft enum NEP
In-Reply-To: <CAMRnEmrOhZ7-DNGyrQ_oRrJEzd=zeR8HjpZOfwEX5sEyYoSVDA@mail.gmail.com>
References: <4F5A35E4.6080501@continuum.io>
	<CAMRnEmrOhZ7-DNGyrQ_oRrJEzd=zeR8HjpZOfwEX5sEyYoSVDA@mail.gmail.com>
Message-ID: <CAPJVwBmdXhyvmke52DJm22N1M=jDn6Tz6saUj3Nsccd9qiPu3Q@mail.gmail.com>

On Wed, Mar 14, 2012 at 1:44 AM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> On Fri, Mar 9, 2012 at 8:55 AM, Bryan Van de Ven <bryanv at continuum.io>
> wrote:
>>
>> Hi all,
>>
>> I have started working on a NEP for adding an enumerated type to NumPy.
>> It is on my GitHub:
>>
>> ? ? https://github.com/bryevdv/numpy/blob/enum/doc/neps/enum.rst
>>
>> It is still very rough, and incomplete in places. But I would like to
>> get feedback sooner rather than later in order to refine it. In
>> particular there are a few questions inline in the document that I would
>> like input on. Any comments, suggestions, questions, concerns, etc. are
>> very welcome.
>
>
> This looks like a great start to me.
>
> I think the open/closed enum distinction will need to be explored a little
> bit more, because it interacts with dtype immutability/hashability. Do you
> know if there are any examples of Python objects in the wild that
> dynamically convert from not being hashable (i.e. raising an exception if
> used as a dict key) to become hashable?

I haven't run into any...

Thinking about it, I'm not sure I have any use case for this type
being mutable. Maybe someone else can think of one? The first case
that came to mind was in reading a large text file, where you want to
(1) auto-create an enum, (2) use a pre-allocated array, and (3) don't
know ahead of time what the levels are:

  a = np.empty(lines_in_file, dtype=np.dtype(Enum()))
  for i, line in enumerate(f):
    field = line.split()[0]
    a.dtype.add_level(field)
    a[i] = field
  a.dtype.seal()

But really this is just can be done just as easily and efficiently
without a mutable dtype:

  a = np.empty(lines_in_file, dtype=np.int32)
  intern_table = {}
  next_level = 0
  for i, line in enumerate(f):
    field = line.split()[0]
    val = intern_table.setdefault(field, next_level)
    if val == next_level:
      next_level += 1
    a[i] = val
  a = a.view(dtype=np.dtype(Enum(map=intern_table)))

I notice that the HDF5 C library has a concept of open versus closed
enums, but I can't tell from the documentation at hand why this is; it
looks like it might just be a limitation of the implementation. (Like,
a workaround for C's lack of a standard mapping type, which makes it
inconvenient to pass in all the mappings in to a single API call.)

> It might be worth adding a section which briefly compares and contrasts the
> proposed functionality with enums in various programming languages. Here are
> two links I found to try and get an idea:
>
> MS on C# enum usage:
> http://msdn.microsoft.com/en-us/library/cc138362.aspx
> Wikipedia on C++ enum class:
> http://en.wikipedia.org/wiki/C%2B%2B11#Strongly_typed_enumerations
>
> For example, the C# enum has a way to enable a "flags" mode, which will
> create successive powers of 2. This may not be a feature NumPy needs, but if
> people are finding it useful in C#, maybe it would be useful here too.

There's also a long, ongoing debate about how to do enums in Python -- e.g.:
  http://www.python.org/dev/peps/pep-0354/
  http://pypi.python.org/pypi/enum/
  http://pypi.python.org/pypi/enum_meta/
  http://pypi.python.org/pypi/flufl.enum/
  http://pypi.python.org/pypi/lazr.enum/
  http://pypi.python.org/pypi/pyutilib.enum/
  http://pypi.python.org/pypi/coding/
  http://stackoverflow.com/questions/36932/whats-the-best-way-to-implement-an-enum-in-python
I guess Guido likes flufl.enum:
  http://mail.python.org/pipermail/python-ideas/2011-July/010909.html

BUT, I'm not sure any of this is relevant at all. "Enums" are a
programming language feature that are, first and foremost, about
injecting names into your code's namespace. What I'm hoping to see is
a dtype for holding categorical data, similar to an R "factor"
  http://stat.ethz.ch/R-manual/R-devel/library/base/html/factor.html
  https://svn.r-project.org/R/trunk/src/library/base/R/factor.R (NB:
This is GPL code if anyone is paranoid about contamination, but also
the most complete API description available)
or an HDF5 "enum"
  http://www.hdfgroup.org/HDF5/doc/H5.user/Datatypes.html#Datatypes_Enum
I believe pandas has some functionality along these lines too, though
I can't find it in the online docs -- hopefully Wes will fill us in.

These are basically objects that act for most purposes like string
arrays, but in which all strings are required to come from a finite,
specified list. This list acts like some metadata attached to the
array; it's order may or may not be significant. And they're
implemented internally as integer arrays.

I'm not sure what it would even mean to treat this kind of data as
"flags", since you can't take the bitwise-or of two strings...

-- Nathaniel


From charlesr.harris at gmail.com  Thu Mar 15 19:45:26 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 15 Mar 2012 17:45:26 -0600
Subject: [Numpy-discussion] Commit PR 229 breaks build
Message-ID: <CAB6mnxKn9GWb++bzc9KkD2tqXQ+KoexP4ghJmYux=xWJpHhWxg@mail.gmail.com>

On my machines anyway.

Running from numpy source directory.
non-existing path in 'numpy/distutils': 'site.cfg'
F2PY Version 2
numpy/core/setup_common.py:86: MismatchCAPIWarning: API mismatch detected,
the C API version numbers have to be updated. Current C api version is 6,
with checksum eb54c77ff4149bab310324cd7c0cb176, but recorded checksum for C
API version 6 in codegen_dir/cversions.txt is
e61d5dc51fa1c6459328266e215d6987. If functions were added in the C API, you
have to update C_API_VERSION  in numpy/core/setup_common.pyc.
  MismatchCAPIWarning)
blas_opt_info:
blas_mkl_info:
  libraries mkl,vml,guide not found in ['/usr/local/lib64',
'/usr/local/lib', '/usr/lib64', '/usr/lib']
  NOT AVAILABLE

atlas_blas_threads_info:
Setting PTATLAS=ATLAS
Setting PTATLAS=ATLAS
Traceback (most recent call last):
  File "setup.py", line 214, in <module>
    setup_package()
  File "setup.py", line 207, in setup_package
    configuration=configuration )
  File "/home/charris/Workspace/numpy.git/numpy/distutils/core.py", line
152, in setup
    config = configuration()
  File "setup.py", line 147, in configuration
    config.add_subpackage('numpy')
  File "/home/charris/Workspace/numpy.git/numpy/distutils/misc_util.py",
line 1002, in add_subpackage
    caller_level = 2)
  File "/home/charris/Workspace/numpy.git/numpy/distutils/misc_util.py",
line 971, in get_subpackage
    caller_level = caller_level + 1)
  File "/home/charris/Workspace/numpy.git/numpy/distutils/misc_util.py",
line 908, in _get_configuration_from_setup_py
    config = setup_module.configuration(*args)
  File "numpy/setup.py", line 9, in configuration
    config.add_subpackage('core')
  File "/home/charris/Workspace/numpy.git/numpy/distutils/misc_util.py",
line 1002, in add_subpackage
    caller_level = 2)
  File "/home/charris/Workspace/numpy.git/numpy/distutils/misc_util.py",
line 971, in get_subpackage
    caller_level = caller_level + 1)
  File "/home/charris/Workspace/numpy.git/numpy/distutils/misc_util.py",
line 908, in _get_configuration_from_setup_py
    config = setup_module.configuration(*args)
  File "numpy/core/setup.py", line 901, in configuration
    blas_info = get_info('blas_opt',0)
  File "/home/charris/Workspace/numpy.git/numpy/distutils/system_info.py",
line 325, in get_info
    return cl().get_info(notfound_action)
  File "/home/charris/Workspace/numpy.git/numpy/distutils/system_info.py",
line 478, in get_info
    self.calc_info()
  File "/home/charris/Workspace/numpy.git/numpy/distutils/system_info.py",
line 1465, in calc_info
    atlas_info = get_info('atlas_blas_threads')
  File "/home/charris/Workspace/numpy.git/numpy/distutils/system_info.py",
line 325, in get_info
    return cl().get_info(notfound_action)
  File "/home/charris/Workspace/numpy.git/numpy/distutils/system_info.py",
line 478, in get_info
    self.calc_info()
  File "/home/charris/Workspace/numpy.git/numpy/distutils/system_info.py
numpy/distutils/system_info.py", line 1090, in calc_info
    dict_append(atlas, **atlas_extra_info)
NameError: global name 'atlas_extra_info' is not defined


Commenting out line 1090 in numpy/distutils/system_info.py fixes the
problem, but I'm pretty sure that isn't the right fix.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120315/c0aed0e0/attachment.html>

From matthew.brett at gmail.com  Thu Mar 15 20:09:53 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Thu, 15 Mar 2012 17:09:53 -0700
Subject: [Numpy-discussion] Casting rules changed in trunk?
In-Reply-To: <CAH6Pt5rq794KK3uTwkD+oxKUP5x=6Z5zgosfTior0O7oAozMPg@mail.gmail.com>
References: <CAH6Pt5qJPn0f+T4b=ZbNZcq=FgpAoOG-XvAZH7=d5nZco-wKeg@mail.gmail.com>
	<CAH6Pt5rq794KK3uTwkD+oxKUP5x=6Z5zgosfTior0O7oAozMPg@mail.gmail.com>
Message-ID: <CAH6Pt5ovn_bcQTbU9dt7DHDecPdEvgx1wNYVVEvAY-9uhixVyg@mail.gmail.com>

Hi,

On Thu, Mar 8, 2012 at 3:14 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Wed, Mar 7, 2012 at 4:08 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
>> Hi,
>>
>> I noticed a casting change running the test suite on our image reader,
>> nibabel: https://github.com/nipy/nibabel/blob/master/nibabel/tests/test_casting.py
>>
>> For this script:
>>
>> <pre>
>> import numpy as np
>>
>> Adata = np.zeros((2,), dtype=np.uint8)
>> Bdata = np.zeros((2,), dtype=np.int16)
>> Bzero = np.int16(0)
>> Bbig = np.int16(256)
>>
>> print np.__version__
>> print 'Array add', (Adata + Bdata).dtype
>> print 'Scalar 0 add', (Adata + Bzero).dtype
>> print 'Scalar 256 add', (Adata + Bbig).dtype
>> </pre>
>>
>> 1.4.1
>> Array add int16
>> Scalar 0 add uint8
>> Scalar 256 add uint8
>>
>> 1.5.1
>> Array add int16
>> Scalar 0 add uint8
>> Scalar 256 add uint8
>>
>> 1.6.1
>> Array add int16
>> Scalar 0 add uint8
>> Scalar 256 add int16
>>
>> 1.7.0.dev-aae5b0a
>> Array add int16
>> Scalar 0 add uint8
>> Scalar 256 add uint16
>>
>> I can understand the uint8 outputs from numpy < 1.6 - the rule being
>> not to upcast for scalars.
>>
>> I can understand the int16 output from 1.6.1 on the basis that the
>> value is outside uint8 range and therefore we might prefer a type that
>> can handle values from both uint8 and int16.
>>
>> Was the current change intended? ?It has the following odd effect:
>>
>> In [5]: Adata + np.int16(257)
>> Out[5]: array([257, 257], dtype=uint16)
>>
>> In [7]: Adata + np.int16(-257)
>> Out[7]: array([-257, -257], dtype=int16)
>>
>> In [8]: Adata - np.int16(257)
>> Out[8]: array([65279, 65279], dtype=uint16)
>>
>> but I guess you can argue that there are odd effects for other choices too,
>
> In case it wasn't clear, this, in numpy 1.6.1:
>
> In [2]: (np.zeros((2,), dtype=np.uint8) + np.int16(257)).dtype
> Out[2]: dtype('int16')
>
> changed to this in current trunk:
>
> In [2]: (np.zeros((2,), dtype=np.uint8) + np.int16(257)).dtype
> Out[2]: dtype('uint16')
>
> which is different still in previous versions of numpy (e.g. 1.4.1):
>
> In [2]: (np.zeros((2,), dtype=np.uint8) + np.int16(257)).dtype
> Out[2]: dtype('uint8')
>
> My impression had been that the plan was to avoid changes in the
> casting rules if possible.
>
> Was this change in trunk intentional? ?If not, I am happy to bisect,

OK - I will assume it was unintentional and make a bug report,

Best,

Matthew


From njs at pobox.com  Thu Mar 15 20:27:32 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 16 Mar 2012 00:27:32 +0000
Subject: [Numpy-discussion] Missing data again
In-Reply-To: <CAPJVwB=wHMJbdDjMksK7=jdtP5eLkpe=BaPtnnX-r2-4tkfrkA@mail.gmail.com>
References: <291B3A7C-A5CB-4559-A143-F4B4FB002B71@continuum.io>
	<CAMRnEmoiTZcVpBZKU1B-kOGMsmN8SxghJ=oc5v7Y8s7wRvd+BA@mail.gmail.com>
	<4F561590.3000400@crans.org>
	<CAMRnEmqVP9N6_fNEbhcrQGOmVaMo3wagEeTigWN3pTy95c546w@mail.gmail.com>
	<CAPJVwBnKuA0NC3FvEuzmxgpA8FOT2fver98Ec5ow3oPZACfiyA@mail.gmail.com>
	<4F578E60.7080304@crans.org>
	<CAB6mnxK47+s5FBS+ej=U0CpYWFK4nTjicuHz03qUxwC2F+Wieg@mail.gmail.com>
	<CAPJVwBnzTAk1t-okcNz1_zyEvehhy2DFASaVWBaChmC4q-6aVA@mail.gmail.com>
	<CAB6mnxKJ=imqNgfBAh3hEvqVCVWjtT=OaomF5ue-1k8qRB+qUA@mail.gmail.com>
	<CAPJVwB=wHMJbdDjMksK7=jdtP5eLkpe=BaPtnnX-r2-4tkfrkA@mail.gmail.com>
Message-ID: <CAPJVwBkQ-veCJQjd1J7OHTZrh4zAnzTUP8+OgDFV-O3a2XhbAQ@mail.gmail.com>

Hi Chuck,

I think I let my frustration get the better of me, and the message
below is too confrontational. I apologize.

I truly would like to understand where you're coming from on this,
though, so I'll try to make this more productive. My summary of points
that no-one has disagreed with yet is here:
  https://github.com/njsmith/numpy/wiki/NA-discussion-status
Of course, this means that there's lots that's left out. Instead of
getting into all those contentious details, I'll stick to just a few
basic questions that might let us get at least of bit of common
ground:
1) Do you disagree with anything that is stated there?
2) Do you feel like that document accurately summarises your basic
idea of what this feature is supposed to do (I assume under the
IGNORED heading)?

Thanks,
-- Nathaniel

On Wed, Mar 7, 2012 at 11:10 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Wed, Mar 7, 2012 at 7:37 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>>
>>
>> On Wed, Mar 7, 2012 at 12:26 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>> When it comes to "missing data", bitpatterns can do everything that
>>> masks can do, are no more complicated to implement, and have better
>>> performance characteristics.
>>>
>>
>> Maybe for float, for other things, no. And we have lots of otherthings.
>
> It would be easier to discuss this if you'd, like, discuss :-(. If you
> know of some advantage that masks have over bitpatterns when it comes
> to missing data, can you please share it, instead of just asserting
> it?
>
> Not that I'm immune... I perhaps should have been more explicit
> myself, when I said "performance characteristics", let me clarify that
> I was thinking of both speed (for floats) and memory (for
> most-but-not-all things).
>
>> The
>> performance is a strawman,
>
> How many users need to speak up to say that this is a serious problem
> they have with the current implementation before you stop calling it a
> strawman? Because when Wes says that it's not going to fly for his
> stats/econometics cases, and the neuroimaging folk like Gary and Matt
> say it's not going to fly for their use cases... surely just waving
> that away is a bit dismissive?
>
> I'm not saying that we *have* to implement bitpatterns because
> performance is *the most important feature* -- I'm just saying, well,
> what I said. For *missing data use* cases, bitpatterns have better
> performance characteristics than masks. If we decide that these use
> cases are important, then we should take this into account and weigh
> it against other considerations. Maybe what you think is that these
> use cases shouldn't be the focus of this feature and it should focus
> on the "ignored" use cases instead? That would be a legitimate
> argument... but if that's what you want to say, say it, don't just
> dismiss your users!
>
>> and it *isn't* easier to implement.
>
> If I thought bitpatterns would be easier to implement, I would have
> said so... What I said was that they're not harder. You have some
> extra complexity, mostly in casting, and some reduced complexity -- no
> need to allocate and manipulate the mask. (E.g., simple same-type
> assignments and slicing require special casing for masks, but not for
> bitpatterns.) In many places the complexity is identical -- printing
> routines need to check for either special bitpatterns or masked
> values, whatever. Ufunc loops need to either find the appropriate part
> of the mask, or create a temporary mask buffer by calling a dtype
> func, whatever. On net they seem about equivalent, complexity-wise.
>
> ...I assume you disagree with this analysis, since I've said it
> before, wrote up a sketch for how the implementation would work at the
> C level, etc., and you continue to claim that simplicity is a
> compelling advantage for the masked approach. But I still don't know
> why you think that :-(.
>
>>> > Also, different folks adopt different values
>>> > for 'missing' data, and distributing one or several masks along with the
>>> > data is another common practice.
>>>
>>> True, but not really relevant to the current debate, because you have
>>> to handle such issues as part of your general data import workflow
>>> anyway, and none of these is any more complicated no matter which
>>> implementations are available.
>>>
>>> > One inconvenience I have run into with the current API is that is should
>>> > be
>>> > easier to clear the mask from an "ignored" value without taking a new
>>> > view
>>> > or assigning known data. So maybe two types of masks (different
>>> > payloads),
>>> > or an additional flag could be helpful. The process of assigning masks
>>> > could
>>> > also be made a bit easier than using fancy indexing.
>>>
>>> So this, uh... this was actually the whole goal of the "alterNEP"
>>> design for masks -- making all this stuff easy for people (like you,
>>> apparently?) that want support for ignored values, separately from
>>> missing data, and want a nice clean API for it. Basically having a
>>> separate .mask attribute which was an ordinary, assignable array
>>> broadcastable to the attached array's shape. Nobody seemed interested
>>> in talking about it much then but maybe there's interest now?
>>>
>>
>> Come off it, Nathaniel, the problem is minor and fixable. The intent of the
>> initial implementation was to discover such things.
>
> Implementation can be wonderful, I absolutely agree. But you
> understand that I'd be more impressed by this example if your
> discovery weren't something I had been arguing for since before the
> implementation began :-).
>
>> These things are less
>> accessible with the current API *precisely* because of the feedback from R
>> users. It didn't start that way.
>>
>> We now have something to evolve into what we want. That is a heck of a lot
>> more useful than endless discussion.
>
> No, you are still missing the point completely! There is no "what *we*
> want", because what you want is different than what I want. The
> masking stuff in the alterNEP was an attempt to give people like you
> who wanted "ignored" support what they wanted, and the bitpattern
> stuff was to satisfy people like me who want "missing data" support.
> The NEP took a different approach to trying to make everyone happy...
> unfortunately it sounds like it made no-one happy. Blaming the R users
> for this isn't *wrong*, exactly, but it's a bit one-sided.
>
> If you have a proposal for how the current code can be "evolved" into
> something that will make the neuro/econ/stats people happy, then
> please tell us. But I don't see how it's possible, and your current
> proposals are going in the wrong direction. Unless we can actually
> talk about these disagreements, we're just going to have more endless
> discussion.
>
> -- Nathaniel


From ben.root at ou.edu  Thu Mar 15 20:56:36 2012
From: ben.root at ou.edu (Benjamin Root)
Date: Thu, 15 Mar 2012 19:56:36 -0500
Subject: [Numpy-discussion] draft enum NEP
In-Reply-To: <CAPJVwBmdXhyvmke52DJm22N1M=jDn6Tz6saUj3Nsccd9qiPu3Q@mail.gmail.com>
References: <4F5A35E4.6080501@continuum.io>
	<CAMRnEmrOhZ7-DNGyrQ_oRrJEzd=zeR8HjpZOfwEX5sEyYoSVDA@mail.gmail.com>
	<CAPJVwBmdXhyvmke52DJm22N1M=jDn6Tz6saUj3Nsccd9qiPu3Q@mail.gmail.com>
Message-ID: <CANNq6FniRNKLpg2+nNJa-UsehHpAnnAazLTSKtJ-rCpYm3y7ZQ@mail.gmail.com>

On Thursday, March 15, 2012, Nathaniel Smith <njs at pobox.com> wrote:
> On Wed, Mar 14, 2012 at 1:44 AM, Mark Wiebe <mwwiebe at gmail.com> wrote:
>> On Fri, Mar 9, 2012 at 8:55 AM, Bryan Van de Ven <bryanv at continuum.io>
>> wrote:
>>>
>>> Hi all,
>>>
>>> I have started working on a NEP for adding an enumerated type to NumPy.
>>> It is on my GitHub:
>>>
>>>     https://github.com/bryevdv/numpy/blob/enum/doc/neps/enum.rst
>>>
>>> It is still very rough, and incomplete in places. But I would like to
>>> get feedback sooner rather than later in order to refine it. In
>>> particular there are a few questions inline in the document that I would
>>> like input on. Any comments, suggestions, questions, concerns, etc. are
>>> very welcome.
>>
>>
>> This looks like a great start to me.
>>
>> I think the open/closed enum distinction will need to be explored a
little
>> bit more, because it interacts with dtype immutability/hashability. Do
you
>> know if there are any examples of Python objects in the wild that
>> dynamically convert from not being hashable (i.e. raising an exception if
>> used as a dict key) to become hashable?
>
> I haven't run into any...
>
> Thinking about it, I'm not sure I have any use case for this type
> being mutable. Maybe someone else can think of one? The first case
> that came to mind was in reading a large text file, where you want to
> (1) auto-create an enum, (2) use a pre-allocated array, and (3) don't
> know ahead of time what the levels are:
>
>  a = np.empty(lines_in_file, dtype=np.dtype(Enum()))
>  for i, line in enumerate(f):
>    field = line.split()[0]
>    a.dtype.add_level(field)
>    a[i] = field
>  a.dtype.seal()
>
> But really this is just can be done just as easily and efficiently
> without a mutable dtype:
>
>  a = np.empty(lines_in_file, dtype=np.int32)
>  intern_table = {}
>  next_level = 0
>  for i, line in enumerate(f):
>    field = line.split()[0]
>    val = intern_table.setdefault(field, next_level)
>    if val == next_level:
>      next_level += 1
>    a[i] = val
>  a = a.view(dtype=np.dtype(Enum(map=intern_table)))
>
> I notice that the HDF5 C library has a concept of open versus closed
> enums, but I can't tell from the documentation at hand why this is; it
> looks like it might just be a limitation of the implementation. (Like,
> a workaround for C's lack of a standard mapping type, which makes it
> inconvenient to pass in all the mappings in to a single API call.)
>
>> It might be worth adding a section which briefly compares and contrasts
the
>> proposed functionality with enums in various programming languages. Here
are
>> two links I found to try and get an idea:
>>
>> MS on C# enum usage:
>> http://msdn.microsoft.com/en-us/library/cc138362.aspx
>> Wikipedia on C++ enum class:
>> http://en.wikipedia.org/wiki/C%2B%2B11#Strongly_typed_enumerations
>>
>> For example, the C# enum has a way to enable a "flags" mode, which will
>> create successive powers of 2. This may not be a feature NumPy needs,
but if
>> people are finding it useful in C#, maybe it would be useful here too.
>
> There's also a long, ongoing debate about how to do enums in Python --
e.g.:
>  http://www.python.org/dev/peps/pep-0354/
>  http://pypi.python.org/pypi/enum/
>  http://pypi.python.org/pypi/enum_meta/
>  http://pypi.python.org/pypi/flufl.enum/
>  http://pypi.python.org/pypi/lazr.enum/
>  http://pypi.python.org/pypi/pyutilib.enum/
>  http://pypi.python.org/pypi/coding/
>
http://stackoverflow.com/questions/36932/whats-the-best-way-to-implement-an-enum-in-python
> I guess Guido likes flufl.enum:
>  http://mail.python.org/pipermail/python-ideas/2011-July/010909.html
>
> BUT, I'm not sure any of this is relevant at all. "Enums" are a
> programming language feature that are, first and foremost, about
> injecting names into your code's namespace. What I'm hoping to see is
> a dtype for holding categorical data, similar to an R "factor"
>  http://stat.ethz.ch/R-manual/R-devel/library/base/html/factor.html
>  https://svn.r-project.org/R/trunk/src/library/base/R/factor.R (NB:
> This is GPL code if anyone is paranoid about contamination, but also
> the most complete API description available)
> or an HDF5 "enum"
>  http://www.hdfgroup.org/HDF5/doc/H5.user/Datatypes.html#Datatypes_Enum
> I believe pandas has some functionality along these lines too, though
> I can't find it in the online docs -- hopefully Wes will fill us in.
>
> These are basically objects that act for most purposes like string
> arrays, but in which all strings are required to come from a finite,
> specified list. This list acts like some metadata attached to the
> array; it's order may or may not be significant. And they're
> implemented internally as integer arrays.
>
> I'm not sure what it would even mean to treat this kind of data as
> "flags", since you can't take the bitwise-or of two strings...
>
> -- Nathaniel
>

I guess my problem is that this isn't _quite_ like an enum that I am
familiar with (but not quite unlike it either).  Should we call it
"factor", to avoid confusion or are there going to be too many that won't
know what that is, but would be drawn in by a name of "enum"?

Just a thought.

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120315/c71bb7b3/attachment.html>

From stefan at sun.ac.za  Thu Mar 15 21:02:23 2012
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Thu, 15 Mar 2012 18:02:23 -0700
Subject: [Numpy-discussion] draft enum NEP
In-Reply-To: <CAPJVwBmdXhyvmke52DJm22N1M=jDn6Tz6saUj3Nsccd9qiPu3Q@mail.gmail.com>
References: <4F5A35E4.6080501@continuum.io>
	<CAMRnEmrOhZ7-DNGyrQ_oRrJEzd=zeR8HjpZOfwEX5sEyYoSVDA@mail.gmail.com>
	<CAPJVwBmdXhyvmke52DJm22N1M=jDn6Tz6saUj3Nsccd9qiPu3Q@mail.gmail.com>
Message-ID: <CABDkGQneh12fC9UrT_aAHX7LHQSY7+k8ehO6b2cwN4gjSxVuvw@mail.gmail.com>

On Thu, Mar 15, 2012 at 4:02 PM, Nathaniel Smith <njs at pobox.com> wrote:
> I'm not sure what it would even mean to treat this kind of data as
> "flags", since you can't take the bitwise-or of two strings...

This makes a more sense outside of ndarrays, where you would do something like:

enum FLAG0 = 1, FLAG1 = 2, FLAG2 = 4
do_something(data, mode=FLAG0 & FLAG2)

The enum is therefore just a handle for its numerical value.  While it
may not be that useful in an array, I think Mark was just pointing out
that there may be other similar use cases, such as enumerating from 0
to N-1, or in reverse from N-1 down to 0, or in steps of 2, or in
powers of 2, etc.

St?fan


From matthew.brett at gmail.com  Fri Mar 16 00:10:03 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Thu, 15 Mar 2012 21:10:03 -0700
Subject: [Numpy-discussion] float96 on windows32 is float64?
Message-ID: <CAH6Pt5oqmk3EVL4_ee8BZjnS53tpfNfTJeuS__RsGtZYJCvr2w@mail.gmail.com>

Hi,

Am I right in thinking that float96 on windows 32 bit is a float64
padded to 96 bits?  If so, is it useful?  Has anyone got a windows64
box to check float128 ?

Python 2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.__version__
'1.6.1'
>>> info = np.finfo(np.float96)
>>> print info
Machine parameters for float96
---------------------------------------------------------------------
precision= 15   resolution= 1e-15
machep=   -52   eps=        2.22044604925e-16
negep =   -53   epsneg=     1.11022302463e-16
minexp=-16382   tiny=       0.0
maxexp= 16384   max=        1.#INF
nexp  =    15   min=        -max
---------------------------------------------------------------------

>>> info.nmant
52

Confirming 52 (+1 implicit) significand digits

>>> np.float96(2**52)+1
4503599627370497.0
>>> np.float96(2**53)+1
9007199254740992.0

float96 claims 15 exponent digits (nexp above), but in fact it appears
to have 11, as does float64

>>> np.float64(2**1022) * 2
8.9884656743115795e+307
>>> np.float64(2**1022) * 4
__main__:1: RuntimeWarning: overflow encountered in double_scalars
inf
>>> np.float96(2**1022) * 2
8.9884656743115795e+307
>>> np.float96(2**1022) * 4
1.#INF

It does take up 12 bytes (96 bits)

>>> np.dtype(np.float96).itemsize
12

Thanks,

Matthew


From cournape at gmail.com  Fri Mar 16 00:17:21 2012
From: cournape at gmail.com (David Cournapeau)
Date: Thu, 15 Mar 2012 23:17:21 -0500
Subject: [Numpy-discussion] float96 on windows32 is float64?
In-Reply-To: <CAH6Pt5oqmk3EVL4_ee8BZjnS53tpfNfTJeuS__RsGtZYJCvr2w@mail.gmail.com>
References: <CAH6Pt5oqmk3EVL4_ee8BZjnS53tpfNfTJeuS__RsGtZYJCvr2w@mail.gmail.com>
Message-ID: <CAGY4rcXYcQoHnxiLTQrJHOwp5UvgfP2xqtUozSzU0jxH-D3=yA@mail.gmail.com>

On Thu, Mar 15, 2012 at 11:10 PM, Matthew Brett <matthew.brett at gmail.com>wrote:

> Hi,
>
> Am I right in thinking that float96 on windows 32 bit is a float64
> padded to 96 bits?


Yes


>  If so, is it useful?


Yes: this is what allows you to use dtype to parse complex binary files
directly in numpy without having to care so much about those details. And
that's how it is defined on windows in any case (C standard only forces you
to have sizeof(long double) >= sizeof(double)).


>  Has anyone got a windows64
> box to check float128 ?
>

Too lazy to check on my vm, but I am pretty sure it is 16 bytes on windows
64.

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120315/e8173ae0/attachment.html>

From charlesr.harris at gmail.com  Fri Mar 16 00:24:29 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 15 Mar 2012 22:24:29 -0600
Subject: [Numpy-discussion] float96 on windows32 is float64?
In-Reply-To: <CAGY4rcXYcQoHnxiLTQrJHOwp5UvgfP2xqtUozSzU0jxH-D3=yA@mail.gmail.com>
References: <CAH6Pt5oqmk3EVL4_ee8BZjnS53tpfNfTJeuS__RsGtZYJCvr2w@mail.gmail.com>
	<CAGY4rcXYcQoHnxiLTQrJHOwp5UvgfP2xqtUozSzU0jxH-D3=yA@mail.gmail.com>
Message-ID: <CAB6mnxLNMHLEMvreXgG_PBf-VTmsoy6xt-tnDda1OSfzbtMsdQ@mail.gmail.com>

On Thu, Mar 15, 2012 at 10:17 PM, David Cournapeau <cournape at gmail.com>wrote:

>
>
> On Thu, Mar 15, 2012 at 11:10 PM, Matthew Brett <matthew.brett at gmail.com>wrote:
>
>> Hi,
>>
>> Am I right in thinking that float96 on windows 32 bit is a float64
>> padded to 96 bits?
>
>
> Yes
>
>
>>  If so, is it useful?
>
>
> Yes: this is what allows you to use dtype to parse complex binary files
> directly in numpy without having to care so much about those details. And
> that's how it is defined on windows in any case (C standard only forces you
> to have sizeof(long double) >= sizeof(double)).
>
>
>
>>  Has anyone got a windows64
>> box to check float128 ?
>>
>
> Too lazy to check on my vm, but I am pretty sure it is 16 bytes on windows
> 64.
>
>
Wait, MSVC doesn't support extended precision, so how do we get doubles
padded to 96 bits? I think MINGW supports extended precision but the MS
libraries won't. Still, if it's doubles it should be 64 bits and float96
shouldn't exist. Doubles padded to 96 bits are 150% pointless.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120315/ac7d8f4f/attachment.html>

From matthew.brett at gmail.com  Fri Mar 16 00:25:41 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Thu, 15 Mar 2012 21:25:41 -0700
Subject: [Numpy-discussion] float96 on windows32 is float64?
In-Reply-To: <CAGY4rcXYcQoHnxiLTQrJHOwp5UvgfP2xqtUozSzU0jxH-D3=yA@mail.gmail.com>
References: <CAH6Pt5oqmk3EVL4_ee8BZjnS53tpfNfTJeuS__RsGtZYJCvr2w@mail.gmail.com>
	<CAGY4rcXYcQoHnxiLTQrJHOwp5UvgfP2xqtUozSzU0jxH-D3=yA@mail.gmail.com>
Message-ID: <CAH6Pt5oVh2-msSNnZt+TPrtKRq2vGV5-Sp37vaJjigxjGsb5nw@mail.gmail.com>

Hi,

On Thu, Mar 15, 2012 at 9:17 PM, David Cournapeau <cournape at gmail.com> wrote:
>
>
> On Thu, Mar 15, 2012 at 11:10 PM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
>>
>> Hi,
>>
>> Am I right in thinking that float96 on windows 32 bit is a float64
>> padded to 96 bits?
>
>
> Yes
>
>>
>> ?If so, is it useful?
>
>
> Yes: this is what allows you to use dtype to parse complex binary files
> directly in numpy without having to care so much about those details. And
> that's how it is defined on windows in any case (C standard only forces you
> to have sizeof(long double) >= sizeof(double)).

I propose then to rename this one to float64_96 .

The nexp value in finfo(np.float96) is incorrect I believe, I'll make
a ticket for it.

>> ?Has anyone got a windows64
>> box to check float128 ?
>
>
> Too lazy to check on my vm, but I am pretty sure it is 16 bytes on windows
> 64.

Should you have time to do that, could you confirm it's also a padded
float64 and that nexp is still (incorrectly) 15?   That would be a
great help,

Thanks,

Matthew


From matthew.brett at gmail.com  Fri Mar 16 00:29:43 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Thu, 15 Mar 2012 21:29:43 -0700
Subject: [Numpy-discussion] float96 on windows32 is float64?
In-Reply-To: <CAB6mnxLNMHLEMvreXgG_PBf-VTmsoy6xt-tnDda1OSfzbtMsdQ@mail.gmail.com>
References: <CAH6Pt5oqmk3EVL4_ee8BZjnS53tpfNfTJeuS__RsGtZYJCvr2w@mail.gmail.com>
	<CAGY4rcXYcQoHnxiLTQrJHOwp5UvgfP2xqtUozSzU0jxH-D3=yA@mail.gmail.com>
	<CAB6mnxLNMHLEMvreXgG_PBf-VTmsoy6xt-tnDda1OSfzbtMsdQ@mail.gmail.com>
Message-ID: <CAH6Pt5ouuUXrREhjBza=kfA2eoPQ5TcOhGRXe3AbC3QpW5ph1g@mail.gmail.com>

Hi,

On Thu, Mar 15, 2012 at 9:24 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Thu, Mar 15, 2012 at 10:17 PM, David Cournapeau <cournape at gmail.com>
> wrote:
>>
>>
>>
>> On Thu, Mar 15, 2012 at 11:10 PM, Matthew Brett <matthew.brett at gmail.com>
>> wrote:
>>>
>>> Hi,
>>>
>>> Am I right in thinking that float96 on windows 32 bit is a float64
>>> padded to 96 bits?
>>
>>
>> Yes
>>
>>>
>>> ?If so, is it useful?
>>
>>
>> Yes: this is what allows you to use dtype to parse complex binary files
>> directly in numpy without having to care so much about those details. And
>> that's how it is defined on windows in any case (C standard only forces you
>> to have sizeof(long double) >= sizeof(double)).
>>
>>
>>>
>>> ?Has anyone got a windows64
>>> box to check float128 ?
>>
>>
>> Too lazy to check on my vm, but I am pretty sure it is 16 bytes on windows
>> 64.
>>
>
> Wait, MSVC doesn't support extended precision, so how do we get doubles
> padded to 96 bits? I think MINGW supports extended precision but the MS
> libraries won't. Still, if it's doubles it should be 64 bits and float96
> shouldn't exist. Doubles padded to 96 bits are 150% pointless.

I think David is arguing that longdouble for MSVC is indeed a 96 bit
padded float64 unless I misunderstand him.

If we were thinking of trade-offs I suppose one could argue that the
confusion and wasted memory of float96 might outweigh the simple
ability to read in binary files containing these values, on the basis
that one can do it anyway (by using a structured array dtype) and that
such files must be very rare in practice.

See you,

Matthew


From kalatsky at gmail.com  Fri Mar 16 00:33:49 2012
From: kalatsky at gmail.com (Val Kalatsky)
Date: Thu, 15 Mar 2012 23:33:49 -0500
Subject: [Numpy-discussion] float96 on windows32 is float64?
In-Reply-To: <CAH6Pt5oVh2-msSNnZt+TPrtKRq2vGV5-Sp37vaJjigxjGsb5nw@mail.gmail.com>
References: <CAH6Pt5oqmk3EVL4_ee8BZjnS53tpfNfTJeuS__RsGtZYJCvr2w@mail.gmail.com>
	<CAGY4rcXYcQoHnxiLTQrJHOwp5UvgfP2xqtUozSzU0jxH-D3=yA@mail.gmail.com>
	<CAH6Pt5oVh2-msSNnZt+TPrtKRq2vGV5-Sp37vaJjigxjGsb5nw@mail.gmail.com>
Message-ID: <CAE8bXEmT88FW8==nuUyAXAZ_75Ub8ah0gY=XG8GjTURQd1dULA@mail.gmail.com>

I just happened to have an xp64 VM running:
My version of numpy (1.6.1) does not have float128 (see more below what I
get in ipython session).
If you need to test something else please let me know.
Val
---

Enthought Python Distribution -- www.enthought.com

Python 2.7.2 |EPD 7.2-2 (64-bit)| (default, Jul  3 2011, 15:34:33) [MSC
v.1500 6
4 bit (AMD64)]
Type "copyright", "credits" or "license" for more information.

IPython 0.12 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

Welcome to pylab, a matplotlib-based Python environment [backend: WXAgg].
For more information, type 'help(pylab)'.

In [1]: import numpy as np

In [2]: np.__version__
Out[2]: '1.6.1'

In [3]: np.flo
np.float        np.float32      np.float_       np.floor
np.float16      np.float64      np.floating     np.floor_divide

In [3]: np.flo

On Thu, Mar 15, 2012 at 11:25 PM, Matthew Brett <matthew.brett at gmail.com>wrote:

> Hi,
>
> On Thu, Mar 15, 2012 at 9:17 PM, David Cournapeau <cournape at gmail.com>
> wrote:
> >
> >
> > On Thu, Mar 15, 2012 at 11:10 PM, Matthew Brett <matthew.brett at gmail.com
> >
> > wrote:
> >>
> >> Hi,
> >>
> >> Am I right in thinking that float96 on windows 32 bit is a float64
> >> padded to 96 bits?
> >
> >
> > Yes
> >
> >>
> >>  If so, is it useful?
> >
> >
> > Yes: this is what allows you to use dtype to parse complex binary files
> > directly in numpy without having to care so much about those details. And
> > that's how it is defined on windows in any case (C standard only forces
> you
> > to have sizeof(long double) >= sizeof(double)).
>
> I propose then to rename this one to float64_96 .
>
> The nexp value in finfo(np.float96) is incorrect I believe, I'll make
> a ticket for it.
>
> >>  Has anyone got a windows64
> >> box to check float128 ?
> >
> >
> > Too lazy to check on my vm, but I am pretty sure it is 16 bytes on
> windows
> > 64.
>
> Should you have time to do that, could you confirm it's also a padded
> float64 and that nexp is still (incorrectly) 15?   That would be a
> great help,
>
> Thanks,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120315/4e8b3f41/attachment.html>

From ben.root at ou.edu  Fri Mar 16 00:36:28 2012
From: ben.root at ou.edu (Benjamin Root)
Date: Thu, 15 Mar 2012 23:36:28 -0500
Subject: [Numpy-discussion] float96 on windows32 is float64?
In-Reply-To: <CAB6mnxLNMHLEMvreXgG_PBf-VTmsoy6xt-tnDda1OSfzbtMsdQ@mail.gmail.com>
References: <CAH6Pt5oqmk3EVL4_ee8BZjnS53tpfNfTJeuS__RsGtZYJCvr2w@mail.gmail.com>
	<CAGY4rcXYcQoHnxiLTQrJHOwp5UvgfP2xqtUozSzU0jxH-D3=yA@mail.gmail.com>
	<CAB6mnxLNMHLEMvreXgG_PBf-VTmsoy6xt-tnDda1OSfzbtMsdQ@mail.gmail.com>
Message-ID: <CANNq6FnP0tQiYbA9=1dL61MtXuO5TAuNvXh3jBdPpJ7meNLsbQ@mail.gmail.com>

On Thursday, March 15, 2012, Charles R Harris <charlesr.harris at gmail.com>
wrote:
>
>
> On Thu, Mar 15, 2012 at 10:17 PM, David Cournapeau <cournape at gmail.com>
wrote:
>>
>>
>> On Thu, Mar 15, 2012 at 11:10 PM, Matthew Brett <matthew.brett at gmail.com>
wrote:
>>>
>>> Hi,
>>>
>>> Am I right in thinking that float96 on windows 32 bit is a float64
>>> padded to 96 bits?
>>
>> Yes
>>
>>>
>>>  If so, is it useful?
>>
>> Yes: this is what allows you to use dtype to parse complex binary files
directly in numpy without having to care so much about those details. And
that's how it is defined on windows in any case (C standard only forces you
to have sizeof(long double) >= sizeof(double)).
>>
>>>
>>>  Has anyone got a windows64
>>> box to check float128 ?
>>
>> Too lazy to check on my vm, but I am pretty sure it is 16 bytes on
windows 64.
>
> Wait, MSVC doesn't support extended precision, so how do we get doubles
padded to 96 bits? I think MINGW supports extended precision but the MS
libraries won't. Still, if it's doubles it should be 64 bits and float96
shouldn't exist. Doubles padded to 96 bits are 150% pointless.
>
> Chuck
>

There is a Microsoft joke in there, somewhere...

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120315/107bc9cc/attachment.html>

From matthew.brett at gmail.com  Fri Mar 16 00:38:24 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Thu, 15 Mar 2012 21:38:24 -0700
Subject: [Numpy-discussion] float96 on windows32 is float64?
In-Reply-To: <CAE8bXEmT88FW8==nuUyAXAZ_75Ub8ah0gY=XG8GjTURQd1dULA@mail.gmail.com>
References: <CAH6Pt5oqmk3EVL4_ee8BZjnS53tpfNfTJeuS__RsGtZYJCvr2w@mail.gmail.com>
	<CAGY4rcXYcQoHnxiLTQrJHOwp5UvgfP2xqtUozSzU0jxH-D3=yA@mail.gmail.com>
	<CAH6Pt5oVh2-msSNnZt+TPrtKRq2vGV5-Sp37vaJjigxjGsb5nw@mail.gmail.com>
	<CAE8bXEmT88FW8==nuUyAXAZ_75Ub8ah0gY=XG8GjTURQd1dULA@mail.gmail.com>
Message-ID: <CAH6Pt5p__vi=_mqB9v3nfspKBr4S+5UQ6CdQ_pPgFBDTgtrkgw@mail.gmail.com>

Hi,

On Thu, Mar 15, 2012 at 9:33 PM, Val Kalatsky <kalatsky at gmail.com> wrote:
>
> I just happened to have an xp64 VM running:
> My version of numpy (1.6.1) does not have float128 (see more below what I
> get in ipython session).
> If you need to test something else please let me know.

Thanks a lot - that's helpful.  What do you get for:

print np.finfo(np.longdouble)

?

Best,

Matthew


From kalatsky at gmail.com  Fri Mar 16 00:41:17 2012
From: kalatsky at gmail.com (Val Kalatsky)
Date: Thu, 15 Mar 2012 23:41:17 -0500
Subject: [Numpy-discussion] float96 on windows32 is float64?
In-Reply-To: <CAH6Pt5p__vi=_mqB9v3nfspKBr4S+5UQ6CdQ_pPgFBDTgtrkgw@mail.gmail.com>
References: <CAH6Pt5oqmk3EVL4_ee8BZjnS53tpfNfTJeuS__RsGtZYJCvr2w@mail.gmail.com>
	<CAGY4rcXYcQoHnxiLTQrJHOwp5UvgfP2xqtUozSzU0jxH-D3=yA@mail.gmail.com>
	<CAH6Pt5oVh2-msSNnZt+TPrtKRq2vGV5-Sp37vaJjigxjGsb5nw@mail.gmail.com>
	<CAE8bXEmT88FW8==nuUyAXAZ_75Ub8ah0gY=XG8GjTURQd1dULA@mail.gmail.com>
	<CAH6Pt5p__vi=_mqB9v3nfspKBr4S+5UQ6CdQ_pPgFBDTgtrkgw@mail.gmail.com>
Message-ID: <CAE8bXEmQiSzJFNoXoOb9y=rYJWT1K6YT-9jNS7ddYKW3FnJ+qQ@mail.gmail.com>

I does look like a joke.
Here is print np.finfo(np.longdouble)

In [2]: np.__version__
Out[2]: '1.6.1'

In [3]: np.flo
np.float        np.float32      np.float_       np.floor
np.float16      np.float64      np.floating     np.floor_divide

In [3]: print np.finfo(np.longdouble)
Machine parameters for float64
---------------------------------------------------------------------
precision= 15   resolution= 1e-15
machep=   -52   eps=        2.22044604925e-16
negep =   -53   epsneg=     1.11022302463e-16
minexp= -1022   tiny=       2.22507385851e-308
maxexp=  1024   max=        1.79769313486e+308
nexp  =    11   min=        -max
---------------------------------------------------------------------


On Thu, Mar 15, 2012 at 11:38 PM, Matthew Brett <matthew.brett at gmail.com>wrote:

> Hi,
>
> On Thu, Mar 15, 2012 at 9:33 PM, Val Kalatsky <kalatsky at gmail.com> wrote:
> >
> > I just happened to have an xp64 VM running:
> > My version of numpy (1.6.1) does not have float128 (see more below what I
> > get in ipython session).
> > If you need to test something else please let me know.
>
> Thanks a lot - that's helpful.  What do you get for:
>
> print np.finfo(np.longdouble)
>
> ?
>
> Best,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120315/4695f4a1/attachment.html>

From warren.weckesser at enthought.com  Fri Mar 16 00:50:07 2012
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Thu, 15 Mar 2012 23:50:07 -0500
Subject: [Numpy-discussion] float96 on windows32 is float64?
In-Reply-To: <CAE8bXEmQiSzJFNoXoOb9y=rYJWT1K6YT-9jNS7ddYKW3FnJ+qQ@mail.gmail.com>
References: <CAH6Pt5oqmk3EVL4_ee8BZjnS53tpfNfTJeuS__RsGtZYJCvr2w@mail.gmail.com>
	<CAGY4rcXYcQoHnxiLTQrJHOwp5UvgfP2xqtUozSzU0jxH-D3=yA@mail.gmail.com>
	<CAH6Pt5oVh2-msSNnZt+TPrtKRq2vGV5-Sp37vaJjigxjGsb5nw@mail.gmail.com>
	<CAE8bXEmT88FW8==nuUyAXAZ_75Ub8ah0gY=XG8GjTURQd1dULA@mail.gmail.com>
	<CAH6Pt5p__vi=_mqB9v3nfspKBr4S+5UQ6CdQ_pPgFBDTgtrkgw@mail.gmail.com>
	<CAE8bXEmQiSzJFNoXoOb9y=rYJWT1K6YT-9jNS7ddYKW3FnJ+qQ@mail.gmail.com>
Message-ID: <CAM-+wY_1ouGaiukzkAm3TO_kZFK77zYvCHLSN2pdb-oPzxMkPQ@mail.gmail.com>

On Thu, Mar 15, 2012 at 11:41 PM, Val Kalatsky <kalatsky at gmail.com> wrote:

> I does look like a joke.
> Here is print np.finfo(np.longdouble)
>
> In [2]: np.__version__
> Out[2]: '1.6.1'
>
> In [3]: np.flo
> np.float        np.float32      np.float_       np.floor
> np.float16      np.float64      np.floating     np.floor_divide
>
> In [3]: print np.finfo(np.longdouble)
> Machine parameters for float64
> ---------------------------------------------------------------------
> precision= 15   resolution= 1e-15
> machep=   -52   eps=        2.22044604925e-16
> negep =   -53   epsneg=     1.11022302463e-16
> minexp= -1022   tiny=       2.22507385851e-308
> maxexp=  1024   max=        1.79769313486e+308
> nexp  =    11   min=        -max
> ---------------------------------------------------------------------
>
>

In this case (Win64), np.longdouble is just an alias for np.float64:

In [16]: sys.version
Out[16]: '2.7.2 |EPD 7.2-2 (64-bit)| (default, Sep 14 2011, 11:25:00) [MSC
v.1500 64 bit (AMD64)]'

In [17]: np.__version__
Out[17]: '1.6.1'

In [18]: np.longdouble
Out[18]: numpy.float64


Warren


> On Thu, Mar 15, 2012 at 11:38 PM, Matthew Brett <matthew.brett at gmail.com>wrote:
>
>> Hi,
>>
>> On Thu, Mar 15, 2012 at 9:33 PM, Val Kalatsky <kalatsky at gmail.com> wrote:
>> >
>> > I just happened to have an xp64 VM running:
>> > My version of numpy (1.6.1) does not have float128 (see more below what
>> I
>> > get in ipython session).
>> > If you need to test something else please let me know.
>>
>> Thanks a lot - that's helpful.  What do you get for:
>>
>> print np.finfo(np.longdouble)
>>
>> ?
>>
>> Best,
>>
>> Matthew
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120315/45e75bee/attachment.html>

From matthew.brett at gmail.com  Fri Mar 16 00:52:00 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Thu, 15 Mar 2012 21:52:00 -0700
Subject: [Numpy-discussion] float96 on windows32 is float64?
In-Reply-To: <CAE8bXEmQiSzJFNoXoOb9y=rYJWT1K6YT-9jNS7ddYKW3FnJ+qQ@mail.gmail.com>
References: <CAH6Pt5oqmk3EVL4_ee8BZjnS53tpfNfTJeuS__RsGtZYJCvr2w@mail.gmail.com>
	<CAGY4rcXYcQoHnxiLTQrJHOwp5UvgfP2xqtUozSzU0jxH-D3=yA@mail.gmail.com>
	<CAH6Pt5oVh2-msSNnZt+TPrtKRq2vGV5-Sp37vaJjigxjGsb5nw@mail.gmail.com>
	<CAE8bXEmT88FW8==nuUyAXAZ_75Ub8ah0gY=XG8GjTURQd1dULA@mail.gmail.com>
	<CAH6Pt5p__vi=_mqB9v3nfspKBr4S+5UQ6CdQ_pPgFBDTgtrkgw@mail.gmail.com>
	<CAE8bXEmQiSzJFNoXoOb9y=rYJWT1K6YT-9jNS7ddYKW3FnJ+qQ@mail.gmail.com>
Message-ID: <CAH6Pt5pFHLSUAui1XoTztGh3GwGXP+3x2ao_TO0QCDZzRO4X5g@mail.gmail.com>

Hi,

On Thu, Mar 15, 2012 at 9:41 PM, Val Kalatsky <kalatsky at gmail.com> wrote:
> I does look like a joke.
> Here is?print np.finfo(np.longdouble)
>
> In [2]: np.__version__
> Out[2]: '1.6.1'
>
> In [3]: np.flo
> np.float ? ? ? ?np.float32 ? ? ?np.float_ ? ? ? np.floor
> np.float16 ? ? ?np.float64 ? ? ?np.floating ? ? np.floor_divide
>
> In [3]: print np.finfo(np.longdouble)
> Machine parameters for float64
> ---------------------------------------------------------------------
> precision= 15 ? resolution= 1e-15
> machep= ? -52 ? eps= ? ? ? ?2.22044604925e-16
> negep = ? -53 ? epsneg= ? ? 1.11022302463e-16
> minexp= -1022 ? tiny= ? ? ? 2.22507385851e-308
> maxexp= ?1024 ? max= ? ? ? ?1.79769313486e+308
> nexp ?= ? ?11 ? min= ? ? ? ?-max
> ---------------------------------------------------------------------

Great - much easier on the eye - longdouble is float64 as expected.

Thanks,

Matthew


From ischnell at enthought.com  Fri Mar 16 01:17:08 2012
From: ischnell at enthought.com (Ilan Schnell)
Date: Fri, 16 Mar 2012 00:17:08 -0500
Subject: [Numpy-discussion] float96 on windows32 is float64?
In-Reply-To: <CAH6Pt5pFHLSUAui1XoTztGh3GwGXP+3x2ao_TO0QCDZzRO4X5g@mail.gmail.com>
References: <CAH6Pt5oqmk3EVL4_ee8BZjnS53tpfNfTJeuS__RsGtZYJCvr2w@mail.gmail.com>
	<CAGY4rcXYcQoHnxiLTQrJHOwp5UvgfP2xqtUozSzU0jxH-D3=yA@mail.gmail.com>
	<CAH6Pt5oVh2-msSNnZt+TPrtKRq2vGV5-Sp37vaJjigxjGsb5nw@mail.gmail.com>
	<CAE8bXEmT88FW8==nuUyAXAZ_75Ub8ah0gY=XG8GjTURQd1dULA@mail.gmail.com>
	<CAH6Pt5p__vi=_mqB9v3nfspKBr4S+5UQ6CdQ_pPgFBDTgtrkgw@mail.gmail.com>
	<CAE8bXEmQiSzJFNoXoOb9y=rYJWT1K6YT-9jNS7ddYKW3FnJ+qQ@mail.gmail.com>
	<CAH6Pt5pFHLSUAui1XoTztGh3GwGXP+3x2ao_TO0QCDZzRO4X5g@mail.gmail.com>
Message-ID: <CAAUn5qKg0RPP_tM8uQ85i=Eai197pWKNxNMn1YzzUc=yZD=JOw@mail.gmail.com>

I'm seeing the same thing on both (64 and 32-bit) Windows
EPD test machines.  I guess Windows does not support 128
bit floats.
I did some tests a few weeks ago, and discovered that also
on the Mac and Linux long long double is not really 128 bits.
If I remember correctly it was 80 bits: 1 (sign) + 16 (exp) + 63 (mantissa)

- Ilan


On Thu, Mar 15, 2012 at 11:52 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Thu, Mar 15, 2012 at 9:41 PM, Val Kalatsky <kalatsky at gmail.com> wrote:
>> I does look like a joke.
>> Here is?print np.finfo(np.longdouble)
>>
>> In [2]: np.__version__
>> Out[2]: '1.6.1'
>>
>> In [3]: np.flo
>> np.float ? ? ? ?np.float32 ? ? ?np.float_ ? ? ? np.floor
>> np.float16 ? ? ?np.float64 ? ? ?np.floating ? ? np.floor_divide
>>
>> In [3]: print np.finfo(np.longdouble)
>> Machine parameters for float64
>> ---------------------------------------------------------------------
>> precision= 15 ? resolution= 1e-15
>> machep= ? -52 ? eps= ? ? ? ?2.22044604925e-16
>> negep = ? -53 ? epsneg= ? ? 1.11022302463e-16
>> minexp= -1022 ? tiny= ? ? ? 2.22507385851e-308
>> maxexp= ?1024 ? max= ? ? ? ?1.79769313486e+308
>> nexp ?= ? ?11 ? min= ? ? ? ?-max
>> ---------------------------------------------------------------------
>
> Great - much easier on the eye - longdouble is float64 as expected.
>
> Thanks,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From matthew.brett at gmail.com  Fri Mar 16 01:22:48 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Thu, 15 Mar 2012 22:22:48 -0700
Subject: [Numpy-discussion] float96 on windows32 is float64?
In-Reply-To: <CAAUn5qKg0RPP_tM8uQ85i=Eai197pWKNxNMn1YzzUc=yZD=JOw@mail.gmail.com>
References: <CAH6Pt5oqmk3EVL4_ee8BZjnS53tpfNfTJeuS__RsGtZYJCvr2w@mail.gmail.com>
	<CAGY4rcXYcQoHnxiLTQrJHOwp5UvgfP2xqtUozSzU0jxH-D3=yA@mail.gmail.com>
	<CAH6Pt5oVh2-msSNnZt+TPrtKRq2vGV5-Sp37vaJjigxjGsb5nw@mail.gmail.com>
	<CAE8bXEmT88FW8==nuUyAXAZ_75Ub8ah0gY=XG8GjTURQd1dULA@mail.gmail.com>
	<CAH6Pt5p__vi=_mqB9v3nfspKBr4S+5UQ6CdQ_pPgFBDTgtrkgw@mail.gmail.com>
	<CAE8bXEmQiSzJFNoXoOb9y=rYJWT1K6YT-9jNS7ddYKW3FnJ+qQ@mail.gmail.com>
	<CAH6Pt5pFHLSUAui1XoTztGh3GwGXP+3x2ao_TO0QCDZzRO4X5g@mail.gmail.com>
	<CAAUn5qKg0RPP_tM8uQ85i=Eai197pWKNxNMn1YzzUc=yZD=JOw@mail.gmail.com>
Message-ID: <CAH6Pt5qNJd+qnWKtVvuLZZAHpSGiScusG0k2jLpJv6tXhsFM2Q@mail.gmail.com>

Hi,

On Thu, Mar 15, 2012 at 10:17 PM, Ilan Schnell <ischnell at enthought.com> wrote:
> I'm seeing the same thing on both (64 and 32-bit) Windows
> EPD test machines. ?I guess Windows does not support 128
> bit floats.

Do you mean there is no float96 on windows 32 bit as I described at
the beginning of the thread?

> I did some tests a few weeks ago, and discovered that also
> on the Mac and Linux long long double is not really 128 bits.
> If I remember correctly it was 80 bits: 1 (sign) + 16 (exp) + 63 (mantissa)

Yes, that's right, on Intel for linux and OSX longdouble is 80 bit precision.

Thanks,

Matthew


From ischnell at enthought.com  Fri Mar 16 01:26:53 2012
From: ischnell at enthought.com (Ilan Schnell)
Date: Fri, 16 Mar 2012 00:26:53 -0500
Subject: [Numpy-discussion] float96 on windows32 is float64?
In-Reply-To: <CAH6Pt5qNJd+qnWKtVvuLZZAHpSGiScusG0k2jLpJv6tXhsFM2Q@mail.gmail.com>
References: <CAH6Pt5oqmk3EVL4_ee8BZjnS53tpfNfTJeuS__RsGtZYJCvr2w@mail.gmail.com>
	<CAGY4rcXYcQoHnxiLTQrJHOwp5UvgfP2xqtUozSzU0jxH-D3=yA@mail.gmail.com>
	<CAH6Pt5oVh2-msSNnZt+TPrtKRq2vGV5-Sp37vaJjigxjGsb5nw@mail.gmail.com>
	<CAE8bXEmT88FW8==nuUyAXAZ_75Ub8ah0gY=XG8GjTURQd1dULA@mail.gmail.com>
	<CAH6Pt5p__vi=_mqB9v3nfspKBr4S+5UQ6CdQ_pPgFBDTgtrkgw@mail.gmail.com>
	<CAE8bXEmQiSzJFNoXoOb9y=rYJWT1K6YT-9jNS7ddYKW3FnJ+qQ@mail.gmail.com>
	<CAH6Pt5pFHLSUAui1XoTztGh3GwGXP+3x2ao_TO0QCDZzRO4X5g@mail.gmail.com>
	<CAAUn5qKg0RPP_tM8uQ85i=Eai197pWKNxNMn1YzzUc=yZD=JOw@mail.gmail.com>
	<CAH6Pt5qNJd+qnWKtVvuLZZAHpSGiScusG0k2jLpJv6tXhsFM2Q@mail.gmail.com>
Message-ID: <CAAUn5q+G1bUVm5byhJNDzof_Vay2UyYRGt7DKwGONSwb0MDt7A@mail.gmail.com>

To be more precise.  On both 32-bit and 64-bit Windows
machines I don't see.float96 as well as np.float128

- Ilan


On Fri, Mar 16, 2012 at 12:22 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Thu, Mar 15, 2012 at 10:17 PM, Ilan Schnell <ischnell at enthought.com> wrote:
>> I'm seeing the same thing on both (64 and 32-bit) Windows
>> EPD test machines. ?I guess Windows does not support 128
>> bit floats.
>
> Do you mean there is no float96 on windows 32 bit as I described at
> the beginning of the thread?
>
>> I did some tests a few weeks ago, and discovered that also
>> on the Mac and Linux long long double is not really 128 bits.
>> If I remember correctly it was 80 bits: 1 (sign) + 16 (exp) + 63 (mantissa)
>
> Yes, that's right, on Intel for linux and OSX longdouble is 80 bit precision.
>
> Thanks,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From matthew.brett at gmail.com  Fri Mar 16 01:40:27 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Thu, 15 Mar 2012 22:40:27 -0700
Subject: [Numpy-discussion] float96 on windows32 is float64?
In-Reply-To: <CAAUn5q+G1bUVm5byhJNDzof_Vay2UyYRGt7DKwGONSwb0MDt7A@mail.gmail.com>
References: <CAH6Pt5oqmk3EVL4_ee8BZjnS53tpfNfTJeuS__RsGtZYJCvr2w@mail.gmail.com>
	<CAGY4rcXYcQoHnxiLTQrJHOwp5UvgfP2xqtUozSzU0jxH-D3=yA@mail.gmail.com>
	<CAH6Pt5oVh2-msSNnZt+TPrtKRq2vGV5-Sp37vaJjigxjGsb5nw@mail.gmail.com>
	<CAE8bXEmT88FW8==nuUyAXAZ_75Ub8ah0gY=XG8GjTURQd1dULA@mail.gmail.com>
	<CAH6Pt5p__vi=_mqB9v3nfspKBr4S+5UQ6CdQ_pPgFBDTgtrkgw@mail.gmail.com>
	<CAE8bXEmQiSzJFNoXoOb9y=rYJWT1K6YT-9jNS7ddYKW3FnJ+qQ@mail.gmail.com>
	<CAH6Pt5pFHLSUAui1XoTztGh3GwGXP+3x2ao_TO0QCDZzRO4X5g@mail.gmail.com>
	<CAAUn5qKg0RPP_tM8uQ85i=Eai197pWKNxNMn1YzzUc=yZD=JOw@mail.gmail.com>
	<CAH6Pt5qNJd+qnWKtVvuLZZAHpSGiScusG0k2jLpJv6tXhsFM2Q@mail.gmail.com>
	<CAAUn5q+G1bUVm5byhJNDzof_Vay2UyYRGt7DKwGONSwb0MDt7A@mail.gmail.com>
Message-ID: <CAH6Pt5rVkF2UEUUoHA0-SoZ+zNFrDkdgFwqLxzjVV3_Gm+isKw@mail.gmail.com>

Hi,

On Thu, Mar 15, 2012 at 10:26 PM, Ilan Schnell <ischnell at enthought.com> wrote:
> To be more precise. ?On both 32-bit and 64-bit Windows
> machines I don't see.float96 as well as np.float128

Do you have any idea why I am seeing float96 and you are not?  I'm on
XP with the current sourceforge 1.6.1 exe installer with python.org
2.7 (and same for python.org 2.6 and numpy 1.5.1).

Thanks,

Matthew


From ischnell at enthought.com  Fri Mar 16 02:10:12 2012
From: ischnell at enthought.com (Ilan Schnell)
Date: Fri, 16 Mar 2012 01:10:12 -0500
Subject: [Numpy-discussion] float96 on windows32 is float64?
In-Reply-To: <CAH6Pt5rVkF2UEUUoHA0-SoZ+zNFrDkdgFwqLxzjVV3_Gm+isKw@mail.gmail.com>
References: <CAH6Pt5oqmk3EVL4_ee8BZjnS53tpfNfTJeuS__RsGtZYJCvr2w@mail.gmail.com>
	<CAGY4rcXYcQoHnxiLTQrJHOwp5UvgfP2xqtUozSzU0jxH-D3=yA@mail.gmail.com>
	<CAH6Pt5oVh2-msSNnZt+TPrtKRq2vGV5-Sp37vaJjigxjGsb5nw@mail.gmail.com>
	<CAE8bXEmT88FW8==nuUyAXAZ_75Ub8ah0gY=XG8GjTURQd1dULA@mail.gmail.com>
	<CAH6Pt5p__vi=_mqB9v3nfspKBr4S+5UQ6CdQ_pPgFBDTgtrkgw@mail.gmail.com>
	<CAE8bXEmQiSzJFNoXoOb9y=rYJWT1K6YT-9jNS7ddYKW3FnJ+qQ@mail.gmail.com>
	<CAH6Pt5pFHLSUAui1XoTztGh3GwGXP+3x2ao_TO0QCDZzRO4X5g@mail.gmail.com>
	<CAAUn5qKg0RPP_tM8uQ85i=Eai197pWKNxNMn1YzzUc=yZD=JOw@mail.gmail.com>
	<CAH6Pt5qNJd+qnWKtVvuLZZAHpSGiScusG0k2jLpJv6tXhsFM2Q@mail.gmail.com>
	<CAAUn5q+G1bUVm5byhJNDzof_Vay2UyYRGt7DKwGONSwb0MDt7A@mail.gmail.com>
	<CAH6Pt5rVkF2UEUUoHA0-SoZ+zNFrDkdgFwqLxzjVV3_Gm+isKw@mail.gmail.com>
Message-ID: <CAAUn5qJ09dgSH-zYfFuQiFBe1Y5oERh=f0trow7ih7MrFLC-4Q@mail.gmail.com>

I just did a quick test across all supported EPD platforms:
win-64: float96 No, float128 No
win-32: float96 No, float128 No
osx-64: float96 No, float128 Yes
osx-32: float96 No, float128 Yes
rh3-64: float96 No, float128 Yes
rh3-32: float96 Yes, float128 No
rh5-64: float96 No, float128 Yes
rh5-32: float96 Yes, float128 No
sol-64: float96 No, float128 Yes
sol-32: float96 Yes, float128 No

I have no explanation for this, but I'm guessing David C. has.
I'll look more into this tomorrow.

- Ilan


On Fri, Mar 16, 2012 at 12:40 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Thu, Mar 15, 2012 at 10:26 PM, Ilan Schnell <ischnell at enthought.com> wrote:
>> To be more precise. ?On both 32-bit and 64-bit Windows
>> machines I don't see.float96 as well as np.float128
>
> Do you have any idea why I am seeing float96 and you are not? ?I'm on
> XP with the current sourceforge 1.6.1 exe installer with python.org
> 2.7 (and same for python.org 2.6 and numpy 1.5.1).
>
> Thanks,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From ralf.gommers at googlemail.com  Fri Mar 16 03:44:13 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Fri, 16 Mar 2012 08:44:13 +0100
Subject: [Numpy-discussion] Commit PR 229 breaks build
In-Reply-To: <CAB6mnxKn9GWb++bzc9KkD2tqXQ+KoexP4ghJmYux=xWJpHhWxg@mail.gmail.com>
References: <CAB6mnxKn9GWb++bzc9KkD2tqXQ+KoexP4ghJmYux=xWJpHhWxg@mail.gmail.com>
Message-ID: <CABL7CQjV0Ji6bf9fgZrsqicAmegd9x_oEDyzrJ7Vo-fSAswhmA@mail.gmail.com>

On Fri, Mar 16, 2012 at 12:45 AM, Charles R Harris <
charlesr.harris at gmail.com> wrote:

> On my machines anyway.
>
> Running from numpy source directory.
> non-existing path in 'numpy/distutils': 'site.cfg'
> F2PY Version 2
> numpy/core/setup_common.py:86: MismatchCAPIWarning: API mismatch detected,
> the C API version numbers have to be updated. Current C api version is 6,
> with checksum eb54c77ff4149bab310324cd7c0cb176, but recorded checksum for C
> API version 6 in codegen_dir/cversions.txt is
> e61d5dc51fa1c6459328266e215d6987. If functions were added in the C API, you
> have to update C_API_VERSION  in numpy/core/setup_common.pyc.
>   MismatchCAPIWarning)
> blas_opt_info:
> blas_mkl_info:
>   libraries mkl,vml,guide not found in ['/usr/local/lib64',
> '/usr/local/lib', '/usr/lib64', '/usr/lib']
>   NOT AVAILABLE
>
> atlas_blas_threads_info:
> Setting PTATLAS=ATLAS
> Setting PTATLAS=ATLAS
> Traceback (most recent call last):
>   File "setup.py", line 214, in <module>
>     setup_package()
>   File "setup.py", line 207, in setup_package
>     configuration=configuration )
>   File "/home/charris/Workspace/numpy.git/numpy/distutils/core.py", line
> 152, in setup
>     config = configuration()
>   File "setup.py", line 147, in configuration
>     config.add_subpackage('numpy')
>   File "/home/charris/Workspace/numpy.git/numpy/distutils/misc_util.py",
> line 1002, in add_subpackage
>     caller_level = 2)
>   File "/home/charris/Workspace/numpy.git/numpy/distutils/misc_util.py",
> line 971, in get_subpackage
>     caller_level = caller_level + 1)
>   File "/home/charris/Workspace/numpy.git/numpy/distutils/misc_util.py",
> line 908, in _get_configuration_from_setup_py
>     config = setup_module.configuration(*args)
>   File "numpy/setup.py", line 9, in configuration
>     config.add_subpackage('core')
>   File "/home/charris/Workspace/numpy.git/numpy/distutils/misc_util.py",
> line 1002, in add_subpackage
>     caller_level = 2)
>   File "/home/charris/Workspace/numpy.git/numpy/distutils/misc_util.py",
> line 971, in get_subpackage
>     caller_level = caller_level + 1)
>   File "/home/charris/Workspace/numpy.git/numpy/distutils/misc_util.py",
> line 908, in _get_configuration_from_setup_py
>     config = setup_module.configuration(*args)
>   File "numpy/core/setup.py", line 901, in configuration
>     blas_info = get_info('blas_opt',0)
>   File "/home/charris/Workspace/numpy.git/numpy/distutils/system_info.py",
> line 325, in get_info
>     return cl().get_info(notfound_action)
>   File "/home/charris/Workspace/numpy.git/numpy/distutils/system_info.py",
> line 478, in get_info
>     self.calc_info()
>   File "/home/charris/Workspace/numpy.git/numpy/distutils/system_info.py",
> line 1465, in calc_info
>     atlas_info = get_info('atlas_blas_threads')
>   File "/home/charris/Workspace/numpy.git/numpy/distutils/system_info.py",
> line 325, in get_info
>     return cl().get_info(notfound_action)
>   File "/home/charris/Workspace/numpy.git/numpy/distutils/system_info.py",
> line 478, in get_info
>     self.calc_info()
>   File "/home/charris/Workspace/numpy.git/numpy/distutils/system_info.py
> numpy/distutils/system_info.py", line 1090, in calc_info
>     dict_append(atlas, **atlas_extra_info)
> NameError: global name 'atlas_extra_info' is not defined
>
>
> Commenting out line 1090 in numpy/distutils/system_info.py fixes the
> problem, but I'm pretty sure that isn't the right fix.
>

Does it work when you add:
    atlas_version, atlas_extra_info = get_atlas_version(**atlas)
on line 1090?

I thought the testing was extensive enough (we did 3 machines, different
OSes and site.cfg's), but apparently not. Sorry about that.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120316/b37d2a4e/attachment.html>

From madsipsen at gmail.com  Fri Mar 16 07:04:17 2012
From: madsipsen at gmail.com (Mads Ipsen)
Date: Fri, 16 Mar 2012 12:04:17 +0100
Subject: [Numpy-discussion] Minus zero
Message-ID: <4F631E31.8010709@gmail.com>

Here's an array

a = [[ -2.66453526e-15,   4.49564793e-02,   1.14401980e+00],
      [  2.02475000e+00,   2.06970648e+00,   1.14401980e+00],
      [  2.02475000e+00,   4.49564793e-02,   3.16876980e+00],
      [ -2.66453526e-15,   2.06970648e+00,   3.16876980e+00],
      [ -2.66453526e-15,   4.49564793e-02,   5.19351980e+00],
      [  2.02475000e+00,   2.06970648e+00,   5.19351980e+00],
      [  2.02475000e+00,   4.49564793e-02,   7.21826980e+00],
      [  0.00000000e+00,   2.02475000e+00,   7.08662500e+00]]

Now, call

   array_repr(a, precision=12, suppress_small=True)

which gives the output

array([[-0.          ,  0.0449564793,  1.1440198   ],
        [ 2.02475     ,  2.06970648  ,  1.1440198   ],
        [ 2.02475     ,  0.0449564793,  3.1687698   ],
        [-0.          ,  2.06970648  ,  3.1687698   ],
        [-0.          ,  0.0449564793,  5.1935198   ],
        [ 2.02475     ,  2.06970648  ,  5.1935198   ],
        [ 2.02475     ,  0.0449564793,  7.2182698   ],
        [ 0.          ,  2.02475     ,  7.086625    ]])

* Since the flag 'suppress_small=True' is used, I'd expect a '0.' 
instead of a '-0.' or maybe this is intentional.
* Also, wouldn't '0.0' instead of '0.' be more readable (and there's 
certainly space enough to add it)?

Best regards,

Mads


-- 
+-----------------------------------------------------+
| Mads Ipsen                                          |
+----------------------+------------------------------+
| G?seb?ksvej 7, 4. tv |                              |
| DK-2500 Valby        | phone:          +45-29716388 |
| Denmark              | email:  mads.ipsen at gmail.com |
+----------------------+------------------------------+


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120316/15a2e721/attachment.html>

From charlesr.harris at gmail.com  Fri Mar 16 07:49:56 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 16 Mar 2012 05:49:56 -0600
Subject: [Numpy-discussion] Commit PR 229 breaks build
In-Reply-To: <CABL7CQjV0Ji6bf9fgZrsqicAmegd9x_oEDyzrJ7Vo-fSAswhmA@mail.gmail.com>
References: <CAB6mnxKn9GWb++bzc9KkD2tqXQ+KoexP4ghJmYux=xWJpHhWxg@mail.gmail.com>
	<CABL7CQjV0Ji6bf9fgZrsqicAmegd9x_oEDyzrJ7Vo-fSAswhmA@mail.gmail.com>
Message-ID: <CAB6mnxLFnxNfnJP743BXsLEE=ziEuQW042vsrZ8sjEzL=j4fBA@mail.gmail.com>

On Fri, Mar 16, 2012 at 1:44 AM, Ralf Gommers
<ralf.gommers at googlemail.com>wrote:

>
>
> On Fri, Mar 16, 2012 at 12:45 AM, Charles R Harris <
> charlesr.harris at gmail.com> wrote:
>
>> On my machines anyway.
>>
>> Running from numpy source directory.
>> non-existing path in 'numpy/distutils': 'site.cfg'
>> F2PY Version 2
>> numpy/core/setup_common.py:86: MismatchCAPIWarning: API mismatch
>> detected, the C API version numbers have to be updated. Current C api
>> version is 6, with checksum eb54c77ff4149bab310324cd7c0cb176, but recorded
>> checksum for C API version 6 in codegen_dir/cversions.txt is
>> e61d5dc51fa1c6459328266e215d6987. If functions were added in the C API, you
>> have to update C_API_VERSION  in numpy/core/setup_common.pyc.
>>   MismatchCAPIWarning)
>> blas_opt_info:
>> blas_mkl_info:
>>   libraries mkl,vml,guide not found in ['/usr/local/lib64',
>> '/usr/local/lib', '/usr/lib64', '/usr/lib']
>>   NOT AVAILABLE
>>
>> atlas_blas_threads_info:
>> Setting PTATLAS=ATLAS
>> Setting PTATLAS=ATLAS
>> Traceback (most recent call last):
>>   File "setup.py", line 214, in <module>
>>     setup_package()
>>   File "setup.py", line 207, in setup_package
>>     configuration=configuration )
>>   File "/home/charris/Workspace/numpy.git/numpy/distutils/core.py", line
>> 152, in setup
>>     config = configuration()
>>   File "setup.py", line 147, in configuration
>>     config.add_subpackage('numpy')
>>   File "/home/charris/Workspace/numpy.git/numpy/distutils/misc_util.py",
>> line 1002, in add_subpackage
>>     caller_level = 2)
>>   File "/home/charris/Workspace/numpy.git/numpy/distutils/misc_util.py",
>> line 971, in get_subpackage
>>     caller_level = caller_level + 1)
>>   File "/home/charris/Workspace/numpy.git/numpy/distutils/misc_util.py",
>> line 908, in _get_configuration_from_setup_py
>>     config = setup_module.configuration(*args)
>>   File "numpy/setup.py", line 9, in configuration
>>     config.add_subpackage('core')
>>   File "/home/charris/Workspace/numpy.git/numpy/distutils/misc_util.py",
>> line 1002, in add_subpackage
>>     caller_level = 2)
>>   File "/home/charris/Workspace/numpy.git/numpy/distutils/misc_util.py",
>> line 971, in get_subpackage
>>     caller_level = caller_level + 1)
>>   File "/home/charris/Workspace/numpy.git/numpy/distutils/misc_util.py",
>> line 908, in _get_configuration_from_setup_py
>>     config = setup_module.configuration(*args)
>>   File "numpy/core/setup.py", line 901, in configuration
>>     blas_info = get_info('blas_opt',0)
>>   File
>> "/home/charris/Workspace/numpy.git/numpy/distutils/system_info.py", line
>> 325, in get_info
>>     return cl().get_info(notfound_action)
>>   File
>> "/home/charris/Workspace/numpy.git/numpy/distutils/system_info.py", line
>> 478, in get_info
>>     self.calc_info()
>>   File
>> "/home/charris/Workspace/numpy.git/numpy/distutils/system_info.py", line
>> 1465, in calc_info
>>     atlas_info = get_info('atlas_blas_threads')
>>   File
>> "/home/charris/Workspace/numpy.git/numpy/distutils/system_info.py", line
>> 325, in get_info
>>     return cl().get_info(notfound_action)
>>   File
>> "/home/charris/Workspace/numpy.git/numpy/distutils/system_info.py", line
>> 478, in get_info
>>     self.calc_info()
>>   File "/home/charris/Workspace/numpy.git/numpy/distutils/system_info.py
>> numpy/distutils/system_info.py", line 1090, in calc_info
>>     dict_append(atlas, **atlas_extra_info)
>> NameError: global name 'atlas_extra_info' is not defined
>>
>>
>> Commenting out line 1090 in numpy/distutils/system_info.py fixes the
>> problem, but I'm pretty sure that isn't the right fix.
>>
>
> Does it work when you add:
>     atlas_version, atlas_extra_info = get_atlas_version(**atlas)
> on line 1090?
>
> I thought the testing was extensive enough (we did 3 machines, different
> OSes and site.cfg's), but apparently not. Sorry about that.
>
>
That fixes the problem for me.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120316/1840d6b7/attachment.html>

From josef.pktd at gmail.com  Fri Mar 16 08:36:20 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 16 Mar 2012 08:36:20 -0400
Subject: [Numpy-discussion] float96 on windows32 is float64?
In-Reply-To: <CAAUn5qJ09dgSH-zYfFuQiFBe1Y5oERh=f0trow7ih7MrFLC-4Q@mail.gmail.com>
References: <CAH6Pt5oqmk3EVL4_ee8BZjnS53tpfNfTJeuS__RsGtZYJCvr2w@mail.gmail.com>
	<CAGY4rcXYcQoHnxiLTQrJHOwp5UvgfP2xqtUozSzU0jxH-D3=yA@mail.gmail.com>
	<CAH6Pt5oVh2-msSNnZt+TPrtKRq2vGV5-Sp37vaJjigxjGsb5nw@mail.gmail.com>
	<CAE8bXEmT88FW8==nuUyAXAZ_75Ub8ah0gY=XG8GjTURQd1dULA@mail.gmail.com>
	<CAH6Pt5p__vi=_mqB9v3nfspKBr4S+5UQ6CdQ_pPgFBDTgtrkgw@mail.gmail.com>
	<CAE8bXEmQiSzJFNoXoOb9y=rYJWT1K6YT-9jNS7ddYKW3FnJ+qQ@mail.gmail.com>
	<CAH6Pt5pFHLSUAui1XoTztGh3GwGXP+3x2ao_TO0QCDZzRO4X5g@mail.gmail.com>
	<CAAUn5qKg0RPP_tM8uQ85i=Eai197pWKNxNMn1YzzUc=yZD=JOw@mail.gmail.com>
	<CAH6Pt5qNJd+qnWKtVvuLZZAHpSGiScusG0k2jLpJv6tXhsFM2Q@mail.gmail.com>
	<CAAUn5q+G1bUVm5byhJNDzof_Vay2UyYRGt7DKwGONSwb0MDt7A@mail.gmail.com>
	<CAH6Pt5rVkF2UEUUoHA0-SoZ+zNFrDkdgFwqLxzjVV3_Gm+isKw@mail.gmail.com>
	<CAAUn5qJ09dgSH-zYfFuQiFBe1Y5oERh=f0trow7ih7MrFLC-4Q@mail.gmail.com>
Message-ID: <CAMMTP+Ar04b=Aui6orXHK=PyF9CZv2ds5SGRDQ9qVspWYTLNjA@mail.gmail.com>

On Fri, Mar 16, 2012 at 2:10 AM, Ilan Schnell <ischnell at enthought.com> wrote:
> I just did a quick test across all supported EPD platforms:
> win-64: float96 No, float128 No
> win-32: float96 No, float128 No
> osx-64: float96 No, float128 Yes
> osx-32: float96 No, float128 Yes
> rh3-64: float96 No, float128 Yes
> rh3-32: float96 Yes, float128 No
> rh5-64: float96 No, float128 Yes
> rh5-32: float96 Yes, float128 No
> sol-64: float96 No, float128 Yes
> sol-32: float96 Yes, float128 No


numpy 1.5.1 MingW, on python 2.6 win32 has float96, float128 no
numpy 1.6.1 Gohlke (MKL I think) on python 3.2 win64   no float96, no float128

Josef

>
> I have no explanation for this, but I'm guessing David C. has.
> I'll look more into this tomorrow.
>
> - Ilan
>
>
> On Fri, Mar 16, 2012 at 12:40 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
>> Hi,
>>
>> On Thu, Mar 15, 2012 at 10:26 PM, Ilan Schnell <ischnell at enthought.com> wrote:
>>> To be more precise. ?On both 32-bit and 64-bit Windows
>>> machines I don't see.float96 as well as np.float128
>>
>> Do you have any idea why I am seeing float96 and you are not? ?I'm on
>> XP with the current sourceforge 1.6.1 exe installer with python.org
>> 2.7 (and same for python.org 2.6 and numpy 1.5.1).
>>
>> Thanks,
>>
>> Matthew
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From njs at pobox.com  Fri Mar 16 11:07:25 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 16 Mar 2012 15:07:25 +0000
Subject: [Numpy-discussion] draft enum NEP
In-Reply-To: <CABDkGQneh12fC9UrT_aAHX7LHQSY7+k8ehO6b2cwN4gjSxVuvw@mail.gmail.com>
References: <4F5A35E4.6080501@continuum.io>
	<CAMRnEmrOhZ7-DNGyrQ_oRrJEzd=zeR8HjpZOfwEX5sEyYoSVDA@mail.gmail.com>
	<CAPJVwBmdXhyvmke52DJm22N1M=jDn6Tz6saUj3Nsccd9qiPu3Q@mail.gmail.com>
	<CABDkGQneh12fC9UrT_aAHX7LHQSY7+k8ehO6b2cwN4gjSxVuvw@mail.gmail.com>
Message-ID: <CAPJVwBm0+VnRcqWhGfJPfQ9Rm75S1jc0pC9JjeEDo=dAbcPsRw@mail.gmail.com>

On Mar 16, 2012 1:02 AM, "St?fan van der Walt" <stefan
<stefan at sun.ac.za>@<stefan at sun.ac.za>
sun.ac.za <stefan at sun.ac.za>> wrote:
>
> On Thu, Mar 15, 2012 at 4:02 PM, Nathaniel Smith <njs <njs at pobox.com>@<njs at pobox.com>
pobox.com <njs at pobox.com>> wrote:
> > I'm not sure what it would even mean to treat this kind of data as
> > "flags", since you can't take the bitwise-or of two strings...
>
> This makes a more sense outside of ndarrays, where you would do something
like:
>
> enum FLAG0 = 1, FLAG1 = 2, FLAG2 = 4
> do_something(data, mode=FLAG0 & FLAG2)
>
> The enum is therefore just a handle for its numerical value.  While it
> may not be that useful in an array, I think Mark was just pointing out
> that there may be other similar use cases, such as enumerating from 0
> to N-1, or in reverse from N-1 down to 0, or in steps of 2, or in
> powers of 2, etc.

Right, there may be. But are there? That's the question :-)

It looks like R doesn't support anything except 1, ..., N numbering.
There's really no reason it would either, since in their design the
underlying integer values are almost entirely hidden from users. You could
get at them if you wanted, but I bet less than 1% of users are even aware
that factors and integers have anything to do with each other. Factors are
just documented to be a way to store an array of strings drawn from a
limited ordered list. (The ordering is important for things like polynomial
coding and treatment versus baseline coding.)

HDF5 supports arbitrary symbol<->integer mappings.

0, ..., N-1 coding makes the common problem of creating an indicator matrix
very convenient:
ind = np.zeros((enum_a.length, len(enum_.dtype.levels)), dtype=bool)
ind[:, enum_a.view(dtype=np.int32)] = True

But we can't restrict ourselves to only this coding if we want
compatibility with HDF5 or R (because R is 1-based). So I guess supporting
arbitrary mappings is worth it - though I doubt this flexibility will be
used much. I'm curious if anyone can think of a reason they'd use it
besides interoperability.

Cheers,
- Nathaniel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120316/499ebe24/attachment.html>

From bryanv at continuum.io  Fri Mar 16 12:26:01 2012
From: bryanv at continuum.io (Bryan Van de Ven)
Date: Fri, 16 Mar 2012 11:26:01 -0500
Subject: [Numpy-discussion] draft enum NEP
In-Reply-To: <CAPJVwBm0+VnRcqWhGfJPfQ9Rm75S1jc0pC9JjeEDo=dAbcPsRw@mail.gmail.com>
References: <4F5A35E4.6080501@continuum.io>
	<CAMRnEmrOhZ7-DNGyrQ_oRrJEzd=zeR8HjpZOfwEX5sEyYoSVDA@mail.gmail.com>
	<CAPJVwBmdXhyvmke52DJm22N1M=jDn6Tz6saUj3Nsccd9qiPu3Q@mail.gmail.com>
	<CABDkGQneh12fC9UrT_aAHX7LHQSY7+k8ehO6b2cwN4gjSxVuvw@mail.gmail.com>
	<CAPJVwBm0+VnRcqWhGfJPfQ9Rm75S1jc0pC9JjeEDo=dAbcPsRw@mail.gmail.com>
Message-ID: <4F636999.90507@continuum.io>

Hi all,

I have spent some time thinking about things, and discussing them with 
folks nearby. I actually got to wondering whether we really need new 
dtypes for this. It seems like enumerated values or factor levels could 
be cast as an annotation or metadata that could be attached to any 
existing integral dtypes. It spells differently enough that I have put 
up an alternate version that reflects this notion. I'd like to see what 
folks think of this direction:

     https://github.com/bryevdv/numpy/blob/enum/doc/neps/enum_alt.rst

So this would require adding machinery to existing dtypes to behave 
properly when there is factor metadata present. Perhaps that is not an 
acceptable trade-off, but it seems worth discussing.

I think a very similar approach could be used to add categorical ranges 
to any numerical or string types (I think they are called "shingles" in R?)

Please let me know what you think.

Bryan


From chaoyuejoy at gmail.com  Fri Mar 16 13:57:33 2012
From: chaoyuejoy at gmail.com (Chao YUE)
Date: Fri, 16 Mar 2012 18:57:33 +0100
Subject: [Numpy-discussion] remove redundant dimension in a ndarray?
Message-ID: <CAAN-aRFMbs6JMS2e6V+6Vu5Tw=OGGJL2A9L93G7af3RCzWLnTg@mail.gmail.com>

Dear all,

Do we have a function in numpy that can automatically "shrink" a ndarray
with redundant dimension?

like I have a ndarray with shape of (13,1,1,160,1), now I have written a
small function to change the array to dimension of (13,160) [reduce the
extra dimension with length as 1].
but I just would like to know maybe there is already something which can do
this there ?

cheers,

Chao

-- 
***********************************************************************************
Chao YUE
Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
UMR 1572 CEA-CNRS-UVSQ
Batiment 712 - Pe 119
91191 GIF Sur YVETTE Cedex
Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
************************************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120316/03b570d4/attachment.html>

From derek at astro.physik.uni-goettingen.de  Fri Mar 16 13:59:15 2012
From: derek at astro.physik.uni-goettingen.de (Derek Homeier)
Date: Fri, 16 Mar 2012 18:59:15 +0100
Subject: [Numpy-discussion] remove redundant dimension in a ndarray?
In-Reply-To: <CAAN-aRFMbs6JMS2e6V+6Vu5Tw=OGGJL2A9L93G7af3RCzWLnTg@mail.gmail.com>
References: <CAAN-aRFMbs6JMS2e6V+6Vu5Tw=OGGJL2A9L93G7af3RCzWLnTg@mail.gmail.com>
Message-ID: <CA63FA25-A17D-4364-8424-7F18FDDB89E4@astro.physik.uni-goettingen.de>

Dear Chao,

> Do we have a function in numpy that can automatically "shrink" a ndarray with redundant dimension?
> 
> like I have a ndarray with shape of (13,1,1,160,1), now I have written a small function to change the array to dimension of (13,160) [reduce the extra dimension with length as 1].
> but I just would like to know maybe there is already something which can do this there ?

np.squeeze

Cheers,
				Derek


From ralf.gommers at googlemail.com  Fri Mar 16 18:00:31 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Fri, 16 Mar 2012 23:00:31 +0100
Subject: [Numpy-discussion] unique along axis?
In-Reply-To: <jjlhc9$rek$1@dough.gmane.org>
References: <jjlhc9$rek$1@dough.gmane.org>
Message-ID: <CABL7CQiy9v-mq5idKRBFT_b_3rtge5VHqzu68tLCSY37zHy-UQ@mail.gmail.com>

On Mon, Mar 12, 2012 at 8:04 PM, Neal Becker <ndbecker2 at gmail.com> wrote:

> I see unique does not take an axis arg.
>
> Suggested way to apply unique to each column of a 2d array?
>

A for-loop?

I'm guessing that there isn't an axis keyword because the number of unique
elements per column may not be the same, so you can't return an ndarray.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120316/1adba99e/attachment.html>

From glenjenness at gmail.com  Fri Mar 16 21:00:28 2012
From: glenjenness at gmail.com (Glen Jenness)
Date: Fri, 16 Mar 2012 20:00:28 -0500
Subject: [Numpy-discussion] numpy + MKL problems
Message-ID: <CAHrrQpqiSsBY1wyjpY-wwbnkesGxY-3hyyAUUuWeK2oK7g8czg@mail.gmail.com>

Dear users,
I was playing around with my numpy configuration, and it is now no longer
working.  Whenever I try to run anything with it, I get:

...MKL FATAL ERROR: /opt/intel/mkl/10.0.3.020/lib/em64t/: cannot read file
data: Is a directory

My site.cfg file is (in case it helps!):
[DEFAULT]
library_dirs = /usr/lib
include_dirs = /usr/include
[fftw]
libraries = fftw3
[mkl]
library_dirs = /opt/intel/mkl/10.0.3.020/lib/em64t
include_dirs = /opt/intel/mkl/10.0.3.020/include
mkl_libs = mkl_intel_lp64,mkl_intel_thread,mkl_core

I have followed the directions for installing it from source, and it was
working fine earlier today.  But I'm not really sure what I had changed so
it would go from working to not working.

Any advice would be appreciated!
Dr. Glen Jenness
Schmidt Group
Department of Chemistry
University of Wisconsin - Madison
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120316/1b6aafb7/attachment.html>

From francesc at continuum.io  Fri Mar 16 21:17:05 2012
From: francesc at continuum.io (Francesc Alted)
Date: Fri, 16 Mar 2012 20:17:05 -0500
Subject: [Numpy-discussion] numpy + MKL problems
In-Reply-To: <CAHrrQpqiSsBY1wyjpY-wwbnkesGxY-3hyyAUUuWeK2oK7g8czg@mail.gmail.com>
References: <CAHrrQpqiSsBY1wyjpY-wwbnkesGxY-3hyyAUUuWeK2oK7g8czg@mail.gmail.com>
Message-ID: <3B030CF2-5B85-4220-AA41-090D2D28F7B1@continuum.io>

On Mar 16, 2012, at 8:00 PM, Glen Jenness wrote:

> Dear users,
> I was playing around with my numpy configuration, and it is now no longer working.  Whenever I try to run anything with it, I get:
> 
> ...MKL FATAL ERROR: /opt/intel/mkl/10.0.3.020/lib/em64t/: cannot read file data: Is a directory
> 
> My site.cfg file is (in case it helps!):
> [DEFAULT]
> library_dirs = /usr/lib
> include_dirs = /usr/include
> [fftw]
> libraries = fftw3
> [mkl]
> library_dirs = /opt/intel/mkl/10.0.3.020/lib/em64t
> include_dirs = /opt/intel/mkl/10.0.3.020/include
> mkl_libs = mkl_intel_lp64,mkl_intel_thread,mkl_core
> 
> I have followed the directions for installing it from source, and it was working fine earlier today.  But I'm not really sure what I had changed so it would go from working to not working.

This might be related with:

http://projects.scipy.org/numpy/ticket/993

being fixed in the last few hours.  Could you please bisect (http://webchick.net/node/99) and tell us which commit is the bad one?

Thanks!

-- Francesc Alted


From glenjenness at gmail.com  Fri Mar 16 21:28:39 2012
From: glenjenness at gmail.com (Glen Jenness)
Date: Fri, 16 Mar 2012 20:28:39 -0500
Subject: [Numpy-discussion] numpy + MKL problems
In-Reply-To: <3B030CF2-5B85-4220-AA41-090D2D28F7B1@continuum.io>
References: <CAHrrQpqiSsBY1wyjpY-wwbnkesGxY-3hyyAUUuWeK2oK7g8czg@mail.gmail.com>
	<3B030CF2-5B85-4220-AA41-090D2D28F7B1@continuum.io>
Message-ID: <CAHrrQppo4YVGfRL+Zu+6jMxd599gu_np_hWN4+ETHrB4Z858vA@mail.gmail.com>

Fransesc,
I don't think that's the problem-I'm working with a version of numpy I had
downloaded back in Dec/Jan and have not updated it since.

However, doing a "LD_PRELOAD" seems to have fixed the problem, but now
whenever I'm running a script I get:

 python 2.4.3 GCC 4.1.2 20070626 (Red Hat 4.1.2-14) 64bit ELF on Linux
x86_64 redhat 5.2 Final
python: symbol lookup error: /opt/intel/mkl/
10.0.3.020/lib/em64t/libmkl_lapack.so: undefined symbol: mkl_lapack_dgetrf

Glen

On Fri, Mar 16, 2012 at 8:17 PM, Francesc Alted <francesc at continuum.io>wrote:

> On Mar 16, 2012, at 8:00 PM, Glen Jenness wrote:
>
> > Dear users,
> > I was playing around with my numpy configuration, and it is now no
> longer working.  Whenever I try to run anything with it, I get:
> >
> > ...MKL FATAL ERROR: /opt/intel/mkl/10.0.3.020/lib/em64t/: cannot read
> file data: Is a directory
> >
> > My site.cfg file is (in case it helps!):
> > [DEFAULT]
> > library_dirs = /usr/lib
> > include_dirs = /usr/include
> > [fftw]
> > libraries = fftw3
> > [mkl]
> > library_dirs = /opt/intel/mkl/10.0.3.020/lib/em64t
> > include_dirs = /opt/intel/mkl/10.0.3.020/include
> > mkl_libs = mkl_intel_lp64,mkl_intel_thread,mkl_core
> >
> > I have followed the directions for installing it from source, and it was
> working fine earlier today.  But I'm not really sure what I had changed so
> it would go from working to not working.
>
> This might be related with:
>
> http://projects.scipy.org/numpy/ticket/993
>
> being fixed in the last few hours.  Could you please bisect (
> http://webchick.net/node/99) and tell us which commit is the bad one?
>
> Thanks!
>
> -- Francesc Alted
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
Dr. Glen Jenness
Schmidt Group
Department of Chemistry
University of Wisconsin - Madison
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120316/bd5969b6/attachment.html>

From francesc at continuum.io  Fri Mar 16 21:35:54 2012
From: francesc at continuum.io (Francesc Alted)
Date: Fri, 16 Mar 2012 20:35:54 -0500
Subject: [Numpy-discussion] numpy + MKL problems
In-Reply-To: <CAHrrQppo4YVGfRL+Zu+6jMxd599gu_np_hWN4+ETHrB4Z858vA@mail.gmail.com>
References: <CAHrrQpqiSsBY1wyjpY-wwbnkesGxY-3hyyAUUuWeK2oK7g8czg@mail.gmail.com>
	<3B030CF2-5B85-4220-AA41-090D2D28F7B1@continuum.io>
	<CAHrrQppo4YVGfRL+Zu+6jMxd599gu_np_hWN4+ETHrB4Z858vA@mail.gmail.com>
Message-ID: <19593150-7F9E-40F6-BD13-D1074BE87D5A@continuum.io>

On Mar 16, 2012, at 8:28 PM, Glen Jenness wrote:

> Fransesc,
> I don't think that's the problem-I'm working with a version of numpy I had downloaded back in Dec/Jan and have not updated it since.
> 
> However, doing a "LD_PRELOAD" seems to have fixed the problem, but now whenever I'm running a script I get:
> 
>  python 2.4.3 GCC 4.1.2 20070626 (Red Hat 4.1.2-14) 64bit ELF on Linux x86_64 redhat 5.2 Final
> python: symbol lookup error: /opt/intel/mkl/10.0.3.020/lib/em64t/libmkl_lapack.so: undefined symbol: mkl_lapack_dgetrf

So, if numpy has not changed, then something else does, right?  Have you upgraded MKL? GCC? Installed Intel C compiler?

-- Francesc Alted


From matthew.brett at gmail.com  Fri Mar 16 21:39:18 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Fri, 16 Mar 2012 18:39:18 -0700
Subject: [Numpy-discussion] float96 on windows32 is float64?
In-Reply-To: <CAMMTP+Ar04b=Aui6orXHK=PyF9CZv2ds5SGRDQ9qVspWYTLNjA@mail.gmail.com>
References: <CAH6Pt5oqmk3EVL4_ee8BZjnS53tpfNfTJeuS__RsGtZYJCvr2w@mail.gmail.com>
	<CAGY4rcXYcQoHnxiLTQrJHOwp5UvgfP2xqtUozSzU0jxH-D3=yA@mail.gmail.com>
	<CAH6Pt5oVh2-msSNnZt+TPrtKRq2vGV5-Sp37vaJjigxjGsb5nw@mail.gmail.com>
	<CAE8bXEmT88FW8==nuUyAXAZ_75Ub8ah0gY=XG8GjTURQd1dULA@mail.gmail.com>
	<CAH6Pt5p__vi=_mqB9v3nfspKBr4S+5UQ6CdQ_pPgFBDTgtrkgw@mail.gmail.com>
	<CAE8bXEmQiSzJFNoXoOb9y=rYJWT1K6YT-9jNS7ddYKW3FnJ+qQ@mail.gmail.com>
	<CAH6Pt5pFHLSUAui1XoTztGh3GwGXP+3x2ao_TO0QCDZzRO4X5g@mail.gmail.com>
	<CAAUn5qKg0RPP_tM8uQ85i=Eai197pWKNxNMn1YzzUc=yZD=JOw@mail.gmail.com>
	<CAH6Pt5qNJd+qnWKtVvuLZZAHpSGiScusG0k2jLpJv6tXhsFM2Q@mail.gmail.com>
	<CAAUn5q+G1bUVm5byhJNDzof_Vay2UyYRGt7DKwGONSwb0MDt7A@mail.gmail.com>
	<CAH6Pt5rVkF2UEUUoHA0-SoZ+zNFrDkdgFwqLxzjVV3_Gm+isKw@mail.gmail.com>
	<CAAUn5qJ09dgSH-zYfFuQiFBe1Y5oERh=f0trow7ih7MrFLC-4Q@mail.gmail.com>
	<CAMMTP+Ar04b=Aui6orXHK=PyF9CZv2ds5SGRDQ9qVspWYTLNjA@mail.gmail.com>
Message-ID: <CAH6Pt5ota1GFdGdmc6-aNKFXizi38wO=GJfU+LykzXsURRw3nA@mail.gmail.com>

Hi,

On Fri, Mar 16, 2012 at 5:36 AM,  <josef.pktd at gmail.com> wrote:
> On Fri, Mar 16, 2012 at 2:10 AM, Ilan Schnell <ischnell at enthought.com> wrote:
>> I just did a quick test across all supported EPD platforms:
>> win-64: float96 No, float128 No
>> win-32: float96 No, float128 No
>> osx-64: float96 No, float128 Yes
>> osx-32: float96 No, float128 Yes
>> rh3-64: float96 No, float128 Yes
>> rh3-32: float96 Yes, float128 No
>> rh5-64: float96 No, float128 Yes
>> rh5-32: float96 Yes, float128 No
>> sol-64: float96 No, float128 Yes
>> sol-32: float96 Yes, float128 No
>
>
> numpy 1.5.1 MingW, on python 2.6 win32 has float96, float128 no
> numpy 1.6.1 Gohlke (MKL I think) on python 3.2 win64 ? no float96, no float128
>
> Josef
>
>>
>> I have no explanation for this, but I'm guessing David C. has.
>> I'll look more into this tomorrow.
>>
>> - Ilan

Oh dear - I completely forgot the previous thread that I started on
this : http://mail.scipy.org/pipermail/numpy-discussion/2011-November/059233.html

You young people, don't laugh, this will happen to you one day.

Anyway, summarizing, it appears that windows float96:

a) Is stored as an 80 bit extended precision number
b) Uses float64 precision for all calculations.
c) Is specific to MingW builds of numpy - I think.

Perhaps David C you'll correct me if I've got that wrong,

Best,

Matthew


From glenjenness at gmail.com  Fri Mar 16 21:56:37 2012
From: glenjenness at gmail.com (Glen Jenness)
Date: Fri, 16 Mar 2012 20:56:37 -0500
Subject: [Numpy-discussion] numpy + MKL problems
In-Reply-To: <19593150-7F9E-40F6-BD13-D1074BE87D5A@continuum.io>
References: <CAHrrQpqiSsBY1wyjpY-wwbnkesGxY-3hyyAUUuWeK2oK7g8czg@mail.gmail.com>
	<3B030CF2-5B85-4220-AA41-090D2D28F7B1@continuum.io>
	<CAHrrQppo4YVGfRL+Zu+6jMxd599gu_np_hWN4+ETHrB4Z858vA@mail.gmail.com>
	<19593150-7F9E-40F6-BD13-D1074BE87D5A@continuum.io>
Message-ID: <CAHrrQpoi+NzAh3Fg-W26KhQW7sz69Qd7c+PenNgCDp3KrXZVGQ@mail.gmail.com>

None of that-this is why it has been so frustrating!

Only thing I did was remove the folders in the ./build directory in order
to get a cleaner slate, and just changed a couple libraries in site.cfg,
and got the error, and it persisted even when I went back to the libraries
I had given it when I got it to work.  So I am really confused as to what
could have caused the issue to start with.

Anyways, it seems okay now-I had to add mkl_lapack to the library list.
 Everything isn't 100% yet, but I can now work with it.

On Fri, Mar 16, 2012 at 8:35 PM, Francesc Alted <francesc at continuum.io>wrote:

> On Mar 16, 2012, at 8:28 PM, Glen Jenness wrote:
>
> > Fransesc,
> > I don't think that's the problem-I'm working with a version of numpy I
> had downloaded back in Dec/Jan and have not updated it since.
> >
> > However, doing a "LD_PRELOAD" seems to have fixed the problem, but now
> whenever I'm running a script I get:
> >
> >  python 2.4.3 GCC 4.1.2 20070626 (Red Hat 4.1.2-14) 64bit ELF on Linux
> x86_64 redhat 5.2 Final
> > python: symbol lookup error: /opt/intel/mkl/
> 10.0.3.020/lib/em64t/libmkl_lapack.so: undefined symbol: mkl_lapack_dgetrf
>
> So, if numpy has not changed, then something else does, right?  Have you
> upgraded MKL? GCC? Installed Intel C compiler?
>
> -- Francesc Alted
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
Dr. Glen Jenness
Schmidt Group
Department of Chemistry
University of Wisconsin - Madison
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120316/d9b0312f/attachment.html>

From matthew.brett at gmail.com  Sat Mar 17 04:24:34 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Sat, 17 Mar 2012 01:24:34 -0700
Subject: [Numpy-discussion] Build error on OSX from commit 72c6fbd
Message-ID: <CAH6Pt5r_o8mRnRHhx1qon151666vh3sSy-kxu_FsvLzXTDj8zQ@mail.gmail.com>

Hi,

As of commit 72c6fbd, I am getting the appended build error on OSX
10.6.8.  I couldn't immediately see what might have caused the
problem.

Cheers,

Matthew

...
creating build/temp.macosx-10.3-fat-2.6/numpy/core/blasdot
compile options: '-DNO_ATLAS_INFO=3 -Inumpy/core/blasdot
-Inumpy/core/include
-Ibuild/src.macosx-10.3-fat-2.6/numpy/core/include/numpy
-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core
-Inumpy/core/src/npymath -Inumpy/core/src/multiarray
-Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/include
-I/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6
-Ibuild/src.macosx-10.3-fat-2.6/numpy/core/src/multiarray
-Ibuild/src.macosx-10.3-fat-2.6/numpy/core/src/umath -c'
extra options: '-msse3 -I/System/Library/Frameworks/vecLib.framework/Headers'
gcc-4.0: numpy/core/blasdot/_dotblas.c
In file included from numpy/core/include/numpy/ndarraytypes.h:1972,
                 from numpy/core/include/numpy/ndarrayobject.h:17,
                 from numpy/core/include/numpy/arrayobject.h:15,
                 from numpy/core/blasdot/_dotblas.c:6:
numpy/core/include/numpy/npy_deprecated_api.h:11:2: warning: #warning
"Using deprecated NumPy API, disable it by #defining
NPY_NO_DEPRECATED_API"
In file included from numpy/core/include/numpy/ndarraytypes.h:1972,
                 from numpy/core/include/numpy/ndarrayobject.h:17,
                 from numpy/core/include/numpy/arrayobject.h:15,
                 from numpy/core/blasdot/_dotblas.c:6:
numpy/core/include/numpy/npy_deprecated_api.h:11:2: warning: #warning
"Using deprecated NumPy API, disable it by #defining
NPY_NO_DEPRECATED_API"
numpy/core/blasdot/_dotblas.c: In function ?dotblas_matrixproduct?:
numpy/core/blasdot/_dotblas.c: In function ?dotblas_matrixproduct?:
numpy/core/blasdot/_dotblas.c:239: warning: comparison of distinct
pointer types lacks a cast
numpy/core/blasdot/_dotblas.c:257: warning: passing argument 3 of
?*(PyArray_API + 1120u)? from incompatible pointer type
numpy/core/blasdot/_dotblas.c:292: warning: passing argument 3 of
?*(PyArray_API + 1120u)? from incompatible pointer type
numpy/core/blasdot/_dotblas.c:239: warning: comparison of distinct
pointer types lacks a cast
numpy/core/blasdot/_dotblas.c:257: warning: passing argument 3 of
?*(PyArray_API + 1120u)? from incompatible pointer type
numpy/core/blasdot/_dotblas.c:292: warning: passing argument 3 of
?*(PyArray_API + 1120u)? from incompatible pointer type
gcc-4.0 -arch ppc -arch i386 -isysroot /Developer/SDKs/MacOSX10.4u.sdk
-g -bundle -undefined dynamic_lookup
build/temp.macosx-10.3-fat-2.6/numpy/core/blasdot/_dotblas.o
-Lbuild/temp.macosx-10.3-fat-2.6 -o
build/lib.macosx-10.3-fat-2.6/numpy/core/_dotblas.so -Wl, -framework
-Wl,Accelerate
ld: file not found:
collect2: ld returned 1 exit status
ld: file not found:
collect2: ld returned 1 exit status
lipo: can't open input file:
/var/folders/jg/jgfZ12ZXHwGSFKD85xLpLk+++TI/-Tmp-//ccZil7bP.out (No
such file or directory)
ld: file not found:
collect2: ld returned 1 exit status
ld: file not found:
collect2: ld returned 1 exit status
lipo: can't open input file:
/var/folders/jg/jgfZ12ZXHwGSFKD85xLpLk+++TI/-Tmp-//ccZil7bP.out (No
such file or directory)
error: Command "gcc-4.0 -arch ppc -arch i386 -isysroot
/Developer/SDKs/MacOSX10.4u.sdk -g -bundle -undefined dynamic_lookup
build/temp.macosx-10.3-fat-2.6/numpy/core/blasdot/_dotblas.o
-Lbuild/temp.macosx-10.3-fat-2.6 -o
build/lib.macosx-10.3-fat-2.6/numpy/core/_dotblas.so -Wl, -framework
-Wl,Accelerate" failed with exit status 1


From ralf.gommers at googlemail.com  Sat Mar 17 05:10:48 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sat, 17 Mar 2012 10:10:48 +0100
Subject: [Numpy-discussion] Build error on OSX from commit 72c6fbd
In-Reply-To: <CAH6Pt5r_o8mRnRHhx1qon151666vh3sSy-kxu_FsvLzXTDj8zQ@mail.gmail.com>
References: <CAH6Pt5r_o8mRnRHhx1qon151666vh3sSy-kxu_FsvLzXTDj8zQ@mail.gmail.com>
Message-ID: <CABL7CQgBhscwEEf7_aZEfaWmW0zTv_GEYr05Q_pAqU1OW2kbow@mail.gmail.com>

On Sat, Mar 17, 2012 at 9:24 AM, Matthew Brett <matthew.brett at gmail.com>wrote:

> Hi,
>
> As of commit 72c6fbd, I am getting the appended build error on OSX
> 10.6.8.  I couldn't immediately see what might have caused the
> problem.
>

I can't reproduce it, but it should be fixed by
https://github.com/rgommers/numpy/commit/bca298fb3. Can you confirm that?

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120317/631d4d57/attachment.html>

From njs at pobox.com  Sat Mar 17 13:11:13 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 17 Mar 2012 17:11:13 +0000
Subject: [Numpy-discussion] draft enum NEP
In-Reply-To: <4F636999.90507@continuum.io>
References: <4F5A35E4.6080501@continuum.io>
	<CAMRnEmrOhZ7-DNGyrQ_oRrJEzd=zeR8HjpZOfwEX5sEyYoSVDA@mail.gmail.com>
	<CAPJVwBmdXhyvmke52DJm22N1M=jDn6Tz6saUj3Nsccd9qiPu3Q@mail.gmail.com>
	<CABDkGQneh12fC9UrT_aAHX7LHQSY7+k8ehO6b2cwN4gjSxVuvw@mail.gmail.com>
	<CAPJVwBm0+VnRcqWhGfJPfQ9Rm75S1jc0pC9JjeEDo=dAbcPsRw@mail.gmail.com>
	<4F636999.90507@continuum.io>
Message-ID: <CAPJVwB=UpusQ15p-0vxasSnR0-14=LXv5nWwAmBYXJsMrx9n=A@mail.gmail.com>

On Fri, Mar 16, 2012 at 4:26 PM, Bryan Van de Ven <bryanv at continuum.io>
wrote:
> Hi all,
>
> I have spent some time thinking about things, and discussing them with
folks
> nearby. I actually got to wondering whether we really need new dtypes for
> this. It seems like enumerated values or factor levels could be cast as an
> annotation or metadata that could be attached to any existing integral
> dtypes. It spells differently enough that I have put up an alternate
version
> that reflects this notion. I'd like to see what folks think of this
> direction:
>
>    https://github.com/bryevdv/numpy/blob/enum/doc/neps/enum_alt.rst
>
> So this would require adding machinery to existing dtypes to behave
properly
> when there is factor metadata present. Perhaps that is not an acceptable
> trade-off, but it seems worth discussing.

I took a look at this, but I think something was lost in the translation
from your head to text :-). Your description here makes it sound like
what's different about this proposal is that there's very different
underlying mechanics, but the enum_alt file just seems to describe an
alternative and more-or-less equivalent user-level API. Unless you told me,
I would have assumed that it just created a new dtype, rather than modified
existing ones.

What mechanism are you thinking of? Or did I miss something?

> I think a very similar approach could be used to add categorical ranges to
> any numerical or string types (I think they are called "shingles" in R?)

A 'shingle' is a way of mapping (floating point) numbers into categories.
However, they generally allow a single number to fall into multiple
categories. So for example, you might take these data points:

 1  2  3  4  5  6  7  8  9  10 11

And divide them into categories A, B, C like this:

 1  2  3  4  5  6  7  8  9  10  11
 AAAAAAAAAAAAA
          BBBBBBBBBBBBB
                   CCCCCCCCCCCCCCC

Which is why they're called "shingles" :-)

http://www.floridadisaster.org/hrg/images/roofs/shingle_loose_tab_large.jpg
This can be a very convenient data structure for various sorts of
visualizations, but I'm not sure how it would make sense to integrate it
into basic numerical types.

R has a more basic function called 'cut' which takes a numerical array plus
some specified breakpoints, and returns a factor array. But that's a simple
utility function that doesn't need any special features in the underlying
representation.

-- Nathaniel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120317/60f94bfb/attachment.html>

From matthew.brett at gmail.com  Sat Mar 17 15:45:10 2012
From: matthew.brett at gmail.com (Matthew Brett)
Date: Sat, 17 Mar 2012 12:45:10 -0700
Subject: [Numpy-discussion] Build error on OSX from commit 72c6fbd
In-Reply-To: <CABL7CQgBhscwEEf7_aZEfaWmW0zTv_GEYr05Q_pAqU1OW2kbow@mail.gmail.com>
References: <CAH6Pt5r_o8mRnRHhx1qon151666vh3sSy-kxu_FsvLzXTDj8zQ@mail.gmail.com>
	<CABL7CQgBhscwEEf7_aZEfaWmW0zTv_GEYr05Q_pAqU1OW2kbow@mail.gmail.com>
Message-ID: <CAH6Pt5q5YjH9Xst4ZwOFT6Meg4WiM11rucfiLLL+_rbBCP3kzQ@mail.gmail.com>

Hi,

On Sat, Mar 17, 2012 at 2:10 AM, Ralf Gommers
<ralf.gommers at googlemail.com> wrote:
>
>
> On Sat, Mar 17, 2012 at 9:24 AM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
>>
>> Hi,
>>
>> As of commit 72c6fbd, I am getting the appended build error on OSX
>> 10.6.8. ?I couldn't immediately see what might have caused the
>> problem.
>
>
> I can't reproduce it, but it should be fixed by
> https://github.com/rgommers/numpy/commit/bca298fb3. Can you confirm that?

Yes that fixes it - thanks for the quick reply,

Matthew


From avalle at famaf.unc.edu.ar  Sun Mar 18 10:05:04 2012
From: avalle at famaf.unc.edu.ar (avalle at famaf.unc.edu.ar)
Date: Sun, 18 Mar 2012 11:05:04 -0300
Subject: [Numpy-discussion] numpy installation problem?
Message-ID: <3c2ee5be3ebf58873f5944f9ab07d2d6.squirrel@webmail.famaf.unc.edu.ar>

Dear list,
I am having problems installing matplotlib (from source) and fipy.
I had installed numpy from source and it is running well:
:~$ python -c "import numpy; print numpy.__version__"
1.6.1
After being trying to solve this problem on matplotlib list, I was
recommended to ask numpy list.
I copy-paste the errors below and forward the mails from matplotlib users.

...
....
creating /usr/local/lib/python2.7/dist-packages/FiPy-2.1.2-py2.7.egg
Extracting FiPy-2.1.2-py2.7.egg to /usr/local/lib/python2.7/dist-packages
FiPy 2.1.2 is already the active version in easy-install.pth

Installed /usr/local/lib/python2.7/dist-packages/FiPy-2.1.2-py2.7.egg
Processing dependencies for FiPy==2.1.2
Finished processing dependencies for FiPy==2.1.2
Traceback (most recent call last):
  File "setup.py", line 594, in <module>
    __import__(pkg)
  File
"/usr/local/lib/python2.7/dist-packages/pysparse-1.2_dev224-py2.7-linux-x86_64.egg/pysparse/__init__.py",
line 6, in <module>
    from numpy._import_tools import PackageLoader
  File "/usr/local/lib/python2.7/dist-packages/numpy/__init__.py", line
137, in <module>
    import add_newdocs
  File "/usr/local/lib/python2.7/dist-packages/numpy/add_newdocs.py", line
9, in <module>
    from numpy.lib import add_newdoc
  File "/usr/local/lib/python2.7/dist-packages/numpy/lib/__init__.py",
line 13, in <module>
    from polynomial import *
  File "/usr/local/lib/python2.7/dist-packages/numpy/lib/polynomial.py",
line 11, in <module>
    import numpy.core.numeric as NX
AttributeError: 'module' object has no attribute 'core'
:~/FiPy-2.1.2$


:~/matplotlib$ sudo python setup.py install
basedirlist is: ['/usr/local', '/usr']
============================================================================
BUILDING MATPLOTLIB
            matplotlib: 1.2.x
                python: 2.7.2+ (default, Oct  4 2011, 20:06:09)  [GCC 4.6.1]
              platform: linux2

REQUIRED DEPENDENCIES
                 numpy: no
                        * You must install numpy 1.4 or later to build
                        * matplotlib.


I do not understand why matplotlib can not see numpy. Please, see the
forwarded message. My notebook description:
Linux 3.0.0-13-generic #22-Ubuntu SMP Wed Nov 2 13:27:26 UTC 2011 x86_64
x86_64 x86_64 GNU/Linux
Thank in advance for your help
Regards,
Lucia
-------------- next part --------------
An embedded message was scrubbed...
From: avalle at famaf.unc.edu.ar
Subject: Re: [Matplotlib-users] problems installaling matplotlib on ubuntu 	11.10
Date: Wed, 7 Mar 2012 07:37:34 -0300
Size: 5253
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120318/1d9cc69c/attachment.mht>

From ben.root at ou.edu  Sun Mar 18 11:14:54 2012
From: ben.root at ou.edu (Benjamin Root)
Date: Sun, 18 Mar 2012 10:14:54 -0500
Subject: [Numpy-discussion] numpy installation problem?
In-Reply-To: <3c2ee5be3ebf58873f5944f9ab07d2d6.squirrel@webmail.famaf.unc.edu.ar>
References: <3c2ee5be3ebf58873f5944f9ab07d2d6.squirrel@webmail.famaf.unc.edu.ar>
Message-ID: <CANNq6F=XJYMSBCSZNoMBHCD3o+t9mPUadeKu-B-7h+fth-eM_A@mail.gmail.com>

On Sunday, March 18, 2012,  <avalle at famaf.unc.edu.ar> wrote:
> Dear list,
> I am having problems installing matplotlib (from source) and fipy.
> I had installed numpy from source and it is running well:
> :~$ python -c "import numpy; print numpy.__version__"
> 1.6.1
> After being trying to solve this problem on matplotlib list, I was
> recommended to ask numpy list.
> I copy-paste the errors below and forward the mails from matplotlib users.
>
> ...
> ....
> creating /usr/local/lib/python2.7/dist-packages/FiPy-2.1.2-py2.7.egg
> Extracting FiPy-2.1.2-py2.7.egg to /usr/local/lib/python2.7/dist-packages
> FiPy 2.1.2 is already the active version in easy-install.pth
>
> Installed /usr/local/lib/python2.7/dist-packages/FiPy-2.1.2-py2.7.egg
> Processing dependencies for FiPy==2.1.2
> Finished processing dependencies for FiPy==2.1.2
> Traceback (most recent call last):
>  File "setup.py", line 594, in <module>
>    __import__(pkg)
>  File
>
"/usr/local/lib/python2.7/dist-packages/pysparse-1.2_dev224-py2.7-linux-x86_64.egg/pysparse/__init__.py",
> line 6, in <module>
>    from numpy._import_tools import PackageLoader
>  File "/usr/local/lib/python2.7/dist-packages/numpy/__init__.py", line
> 137, in <module>
>    import add_newdocs
>  File "/usr/local/lib/python2.7/dist-packages/numpy/add_newdocs.py", line
> 9, in <module>
>    from numpy.lib import add_newdoc
>  File "/usr/local/lib/python2.7/dist-packages/numpy/lib/__init__.py",
> line 13, in <module>
>    from polynomial import *
>  File "/usr/local/lib/python2.7/dist-packages/numpy/lib/polynomial.py",
> line 11, in <module>
>    import numpy.core.numeric as NX
> AttributeError: 'module' object has no attribute 'core'
> :~/FiPy-2.1.2$
>
>
> :~/matplotlib$ sudo python setup.py install
> basedirlist is: ['/usr/local', '/usr']
>
============================================================================
> BUILDING MATPLOTLIB
>            matplotlib: 1.2.x
>                python: 2.7.2+ (default, Oct  4 2011, 20:06:09)  [GCC
4.6.1]
>              platform: linux2
>
> REQUIRED DEPENDENCIES
>                 numpy: no
>                        * You must install numpy 1.4 or later to build
>                        * matplotlib.
>
>
> I do not understand why matplotlib can not see numpy. Please, see the
> forwarded message. My notebook description:
> Linux 3.0.0-13-generic #22-Ubuntu SMP Wed Nov 2 13:27:26 UTC 2011 x86_64
> x86_64 x86_64 GNU/Linux
> Thank in advance for your help
> Regards,
> Lucia
>

Just making sure, where/when did you get your mpl source?  There was a bug
in mpl master for a while that would not parse numpy's development version
number correctly, but it has been fixed since then.

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120318/53d56ca6/attachment.html>

From rigal at rapideye.net  Mon Mar 19 07:04:12 2012
From: rigal at rapideye.net (Matthieu Rigal)
Date: Mon, 19 Mar 2012 12:04:12 +0100
Subject: [Numpy-discussion] Using logical function on more than 2 arrays,
	availability of a "between" function ?
Message-ID: <201203191204.12564.rigal@rapideye.net>

Dear Numpy fellows,

I have actually a double question, which only aims to answer a single one : 
how to get the following line being processed more efficiently :

array = numpy.logical_and(numpy.logical_and(aBlueChannel < 1.0, aNirChannel > 
(aBlueChannel * 1.0)), aNirChannel < (aBlueChannel * 1.8))

One possibility would have been to have the logical_and being able to handle 
more than two arrays

Another one would have been to be able to make a "double comparison" or a 
"between", like following one :

array = numpy.logical_and((aBlueChannel < 1.0), (1.0 < 
aNirChannel/aBlueChannel < 1.8))

Is there any way to get the things work this way ? Would it else be a possible 
improvement for 1.7 or a later version ?

Best Regards,
Matthieu Rigal

RapidEye AG
Molkenmarkt 30
14776 Brandenburg an der Havel
Germany
 
Follow us on Twitter! www.twitter.com/rapideye_ag
 
Head Office/Sitz der Gesellschaft: Brandenburg an der Havel
Management Board/Vorstand: Ryan Johnson
Chairman of Supervisory Board/Vorsitzender des Aufsichtsrates: 
Robert Johnson
Commercial Register/Handelsregister Potsdam HRB 24742 P
Tax Number/Steuernummer: 048/100/00053
VAT-Ident-Number/Ust.-ID: DE 199331235
DIN EN ISO 9001 certified
 

From rhattersley at gmail.com  Mon Mar 19 09:20:23 2012
From: rhattersley at gmail.com (Richard Hattersley)
Date: Mon, 19 Mar 2012 13:20:23 +0000
Subject: [Numpy-discussion] Using logical function on more than 2 arrays,
 availability of a "between" function ?
In-Reply-To: <201203191204.12564.rigal@rapideye.net>
References: <201203191204.12564.rigal@rapideye.net>
Message-ID: <CAP=RS9=UBOc6Kmtmnne7W093t19w=T=oSrXUAW0WF8B49hqcXQ@mail.gmail.com>

What do you mean by "efficient"? Are you trying to get it execute
faster? Or using less memory? Or have more concise source code?

Less memory:
 - numpy.vectorize would let you get to the end result without any
intermediate arrays but will be slow.
 - Using the "out" parameter of numpy.logical_and will let you avoid
one of the intermediate arrays.

More speed?:
Perhaps putting all three boolean temporary results into a single
boolean array (using the "out" parameter of numpy.greater, etc) and
using numpy.all might benefit from logical short-circuiting.

And watch out for divide-by-zero from "aNirChannel/aBlueChannel".

Regards,
Richard Hattersley

On 19 March 2012 11:04, Matthieu Rigal <rigal at rapideye.net> wrote:
> Dear Numpy fellows,
>
> I have actually a double question, which only aims to answer a single one :
> how to get the following line being processed more efficiently :
>
> array = numpy.logical_and(numpy.logical_and(aBlueChannel < 1.0, aNirChannel >
> (aBlueChannel * 1.0)), aNirChannel < (aBlueChannel * 1.8))
>
> One possibility would have been to have the logical_and being able to handle
> more than two arrays
>
> Another one would have been to be able to make a "double comparison" or a
> "between", like following one :
>
> array = numpy.logical_and((aBlueChannel < 1.0), (1.0 <
> aNirChannel/aBlueChannel < 1.8))
>
> Is there any way to get the things work this way ? Would it else be a possible
> improvement for 1.7 or a later version ?
>
> Best Regards,
> Matthieu Rigal
>
> RapidEye AG
> Molkenmarkt 30
> 14776 Brandenburg an der Havel
> Germany
>
> Follow us on Twitter! www.twitter.com/rapideye_ag
>
> Head Office/Sitz der Gesellschaft: Brandenburg an der Havel
> Management Board/Vorstand: Ryan Johnson
> Chairman of Supervisory Board/Vorsitzender des Aufsichtsrates:
> Robert Johnson
> Commercial Register/Handelsregister Potsdam HRB 24742 P
> Tax Number/Steuernummer: 048/100/00053
> VAT-Ident-Number/Ust.-ID: DE 199331235
> DIN EN ISO 9001 certified
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From lists at hilboll.de  Mon Mar 19 13:45:44 2012
From: lists at hilboll.de (Andreas H.)
Date: Mon, 19 Mar 2012 18:45:44 +0100
Subject: [Numpy-discussion] Trouble building NumPy on PPC64
Message-ID: <0c1b056b6bd43c833d13fdf4495289cb.squirrel@srv2.s4y.tournesol-consulting.eu>

Hi all,

I have troube installing numpy in a virtual environment on a SuSE
Enterprise 11 server (ppc64).

Here is what I did:

    curl -O https://raw.github.com/pypa/virtualenv/master/virtualenv.py
    python virtualenv.py --distribute --no-site-packages .virtualenvs/pydoas
    source .virtualenvs/pydoas/bin/activate
    pip install numpy

And here is the outcome:

    SystemError: Cannot compile 'Python.h'. Perhaps you need to install
python-dev|python-devel.

However, Python.h exists, because I did install the python-devel package:

    (pydoas)hilboll at odin:~/.virtualenvs/pydoas/build/numpy> find
/usr/include/ | grep Python.h
    /usr/include/python2.6/Python.h

I also tried without the --distribute --no-site-packages flags, with the
same result.

Any hints are very welcome :)

Cheers,
Andreas.


From pgmdevlist at gmail.com  Tue Mar 20 05:53:49 2012
From: pgmdevlist at gmail.com (Pierre GM)
Date: Tue, 20 Mar 2012 10:53:49 +0100
Subject: [Numpy-discussion] Question on numpy.ma.masked_values
In-Reply-To: <CAE5kuyjMhdzRJ13AbGu3MfuQPwid2mGu5Nzm-KUV8Y3t=6GmGQ@mail.gmail.com>
References: <CAE5kuyjTj8DiBUS_VRcDok9qem-_gR6uCt7p1J9O5St-Csov2A@mail.gmail.com>
	<CA+X_USUbcB81rcqp7jA4dFGxZUEscBjmTNR938_FqRzf8=E84w@mail.gmail.com>
	<CAE5kuygc-jMx0ozx0L4s98dU8oKO49gQ720KOPNGSu00CAWqvg@mail.gmail.com>
	<CAE5kuyjMhdzRJ13AbGu3MfuQPwid2mGu5Nzm-KUV8Y3t=6GmGQ@mail.gmail.com>
Message-ID: <CA+X_USWVCBLP7KDeFkqGoYsa0T9ju6kUq6H9Oh70hvbKY3qdZg@mail.gmail.com>

G?khan,
By default, the mask of a MaskedArray is set to the special value
`np.ma.nomask`. In other terms::
    np.ma.array(...) <=> np.ma.array(..., mask=np.ma.nomask)

In practice, np.ma.nomask lets us quickly check whether a MaskedArray
has a masked value : if its .mask is np.ma.nomask, then no masked
value, otherwise it's a full boolean array and we can use any.

If you want to create a MaskedArray w/ a full boolean mask, just use::
   np.ma.array(..., mask=False)
In that case, the mask is automatically created as a boolean array
with the same shape as the data, with False everywhere. If you used
True, the mask would be full of True...


Now, just to be clear, you'd want
'np.ma.masked_values(...,shrink=False) to create a maked array w/ a
full boolean mask by default, right ?

On 3/15/12, G?khan Sever <gokhansever at gmail.com> wrote:
> Submitted the ticket at http://projects.scipy.org/numpy/ticket/2082
>
>
>
> On Thu, Mar 15, 2012 at 1:24 PM, G?khan Sever <gokhansever at gmail.com> wrote:
>
>>
>>
>> On Thu, Mar 15, 2012 at 1:12 PM, Pierre GM <pgmdevlist at gmail.com> wrote:
>>
>>> Ciao G?khan,
>>> AFAIR, shrink is used only to force a collapse of a mask full of False,
>>> not to force the creation of such a mask.
>>> Now, it should work as you expected, meaning that it needs to be fixed.
>>> Could you open a ticket? And put me in copy, just in case.
>>> Anyhow:
>>> Your trick is a tad dangerous, as it erases the previous mask. I'd prefer
>>> to create x w/ a full mask, then use masked_values w/ shrink=False...
>>> Now,
>>> if you're sure there's x= no masked values, go for it.
>>> Cheers
>>>
>>> This condition checking should make it stronger:
>>
>> I7 x = np.array([1, 1.1, 2, 1.1, 3])
>>
>> I8 y = np.ma.masked_values(x, 1.5)
>>
>> I9 if y.mask == False:
>>     y.mask = np.zeros(len(x), dtype=np.bool)*True
>>    ...:
>>
>> I10 y.mask
>> O10 array([False, False, False, False, False], dtype=bool)
>>
>> I11 y
>> O11
>> masked_array(data = [1.0 1.1 2.0 1.1 3.0],
>>              mask = [False False False False False],
>>        fill_value = 1.5)
>>
>> How do you create "x w/ a full mask"?
>>
>> --
>> G?khan
>>
>
>
>
> --
> G?khan
>


From pgmdevlist at gmail.com  Tue Mar 20 06:20:05 2012
From: pgmdevlist at gmail.com (Pierre GM)
Date: Tue, 20 Mar 2012 11:20:05 +0100
Subject: [Numpy-discussion] Recovering from a STOP ?
In-Reply-To: <CALGmxEL6QFZYbUTdWrrNgSbcJ7-E+EhwiZb06q_TVwDErsJ7uw@mail.gmail.com>
References: <loom.20120314T142725-938@post.gmane.org>
	<jjqgpr$3n2$1@dough.gmane.org>
	<CALGmxEL6QFZYbUTdWrrNgSbcJ7-E+EhwiZb06q_TVwDErsJ7uw@mail.gmail.com>
Message-ID: <CA+X_USX9H1jF1LKOQxC8senKM5zKFQ1GQvjHjDrDhrJx3CctBQ@mail.gmail.com>

Pauli, Chris,
Thanks for your inputs.

Pauli, I think that when f2py encounters a STOP statement, it just
stops the execution of the process. Alas, it's the same process as the
interpreter... So we need a trick not to interrupt the whole process.

I eventually resorted to patching f2py as suggested in
http://article.gmane.org/gmane.comp.python.f2py.user/1204/
which amounts to let f2py trap the SIGINT. That's good enough for what I need.

However, that stresses the need to have a more robust way to deal with
these interruptions and make sure that several objects calling one (or
several temporary) fortran extension don't interact the ones with the
others.

And I'm coming to the same conclusion as Chris', that I have to use
the `multiprocessing` module, with several processes calling their own
fortran extension. But that's another story.

Thanks y'all for your help
P.


On 3/14/12, Chris Barker <chris.barker at noaa.gov> wrote:
> On Wed, Mar 14, 2012 at 9:25 AM, Pauli Virtanen <pav at iki.fi> wrote:
>> Or, maybe the whole Fortran stuff can be run in a separate process, so
>> that crashing doesn't matter.
>
> That's what I was going to suggest -- even if you can get it not to
> crash, it may well be in a bad state -- memory leaks, and who know
> what else.
>
> We did something similar with some C code that called the system
> exit() function when it encountered errors -- it may not have been too
> hard to replace those calls, but making sure the memory was all
> cleaned up was going to be a trick -- so we just used the multiprocess
> module to call it in another process.
>
> HTH,
>
> -Chris
>
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959?? voice
> 7600 Sand Point Way NE ??(206) 526-6329?? fax
> Seattle, WA ?98115 ? ? ??(206) 526-6317?? main reception
>
> Chris.Barker at noaa.gov
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From rigal at rapideye.net  Tue Mar 20 08:13:15 2012
From: rigal at rapideye.net (Matthieu Rigal)
Date: Tue, 20 Mar 2012 13:13:15 +0100
Subject: [Numpy-discussion] NumPy-Discussion Digest, Vol 66, Issue 61
In-Reply-To: <mailman.17.1332176402.13169.numpy-discussion@scipy.org>
References: <mailman.17.1332176402.13169.numpy-discussion@scipy.org>
Message-ID: <201203201313.15488.rigal@rapideye.net>

Hi Richard,

Thanks for your answer and the related help !

In fact, I was hoping to have a less memory and more speed solution. Something 
equivalent to a "raster calculator" for numpy. Wouldn't it make sense to have 
some optimized function to work on more than 2 arrays for numpy anyway ?

At the end, I am rather interested by more speed.

I tried first a code-sparing version :
array = numpy.asarray([(aBlueChannel < 1.0),(aNirChannel > aBlueChannel * 
1.0),(aNirChannel < aBlueChannel * 1.8)]).all()

But this one is at the end more than 2 times slower than :
array1 = numpy.empty([3,6566,6682], dtype=numpy.bool)
numpy.less(aBlueChannel, 1.0, out=array1[0])
numpy.greater(aNirChannel, (aBlueChannel * 1.0), out=array1[1])
numpy.less(aNirChannel, (aBlueChannel * 1.8), out=array1[2])
array = array1.all()

(and this solution is about 30% faster than the original one)

I could find another way which was fine for me too:
array = (aBlueChannel < 1.0) * (aNirChannel > (aBlueChannel * 1.0)) * 
(aNirChannel < (aBlueChannel * 1.8))

But this one is only 5-10% faster than the original solution, even if probably 
using less memory than the 2 previous ones. (same was possible with operator 
+, but slower than operator *)

Regards,
Matthieu Rigal


On Monday 19 March 2012 18:00:02 numpy-discussion-request at scipy.org wrote:
> Message: 2
> Date: Mon, 19 Mar 2012 13:20:23 +0000
> From: Richard Hattersley <rhattersley at gmail.com>
> Subject: Re: [Numpy-discussion] Using logical function on more than 2
>         arrays, availability of a "between" function ?
> To: Discussion of Numerical Python <numpy-discussion at scipy.org>
> Message-ID:
>         <CAP=RS9=UBOc6Kmtmnne7W093t19w=T=oSrXUAW0WF8B49hqcXQ at mail.gmail.com
> > Content-Type: text/plain; charset=ISO-8859-1
> 
> What do you mean by "efficient"? Are you trying to get it execute
> faster? Or using less memory? Or have more concise source code?
> 
> Less memory:
>  - numpy.vectorize would let you get to the end result without any
> intermediate arrays but will be slow.
>  - Using the "out" parameter of numpy.logical_and will let you avoid
> one of the intermediate arrays.
> 
> More speed?:
> Perhaps putting all three boolean temporary results into a single
> boolean array (using the "out" parameter of numpy.greater, etc) and
> using numpy.all might benefit from logical short-circuiting.
> 
> And watch out for divide-by-zero from "aNirChannel/aBlueChannel".
> 
> Regards,
> Richard Hattersley
> 

RapidEye AG
Molkenmarkt 30
14776 Brandenburg an der Havel
Germany
 
Follow us on Twitter! www.twitter.com/rapideye_ag
 
Head Office/Sitz der Gesellschaft: Brandenburg an der Havel
Management Board/Vorstand: Ryan Johnson
Chairman of Supervisory Board/Vorsitzender des Aufsichtsrates: 
Robert Johnson
Commercial Register/Handelsregister Potsdam HRB 24742 P
Tax Number/Steuernummer: 048/100/00053
VAT-Ident-Number/Ust.-ID: DE 199331235
DIN EN ISO 9001 certified
 

From chaoyuejoy at gmail.com  Tue Mar 20 08:33:56 2012
From: chaoyuejoy at gmail.com (Chao YUE)
Date: Tue, 20 Mar 2012 13:33:56 +0100
Subject: [Numpy-discussion] Trying to read 500M txt file using
	numpy.genfromtxt within ipython shell
Message-ID: <CAAN-aRF3Esp1sZQrA86+HttGvbT+b_LY-Ws1OReFoWWVVRUToA@mail.gmail.com>

Dear all,

I received a file from others which contains ~30 million lines and in size
of ~500M.
I try read it with numpy.genfromtxt in ipython interactive mode. Then
ipython crashed.
The data contains lat,lon,var1,year, the year ranges from 1001 to 2006.
Finally I want to write the
data to netcdf for separate years and feed them into the model. I guess I
need a better way to do this?
anyone would be any idea is highly appreciated.


lon,lat,year,area_burned
-180.0,65.0,1001,0
-180.0,65.0,1002,0
-180.0,65.0,1003,0
-180.0,65.0,1004,0
-180.0,65.0,1005,0
-180.0,65.0,1006,0
-180.0,65.0,1007,0

thanks and cheers,

Chao

-- 
***********************************************************************************
Chao YUE
Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
UMR 1572 CEA-CNRS-UVSQ
Batiment 712 - Pe 119
91191 GIF Sur YVETTE Cedex
Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
************************************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120320/f21e9b4b/attachment.html>

From rigal at rapideye.net  Tue Mar 20 09:05:22 2012
From: rigal at rapideye.net (Matthieu Rigal)
Date: Tue, 20 Mar 2012 14:05:22 +0100
Subject: [Numpy-discussion] Using logical function on more than 2 arrays,
	availability of a "between" function ?
Message-ID: <201203201405.22737.rigal@rapideye.net>

Auto-answer, sorry,

Well, actually I made a mistake lower... that you may have noticed...
On the faster (your) solution, even with a cleaner use of the out parameter, 
the fact that the all has then to be used with parameter axis=0 takes more 
time and makes it actually slower than the initial solution...

So I may go for the "multiplier" solution.

Regards,
Matthieu

On Tuesday 20 March 2012 13:13:15 you wrote:
> Hi Richard,
> 
> Thanks for your answer and the related help !
> 
> In fact, I was hoping to have a less memory and more speed solution.
>  Something equivalent to a "raster calculator" for numpy. Wouldn't it make
>  sense to have some optimized function to work on more than 2 arrays for
>  numpy anyway ?
> 
> At the end, I am rather interested by more speed.
> 
> I tried first a code-sparing version :
> array = numpy.asarray([(aBlueChannel < 1.0),(aNirChannel > aBlueChannel *
> 1.0),(aNirChannel < aBlueChannel * 1.8)]).all()
> 
> But this one is at the end more than 2 times slower than :
> array1 = numpy.empty([3,6566,6682], dtype=numpy.bool)
> numpy.less(aBlueChannel, 1.0, out=array1[0])
> numpy.greater(aNirChannel, (aBlueChannel * 1.0), out=array1[1])
> numpy.less(aNirChannel, (aBlueChannel * 1.8), out=array1[2])
> array = array1.all()
> 
> (and this solution is about 30% faster than the original one)
> 
> I could find another way which was fine for me too:
> array = (aBlueChannel < 1.0) * (aNirChannel > (aBlueChannel * 1.0)) *
> (aNirChannel < (aBlueChannel * 1.8))
> 
> But this one is only 5-10% faster than the original solution, even if
>  probably using less memory than the 2 previous ones. (same was possible
>  with operator +, but slower than operator *)
> 
> Regards,
> Matthieu Rigal
> 
> On Monday 19 March 2012 18:00:02 numpy-discussion-request at scipy.org wrote:
> > Message: 2
> > Date: Mon, 19 Mar 2012 13:20:23 +0000
> > From: Richard Hattersley <rhattersley at gmail.com>
> > Subject: Re: [Numpy-discussion] Using logical function on more than 2
> >         arrays, availability of a "between" function ?
> > To: Discussion of Numerical Python <numpy-discussion at scipy.org>
> > Message-ID:
> >        
> > <CAP=RS9=UBOc6Kmtmnne7W093t19w=T=oSrXUAW0WF8B49hqcXQ at mail.gmail.com
> >
> > > Content-Type: text/plain; charset=ISO-8859-1
> >
> > What do you mean by "efficient"? Are you trying to get it execute
> > faster? Or using less memory? Or have more concise source code?
> >
> > Less memory:
> >  - numpy.vectorize would let you get to the end result without any
> > intermediate arrays but will be slow.
> >  - Using the "out" parameter of numpy.logical_and will let you avoid
> > one of the intermediate arrays.
> >
> > More speed?:
> > Perhaps putting all three boolean temporary results into a single
> > boolean array (using the "out" parameter of numpy.greater, etc) and
> > using numpy.all might benefit from logical short-circuiting.
> >
> > And watch out for divide-by-zero from "aNirChannel/aBlueChannel".
> >
> > Regards,
> > Richard Hattersley
> 

RapidEye AG
Molkenmarkt 30
14776 Brandenburg an der Havel
Germany
 
Follow us on Twitter! www.twitter.com/rapideye_ag
 
Head Office/Sitz der Gesellschaft: Brandenburg an der Havel
Management Board/Vorstand: Ryan Johnson
Chairman of Supervisory Board/Vorsitzender des Aufsichtsrates: 
Robert Johnson
Commercial Register/Handelsregister Potsdam HRB 24742 P
Tax Number/Steuernummer: 048/100/00053
VAT-Ident-Number/Ust.-ID: DE 199331235
DIN EN ISO 9001 certified
 

From david.froger at gmail.com  Tue Mar 20 09:17:12 2012
From: david.froger at gmail.com (David Froger)
Date: Tue, 20 Mar 2012 14:17:12 +0100
Subject: [Numpy-discussion] Trying to read 500M txt file using
	numpy.genfromtxt within ipython shell
In-Reply-To: <CAAN-aRF3Esp1sZQrA86+HttGvbT+b_LY-Ws1OReFoWWVVRUToA@mail.gmail.com>
References: <CAAN-aRF3Esp1sZQrA86+HttGvbT+b_LY-Ws1OReFoWWVVRUToA@mail.gmail.com>
Message-ID: <1332249107-sup-5881@david-desktop>

Hi,

I think writing a Python script that convert your txt file to one netcdf file,
reading the txt file one line at a time, and then use the netcdf file normally 
would be a good solution!

Best,
David

Excerpts from Chao YUE's message of mar. mars 20 13:33:56 +0100 2012:
> Dear all,
> 
> I received a file from others which contains ~30 million lines and in size
> of ~500M.
> I try read it with numpy.genfromtxt in ipython interactive mode. Then
> ipython crashed.
> The data contains lat,lon,var1,year, the year ranges from 1001 to 2006.
> Finally I want to write the
> data to netcdf for separate years and feed them into the model. I guess I
> need a better way to do this?
> anyone would be any idea is highly appreciated.
> 
> 
> lon,lat,year,area_burned
> -180.0,65.0,1001,0
> -180.0,65.0,1002,0
> -180.0,65.0,1003,0
> -180.0,65.0,1004,0
> -180.0,65.0,1005,0
> -180.0,65.0,1006,0
> -180.0,65.0,1007,0
> 
> thanks and cheers,
> 
> Chao
> 


From chaoyuejoy at gmail.com  Tue Mar 20 09:40:33 2012
From: chaoyuejoy at gmail.com (Chao YUE)
Date: Tue, 20 Mar 2012 14:40:33 +0100
Subject: [Numpy-discussion] Trying to read 500M txt file using
 numpy.genfromtxt within ipython shell
In-Reply-To: <1332249107-sup-5881@david-desktop>
References: <CAAN-aRF3Esp1sZQrA86+HttGvbT+b_LY-Ws1OReFoWWVVRUToA@mail.gmail.com>
	<1332249107-sup-5881@david-desktop>
Message-ID: <CAAN-aRGKhAAo5BWH8hSiCMjB97zjA0WE+RVVY5td7aUv02TJKw@mail.gmail.com>

I would be in agree. thanks!
I use gawk to separate the file into many files by year, then it would be
easier to handle.
anyway, it's not a good practice to produce such huge line txt files....

Chao

2012/3/20 David Froger <david.froger at gmail.com>

> Hi,
>
> I think writing a Python script that convert your txt file to one netcdf
> file,
> reading the txt file one line at a time, and then use the netcdf file
> normally
> would be a good solution!
>
> Best,
> David
>
> Excerpts from Chao YUE's message of mar. mars 20 13:33:56 +0100 2012:
> > Dear all,
> >
> > I received a file from others which contains ~30 million lines and in
> size
> > of ~500M.
> > I try read it with numpy.genfromtxt in ipython interactive mode. Then
> > ipython crashed.
> > The data contains lat,lon,var1,year, the year ranges from 1001 to 2006.
> > Finally I want to write the
> > data to netcdf for separate years and feed them into the model. I guess I
> > need a better way to do this?
> > anyone would be any idea is highly appreciated.
> >
> >
> > lon,lat,year,area_burned
> > -180.0,65.0,1001,0
> > -180.0,65.0,1002,0
> > -180.0,65.0,1003,0
> > -180.0,65.0,1004,0
> > -180.0,65.0,1005,0
> > -180.0,65.0,1006,0
> > -180.0,65.0,1007,0
> >
> > thanks and cheers,
> >
> > Chao
> >
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
***********************************************************************************
Chao YUE
Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
UMR 1572 CEA-CNRS-UVSQ
Batiment 712 - Pe 119
91191 GIF Sur YVETTE Cedex
Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
************************************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120320/e3ebd7ee/attachment.html>

From derek at astro.physik.uni-goettingen.de  Tue Mar 20 11:23:48 2012
From: derek at astro.physik.uni-goettingen.de (Derek Homeier)
Date: Tue, 20 Mar 2012 16:23:48 +0100
Subject: [Numpy-discussion] Trying to read 500M txt file using
	numpy.genfromtxt within ipython shell
In-Reply-To: <CAAN-aRGKhAAo5BWH8hSiCMjB97zjA0WE+RVVY5td7aUv02TJKw@mail.gmail.com>
References: <CAAN-aRF3Esp1sZQrA86+HttGvbT+b_LY-Ws1OReFoWWVVRUToA@mail.gmail.com>
	<1332249107-sup-5881@david-desktop>
	<CAAN-aRGKhAAo5BWH8hSiCMjB97zjA0WE+RVVY5td7aUv02TJKw@mail.gmail.com>
Message-ID: <2F688827-E4E4-417E-8EFB-693B12FEEAA7@astro.physik.uni-goettingen.de>

On 20 Mar 2012, at 14:40, Chao YUE wrote:

> I would be in agree. thanks!
> I use gawk to separate the file into many files by year, then it would be easier to handle.
> anyway, it's not a good practice to produce such huge line txt files....

Indeed it's not, but it's also not good practice to load the entire content 
of text files as python lists into memory, as unfortunately all the numpy 
readers are still doing. But this has been discussed on this list and 
improvements are under way. 
For your problem at hand the textreader Warren Weckesser recently 
made known - can't find the post right now, but you can find it at

https://github.com/WarrenWeckesser/textreader

might be helpful. It is still under construction, but for a plain csv file such 
as yours it should be working already. And since the text parsing is 
implemented in C, it should also give you a huge speedup for your 1/2 GB!

For additional profiling, similar to what David suggested, it would certainly 
be a good idea to read in smaller chunks of the file and write it directly to 
the netCDF file. Note that you can already read single lines at a time with the 
likes of

from StringIO import StringIO
f = open('file.txt'. 'r')
np.genfromtxt(StringIO(f.next()), delimiter=',')

but I don't think it would work this way with textreader, and iterating such a small 
loop over lines in Python would beat the point of using a fast reader. 
As your actual data would be < 1GB in numpy, memory usage with textreader 
should also not be critical yet.

Cheers,
				Derek


From nouiz at nouiz.org  Tue Mar 20 12:01:17 2012
From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=)
Date: Tue, 20 Mar 2012 12:01:17 -0400
Subject: [Numpy-discussion] Using logical function on more than 2 arrays,
 availability of a "between" function ?
In-Reply-To: <201203201405.22737.rigal@rapideye.net>
References: <201203201405.22737.rigal@rapideye.net>
Message-ID: <CADKKbtjkk1_iq_NadJrt4wg+nDbC36-=-rE4pW9UA3qJMrcsKg@mail.gmail.com>

I didn't try it, but I think that Theano and numexpr should be able to
make them faster.

[1] http://deeplearning.net/software/theano/
[2] https://code.google.com/p/numexpr/

Fred

On Tue, Mar 20, 2012 at 9:05 AM, Matthieu Rigal <rigal at rapideye.net> wrote:
> Auto-answer, sorry,
>
> Well, actually I made a mistake lower... that you may have noticed...
> On the faster (your) solution, even with a cleaner use of the out parameter,
> the fact that the all has then to be used with parameter axis=0 takes more
> time and makes it actually slower than the initial solution...
>
> So I may go for the "multiplier" solution.
>
> Regards,
> Matthieu
>
> On Tuesday 20 March 2012 13:13:15 you wrote:
>> Hi Richard,
>>
>> Thanks for your answer and the related help !
>>
>> In fact, I was hoping to have a less memory and more speed solution.
>> ?Something equivalent to a "raster calculator" for numpy. Wouldn't it make
>> ?sense to have some optimized function to work on more than 2 arrays for
>> ?numpy anyway ?
>>
>> At the end, I am rather interested by more speed.
>>
>> I tried first a code-sparing version :
>> array = numpy.asarray([(aBlueChannel < 1.0),(aNirChannel > aBlueChannel *
>> 1.0),(aNirChannel < aBlueChannel * 1.8)]).all()
>>
>> But this one is at the end more than 2 times slower than :
>> array1 = numpy.empty([3,6566,6682], dtype=numpy.bool)
>> numpy.less(aBlueChannel, 1.0, out=array1[0])
>> numpy.greater(aNirChannel, (aBlueChannel * 1.0), out=array1[1])
>> numpy.less(aNirChannel, (aBlueChannel * 1.8), out=array1[2])
>> array = array1.all()
>>
>> (and this solution is about 30% faster than the original one)
>>
>> I could find another way which was fine for me too:
>> array = (aBlueChannel < 1.0) * (aNirChannel > (aBlueChannel * 1.0)) *
>> (aNirChannel < (aBlueChannel * 1.8))
>>
>> But this one is only 5-10% faster than the original solution, even if
>> ?probably using less memory than the 2 previous ones. (same was possible
>> ?with operator +, but slower than operator *)
>>
>> Regards,
>> Matthieu Rigal
>>
>> On Monday 19 March 2012 18:00:02 numpy-discussion-request at scipy.org wrote:
>> > Message: 2
>> > Date: Mon, 19 Mar 2012 13:20:23 +0000
>> > From: Richard Hattersley <rhattersley at gmail.com>
>> > Subject: Re: [Numpy-discussion] Using logical function on more than 2
>> > ? ? ? ? arrays, availability of a "between" function ?
>> > To: Discussion of Numerical Python <numpy-discussion at scipy.org>
>> > Message-ID:
>> >
>> > <CAP=RS9=UBOc6Kmtmnne7W093t19w=T=oSrXUAW0WF8B49hqcXQ at mail.gmail.com
>> >
>> > > Content-Type: text/plain; charset=ISO-8859-1
>> >
>> > What do you mean by "efficient"? Are you trying to get it execute
>> > faster? Or using less memory? Or have more concise source code?
>> >
>> > Less memory:
>> > ?- numpy.vectorize would let you get to the end result without any
>> > intermediate arrays but will be slow.
>> > ?- Using the "out" parameter of numpy.logical_and will let you avoid
>> > one of the intermediate arrays.
>> >
>> > More speed?:
>> > Perhaps putting all three boolean temporary results into a single
>> > boolean array (using the "out" parameter of numpy.greater, etc) and
>> > using numpy.all might benefit from logical short-circuiting.
>> >
>> > And watch out for divide-by-zero from "aNirChannel/aBlueChannel".
>> >
>> > Regards,
>> > Richard Hattersley
>>
>
> RapidEye AG
> Molkenmarkt 30
> 14776 Brandenburg an der Havel
> Germany
>
> Follow us on Twitter! www.twitter.com/rapideye_ag
>
> Head Office/Sitz der Gesellschaft: Brandenburg an der Havel
> Management Board/Vorstand: Ryan Johnson
> Chairman of Supervisory Board/Vorsitzender des Aufsichtsrates:
> Robert Johnson
> Commercial Register/Handelsregister Potsdam HRB 24742 P
> Tax Number/Steuernummer: 048/100/00053
> VAT-Ident-Number/Ust.-ID: DE 199331235
> DIN EN ISO 9001 certified
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From markflorisson88 at gmail.com  Tue Mar 20 13:49:09 2012
From: markflorisson88 at gmail.com (mark florisson)
Date: Tue, 20 Mar 2012 18:49:09 +0100
Subject: [Numpy-discussion] Looking for people interested in helping
 with Python compiler to LLVM
In-Reply-To: <FA0A06B7-5EDF-4D2F-8919-A65E310D8AC5@continuum.io>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<4F5EE219.8010108@astro.uio.no>
	<C15466AA-DBDC-4582-A522-D50D1A2E4AC1@continuum.io>
	<CANg26EWHOqu2RH1A-jBTpY79h973P9Ob-070NbBwLPeZkY7eKA@mail.gmail.com>
	<FA0A06B7-5EDF-4D2F-8919-A65E310D8AC5@continuum.io>
Message-ID: <CANg26EVFOoQtkHcP-sYZ2vD13V6WF8cqZLkZOHyjusvMghZXmw@mail.gmail.com>

On 13 March 2012 18:18, Travis Oliphant <travis at continuum.io> wrote:
>>>
>>> (Mark F., how does the above match how you feel about this?)
>>
>> I would like collaboration, but from a technical perspective I think
>> this would be much more involved than just dumping the AST to an IR
>> and generating some code from there. For vector expressions I think
>> sharing code would be more feasible than arbitrary (parallel) loops,
>> etc. Cython as a compiler can make many decisions that a Python
>> (bytecode) compiler can't make (at least without annotations and a
>> well-defined subset of the language (not so much the syntax as the
>> semantics)). I think in numba, if parallelism is to be supported, you
>> will want a prange-like construct, as proving independence between
>> iterations can be very hard to near impossible for a compiler.
>
> I completely agree that you have to define some kind of syntax to get parallelism. ?But, a prange construct would not be out of the question, of course.
>
>>
>> As for code generation, I'm not sure how llvm would do things like
>> slicing arrays, reshaping, resizing etc (for vector expressions you
>> can first evaluate all slicing and indexing operations and then
>> compile the remaining vector expression), but for loops and array
>> reassignment within loops this would have to invoke the actual slicing
>> code from the llvm code (I presume).
>
> There could be some analysis on the byte-code, prior to emitting the llvm code in order to handle lots of things. ? Basically, you have to "play" the byte-code on a simple machine anyway in order to emit the correct code. ? The big thing about Cython is you have to typedef too many things that are really quite knowable from the code. ? If Cython could improve it's type inference, then it would be a more suitable target.
>
>> There are many other things, like
>> bounds checking, wraparound, etc, that are all supported in both numpy
>> and Cython, but going through an llvm layer would as far as I can see,
>> require re-implementing those, at least if you want top-notch
>> performance. Personally, I think for non-trivial performance-critical
>> code (for loops with indexing, slicing, function calls, etc) Cython is
>> a better target.
>
> With libclang it is really quite possible to imagine a cython -> C target that itself compiles to llvm so that you can do everything at that intermediate layer. ? However, ?LLVM is a much better layer for optimization than C now that there are a lot of people collaborating on that layer. ? I think it would be great if Cython targeted LLVM actually instead of C.
>
>>
>> Finally, as for non-vector-expression code, I really believe Cython is
>> a better target. cython.inline can have high overhead (at least the
>> first time it has to compile), but with better (numpy-aware) type
>> inference or profile guided optimizations (see recent threads on the
>> cython-dev mailing list), in addition to things like prange, I
>> personally believe Cython targets most of the use cases where numba
>> would be able to generate performing code.
>
> Cython and Numba certainly overlap. ?However, Cython requires:
>
> ? ? ? ?1) learning another language
> ? ? ? ?2) creating an extension module --- loading bit-code files and dynamically executing (even on a different machine from the one that initially created them) can be a powerful alternative for run-time compilation and distribution of code.
>
> These aren't show-stoppers obviously. ? But, I think some users would prefer an even simpler approach to getting fast-code than Cython (which currently doesn't do enought type-inference and requires building a dlopen extension module).

Dag and I have been discussing this at PyCon, and here is my take on
it (at this moment :).

Definitely, if you can avoid Cython then that is easier and more
desirable in many ways. So perhaps we can create a third project
called X (I'm not very creative, maybe ArrayExprOpt), that takes an
abstract syntax tree in a rather simple form, performs code
optimizations such as rewriting loops with array accesses to vector
expressions, fusing vector expressions and loops, etc, and spits out a
transformed AST containing these optimizations. If runtime information
is given such as actual shape and stride information the
transformations could figure out there and then whether to do things
like collapsing, axes swapping, blocking (as in, introducing more axes
or loops to retain discontiguous blocks in the cache), blocked memory
copies to contiguous chunks, etc. The AST could then also say whether
the final expressions are vectorizable. Part of this functionality is
already in numpy's nditer, except that this would be implicit and do
more (and hopefully with minimal overhead).

So numba, Cython and maybe numexpr could use the functionality, simply
by building the AST from Python and converting back (if necessary) to
its own AST. As such, the AST optimizer would be only part of any
(runtime) compiler's pipeline, and it should be very flexible to
retain any information (metadata regarding actual types, control flow
information, etc) provided by the original AST. It would not do
control flow analysis, type inference or promotion, etc, but only deal
with abstract types like integers, reals and arrays (C, Fortran or
partly contiguous or strided). It would not deal with objects, but
would allow to insert nodes like UnreorderableNode and SideEffectNode
wrapping parts of the original AST. In short, it should be as easy as
possible to convert from an original AST to this project's AST and
back again afterwards.

As the project matures many optimizations may be added that deal with
all sorts of loop restructuring and ways to efficiently utilize the
cache as well as enable vectorization and possibly parallelism.
Perhaps it could even generate a different AST depending on whether
execution target the CPU or the GPU (with optionally available
information such as cache sizes, GPU shared/local memory sizes, etc).

Seeing that this would be a part of my master dissertation, my
supervisor would require me to write the code, so at least until
August I think I would have to write (at least the bulk of) this.
Otherwise I can also make other parts of my dissertation's project
more prominent to make up for it. Anyway, my question is, is there
interest from at least the numba and numexpr projects (if code can be
transformed into vector operations, it makes sense to use numexpr for
that, I'm not sure what numba's interest is in that).

> -Travis
>
>
>
>
>>
>>> Dag
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From chris.barker at noaa.gov  Tue Mar 20 13:49:49 2012
From: chris.barker at noaa.gov (Chris Barker)
Date: Tue, 20 Mar 2012 10:49:49 -0700
Subject: [Numpy-discussion] NumPy-Discussion Digest, Vol 66, Issue 61
In-Reply-To: <201203201313.15488.rigal@rapideye.net>
References: <mailman.17.1332176402.13169.numpy-discussion@scipy.org>
	<201203201313.15488.rigal@rapideye.net>
Message-ID: <CALGmxEJdqhD8i97bs+Htat+htoqcqu6QCH+_1eAY=Oxg5RTtiw@mail.gmail.com>

On Tue, Mar 20, 2012 at 5:13 AM, Matthieu Rigal <rigal at rapideye.net> wrote:

> In fact, I was hoping to have a less memory and more speed solution.

which do often go together, at least for big problems -- pushingm
emory around often takes more time than the computation itself.

> At the end, I am rather interested by more speed.
>
> I tried first a code-sparing version :
array = numpy.asarray([(aBlueChannel < 1.0),(aNirChannel > aBlueChannel *
> 1.0),(aNirChannel < aBlueChannel * 1.8)]).all()

(by the way -- it is MUCH better if you post example code with actual
data (the example data can be much smaller) -- while with small data
sets you can't test performance, you can test correctness -- the
easier you make it for us to try stuff out, the more we can help you.


re-formatting so I can read this, I get:

array = numpy.asarray([(aBlueChannel < 1.0),
                                      (aNirChannel > aBlueChannel * 1.0),
                                      (aNirChannel < aBlueChannel * 1.8)]).all()

a few notes:

asarray  will work, but is pointless here -- I'd just use "array" --
asarray() is for when you may or may not have an array as input, and
you want to preserve it if you do.

I'd probably use numpy.vstack or rstack here, rather than the array
(or asarray) function -- there is a larger parsing overhead to array()

I suppose this is a special case, but multiplying by 1.0 is kind of
pointless there (or are  you doing that to cast to a float array)?

if you are doing an all() -- not much reason to put them all in the
same array first anyway.


> But this one is at the end more than 2 times slower than :

array1 = numpy.empty([3,6566,6682], dtype=numpy.bool)
numpy.less(aBlueChannel, 1.0, out=array1[0])
numpy.greater(aNirChannel, (aBlueChannel * 1.0), out=array1[1])
numpy.less(aNirChannel, (aBlueChannel * 1.8), out=array1[2])
array = array1.all()

yup -- creating temporaries can be slow for big data -- there is the
trade-off between compact code and performance some times.

I think you can be more memory efficient here, though -- if in the end
all you want is the final "all" check, no need to store all checks for
each channel -- something like:

#allocate a bool array:
array1 = numpy.empty( (6566,6682),  dtype=numpy.bool)

result = numpy.less(aBlueChannel, 1.0, out=array1).all()
result &= numpy.greater(aNirChannel, (aBlueChannel * 1.0), out=array1).all()
result &= numpy.less(aNirChannel, (aBlueChannel * 1.8), out=array1[2]).all()

three loops for the all(), but less memory to push around -- may be faster.

I'd also take a look at numexpr for this, it could be very helpful:

http://code.google.com/p/numexpr/

-Chris


> (and this solution is about 30% faster than the original one)
>
> I could find another way which was fine for me too:
> array = (aBlueChannel < 1.0) * (aNirChannel > (aBlueChannel * 1.0)) *
> (aNirChannel < (aBlueChannel * 1.8))
>
> But this one is only 5-10% faster than the original solution, even if probably
> using less memory than the 2 previous ones. (same was possible with operator
> +, but slower than operator *)
>
> Regards,
> Matthieu Rigal
>
>
> On Monday 19 March 2012 18:00:02 numpy-discussion-request at scipy.org wrote:
>> Message: 2
>> Date: Mon, 19 Mar 2012 13:20:23 +0000
>> From: Richard Hattersley <rhattersley at gmail.com>
>> Subject: Re: [Numpy-discussion] Using logical function on more than 2
>> ? ? ? ? arrays, availability of a "between" function ?
>> To: Discussion of Numerical Python <numpy-discussion at scipy.org>
>> Message-ID:
>> ? ? ? ? <CAP=RS9=UBOc6Kmtmnne7W093t19w=T=oSrXUAW0WF8B49hqcXQ at mail.gmail.com
>> > Content-Type: text/plain; charset=ISO-8859-1
>>
>> What do you mean by "efficient"? Are you trying to get it execute
>> faster? Or using less memory? Or have more concise source code?
>>
>> Less memory:
>> ?- numpy.vectorize would let you get to the end result without any
>> intermediate arrays but will be slow.
>> ?- Using the "out" parameter of numpy.logical_and will let you avoid
>> one of the intermediate arrays.
>>
>> More speed?:
>> Perhaps putting all three boolean temporary results into a single
>> boolean array (using the "out" parameter of numpy.greater, etc) and
>> using numpy.all might benefit from logical short-circuiting.
>>
>> And watch out for divide-by-zero from "aNirChannel/aBlueChannel".
>>
>> Regards,
>> Richard Hattersley
>>
>
> RapidEye AG
> Molkenmarkt 30
> 14776 Brandenburg an der Havel
> Germany
>
> Follow us on Twitter! www.twitter.com/rapideye_ag
>
> Head Office/Sitz der Gesellschaft: Brandenburg an der Havel
> Management Board/Vorstand: Ryan Johnson
> Chairman of Supervisory Board/Vorsitzender des Aufsichtsrates:
> Robert Johnson
> Commercial Register/Handelsregister Potsdam HRB 24742 P
> Tax Number/Steuernummer: 048/100/00053
> VAT-Ident-Number/Ust.-ID: DE 199331235
> DIN EN ISO 9001 certified
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959?? voice
7600 Sand Point Way NE ??(206) 526-6329?? fax
Seattle, WA ?98115 ? ? ??(206) 526-6317?? main reception

Chris.Barker at noaa.gov


From shish at keba.be  Tue Mar 20 13:58:44 2012
From: shish at keba.be (Olivier Delalleau)
Date: Tue, 20 Mar 2012 13:58:44 -0400
Subject: [Numpy-discussion] Looking for people interested in helping
 with Python compiler to LLVM
In-Reply-To: <CANg26EVFOoQtkHcP-sYZ2vD13V6WF8cqZLkZOHyjusvMghZXmw@mail.gmail.com>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<4F5EE219.8010108@astro.uio.no>
	<C15466AA-DBDC-4582-A522-D50D1A2E4AC1@continuum.io>
	<CANg26EWHOqu2RH1A-jBTpY79h973P9Ob-070NbBwLPeZkY7eKA@mail.gmail.com>
	<FA0A06B7-5EDF-4D2F-8919-A65E310D8AC5@continuum.io>
	<CANg26EVFOoQtkHcP-sYZ2vD13V6WF8cqZLkZOHyjusvMghZXmw@mail.gmail.com>
Message-ID: <CAFXk4boesOAbRRzyejuKW58avohKK=VVGrmFkgh00f9njVoLEA@mail.gmail.com>

This sounds a lot like Theano, did you look into it?

-=- Olivier

Le 20 mars 2012 13:49, mark florisson <markflorisson88 at gmail.com> a ?crit :

> On 13 March 2012 18:18, Travis Oliphant <travis at continuum.io> wrote:
> >>>
> >>> (Mark F., how does the above match how you feel about this?)
> >>
> >> I would like collaboration, but from a technical perspective I think
> >> this would be much more involved than just dumping the AST to an IR
> >> and generating some code from there. For vector expressions I think
> >> sharing code would be more feasible than arbitrary (parallel) loops,
> >> etc. Cython as a compiler can make many decisions that a Python
> >> (bytecode) compiler can't make (at least without annotations and a
> >> well-defined subset of the language (not so much the syntax as the
> >> semantics)). I think in numba, if parallelism is to be supported, you
> >> will want a prange-like construct, as proving independence between
> >> iterations can be very hard to near impossible for a compiler.
> >
> > I completely agree that you have to define some kind of syntax to get
> parallelism.  But, a prange construct would not be out of the question, of
> course.
> >
> >>
> >> As for code generation, I'm not sure how llvm would do things like
> >> slicing arrays, reshaping, resizing etc (for vector expressions you
> >> can first evaluate all slicing and indexing operations and then
> >> compile the remaining vector expression), but for loops and array
> >> reassignment within loops this would have to invoke the actual slicing
> >> code from the llvm code (I presume).
> >
> > There could be some analysis on the byte-code, prior to emitting the
> llvm code in order to handle lots of things.   Basically, you have to
> "play" the byte-code on a simple machine anyway in order to emit the
> correct code.   The big thing about Cython is you have to typedef too many
> things that are really quite knowable from the code.   If Cython could
> improve it's type inference, then it would be a more suitable target.
> >
> >> There are many other things, like
> >> bounds checking, wraparound, etc, that are all supported in both numpy
> >> and Cython, but going through an llvm layer would as far as I can see,
> >> require re-implementing those, at least if you want top-notch
> >> performance. Personally, I think for non-trivial performance-critical
> >> code (for loops with indexing, slicing, function calls, etc) Cython is
> >> a better target.
> >
> > With libclang it is really quite possible to imagine a cython -> C
> target that itself compiles to llvm so that you can do everything at that
> intermediate layer.   However,  LLVM is a much better layer for
> optimization than C now that there are a lot of people collaborating on
> that layer.   I think it would be great if Cython targeted LLVM actually
> instead of C.
> >
> >>
> >> Finally, as for non-vector-expression code, I really believe Cython is
> >> a better target. cython.inline can have high overhead (at least the
> >> first time it has to compile), but with better (numpy-aware) type
> >> inference or profile guided optimizations (see recent threads on the
> >> cython-dev mailing list), in addition to things like prange, I
> >> personally believe Cython targets most of the use cases where numba
> >> would be able to generate performing code.
> >
> > Cython and Numba certainly overlap.  However, Cython requires:
> >
> >        1) learning another language
> >        2) creating an extension module --- loading bit-code files and
> dynamically executing (even on a different machine from the one that
> initially created them) can be a powerful alternative for run-time
> compilation and distribution of code.
> >
> > These aren't show-stoppers obviously.   But, I think some users would
> prefer an even simpler approach to getting fast-code than Cython (which
> currently doesn't do enought type-inference and requires building a dlopen
> extension module).
>
> Dag and I have been discussing this at PyCon, and here is my take on
> it (at this moment :).
>
> Definitely, if you can avoid Cython then that is easier and more
> desirable in many ways. So perhaps we can create a third project
> called X (I'm not very creative, maybe ArrayExprOpt), that takes an
> abstract syntax tree in a rather simple form, performs code
> optimizations such as rewriting loops with array accesses to vector
> expressions, fusing vector expressions and loops, etc, and spits out a
> transformed AST containing these optimizations. If runtime information
> is given such as actual shape and stride information the
> transformations could figure out there and then whether to do things
> like collapsing, axes swapping, blocking (as in, introducing more axes
> or loops to retain discontiguous blocks in the cache), blocked memory
> copies to contiguous chunks, etc. The AST could then also say whether
> the final expressions are vectorizable. Part of this functionality is
> already in numpy's nditer, except that this would be implicit and do
> more (and hopefully with minimal overhead).
>
> So numba, Cython and maybe numexpr could use the functionality, simply
> by building the AST from Python and converting back (if necessary) to
> its own AST. As such, the AST optimizer would be only part of any
> (runtime) compiler's pipeline, and it should be very flexible to
> retain any information (metadata regarding actual types, control flow
> information, etc) provided by the original AST. It would not do
> control flow analysis, type inference or promotion, etc, but only deal
> with abstract types like integers, reals and arrays (C, Fortran or
> partly contiguous or strided). It would not deal with objects, but
> would allow to insert nodes like UnreorderableNode and SideEffectNode
> wrapping parts of the original AST. In short, it should be as easy as
> possible to convert from an original AST to this project's AST and
> back again afterwards.
>
> As the project matures many optimizations may be added that deal with
> all sorts of loop restructuring and ways to efficiently utilize the
> cache as well as enable vectorization and possibly parallelism.
> Perhaps it could even generate a different AST depending on whether
> execution target the CPU or the GPU (with optionally available
> information such as cache sizes, GPU shared/local memory sizes, etc).
>
> Seeing that this would be a part of my master dissertation, my
> supervisor would require me to write the code, so at least until
> August I think I would have to write (at least the bulk of) this.
> Otherwise I can also make other parts of my dissertation's project
> more prominent to make up for it. Anyway, my question is, is there
> interest from at least the numba and numexpr projects (if code can be
> transformed into vector operations, it makes sense to use numexpr for
> that, I'm not sure what numba's interest is in that).
>
> > -Travis
> >
> >
> >
> >
> >>
> >>> Dag
> >>> _______________________________________________
> >>> NumPy-Discussion mailing list
> >>> NumPy-Discussion at scipy.org
> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> NumPy-Discussion mailing list
> >>> NumPy-Discussion at scipy.org
> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>>
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120320/b994d4da/attachment.html>

From gokhansever at gmail.com  Tue Mar 20 14:18:13 2012
From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=)
Date: Tue, 20 Mar 2012 12:18:13 -0600
Subject: [Numpy-discussion] Question on numpy.ma.masked_values
In-Reply-To: <CA+X_USWVCBLP7KDeFkqGoYsa0T9ju6kUq6H9Oh70hvbKY3qdZg@mail.gmail.com>
References: <CAE5kuyjTj8DiBUS_VRcDok9qem-_gR6uCt7p1J9O5St-Csov2A@mail.gmail.com>
	<CA+X_USUbcB81rcqp7jA4dFGxZUEscBjmTNR938_FqRzf8=E84w@mail.gmail.com>
	<CAE5kuygc-jMx0ozx0L4s98dU8oKO49gQ720KOPNGSu00CAWqvg@mail.gmail.com>
	<CAE5kuyjMhdzRJ13AbGu3MfuQPwid2mGu5Nzm-KUV8Y3t=6GmGQ@mail.gmail.com>
	<CA+X_USWVCBLP7KDeFkqGoYsa0T9ju6kUq6H9Oh70hvbKY3qdZg@mail.gmail.com>
Message-ID: <CAE5kuyj=MuB69y-kqaNSnXKVfq6sjJvFxSatJvdeT1DRmHhGMA@mail.gmail.com>

Yes, that's the behaviour that I expect setting the 'shrink' keyword to 'False'

> Now, just to be clear, you'd want
> 'np.ma.masked_values(...,shrink=False) to create a maked array w/ a
> full boolean mask by default, right ?


From francesc at continuum.io  Tue Mar 20 14:24:26 2012
From: francesc at continuum.io (Francesc Alted)
Date: Tue, 20 Mar 2012 13:24:26 -0500
Subject: [Numpy-discussion] Looking for people interested in helping
	with Python compiler to LLVM
In-Reply-To: <CANg26EVFOoQtkHcP-sYZ2vD13V6WF8cqZLkZOHyjusvMghZXmw@mail.gmail.com>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<4F5EE219.8010108@astro.uio.no>
	<C15466AA-DBDC-4582-A522-D50D1A2E4AC1@continuum.io>
	<CANg26EWHOqu2RH1A-jBTpY79h973P9Ob-070NbBwLPeZkY7eKA@mail.gmail.com>
	<FA0A06B7-5EDF-4D2F-8919-A65E310D8AC5@continuum.io>
	<CANg26EVFOoQtkHcP-sYZ2vD13V6WF8cqZLkZOHyjusvMghZXmw@mail.gmail.com>
Message-ID: <74EF7B8D-FADC-4091-A7EB-09472A85B75B@continuum.io>

On Mar 20, 2012, at 12:49 PM, mark florisson wrote:
>> Cython and Numba certainly overlap.  However, Cython requires:
>> 
>>        1) learning another language
>>        2) creating an extension module --- loading bit-code files and dynamically executing (even on a different machine from the one that initially created them) can be a powerful alternative for run-time compilation and distribution of code.
>> 
>> These aren't show-stoppers obviously.   But, I think some users would prefer an even simpler approach to getting fast-code than Cython (which currently doesn't do enought type-inference and requires building a dlopen extension module).
> 
> Dag and I have been discussing this at PyCon, and here is my take on
> it (at this moment :).
> 
> Definitely, if you can avoid Cython then that is easier and more
> desirable in many ways. So perhaps we can create a third project
> called X (I'm not very creative, maybe ArrayExprOpt), that takes an
> abstract syntax tree in a rather simple form, performs code
> optimizations such as rewriting loops with array accesses to vector
> expressions, fusing vector expressions and loops, etc, and spits out a
> transformed AST containing these optimizations. If runtime information
> is given such as actual shape and stride information the
> transformations could figure out there and then whether to do things
> like collapsing, axes swapping, blocking (as in, introducing more axes
> or loops to retain discontiguous blocks in the cache), blocked memory
> copies to contiguous chunks, etc. The AST could then also say whether
> the final expressions are vectorizable. Part of this functionality is
> already in numpy's nditer, except that this would be implicit and do
> more (and hopefully with minimal overhead).
> 
> So numba, Cython and maybe numexpr could use the functionality, simply
> by building the AST from Python and converting back (if necessary) to
> its own AST. As such, the AST optimizer would be only part of any
> (runtime) compiler's pipeline, and it should be very flexible to
> retain any information (metadata regarding actual types, control flow
> information, etc) provided by the original AST. It would not do
> control flow analysis, type inference or promotion, etc, but only deal
> with abstract types like integers, reals and arrays (C, Fortran or
> partly contiguous or strided). It would not deal with objects, but
> would allow to insert nodes like UnreorderableNode and SideEffectNode
> wrapping parts of the original AST. In short, it should be as easy as
> possible to convert from an original AST to this project's AST and
> back again afterwards.

I think this is a very interesting project, and certainly projects like numba can benefit of it.  So, in order to us have an idea on what you are after, can we assume that your project (call it X) would be kind of an compiler optimizer, and then the produced, optimized code could be feed into numba for optimized LLVM code generation (that on its turn, can be run on top of CPUs or GPUs or a combination)?  Is that correct?

Giving that my interpretation above is correct, it is bit more difficult to me to see how your X project could be of benefit for numexpr.  In fact, I actually see this the other way round: once the optimizer has discovered the vectorization parts, then go one step further and generate code that uses numexpr automatically (call this, vectorization through numexpr).  This is what you mean, or I'm missing something?

> As the project matures many optimizations may be added that deal with
> all sorts of loop restructuring and ways to efficiently utilize the
> cache as well as enable vectorization and possibly parallelism.
> Perhaps it could even generate a different AST depending on whether
> execution target the CPU or the GPU (with optionally available
> information such as cache sizes, GPU shared/local memory sizes, etc).
> 
> Seeing that this would be a part of my master dissertation, my
> supervisor would require me to write the code, so at least until
> August I think I would have to write (at least the bulk of) this.
> Otherwise I can also make other parts of my dissertation's project
> more prominent to make up for it. Anyway, my question is, is there
> interest from at least the numba and numexpr projects (if code can be
> transformed into vector operations, it makes sense to use numexpr for
> that, I'm not sure what numba's interest is in that).

I'm definitely interested for the numexpr part.  It is just that I'm still struggling to see the big picture on this.  But the general idea is really appealing.

Thanks,

-- Francesc Alted


From d.s.seljebotn at astro.uio.no  Tue Mar 20 15:40:58 2012
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Tue, 20 Mar 2012 12:40:58 -0700
Subject: [Numpy-discussion] Looking for people interested in helping
	with Python compiler to LLVM
In-Reply-To: <CAFXk4boesOAbRRzyejuKW58avohKK=VVGrmFkgh00f9njVoLEA@mail.gmail.com>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<4F5EE219.8010108@astro.uio.no>
	<C15466AA-DBDC-4582-A522-D50D1A2E4AC1@continuum.io>
	<CANg26EWHOqu2RH1A-jBTpY79h973P9Ob-070NbBwLPeZkY7eKA@mail.gmail.com>
	<FA0A06B7-5EDF-4D2F-8919-A65E310D8AC5@continuum.io>
	<CANg26EVFOoQtkHcP-sYZ2vD13V6WF8cqZLkZOHyjusvMghZXmw@mail.gmail.com>
	<CAFXk4boesOAbRRzyejuKW58avohKK=VVGrmFkgh00f9njVoLEA@mail.gmail.com>
Message-ID: <2745ec09-9b93-475b-879c-bcf19cee009b@email.android.com>

We talked some about Theano. There are some differences in project goals which means that it makes sense to make this a seperate project: Cython wants to use this to generate C code up front from the Cython AST at compilation time; numba also has a different frontend (parsing of python bytecode) and a different backend (LLVM).

However, it may very well be possible that Theano could be refactored so that the more essential algorithms working on the syntax tree could be pulled out and shared with cython and numba. Then the question is whether the core of Theano is smart enough to compete with Fortran compilers and support arbitraily strided inputs optimally. Otherwise one might as well start from scratch. I'll leave that for Mark to figure out...

Dag
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Olivier Delalleau <shish at keba.be> wrote:

This sounds a lot like Theano, did you look into it?

-=- Olivier

Le 20 mars 2012 13:49, mark florisson <markflorisson88 at gmail.com> a ?crit :

On 13 March 2012 18:18, Travis Oliphant <travis at continuum.io> wrote:
>>>
>>> (Mark F., how does the above match how you feel about this?)
>>
>> I would like collaboration, but from a technical perspective I think
>> this would be much more involved than just dumping the AST to an IR
>> and generating some code from there. For vector expressions I think
>> sharing code would be more feasible than arbitrary (parallel) loops,
>> etc. Cython as a compiler can make many decisions that a Python
>> (bytecode) compiler can't make (at least without annotations and a
>> well-defined subset of the language (not so much the syntax as the
>> semantics)). I think in numba, if parallelism is to be supported, you
>> will want a prange-like construct, as proving independence between
>> iterations can be very hard to near impossible for a compiler.
>
> I completely agree that you have to define some kind of syntax to get parallelism.  But, a prange construct would not be out of the question, of course.
>
>>
>> As for code generation, I'm not sure how llvm would do things like
>> slicing arrays, reshaping, resizing etc (for vector expressions you
>> can first evaluate all slicing and indexing operations and then
>> compile the remaining vector expression), but for loops and array
>> reassignment within loops this would have to invoke the actual slicing
>> code from the llvm code (I presume).
>
> There could be some analysis on the byte-code, prior to emitting the llvm code in order to handle lots of things.   Basically, you have to "play" the byte-code on a simple machine anyway in order to emit the correct code.   The big thing about Cython is you have to typedef too many things that are really quite knowable from the code.   If Cython could improve it's type inference, then it would be a more suitable target.
>
>> There are many other things, like
>> bounds checking, wraparound, etc, that are all supported in both numpy
>> and Cython, but going through an llvm layer would as far as I can see,
>> require re-implementing those, at least if you want top-notch
>> performance. Personally, I think for non-trivial performance-critical
>> code (for loops with indexing, slicing, function calls, etc) Cython is
>> a better target.
>
> With libclang it is really quite possible to imagine a cython -> C target that itself compiles to llvm so that you can do everything at that intermediate layer.   However,  LLVM is a much better layer for optimization than C now that there are a lot of people collaborating on that layer.   I think it would be great if Cython targeted LLVM actually instead of C.
>
>>
>> Finally, as for non-vector-expression code, I really believe Cython is
>> a better target. cython.inline can have high overhead (at least the
>> first time it has to compile), but with better (numpy-aware) type
>> inference or profile guided optimizations (see recent threads on the
>> cython-dev mailing list), in addition to things like prange, I
>> personally believe Cython targets most of the use cases where numba
>> would be able to generate performing code.
>
> Cython and Numba certainly overlap.  However, Cython requires:
>
>        1) learning another language
>        2) creating an extension module --- loading bit-code files and dynamically executing (even on a different machine from the one that initially created them) can be a powerful alternative for run-time compilation and distribution of code.
>
> These aren't show-stoppers obviously.   But, I think some users would prefer an even simpler approach to getting fast-code than Cython (which currently doesn't do enought type-inference and requires building a dlopen extension module).

Dag and I have been discussing this at PyCon, and here is my take on
it (at this moment :).

Definitely, if you can avoid Cython then that is easier and more
desirable in many ways. So perhaps we can create a third project
called X (I'm not very creative, maybe ArrayExprOpt), that takes an
abstract syntax tree in a rather simple form, performs code
optimizations such as rewriting loops with array accesses to vector
expressions, fusing vector expressions and loops, etc, and spits out a
transformed AST containing these optimizations. If runtime information
is given such as actual shape and stride information the
transformations could figure out there and then whether to do things
like collapsing, axes swapping, blocking (as in, introducing more axes
or loops to retain discontiguous blocks in the cache), blocked memory
copies to contiguous chunks, etc. The AST could then also say whether
the final expressions are vectorizable. Part of this functionality is
already in numpy's nditer, except that this would be implicit and do
more (and hopefully with minimal overhead).

So numba, Cython and maybe numexpr could use the functionality, simply
by building the AST from Python and converting back (if necessary) to
its own AST. As such, the AST optimizer would be only part of any
(runtime) compiler's pipeline, and it should be very flexible to
retain any information (metadata regarding actual types, control flow
information, etc) provided by the original AST. It would not do
control flow analysis, type inference or promotion, etc, but only deal
with abstract types like integers, reals and arrays (C, Fortran or
partly contiguous or strided). It would not deal with objects, but
would allow to insert nodes like UnreorderableNode and SideEffectNode
wrapping parts of the original AST. In short, it should be as easy as
possible to convert from an original AST to this project's AST and
back again afterwards.

As the project matures many optimizations may be added that deal with
all sorts of loop restructuring and ways to efficiently utilize the
cache as well as enable vectorization and possibly parallelism.
Perhaps it could even generate a different AST depending on whether
execution target the CPU or the GPU (with optionally available
information such as cache sizes, GPU shared/local memory sizes, etc).

Seeing that this would be a part of my master dissertation, my
supervisor would require me to write the code, so at least until
August I think I would have to write (at least the bulk of) this.
Otherwise I can also make other parts of my dissertation's project
more prominent to make up for it. Anyway, my question is, is there
interest from at least the numba and numexpr projects (if code can be
transformed into vector operations, it makes sense to use numexpr for
that, I'm not sure what numba's interest is in that).


> -Travis
>
>
>
>
>>
>>> Dag
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120320/f6dfec29/attachment.html>

From d.s.seljebotn at astro.uio.no  Tue Mar 20 15:44:17 2012
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Tue, 20 Mar 2012 12:44:17 -0700
Subject: [Numpy-discussion] Looking for people interested in helping
	with Python compiler to LLVM
Message-ID: <6d1ad0e3-1523-416c-b344-016c9a0b4071@email.android.com>

Sorry, forgot to CC list on this. Lines staring with single greater-than are mine.
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Dag Sverre Seljebotn <d.s.seljebotn at astro.uio.no> wrote:


Francesc Alted <francesc at continuum.io> wrote:

>On Mar 20, 2012, at 12:49 PM, mark florisson wrote:
>>> Cython and Numba certainly overlap. However, Cython requires:
>>> 
>>> 1) learning another language
>>> 2) creating an extension module --- loading bit-code files
>and dynamically executing (even on a different machine from the one
>that initially created them) can be a powerful alternative for run-time
>compilation and distribution of code.
>>> 
>>> These aren't show-stoppers obviously. But, I think some users
>would prefer an even simpler approach to getting fast-code than Cython
>(which currently doesn't do enought type-inference and requires
>building a dlopen extension module).
>> 
>> Dag and I have been discussing this at PyCon, and here is my take on
>> it (at this moment :).
>> 
>> Definitely, if you can avoid Cython then that is easier and more
>> desirable in many ways. So perhaps we can create a third project
>> called X (I'm not very creative, maybe ArrayExprOpt), that takes an
>> abstract syntax tree in a rather simple form, performs code
>> optimizations such as rewriting loops with array accesses to vector
>> expressions, fusing vector expressions and loops, etc, and spits out
>a
>> transformed AST containing these optimizations. If runtime
>information
>> is given such as actual shape and stride information the
>> transformations could figure out there and then whether to do things
>> like collapsing, axes swapping, blocking (as in, introducing more
>axes
>> or loops to retain discontiguous blocks in the cache), blocked memory
>> copies to contiguous chunks, etc. The AST could then also say whether
>> the final expressions are vectorizable. Part of this functionality is
>> already in numpy's nditer, except that this would be implicit and do
>> more (and hopefully with minimal overhead).
>> 
>> So numba, Cython and maybe numexpr could use the functionality,
>simply
>> by building the AST from Python and converting back (if necessary) to
>> its own AST. As such, the AST optimizer would be only part of any
>> (runtime) compiler's pipeline, and it should be very flexible to
>> retain any information (metadata regarding actual types, control flow
>> information, etc) provided by the original AST. It would not do
>> control flow analysis, type inference or promotion, etc, but only
>deal
>> with abstract types like integers, reals and arrays (C, Fortran or
>> partly contiguous or strided). It would not deal with objects, but
>> would allow to insert nodes like UnreorderableNode and SideEffectNode
>> wrapping parts of the original AST. In short, it should be as easy as
>> possible to convert from an original AST to this project's AST and
>> back again afterwards.
>
>I think this is a very interesting project, and certainly projects like
>numba can benefit of it. So, in order to us have an idea on what you
>are after, can we assume that your project (call it X) would be kind of
>an compiler optimizer, and then the produced, optimized code could be
>feed into numba for optimized LLVM code generation (that on its turn,
>can be run on top of CPUs or GPUs or a combination)? Is that correct?

I think so. Another way of thinking about it is that it is a reimplementation of the logic in the (good and closed source) Fortran 90 compilers, in a reusable component for inclusion in various compilers.

Various c++ metaprogramming libraries (like Blitz++) are similar too.

>
>Giving that my interpretation above is correct, it is bit more
>difficult to me to see how your X project could be of benefit for
>numexpr. In fact, I actually see this the other way round: once the
>optimizer has discovered the vectorization parts, then go one step
>further and generate code that uses numexpr automatically (call this,
>vectorization through numexpr). This is what you mean, or I'm missing
>something?

No. I think in some ways this is a competitor to numexpr -- you would gut out the middle of numexpr and keep the frontend and backend, but use this to optimize iteration order and blocking strategies.

I think the goal is for higher performance than what I understand numexpr can provide (in some cases, not all!). For instance, can numexpr deal well with

a + a.T

where a is a c-contiguous array? Any numpy-like iteration order will not work well, one needs to use higher-dimensional (eg 2D) blocking, not 1D blocking.

(if numexpr can do this then great, the task might then reduce to refactoring numexpr so that cython and numba can use the same logic)

Dag

>
>> As the project matures many optimizations may be added that deal with
>> all sorts of loop restructuring and ways to efficiently utilize the
>> cache as well as enable vectorization and possibly parallelism.
>> Perhaps it could even generate a different AST depending on whether
>> execution target the CPU or the GPU (with optionally available
>> information such as cache sizes, GPU shared/local memory sizes, etc).
>> 
>> Seeing that this would be a part of my master dissertation, my
>> supervisor would require me to write the code, so at least until
>> August I think I would have to write (at least the bulk of) this.
>> Otherwise I can also make other parts of my dissertation's project
>> more prominent to make up for it. Anyway, my question is, is there
>> interest from at least the numba and numexpr projects (if code can be
>> transformed into vector operations, it makes sense to use numexpr for
>> that, I'm not sure what numba's interest is in that).
>
>I'm definitely interested for the numexpr part. It is just that I'm
>still struggling to see the big picture on this. But the general idea
>is really appealing.
>
>Thanks,
>
>-- Francesc Alted

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120320/f428a229/attachment.html>

From shish at keba.be  Tue Mar 20 15:49:18 2012
From: shish at keba.be (Olivier Delalleau)
Date: Tue, 20 Mar 2012 15:49:18 -0400
Subject: [Numpy-discussion] Looking for people interested in helping
 with Python compiler to LLVM
In-Reply-To: <2745ec09-9b93-475b-879c-bcf19cee009b@email.android.com>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<4F5EE219.8010108@astro.uio.no>
	<C15466AA-DBDC-4582-A522-D50D1A2E4AC1@continuum.io>
	<CANg26EWHOqu2RH1A-jBTpY79h973P9Ob-070NbBwLPeZkY7eKA@mail.gmail.com>
	<FA0A06B7-5EDF-4D2F-8919-A65E310D8AC5@continuum.io>
	<CANg26EVFOoQtkHcP-sYZ2vD13V6WF8cqZLkZOHyjusvMghZXmw@mail.gmail.com>
	<CAFXk4boesOAbRRzyejuKW58avohKK=VVGrmFkgh00f9njVoLEA@mail.gmail.com>
	<2745ec09-9b93-475b-879c-bcf19cee009b@email.android.com>
Message-ID: <CAFXk4br+FqcVgRdDunqEBmNqU+dNcZGynmz90y7vsSEewOXPEQ@mail.gmail.com>

I doubt Theano is already as smart as you'd want it to be right now,
however the core mechanisms are there to perform graph optimizations and
move computations to GPU. It may save time to start from there instead of
starting all over from scratch. I'm not sure though, but it looks like it
would be worth considering it at least.

-=- Olivier

Le 20 mars 2012 15:40, Dag Sverre Seljebotn <d.s.seljebotn at astro.uio.no> a
?crit :

> ** We talked some about Theano. There are some differences in project
> goals which means that it makes sense to make this a seperate project:
> Cython wants to use this to generate C code up front from the Cython AST at
> compilation time; numba also has a different frontend (parsing of python
> bytecode) and a different backend (LLVM).
>
> However, it may very well be possible that Theano could be refactored so
> that the more essential algorithms working on the syntax tree could be
> pulled out and shared with cython and numba. Then the question is whether
> the core of Theano is smart enough to compete with Fortran compilers and
> support arbitraily strided inputs optimally. Otherwise one might as well
> start from scratch. I'll leave that for Mark to figure out...
>
> Dag
> --
> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>
>
> Olivier Delalleau <shish at keba.be> wrote:
>>
>> This sounds a lot like Theano, did you look into it?
>>
>> -=- Olivier
>>
>> Le 20 mars 2012 13:49, mark florisson <markflorisson88 at gmail.com> a
>> ?crit :
>>
>>> On 13 March 2012 18:18, Travis Oliphant <travis at continuum.io> wrote:
>>> >>>
>>> >>> (Mark F., how does the above match how you feel about this?)
>>> >>
>>> >> I would like collaboration, but from a technical perspective I think
>>> >> this would be much more involved than just dumping the AST to an IR
>>> >> and generating some code from there. For vector expressions I think
>>> >> sharing code would be more feasible than arbitrary (parallel) loops,
>>> >> etc. Cython as a compiler can make many decisions that a Python
>>> >> (bytecode) compiler can't make (at least without annotations and a
>>> >> well-defined subset of the language (not so much the syntax as the
>>> >> semantics)). I think in numba, if parallelism is to be supported, you
>>> >> will want a prange-like construct, as proving independence between
>>> >> iterations can be very hard to near impossible for a compiler.
>>> >
>>> > I completely agree that you have to define some kind of syntax to get
>>> parallelism.  But, a prange construct would not be out of the question, of
>>> course.
>>> >
>>> >>
>>> >> As for code generation, I'm not sure how llvm would do things like
>>> >> slicing arrays, reshaping, resizing etc (for vector expressions you
>>> >> can first evaluate all slicing and indexing operations and then
>>> >> compile the remaining vector expression), but for loops and array
>>> >> reassignment within loops this would have to invoke the actual slicing
>>> >> code from the llvm code (I presume).
>>> >
>>> > There could be some analysis on the byte-code, prior to emitting the
>>> llvm code in order to handle lots of things.   Basically, you have to
>>> "play" the byte-code on a simple machine anyway in order to emit the
>>> correct code.   The big thing about Cython is you have to typedef too many
>>> things that are really quite knowable from the code.   If Cython could
>>> improve it's type inference, then it would be a more suitable target.
>>> >
>>> >> There are many other things, like
>>> >> bounds checking, wraparound, etc, that are all supported in both numpy
>>> >> and Cython, but going through an llvm layer would as far as I can see,
>>> >> require re-implementing those, at least if you want top-notch
>>> >> performance. Personally, I think for non-trivial performance-critical
>>> >> code (for loops with indexing, slicing, function calls, etc) Cython is
>>> >> a better target.
>>> >
>>> > With libclang it is really quite possible to imagine a cython -> C
>>> target that itself compiles to llvm so that you can do everything at that
>>> intermediate layer.   However,  LLVM is a much better layer for
>>> optimization than C now that there are a lot of people collaborating on
>>> that layer.   I think it would be great if Cython targeted LLVM actually
>>> instead of C.
>>> >
>>> >>
>>> >> Finally, as for non-vector-expression code, I really believe Cython is
>>> >> a better target. cython.inline can have high overhead (at least the
>>> >> first time it has to compile), but with better (numpy-aware) type
>>> >> inference or profile guided optimizations (see recent threads on the
>>> >> cython-dev mailing list), in addition to things like prange, I
>>> >> personally believe Cython targets most of the use cases where numba
>>> >> would be able to generate performing code.
>>> >
>>> > Cython and Numba certainly overlap.  However, Cython requires:
>>> >
>>> >        1) learning another language
>>> >        2) creating an extension module --- loading bit-code files and
>>> dynamically executing (even on a different machine from the one that
>>> initially created them) can be a powerful alternative for run-time
>>> compilation and distribution of code.
>>> >
>>> > These aren't show-stoppers obviously.   But, I think some users would
>>> prefer an even simpler approach to getting fast-code than Cython (which
>>> currently doesn't do enought type-inference and requires building a dlopen
>>> extension module).
>>>
>>> Dag and I have been discussing this at PyCon, and here is my take on
>>> it (at this moment :).
>>>
>>> Definitely, if you can avoid Cython then that is easier and more
>>> desirable in many ways. So perhaps we can create a third project
>>> called X (I'm not very creative, maybe ArrayExprOpt), that takes an
>>> abstract syntax tree in a rather simple form, performs code
>>> optimizations such as rewriting loops with array accesses to vector
>>> expressions, fusing vector expressions and loops, etc, and spits out a
>>> transformed AST containing these optimizations. If runtime information
>>> is given such as actual shape and stride information the
>>> transformations could figure out there and then whether to do things
>>> like collapsing, axes swapping, blocking (as in, introducing more axes
>>> or loops to retain discontiguous blocks in the cache), blocked memory
>>> copies to contiguous chunks, etc. The AST could then also say whether
>>> the final expressions are vectorizable. Part of this functionality is
>>> already in numpy's nditer, except that this would be implicit and do
>>> more (and hopefully with minimal overhead).
>>>
>>> So numba, Cython and maybe numexpr could use the functionality, simply
>>> by building the AST from Python and converting back (if necessary) to
>>> its own AST. As such, the AST optimizer would be only part of any
>>> (runtime) compiler's pipeline, and it should be very flexible to
>>> retain any information (metadata regarding actual types, control flow
>>> information, etc) provided by the original AST. It would not do
>>> control flow analysis, type inference or promotion, etc, but only deal
>>> with abstract types like integers, reals and arrays (C, Fortran or
>>> partly contiguous or strided). It would not deal with objects, but
>>> would allow to insert nodes like UnreorderableNode and SideEffectNode
>>> wrapping parts of the original AST. In short, it should be as easy as
>>> possible to convert from an original AST to this project's AST and
>>> back again afterwards.
>>>
>>> As the project matures many optimizations may be added that deal with
>>> all sorts of loop restructuring and ways to efficiently utilize the
>>> cache as well as enable vectorization and possibly parallelism.
>>> Perhaps it could even generate a different AST depending on whether
>>> execution target the CPU or the GPU (with optionally available
>>> information such as cache sizes, GPU shared/local memory sizes, etc).
>>>
>>> Seeing that this would be a part of my master dissertation, my
>>> supervisor would require me to write the code, so at least until
>>> August I think I would have to write (at least the bulk of) this.
>>> Otherwise I can also make other parts of my dissertation's project
>>> more prominent to make up for it. Anyway, my question is, is there
>>> interest from at least the numba and numexpr projects (if code can be
>>> transformed into vector operations, it makes sense to use numexpr for
>>> that, I'm not sure what numba's interest is in that).
>>>
>>> > -Travis
>>> >
>>> >
>>> >
>>> >
>>> >>
>>> >>> Dag
>>> >>> _______________________________________________
>>> >>> NumPy-Discussion mailing list
>>> >>> NumPy-Discussion at scipy.org
>>> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>> >>>
>>> >>>
>>> >>>
>>> >>> _______________________________________________
>>> >>> NumPy-Discussion mailing list
>>> >>> NumPy-Discussion at scipy.org
>>> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>> >>>
>>> >> _______________________________________________
>>> >> NumPy-Discussion mailing list
>>> >> NumPy-Discussion at scipy.org
>>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>> >
>>> > _______________________________________________
>>> > NumPy-Discussion mailing list
>>> > NumPy-Discussion at scipy.org
>>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120320/cd2aeba8/attachment.html>

From francesc at continuum.io  Tue Mar 20 15:56:04 2012
From: francesc at continuum.io (Francesc Alted)
Date: Tue, 20 Mar 2012 14:56:04 -0500
Subject: [Numpy-discussion] Looking for people interested in helping
	with Python compiler to LLVM
In-Reply-To: <47467832-971b-4a39-b25f-dbb2e1779c9b@email.android.com>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<4F5EE219.8010108@astro.uio.no>
	<C15466AA-DBDC-4582-A522-D50D1A2E4AC1@continuum.io>
	<CANg26EWHOqu2RH1A-jBTpY79h973P9Ob-070NbBwLPeZkY7eKA@mail.gmail.com>
	<FA0A06B7-5EDF-4D2F-8919-A65E310D8AC5@continuum.io>
	<CANg26EVFOoQtkHcP-sYZ2vD13V6WF8cqZLkZOHyjusvMghZXmw@mail.gmail.com>
	<74EF7B8D-FADC-4091-A7EB-09472A85B75B@continuum.io>
	<47467832-971b-4a39-b25f-dbb2e1779c9b@email.android.com>
Message-ID: <6CAE2B1E-9EFB-4B23-8D81-0AC013C35EFB@continuum.io>

On Mar 20, 2012, at 2:29 PM, Dag Sverre Seljebotn wrote:
> Francesc Alted <francesc at continuum.io> wrote:
> 
>> On Mar 20, 2012, at 12:49 PM, mark florisson wrote:
>>>> Cython and Numba certainly overlap.  However, Cython requires:
>>>> 
>>>>       1) learning another language
>>>>       2) creating an extension module --- loading bit-code files
>> and dynamically executing (even on a different machine from the one
>> that initially created them) can be a powerful alternative for run-time
>> compilation and distribution of code.
>>>> 
>>>> These aren't show-stoppers obviously.   But, I think some users
>> would prefer an even simpler approach to getting fast-code than Cython
>> (which currently doesn't do enought type-inference and requires
>> building a dlopen extension module).
>>> 
>>> Dag and I have been discussing this at PyCon, and here is my take on
>>> it (at this moment :).
>>> 
>>> Definitely, if you can avoid Cython then that is easier and more
>>> desirable in many ways. So perhaps we can create a third project
>>> called X (I'm not very creative, maybe ArrayExprOpt), that takes an
>>> abstract syntax tree in a rather simple form, performs code
>>> optimizations such as rewriting loops with array accesses to vector
>>> expressions, fusing vector expressions and loops, etc, and spits out
>> a
>>> transformed AST containing these optimizations. If runtime
>> information
>>> is given such as actual shape and stride information the
>>> transformations could figure out there and then whether to do things
>>> like collapsing, axes swapping, blocking (as in, introducing more
>> axes
>>> or loops to retain discontiguous blocks in the cache), blocked memory
>>> copies to contiguous chunks, etc. The AST could then also say whether
>>> the final expressions are vectorizable. Part of this functionality is
>>> already in numpy's nditer, except that this would be implicit and do
>>> more (and hopefully with minimal overhead).
>>> 
>>> So numba, Cython and maybe numexpr could use the functionality,
>> simply
>>> by building the AST from Python and converting back (if necessary) to
>>> its own AST. As such, the AST optimizer would be only part of any
>>> (runtime) compiler's pipeline, and it should be very flexible to
>>> retain any information (metadata regarding actual types, control flow
>>> information, etc) provided by the original AST. It would not do
>>> control flow analysis, type inference or promotion, etc, but only
>> deal
>>> with abstract types like integers, reals and arrays (C, Fortran or
>>> partly contiguous or strided). It would not deal with objects, but
>>> would allow to insert nodes like UnreorderableNode and SideEffectNode
>>> wrapping parts of the original AST. In short, it should be as easy as
>>> possible to convert from an original AST to this project's AST and
>>> back again afterwards.
>> 
>> I think this is a very interesting project, and certainly projects like
>> numba can benefit of it.  So, in order to us have an idea on what you
>> are after, can we assume that your project (call it X) would be kind of
>> an compiler optimizer, and then the produced, optimized code could be
>> feed into numba for optimized LLVM code generation (that on its turn,
>> can be run on top of CPUs or GPUs or a combination)?  Is that correct?
> 
> I think so. Another way of thinking about it is that it is a reimplementation of the logic in the (good and closed source) Fortran 90 compilers, in a reusable component for inclusion in various compilers.
> 
> Various c++ metaprogramming libraries (like Blitz++) are similar too.

Aha, thanks.

>> Giving that my interpretation above is correct, it is bit more
>> difficult to me to see how your X project could be of benefit for
>> numexpr.  In fact, I actually see this the other way round: once the
>> optimizer has discovered the vectorization parts, then go one step
>> further and generate code that uses numexpr automatically (call this,
>> vectorization through numexpr).  This is what you mean, or I'm missing
>> something?
> 
> No. I think in some ways this is a competitor to numexpr -- you would gut out the middle of numexpr and keep the frontend and backend, but use this to optimize iteration order and blocking strategies.

I see.  Yes, I can easily see Mark's project X + numba more as a competitor (killer?) to numexpr too.

> 
> I think the goal is for higher performance than what I understand numexpr can provide (in some cases, not all!). For instance, can numexpr deal well with
> 
> a + a.T
> 
> where a is a c-contiguous array? Any numpy-like iteration order will not work well, one needs to use higher-dimensional (eg 2D) blocking, not 1D blocking.

No.  numexpr cannot deal with the above problem efficiently.  numexpr is about 1d blocking, so its approach is pretty naive (but effective for these 1d blocking tasks).  

> (if numexpr can do this then great, the task might then reduce to refactoring numexpr so that cython and numba can use the same logic)

Well, now that I think about it, the virtual machine in latest numexpr (2.0) is based on the new nditer iterator included in NumPy 1.6, so I'm wondering if with a little bit of more effort, the 2d blocking could be implemented.  While I'm pretty sure this is the case, I don't know if it would be really worth the effort.  Perhaps concentrating on things like numba + projectX, or just Theano would be better targets.  Perhaps Mark Wiebe (who contributed the new nditer-aware VM) could say a bit more about this.

Hmm, some good food for brains :)

-- Francesc Alted


From ralf.gommers at googlemail.com  Tue Mar 20 18:02:38 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Tue, 20 Mar 2012 23:02:38 +0100
Subject: [Numpy-discussion] Trouble building NumPy on PPC64
In-Reply-To: <0c1b056b6bd43c833d13fdf4495289cb.squirrel@srv2.s4y.tournesol-consulting.eu>
References: <0c1b056b6bd43c833d13fdf4495289cb.squirrel@srv2.s4y.tournesol-consulting.eu>
Message-ID: <CABL7CQj2SBsehehSyC18HzFa62NsbPgxF7pCHZ8ws_KdHzn33A@mail.gmail.com>

On Mon, Mar 19, 2012 at 6:45 PM, Andreas H. <lists at hilboll.de> wrote:

> Hi all,
>
> I have troube installing numpy in a virtual environment on a SuSE
> Enterprise 11 server (ppc64).
>
> Here is what I did:
>
>    curl -O https://raw.github.com/pypa/virtualenv/master/virtualenv.py
>    python virtualenv.py --distribute --no-site-packages .virtualenvs/pydoas
>    source .virtualenvs/pydoas/bin/activate
>    pip install numpy
>
> And here is the outcome:
>
>    SystemError: Cannot compile 'Python.h'. Perhaps you need to install
> python-dev|python-devel.
>
> However, Python.h exists, because I did install the python-devel package:
>
>    (pydoas)hilboll at odin:~/.virtualenvs/pydoas/build/numpy> find
> /usr/include/ | grep Python.h
>    /usr/include/python2.6/Python.h
>
> I also tried without the --distribute --no-site-packages flags, with the
> same result.
>
> Any hints are very welcome :)
>

Hard to say what's going wrong. Perhaps pip is not for python2.6 but
another python version? Can you install without pip, so normal "python
setup.py install"? Can you post the full build log?

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120320/8fb17c3f/attachment.html>

From chris.barker at noaa.gov  Tue Mar 20 18:59:25 2012
From: chris.barker at noaa.gov (Chris Barker)
Date: Tue, 20 Mar 2012 15:59:25 -0700
Subject: [Numpy-discussion] Possible roadmap addendum: building better
 text file readers
In-Reply-To: <CAM-+wY8ieo0qgUGXNjTDWk3c_O8nr2c0Ha3sDVmxJJYyvdV8mg@mail.gmail.com>
References: <CAJPUwMD=q4kxjUJ4+cyC=hrPx8i7AUtg=ULmvngScDX3czGc9w@mail.gmail.com>
	<ji65gb$i5l$1@dough.gmane.org>
	<BBC7E390-BF69-42F9-B74E-2E673BE53F53@continuum.io>
	<1330092347-sup-3918@rohan>
	<CAJPUwMCTCy29N2iv2HGy4r1uPcCz6d0_mFi9hh7JkgwdbEkVFQ@mail.gmail.com>
	<1330207186-sup-1957@rohan> <loom.20120227T062258-15@post.gmane.org>
	<1330351883-sup-9943@rohan>
	<CAPJVwB=H2P68CU_oKAf7n_fvj2q1Ne8TSw153_twAY0aq69H6A@mail.gmail.com>
	<1330365437-sup-7898@rohan>
	<CAPJVwBnaEd=EF9xD_-=VJeCMTYpb=7gOwxRYoQ9xce7xqiwWEg@mail.gmail.com>
	<1330387831-sup-839@rohan>
	<CAPJVwBnO6xbBWREL-XNTfhhKzhawWzBTheZyeLm-=TOxrXBRiA@mail.gmail.com>
	<CACHfV1s-z6=hM_j+8t8VeLzbEs8kKXfiWrf3MxiNpioqKtseig@mail.gmail.com>
	<CALGmxEJZjfu_X6M_hU6sqsPNTX=hX+gL_iZc92kpdpikyupapg@mail.gmail.com>
	<CAM-+wY8ieo0qgUGXNjTDWk3c_O8nr2c0Ha3sDVmxJJYyvdV8mg@mail.gmail.com>
Message-ID: <CALGmxE+PpHL+g1gkx1hq-KmTotX=cGF2yes0agzEbY93w8-oUA@mail.gmail.com>

Warren et al:

On Wed, Mar 7, 2012 at 7:49 AM, Warren Weckesser
<warren.weckesser at enthought.com> wrote:
> If you are setup with Cython to build extension modules,

I am

> and you don't mind
> testing an unreleased and experimental reader,

and I don't.

> you can try the text reader
> that I'm working on: https://github.com/WarrenWeckesser/textreader

It just took me a while to get around to it!

First of all: this is pretty much exactly what I've been looking for
for years, and never got around to writing myself - thanks!

My comments/suggestions:

1) a docstring for the textreader module would be nice.

2) "tzoffset" -- this is tricky stuff. Ideally, it should be able to
parse an ISO datetime string timezone specifier, but short of that, I
think the default should be None or UTC -- time zones are too ugly to
presume anything!

3) it breaks with the old MacOS style line endings: \r only. Maybe no
need to support that, but it turns out one of my old test files still
had them!

4) when I try to read more rows than are in the file, I get:
   File "textreader.pyx", line 247, in textreader.readrows
(python/textreader.c:3469)
  ValueError: negative dimensions are not allowed

good to get an error, but it's not very informative!

5) for reading float64 values -- I get something different with
textreader than with the python "float()":
  input: "678.901"
  float("") :  678.90099999999995
  textreader : 678.90100000000007

as close as the number of figures available, but curious...


5) Performance issue: in my case, I'm reading a big file that's in
chunks -- each one has a header indicating how many rows follow, then
the rows, so I parse it out bit by bit.
For smallish files, it's much faster than pure python, and almost as
fast as some old C code of mine that is far less flexible.

But for large files,  -- it's much slower -- indeed slower than a pure
python version for my use case.

I did a simplified test -- with 10,000 rows:

total number of rows:  10000
pure python took: 1.410408 seconds
pure python chunks took: 1.613094 seconds
textreader all at once took: 0.067098 seconds
textreader in chunks took : 0.131802 seconds

but with 1,000,000 rows:

total number of rows:  1000000
total number of chunks:  1000
pure python took: 30.712564 seconds
pure python chunks took: 31.313225 seconds
textreader all at once took: 1.314924 seconds
textreader in chunks took : 9.684819 seconds

then it gets even worse with the chunk size smaller:

total number of rows:  1000000
total number of chunks:  10000
pure python took: 30.032246 seconds
pure python chunks took: 42.010589 seconds
textreader all at once took: 1.318613 seconds
textreader in chunks took : 87.743729 seconds

my code, which is C that essentially runs fscanf over the file, has
essentially no performance hit from doing it in chunks -- so I think
something is wrong here.

Sorry, I haven't dug into the code to try to figure out what yet --
does it rewind the file each time maybe?

Enclosed is my test code.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959?? voice
7600 Sand Point Way NE ??(206) 526-6329?? fax
Seattle, WA ?98115 ? ? ??(206) 526-6317?? main reception

Chris.Barker at noaa.gov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test_performance.py
Type: application/octet-stream
Size: 3294 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120320/28bd9049/attachment.obj>

From matrixhasu at gmail.com  Tue Mar 20 19:28:54 2012
From: matrixhasu at gmail.com (Sandro Tosi)
Date: Wed, 21 Mar 2012 00:28:54 +0100
Subject: [Numpy-discussion] Testsuite fails with Python 2.7.3rc1 and
	3.2.3rc1 (Debian)
Message-ID: <CAB4XWXwUfnmoGOUtMBKrC1CyVgu+hnaK=fGi=6EE_vwbDO8zUA@mail.gmail.com>

Hello,
I've reported http://projects.scipy.org/numpy/ticket/2085 and Ralf
asked for bringing that up here: is anyone able to replicate the
problem described in that ticket?

The debian bug tracking the problem is:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=664672

Cheers,
-- 
Sandro Tosi (aka morph, morpheus, matrixhasu)
My website: http://matrixhasu.altervista.org/
Me at Debian: http://wiki.debian.org/SandroTosi


From d.s.seljebotn at astro.uio.no  Wed Mar 21 00:20:52 2012
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Tue, 20 Mar 2012 21:20:52 -0700
Subject: [Numpy-discussion] Looking for people interested in helping
 with Python compiler to LLVM
In-Reply-To: <6CAE2B1E-9EFB-4B23-8D81-0AC013C35EFB@continuum.io>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<4F5EE219.8010108@astro.uio.no>
	<C15466AA-DBDC-4582-A522-D50D1A2E4AC1@continuum.io>
	<CANg26EWHOqu2RH1A-jBTpY79h973P9Ob-070NbBwLPeZkY7eKA@mail.gmail.com>
	<FA0A06B7-5EDF-4D2F-8919-A65E310D8AC5@continuum.io>
	<CANg26EVFOoQtkHcP-sYZ2vD13V6WF8cqZLkZOHyjusvMghZXmw@mail.gmail.com>
	<74EF7B8D-FADC-4091-A7EB-09472A85B75B@continuum.io>
	<47467832-971b-4a39-b25f-dbb2e1779c9b@email.android.com>
	<6CAE2B1E-9EFB-4B23-8D81-0AC013C35EFB@continuum.io>
Message-ID: <4F695724.8050309@astro.uio.no>

On 03/20/2012 12:56 PM, Francesc Alted wrote:
> On Mar 20, 2012, at 2:29 PM, Dag Sverre Seljebotn wrote:
>> Francesc Alted<francesc at continuum.io>  wrote:
>>
>>> On Mar 20, 2012, at 12:49 PM, mark florisson wrote:
>>>>> Cython and Numba certainly overlap.  However, Cython requires:
>>>>>
>>>>>        1) learning another language
>>>>>        2) creating an extension module --- loading bit-code files
>>> and dynamically executing (even on a different machine from the one
>>> that initially created them) can be a powerful alternative for run-time
>>> compilation and distribution of code.
>>>>>
>>>>> These aren't show-stoppers obviously.   But, I think some users
>>> would prefer an even simpler approach to getting fast-code than Cython
>>> (which currently doesn't do enought type-inference and requires
>>> building a dlopen extension module).
>>>>
>>>> Dag and I have been discussing this at PyCon, and here is my take on
>>>> it (at this moment :).
>>>>
>>>> Definitely, if you can avoid Cython then that is easier and more
>>>> desirable in many ways. So perhaps we can create a third project
>>>> called X (I'm not very creative, maybe ArrayExprOpt), that takes an
>>>> abstract syntax tree in a rather simple form, performs code
>>>> optimizations such as rewriting loops with array accesses to vector
>>>> expressions, fusing vector expressions and loops, etc, and spits out
>>> a
>>>> transformed AST containing these optimizations. If runtime
>>> information
>>>> is given such as actual shape and stride information the
>>>> transformations could figure out there and then whether to do things
>>>> like collapsing, axes swapping, blocking (as in, introducing more
>>> axes
>>>> or loops to retain discontiguous blocks in the cache), blocked memory
>>>> copies to contiguous chunks, etc. The AST could then also say whether
>>>> the final expressions are vectorizable. Part of this functionality is
>>>> already in numpy's nditer, except that this would be implicit and do
>>>> more (and hopefully with minimal overhead).
>>>>
>>>> So numba, Cython and maybe numexpr could use the functionality,
>>> simply
>>>> by building the AST from Python and converting back (if necessary) to
>>>> its own AST. As such, the AST optimizer would be only part of any
>>>> (runtime) compiler's pipeline, and it should be very flexible to
>>>> retain any information (metadata regarding actual types, control flow
>>>> information, etc) provided by the original AST. It would not do
>>>> control flow analysis, type inference or promotion, etc, but only
>>> deal
>>>> with abstract types like integers, reals and arrays (C, Fortran or
>>>> partly contiguous or strided). It would not deal with objects, but
>>>> would allow to insert nodes like UnreorderableNode and SideEffectNode
>>>> wrapping parts of the original AST. In short, it should be as easy as
>>>> possible to convert from an original AST to this project's AST and
>>>> back again afterwards.
>>>
>>> I think this is a very interesting project, and certainly projects like
>>> numba can benefit of it.  So, in order to us have an idea on what you
>>> are after, can we assume that your project (call it X) would be kind of
>>> an compiler optimizer, and then the produced, optimized code could be
>>> feed into numba for optimized LLVM code generation (that on its turn,
>>> can be run on top of CPUs or GPUs or a combination)?  Is that correct?
>>
>> I think so. Another way of thinking about it is that it is a reimplementation of the logic in the (good and closed source) Fortran 90 compilers, in a reusable component for inclusion in various compilers.
>>
>> Various c++ metaprogramming libraries (like Blitz++) are similar too.
>
> Aha, thanks.
>
>>> Giving that my interpretation above is correct, it is bit more
>>> difficult to me to see how your X project could be of benefit for
>>> numexpr.  In fact, I actually see this the other way round: once the
>>> optimizer has discovered the vectorization parts, then go one step
>>> further and generate code that uses numexpr automatically (call this,
>>> vectorization through numexpr).  This is what you mean, or I'm missing
>>> something?
>>
>> No. I think in some ways this is a competitor to numexpr -- you would gut out the middle of numexpr and keep the frontend and backend, but use this to optimize iteration order and blocking strategies.
>
> I see.  Yes, I can easily see Mark's project X + numba more as a competitor (killer?) to numexpr too.
>
>>
>> I think the goal is for higher performance than what I understand numexpr can provide (in some cases, not all!). For instance, can numexpr deal well with
>>
>> a + a.T
>>
>> where a is a c-contiguous array? Any numpy-like iteration order will not work well, one needs to use higher-dimensional (eg 2D) blocking, not 1D blocking.
>
> No.  numexpr cannot deal with the above problem efficiently.  numexpr is about 1d blocking, so its approach is pretty naive (but effective for these 1d blocking tasks).
>
>> (if numexpr can do this then great, the task might then reduce to refactoring numexpr so that cython and numba can use the same logic)
>
> Well, now that I think about it, the virtual machine in latest numexpr (2.0) is based on the new nditer iterator included in NumPy 1.6, so I'm wondering if with a little bit of more effort, the 2d blocking could be implemented.  While I'm pretty sure this is the case, I don't know if it would be really worth the effort.  Perhaps concentrating on things like numba + projectX, or just Theano would be better targets.  Perhaps Mark Wiebe (who contributed the new nditer-aware VM) could say a bit more about this.

My guess is that the answer is that nditer works well in many 
situations, but can be sub-optimal for some array shapes, particularly 
if the arrays fit in cache.

Here's a contrived bad case (not sure if it is the worst): Consider

arr[::2, :]

where arr is C-contiguous with shape (n, 2), and let n be such that the 
array fits in L2 cache :-)

The ::2 ensures that the array can't be flattened. Any nditer approach, 
as I understand it, would need to execute the inner subroutine for each 
row of 4 elements, and then invoke the iteration machinery, which I 
imagine is quite slow compared to LLVM code generated specifically for 
this array, which could even unroll the inner 4-element loop.

If one really wants to compete with Fortran, one must take into account 
that the scientific end-user may already be structuring the computation 
in a cache-efficient way. It doesn't always help with good performance 
"as long as arrays are >10 MB".

If the array is large, the memory bottleneck should makes a lot of the 
CPU overhead of nditer irrelevant even for worst-case arrays; though I'd 
be curious to see benchmarks of how much overhead is left, and my 
(rather unfounded) suspicion is that hard-wired compiled LLVM code for a 
specific array should play better with the CPUs cache predictor to 
better mask memory access latencies (?)

(Mark F., apropos benchmarks, let me know what kind of different 
hardware you have access to; I could see if I can give you access to a 
couple of boxes with different memory bus characteristics, i.e., Intel 
vs. AMD and so on.)

Dag


From d.s.seljebotn at astro.uio.no  Wed Mar 21 00:22:07 2012
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Tue, 20 Mar 2012 21:22:07 -0700
Subject: [Numpy-discussion] Looking for people interested in helping
 with Python compiler to LLVM
In-Reply-To: <4F695724.8050309@astro.uio.no>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<4F5EE219.8010108@astro.uio.no>
	<C15466AA-DBDC-4582-A522-D50D1A2E4AC1@continuum.io>
	<CANg26EWHOqu2RH1A-jBTpY79h973P9Ob-070NbBwLPeZkY7eKA@mail.gmail.com>
	<FA0A06B7-5EDF-4D2F-8919-A65E310D8AC5@continuum.io>
	<CANg26EVFOoQtkHcP-sYZ2vD13V6WF8cqZLkZOHyjusvMghZXmw@mail.gmail.com>
	<74EF7B8D-FADC-4091-A7EB-09472A85B75B@continuum.io>
	<47467832-971b-4a39-b25f-dbb2e1779c9b@email.android.com>
	<6CAE2B1E-9EFB-4B23-8D81-0AC013C35EFB@continuum.io>
	<4F695724.8050309@astro.uio.no>
Message-ID: <4F69576F.8070903@astro.uio.no>

On 03/20/2012 09:20 PM, Dag Sverre Seljebotn wrote:
> On 03/20/2012 12:56 PM, Francesc Alted wrote:
>> On Mar 20, 2012, at 2:29 PM, Dag Sverre Seljebotn wrote:
>>> Francesc Alted<francesc at continuum.io>   wrote:
>>>
>>>> On Mar 20, 2012, at 12:49 PM, mark florisson wrote:
>>>>>> Cython and Numba certainly overlap.  However, Cython requires:
>>>>>>
>>>>>>         1) learning another language
>>>>>>         2) creating an extension module --- loading bit-code files
>>>> and dynamically executing (even on a different machine from the one
>>>> that initially created them) can be a powerful alternative for run-time
>>>> compilation and distribution of code.
>>>>>>
>>>>>> These aren't show-stoppers obviously.   But, I think some users
>>>> would prefer an even simpler approach to getting fast-code than Cython
>>>> (which currently doesn't do enought type-inference and requires
>>>> building a dlopen extension module).
>>>>>
>>>>> Dag and I have been discussing this at PyCon, and here is my take on
>>>>> it (at this moment :).
>>>>>
>>>>> Definitely, if you can avoid Cython then that is easier and more
>>>>> desirable in many ways. So perhaps we can create a third project
>>>>> called X (I'm not very creative, maybe ArrayExprOpt), that takes an
>>>>> abstract syntax tree in a rather simple form, performs code
>>>>> optimizations such as rewriting loops with array accesses to vector
>>>>> expressions, fusing vector expressions and loops, etc, and spits out
>>>> a
>>>>> transformed AST containing these optimizations. If runtime
>>>> information
>>>>> is given such as actual shape and stride information the
>>>>> transformations could figure out there and then whether to do things
>>>>> like collapsing, axes swapping, blocking (as in, introducing more
>>>> axes
>>>>> or loops to retain discontiguous blocks in the cache), blocked memory
>>>>> copies to contiguous chunks, etc. The AST could then also say whether
>>>>> the final expressions are vectorizable. Part of this functionality is
>>>>> already in numpy's nditer, except that this would be implicit and do
>>>>> more (and hopefully with minimal overhead).
>>>>>
>>>>> So numba, Cython and maybe numexpr could use the functionality,
>>>> simply
>>>>> by building the AST from Python and converting back (if necessary) to
>>>>> its own AST. As such, the AST optimizer would be only part of any
>>>>> (runtime) compiler's pipeline, and it should be very flexible to
>>>>> retain any information (metadata regarding actual types, control flow
>>>>> information, etc) provided by the original AST. It would not do
>>>>> control flow analysis, type inference or promotion, etc, but only
>>>> deal
>>>>> with abstract types like integers, reals and arrays (C, Fortran or
>>>>> partly contiguous or strided). It would not deal with objects, but
>>>>> would allow to insert nodes like UnreorderableNode and SideEffectNode
>>>>> wrapping parts of the original AST. In short, it should be as easy as
>>>>> possible to convert from an original AST to this project's AST and
>>>>> back again afterwards.
>>>>
>>>> I think this is a very interesting project, and certainly projects like
>>>> numba can benefit of it.  So, in order to us have an idea on what you
>>>> are after, can we assume that your project (call it X) would be kind of
>>>> an compiler optimizer, and then the produced, optimized code could be
>>>> feed into numba for optimized LLVM code generation (that on its turn,
>>>> can be run on top of CPUs or GPUs or a combination)?  Is that correct?
>>>
>>> I think so. Another way of thinking about it is that it is a reimplementation of the logic in the (good and closed source) Fortran 90 compilers, in a reusable component for inclusion in various compilers.
>>>
>>> Various c++ metaprogramming libraries (like Blitz++) are similar too.
>>
>> Aha, thanks.
>>
>>>> Giving that my interpretation above is correct, it is bit more
>>>> difficult to me to see how your X project could be of benefit for
>>>> numexpr.  In fact, I actually see this the other way round: once the
>>>> optimizer has discovered the vectorization parts, then go one step
>>>> further and generate code that uses numexpr automatically (call this,
>>>> vectorization through numexpr).  This is what you mean, or I'm missing
>>>> something?
>>>
>>> No. I think in some ways this is a competitor to numexpr -- you would gut out the middle of numexpr and keep the frontend and backend, but use this to optimize iteration order and blocking strategies.
>>
>> I see.  Yes, I can easily see Mark's project X + numba more as a competitor (killer?) to numexpr too.
>>
>>>
>>> I think the goal is for higher performance than what I understand numexpr can provide (in some cases, not all!). For instance, can numexpr deal well with
>>>
>>> a + a.T
>>>
>>> where a is a c-contiguous array? Any numpy-like iteration order will not work well, one needs to use higher-dimensional (eg 2D) blocking, not 1D blocking.
>>
>> No.  numexpr cannot deal with the above problem efficiently.  numexpr is about 1d blocking, so its approach is pretty naive (but effective for these 1d blocking tasks).
>>
>>> (if numexpr can do this then great, the task might then reduce to refactoring numexpr so that cython and numba can use the same logic)
>>
>> Well, now that I think about it, the virtual machine in latest numexpr (2.0) is based on the new nditer iterator included in NumPy 1.6, so I'm wondering if with a little bit of more effort, the 2d blocking could be implemented.  While I'm pretty sure this is the case, I don't know if it would be really worth the effort.  Perhaps concentrating on things like numba + projectX, or just Theano would be better targets.  Perhaps Mark Wiebe (who contributed the new nditer-aware VM) could say a bit more about this.
>
> My guess is that the answer is that nditer works well in many
> situations, but can be sub-optimal for some array shapes, particularly
> if the arrays fit in cache.
>
> Here's a contrived bad case (not sure if it is the worst): Consider
>
> arr[::2, :]
>
> where arr is C-contiguous with shape (n, 2), and let n be such that the
> array fits in L2 cache :-)
>
> The ::2 ensures that the array can't be flattened. Any nditer approach,
> as I understand it, would need to execute the inner subroutine for each
> row of 4 elements, and then invoke the iteration machinery, which I
> imagine is quite slow compared to LLVM code generated specifically for
> this array, which could even unroll the inner 4-element loop.

Sorry: "2 elements", "2-element loop".

Dag

>
> If one really wants to compete with Fortran, one must take into account
> that the scientific end-user may already be structuring the computation
> in a cache-efficient way. It doesn't always help with good performance
> "as long as arrays are>10 MB".
>
> If the array is large, the memory bottleneck should makes a lot of the
> CPU overhead of nditer irrelevant even for worst-case arrays; though I'd
> be curious to see benchmarks of how much overhead is left, and my
> (rather unfounded) suspicion is that hard-wired compiled LLVM code for a
> specific array should play better with the CPUs cache predictor to
> better mask memory access latencies (?)
>
> (Mark F., apropos benchmarks, let me know what kind of different
> hardware you have access to; I could see if I can give you access to a
> couple of boxes with different memory bus characteristics, i.e., Intel
> vs. AMD and so on.)
>
> Dag
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From warren.weckesser at enthought.com  Wed Mar 21 01:41:25 2012
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Wed, 21 Mar 2012 00:41:25 -0500
Subject: [Numpy-discussion] Possible roadmap addendum: building better
 text file readers
In-Reply-To: <CALGmxE+PpHL+g1gkx1hq-KmTotX=cGF2yes0agzEbY93w8-oUA@mail.gmail.com>
References: <CAJPUwMD=q4kxjUJ4+cyC=hrPx8i7AUtg=ULmvngScDX3czGc9w@mail.gmail.com>
	<ji65gb$i5l$1@dough.gmane.org>
	<BBC7E390-BF69-42F9-B74E-2E673BE53F53@continuum.io>
	<1330092347-sup-3918@rohan>
	<CAJPUwMCTCy29N2iv2HGy4r1uPcCz6d0_mFi9hh7JkgwdbEkVFQ@mail.gmail.com>
	<1330207186-sup-1957@rohan> <loom.20120227T062258-15@post.gmane.org>
	<1330351883-sup-9943@rohan>
	<CAPJVwB=H2P68CU_oKAf7n_fvj2q1Ne8TSw153_twAY0aq69H6A@mail.gmail.com>
	<1330365437-sup-7898@rohan>
	<CAPJVwBnaEd=EF9xD_-=VJeCMTYpb=7gOwxRYoQ9xce7xqiwWEg@mail.gmail.com>
	<1330387831-sup-839@rohan>
	<CAPJVwBnO6xbBWREL-XNTfhhKzhawWzBTheZyeLm-=TOxrXBRiA@mail.gmail.com>
	<CACHfV1s-z6=hM_j+8t8VeLzbEs8kKXfiWrf3MxiNpioqKtseig@mail.gmail.com>
	<CALGmxEJZjfu_X6M_hU6sqsPNTX=hX+gL_iZc92kpdpikyupapg@mail.gmail.com>
	<CAM-+wY8ieo0qgUGXNjTDWk3c_O8nr2c0Ha3sDVmxJJYyvdV8mg@mail.gmail.com>
	<CALGmxE+PpHL+g1gkx1hq-KmTotX=cGF2yes0agzEbY93w8-oUA@mail.gmail.com>
Message-ID: <CAM-+wY_Fiq_cpmEvSjtTL6R315jmfGsLFOpwhXZJaBLuxcvGLg@mail.gmail.com>

On Tue, Mar 20, 2012 at 5:59 PM, Chris Barker <chris.barker at noaa.gov> wrote:

> Warren et al:
>
> On Wed, Mar 7, 2012 at 7:49 AM, Warren Weckesser
> <warren.weckesser at enthought.com> wrote:
> > If you are setup with Cython to build extension modules,
>
> I am
>
> > and you don't mind
> > testing an unreleased and experimental reader,
>
> and I don't.
>
> > you can try the text reader
> > that I'm working on: https://github.com/WarrenWeckesser/textreader
>
> It just took me a while to get around to it!
>
> First of all: this is pretty much exactly what I've been looking for
> for years, and never got around to writing myself - thanks!
>
> My comments/suggestions:
>
> 1) a docstring for the textreader module would be nice.
>
> 2) "tzoffset" -- this is tricky stuff. Ideally, it should be able to
> parse an ISO datetime string timezone specifier, but short of that, I
> think the default should be None or UTC -- time zones are too ugly to
> presume anything!
>
> 3) it breaks with the old MacOS style line endings: \r only. Maybe no
> need to support that, but it turns out one of my old test files still
> had them!
>
> 4) when I try to read more rows than are in the file, I get:
>   File "textreader.pyx", line 247, in textreader.readrows
> (python/textreader.c:3469)
>  ValueError: negative dimensions are not allowed
>
> good to get an error, but it's not very informative!
>
> 5) for reading float64 values -- I get something different with
> textreader than with the python "float()":
>  input: "678.901"
>  float("") :  678.90099999999995
>  textreader : 678.90100000000007
>
> as close as the number of figures available, but curious...
>
>
> 5) Performance issue: in my case, I'm reading a big file that's in
> chunks -- each one has a header indicating how many rows follow, then
> the rows, so I parse it out bit by bit.
> For smallish files, it's much faster than pure python, and almost as
> fast as some old C code of mine that is far less flexible.
>
> But for large files,  -- it's much slower -- indeed slower than a pure
> python version for my use case.
>
> I did a simplified test -- with 10,000 rows:
>
> total number of rows:  10000
> pure python took: 1.410408 seconds
> pure python chunks took: 1.613094 seconds
> textreader all at once took: 0.067098 seconds
> textreader in chunks took : 0.131802 seconds
>
> but with 1,000,000 rows:
>
> total number of rows:  1000000
> total number of chunks:  1000
> pure python took: 30.712564 seconds
> pure python chunks took: 31.313225 seconds
> textreader all at once took: 1.314924 seconds
> textreader in chunks took : 9.684819 seconds
>
> then it gets even worse with the chunk size smaller:
>
> total number of rows:  1000000
> total number of chunks:  10000
> pure python took: 30.032246 seconds
> pure python chunks took: 42.010589 seconds
> textreader all at once took: 1.318613 seconds
> textreader in chunks took : 87.743729 seconds
>
> my code, which is C that essentially runs fscanf over the file, has
> essentially no performance hit from doing it in chunks -- so I think
> something is wrong here.
>
> Sorry, I haven't dug into the code to try to figure out what yet --
> does it rewind the file each time maybe?
>
> Enclosed is my test code.
>
> -Chris
>


Chris,

Thanks!  The feedback is great.  I won't have time to get back to this for
another week or so, but then I'll look into the issues you reported.

Warren


>
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120321/d5ffa56c/attachment.html>

From markflorisson88 at gmail.com  Wed Mar 21 09:12:04 2012
From: markflorisson88 at gmail.com (mark florisson)
Date: Wed, 21 Mar 2012 14:12:04 +0100
Subject: [Numpy-discussion] Looking for people interested in helping
 with Python compiler to LLVM
In-Reply-To: <4F695724.8050309@astro.uio.no>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<4F5EE219.8010108@astro.uio.no>
	<C15466AA-DBDC-4582-A522-D50D1A2E4AC1@continuum.io>
	<CANg26EWHOqu2RH1A-jBTpY79h973P9Ob-070NbBwLPeZkY7eKA@mail.gmail.com>
	<FA0A06B7-5EDF-4D2F-8919-A65E310D8AC5@continuum.io>
	<CANg26EVFOoQtkHcP-sYZ2vD13V6WF8cqZLkZOHyjusvMghZXmw@mail.gmail.com>
	<74EF7B8D-FADC-4091-A7EB-09472A85B75B@continuum.io>
	<47467832-971b-4a39-b25f-dbb2e1779c9b@email.android.com>
	<6CAE2B1E-9EFB-4B23-8D81-0AC013C35EFB@continuum.io>
	<4F695724.8050309@astro.uio.no>
Message-ID: <CANg26EWuLjyVOH9LXA_10o3g=qJjyR5=oRKRF7_ZzSAg+8JiFA@mail.gmail.com>

On 21 March 2012 05:20, Dag Sverre Seljebotn <d.s.seljebotn at astro.uio.no> wrote:
> On 03/20/2012 12:56 PM, Francesc Alted wrote:
>> On Mar 20, 2012, at 2:29 PM, Dag Sverre Seljebotn wrote:
>>> Francesc Alted<francesc at continuum.io> ?wrote:
>>>
>>>> On Mar 20, 2012, at 12:49 PM, mark florisson wrote:
>>>>>> Cython and Numba certainly overlap. ?However, Cython requires:
>>>>>>
>>>>>> ? ? ? ?1) learning another language
>>>>>> ? ? ? ?2) creating an extension module --- loading bit-code files
>>>> and dynamically executing (even on a different machine from the one
>>>> that initially created them) can be a powerful alternative for run-time
>>>> compilation and distribution of code.
>>>>>>
>>>>>> These aren't show-stoppers obviously. ? But, I think some users
>>>> would prefer an even simpler approach to getting fast-code than Cython
>>>> (which currently doesn't do enought type-inference and requires
>>>> building a dlopen extension module).
>>>>>
>>>>> Dag and I have been discussing this at PyCon, and here is my take on
>>>>> it (at this moment :).
>>>>>
>>>>> Definitely, if you can avoid Cython then that is easier and more
>>>>> desirable in many ways. So perhaps we can create a third project
>>>>> called X (I'm not very creative, maybe ArrayExprOpt), that takes an
>>>>> abstract syntax tree in a rather simple form, performs code
>>>>> optimizations such as rewriting loops with array accesses to vector
>>>>> expressions, fusing vector expressions and loops, etc, and spits out
>>>> a
>>>>> transformed AST containing these optimizations. If runtime
>>>> information
>>>>> is given such as actual shape and stride information the
>>>>> transformations could figure out there and then whether to do things
>>>>> like collapsing, axes swapping, blocking (as in, introducing more
>>>> axes
>>>>> or loops to retain discontiguous blocks in the cache), blocked memory
>>>>> copies to contiguous chunks, etc. The AST could then also say whether
>>>>> the final expressions are vectorizable. Part of this functionality is
>>>>> already in numpy's nditer, except that this would be implicit and do
>>>>> more (and hopefully with minimal overhead).
>>>>>
>>>>> So numba, Cython and maybe numexpr could use the functionality,
>>>> simply
>>>>> by building the AST from Python and converting back (if necessary) to
>>>>> its own AST. As such, the AST optimizer would be only part of any
>>>>> (runtime) compiler's pipeline, and it should be very flexible to
>>>>> retain any information (metadata regarding actual types, control flow
>>>>> information, etc) provided by the original AST. It would not do
>>>>> control flow analysis, type inference or promotion, etc, but only
>>>> deal
>>>>> with abstract types like integers, reals and arrays (C, Fortran or
>>>>> partly contiguous or strided). It would not deal with objects, but
>>>>> would allow to insert nodes like UnreorderableNode and SideEffectNode
>>>>> wrapping parts of the original AST. In short, it should be as easy as
>>>>> possible to convert from an original AST to this project's AST and
>>>>> back again afterwards.
>>>>
>>>> I think this is a very interesting project, and certainly projects like
>>>> numba can benefit of it. ?So, in order to us have an idea on what you
>>>> are after, can we assume that your project (call it X) would be kind of
>>>> an compiler optimizer, and then the produced, optimized code could be
>>>> feed into numba for optimized LLVM code generation (that on its turn,
>>>> can be run on top of CPUs or GPUs or a combination)? ?Is that correct?
>>>
>>> I think so. Another way of thinking about it is that it is a reimplementation of the logic in the (good and closed source) Fortran 90 compilers, in a reusable component for inclusion in various compilers.
>>>
>>> Various c++ metaprogramming libraries (like Blitz++) are similar too.
>>
>> Aha, thanks.
>>
>>>> Giving that my interpretation above is correct, it is bit more
>>>> difficult to me to see how your X project could be of benefit for
>>>> numexpr. ?In fact, I actually see this the other way round: once the
>>>> optimizer has discovered the vectorization parts, then go one step
>>>> further and generate code that uses numexpr automatically (call this,
>>>> vectorization through numexpr). ?This is what you mean, or I'm missing
>>>> something?
>>>
>>> No. I think in some ways this is a competitor to numexpr -- you would gut out the middle of numexpr and keep the frontend and backend, but use this to optimize iteration order and blocking strategies.
>>
>> I see. ?Yes, I can easily see Mark's project X + numba more as a competitor (killer?) to numexpr too.
>>
>>>
>>> I think the goal is for higher performance than what I understand numexpr can provide (in some cases, not all!). For instance, can numexpr deal well with
>>>
>>> a + a.T
>>>
>>> where a is a c-contiguous array? Any numpy-like iteration order will not work well, one needs to use higher-dimensional (eg 2D) blocking, not 1D blocking.
>>
>> No. ?numexpr cannot deal with the above problem efficiently. ?numexpr is about 1d blocking, so its approach is pretty naive (but effective for these 1d blocking tasks).
>>
>>> (if numexpr can do this then great, the task might then reduce to refactoring numexpr so that cython and numba can use the same logic)
>>
>> Well, now that I think about it, the virtual machine in latest numexpr (2.0) is based on the new nditer iterator included in NumPy 1.6, so I'm wondering if with a little bit of more effort, the 2d blocking could be implemented. ?While I'm pretty sure this is the case, I don't know if it would be really worth the effort. ?Perhaps concentrating on things like numba + projectX, or just Theano would be better targets. ?Perhaps Mark Wiebe (who contributed the new nditer-aware VM) could say a bit more about this.
>
> My guess is that the answer is that nditer works well in many
> situations, but can be sub-optimal for some array shapes, particularly
> if the arrays fit in cache.
>
> Here's a contrived bad case (not sure if it is the worst): Consider
>
> arr[::2, :]
>
> where arr is C-contiguous with shape (n, 2), and let n be such that the
> array fits in L2 cache :-)
>
> The ::2 ensures that the array can't be flattened. Any nditer approach,
> as I understand it, would need to execute the inner subroutine for each
> row of 4 elements, and then invoke the iteration machinery, which I
> imagine is quite slow compared to LLVM code generated specifically for
> this array, which could even unroll the inner 4-element loop.
>
> If one really wants to compete with Fortran, one must take into account
> that the scientific end-user may already be structuring the computation
> in a cache-efficient way. It doesn't always help with good performance
> "as long as arrays are >10 MB".
>
> If the array is large, the memory bottleneck should makes a lot of the
> CPU overhead of nditer irrelevant even for worst-case arrays; though I'd
> be curious to see benchmarks of how much overhead is left, and my
> (rather unfounded) suspicion is that hard-wired compiled LLVM code for a
> specific array should play better with the CPUs cache predictor to
> better mask memory access latencies (?)

Definitely. An other issue is that nditer, although it could perform
blocking if implemented, cannot look at the larger structure of the
memory access pattern. Often the elementwise match-up of elements is
not at an outermost level, and you may want to do the copying to
contiguous memory both to avoid cache conflicts within your blocks,
but you may also want multiple blocks to match up (consider for
instance LU-factorization or matrix multiplication, although no one in
their right minds would write those by hand :). Unless you reuse the
same data enough in your expressions to warrant copying to gain
cache-reuse and possibly vectorization, you're probably not going to
perform copies.

> (Mark F., apropos benchmarks, let me know what kind of different
> hardware you have access to; I could see if I can give you access to a
> couple of boxes with different memory bus characteristics, i.e., Intel
> vs. AMD and so on.)
>
> Dag
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From markflorisson88 at gmail.com  Wed Mar 21 09:14:39 2012
From: markflorisson88 at gmail.com (mark florisson)
Date: Wed, 21 Mar 2012 14:14:39 +0100
Subject: [Numpy-discussion] Looking for people interested in helping
 with Python compiler to LLVM
In-Reply-To: <CAFXk4br+FqcVgRdDunqEBmNqU+dNcZGynmz90y7vsSEewOXPEQ@mail.gmail.com>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<4F5EE219.8010108@astro.uio.no>
	<C15466AA-DBDC-4582-A522-D50D1A2E4AC1@continuum.io>
	<CANg26EWHOqu2RH1A-jBTpY79h973P9Ob-070NbBwLPeZkY7eKA@mail.gmail.com>
	<FA0A06B7-5EDF-4D2F-8919-A65E310D8AC5@continuum.io>
	<CANg26EVFOoQtkHcP-sYZ2vD13V6WF8cqZLkZOHyjusvMghZXmw@mail.gmail.com>
	<CAFXk4boesOAbRRzyejuKW58avohKK=VVGrmFkgh00f9njVoLEA@mail.gmail.com>
	<2745ec09-9b93-475b-879c-bcf19cee009b@email.android.com>
	<CAFXk4br+FqcVgRdDunqEBmNqU+dNcZGynmz90y7vsSEewOXPEQ@mail.gmail.com>
Message-ID: <CANg26EX+KotYUKwdHuQ1oBog0OrFzf=eCHauXGB236Bmv9P1iw@mail.gmail.com>

On 20 March 2012 20:49, Olivier Delalleau <shish at keba.be> wrote:
> I doubt Theano is already as smart as you'd want it to be right now, however
> the core mechanisms are there to perform graph optimizations and move
> computations to GPU. It may save time to start from there instead of
> starting all over from scratch. I'm not sure though, but it looks like it
> would be worth considering it at least.

Thanks for the suggestion Olivier, as Dag said we discusses it, and
indeed we (or I) should look a lot deeper into it and see what
components are reusable there and discuss with the Theano community if
and how we can collaborate.

> -=- Olivier
>
> Le 20 mars 2012 15:40, Dag Sverre Seljebotn <d.s.seljebotn at astro.uio.no> a
> ?crit :
>
>> We talked some about Theano. There are some differences in project goals
>> which means that it makes sense to make this a seperate project: Cython
>> wants to use this to generate C code up front from the Cython AST at
>> compilation time; numba also has a different frontend (parsing of python
>> bytecode) and a different backend (LLVM).
>>
>> However, it may very well be possible that Theano could be refactored so
>> that the more essential algorithms working on the syntax tree could be
>> pulled out and shared with cython and numba. Then the question is whether
>> the core of Theano is smart enough to compete with Fortran compilers and
>> support arbitraily strided inputs optimally. Otherwise one might as well
>> start from scratch. I'll leave that for Mark to figure out...
>>
>> Dag
>> --
>> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>>
>>
>> Olivier Delalleau <shish at keba.be> wrote:
>>>
>>> This sounds a lot like Theano, did you look into it?
>>>
>>> -=- Olivier
>>>
>>> Le 20 mars 2012 13:49, mark florisson <markflorisson88 at gmail.com> a ?crit
>>> :
>>>>
>>>> On 13 March 2012 18:18, Travis Oliphant <travis at continuum.io> wrote:
>>>> >>>
>>>> >>> (Mark F., how does the above match how you feel about this?)
>>>> >>
>>>> >> I would like collaboration, but from a technical perspective I think
>>>> >> this would be much more involved than just dumping the AST to an IR
>>>> >> and generating some code from there. For vector expressions I think
>>>> >> sharing code would be more feasible than arbitrary (parallel) loops,
>>>> >> etc. Cython as a compiler can make many decisions that a Python
>>>> >> (bytecode) compiler can't make (at least without annotations and a
>>>> >> well-defined subset of the language (not so much the syntax as the
>>>> >> semantics)). I think in numba, if parallelism is to be supported, you
>>>> >> will want a prange-like construct, as proving independence between
>>>> >> iterations can be very hard to near impossible for a compiler.
>>>> >
>>>> > I completely agree that you have to define some kind of syntax to get
>>>> > parallelism. ?But, a prange construct would not be out of the question, of
>>>> > course.
>>>> >
>>>> >>
>>>> >> As for code generation, I'm not sure how llvm would do things like
>>>> >> slicing arrays, reshaping, resizing etc (for vector expressions you
>>>> >> can first evaluate all slicing and indexing operations and then
>>>> >> compile the remaining vector expression), but for loops and array
>>>> >> reassignment within loops this would have to invoke the actual
>>>> >> slicing
>>>> >> code from the llvm code (I presume).
>>>> >
>>>> > There could be some analysis on the byte-code, prior to emitting the
>>>> > llvm code in order to handle lots of things. ? Basically, you have to "play"
>>>> > the byte-code on a simple machine anyway in order to emit the correct code.
>>>> > ? The big thing about Cython is you have to typedef too many things that are
>>>> > really quite knowable from the code. ? If Cython could improve it's type
>>>> > inference, then it would be a more suitable target.
>>>> >
>>>> >> There are many other things, like
>>>> >> bounds checking, wraparound, etc, that are all supported in both
>>>> >> numpy
>>>> >> and Cython, but going through an llvm layer would as far as I can
>>>> >> see,
>>>> >> require re-implementing those, at least if you want top-notch
>>>> >> performance. Personally, I think for non-trivial performance-critical
>>>> >> code (for loops with indexing, slicing, function calls, etc) Cython
>>>> >> is
>>>> >> a better target.
>>>> >
>>>> > With libclang it is really quite possible to imagine a cython -> C
>>>> > target that itself compiles to llvm so that you can do everything at that
>>>> > intermediate layer. ? However, ?LLVM is a much better layer for optimization
>>>> > than C now that there are a lot of people collaborating on that layer. ? I
>>>> > think it would be great if Cython targeted LLVM actually instead of C.
>>>> >
>>>> >>
>>>> >> Finally, as for non-vector-expression code, I really believe Cython
>>>> >> is
>>>> >> a better target. cython.inline can have high overhead (at least the
>>>> >> first time it has to compile), but with better (numpy-aware) type
>>>> >> inference or profile guided optimizations (see recent threads on the
>>>> >> cython-dev mailing list), in addition to things like prange, I
>>>> >> personally believe Cython targets most of the use cases where numba
>>>> >> would be able to generate performing code.
>>>> >
>>>> > Cython and Numba certainly overlap. ?However, Cython requires:
>>>> >
>>>> > ? ? ? ?1) learning another language
>>>> > ? ? ? ?2) creating an extension module --- loading bit-code files and
>>>> > dynamically executing (even on a different machine from the one that
>>>> > initially created them) can be a powerful alternative for run-time
>>>> > compilation and distribution of code.
>>>> >
>>>> > These aren't show-stoppers obviously. ? But, I think some users would
>>>> > prefer an even simpler approach to getting fast-code than Cython (which
>>>> > currently doesn't do enought type-inference and requires building a dlopen
>>>> > extension module).
>>>>
>>>> Dag and I have been discussing this at PyCon, and here is my take on
>>>> it (at this moment :).
>>>>
>>>> Definitely, if you can avoid Cython then that is easier and more
>>>> desirable in many ways. So perhaps we can create a third project
>>>> called X (I'm not very creative, maybe ArrayExprOpt), that takes an
>>>> abstract syntax tree in a rather simple form, performs code
>>>> optimizations such as rewriting loops with array accesses to vector
>>>> expressions, fusing vector expressions and loops, etc, and spits out a
>>>> transformed AST containing these optimizations. If runtime information
>>>> is given such as actual shape and stride information the
>>>> transformations could figure out there and then whether to do things
>>>> like collapsing, axes swapping, blocking (as in, introducing more axes
>>>> or loops to retain discontiguous blocks in the cache), blocked memory
>>>> copies to contiguous chunks, etc. The AST could then also say whether
>>>> the final expressions are vectorizable. Part of this functionality is
>>>> already in numpy's nditer, except that this would be implicit and do
>>>> more (and hopefully with minimal overhead).
>>>>
>>>> So numba, Cython and maybe numexpr could use the functionality, simply
>>>> by building the AST from Python and converting back (if necessary) to
>>>> its own AST. As such, the AST optimizer would be only part of any
>>>> (runtime) compiler's pipeline, and it should be very flexible to
>>>> retain any information (metadata regarding actual types, control flow
>>>> information, etc) provided by the original AST. It would not do
>>>> control flow analysis, type inference or promotion, etc, but only deal
>>>> with abstract types like integers, reals and arrays (C, Fortran or
>>>> partly contiguous or strided). It would not deal with objects, but
>>>> would allow to insert nodes like UnreorderableNode and SideEffectNode
>>>> wrapping parts of the original AST. In short, it should be as easy as
>>>> possible to convert from an original AST to this project's AST and
>>>> back again afterwards.
>>>>
>>>> As the project matures many optimizations may be added that deal with
>>>> all sorts of loop restructuring and ways to efficiently utilize the
>>>> cache as well as enable vectorization and possibly parallelism.
>>>> Perhaps it could even generate a different AST depending on whether
>>>> execution target the CPU or the GPU (with optionally available
>>>> information such as cache sizes, GPU shared/local memory sizes, etc).
>>>>
>>>> Seeing that this would be a part of my master dissertation, my
>>>> supervisor would require me to write the code, so at least until
>>>> August I think I would have to write (at least the bulk of) this.
>>>> Otherwise I can also make other parts of my dissertation's project
>>>> more prominent to make up for it. Anyway, my question is, is there
>>>> interest from at least the numba and numexpr projects (if code can be
>>>> transformed into vector operations, it makes sense to use numexpr for
>>>> that, I'm not sure what numba's interest is in that).
>>>>
>>>> > -Travis
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >>
>>>> >>> Dag
>>>> >>> _______________________________________________
>>>> >>> NumPy-Discussion mailing list
>>>> >>> NumPy-Discussion at scipy.org
>>>> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> _______________________________________________
>>>> >>> NumPy-Discussion mailing list
>>>> >>> NumPy-Discussion at scipy.org
>>>> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>> >>>
>>>> >> _______________________________________________
>>>> >> NumPy-Discussion mailing list
>>>> >> NumPy-Discussion at scipy.org
>>>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>> >
>>>> > _______________________________________________
>>>> > NumPy-Discussion mailing list
>>>> > NumPy-Discussion at scipy.org
>>>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From sourceforge.numpy at user.fastmail.fm  Wed Mar 21 11:09:55 2012
From: sourceforge.numpy at user.fastmail.fm (Hugo Gagnon)
Date: Wed, 21 Mar 2012 11:09:55 -0400
Subject: [Numpy-discussion] Viewing a float64 array with a float32 array
Message-ID: <1332342595.28097.140661052168005.614BB73F@webmail.messagingengine.com>

Hi,

Is it possible to have a view of a float64 array that is itself float32?
So that:

>>> A = np.arange(5, dtype='d')
>>> A.view(dtype='f')

would return a size 5 float32 array looking at A's data?

Thanks, 
--
  Hugo Gagnon


From zachary.pincus at yale.edu  Wed Mar 21 11:19:38 2012
From: zachary.pincus at yale.edu (Zachary Pincus)
Date: Wed, 21 Mar 2012 11:19:38 -0400
Subject: [Numpy-discussion] Viewing a float64 array with a float32 array
In-Reply-To: <1332342595.28097.140661052168005.614BB73F@webmail.messagingengine.com>
References: <1332342595.28097.140661052168005.614BB73F@webmail.messagingengine.com>
Message-ID: <64BB231B-FCE8-415A-A063-3F5F299DF249@yale.edu>

> Hi,
> 
> Is it possible to have a view of a float64 array that is itself float32?
> So that:
> 
>>>> A = np.arange(5, dtype='d')
>>>> A.view(dtype='f')
> 
> would return a size 5 float32 array looking at A's data?

I think not. The memory layout of a 32-bit IEEE float is not a subset of that of a 64-bit float -- see e.g. the first table in:
http://steve.hollasch.net/cgindex/coding/ieeefloat.html

This would work for int8/int16 or other int types (so long as the number doesn't exceed the range of the smaller type), but AFAIK not floats.

Note how the subset relationship works for the int8/int16 case, but not float32/float64:

str(numpy.array(100,dtype=numpy.int8).data)
'd'

str(numpy.array(100,dtype=numpy.int16).data)
'd\x00'

str(numpy.array(-100,dtype=numpy.int16).data)
'\x9c\xff'

str(numpy.array(-100,dtype=numpy.int8).data)
'\x9c'

str(numpy.array(100,dtype=numpy.float32).data)
'\x00\x00\xc8B'

str(numpy.array(100,dtype=numpy.float64).data)
'\x00\x00\x00\x00\x00\x00Y@'


Zach


From sourceforge.numpy at user.fastmail.fm  Wed Mar 21 11:48:00 2012
From: sourceforge.numpy at user.fastmail.fm (Hugo Gagnon)
Date: Wed, 21 Mar 2012 11:48:00 -0400
Subject: [Numpy-discussion] Viewing a float64 array with a float32 array
In-Reply-To: <64BB231B-FCE8-415A-A063-3F5F299DF249@yale.edu>
References: <1332342595.28097.140661052168005.614BB73F@webmail.messagingengine.com>
	<64BB231B-FCE8-415A-A063-3F5F299DF249@yale.edu>
Message-ID: <1332344880.5195.140661052191801.24272A44@webmail.messagingengine.com>

I'm not sure if you are referring to rounding errors but that's OK with
me.

I was thinking something along the lines of changing how numpy looks at
the data of A's view by modifying say the stride attribute, etc.
--
  Hugo Gagnon


On Wed, Mar 21, 2012, at 11:19, Zachary Pincus wrote:
> > Hi,
> > 
> > Is it possible to have a view of a float64 array that is itself float32?
> > So that:
> > 
> >>>> A = np.arange(5, dtype='d')
> >>>> A.view(dtype='f')
> > 
> > would return a size 5 float32 array looking at A's data?
> 
> I think not. The memory layout of a 32-bit IEEE float is not a subset of
> that of a 64-bit float -- see e.g. the first table in:
> http://steve.hollasch.net/cgindex/coding/ieeefloat.html
> 
> This would work for int8/int16 or other int types (so long as the number
> doesn't exceed the range of the smaller type), but AFAIK not floats.
> 
> Note how the subset relationship works for the int8/int16 case, but not
> float32/float64:
> 
> str(numpy.array(100,dtype=numpy.int8).data)
> 'd'
> 
> str(numpy.array(100,dtype=numpy.int16).data)
> 'd\x00'
> 
> str(numpy.array(-100,dtype=numpy.int16).data)
> '\x9c\xff'
> 
> str(numpy.array(-100,dtype=numpy.int8).data)
> '\x9c'
> 
> str(numpy.array(100,dtype=numpy.float32).data)
> '\x00\x00\xc8B'
> 
> str(numpy.array(100,dtype=numpy.float64).data)
> '\x00\x00\x00\x00\x00\x00Y@'
> 
> 
> Zach
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From zachary.pincus at yale.edu  Wed Mar 21 11:58:06 2012
From: zachary.pincus at yale.edu (Zachary Pincus)
Date: Wed, 21 Mar 2012 11:58:06 -0400
Subject: [Numpy-discussion] Viewing a float64 array with a float32 array
In-Reply-To: <1332344880.5195.140661052191801.24272A44@webmail.messagingengine.com>
References: <1332342595.28097.140661052168005.614BB73F@webmail.messagingengine.com>
	<64BB231B-FCE8-415A-A063-3F5F299DF249@yale.edu>
	<1332344880.5195.140661052191801.24272A44@webmail.messagingengine.com>
Message-ID: <846A2C6F-8E8D-4AC2-B40A-74C57A315C95@yale.edu>

> I'm not sure if you are referring to rounding errors but that's OK with
> me.
> 
> I was thinking something along the lines of changing how numpy looks at
> the data of A's view by modifying say the stride attribute, etc.

Yes, so was I. As you can see in my example with ints below, you could skip every other byte of the int16 array to "look" at an int8 array. This is because the memory layout of an int8 is a proper subset of int16. (Modulo endian-concerns of course...)

But looking at the link I provided, you can see that taking the first 32 bits of an float64 (or the last 32 or any 32) does not yield something that can be interpreted as a float32. So there's no subset relationship, and you can't do the strides-trick.

To be extra clear, look at the memory layout of a float that's expressible without rounding error:
str(numpy.array(128,dtype=numpy.float64).data)
'\x00\x00\x00\x00\x00\x00`@'

str(numpy.array(128,dtype=numpy.float32).data)
'\x00\x00\x00C'

There's obviously no stride trick whereby one will "look" like the other. 

Zach


> 
> On Wed, Mar 21, 2012, at 11:19, Zachary Pincus wrote:
>>> Hi,
>>> 
>>> Is it possible to have a view of a float64 array that is itself float32?
>>> So that:
>>> 
>>>>>> A = np.arange(5, dtype='d')
>>>>>> A.view(dtype='f')
>>> 
>>> would return a size 5 float32 array looking at A's data?
>> 
>> I think not. The memory layout of a 32-bit IEEE float is not a subset of
>> that of a 64-bit float -- see e.g. the first table in:
>> http://steve.hollasch.net/cgindex/coding/ieeefloat.html
>> 
>> This would work for int8/int16 or other int types (so long as the number
>> doesn't exceed the range of the smaller type), but AFAIK not floats.
>> 
>> Note how the subset relationship works for the int8/int16 case, but not
>> float32/float64:
>> 
>> str(numpy.array(100,dtype=numpy.int8).data)
>> 'd'
>> 
>> str(numpy.array(100,dtype=numpy.int16).data)
>> 'd\x00'
>> 
>> str(numpy.array(-100,dtype=numpy.int16).data)
>> '\x9c\xff'
>> 
>> str(numpy.array(-100,dtype=numpy.int8).data)
>> '\x9c'
>> 
>> str(numpy.array(100,dtype=numpy.float32).data)
>> '\x00\x00\xc8B'
>> 
>> str(numpy.array(100,dtype=numpy.float64).data)
>> '\x00\x00\x00\x00\x00\x00Y@'
>> 
>> 
>> Zach
>> 
>> 
>> 
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From tang.yan at gmail.com  Thu Mar 22 00:48:26 2012
From: tang.yan at gmail.com (Yan Tang)
Date: Thu, 22 Mar 2012 00:48:26 -0400
Subject: [Numpy-discussion] About np array and recarray
Message-ID: <CAAV4V2A6_agL77DQ3xLLX-w5pBTihDuZQowHXN2ErWqUtHGMdg@mail.gmail.com>

Hi,

I am really confused on the np array or record array, and cannot understand
how it works.

What I want to do is that I have a normal python two dimensional array/list:

a = [['2000-01-01', 2],['2000-01-02', 3]]

I want to convert it to a recarray with this dtype [('date', 'object'),
('count', 'int')].  I tried multiple ways and none of them works.  And some
of the tests show pretty odd behavior.

This is good, and it is almost what i want:

>>> import numpy as np
>>> a = [('2000-01-01', 2), ('2000-01-02', 3)]
>>> np.array(a, dtype=[('date', 'object'), ('count', 'int')])
array([('2000-01-01', 2), ('2000-01-02', 3)],
      dtype=[('date', '|O8'), ('count', '<i8')])

Why this doesn't work?!

>>> a = [['2000-01-01', 2],['2000-01-02', 3]]
>>> np.array(a, dtype=[('date', 'object'), ('count', 'int')])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: tried to set void-array with object members using buffer.

Why can this cause segmentation fault?!

>>> a = [['2000-01-01', 2],['2000-01-02', 3]]
>>> np.ndarray((len(a),), buffer=np.array(a), dtype=[('date', 'object'),
('count', 'int')])
Segmentation fault (And python quit!)

Python version 2.6.5

On this reference page,
http://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html

>>> x = np.array([(1,2),(3,4)])
>>> x
array([[1, 2],
       [3, 4]])
>>> np.array([[1, 2], [3, 4]])
array([[1, 2],
       [3, 4]])

Can anyone help me about this?

Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120322/7169fd91/attachment.html>

From travis at continuum.io  Thu Mar 22 01:16:12 2012
From: travis at continuum.io (Travis Oliphant)
Date: Thu, 22 Mar 2012 00:16:12 -0500
Subject: [Numpy-discussion] About np array and recarray
In-Reply-To: <CAAV4V2A6_agL77DQ3xLLX-w5pBTihDuZQowHXN2ErWqUtHGMdg@mail.gmail.com>
References: <CAAV4V2A6_agL77DQ3xLLX-w5pBTihDuZQowHXN2ErWqUtHGMdg@mail.gmail.com>
Message-ID: <4A3ED445-4241-4E9B-9C00-8E5FC05BD412@continuum.io>


On Mar 21, 2012, at 11:48 PM, Yan Tang wrote:

> Hi,
> 
> I am really confused on the np array or record array, and cannot understand how it works.
> 
> What I want to do is that I have a normal python two dimensional array/list:
> 
> a = [['2000-01-01', 2],['2000-01-02', 3]]
> 
> I want to convert it to a recarray with this dtype [('date', 'object'), ('count', 'int')].  I tried multiple ways and none of them works.  And some of the tests show pretty odd behavior.
> 
> This is good, and it is almost what i want:
> 
> >>> import numpy as np
> >>> a = [('2000-01-01', 2), ('2000-01-02', 3)]
> >>> np.array(a, dtype=[('date', 'object'), ('count', 'int')])
> array([('2000-01-01', 2), ('2000-01-02', 3)], 
>       dtype=[('date', '|O8'), ('count', '<i8')])

This is the correct way to initiate the record array, or structured array, from a Python object.   

> 
> Why this doesn't work?!
> 
> >>> a = [['2000-01-01', 2],['2000-01-02', 3]]
> >>> np.array(a, dtype=[('date', 'object'), ('count', 'int')])
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> ValueError: tried to set void-array with object members using buffer.
> 

The error here could be more instructive, but the problems is that to simplify the np.array factory function (which is already somewhat complex) it was decided to force records to be input as "tuples" and not as lists.   You *must* use tuples to specify records for a structured array.   

> Why can this cause segmentation fault?!
> 
> >>> a = [['2000-01-01', 2],['2000-01-02', 3]]
> >>> np.ndarray((len(a),), buffer=np.array(a), dtype=[('date', 'object'), ('count', 'int')])
> Segmentation fault (And python quit!)

The np.ndarray constructor should not be used directly unless you know what you are doing.

The np.array factory function is the standard way to create arrays.   The problem here is that you are explicitly asking NumPy to point to a particular region of memory to use as it's data-buffer.   This memory is the data buffer of an array of "strings".   The np.array factory function will try and auto-detect the data-type of the array if you do not specify it --- which in this case results in an array of strings.    Then, with the dtype specification you are asking it to interpret a portion of that array of strings as a pointer to a Python object.   This will cause a segmentation fault when the printing code tries to dereference a pointer which is actually 4 characters of a string. 

This should probably be checked for in the ndarray constructor.   I don't think it ever really makes sense to use an "object" dtype when you also supply the buffer unless that buffer actually held Python object pointers in the first place.   Even in this case you could do what you wanted without calling the constructor.  So, likely a check should be made so that you can't have an object array and also supply a buffer. 

> 
> Python version 2.6.5
> 
> On this reference page, http://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html
> 
> >>> x = np.array([(1,2),(3,4)])
> >>> x
> array([[1, 2],
>        [3, 4]])
> >>> np.array([[1, 2], [3, 4]])
> array([[1, 2],
>        [3, 4]])
> 
> Can anyone help me about this?

I'm not sure what you are asking for here?   Yes, for arrays with non-structured dtypes, numpy will treat tuples as lists. 

Best regards,

-Travis


> 
> Thanks.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120322/03dcefa7/attachment.html>

From kalatsky at gmail.com  Thu Mar 22 01:32:15 2012
From: kalatsky at gmail.com (Val Kalatsky)
Date: Thu, 22 Mar 2012 00:32:15 -0500
Subject: [Numpy-discussion] About np array and recarray
In-Reply-To: <CAAV4V2A6_agL77DQ3xLLX-w5pBTihDuZQowHXN2ErWqUtHGMdg@mail.gmail.com>
References: <CAAV4V2A6_agL77DQ3xLLX-w5pBTihDuZQowHXN2ErWqUtHGMdg@mail.gmail.com>
Message-ID: <CAE8bXEkmgBySx1dcJK5KNJNpgKvvaXK4dHGFq9rrQF49C_zk6A@mail.gmail.com>

Will this do what you need to accomplish?

import datetime
np.array([(datetime.datetime.strptime(i[0], "%Y-%m-%d").date(), i[1]) for i
in a], dtype=[('date', 'object'), ('count', 'int')])

Val

On Wed, Mar 21, 2012 at 11:48 PM, Yan Tang <tang.yan at gmail.com> wrote:

> Hi,
>
> I am really confused on the np array or record array, and cannot
> understand how it works.
>
> What I want to do is that I have a normal python two dimensional
> array/list:
>
> a = [['2000-01-01', 2],['2000-01-02', 3]]
>
> I want to convert it to a recarray with this dtype [('date', 'object'),
> ('count', 'int')].  I tried multiple ways and none of them works.  And some
> of the tests show pretty odd behavior.
>
> This is good, and it is almost what i want:
>
> >>> import numpy as np
> >>> a = [('2000-01-01', 2), ('2000-01-02', 3)]
> >>> np.array(a, dtype=[('date', 'object'), ('count', 'int')])
> array([('2000-01-01', 2), ('2000-01-02', 3)],
>       dtype=[('date', '|O8'), ('count', '<i8')])
>
> Why this doesn't work?!
>
> >>> a = [['2000-01-01', 2],['2000-01-02', 3]]
> >>> np.array(a, dtype=[('date', 'object'), ('count', 'int')])
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> ValueError: tried to set void-array with object members using buffer.
>
> Why can this cause segmentation fault?!
>
> >>> a = [['2000-01-01', 2],['2000-01-02', 3]]
> >>> np.ndarray((len(a),), buffer=np.array(a), dtype=[('date', 'object'),
> ('count', 'int')])
> Segmentation fault (And python quit!)
>
> Python version 2.6.5
>
> On this reference page,
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html
>
> >>> x = np.array([(1,2),(3,4)])
> >>> x
> array([[1, 2],
>        [3, 4]])
> >>> np.array([[1, 2], [3, 4]])
> array([[1, 2],
>        [3, 4]])
>
> Can anyone help me about this?
>
> Thanks.
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120322/5c9515b0/attachment.html>

From tang.yan at gmail.com  Thu Mar 22 12:42:13 2012
From: tang.yan at gmail.com (Yan Tang)
Date: Thu, 22 Mar 2012 12:42:13 -0400
Subject: [Numpy-discussion] About np array and recarray
In-Reply-To: <4A3ED445-4241-4E9B-9C00-8E5FC05BD412@continuum.io>
References: <CAAV4V2A6_agL77DQ3xLLX-w5pBTihDuZQowHXN2ErWqUtHGMdg@mail.gmail.com>
	<4A3ED445-4241-4E9B-9C00-8E5FC05BD412@continuum.io>
Message-ID: <CAAV4V2C8f4vCt0Uc8Yhmz=e_D=ErRFXWU=YEjmjMmGypzvbs4w@mail.gmail.com>

Thank you very much.  Very detailed explanation.

On Thu, Mar 22, 2012 at 1:16 AM, Travis Oliphant <travis at continuum.io>wrote:

>
> On Mar 21, 2012, at 11:48 PM, Yan Tang wrote:
>
> Hi,
>
> I am really confused on the np array or record array, and cannot
> understand how it works.
>
> What I want to do is that I have a normal python two dimensional
> array/list:
>
> a = [['2000-01-01', 2],['2000-01-02', 3]]
>
> I want to convert it to a recarray with this dtype [('date', 'object'),
> ('count', 'int')].  I tried multiple ways and none of them works.  And some
> of the tests show pretty odd behavior.
>
> This is good, and it is almost what i want:
>
> >>> import numpy as np
> >>> a = [('2000-01-01', 2), ('2000-01-02', 3)]
> >>> np.array(a, dtype=[('date', 'object'), ('count', 'int')])
> array([('2000-01-01', 2), ('2000-01-02', 3)],
>       dtype=[('date', '|O8'), ('count', '<i8')])
>
>
> This is the correct way to initiate the record array, or structured array,
> from a Python object.
>
>
> Why this doesn't work?!
>
> >>> a = [['2000-01-01', 2],['2000-01-02', 3]]
> >>> np.array(a, dtype=[('date', 'object'), ('count', 'int')])
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> ValueError: tried to set void-array with object members using buffer.
>
>
> The error here could be more instructive, but the problems is that to
> simplify the np.array factory function (which is already somewhat complex)
> it was decided to force records to be input as "tuples" and not as lists.
> You *must* use tuples to specify records for a structured array.
>
> Why can this cause segmentation fault?!
>
> >>> a = [['2000-01-01', 2],['2000-01-02', 3]]
> >>> np.ndarray((len(a),), buffer=np.array(a), dtype=[('date', 'object'),
> ('count', 'int')])
> Segmentation fault (And python quit!)
>
>
> The np.ndarray constructor should not be used directly unless you know
> what you are doing.
>
> The np.array factory function is the standard way to create arrays.   The
> problem here is that you are explicitly asking NumPy to point to a
> particular region of memory to use as it's data-buffer.   This memory is
> the data buffer of an array of "strings".   The np.array factory function
> will try and auto-detect the data-type of the array if you do not specify
> it --- which in this case results in an array of strings.    Then, with the
> dtype specification you are asking it to interpret a portion of that array
> of strings as a pointer to a Python object.   This will cause a
> segmentation fault when the printing code tries to dereference a pointer
> which is actually 4 characters of a string.
>
> This should probably be checked for in the ndarray constructor.   I don't
> think it ever really makes sense to use an "object" dtype when you also
> supply the buffer unless that buffer actually held Python object pointers
> in the first place.   Even in this case you could do what you wanted
> without calling the constructor.  So, likely a check should be made so that
> you can't have an object array and also supply a buffer.
>
>
> Python version 2.6.5
>
> On this reference page,
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html
>
> >>> x = np.array([(1,2),(3,4)])
> >>> x
> array([[1, 2],
>        [3, 4]])
> >>> np.array([[1, 2], [3, 4]])
> array([[1, 2],
>        [3, 4]])
>
> Can anyone help me about this?
>
>
> I'm not sure what you are asking for here?   Yes, for arrays with
> non-structured dtypes, numpy will treat tuples as lists.
>

The thing I am asking for is, it looks like from my example, [[1,2],[3,4]],
and [(1,2),(3,4)], after constructing the np.array, the result looks the
same.  Then go back to my first question, why it looks like only the tuple
works, not the list one.

As you explained, it looks like we have to use tuple instead of list.
 That's OK.  But I didn't find it any place in the document, ;).


>
> Best regards,
>
> -Travis
>
>
>
> Thanks.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120322/5f57b800/attachment.html>

From tang.yan at gmail.com  Thu Mar 22 12:42:58 2012
From: tang.yan at gmail.com (Yan Tang)
Date: Thu, 22 Mar 2012 12:42:58 -0400
Subject: [Numpy-discussion] About np array and recarray
In-Reply-To: <CAE8bXEkmgBySx1dcJK5KNJNpgKvvaXK4dHGFq9rrQF49C_zk6A@mail.gmail.com>
References: <CAAV4V2A6_agL77DQ3xLLX-w5pBTihDuZQowHXN2ErWqUtHGMdg@mail.gmail.com>
	<CAE8bXEkmgBySx1dcJK5KNJNpgKvvaXK4dHGFq9rrQF49C_zk6A@mail.gmail.com>
Message-ID: <CAAV4V2DsdBpD4_Mz0W2c1n1xLHH9bO7WKwBCERH8kFgjPQ8_+A@mail.gmail.com>

Yes, this is finally how I work around it.  I just want to save the
conversion from list to tuple.

On Thu, Mar 22, 2012 at 1:32 AM, Val Kalatsky <kalatsky at gmail.com> wrote:

>
> Will this do what you need to accomplish?
>
> import datetime
> np.array([(datetime.datetime.strptime(i[0], "%Y-%m-%d").date(), i[1]) for
> i in a], dtype=[('date', 'object'), ('count', 'int')])
>
> Val
>
> On Wed, Mar 21, 2012 at 11:48 PM, Yan Tang <tang.yan at gmail.com> wrote:
>
>> Hi,
>>
>> I am really confused on the np array or record array, and cannot
>> understand how it works.
>>
>> What I want to do is that I have a normal python two dimensional
>> array/list:
>>
>> a = [['2000-01-01', 2],['2000-01-02', 3]]
>>
>> I want to convert it to a recarray with this dtype [('date', 'object'),
>> ('count', 'int')].  I tried multiple ways and none of them works.  And some
>> of the tests show pretty odd behavior.
>>
>> This is good, and it is almost what i want:
>>
>> >>> import numpy as np
>> >>> a = [('2000-01-01', 2), ('2000-01-02', 3)]
>> >>> np.array(a, dtype=[('date', 'object'), ('count', 'int')])
>> array([('2000-01-01', 2), ('2000-01-02', 3)],
>>       dtype=[('date', '|O8'), ('count', '<i8')])
>>
>> Why this doesn't work?!
>>
>> >>> a = [['2000-01-01', 2],['2000-01-02', 3]]
>> >>> np.array(a, dtype=[('date', 'object'), ('count', 'int')])
>> Traceback (most recent call last):
>>   File "<stdin>", line 1, in <module>
>> ValueError: tried to set void-array with object members using buffer.
>>
>> Why can this cause segmentation fault?!
>>
>> >>> a = [['2000-01-01', 2],['2000-01-02', 3]]
>> >>> np.ndarray((len(a),), buffer=np.array(a), dtype=[('date', 'object'),
>> ('count', 'int')])
>> Segmentation fault (And python quit!)
>>
>> Python version 2.6.5
>>
>> On this reference page,
>> http://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html
>>
>> >>> x = np.array([(1,2),(3,4)])
>> >>> x
>> array([[1, 2],
>>        [3, 4]])
>> >>> np.array([[1, 2], [3, 4]])
>> array([[1, 2],
>>        [3, 4]])
>>
>> Can anyone help me about this?
>>
>> Thanks.
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120322/fce263ea/attachment.html>

From lists at hilboll.de  Thu Mar 22 12:49:44 2012
From: lists at hilboll.de (Andreas H.)
Date: Thu, 22 Mar 2012 17:49:44 +0100
Subject: [Numpy-discussion] Trouble building NumPy on PPC64
In-Reply-To: <CABL7CQj2SBsehehSyC18HzFa62NsbPgxF7pCHZ8ws_KdHzn33A@mail.gmail.com>
References: <0c1b056b6bd43c833d13fdf4495289cb.squirrel@srv2.s4y.tournesol-consulting.eu>
	<CABL7CQj2SBsehehSyC18HzFa62NsbPgxF7pCHZ8ws_KdHzn33A@mail.gmail.com>
Message-ID: <34459c48a741e17bbd5049953d2c6210.squirrel@srv2.s4y.tournesol-consulting.eu>

> On Mon, Mar 19, 2012 at 6:45 PM, Andreas H. <lists at hilboll.de> wrote:
>
>> Hi all,
>>
>> I have troube installing numpy in a virtual environment on a SuSE
>> Enterprise 11 server (ppc64).
>>
>> Here is what I did:
>>
>>    curl -O https://raw.github.com/pypa/virtualenv/master/virtualenv.py
>>    python virtualenv.py --distribute --no-site-packages
>> .virtualenvs/pydoas
>>    source .virtualenvs/pydoas/bin/activate
>>    pip install numpy
>>
>> And here is the outcome:
>>
>>    SystemError: Cannot compile 'Python.h'. Perhaps you need to install
>> python-dev|python-devel.
>>
>> However, Python.h exists, because I did install the python-devel
>> package:
>>
>>    (pydoas)hilboll at odin:~/.virtualenvs/pydoas/build/numpy> find
>> /usr/include/ | grep Python.h
>>    /usr/include/python2.6/Python.h
>>
>> I also tried without the --distribute --no-site-packages flags, with the
>> same result.
>>
>> Any hints are very welcome :)
>>
>
> Hard to say what's going wrong. Perhaps pip is not for python2.6 but
> another python version? Can you install without pip, so normal "python
> setup.py install"? Can you post the full build log?

Thanks for your ideas, Ralf! Actually, it turned out to be a problem with
the gcc installation on the machine I was building on. The sysadmin fixed
that, and now NumPy is running fine :)

Cheers,
A.


From gael.varoquaux at normalesup.org  Thu Mar 22 13:07:36 2012
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Thu, 22 Mar 2012 18:07:36 +0100
Subject: [Numpy-discussion] Crashes and test failures
Message-ID: <20120322170736.GA29845@phare.normalesup.org>


Hi list,

I just got a new laptop, running Ubuntu 11.10 64bit on an Intel i7.
I am a bit intriged by the test results of numpy on this box.

First of all, master builds and imports OK, but the simplest test case
crashes with a segfault:

    import numpy as np
    a = np.ones(10, dtype=np.bool)
    np.all(a)

Am I the only one to see this? If so, I am wiling to investigate, but
I'll need a bit of help, as I had a look with gcc, and the crash
seems burried deep in the guts of numpy.

Second, I backed out a bit, and checkout out the 1.6.1 tag. It builds
alright and doesn't segfault, but I am getting a few test failures
(pasted at the end of the mail). I haven't been following the mailing
list well-enough lately, so I am unsure: are these known failure. I was
surprised to find a tag that looked like a release with failures.

Sorry if I am asking stupid questions, finding master unusable at first
glance for me was a bit offseting.

Gael


======================================================================
ERROR: Failure: ImportError (cannot import name fib2)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.7/nose/loader.py", line 390, in
loadTestsFromName
    addr.filename, addr.module)
  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 39, in
importFromPath
    return self.importFromDir(dir_path, fqname)
  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 86, in
importFromDir
    mod = load_module(part_fqname, fh, filename, desc)
  File
"/home/varoquau/dev/numpy/numpy/distutils/tests/f2py_ext/tests/test_fib2.py",
line 3, in <module>
    from f2py_ext import fib2
ImportError: cannot import name fib2

======================================================================
ERROR: Failure: ImportError (cannot import name foo)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.7/nose/loader.py", line 390, in
loadTestsFromName
    addr.filename, addr.module)
  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 39, in
importFromPath
    return self.importFromDir(dir_path, fqname)
  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 86, in
importFromDir
    mod = load_module(part_fqname, fh, filename, desc)
  File
"/home/varoquau/dev/numpy/numpy/distutils/tests/f2py_f90_ext/tests/test_foo.py",
line 3, in <module>
    from f2py_f90_ext import foo
ImportError: cannot import name foo

======================================================================
ERROR: Failure: ImportError (cannot import name fib3)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.7/nose/loader.py", line 390, in
loadTestsFromName
    addr.filename, addr.module)
  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 39, in
importFromPath
    return self.importFromDir(dir_path, fqname)
  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 86, in
importFromDir
    mod = load_module(part_fqname, fh, filename, desc)
  File
"/home/varoquau/dev/numpy/numpy/distutils/tests/gen_ext/tests/test_fib3.py",
line 3, in <module>
    from gen_ext import fib3
ImportError: cannot import name fib3

======================================================================
ERROR: Failure: ImportError (No module named primes)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.7/nose/loader.py", line 390, in
loadTestsFromName
    addr.filename, addr.module)
  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 39, in
importFromPath
    return self.importFromDir(dir_path, fqname)
  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 86, in
importFromDir
    mod = load_module(part_fqname, fh, filename, desc)
  File
"/home/varoquau/dev/numpy/numpy/distutils/tests/pyrex_ext/tests/test_primes.py",
line 3, in <module>
    from pyrex_ext.primes import primes
ImportError: No module named primes

======================================================================
ERROR: Failure: ImportError (cannot import name example)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.7/nose/loader.py", line 390, in
loadTestsFromName
    addr.filename, addr.module)
  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 39, in
importFromPath
    return self.importFromDir(dir_path, fqname)
  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 86, in
importFromDir
    mod = load_module(part_fqname, fh, filename, desc)
  File
"/home/varoquau/dev/numpy/numpy/distutils/tests/swig_ext/tests/test_example.py",
line 3, in <module>
    from swig_ext import example
ImportError: cannot import name example

======================================================================
ERROR: Failure: ImportError (cannot import name example2)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.7/nose/loader.py", line 390, in
loadTestsFromName
    addr.filename, addr.module)
  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 39, in
importFromPath
    return self.importFromDir(dir_path, fqname)
  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 86, in
importFromDir
    mod = load_module(part_fqname, fh, filename, desc)
  File
"/home/varoquau/dev/numpy/numpy/distutils/tests/swig_ext/tests/test_example2.py",
line 3, in <module>
    from swig_ext import example2
ImportError: cannot import name example2

======================================================================
FAIL: test_kind.TestKind.test_all
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.7/nose/case.py", line 187, in runTest
    self.test(*self.arg)
  File "/home/varoquau/dev/numpy/numpy/f2py/tests/test_kind.py", line 30,
in test_all
    'selectedrealkind(%s): expected %r but got %r' %  (i,
selected_real_kind(i), selectedrealkind(i)))
  File "/home/varoquau/dev/numpy/numpy/testing/utils.py", line 34, in
assert_
    raise AssertionError(msg)
AssertionError: selectedrealkind(19): expected -1 but got 16

----------------------------------------------------------------------


From charlesr.harris at gmail.com  Thu Mar 22 13:20:10 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 22 Mar 2012 11:20:10 -0600
Subject: [Numpy-discussion] Crashes and test failures
In-Reply-To: <20120322170736.GA29845@phare.normalesup.org>
References: <20120322170736.GA29845@phare.normalesup.org>
Message-ID: <CAB6mnxJvfChix3CUjTam4y9ZiLoQsiwC8pyGxYkRu3v-TRT7Ag@mail.gmail.com>

On Thu, Mar 22, 2012 at 11:07 AM, Gael Varoquaux <
gael.varoquaux at normalesup.org> wrote:

>
> Hi list,
>
> I just got a new laptop, running Ubuntu 11.10 64bit on an Intel i7.
> I am a bit intriged by the test results of numpy on this box.
>
> First of all, master builds and imports OK, but the simplest test case
> crashes with a segfault:
>
>    import numpy as np
>    a = np.ones(10, dtype=np.bool)
>    np.all(a)
>
> Am I the only one to see this? If so, I am wiling to investigate, but
> I'll need a bit of help, as I had a look with gcc, and the crash
> seems burried deep in the guts of numpy.
>
> Second, I backed out a bit, and checkout out the 1.6.1 tag. It builds
> alright and doesn't segfault, but I am getting a few test failures
> (pasted at the end of the mail). I haven't been following the mailing
> list well-enough lately, so I am unsure: are these known failure. I was
> surprised to find a tag that looked like a release with failures.
>
> Sorry if I am asking stupid questions, finding master unusable at first
> glance for me was a bit offseting.
>
> Gael
>
>
>
> ======================================================================
> ERROR: Failure: ImportError (cannot import name fib2)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>  File "/usr/lib/pymodules/python2.7/nose/loader.py", line 390, in
> loadTestsFromName
>    addr.filename, addr.module)
>  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 39, in
> importFromPath
>    return self.importFromDir(dir_path, fqname)
>  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 86, in
> importFromDir
>    mod = load_module(part_fqname, fh, filename, desc)
>  File
>
> "/home/varoquau/dev/numpy/numpy/distutils/tests/f2py_ext/tests/test_fib2.py",
> line 3, in <module>
>    from f2py_ext import fib2
> ImportError: cannot import name fib2
>
> ======================================================================
> ERROR: Failure: ImportError (cannot import name foo)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>  File "/usr/lib/pymodules/python2.7/nose/loader.py", line 390, in
> loadTestsFromName
>    addr.filename, addr.module)
>  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 39, in
> importFromPath
>    return self.importFromDir(dir_path, fqname)
>  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 86, in
> importFromDir
>    mod = load_module(part_fqname, fh, filename, desc)
>  File
>
> "/home/varoquau/dev/numpy/numpy/distutils/tests/f2py_f90_ext/tests/test_foo.py",
> line 3, in <module>
>    from f2py_f90_ext import foo
> ImportError: cannot import name foo
>
> ======================================================================
> ERROR: Failure: ImportError (cannot import name fib3)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>  File "/usr/lib/pymodules/python2.7/nose/loader.py", line 390, in
> loadTestsFromName
>    addr.filename, addr.module)
>  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 39, in
> importFromPath
>    return self.importFromDir(dir_path, fqname)
>  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 86, in
> importFromDir
>    mod = load_module(part_fqname, fh, filename, desc)
>  File
>
> "/home/varoquau/dev/numpy/numpy/distutils/tests/gen_ext/tests/test_fib3.py",
> line 3, in <module>
>    from gen_ext import fib3
> ImportError: cannot import name fib3
>
> ======================================================================
> ERROR: Failure: ImportError (No module named primes)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>  File "/usr/lib/pymodules/python2.7/nose/loader.py", line 390, in
> loadTestsFromName
>    addr.filename, addr.module)
>  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 39, in
> importFromPath
>    return self.importFromDir(dir_path, fqname)
>  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 86, in
> importFromDir
>    mod = load_module(part_fqname, fh, filename, desc)
>  File
>
> "/home/varoquau/dev/numpy/numpy/distutils/tests/pyrex_ext/tests/test_primes.py",
> line 3, in <module>
>    from pyrex_ext.primes import primes
> ImportError: No module named primes
>
> ======================================================================
> ERROR: Failure: ImportError (cannot import name example)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>  File "/usr/lib/pymodules/python2.7/nose/loader.py", line 390, in
> loadTestsFromName
>    addr.filename, addr.module)
>  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 39, in
> importFromPath
>    return self.importFromDir(dir_path, fqname)
>  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 86, in
> importFromDir
>    mod = load_module(part_fqname, fh, filename, desc)
>  File
>
> "/home/varoquau/dev/numpy/numpy/distutils/tests/swig_ext/tests/test_example.py",
> line 3, in <module>
>    from swig_ext import example
> ImportError: cannot import name example
>
> ======================================================================
> ERROR: Failure: ImportError (cannot import name example2)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>  File "/usr/lib/pymodules/python2.7/nose/loader.py", line 390, in
> loadTestsFromName
>    addr.filename, addr.module)
>  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 39, in
> importFromPath
>    return self.importFromDir(dir_path, fqname)
>  File "/usr/lib/pymodules/python2.7/nose/importer.py", line 86, in
> importFromDir
>    mod = load_module(part_fqname, fh, filename, desc)
>  File
>
> "/home/varoquau/dev/numpy/numpy/distutils/tests/swig_ext/tests/test_example2.py",
> line 3, in <module>
>    from swig_ext import example2
> ImportError: cannot import name example2
>
> ======================================================================
> FAIL: test_kind.TestKind.test_all
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>  File "/usr/lib/pymodules/python2.7/nose/case.py", line 187, in runTest
>    self.test(*self.arg)
>  File "/home/varoquau/dev/numpy/numpy/f2py/tests/test_kind.py", line 30,
> in test_all
>    'selectedrealkind(%s): expected %r but got %r' %  (i,
> selected_real_kind(i), selectedrealkind(i)))
>  File "/home/varoquau/dev/numpy/numpy/testing/utils.py", line 34, in
> assert_
>    raise AssertionError(msg)
> AssertionError: selectedrealkind(19): expected -1 but got 16
>
> ----------------------------------------------------------------------
>
>
This one was fixed a few days ago, so you aren't running the latest. That
might be connected to all the import errors you are seeing.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120322/e763a23e/attachment.html>

From gael.varoquaux at normalesup.org  Thu Mar 22 13:24:04 2012
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Thu, 22 Mar 2012 18:24:04 +0100
Subject: [Numpy-discussion] Crashes and test failures
In-Reply-To: <CAB6mnxJvfChix3CUjTam4y9ZiLoQsiwC8pyGxYkRu3v-TRT7Ag@mail.gmail.com>
References: <20120322170736.GA29845@phare.normalesup.org>
	<CAB6mnxJvfChix3CUjTam4y9ZiLoQsiwC8pyGxYkRu3v-TRT7Ag@mail.gmail.com>
Message-ID: <20120322172404.GI21185@phare.normalesup.org>

On Thu, Mar 22, 2012 at 11:20:10AM -0600, Charles R Harris wrote:
> This one was fixed a few days ago, so you aren't running the latest.

Indeed, but if I run the latest, any operation on arrays segfaults on me,
so I am running 0.6.1. But you are saying that it's known and fixed.
Great!

> That might be connected to all the import errors you are seeing.

In a sens that good news.

Gael


From fperez.net at gmail.com  Thu Mar 22 17:11:14 2012
From: fperez.net at gmail.com (Fernando Perez)
Date: Thu, 22 Mar 2012 14:11:14 -0700
Subject: [Numpy-discussion] [ANN] PyData workshop videos are up online,
	including panel with Guido
Message-ID: <CAHAreOpXhKKR4KW13b3hw2M=zqABWgg0+muEnBdVnzXJNeEG2Q@mail.gmail.com>

Hi all,

just to let you know that the videos from the PyData workshop we held
at Google a couple of weeks ago are now online (not all talks are up
yet, so watch the page over the next few days if a talk you wanted to
see isn't posted yet):

http://marakana.com/s/2012_pydata_workshop,1090/index.html

The panel discussion with Guido that we talked about on these lists is
in there; I hope to write up a short summary about it soon.

Many thanks to Simeon Franklin and the rest of the Marakana team for
doing all this work (for free)!

Cheers,

f


From charlesr.harris at gmail.com  Thu Mar 22 20:24:13 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 22 Mar 2012 18:24:13 -0600
Subject: [Numpy-discussion] Crashes and test failures
In-Reply-To: <20120322172404.GI21185@phare.normalesup.org>
References: <20120322170736.GA29845@phare.normalesup.org>
	<CAB6mnxJvfChix3CUjTam4y9ZiLoQsiwC8pyGxYkRu3v-TRT7Ag@mail.gmail.com>
	<20120322172404.GI21185@phare.normalesup.org>
Message-ID: <CAB6mnxJbYZXnDXBnPNaD111_tVn9hWsyjCqziiDa7vsB6WGaxQ@mail.gmail.com>

On Thu, Mar 22, 2012 at 11:24 AM, Gael Varoquaux <
gael.varoquaux at normalesup.org> wrote:

> On Thu, Mar 22, 2012 at 11:20:10AM -0600, Charles R Harris wrote:
> > This one was fixed a few days ago, so you aren't running the latest.
>
> Indeed, but if I run the latest, any operation on arrays segfaults on me,
> so I am running 0.6.1. But you are saying that it's known and fixed.
> Great!
>
> > That might be connected to all the import errors you are seeing.
>
> In a sens that good news.
>
>
Since you are running on Ubuntu, it's probably a good idea to

python setup.py install --user

which will put everything in .local and avoid the mess with dist-packages
and site-packages. Maybe you do that already?

I didn't have any problems with Ubuntu 11.04 64 bit apart for unity, which
drove me back to Fedora. So if the problems presist I'll guess
compiler/i7/atlas problems.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120322/1b943160/attachment.html>

From gael.varoquaux at normalesup.org  Fri Mar 23 01:58:28 2012
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Fri, 23 Mar 2012 06:58:28 +0100
Subject: [Numpy-discussion] Crashes and test failures
In-Reply-To: <CAB6mnxJbYZXnDXBnPNaD111_tVn9hWsyjCqziiDa7vsB6WGaxQ@mail.gmail.com>
References: <20120322170736.GA29845@phare.normalesup.org>
	<CAB6mnxJvfChix3CUjTam4y9ZiLoQsiwC8pyGxYkRu3v-TRT7Ag@mail.gmail.com>
	<20120322172404.GI21185@phare.normalesup.org>
	<CAB6mnxJbYZXnDXBnPNaD111_tVn9hWsyjCqziiDa7vsB6WGaxQ@mail.gmail.com>
Message-ID: <20120323055828.GB14072@phare.normalesup.org>

On Thu, Mar 22, 2012 at 06:24:13PM -0600, Charles R Harris wrote:
>    Since you are running on Ubuntu, it's probably a good idea to

>    python setup.py install --user

>    which will put everything in .local and avoid the mess with dist-packages
>    and site-packages. Maybe you do that already?

I actually did 

    python setupegg.py develop --user

I would think that it is close enough.

>    I didn't have any problems with Ubuntu 11.04 64 bit apart for unity, which
>    drove me back to Fedora. So if the problems presist I'll guess
>    compiler/i7/atlas problems.

OK, so if it is confirmed that the crash is specific to my computer, I'll
bisect to find out where it appeared.

Cheers,

Ga?l


From tim at cerazone.net  Fri Mar 23 11:31:17 2012
From: tim at cerazone.net (Tim Cera)
Date: Fri, 23 Mar 2012 11:31:17 -0400
Subject: [Numpy-discussion] Pull request for pad functions
Message-ID: <CAO5s+D9gJ2=x42qZbNkxA8z0a_LDE_PZP+xGYdW3ArM0FWfZvQ@mail.gmail.com>

Hello,

I have a pull request to add a n-dimensional array padding feature at
https://github.com/numpy/numpy/pull/198

This message is only to prod for a final review.

Much thanks to Warren Weckesser and Travis Oliphant for their help in
finding some bugs and working on style and API issues. I think it looks
pretty good right now.

Kindest regards,
Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120323/c558cc37/attachment.html>

From david.froger at gmail.com  Sat Mar 24 09:28:29 2012
From: david.froger at gmail.com (David Froger)
Date: Sat, 24 Mar 2012 14:28:29 +0100
Subject: [Numpy-discussion] f2py and pygtk on windows
In-Reply-To: <CAOxZyWMc7wwjpTT1BXEzaX-ihLcTK7MNYSAGOcdYdwdhCOVygg@mail.gmail.com>
References: <CAOxZyWMc7wwjpTT1BXEzaX-ihLcTK7MNYSAGOcdYdwdhCOVygg@mail.gmail.com>
Message-ID: <1332595543-sup-5122@david-desktop>

Excerpts from Sameer Grover's message of ven. mars 09 20:50:06 +0100 2012:
> >>>import gtk
> >>>import foo # where foo is any f2py-wrapped program
> 
> Subsequently, on exiting python interpreter, the interpreter crashes
> with this error message - "This application has requested the Runtime
> to terminate it in an unusual way. Please contact the application's
> support team for more information."
> 
> Strangely enough, interchanging the order of the import statements,
> i.e. importing the f2py wrapped program before gtk works fine.
> Furthermore, each module works fine individually.
> 
> This is a windows-only problem. I'm using Windows 7, Python 2.7,
> latest numpy, mingw32 compiler and the "pygtk all-in-one installer"
> (mentioned on the pygtk download page).
> 
> This happens even for very simple fortran programs such as this one -
> subroutine hello ()
>     write(*,*)'Hello from Fortran90!!!'
> end subroutine hello
> 
> I don't know whether the problem is with f2py or with gtk or with
> python but maybe somebody can shed some light on this.
> 
> Regards,
> Sameer Grover

Hi,

I've had similar problem in the past (but on Gnu/Linux), which was solve
to by setting the LC_ALL environnement variable to C. (never undersant why).
(with Bash: export LC_ALL=C)


From takowl at gmail.com  Sat Mar 24 10:09:45 2012
From: takowl at gmail.com (Thomas Kluyver)
Date: Sat, 24 Mar 2012 14:09:45 +0000
Subject: [Numpy-discussion] f2py and pygtk on windows
In-Reply-To: <1332595543-sup-5122@david-desktop>
References: <CAOxZyWMc7wwjpTT1BXEzaX-ihLcTK7MNYSAGOcdYdwdhCOVygg@mail.gmail.com>
	<1332595543-sup-5122@david-desktop>
Message-ID: <CAOvn4qgZqY_KwCLhsNzd_GC9YUZEYU3iPb4u3MnpgfCn4J=GsQ@mail.gmail.com>

On 24 March 2012 13:28, David Froger <david.froger at gmail.com> wrote:
> I've had similar problem in the past (but on Gnu/Linux), which was solve
> to by setting the LC_ALL environnement variable to C. (never undersant why).
> (with Bash: export LC_ALL=C)

This rings a bell. I have a feeling importing pygtk can change the
Python default encoding (sys.getdefaultencoding()). That's never
normally changed, so Python 2 code can assume it's always ascii. I
guess pygtk sets the encoding from the system locale, so if it's set
to C, it will use ascii, and the problem won't appear.

Thomas


From charlesr.harris at gmail.com  Sat Mar 24 11:34:12 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 24 Mar 2012 09:34:12 -0600
Subject: [Numpy-discussion] f2py and pygtk on windows
In-Reply-To: <CAOvn4qgZqY_KwCLhsNzd_GC9YUZEYU3iPb4u3MnpgfCn4J=GsQ@mail.gmail.com>
References: <CAOxZyWMc7wwjpTT1BXEzaX-ihLcTK7MNYSAGOcdYdwdhCOVygg@mail.gmail.com>
	<1332595543-sup-5122@david-desktop>
	<CAOvn4qgZqY_KwCLhsNzd_GC9YUZEYU3iPb4u3MnpgfCn4J=GsQ@mail.gmail.com>
Message-ID: <CAB6mnxKXk_BpMnv=doPcg4kLLahirDqDSJxTzXO1bczUVJK88A@mail.gmail.com>

On Sat, Mar 24, 2012 at 8:09 AM, Thomas Kluyver <takowl at gmail.com> wrote:

> On 24 March 2012 13:28, David Froger <david.froger at gmail.com> wrote:
> > I've had similar problem in the past (but on Gnu/Linux), which was solve
> > to by setting the LC_ALL environnement variable to C. (never undersant
> why).
> > (with Bash: export LC_ALL=C)
>
> This rings a bell. I have a feeling importing pygtk can change the
> Python default encoding (sys.getdefaultencoding()). That's never
> normally changed, so Python 2 code can assume it's always ascii. I
> guess pygtk sets the encoding from the system locale, so if it's set
> to C, it will use ascii, and the problem won't appear.
>
>
That's interesting. I've noticed several different encodings used in numpy,
'ascii', 'latin1', and utf8. Might be worth taking a closer look at those
cases.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120324/a27d4030/attachment.html>

From bburan at cns.nyu.edu  Sat Mar 24 13:58:06 2012
From: bburan at cns.nyu.edu (Brad Buran)
Date: Sat, 24 Mar 2012 13:58:06 -0400
Subject: [Numpy-discussion] numpydoc.traitsdoc error
Message-ID: <CAHb_y2LSDHBtLKfZ0XjJf0hMs+_cqdi-3Dtnf+HVtAgV2jaMWQ@mail.gmail.com>

Not sure if this is the appropriate place to report the issue, but
I've been getting the following error when trying to build my docs
using Sphinx 1.1.3:

  File "C:\Python27\lib\site-packages\numpydoc\numpydoc.py", line 36,
in mangle_docstrings
    doc = get_doc_object(obj, what, u"\n".join(lines), config=cfg)
TypeError: get_doc_object() got multiple values for keyword argument 'config'

Full traceback is attached.  Is anyone else having a similar issue?

Thanks,
Brad
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sphinx-err-kpakmw.log
Type: application/octet-stream
Size: 4531 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120324/8bbe9d55/attachment.obj>

From charlesr.harris at gmail.com  Sat Mar 24 16:17:07 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 24 Mar 2012 14:17:07 -0600
Subject: [Numpy-discussion] Inconsistent type name formatting.
Message-ID: <CAB6mnxLxO5Fjm1YnopizxyeQjKjdeSi4YoSf5UeMt1Q-q8hLzA@mail.gmail.com>

Example:

In [40]: array([1, 2], dtype('>l')).dtype
Out[40]: dtype('>i8')

In [41]: array([1, 2], dtype('<l')).dtype
Out[41]: dtype('int64')

In [42]: array([1, 2], dtype('>q')).dtype
Out[42]: dtype('>i8')

In [43]: array([1, 2], dtype('<q')).dtype
Out[43]: dtype('int64')

In [44]: array([1, 2], dtype('>M8[D]')).dtype
Out[44]: dtype('>M8[D]')

In [45]: array([1, 2], dtype('<M8[D]')).dtype
Out[45]: dtype('<M8[D]')

I rather like the '>i8' form with '<' if appropriate, although it's
probably a bit late to do anything about it. The datetime requires a patch
to generate the swapped form. Long term, it might be good to strive for
consistency. Thoughts?

Chuck.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120324/dfb7e715/attachment.html>

From charlesr.harris at gmail.com  Sat Mar 24 17:13:53 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 24 Mar 2012 15:13:53 -0600
Subject: [Numpy-discussion] 1.7 blockers
Message-ID: <CAB6mnx+WOfWDGrjpUZHfr_Lp0o2yvwoT6028CR4udZ=1nnHTBA@mail.gmail.com>

Hi All,

There several problems with numpy master that need to be fixed before a
release can be considered.

   1. Datetime on windows with mingw.
   2. Bus error on SPARC, ticket #2076.
   3. NA and real/complex views of complex arrays.

Number 1 has been proved to be particularly difficult, any help or
suggestions for that would be much appreciated. The current work has been
going in pull request 214 <https://github.com/numpy/numpy/pull/214>.

This isn't to say that there aren't a ton of other things that need fixing
or that we can skip out on the current stack of pull requests, but I think
it is impossible to consider a release while those three problems are
outstanding.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120324/cb95e9bd/attachment.html>

From ralf.gommers at googlemail.com  Sat Mar 24 17:39:02 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sat, 24 Mar 2012 22:39:02 +0100
Subject: [Numpy-discussion] numpydoc.traitsdoc error
In-Reply-To: <CAHb_y2LSDHBtLKfZ0XjJf0hMs+_cqdi-3Dtnf+HVtAgV2jaMWQ@mail.gmail.com>
References: <CAHb_y2LSDHBtLKfZ0XjJf0hMs+_cqdi-3Dtnf+HVtAgV2jaMWQ@mail.gmail.com>
Message-ID: <CABL7CQiGVSCz9dy13-bZMMw8Jkcj6aAx3fY=CHyOxagjboFMqA@mail.gmail.com>

On Sat, Mar 24, 2012 at 6:58 PM, Brad Buran <bburan at cns.nyu.edu> wrote:

> Not sure if this is the appropriate place to report the issue, but
> I've been getting the following error when trying to build my docs
> using Sphinx 1.1.3:
>
>  File "C:\Python27\lib\site-packages\numpydoc\numpydoc.py", line 36,
> in mangle_docstrings
>    doc = get_doc_object(obj, what, u"\n".join(lines), config=cfg)
> TypeError: get_doc_object() got multiple values for keyword argument
> 'config'
>
> Full traceback is attached.  Is anyone else having a similar issue?
>
> To determine if the problem is in your toolchain or in your code, you can
try to build the numpy docs. I just checked that they build for me with
Sphinx 1.1.3. If you have numpy installed, simply type "make html" or "make
latex" in the doc/ dir.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120324/a405b976/attachment.html>

From ralf.gommers at googlemail.com  Sun Mar 25 08:57:20 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sun, 25 Mar 2012 14:57:20 +0200
Subject: [Numpy-discussion] label NA and datetime as experimental
Message-ID: <CABL7CQgBB99WopVgUGnQGxEOosMCaODpBdi=4HGGzY6JFUtAsg@mail.gmail.com>

Hi,

We decided to label both NA and datetime APIs as experimental for the 1.7.0
release. I made a PR that does this, please review:
https://github.com/numpy/numpy/pull/240

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120325/8572831e/attachment.html>

From bburan at cns.nyu.edu  Sun Mar 25 10:46:48 2012
From: bburan at cns.nyu.edu (Brad Buran)
Date: Sun, 25 Mar 2012 10:46:48 -0400
Subject: [Numpy-discussion] numpydoc.traitsdoc error
In-Reply-To: <CABL7CQiGVSCz9dy13-bZMMw8Jkcj6aAx3fY=CHyOxagjboFMqA@mail.gmail.com>
References: <CAHb_y2LSDHBtLKfZ0XjJf0hMs+_cqdi-3Dtnf+HVtAgV2jaMWQ@mail.gmail.com>
	<CABL7CQiGVSCz9dy13-bZMMw8Jkcj6aAx3fY=CHyOxagjboFMqA@mail.gmail.com>
Message-ID: <CAHb_y2+kAr++Q2YgCm4RC6CANybkB_6QPa3wT+JExs-Ls+6bkQ@mail.gmail.com>

Hi Ralf:

The Numpy documentation built fine for me.  I actually think the
problem is with the traitsdoc extension (when I change my conf.py file
to use the numpydoc extension instead of numpydoc.traitsdoc the build
works fine).  I checked the traits and traitsui packages to see if I
could use them for testing, but it appears that they do not use the
traitsdoc extension.

Brad

On Sat, Mar 24, 2012 at 5:39 PM, Ralf Gommers
<ralf.gommers at googlemail.com> wrote:
>
>
> On Sat, Mar 24, 2012 at 6:58 PM, Brad Buran <bburan at cns.nyu.edu> wrote:
>>
>> Not sure if this is the appropriate place to report the issue, but
>> I've been getting the following error when trying to build my docs
>> using Sphinx 1.1.3:
>>
>> ?File "C:\Python27\lib\site-packages\numpydoc\numpydoc.py", line 36,
>> in mangle_docstrings
>> ? ?doc = get_doc_object(obj, what, u"\n".join(lines), config=cfg)
>> TypeError: get_doc_object() got multiple values for keyword argument
>> 'config'
>>
>> Full traceback is attached. ?Is anyone else having a similar issue?
>>
> To determine if the problem is in your toolchain or in your code, you can
> try to build the numpy docs. I just checked that they build for me with
> Sphinx 1.1.3. If you have numpy installed, simply type "make html" or "make
> latex" in the doc/ dir.
>
> Ralf
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From ralf.gommers at googlemail.com  Sun Mar 25 12:12:12 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sun, 25 Mar 2012 18:12:12 +0200
Subject: [Numpy-discussion] empty chararrays (ticket 1948)
Message-ID: <CABL7CQgdW2Gcpr3MhiEBn-1xk3MAmwkCe2FdL=mbwujYrSOkoQ@mail.gmail.com>

Hi,

In ticket 1948 a backwards compatibility issue with chararray is reported.
Indexing a chararray with [] or a bool array of False used to return [] in
numpy 1.2.1 (consistent with ndarray behavior), but now returns an empty
string. Unfortunately this changed behavior has been present for the 1.5.x
and 1.6.x releases.

So the question is if this should be changed back or not? The change was
likely unintentional; there's no test for it.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120325/52b8b84a/attachment.html>

From ralf.gommers at googlemail.com  Sun Mar 25 12:33:49 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sun, 25 Mar 2012 18:33:49 +0200
Subject: [Numpy-discussion] 1.7 blockers
In-Reply-To: <CAB6mnx+WOfWDGrjpUZHfr_Lp0o2yvwoT6028CR4udZ=1nnHTBA@mail.gmail.com>
References: <CAB6mnx+WOfWDGrjpUZHfr_Lp0o2yvwoT6028CR4udZ=1nnHTBA@mail.gmail.com>
Message-ID: <CABL7CQi2bE+bEf3sYvAzs=SD677RQnbGSSmBD_=wr7UQUq5Sww@mail.gmail.com>

On Sat, Mar 24, 2012 at 10:13 PM, Charles R Harris <
charlesr.harris at gmail.com> wrote:

> Hi All,
>
> There several problems with numpy master that need to be fixed before a
> release can be considered.
>
>    1. Datetime on windows with mingw.
>    2. Bus error on SPARC, ticket #2076.
>    3. NA and real/complex views of complex arrays.
>
> Number 1 has been proved to be particularly difficult, any help or
> suggestions for that would be much appreciated. The current work has been
> going in pull request 214 <https://github.com/numpy/numpy/pull/214>.
>
> This isn't to say that there aren't a ton of other things that need fixing
> or that we can skip out on the current stack of pull requests, but I think
> it is impossible to consider a release while those three problems are
> outstanding.
>
There's one more ticket that hasn't been looked at AFAIK and that has been
keeping the buildbots (except the Linux one) on red:
http://projects.scipy.org/numpy/ticket/1755 (floating point errors).

The other tickets have been looked at and either have PRs already or should
be fixable with not too much effort.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120325/73217771/attachment.html>

From charlesr.harris at gmail.com  Sun Mar 25 12:48:39 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun, 25 Mar 2012 10:48:39 -0600
Subject: [Numpy-discussion] 1.7 blockers
In-Reply-To: <CABL7CQi2bE+bEf3sYvAzs=SD677RQnbGSSmBD_=wr7UQUq5Sww@mail.gmail.com>
References: <CAB6mnx+WOfWDGrjpUZHfr_Lp0o2yvwoT6028CR4udZ=1nnHTBA@mail.gmail.com>
	<CABL7CQi2bE+bEf3sYvAzs=SD677RQnbGSSmBD_=wr7UQUq5Sww@mail.gmail.com>
Message-ID: <CAB6mnxJh1QxUN-yz-nDrzKKXVLgDCjgnTrkd2VsUW8nLSQ1Cpg@mail.gmail.com>

On Sun, Mar 25, 2012 at 10:33 AM, Ralf Gommers
<ralf.gommers at googlemail.com>wrote:

>
>
> On Sat, Mar 24, 2012 at 10:13 PM, Charles R Harris <
> charlesr.harris at gmail.com> wrote:
>
>> Hi All,
>>
>> There several problems with numpy master that need to be fixed before a
>> release can be considered.
>>
>>    1. Datetime on windows with mingw.
>>    2. Bus error on SPARC, ticket #2076.
>>    3. NA and real/complex views of complex arrays.
>>
>> Number 1 has been proved to be particularly difficult, any help or
>> suggestions for that would be much appreciated. The current work has been
>> going in pull request 214 <https://github.com/numpy/numpy/pull/214>.
>>
>> This isn't to say that there aren't a ton of other things that need
>> fixing or that we can skip out on the current stack of pull requests, but I
>> think it is impossible to consider a release while those three problems are
>> outstanding.
>>
> There's one more ticket that hasn't been looked at AFAIK and that has been
> keeping the buildbots (except the Linux one) on red:
> http://projects.scipy.org/numpy/ticket/1755 (floating point errors).
>
>
I don't know what to do about that one.There may be some compiler flags
that would help.

The other tickets have been looked at and either have PRs already or should
> be fixable with not too much effort.
>
>
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120325/7e19a63d/attachment.html>

From pierre.haessig at crans.org  Sun Mar 25 12:55:02 2012
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Sun, 25 Mar 2012 18:55:02 +0200
Subject: [Numpy-discussion] Using logical function on more than 2 arrays,
 availability of a "between" function ?
In-Reply-To: <201203191204.12564.rigal@rapideye.net>
References: <201203191204.12564.rigal@rapideye.net>
Message-ID: <4F6F4DE6.90203@crans.org>

Hi,

I have an off topic but somehow related question :

Le 19/03/2012 12:04, Matthieu Rigal a ?crit :
> array = numpy.logical_and(numpy.logical_and(aBlueChannel < 1.0, aNirChannel > 
> (aBlueChannel * 1.0)), aNirChannel < (aBlueChannel * 1.8))
Is there any significant difference between :

z = np.logical_and(x,y) and
z= x & y (assuming x and y are already numpy arrays and not just list)

I've always used the & (and | and ~) operator because it's of course
much shorter ;-)

I've seen no mention of the "&" operator in np.logical_and docstring so
I wonder...

Best,
Pierre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120325/41308ed3/attachment.sig>

From charlesr.harris at gmail.com  Sun Mar 25 13:03:39 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun, 25 Mar 2012 11:03:39 -0600
Subject: [Numpy-discussion] empty chararrays (ticket 1948)
In-Reply-To: <CABL7CQgdW2Gcpr3MhiEBn-1xk3MAmwkCe2FdL=mbwujYrSOkoQ@mail.gmail.com>
References: <CABL7CQgdW2Gcpr3MhiEBn-1xk3MAmwkCe2FdL=mbwujYrSOkoQ@mail.gmail.com>
Message-ID: <CAB6mnxKneJnmzdR5Nm+by2EzaWKwxV2Gy9YbuS8OcLHba0Yuaw@mail.gmail.com>

On Sun, Mar 25, 2012 at 10:12 AM, Ralf Gommers
<ralf.gommers at googlemail.com>wrote:

> Hi,
>
> In ticket 1948 a backwards compatibility issue with chararray is reported.
> Indexing a chararray with [] or a bool array of False used to return [] in
> numpy 1.2.1 (consistent with ndarray behavior), but now returns an empty
> string. Unfortunately this changed behavior has been present for the 1.5.x
> and 1.6.x releases.
>
> So the question is if this should be changed back or not? The change was
> likely unintentional; there's no test for it.
>
>
I believe the stsci folks were maintaining chararray, although I don't see
anyone from there with commit permissions. Hmm... I'd be inclined to
reinstate the old behavior, but the stsci folks may have deliberately made
the change, I'd like to hear from them first.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120325/fa12c734/attachment.html>

From ralf.gommers at googlemail.com  Sun Mar 25 13:07:32 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sun, 25 Mar 2012 19:07:32 +0200
Subject: [Numpy-discussion] 1.7 blockers
In-Reply-To: <CAB6mnxJh1QxUN-yz-nDrzKKXVLgDCjgnTrkd2VsUW8nLSQ1Cpg@mail.gmail.com>
References: <CAB6mnx+WOfWDGrjpUZHfr_Lp0o2yvwoT6028CR4udZ=1nnHTBA@mail.gmail.com>
	<CABL7CQi2bE+bEf3sYvAzs=SD677RQnbGSSmBD_=wr7UQUq5Sww@mail.gmail.com>
	<CAB6mnxJh1QxUN-yz-nDrzKKXVLgDCjgnTrkd2VsUW8nLSQ1Cpg@mail.gmail.com>
Message-ID: <CABL7CQjT65PBG3qJHgmEpoJSmwA8abSVW-n343y3m6MBid4JBQ@mail.gmail.com>

On Sun, Mar 25, 2012 at 6:48 PM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

>
>
> On Sun, Mar 25, 2012 at 10:33 AM, Ralf Gommers <
> ralf.gommers at googlemail.com> wrote:
>
>>
>>
>> On Sat, Mar 24, 2012 at 10:13 PM, Charles R Harris <
>> charlesr.harris at gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> There several problems with numpy master that need to be fixed before a
>>> release can be considered.
>>>
>>>    1. Datetime on windows with mingw.
>>>    2. Bus error on SPARC, ticket #2076.
>>>    3. NA and real/complex views of complex arrays.
>>>
>>> Number 1 has been proved to be particularly difficult, any help or
>>> suggestions for that would be much appreciated. The current work has been
>>> going in pull request 214 <https://github.com/numpy/numpy/pull/214>.
>>>
>>> This isn't to say that there aren't a ton of other things that need
>>> fixing or that we can skip out on the current stack of pull requests, but I
>>> think it is impossible to consider a release while those three problems are
>>> outstanding.
>>>
>> There's one more ticket that hasn't been looked at AFAIK and that has
>> been keeping the buildbots (except the Linux one) on red:
>> http://projects.scipy.org/numpy/ticket/1755 (floating point errors).
>>
>>
> I don't know what to do about that one.There may be some compiler flags
> that would help.
>

Why do you think this isn't a problem with the implementation? I haven't
tried to look at this in detail, but Python itself does manage to produce a
warning:
>>> 2.0**2048
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: (34, 'Result too large')
>>> import numpy as np
>>> np.power(2.0, 2048)
inf
>>> np.geterr()
{'over': 'warn', 'divide': 'warn', 'invalid': 'warn', 'under': 'ignore'}

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120325/05f6ba5e/attachment.html>

From ralf.gommers at googlemail.com  Sun Mar 25 13:09:29 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sun, 25 Mar 2012 19:09:29 +0200
Subject: [Numpy-discussion] empty chararrays (ticket 1948)
In-Reply-To: <CAB6mnxKneJnmzdR5Nm+by2EzaWKwxV2Gy9YbuS8OcLHba0Yuaw@mail.gmail.com>
References: <CABL7CQgdW2Gcpr3MhiEBn-1xk3MAmwkCe2FdL=mbwujYrSOkoQ@mail.gmail.com>
	<CAB6mnxKneJnmzdR5Nm+by2EzaWKwxV2Gy9YbuS8OcLHba0Yuaw@mail.gmail.com>
Message-ID: <CABL7CQji1XFY1Me48ikFf9aYS0rJCdDg3iBwpbQKPZdZAJjAug@mail.gmail.com>

On Sun, Mar 25, 2012 at 7:03 PM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

>
>
> On Sun, Mar 25, 2012 at 10:12 AM, Ralf Gommers <
> ralf.gommers at googlemail.com> wrote:
>
>> Hi,
>>
>> In ticket 1948 a backwards compatibility issue with chararray is
>> reported. Indexing a chararray with [] or a bool array of False used to
>> return [] in numpy 1.2.1 (consistent with ndarray behavior), but now
>> returns an empty string. Unfortunately this changed behavior has been
>> present for the 1.5.x and 1.6.x releases.
>>
>> So the question is if this should be changed back or not? The change was
>> likely unintentional; there's no test for it.
>>
>>
> I believe the stsci folks were maintaining chararray, although I don't see
> anyone from there with commit permissions. Hmm... I'd be inclined to
> reinstate the old behavior, but the stsci folks may have deliberately made
> the change, I'd like to hear from them first.
>

The change was made by Michael Droettboom (CC'd), who did have commit
permissions for this. They got lost with the Github move it seems.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120325/3aef4caf/attachment.html>

From efiring at hawaii.edu  Sun Mar 25 14:33:30 2012
From: efiring at hawaii.edu (Eric Firing)
Date: Sun, 25 Mar 2012 08:33:30 -1000
Subject: [Numpy-discussion] Using logical function on more than 2 arrays,
 availability of a "between" function ?
In-Reply-To: <4F6F4DE6.90203@crans.org>
References: <201203191204.12564.rigal@rapideye.net> <4F6F4DE6.90203@crans.org>
Message-ID: <4F6F64FA.6020005@hawaii.edu>

On 03/25/2012 06:55 AM, Pierre Haessig wrote:
> Hi,
>
> I have an off topic but somehow related question :
>
> Le 19/03/2012 12:04, Matthieu Rigal a ?crit :
>> array = numpy.logical_and(numpy.logical_and(aBlueChannel<  1.0, aNirChannel>
>> (aBlueChannel * 1.0)), aNirChannel<  (aBlueChannel * 1.8))
> Is there any significant difference between :
>
> z = np.logical_and(x,y) and
> z= x&  y (assuming x and y are already numpy arrays and not just list)
>
> I've always used the&  (and | and ~) operator because it's of course
> much shorter ;-)
>
> I've seen no mention of the "&" operator in np.logical_and docstring so
> I wonder...

There is a big difference: &, |, and ~ are bitwise operators, not 
logical operators, so they work like logical operators only if operating 
on booleans (or at least arrays containing nothing but integer zeros and 
ones) and only if you bear in mind that & and | have lower precedence 
than their logical counterparts.  Therefore you often need to use more 
parentheses than you might have expected.

In [1]: a = np.array([1])

In [2]: b = np.array([2])

In [5]: np.logical_and(a,b)
Out[5]: array([ True], dtype=bool)

In [6]: a & b
Out[6]: array([0])


Using the bitwise operators in place of logical operators is a hack to 
get around limitations of the language; but, if done carefully, it is a 
useful one.

Eric

>
> Best,
> Pierre
>
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From ralf.gommers at googlemail.com  Sun Mar 25 16:57:28 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sun, 25 Mar 2012 22:57:28 +0200
Subject: [Numpy-discussion] numpydoc.traitsdoc error
In-Reply-To: <CAHb_y2+kAr++Q2YgCm4RC6CANybkB_6QPa3wT+JExs-Ls+6bkQ@mail.gmail.com>
References: <CAHb_y2LSDHBtLKfZ0XjJf0hMs+_cqdi-3Dtnf+HVtAgV2jaMWQ@mail.gmail.com>
	<CABL7CQiGVSCz9dy13-bZMMw8Jkcj6aAx3fY=CHyOxagjboFMqA@mail.gmail.com>
	<CAHb_y2+kAr++Q2YgCm4RC6CANybkB_6QPa3wT+JExs-Ls+6bkQ@mail.gmail.com>
Message-ID: <CABL7CQhPAW5zqk+8PhULr3fQUhXWb1jGkHrRP4WtHZrnAk0MoA@mail.gmail.com>

On Sun, Mar 25, 2012 at 4:46 PM, Brad Buran <bburan at cns.nyu.edu> wrote:

> Hi Ralf:
>
> The Numpy documentation built fine for me.  I actually think the
> problem is with the traitsdoc extension (when I change my conf.py file
> to use the numpydoc extension instead of numpydoc.traitsdoc the build
> works fine).  I checked the traits and traitsui packages to see if I
> could use them for testing, but it appears that they do not use the
> traitsdoc extension.
>

Hmm, I don't know anything about that. If you don't get an answer here, you
can try asking on an Enthought mailing list.

Ralf


>
> Brad
>
> On Sat, Mar 24, 2012 at 5:39 PM, Ralf Gommers
> <ralf.gommers at googlemail.com> wrote:
> >
> >
> > On Sat, Mar 24, 2012 at 6:58 PM, Brad Buran <bburan at cns.nyu.edu> wrote:
> >>
> >> Not sure if this is the appropriate place to report the issue, but
> >> I've been getting the following error when trying to build my docs
> >> using Sphinx 1.1.3:
> >>
> >>  File "C:\Python27\lib\site-packages\numpydoc\numpydoc.py", line 36,
> >> in mangle_docstrings
> >>    doc = get_doc_object(obj, what, u"\n".join(lines), config=cfg)
> >> TypeError: get_doc_object() got multiple values for keyword argument
> >> 'config'
> >>
> >> Full traceback is attached.  Is anyone else having a similar issue?
> >>
> > To determine if the problem is in your toolchain or in your code, you can
> > try to build the numpy docs. I just checked that they build for me with
> > Sphinx 1.1.3. If you have numpy installed, simply type "make html" or
> "make
> > latex" in the doc/ dir.
> >
> > Ralf
> >
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120325/bdfad7b7/attachment.html>

From ralf.gommers at googlemail.com  Sun Mar 25 17:14:24 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sun, 25 Mar 2012 23:14:24 +0200
Subject: [Numpy-discussion] 1.7 blockers
In-Reply-To: <CAB6mnx+WOfWDGrjpUZHfr_Lp0o2yvwoT6028CR4udZ=1nnHTBA@mail.gmail.com>
References: <CAB6mnx+WOfWDGrjpUZHfr_Lp0o2yvwoT6028CR4udZ=1nnHTBA@mail.gmail.com>
Message-ID: <CABL7CQhFkTj+J1XBpanKPzgzrDc=S6PXJc+PzZ76thaGDSttFw@mail.gmail.com>

On Sat, Mar 24, 2012 at 10:13 PM, Charles R Harris <
charlesr.harris at gmail.com> wrote:

> Hi All,
>
> There several problems with numpy master that need to be fixed before a
> release can be considered.
>
>    1. Datetime on windows with mingw.
>    2. Bus error on SPARC, ticket #2076.
>    3. NA and real/complex views of complex arrays.
>
> Number 1 has been proved to be particularly difficult, any help or
> suggestions for that would be much appreciated. The current work has been
> going in pull request 214 <https://github.com/numpy/numpy/pull/214>.
>
> This isn't to say that there aren't a ton of other things that need fixing
> or that we can skip out on the current stack of pull requests, but I think
> it is impossible to consider a release while those three problems are
> outstanding.
>
Why do you consider (2) a blocker? Not saying it's not important, but there
are eight other open tickets with segfaults. Some are more esoteric than
other, but I don't see why for example #1713 and #1808 are less important
than this one.

#1522 provides a patch that fixes a segfault by the way, could use a review.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120325/47a62564/attachment.html>

From pierre.haessig at crans.org  Sun Mar 25 18:22:25 2012
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Mon, 26 Mar 2012 00:22:25 +0200
Subject: [Numpy-discussion] Using logical function on more than 2 arrays,
 availability of a "between" function ?
In-Reply-To: <4F6F64FA.6020005@hawaii.edu>
References: <201203191204.12564.rigal@rapideye.net> <4F6F4DE6.90203@crans.org>
	<4F6F64FA.6020005@hawaii.edu>
Message-ID: <4F6F9AA1.3060007@crans.org>

Hi Eric,

Thanks for the hints !

Le 25/03/2012 20:33, Eric Firing a ?crit :
> Using the bitwise operators in place of logical operators is a hack to 
> get around limitations of the language; but, if done carefully, it is a 
> useful one.
What is the rationale behind not overloading __and__ & other logical
operations ?
Is it a requirement that boolean operators should always return *a bool*
and not an *array of bools* ?

-- 
Pierre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120326/bbf179ca/attachment.sig>

From pierre.haessig at crans.org  Sun Mar 25 18:30:43 2012
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Mon, 26 Mar 2012 00:30:43 +0200
Subject: [Numpy-discussion] different percentile implementations ?
Message-ID: <4F6F9C93.4050208@crans.org>

Hi,

A quick question I've had in mind for some time but didn't find a solution :
Is there a significant difference between "numpy.percentile" and
"scipy.stats.scoreatpercentile" ?

Of course the signatures are somewhat different, but I have the feeling
that the overall purpose is the same. Am I missing something ?

Best,
Pierre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120326/c982a235/attachment.sig>

From charlesr.harris at gmail.com  Sun Mar 25 19:27:19 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun, 25 Mar 2012 17:27:19 -0600
Subject: [Numpy-discussion] 1.7 blockers
In-Reply-To: <CABL7CQhFkTj+J1XBpanKPzgzrDc=S6PXJc+PzZ76thaGDSttFw@mail.gmail.com>
References: <CAB6mnx+WOfWDGrjpUZHfr_Lp0o2yvwoT6028CR4udZ=1nnHTBA@mail.gmail.com>
	<CABL7CQhFkTj+J1XBpanKPzgzrDc=S6PXJc+PzZ76thaGDSttFw@mail.gmail.com>
Message-ID: <CAB6mnxLHZ9RM81jdzPMZ1n90bzosPu2X6Tz8dSaCfk9-ZQ=f8A@mail.gmail.com>

On Sun, Mar 25, 2012 at 3:14 PM, Ralf Gommers
<ralf.gommers at googlemail.com>wrote:

>
>
> On Sat, Mar 24, 2012 at 10:13 PM, Charles R Harris <
> charlesr.harris at gmail.com> wrote:
>
>> Hi All,
>>
>> There several problems with numpy master that need to be fixed before a
>> release can be considered.
>>
>>    1. Datetime on windows with mingw.
>>    2. Bus error on SPARC, ticket #2076.
>>    3. NA and real/complex views of complex arrays.
>>
>> Number 1 has been proved to be particularly difficult, any help or
>> suggestions for that would be much appreciated. The current work has been
>> going in pull request 214 <https://github.com/numpy/numpy/pull/214>.
>>
>> This isn't to say that there aren't a ton of other things that need
>> fixing or that we can skip out on the current stack of pull requests, but I
>> think it is impossible to consider a release while those three problems are
>> outstanding.
>>
> Why do you consider (2) a blocker? Not saying it's not important, but
> there are eight other open tickets with segfaults. Some are more esoteric
> than other, but I don't see why for example #1713 and #1808 are less
> important than this one.
>
> #1522 provides a patch that fixes a segfault by the way, could use a
> review.
>
>
I wasn't aware of the other segfaults, I'd like to get them all fixed...
The list was meant to elicit additions.

I don't know where the missed floating point errors come from, but they are
somewhat dependent on the compiler doing the right thing and hardware
support. I'd welcome any insight into why we get them on SPARC (underflow)
and Windows (overflow). The windows buildbot doesn't seem to be updating
correctly since it is still missing the combinations method that is now
part of the test module.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120325/2d81b84e/attachment.html>

From efiring at hawaii.edu  Sun Mar 25 20:01:07 2012
From: efiring at hawaii.edu (Eric Firing)
Date: Sun, 25 Mar 2012 14:01:07 -1000
Subject: [Numpy-discussion] Using logical function on more than 2 arrays,
 availability of a "between" function ?
In-Reply-To: <4F6F9AA1.3060007@crans.org>
References: <201203191204.12564.rigal@rapideye.net> <4F6F4DE6.90203@crans.org>
	<4F6F64FA.6020005@hawaii.edu> <4F6F9AA1.3060007@crans.org>
Message-ID: <4F6FB1C3.80204@hawaii.edu>

On 03/25/2012 12:22 PM, Pierre Haessig wrote:
> Hi Eric,
>
> Thanks for the hints !
>
> Le 25/03/2012 20:33, Eric Firing a ?crit :
>> Using the bitwise operators in place of logical operators is a hack to
>> get around limitations of the language; but, if done carefully, it is a
>> useful one.
> What is the rationale behind not overloading __and__&  other logical
> operations ?
> Is it a requirement that boolean operators should always return *a bool*
> and not an *array of bools* ?

Pierre,

See http://www.python.org/dev/peps/pep-0335/

Eric


From josef.pktd at gmail.com  Sun Mar 25 20:06:14 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sun, 25 Mar 2012 20:06:14 -0400
Subject: [Numpy-discussion] GSoC 2012
Message-ID: <CAMMTP+BiytvVCcPC3R-MOhUAdGDSuNms42kQOVEqWVO7g0SmHQ@mail.gmail.com>

What happened to any plans for GSOC?

http://wiki.python.org/moin/SummerOfCode/2012

Josef


From mictadlo at gmail.com  Sun Mar 25 23:44:41 2012
From: mictadlo at gmail.com (Mic)
Date: Mon, 26 Mar 2012 13:44:41 +1000
Subject: [Numpy-discussion] Looking for people interested in helping
 with Python compiler to LLVM
In-Reply-To: <CANg26EX+KotYUKwdHuQ1oBog0OrFzf=eCHauXGB236Bmv9P1iw@mail.gmail.com>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<4F5EE219.8010108@astro.uio.no>
	<C15466AA-DBDC-4582-A522-D50D1A2E4AC1@continuum.io>
	<CANg26EWHOqu2RH1A-jBTpY79h973P9Ob-070NbBwLPeZkY7eKA@mail.gmail.com>
	<FA0A06B7-5EDF-4D2F-8919-A65E310D8AC5@continuum.io>
	<CANg26EVFOoQtkHcP-sYZ2vD13V6WF8cqZLkZOHyjusvMghZXmw@mail.gmail.com>
	<CAFXk4boesOAbRRzyejuKW58avohKK=VVGrmFkgh00f9njVoLEA@mail.gmail.com>
	<2745ec09-9b93-475b-879c-bcf19cee009b@email.android.com>
	<CAFXk4br+FqcVgRdDunqEBmNqU+dNcZGynmz90y7vsSEewOXPEQ@mail.gmail.com>
	<CANg26EX+KotYUKwdHuQ1oBog0OrFzf=eCHauXGB236Bmv9P1iw@mail.gmail.com>
Message-ID: <CAOP6n=jx-efOuG=baiyqpt4YdeKU9-SEGrwi=4O-pJifNRgisg@mail.gmail.com>

How about:
* http://www.hotpy.org/
* http://pypy.org/numpydonate.html


On Wed, Mar 21, 2012 at 11:14 PM, mark florisson
<markflorisson88 at gmail.com>wrote:

> On 20 March 2012 20:49, Olivier Delalleau <shish at keba.be> wrote:
> > I doubt Theano is already as smart as you'd want it to be right now,
> however
> > the core mechanisms are there to perform graph optimizations and move
> > computations to GPU. It may save time to start from there instead of
> > starting all over from scratch. I'm not sure though, but it looks like it
> > would be worth considering it at least.
>
> Thanks for the suggestion Olivier, as Dag said we discusses it, and
> indeed we (or I) should look a lot deeper into it and see what
> components are reusable there and discuss with the Theano community if
> and how we can collaborate.
>
> > -=- Olivier
> >
> > Le 20 mars 2012 15:40, Dag Sverre Seljebotn <d.s.seljebotn at astro.uio.no>
> a
> > ?crit :
> >
> >> We talked some about Theano. There are some differences in project goals
> >> which means that it makes sense to make this a seperate project: Cython
> >> wants to use this to generate C code up front from the Cython AST at
> >> compilation time; numba also has a different frontend (parsing of python
> >> bytecode) and a different backend (LLVM).
> >>
> >> However, it may very well be possible that Theano could be refactored so
> >> that the more essential algorithms working on the syntax tree could be
> >> pulled out and shared with cython and numba. Then the question is
> whether
> >> the core of Theano is smart enough to compete with Fortran compilers and
> >> support arbitraily strided inputs optimally. Otherwise one might as well
> >> start from scratch. I'll leave that for Mark to figure out...
> >>
> >> Dag
> >> --
> >> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
> >>
> >>
> >> Olivier Delalleau <shish at keba.be> wrote:
> >>>
> >>> This sounds a lot like Theano, did you look into it?
> >>>
> >>> -=- Olivier
> >>>
> >>> Le 20 mars 2012 13:49, mark florisson <markflorisson88 at gmail.com> a
> ?crit
> >>> :
> >>>>
> >>>> On 13 March 2012 18:18, Travis Oliphant <travis at continuum.io> wrote:
> >>>> >>>
> >>>> >>> (Mark F., how does the above match how you feel about this?)
> >>>> >>
> >>>> >> I would like collaboration, but from a technical perspective I
> think
> >>>> >> this would be much more involved than just dumping the AST to an IR
> >>>> >> and generating some code from there. For vector expressions I think
> >>>> >> sharing code would be more feasible than arbitrary (parallel)
> loops,
> >>>> >> etc. Cython as a compiler can make many decisions that a Python
> >>>> >> (bytecode) compiler can't make (at least without annotations and a
> >>>> >> well-defined subset of the language (not so much the syntax as the
> >>>> >> semantics)). I think in numba, if parallelism is to be supported,
> you
> >>>> >> will want a prange-like construct, as proving independence between
> >>>> >> iterations can be very hard to near impossible for a compiler.
> >>>> >
> >>>> > I completely agree that you have to define some kind of syntax to
> get
> >>>> > parallelism.  But, a prange construct would not be out of the
> question, of
> >>>> > course.
> >>>> >
> >>>> >>
> >>>> >> As for code generation, I'm not sure how llvm would do things like
> >>>> >> slicing arrays, reshaping, resizing etc (for vector expressions you
> >>>> >> can first evaluate all slicing and indexing operations and then
> >>>> >> compile the remaining vector expression), but for loops and array
> >>>> >> reassignment within loops this would have to invoke the actual
> >>>> >> slicing
> >>>> >> code from the llvm code (I presume).
> >>>> >
> >>>> > There could be some analysis on the byte-code, prior to emitting the
> >>>> > llvm code in order to handle lots of things.   Basically, you have
> to "play"
> >>>> > the byte-code on a simple machine anyway in order to emit the
> correct code.
> >>>> >   The big thing about Cython is you have to typedef too many things
> that are
> >>>> > really quite knowable from the code.   If Cython could improve it's
> type
> >>>> > inference, then it would be a more suitable target.
> >>>> >
> >>>> >> There are many other things, like
> >>>> >> bounds checking, wraparound, etc, that are all supported in both
> >>>> >> numpy
> >>>> >> and Cython, but going through an llvm layer would as far as I can
> >>>> >> see,
> >>>> >> require re-implementing those, at least if you want top-notch
> >>>> >> performance. Personally, I think for non-trivial
> performance-critical
> >>>> >> code (for loops with indexing, slicing, function calls, etc) Cython
> >>>> >> is
> >>>> >> a better target.
> >>>> >
> >>>> > With libclang it is really quite possible to imagine a cython -> C
> >>>> > target that itself compiles to llvm so that you can do everything
> at that
> >>>> > intermediate layer.   However,  LLVM is a much better layer for
> optimization
> >>>> > than C now that there are a lot of people collaborating on that
> layer.   I
> >>>> > think it would be great if Cython targeted LLVM actually instead of
> C.
> >>>> >
> >>>> >>
> >>>> >> Finally, as for non-vector-expression code, I really believe Cython
> >>>> >> is
> >>>> >> a better target. cython.inline can have high overhead (at least the
> >>>> >> first time it has to compile), but with better (numpy-aware) type
> >>>> >> inference or profile guided optimizations (see recent threads on
> the
> >>>> >> cython-dev mailing list), in addition to things like prange, I
> >>>> >> personally believe Cython targets most of the use cases where numba
> >>>> >> would be able to generate performing code.
> >>>> >
> >>>> > Cython and Numba certainly overlap.  However, Cython requires:
> >>>> >
> >>>> >        1) learning another language
> >>>> >        2) creating an extension module --- loading bit-code files
> and
> >>>> > dynamically executing (even on a different machine from the one that
> >>>> > initially created them) can be a powerful alternative for run-time
> >>>> > compilation and distribution of code.
> >>>> >
> >>>> > These aren't show-stoppers obviously.   But, I think some users
> would
> >>>> > prefer an even simpler approach to getting fast-code than Cython
> (which
> >>>> > currently doesn't do enought type-inference and requires building a
> dlopen
> >>>> > extension module).
> >>>>
> >>>> Dag and I have been discussing this at PyCon, and here is my take on
> >>>> it (at this moment :).
> >>>>
> >>>> Definitely, if you can avoid Cython then that is easier and more
> >>>> desirable in many ways. So perhaps we can create a third project
> >>>> called X (I'm not very creative, maybe ArrayExprOpt), that takes an
> >>>> abstract syntax tree in a rather simple form, performs code
> >>>> optimizations such as rewriting loops with array accesses to vector
> >>>> expressions, fusing vector expressions and loops, etc, and spits out a
> >>>> transformed AST containing these optimizations. If runtime information
> >>>> is given such as actual shape and stride information the
> >>>> transformations could figure out there and then whether to do things
> >>>> like collapsing, axes swapping, blocking (as in, introducing more axes
> >>>> or loops to retain discontiguous blocks in the cache), blocked memory
> >>>> copies to contiguous chunks, etc. The AST could then also say whether
> >>>> the final expressions are vectorizable. Part of this functionality is
> >>>> already in numpy's nditer, except that this would be implicit and do
> >>>> more (and hopefully with minimal overhead).
> >>>>
> >>>> So numba, Cython and maybe numexpr could use the functionality, simply
> >>>> by building the AST from Python and converting back (if necessary) to
> >>>> its own AST. As such, the AST optimizer would be only part of any
> >>>> (runtime) compiler's pipeline, and it should be very flexible to
> >>>> retain any information (metadata regarding actual types, control flow
> >>>> information, etc) provided by the original AST. It would not do
> >>>> control flow analysis, type inference or promotion, etc, but only deal
> >>>> with abstract types like integers, reals and arrays (C, Fortran or
> >>>> partly contiguous or strided). It would not deal with objects, but
> >>>> would allow to insert nodes like UnreorderableNode and SideEffectNode
> >>>> wrapping parts of the original AST. In short, it should be as easy as
> >>>> possible to convert from an original AST to this project's AST and
> >>>> back again afterwards.
> >>>>
> >>>> As the project matures many optimizations may be added that deal with
> >>>> all sorts of loop restructuring and ways to efficiently utilize the
> >>>> cache as well as enable vectorization and possibly parallelism.
> >>>> Perhaps it could even generate a different AST depending on whether
> >>>> execution target the CPU or the GPU (with optionally available
> >>>> information such as cache sizes, GPU shared/local memory sizes, etc).
> >>>>
> >>>> Seeing that this would be a part of my master dissertation, my
> >>>> supervisor would require me to write the code, so at least until
> >>>> August I think I would have to write (at least the bulk of) this.
> >>>> Otherwise I can also make other parts of my dissertation's project
> >>>> more prominent to make up for it. Anyway, my question is, is there
> >>>> interest from at least the numba and numexpr projects (if code can be
> >>>> transformed into vector operations, it makes sense to use numexpr for
> >>>> that, I'm not sure what numba's interest is in that).
> >>>>
> >>>> > -Travis
> >>>> >
> >>>> >
> >>>> >
> >>>> >
> >>>> >>
> >>>> >>> Dag
> >>>> >>> _______________________________________________
> >>>> >>> NumPy-Discussion mailing list
> >>>> >>> NumPy-Discussion at scipy.org
> >>>> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>>> >>>
> >>>> >>>
> >>>> >>>
> >>>> >>> _______________________________________________
> >>>> >>> NumPy-Discussion mailing list
> >>>> >>> NumPy-Discussion at scipy.org
> >>>> >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>>> >>>
> >>>> >> _______________________________________________
> >>>> >> NumPy-Discussion mailing list
> >>>> >> NumPy-Discussion at scipy.org
> >>>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>>> >
> >>>> > _______________________________________________
> >>>> > NumPy-Discussion mailing list
> >>>> > NumPy-Discussion at scipy.org
> >>>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>>> _______________________________________________
> >>>> NumPy-Discussion mailing list
> >>>> NumPy-Discussion at scipy.org
> >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>>
> >>>
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120326/5e4cb8a0/attachment.html>

From d.s.seljebotn at astro.uio.no  Sun Mar 25 23:53:31 2012
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Sun, 25 Mar 2012 20:53:31 -0700
Subject: [Numpy-discussion] Looking for people interested in helping
 with Python compiler to LLVM
In-Reply-To: <CAOP6n=jx-efOuG=baiyqpt4YdeKU9-SEGrwi=4O-pJifNRgisg@mail.gmail.com>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<4F5EE219.8010108@astro.uio.no>
	<C15466AA-DBDC-4582-A522-D50D1A2E4AC1@continuum.io>
	<CANg26EWHOqu2RH1A-jBTpY79h973P9Ob-070NbBwLPeZkY7eKA@mail.gmail.com>
	<FA0A06B7-5EDF-4D2F-8919-A65E310D8AC5@continuum.io>
	<CANg26EVFOoQtkHcP-sYZ2vD13V6WF8cqZLkZOHyjusvMghZXmw@mail.gmail.com>
	<CAFXk4boesOAbRRzyejuKW58avohKK=VVGrmFkgh00f9njVoLEA@mail.gmail.com>
	<2745ec09-9b93-475b-879c-bcf19cee009b@email.android.com>
	<CAFXk4br+FqcVgRdDunqEBmNqU+dNcZGynmz90y7vsSEewOXPEQ@mail.gmail.com>
	<CANg26EX+KotYUKwdHuQ1oBog0OrFzf=eCHauXGB236Bmv9P1iw@mail.gmail.com>
	<CAOP6n=jx-efOuG=baiyqpt4YdeKU9-SEGrwi=4O-pJifNRgisg@mail.gmail.com>
Message-ID: <4F6FE83B.109@astro.uio.no>

On 03/25/2012 08:44 PM, Mic wrote:
> How about:
> * http://www.hotpy.org/

The front page says a 10x speedup. That's a bit short of the almost 
1000x speedup required for numerical code (that is, for some examples 
Python is thousands of times slower than C or Fortran).

Well -- I'm sure hotpy could get beyond 10x as well -- it's just to say 
that they have probably not looked much on the case of numerical 
computation.

> * http://pypy.org/numpydonate.html

Well, you can start with these...:

http://technicaldiscovery.blogspot.com/2011/10/thoughts-on-porting-numpy-to-pypy.html

http://blog.streamitive.com/2011/10/19/more-thoughts-on-arrays-in-pypy/

Dag


From rhattersley at gmail.com  Mon Mar 26 04:29:20 2012
From: rhattersley at gmail.com (Richard Hattersley)
Date: Mon, 26 Mar 2012 09:29:20 +0100
Subject: [Numpy-discussion] label NA and datetime as experimental
In-Reply-To: <CABL7CQgBB99WopVgUGnQGxEOosMCaODpBdi=4HGGzY6JFUtAsg@mail.gmail.com>
References: <CABL7CQgBB99WopVgUGnQGxEOosMCaODpBdi=4HGGzY6JFUtAsg@mail.gmail.com>
Message-ID: <CAP=RS9=aFgoc2kFAKem8YgvOBaCSweBQP0ggBxy_-k3uZ+CKZQ@mail.gmail.com>

Hi,

My team are currently experimenting with extending datetime to allow
alternative, non-physical calendars (e.g. 360-day used by climate
modellers). Once we've got a handle on the options we'd like to
propose the extensions/changes back to NumPy. Obviously we'd like to
avoid wasted effort, so are there some aspects of datetime64 which are
more experimental than others? Is there a summary of unresolved issues
and/or plans for change?

Thanks,
Richard Hattersley

On 25 March 2012 13:57, Ralf Gommers <ralf.gommers at googlemail.com> wrote:
> Hi,
>
> We decided to label both NA and datetime APIs as experimental for the 1.7.0
> release. I made a PR that does this, please review:
> https://github.com/numpy/numpy/pull/240
>
> Ralf
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From ch.ro.dann at googlemail.com  Mon Mar 26 07:37:04 2012
From: ch.ro.dann at googlemail.com (Christoph Dann)
Date: Mon, 26 Mar 2012 13:37:04 +0200
Subject: [Numpy-discussion] Linking against MKL but still slow?
Message-ID: <CAOxAswnJhb7_YHGxisVNLDnSm5bqxVzp9wUB6S61zsZTdCTF=w@mail.gmail.com>

Dear list,

so far I used Enthoughts Python Distribution which contains a compiled
version of numpy linked against MKL. Now, I want to implement my own
extensions to numpy, so I need my build numpy on my own. So, I
installed Intel Parallel studio including MKL and the C / Fortran
compilers.

I linked against the same libraries as Enthought:

In [2]: np.show_config()
lapack_opt_info:
? ? libraries = ['mkl_lapack95_lp64', 'iomp5', 'mkl_def',
'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'pthread']
? ? library_dirs = ['/opt/intel/lib/intel64', '/opt/intel/mkl/lib/intel64']
? ? define_macros = [('SCIPY_MKL_H', None)]
? ? include_dirs = ['/opt/intel/include/']
blas_opt_info:
? ? libraries = ['iomp5', 'mkl_def', 'mkl_intel_lp64',
'mkl_intel_thread', 'mkl_core', 'pthread']
? ? library_dirs = ['/opt/intel/lib/intel64', '/opt/intel/mkl/lib/intel64']
? ? define_macros = [('SCIPY_MKL_H', None)]
? ? include_dirs = ['/opt/intel/include/']
lapack_mkl_info:
? ? libraries = ['mkl_lapack95_lp64', 'iomp5', 'mkl_def',
'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'pthread']
? ? library_dirs = ['/opt/intel/lib/intel64', '/opt/intel/mkl/lib/intel64']
? ? define_macros = [('SCIPY_MKL_H', None)]
? ? include_dirs = ['/opt/intel/include/']
blas_mkl_info:
? ? libraries = ['iomp5', 'mkl_def', 'mkl_intel_lp64',
'mkl_intel_thread', 'mkl_core', 'pthread']
? ? library_dirs = ['/opt/intel/lib/intel64', '/opt/intel/mkl/lib/intel64']
? ? define_macros = [('SCIPY_MKL_H', None)]
? ? include_dirs = ['/opt/intel/include/']
mkl_info:
? ? libraries = ['iomp5', 'mkl_def', 'mkl_intel_lp64',
'mkl_intel_thread', 'mkl_core', 'pthread']
? ? library_dirs = ['/opt/intel/lib/intel64', '/opt/intel/mkl/lib/intel64']
? ? define_macros = [('SCIPY_MKL_H', None)]
? ? include_dirs = ['/opt/intel/include/']


and used the intel compilers to build numpy. I have an Intel i7
processor and compile including AVX instructions. Yet, EPD is double
as fast as my own build executing the simple benchmark from:
http://dpinte.wordpress.com/2010/01/15/numpy-performance-improvement-with-the-mkl/
I expected at least comparable performance.?How is such a decrease
possible? Did I miss a significant part to make numpy really fast?

Thanks
Christoph


From ch.ro.dann at googlemail.com  Mon Mar 26 05:06:56 2012
From: ch.ro.dann at googlemail.com (Christoph Dann)
Date: Mon, 26 Mar 2012 02:06:56 -0700 (PDT)
Subject: [Numpy-discussion] Linking against MKL but still slow?
Message-ID: <6471907.1930.1332752816588.JavaMail.geo-discussion-forums@vbiz13>

Dear list,

so far I used Enthoughts Python Distribution which contains a compiled 
version of numpy linked against MKL. Now, I want to implement my own 
extensions to numpy, so I need my build numpy on my own. So, I installed 
Intel Parallel studio including MKL and the C / Fortran compilers.

I linked against the same libraries as Enthought:

In [2]: np.show_config()
lapack_opt_info:
    libraries = ['mkl_lapack95_lp64', 'iomp5', 'mkl_def', 'mkl_intel_lp64', 
'mkl_intel_thread', 'mkl_core', 'pthread']
    library_dirs = ['/opt/intel/lib/intel64', '/opt/intel/mkl/lib/intel64']
    define_macros = [('SCIPY_MKL_H', None)]
    include_dirs = ['/opt/intel/include/']
blas_opt_info:
    libraries = ['iomp5', 'mkl_def', 'mkl_intel_lp64', 'mkl_intel_thread', 
'mkl_core', 'pthread']
    library_dirs = ['/opt/intel/lib/intel64', '/opt/intel/mkl/lib/intel64']
    define_macros = [('SCIPY_MKL_H', None)]
    include_dirs = ['/opt/intel/include/']
lapack_mkl_info:
    libraries = ['mkl_lapack95_lp64', 'iomp5', 'mkl_def', 'mkl_intel_lp64', 
'mkl_intel_thread', 'mkl_core', 'pthread']
    library_dirs = ['/opt/intel/lib/intel64', '/opt/intel/mkl/lib/intel64']
    define_macros = [('SCIPY_MKL_H', None)]
    include_dirs = ['/opt/intel/include/']
blas_mkl_info:
    libraries = ['iomp5', 'mkl_def', 'mkl_intel_lp64', 'mkl_intel_thread', 
'mkl_core', 'pthread']
    library_dirs = ['/opt/intel/lib/intel64', '/opt/intel/mkl/lib/intel64']
    define_macros = [('SCIPY_MKL_H', None)]
    include_dirs = ['/opt/intel/include/']
mkl_info:
    libraries = ['iomp5', 'mkl_def', 'mkl_intel_lp64', 'mkl_intel_thread', 
'mkl_core', 'pthread']
    library_dirs = ['/opt/intel/lib/intel64', '/opt/intel/mkl/lib/intel64']
    define_macros = [('SCIPY_MKL_H', None)]
    include_dirs = ['/opt/intel/include/']


and used the intel compilers to build numpy. Since I have an i7 processor I 
compiled including AVX instructions. Yet, EPD is double as fast as my own 
build executing the simple benchmark from: 
http://dpinte.wordpress.com/2010/01/15/numpy-performance-improvement-with-the-mkl/
I expected at least comparable performance. How is such a decrease 
possible? Did I miss a significant part to make numpy really fast?

Thanks
Christoph
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120326/2b74deb7/attachment.html>

From cournape at gmail.com  Mon Mar 26 10:22:37 2012
From: cournape at gmail.com (David Cournapeau)
Date: Mon, 26 Mar 2012 15:22:37 +0100
Subject: [Numpy-discussion] Linking against MKL but still slow?
In-Reply-To: <6471907.1930.1332752816588.JavaMail.geo-discussion-forums@vbiz13>
References: <6471907.1930.1332752816588.JavaMail.geo-discussion-forums@vbiz13>
Message-ID: <CAGY4rcX9UsHMZpfT5oPsZYTgo8j7OC24EDY1KcuWQ4sO4g1gLw@mail.gmail.com>

Hi Christoph,

On Mon, Mar 26, 2012 at 10:06 AM, Christoph Dann
<ch.ro.dann at googlemail.com>wrote:

> Dear list,
>
> so far I used Enthoughts Python Distribution which contains a compiled
> version of numpy linked against MKL. Now, I want to implement my own
> extensions to numpy, so I need my build numpy on my own. So, I installed
> Intel Parallel studio including MKL and the C / Fortran compilers.
>

What do you mean by own extensions to NumPy ? If you mean building
extensions against the C API of NumPy, then you don't need to build your
own NumPy. Building NumPy with Intel Compilers and MKL is a non-trivial
process, so I would rather avoid it.

If you still want to build it by yourself, could you give us the full
output of your build ?

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120326/d5833453/attachment.html>

From charlesr.harris at gmail.com  Mon Mar 26 11:42:27 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 26 Mar 2012 09:42:27 -0600
Subject: [Numpy-discussion] label NA and datetime as experimental
In-Reply-To: <CAP=RS9=aFgoc2kFAKem8YgvOBaCSweBQP0ggBxy_-k3uZ+CKZQ@mail.gmail.com>
References: <CABL7CQgBB99WopVgUGnQGxEOosMCaODpBdi=4HGGzY6JFUtAsg@mail.gmail.com>
	<CAP=RS9=aFgoc2kFAKem8YgvOBaCSweBQP0ggBxy_-k3uZ+CKZQ@mail.gmail.com>
Message-ID: <CAB6mnxKun0bXehNLPHgX+cshXSt67Wsd1+Qm4fvBZRHECe8pKg@mail.gmail.com>

On Mon, Mar 26, 2012 at 2:29 AM, Richard Hattersley
<rhattersley at gmail.com>wrote:

> Hi,
>
> My team are currently experimenting with extending datetime to allow
> alternative, non-physical calendars (e.g. 360-day used by climate
> modellers). Once we've got a handle on the options we'd like to
> propose the extensions/changes back to NumPy. Obviously we'd like to
> avoid wasted effort, so are there some aspects of datetime64 which are
> more experimental than others? Is there a summary of unresolved issues
> and/or plans for change?
>
>
I believe datetime is already used by Pandas, so I don't think there will
be major changes there. I'm not aware of open issues, but I could be wrong.
The calenders are a bit independent, so I think the best procedure is to go
ahead with your work. We want to leave some wiggle room since new features
often need a little time to mature. That's how it looks to me anyway.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120326/95b537cc/attachment.html>

From ralf.gommers at googlemail.com  Mon Mar 26 16:03:27 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Mon, 26 Mar 2012 22:03:27 +0200
Subject: [Numpy-discussion] label NA and datetime as experimental
In-Reply-To: <CAB6mnxKun0bXehNLPHgX+cshXSt67Wsd1+Qm4fvBZRHECe8pKg@mail.gmail.com>
References: <CABL7CQgBB99WopVgUGnQGxEOosMCaODpBdi=4HGGzY6JFUtAsg@mail.gmail.com>
	<CAP=RS9=aFgoc2kFAKem8YgvOBaCSweBQP0ggBxy_-k3uZ+CKZQ@mail.gmail.com>
	<CAB6mnxKun0bXehNLPHgX+cshXSt67Wsd1+Qm4fvBZRHECe8pKg@mail.gmail.com>
Message-ID: <CABL7CQgSJU--sVwnPz65JE+KgfUFWiLAHm7BBs9quc-txJc3Jw@mail.gmail.com>

On Mon, Mar 26, 2012 at 5:42 PM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

>
>
> On Mon, Mar 26, 2012 at 2:29 AM, Richard Hattersley <rhattersley at gmail.com
> > wrote:
>
>> Hi,
>>
>> My team are currently experimenting with extending datetime to allow
>> alternative, non-physical calendars (e.g. 360-day used by climate
>> modellers). Once we've got a handle on the options we'd like to
>> propose the extensions/changes back to NumPy. Obviously we'd like to
>> avoid wasted effort, so are there some aspects of datetime64 which are
>> more experimental than others? Is there a summary of unresolved issues
>> and/or plans for change?
>>
>>
> I believe datetime is already used by Pandas, so I don't think there will
> be major changes there. I'm not aware of open issues, but I could be wrong.
> The calenders are a bit independent, so I think the best procedure is to go
> ahead with your work. We want to leave some wiggle room since new features
> often need a little time to mature. That's how it looks to me anyway.
>

That's my understanding too. Perhaps Mark can comment on the current
status. That status and changes need to still be described in the release
notes by the way.

The experimental tag is mostly due to the datetime history: it was
introduced in 1.4.0, removed again in 1.4.1, reintroduced in 1.6.0, the API
then labeled not useful (
http://thread.gmane.org/gmane.comp.python.numeric.general/44162/focus=44385),
then more changes for this release. I hope it's stable now, but seeing what
came before and that it still doesn't work with MinGW it's hard to be sure.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120326/0d5f039f/attachment.html>

From rhattersley at gmail.com  Mon Mar 26 16:13:51 2012
From: rhattersley at gmail.com (Richard Hattersley)
Date: Mon, 26 Mar 2012 21:13:51 +0100
Subject: [Numpy-discussion] label NA and datetime as experimental
In-Reply-To: <CABL7CQgSJU--sVwnPz65JE+KgfUFWiLAHm7BBs9quc-txJc3Jw@mail.gmail.com>
References: <CABL7CQgBB99WopVgUGnQGxEOosMCaODpBdi=4HGGzY6JFUtAsg@mail.gmail.com>
	<CAP=RS9=aFgoc2kFAKem8YgvOBaCSweBQP0ggBxy_-k3uZ+CKZQ@mail.gmail.com>
	<CAB6mnxKun0bXehNLPHgX+cshXSt67Wsd1+Qm4fvBZRHECe8pKg@mail.gmail.com>
	<CABL7CQgSJU--sVwnPz65JE+KgfUFWiLAHm7BBs9quc-txJc3Jw@mail.gmail.com>
Message-ID: <CAP=RS9kvN0sO_6+bGmLt42cncBHKEtBmsuw7i5ERFT9bg96Ang@mail.gmail.com>

OK - that's useful feedback.

Thanks!

On 26 March 2012 21:03, Ralf Gommers <ralf.gommers at googlemail.com> wrote:
>
>
> On Mon, Mar 26, 2012 at 5:42 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>>
>>
>>
>> On Mon, Mar 26, 2012 at 2:29 AM, Richard Hattersley
>> <rhattersley at gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> My team are currently experimenting with extending datetime to allow
>>> alternative, non-physical calendars (e.g. 360-day used by climate
>>> modellers). Once we've got a handle on the options we'd like to
>>> propose the extensions/changes back to NumPy. Obviously we'd like to
>>> avoid wasted effort, so are there some aspects of datetime64 which are
>>> more experimental than others? Is there a summary of unresolved issues
>>> and/or plans for change?
>>>
>>
>> I believe datetime is already used by Pandas, so I don't think there will
>> be major changes there. I'm not aware of open issues, but I could be wrong.
>> The calenders are a bit independent, so I think the best procedure is to go
>> ahead with your work. We want to leave some wiggle room since new features
>> often need a little time to mature. That's how it looks to me anyway.
>
>
> That's my understanding too. Perhaps Mark can comment on the current status.
> That status and changes need to still be described in the release notes by
> the way.
>
> The experimental tag is mostly due to the datetime history: it was
> introduced in 1.4.0, removed again in 1.4.1, reintroduced in 1.6.0, the API
> then labeled not useful
> (http://thread.gmane.org/gmane.comp.python.numeric.general/44162/focus=44385),
> then more changes for this release. I hope it's stable now, but seeing what
> came before and that it still doesn't work with MinGW it's hard to be sure.
>
> Ralf
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From bryanv at continuum.io  Mon Mar 26 17:34:11 2012
From: bryanv at continuum.io (Bryan Van de Ven)
Date: Mon, 26 Mar 2012 16:34:11 -0500
Subject: [Numpy-discussion] "arg" searchsorted
Message-ID: <4F70E0D3.2090202@continuum.io>

I recently got asked about an "arg" version of searchsorted, basically a 
version that could take a sorter as an additional argument. For instance:

     In [13]: a = np.array([5,6,8,1,6,9,0])

     In [14]: s = np.argsort(a)

     In [17]: s
     Out[17]: array([6, 3, 0, 1, 4, 2, 5])

     In [18]: np.searchsorted(a, [1.5, 7.2], sorter=sorter)
     Out[18]: array([2, 5])

     In [19]: np.searchsorted(a, -1.5, sorter=s)
     Out[19]: 0

     In [20]: np.searchsorted(a, 11.5, sorter=s)
     Out[20]: 7

     In [21]: np.searchsorted(a, 8.5, sorter=s)
     Out[21]: 6

     In [32]: np.searchsorted(a, 6, side='left', sorter=sorter)
     Out[32]: 3

     In [33]: np.searchsorted(a, 6, side='right', sorter=sorter)
     Out[33]: 5

I've already implemented this, I wanted to ping the list and see if 
there were any thoughts about this small feature. Or should I just 
submit a PR for consideration?

Thanks,

Bryan


From charlesr.harris at gmail.com  Mon Mar 26 17:57:01 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 26 Mar 2012 15:57:01 -0600
Subject: [Numpy-discussion] "arg" searchsorted
In-Reply-To: <4F70E0D3.2090202@continuum.io>
References: <4F70E0D3.2090202@continuum.io>
Message-ID: <CAB6mnxJPiygsJNrp9VH_FZA2uNa=FFJHWu0aUL1b6nsxsPrEuA@mail.gmail.com>

On Mon, Mar 26, 2012 at 3:34 PM, Bryan Van de Ven <bryanv at continuum.io>wrote:

> I recently got asked about an "arg" version of searchsorted, basically a
> version that could take a sorter as an additional argument. For instance:
>
>     In [13]: a = np.array([5,6,8,1,6,9,0])
>
>     In [14]: s = np.argsort(a)
>
>     In [17]: s
>     Out[17]: array([6, 3, 0, 1, 4, 2, 5])
>
>     In [18]: np.searchsorted(a, [1.5, 7.2], sorter=sorter)
>     Out[18]: array([2, 5])
>
>     In [19]: np.searchsorted(a, -1.5, sorter=s)
>     Out[19]: 0
>
>     In [20]: np.searchsorted(a, 11.5, sorter=s)
>     Out[20]: 7
>
>     In [21]: np.searchsorted(a, 8.5, sorter=s)
>     Out[21]: 6
>
>     In [32]: np.searchsorted(a, 6, side='left', sorter=sorter)
>     Out[32]: 3
>
>     In [33]: np.searchsorted(a, 6, side='right', sorter=sorter)
>     Out[33]: 5
>
> I've already implemented this, I wanted to ping the list and see if
> there were any thoughts about this small feature. Or should I just
> submit a PR for consideration?
>


An alternate API would be to pass a pair (a, sorter) as the first argument.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120326/f131a04a/attachment.html>

From bryanv at continuum.io  Mon Mar 26 18:32:26 2012
From: bryanv at continuum.io (Bryan Van de Ven)
Date: Mon, 26 Mar 2012 17:32:26 -0500
Subject: [Numpy-discussion] "arg" searchsorted
In-Reply-To: <CAB6mnxJPiygsJNrp9VH_FZA2uNa=FFJHWu0aUL1b6nsxsPrEuA@mail.gmail.com>
References: <4F70E0D3.2090202@continuum.io>
	<CAB6mnxJPiygsJNrp9VH_FZA2uNa=FFJHWu0aUL1b6nsxsPrEuA@mail.gmail.com>
Message-ID: <4F70EE7A.1060308@continuum.io>

On 3/26/12 4:57 PM, Charles R Harris wrote:
>
> On Mon, Mar 26, 2012 at 3:34 PM, Bryan Van de Ven <bryanv at continuum.io 
> <mailto:bryanv at continuum.io>> wrote:
>
>     I recently got asked about an "arg" version of searchsorted,
>     basically a
>     version that could take a sorter as an additional argument. For
>     instance:
>
>         In [13]: a = np.array([5,6,8,1,6,9,0])
>
>         In [14]: s = np.argsort(a)
>
>         In [17]: s
>         Out[17]: array([6, 3, 0, 1, 4, 2, 5])
>
>         In [18]: np.searchsorted(a, [1.5, 7.2], sorter=sorter)
>         Out[18]: array([2, 5])
>
>
> An alternate API would be to pass a pair (a, sorter) as the first 
> argument.
>
> Chuck

Sure, that would be easy enough to implement. I don't really have a 
preference, is there a reason you would prefer that API?

Bryan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120326/ab2df228/attachment.html>

From cooke.stephanie at gmail.com  Mon Mar 26 19:03:02 2012
From: cooke.stephanie at gmail.com (Stephanie Cooke)
Date: Mon, 26 Mar 2012 19:03:02 -0400
Subject: [Numpy-discussion] How to Extract the Number of Rows and Columns in
	a Matrix
Message-ID: <CAKp4H6wGAY_W3WQ93mode-nmKhx5kQQYw4OZPLHG6DqjMwTWDg@mail.gmail.com>

Hello,

I would like to extract the number of rows and columns of a matrix
individually. The shape command outputs the rows and columns together,
but are there commands that will separately give the rows and
separately give the columns?

Thanks


From shish at keba.be  Mon Mar 26 19:26:38 2012
From: shish at keba.be (Olivier Delalleau)
Date: Mon, 26 Mar 2012 19:26:38 -0400
Subject: [Numpy-discussion] How to Extract the Number of Rows and
 Columns in a Matrix
In-Reply-To: <CAKp4H6wGAY_W3WQ93mode-nmKhx5kQQYw4OZPLHG6DqjMwTWDg@mail.gmail.com>
References: <CAKp4H6wGAY_W3WQ93mode-nmKhx5kQQYw4OZPLHG6DqjMwTWDg@mail.gmail.com>
Message-ID: <CAFXk4bqwSqQtUHT0VxryQgaO2SjtT4AwcAPpFoXZKhrq+BW0Jw@mail.gmail.com>

len(M) will give you the number of rows of M.
For columns I just use M.shape[1] myself, I don't know if there exists a
shortcut.

-=- Olivier

Le 26 mars 2012 19:03, Stephanie Cooke <cooke.stephanie at gmail.com> a ?crit :

> Hello,
>
> I would like to extract the number of rows and columns of a matrix
> individually. The shape command outputs the rows and columns together,
> but are there commands that will separately give the rows and
> separately give the columns?
>
> Thanks
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120326/c47972d0/attachment.html>

From charlesr.harris at gmail.com  Mon Mar 26 19:34:36 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 26 Mar 2012 17:34:36 -0600
Subject: [Numpy-discussion] "arg" searchsorted
In-Reply-To: <4F70EE7A.1060308@continuum.io>
References: <4F70E0D3.2090202@continuum.io>
	<CAB6mnxJPiygsJNrp9VH_FZA2uNa=FFJHWu0aUL1b6nsxsPrEuA@mail.gmail.com>
	<4F70EE7A.1060308@continuum.io>
Message-ID: <CAB6mnxJDhQyZFNFJeHXM=fdAOu-n=ufTrJ7-F+0vDB=sYi6LXA@mail.gmail.com>

On Mon, Mar 26, 2012 at 4:32 PM, Bryan Van de Ven <bryanv at continuum.io>wrote:

>  On 3/26/12 4:57 PM, Charles R Harris wrote:
>
>
> On Mon, Mar 26, 2012 at 3:34 PM, Bryan Van de Ven <bryanv at continuum.io>wrote:
>
>> I recently got asked about an "arg" version of searchsorted, basically a
>> version that could take a sorter as an additional argument. For instance:
>>
>>     In [13]: a = np.array([5,6,8,1,6,9,0])
>>
>>     In [14]: s = np.argsort(a)
>>
>>     In [17]: s
>>     Out[17]: array([6, 3, 0, 1, 4, 2, 5])
>>
>>     In [18]: np.searchsorted(a, [1.5, 7.2], sorter=sorter)
>>     Out[18]: array([2, 5])
>>
>
> An alternate API would be to pass a pair (a, sorter) as the first argument.
>
> Chuck
>
>
> Sure, that would be easy enough to implement. I don't really have a
> preference, is there a reason you would prefer that API?
>

No, just exploring possibilities. Another would be a different name,
searchargsorted or some such. I actually think that is a better alternative
than the pair, but it would add another method to ndarray.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120326/093e4ac0/attachment.html>

From derek at astro.physik.uni-goettingen.de  Mon Mar 26 19:38:33 2012
From: derek at astro.physik.uni-goettingen.de (Derek Homeier)
Date: Tue, 27 Mar 2012 01:38:33 +0200
Subject: [Numpy-discussion] How to Extract the Number of Rows and
	Columns in a Matrix
In-Reply-To: <CAFXk4bqwSqQtUHT0VxryQgaO2SjtT4AwcAPpFoXZKhrq+BW0Jw@mail.gmail.com>
References: <CAKp4H6wGAY_W3WQ93mode-nmKhx5kQQYw4OZPLHG6DqjMwTWDg@mail.gmail.com>
	<CAFXk4bqwSqQtUHT0VxryQgaO2SjtT4AwcAPpFoXZKhrq+BW0Jw@mail.gmail.com>
Message-ID: <648324AE-E3CB-4716-B116-974F7B97D64F@astro.physik.uni-goettingen.de>

On 27.03.2012, at 1:26AM, Olivier Delalleau wrote:

> len(M) will give you the number of rows of M.
> For columns I just use M.shape[1] myself, I don't know if there exists a shortcut.
> 

You can use tuple unpacking, if that helps keeping your code conciser?

nrow, ncol = M.shape

Cheers,
					Derek

> Le 26 mars 2012 19:03, Stephanie Cooke <cooke.stephanie at gmail.com> a ?crit :
> Hello,
> 
> I would like to extract the number of rows and columns of a matrix
> individually. The shape command outputs the rows and columns together,
> but are there commands that will separately give the rows and
> separately give the columns?
> 
> Thanks


From cooke.stephanie at gmail.com  Mon Mar 26 20:07:13 2012
From: cooke.stephanie at gmail.com (Stephanie Cooke)
Date: Mon, 26 Mar 2012 20:07:13 -0400
Subject: [Numpy-discussion] AttributeError with shape command
Message-ID: <CAKp4H6zSQFTTujf1Eq5nhBCLbwz4aEK7jpwYbW7KXGf4Bmh3-A@mail.gmail.com>

Hello,

I am new to numpy. When I try to use the command array.shape, I get
the following error:

AttributeError: 'list' object has no attribute 'shape'

Is anyone familiar with this type of error?

Thanks


From derek at astro.physik.uni-goettingen.de  Mon Mar 26 20:18:46 2012
From: derek at astro.physik.uni-goettingen.de (Derek Homeier)
Date: Tue, 27 Mar 2012 02:18:46 +0200
Subject: [Numpy-discussion] AttributeError with shape command
In-Reply-To: <CAKp4H6zSQFTTujf1Eq5nhBCLbwz4aEK7jpwYbW7KXGf4Bmh3-A@mail.gmail.com>
References: <CAKp4H6zSQFTTujf1Eq5nhBCLbwz4aEK7jpwYbW7KXGf4Bmh3-A@mail.gmail.com>
Message-ID: <32F2F3E2-32E3-4A7E-A24F-CFA157DC236F@astro.physik.uni-goettingen.de>

On 27.03.2012, at 2:07AM, Stephanie Cooke wrote:

> I am new to numpy. When I try to use the command array.shape, I get
> the following error:
> 
> AttributeError: 'list' object has no attribute 'shape'
> 
> Is anyone familiar with this type of error?

It means 'array' actually is not one, more precisely, not an object of type np.ndarray. 
How did you create your array? If it originates just from a list of numbers, you can 
create an array from it by 'np.array(list)' (assuming previous 'import numpy as np'). 

It's also possible that a function has returned a list of arrays where you might have 
expected a single array - so it really depends on the circumstances. 

HTH,
						Derek


From shish at keba.be  Mon Mar 26 20:45:19 2012
From: shish at keba.be (Olivier Delalleau)
Date: Mon, 26 Mar 2012 20:45:19 -0400
Subject: [Numpy-discussion] AttributeError with shape command
In-Reply-To: <CAKp4H6zSQFTTujf1Eq5nhBCLbwz4aEK7jpwYbW7KXGf4Bmh3-A@mail.gmail.com>
References: <CAKp4H6zSQFTTujf1Eq5nhBCLbwz4aEK7jpwYbW7KXGf4Bmh3-A@mail.gmail.com>
Message-ID: <CAFXk4bqvA7MsbNnOkSzAEnCsz_x32P6+H7AGg0x54evxQ=KVAA@mail.gmail.com>

It means "array" is a regular Python list and not a numpy array. Use
numpy.array(array) to convert it into an array.

-=- Olivier

Le 26 mars 2012 20:07, Stephanie Cooke <cooke.stephanie at gmail.com> a ?crit :

> Hello,
>
> I am new to numpy. When I try to use the command array.shape, I get
> the following error:
>
> AttributeError: 'list' object has no attribute 'shape'
>
> Is anyone familiar with this type of error?
>
> Thanks
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120326/edf798dd/attachment.html>

From ch.ro.dann at googlemail.com  Tue Mar 27 03:11:15 2012
From: ch.ro.dann at googlemail.com (Christoph Dann)
Date: Tue, 27 Mar 2012 09:11:15 +0200
Subject: [Numpy-discussion] Linking against MKL but still slow?
In-Reply-To: <CAGY4rcX9UsHMZpfT5oPsZYTgo8j7OC24EDY1KcuWQ4sO4g1gLw@mail.gmail.com>
References: <6471907.1930.1332752816588.JavaMail.geo-discussion-forums@vbiz13>
	<CAGY4rcX9UsHMZpfT5oPsZYTgo8j7OC24EDY1KcuWQ4sO4g1gLw@mail.gmail.com>
Message-ID: <CAOxAsw=HbdEiuUPKH1w8t5Wwjv3RJqfDZuix4-tMyavJ+cHr0Q@mail.gmail.com>

Thanks for your response, David.
> What do you mean by own extensions to NumPy ? If you mean building
> extensions against the C API of NumPy, then you don't need to build your own
> NumPy.?Building NumPy with Intel Compilers and MKL is a non-trivial process,
> so I would rather avoid it.
I want to have exact control which version of numpy I use. In
particular, I want to use the current development version.

> If you still want to build it by yourself, could you give us the full output
> of your build ?
My site.cfg looks like:
[mkl]
library_dirs = /opt/intel/lib/intel64:/opt/intel/mkl/lib/intel64
include_dirs = /opt/intel/include/
mkl_libs = mkl_def, mkl_intel_lp64, mkl_intel_thread, mkl_core, iomp5
lapack_libs = mkl_lapack95_lp64

and I build numpy by
python setup.py config --compiler=intelem --fcompiler=intelem
build_clib --compiler=intelem --fcompiler=intelem build_ext
--compiler=intelem --fcompiler=intelem install > log.txt

https://gist.github.com/2213552 is the content of log.txt. There were
no outputs on stderr.
I asume that the warnings can be ignored. However, there is a frequent error:

_configtest.c(5): error: the size of an array must be greater than zero
      static int test_array [1 - 2 * !(((long) (sizeof
(npy_check_sizeof_type))) == 12)];
                             ^

compilation aborted for _configtest.c (code 2)

How can I get rid of it?

Thanks,
Christoph


From nicole.stoffels at forwind.de  Tue Mar 27 05:12:40 2012
From: nicole.stoffels at forwind.de (Nicole Stoffels)
Date: Tue, 27 Mar 2012 11:12:40 +0200
Subject: [Numpy-discussion] Numpy Memory Error with corrcoef
Message-ID: <4F718488.1040400@forwind.de>

Dear all,

I get the following memory error while running my program:

*Traceback (most recent call last):
   File 
"/home/nistl/Software/Netzbetreiber/FLOW/src/MemoryError_Debug.py", line 
9, in <module>
     correlation = corrcoef(data_records)
   File "/usr/lib/python2.7/dist-packages/numpy/lib/function_base.py", 
line 1992, in corrcoef
     c = cov(x, y, rowvar, bias)
   File "/usr/lib/python2.7/dist-packages/numpy/lib/function_base.py", 
line 1973, in cov
     return (dot(X, X.T.conj()) / fact).squeeze()
MemoryError*

Here an easy example how to reproduce the error:

*#!/usr/bin/env python2.7

from pybinsel import open
from numpy import *

if __name__ == '__main__':

     data_records = random.random((459375, 24))
     correlation = corrcoef(data_records)

*My real data has the same dimension. Is this a size problem of the 
array or did I simply make a mistake in the application of corrcoef?

I hope that you can help me! Thanks!

Best regards,

Nicole Stoffels

-- 

Dipl.-Met. Nicole Stoffels

Wind Power Forecasting and Simulation

ForWind - Center for Wind Energy Research
Institute of Physics
Carl von Ossietzky University Oldenburg

Ammerl?nder Heerstr. 136
D-26129 Oldenburg

Tel: +49(0)441 798 - 5079
Fax: +49(0)441 798 - 5099

Web  : www.ForWind.de
Email: nicole.stoffels at forwind.de

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120327/8eb6e227/attachment.html>

From pierre.haessig at crans.org  Tue Mar 27 05:38:06 2012
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Tue, 27 Mar 2012 11:38:06 +0200
Subject: [Numpy-discussion] Numpy Memory Error with corrcoef
In-Reply-To: <4F718488.1040400@forwind.de>
References: <4F718488.1040400@forwind.de>
Message-ID: <4F718A7E.1010107@crans.org>

Hi Nicole,

Le 27/03/2012 11:12, Nicole Stoffels a ?crit :
> *if __name__ == '__main__':
>    
>     data_records = random.random((459375, 24))
>     correlation = corrcoef(data_records)*

May I assume that your data_record is made of 24 different variables of
which you have 459375 observations ?

If this is so and if you expect corrcoeff to return a 24*24 matrix, you
need to either transpose data_records :

>>> correlation = corrcoef(data_records.T)

or use the rowvar=0 argument (see np.corrcoef or np.cov docstring)

>>> correlation = corrcoef(data_records, rowvar = 0)

Both work on my computer, while your example indeed leads to a
MemoryError (because shape 459375*459375 would be a decently big matrix...)

I don't know if it's your case, but for those used to the Matlab (and
textbooks) convention of having variables stored in columns, the default
behaviour of numpy's covariance function is a bit surprising. I guess
historical reasons are involved in this choice. Just a matter of getting
used to it !

Best,
Pierre
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120327/2085ebfc/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120327/2085ebfc/attachment.sig>

From nicole.stoffels at forwind.de  Tue Mar 27 06:04:01 2012
From: nicole.stoffels at forwind.de (Nicole Stoffels)
Date: Tue, 27 Mar 2012 12:04:01 +0200
Subject: [Numpy-discussion] Numpy Memory Error with corrcoef
In-Reply-To: <4F718A7E.1010107@crans.org>
References: <4F718488.1040400@forwind.de> <4F718A7E.1010107@crans.org>
Message-ID: <4F719091.7090004@forwind.de>

Hi Pierre,

thanks for the fast answer!

I actually have timeseries of 24 hours for 459375 gridpoints in Europe. 
The timeseries of every grid point is stored in a column. That's why in 
my real program I already transposed the data, so that the correlation 
is made column by column. What I finally need is the correlation of each 
gridpoint with every other gridpoint. I'm afraid that this results in a 
459375*459375 matrix.

The correlation is actually just an interim result. So I'm currently 
trying to loop over every gridpoint to get single correlations which 
will then be processed further. Is this the right approach?

for column in range(len(data_records)):
     for columnnumber in range(len(data_records)):
         correlation = corrcoef(data_records[column], 
data_records[columnnumber])

Best wished,
Nicole

On 27.03.2012 11:38, Pierre Haessig wrote:
> Hi Nicole,
>
> Le 27/03/2012 11:12, Nicole Stoffels a ?crit :
>> *if __name__ == '__main__':
>>
>>     data_records = random.random((459375, 24))
>>     correlation = corrcoef(data_records)*
>
> May I assume that your data_record is made of 24 different variables 
> of which you have 459375 observations ?
>
> If this is so and if you expect corrcoeff to return a 24*24 matrix, 
> you need to either transpose data_records :
>
> >>> correlation = corrcoef(data_records.T)
>
> or use the rowvar=0 argument (see np.corrcoef or np.cov docstring)
>
> >>> correlation = corrcoef(data_records, rowvar = 0)
>
> Both work on my computer, while your example indeed leads to a 
> MemoryError (because shape 459375*459375 would be a decently big 
> matrix...)
>
> I don't know if it's your case, but for those used to the Matlab (and 
> textbooks) convention of having variables stored in columns, the 
> default behaviour of numpy's covariance function is a bit surprising. 
> I guess historical reasons are involved in this choice. Just a matter 
> of getting used to it !
>
> Best,
> Pierre
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-- 

Dipl.-Met. Nicole Stoffels

Wind Power Forecasting and Simulation

ForWind - Center for Wind Energy Research
Institute of Physics
Carl von Ossietzky University Oldenburg

Ammerl?nder Heerstr. 136
D-26129 Oldenburg

Tel: +49(0)441 798 - 5079
Fax: +49(0)441 798 - 5099

Web  : www.ForWind.de
Email: nicole.stoffels at forwind.de

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120327/f6070d67/attachment.html>

From rhattersley at gmail.com  Tue Mar 27 07:42:57 2012
From: rhattersley at gmail.com (Richard Hattersley)
Date: Tue, 27 Mar 2012 12:42:57 +0100
Subject: [Numpy-discussion] Numpy Memory Error with corrcoef
In-Reply-To: <4F718A7E.1010107@crans.org>
References: <4F718488.1040400@forwind.de>
	<4F718A7E.1010107@crans.org>
Message-ID: <CAP=RS9kdSQ0TQ6o=9z5L_Eg=s0=D1+MSPcoWBKQ8G7gvkihX0A@mail.gmail.com>

> Both work on my computer, while your example indeed leads to a MemoryError
> (because shape 459375*459375 would be a decently big matrix...)

Nicely understated :)

For 32-bit values "decently big" => 786GB


From shish at keba.be  Tue Mar 27 09:30:53 2012
From: shish at keba.be (Olivier Delalleau)
Date: Tue, 27 Mar 2012 09:30:53 -0400
Subject: [Numpy-discussion] Numpy Memory Error with corrcoef
In-Reply-To: <4F719091.7090004@forwind.de>
References: <4F718488.1040400@forwind.de> <4F718A7E.1010107@crans.org>
	<4F719091.7090004@forwind.de>
Message-ID: <CAFXk4bpBOdCGLa-kea=fYV_Ltcuh_wva49ZZid+sOmwvmc-qpg@mail.gmail.com>

Le 27 mars 2012 06:04, Nicole Stoffels <nicole.stoffels at forwind.de> a ?crit
:

> **
> Hi Pierre,
>
> thanks for the fast answer!
>
> I actually have timeseries of 24 hours for 459375 gridpoints in Europe.
> The timeseries of every grid point is stored in a column. That's why in my
> real program I already transposed the data, so that the correlation is made
> column by column. What I finally need is the correlation of each gridpoint
> with every other gridpoint. I'm afraid that this results in a 459375*459375
> matrix.
>
> The correlation is actually just an interim result. So I'm currently
> trying to loop over every gridpoint to get single correlations which will
> then be processed further. Is this the right approach?
>
> for column in range(len(data_records)):
>     for columnnumber in range(len(data_records)):
>         correlation = corrcoef(data_records[column],
> data_records[columnnumber])
>
> Best wished,
> Nicole
>

It may be painfully slow... You should make sure you don't compute twice
each off-diagonal element.
Also, if all your computations can be vectorized, you'll probably get a
significant performance boost by computing your matrix by blocks instead of
element-by-element. Take blocks as big as can fit in memory.

-=- Olivier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120327/9c9b1734/attachment.html>

From bryanv at continuum.io  Tue Mar 27 11:59:31 2012
From: bryanv at continuum.io (Bryan Van de Ven)
Date: Tue, 27 Mar 2012 10:59:31 -0500
Subject: [Numpy-discussion] "arg" searchsorted
In-Reply-To: <CAB6mnxJDhQyZFNFJeHXM=fdAOu-n=ufTrJ7-F+0vDB=sYi6LXA@mail.gmail.com>
References: <4F70E0D3.2090202@continuum.io>
	<CAB6mnxJPiygsJNrp9VH_FZA2uNa=FFJHWu0aUL1b6nsxsPrEuA@mail.gmail.com>
	<4F70EE7A.1060308@continuum.io>
	<CAB6mnxJDhQyZFNFJeHXM=fdAOu-n=ufTrJ7-F+0vDB=sYi6LXA@mail.gmail.com>
Message-ID: <4F71E3E3.9080405@continuum.io>


>     Sure, that would be easy enough to implement. I don't really have
>     a preference, is there a reason you would prefer that API?
>
>
> No, just exploring possibilities. Another would be a different name, 
> searchargsorted or some such. I actually think that is a better 
> alternative than the pair, but it would add another method to ndarray.

I guess after thinking about it my own personal preference is for just 
adding the sorter argument, perhaps others can weigh in. Just to 
summarize, the proposals fo far:

     np.searchsorted(a, 12, sorter=s)

     np.searchsorted((a,s), 12)

     np.searchargsorted(a, s, 12)


Bryan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120327/0fee548f/attachment.html>

From josef.pktd at gmail.com  Tue Mar 27 12:56:03 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Tue, 27 Mar 2012 12:56:03 -0400
Subject: [Numpy-discussion] different percentile implementations ?
In-Reply-To: <4F6F9C93.4050208@crans.org>
References: <4F6F9C93.4050208@crans.org>
Message-ID: <CAMMTP+B7+w=KEDsT2jHKeZCa62tTonPzCqWRDbHEiNfZi9ReQA@mail.gmail.com>

On Sun, Mar 25, 2012 at 6:30 PM, Pierre Haessig
<pierre.haessig at crans.org> wrote:
> Hi,
>
> A quick question I've had in mind for some time but didn't find a solution :
> Is there a significant difference between "numpy.percentile" and
> "scipy.stats.scoreatpercentile" ?
>
> Of course the signatures are somewhat different, but I have the feeling
> that the overall purpose is the same. Am I missing something ?

similar to std, var, histogram, ... some functions from scipy.stats
are now in numpy.

However, in contrast to std, var, I think scoreatpercentile should be
enhanced and not removed (similar to histogram), for example my
attempt:
http://projects.scipy.org/scipy/ticket/1329

Josef

>
> Best,
> Pierre
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From e.antero.tammi at gmail.com  Tue Mar 27 14:22:12 2012
From: e.antero.tammi at gmail.com (eat)
Date: Tue, 27 Mar 2012 21:22:12 +0300
Subject: [Numpy-discussion] Numpy Memory Error with corrcoef
In-Reply-To: <4F718488.1040400@forwind.de>
References: <4F718488.1040400@forwind.de>
Message-ID: <CAKa=AYQu6KwpM-ML2WEy9YSUuEcuJwFBNik1d+iYaTc6yzeJJQ@mail.gmail.com>

Hi,

On Tue, Mar 27, 2012 at 12:12 PM, Nicole Stoffels <
nicole.stoffels at forwind.de> wrote:

> **
> Dear all,
>
> I get the following memory error while running my program:
>
> *Traceback (most recent call last):
>   File "/home/nistl/Software/Netzbetreiber/FLOW/src/MemoryError_Debug.py",
> line 9, in <module>
>     correlation = corrcoef(data_records)
>   File "/usr/lib/python2.7/dist-packages/numpy/lib/function_base.py", line
> 1992, in corrcoef
>     c = cov(x, y, rowvar, bias)
>   File "/usr/lib/python2.7/dist-packages/numpy/lib/function_base.py", line
> 1973, in cov
>     return (dot(X, X.T.conj()) / fact).squeeze()
> MemoryError*
>
> Here an easy example how to reproduce the error:
>
> *#!/usr/bin/env python2.7
>
> from pybinsel import open
> from numpy import *
>
> if __name__ == '__main__':
>
>     data_records = random.random((459375, 24))
>     correlation = corrcoef(data_records)
>
> *My real data has the same dimension. Is this a size problem of the array
> or did I simply make a mistake in the application of corrcoef?
>
> I hope that you can help me! Thanks!
>
As other ones has explained this approach yields an enormous matrix.
However, if I have understood your problem correctly you could implement a
helper class to iterate over all of your observations. Something like along
the lines (although it will take hours? with your data size) to iterate
over all correlations:

"""A helper class for correlations between observations."""

import numpy as np


 class Correlations(object):

    def __init__(self, data):

        self.m= data.shape[0]

        # compatible with corrcoef

        self.scale= self.m- 1

        self.data= data- np.mean(data, 1)[:, None]

        # but you may actually need to scale and translate data

        # more application speficic manner

        self.var= (self.data** 2.).sum(1)/ self.scale


     def obs_kth(self, k):

        c= np.dot(self.data, self.data[k])/ self.scale

        return c/ (self.var[k]* self.var)** .5


     def obs_iterate(self):

        for k in xrange(self.m):

            yield self.obs_kth(k)


 if __name__ == '__main__':

    data= np.random.randn(5, 3)

    print np.corrcoef(data).round(3)

    print

    c= Correlations(data)

    print np.array([p for p in c.obs_iterate()]).round(3)


My 2 cents,
-eat

>
> Best regards,
>
> Nicole Stoffels
>
> --
>
> Dipl.-Met. Nicole Stoffels
>
> Wind Power Forecasting and Simulation
>
> ForWind - Center for Wind Energy Research
> Institute of Physics
> Carl von Ossietzky University Oldenburg
>
> Ammerl?nder Heerstr. 136
> D-26129 Oldenburg
>
> Tel: +49(0)441 798 - 5079
> Fax: +49(0)441 798 - 5099
>
> Web  : www.ForWind.de
> Email: nicole.stoffels at forwind.de
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120327/57bd9efd/attachment.html>

From ralf.gommers at googlemail.com  Tue Mar 27 16:19:02 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Tue, 27 Mar 2012 22:19:02 +0200
Subject: [Numpy-discussion] 1.7 blockers
In-Reply-To: <CAB6mnxLHZ9RM81jdzPMZ1n90bzosPu2X6Tz8dSaCfk9-ZQ=f8A@mail.gmail.com>
References: <CAB6mnx+WOfWDGrjpUZHfr_Lp0o2yvwoT6028CR4udZ=1nnHTBA@mail.gmail.com>
	<CABL7CQhFkTj+J1XBpanKPzgzrDc=S6PXJc+PzZ76thaGDSttFw@mail.gmail.com>
	<CAB6mnxLHZ9RM81jdzPMZ1n90bzosPu2X6Tz8dSaCfk9-ZQ=f8A@mail.gmail.com>
Message-ID: <CABL7CQivOddJYH4QhGLERFRuvf=XRdMXt5CU=RQkgW8OHA1r6g@mail.gmail.com>

On Mon, Mar 26, 2012 at 1:27 AM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

>
>
> On Sun, Mar 25, 2012 at 3:14 PM, Ralf Gommers <ralf.gommers at googlemail.com
> > wrote:
>
>>
>>
>> On Sat, Mar 24, 2012 at 10:13 PM, Charles R Harris <
>> charlesr.harris at gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> There several problems with numpy master that need to be fixed before a
>>> release can be considered.
>>>
>>>    1. Datetime on windows with mingw.
>>>    2. Bus error on SPARC, ticket #2076.
>>>    3. NA and real/complex views of complex arrays.
>>>
>>> Number 1 has been proved to be particularly difficult, any help or
>>> suggestions for that would be much appreciated. The current work has been
>>> going in pull request 214 <https://github.com/numpy/numpy/pull/214>.
>>>
>>> This isn't to say that there aren't a ton of other things that need
>>> fixing or that we can skip out on the current stack of pull requests, but I
>>> think it is impossible to consider a release while those three problems are
>>> outstanding.
>>>
>> Why do you consider (2) a blocker? Not saying it's not important, but
>> there are eight other open tickets with segfaults. Some are more esoteric
>> than other, but I don't see why for example #1713 and #1808 are less
>> important than this one.
>>
>> #1522 provides a patch that fixes a segfault by the way, could use a
>> review.
>>
>>
> I wasn't aware of the other segfaults, I'd like to get them all fixed...
> The list was meant to elicit additions.
>
> There are actually even more, I just searched for "segfault" to find the
8. Search for "segmentation fault" or "bus error" in Trac.

I would hope these have a high priority to get fixed but, unless they're
backwards compatibility issues, I don't consider them blockers. For the
simple reason that then we'd never be able to ship any release.

I don't know where the missed floating point errors come from, but they are
> somewhat dependent on the compiler doing the right thing and hardware
> support. I'd welcome any insight into why we get them on SPARC (underflow)
> and Windows (overflow).
>


> The windows buildbot doesn't seem to be updating correctly since it is
> still missing the combinations method that is now part of the test module.
>

Yeah, none of them are updating, it's a pain. We'll hopefully soon be able
to switch to a shiny new one.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120327/255b3da4/attachment.html>

From ralf.gommers at googlemail.com  Tue Mar 27 16:29:53 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Tue, 27 Mar 2012 22:29:53 +0200
Subject: [Numpy-discussion] Testsuite fails with Python 2.7.3rc1 and
 3.2.3rc1 (Debian)
In-Reply-To: <CAB4XWXwUfnmoGOUtMBKrC1CyVgu+hnaK=fGi=6EE_vwbDO8zUA@mail.gmail.com>
References: <CAB4XWXwUfnmoGOUtMBKrC1CyVgu+hnaK=fGi=6EE_vwbDO8zUA@mail.gmail.com>
Message-ID: <CABL7CQjfbzPMO7p1T+JyzROaau_JNcXNXw-qVJCSqNtL2kaYWQ@mail.gmail.com>

On Wed, Mar 21, 2012 at 12:28 AM, Sandro Tosi <matrixhasu at gmail.com> wrote:

> Hello,
> I've reported http://projects.scipy.org/numpy/ticket/2085 and Ralf
> asked for bringing that up here: is anyone able to replicate the
> problem described in that ticket?
>
> The debian bug tracking the problem is:
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=664672
>

We do have an Ubuntu buildbot that runs fine with 2.7.2 (see
buildbot.scipy.org). Is that failure seen on unusual hardware or with a
specific compiler only?

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120327/84580ea7/attachment.html>

From pierre.haessig at crans.org  Wed Mar 28 05:44:41 2012
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Wed, 28 Mar 2012 11:44:41 +0200
Subject: [Numpy-discussion] different percentile implementations ?
In-Reply-To: <CAMMTP+B7+w=KEDsT2jHKeZCa62tTonPzCqWRDbHEiNfZi9ReQA@mail.gmail.com>
References: <4F6F9C93.4050208@crans.org>
	<CAMMTP+B7+w=KEDsT2jHKeZCa62tTonPzCqWRDbHEiNfZi9ReQA@mail.gmail.com>
Message-ID: <4F72DD89.6090900@crans.org>

Le 27/03/2012 18:56, josef.pktd at gmail.com a ?crit :
> similar to std, var, histogram, ... some functions from scipy.stats
> are now in numpy.
Ok, historical reasons then. Fair enough.
Would a "See also: numpy.percentile" make sense in stats.scoreatpercentile ?
> However, in contrast to std, var, I think scoreatpercentile should be
> enhanced and not removed (similar to histogram), for example my
> attempt:
> http://projects.scipy.org/scipy/ticket/1329
>
I'm not sure I completely understood what was involved in your ticket.

The overall impression I felt is :
  * for a lot of statistical computations, it is not possible and/or
desirable to have the same code for "regular array" and for
"masked/nans/... arrays".
  * However, it would be possible to have the same api, that is : put
all the entry points in scipy.stats instead of having scipy.stats.mstats
as a separate api. Did I understand you correctly ?

Best,
Pierre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120328/46593ede/attachment.sig>

From josef.pktd at gmail.com  Wed Mar 28 07:37:05 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 28 Mar 2012 07:37:05 -0400
Subject: [Numpy-discussion] different percentile implementations ?
In-Reply-To: <4F72DD89.6090900@crans.org>
References: <4F6F9C93.4050208@crans.org>
	<CAMMTP+B7+w=KEDsT2jHKeZCa62tTonPzCqWRDbHEiNfZi9ReQA@mail.gmail.com>
	<4F72DD89.6090900@crans.org>
Message-ID: <CAMMTP+DDkEaMC42y5dvCYRq9dxTPCyrKPZSrkyPDqFzXOSvYVw@mail.gmail.com>

On Wed, Mar 28, 2012 at 5:44 AM, Pierre Haessig
<pierre.haessig at crans.org> wrote:
> Le 27/03/2012 18:56, josef.pktd at gmail.com a ?crit :
>> similar to std, var, histogram, ... some functions from scipy.stats
>> are now in numpy.
> Ok, historical reasons then. Fair enough.
> Would a "See also: numpy.percentile" make sense in stats.scoreatpercentile ?

of course, there are still many opportunities left to improve the
scipy documentation

>> However, in contrast to std, var, I think scoreatpercentile should be
>> enhanced and not removed (similar to histogram), for example my
>> attempt:
>> http://projects.scipy.org/scipy/ticket/1329
>>
> I'm not sure I completely understood what was involved in your ticket.

The main point was that scoreatpercentile/quantile in mstats or in
climpy by Pierre GM has a lot more features that should be in a stats
implementation.

>
> The overall impression I felt is :
> ?* for a lot of statistical computations, it is not possible and/or
> desirable to have the same code for "regular array" and for
> "masked/nans/... arrays".

I think in most cases a pure ndarray implementation without NaNs or
masks will be much faster, so I wouldn't just want to replace
stats.stats by stats.mstats and keep fast paths.

> ?* However, it would be possible to have the same api, that is : put
> all the entry points in scipy.stats instead of having scipy.stats.mstats
> as a separate api. Did I understand you correctly ?

What we should have, but is currently not the case, is that functions
in stats.stats and stats.mstats have the same signature/API.

Whether we can or should merge functions is still a bit open. In the
scoreatpercentile case implementing the limit keyword (which is
currently broken for 2d arrays) requires masking or something
equivalent, so the easiest is to just use the mstats implementation.
Similarly, the truncated statistic like tmean use masked arrays.

Cheers,
Josef

>
> Best,
> Pierre
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From thouis at gmail.com  Wed Mar 28 10:55:47 2012
From: thouis at gmail.com (Thouis (Ray) Jones)
Date: Wed, 28 Mar 2012 16:55:47 +0200
Subject: [Numpy-discussion] weird searchsorted behavior for unicode array
Message-ID: <CAHGWWxFDDsfv8x7WUUW+Lkj_wwnOc0gSZ1Z7223xJ0YhX6M4qg@mail.gmail.com>

I am seeing some very strange behavior searching a unicode array.  The
attached code outputs the following:
UNICODE
Is sorted: True
Search sorted by iteration, left: [0, 1, 2, 4, 4, 6, 6, 8, 8, 10, 10,
12, 12, 13]
Search sorted by iteration, right: [0, 2, 2, 4, 4, 6, 6, 8, 8, 10, 10,
12, 12, 13]
Search sorted by indexing, left: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 13]
Search sorted by indexing, right: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 13]
Search sorted by indexing with copy, left: [1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 13, 13]
Search sorted by indexing with copy, right: [1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 13, 13]

If I remove the first print, it produces:
Is sorted: True
Search sorted by iteration, left: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
Search sorted by iteration, right: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13]
Search sorted by indexing, left: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 13]
Search sorted by indexing, right: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 13]
Search sorted by indexing with copy, left: [0, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 13]
Search sorted by indexing with copy, right: [0, 1, 2, 3, 4, 5, 6, 7,
8, 9, 10, 11, 12, 13]

Neither answer is correct, since left and right should be offset by 1
when searching for an element in the array, by my reading of the docs.

This is numpy 1.6.1 on OSX 10.6, python 2.7

Am I missing something?

Thanks,
Ray Jones
-------------- next part --------------
A non-text attachment was scrubbed...
Name: unicode_searchsorted.py
Type: application/octet-stream
Size: 2927 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120328/59d65717/attachment.obj>

From nouiz at nouiz.org  Wed Mar 28 11:11:59 2012
From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=)
Date: Wed, 28 Mar 2012 11:11:59 -0400
Subject: [Numpy-discussion] Looking for people interested in helping
 with Python compiler to LLVM
In-Reply-To: <4F6FE83B.109@astro.uio.no>
References: <65BC4424-F48B-4D54-9EC4-76BEC3444433@gmail.com>
	<4F5EE219.8010108@astro.uio.no>
	<C15466AA-DBDC-4582-A522-D50D1A2E4AC1@continuum.io>
	<CANg26EWHOqu2RH1A-jBTpY79h973P9Ob-070NbBwLPeZkY7eKA@mail.gmail.com>
	<FA0A06B7-5EDF-4D2F-8919-A65E310D8AC5@continuum.io>
	<CANg26EVFOoQtkHcP-sYZ2vD13V6WF8cqZLkZOHyjusvMghZXmw@mail.gmail.com>
	<CAFXk4boesOAbRRzyejuKW58avohKK=VVGrmFkgh00f9njVoLEA@mail.gmail.com>
	<2745ec09-9b93-475b-879c-bcf19cee009b@email.android.com>
	<CAFXk4br+FqcVgRdDunqEBmNqU+dNcZGynmz90y7vsSEewOXPEQ@mail.gmail.com>
	<CANg26EX+KotYUKwdHuQ1oBog0OrFzf=eCHauXGB236Bmv9P1iw@mail.gmail.com>
	<CAOP6n=jx-efOuG=baiyqpt4YdeKU9-SEGrwi=4O-pJifNRgisg@mail.gmail.com>
	<4F6FE83B.109@astro.uio.no>
Message-ID: <CADKKbtgOS4grc63D46ODOrFzm=iOL6Dem1YFx_0jY3BfKuJJMA@mail.gmail.com>

Hi,

Up to now, I didn't post frequently to this mailing list. So I present
myself a little bit. I'm Fr?d?ric Bastien, one of Theano senior
developer. I'm not a student nor a professor, but a staff of our lab.
So I more time for Theano then most people in our lab, but I have
other work too.

>From now on,  I'm going to read every weekday email on the numpy
mailing list. I think this will help communication between the Theano
and numpy community. We have many common goal and this can only be
useful to all of us. Also, if possible, I'll try to go to conference
where discussion will occur on the big change coming to numpy.


So now, here is my comments/questions to those change.

Currently Theano include some type of functionality that I didn't saw
in the this thread or others related to the refactoring of Numpy.

1) Theano do numerical stability optimization. (simple example: log(1
+ x) -> log1p(x))
2) It allow to do symbolic "graph generation". We use this to do
symbolic differentiation, to compute the R op and L op used in Hessian
free.
3) Automatically make some operation inplace on there input: a + b -> a += b
4) Mathematical optimization like a*2+a -> a * 3
5) Implementation optimization (planed in the new numpy)

Is there any plan to add 1-4 in the new project? Will it be
doable(especially 2). The Theano interface that ask the user to create
a Theano function was caused by the fact it is hard/time consuming to
work on the AST that we looked a little bit. I didn't looked at the
python AST myself, so I don't know if it was possible or just hard. Do
someone know if there is all the information that we need?

I don't recall discussion on having Theano work at the byte-code
level. I never looked at python byte-code, but I remember having read
that it change at each version of python? If that is true, how do you
plan to handle this? I don't think that rewriting each optimization
again at each release of python is acceptable. Event if python change
the byte-code only at each X.Y release.

I like the idea of having a common library to do array expression
optimization (the arrayExprOpt idea). Theano optimizations cover much
more type of optimization as said then just element-wise optimization.
Up to now, I only remember having read about the element-wise
optimization. Did I forget something? Do you plan more then this? If
the goal stay at this level, it mean in the end it target to replace
numexpr, but I think we should target more then this as there is many
high level optimization possible that are useful.


There is many projects that do optimized code generation that use SSE*
instruction for tensor operation like blitz++(already named) and
eigen2. Redoing all this work is time consuming. So any reuse can be
useful. Don't forget that many implementation optimization are size
and hardware dependent. Do you plan to go that low level? If there is
such low level optimization, I suppose it will be only for the
frequent case.


I'm not sure if this thread is the right one for this, but we started
a new project for a full tensor on the GPU. It will be the used by
Theano, PyCUDA and PyOpenCL. But this ask change to the ndarray
interface. The current code is a C library that use cython for its
interface. We made a mailing list for it[1]. The current version is in
a fork. I think it will be highly valuable that we have in the end the
same gpu nd array with numpy too. It only have been announced as a
goal for the new Numpy up to now. We have some work done. It will be
usable shortly. But don't expect all Numpy function implemented yet :)


Fr?d?ric Bastien

[1] https://github.com/inducer/compyte/wiki


From williamj at tenbase2.com  Wed Mar 28 11:16:45 2012
From: williamj at tenbase2.com (William Johnston)
Date: Wed, 28 Mar 2012 11:16:45 -0400
Subject: [Numpy-discussion] cannot reference NpyAccessLib assembly for DLR
	app
Message-ID: <22BFF1F4C9764AD5B23A714B7C818B03@leviathan>


Hello,

I cannot add a reference to NpyAccessLib in an IronPython C# DLR app.

The error message is:

Failed while initializing NpyCoreApi: DllNotFoundException:Unable to load DLL 'NpyAccessLib': The specified module could not be found. (Exception from HRESULT: 0x8007007E)
NumpyDotNet stack trace:
   at NumpyDotNet.NpyCoreApi.GetNativeTypeInfo(Int32& intSize, Int32& longsize, Int32& longLongSize, Int32& longDoubleSize)
   at NumpyDotNet.NpyCoreApi..cctor()
The type initializer for 'NumpyDotNet.NpyCoreApi' threw an exception.

Again, I cannot add the reference.  Attached is a screenshot.

Any suggestions?

Regards,
William Johnston
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120328/39eb1545/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: numpy dll add reference screenshot.png
Type: image/png
Size: 306825 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120328/39eb1545/attachment.png>

From josef.pktd at gmail.com  Wed Mar 28 11:51:06 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 28 Mar 2012 11:51:06 -0400
Subject: [Numpy-discussion] weird searchsorted behavior for unicode array
In-Reply-To: <CAHGWWxFDDsfv8x7WUUW+Lkj_wwnOc0gSZ1Z7223xJ0YhX6M4qg@mail.gmail.com>
References: <CAHGWWxFDDsfv8x7WUUW+Lkj_wwnOc0gSZ1Z7223xJ0YhX6M4qg@mail.gmail.com>
Message-ID: <CAMMTP+C5EHwSJWR8=fkzX2fTZnhdR6X15Ohkhj55z3g+iQTGvw@mail.gmail.com>

On Wed, Mar 28, 2012 at 10:55 AM, Thouis (Ray) Jones <thouis at gmail.com> wrote:
> I am seeing some very strange behavior searching a unicode array. ?The
> attached code outputs the following:
> UNICODE
> Is sorted: True
> Search sorted by iteration, left: [0, 1, 2, 4, 4, 6, 6, 8, 8, 10, 10,
> 12, 12, 13]
> Search sorted by iteration, right: [0, 2, 2, 4, 4, 6, 6, 8, 8, 10, 10,
> 12, 12, 13]
> Search sorted by indexing, left: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 13]
> Search sorted by indexing, right: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
> 12, 13, 13]
> Search sorted by indexing with copy, left: [1, 2, 3, 4, 5, 6, 7, 8, 9,
> 10, 11, 12, 13, 13]
> Search sorted by indexing with copy, right: [1, 2, 3, 4, 5, 6, 7, 8,
> 9, 10, 11, 12, 13, 13]
>
> If I remove the first print, it produces:
> Is sorted: True
> Search sorted by iteration, left: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
> Search sorted by iteration, right: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
> 11, 12, 13]
> Search sorted by indexing, left: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 13]
> Search sorted by indexing, right: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
> 12, 13, 13]
> Search sorted by indexing with copy, left: [0, 1, 2, 3, 4, 5, 6, 7, 8,
> 9, 10, 11, 12, 13]
> Search sorted by indexing with copy, right: [0, 1, 2, 3, 4, 5, 6, 7,
> 8, 9, 10, 11, 12, 13]
>
> Neither answer is correct, since left and right should be offset by 1
> when searching for an element in the array, by my reading of the docs.
>
> This is numpy 1.6.1 on OSX 10.6, python 2.7
>
> Am I missing something?

adding this
# -*- coding: utf-8 -*-

produces consistent results for me
maybe the regex for encoding, but I thought it has to be the first line

Josef


>
> Thanks,
> Ray Jones
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From josef.pktd at gmail.com  Wed Mar 28 13:17:04 2012
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 28 Mar 2012 13:17:04 -0400
Subject: [Numpy-discussion] weird searchsorted behavior for unicode array
In-Reply-To: <CAMMTP+C5EHwSJWR8=fkzX2fTZnhdR6X15Ohkhj55z3g+iQTGvw@mail.gmail.com>
References: <CAHGWWxFDDsfv8x7WUUW+Lkj_wwnOc0gSZ1Z7223xJ0YhX6M4qg@mail.gmail.com>
	<CAMMTP+C5EHwSJWR8=fkzX2fTZnhdR6X15Ohkhj55z3g+iQTGvw@mail.gmail.com>
Message-ID: <CAMMTP+Ddg3UbkgQnHyRcRmCzMrpeojx6u3OeW-Xj7dN+Faqb7g@mail.gmail.com>

On Wed, Mar 28, 2012 at 11:51 AM,  <josef.pktd at gmail.com> wrote:
> On Wed, Mar 28, 2012 at 10:55 AM, Thouis (Ray) Jones <thouis at gmail.com> wrote:
>> I am seeing some very strange behavior searching a unicode array. ?The
>> attached code outputs the following:
>> UNICODE
>> Is sorted: True
>> Search sorted by iteration, left: [0, 1, 2, 4, 4, 6, 6, 8, 8, 10, 10,
>> 12, 12, 13]
>> Search sorted by iteration, right: [0, 2, 2, 4, 4, 6, 6, 8, 8, 10, 10,
>> 12, 12, 13]
>> Search sorted by indexing, left: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 13]
>> Search sorted by indexing, right: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
>> 12, 13, 13]
>> Search sorted by indexing with copy, left: [1, 2, 3, 4, 5, 6, 7, 8, 9,
>> 10, 11, 12, 13, 13]
>> Search sorted by indexing with copy, right: [1, 2, 3, 4, 5, 6, 7, 8,
>> 9, 10, 11, 12, 13, 13]
>>
>> If I remove the first print, it produces:
>> Is sorted: True
>> Search sorted by iteration, left: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
>> Search sorted by iteration, right: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
>> 11, 12, 13]
>> Search sorted by indexing, left: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 13]
>> Search sorted by indexing, right: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
>> 12, 13, 13]
>> Search sorted by indexing with copy, left: [0, 1, 2, 3, 4, 5, 6, 7, 8,
>> 9, 10, 11, 12, 13]
>> Search sorted by indexing with copy, right: [0, 1, 2, 3, 4, 5, 6, 7,
>> 8, 9, 10, 11, 12, 13]
>>
>> Neither answer is correct, since left and right should be offset by 1
>> when searching for an element in the array, by my reading of the docs.
>>
>> This is numpy 1.6.1 on OSX 10.6, python 2.7
>>
>> Am I missing something?
>
> adding this
> # -*- coding: utf-8 -*-
>
> produces consistent results for me
> maybe the regex for encoding, but I thought it has to be the first line

consistent means the same with and without commenting out UNICODE, but
searchsorted doesn't distinguish between left and right. that looks
like a bug (in numpy 1.4.1)

using an object array, or a string view a.view('<S524')  produces the
correct left right shift.

Josef

>
> Josef
>
>
>>
>> Thanks,
>> Ray Jones
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>


From tim at cerazone.net  Wed Mar 28 15:31:14 2012
From: tim at cerazone.net (Tim Cera)
Date: Wed, 28 Mar 2012 15:31:14 -0400
Subject: [Numpy-discussion] Style for pad implementation in 'pad' namespace
	or functions under np.lib
Message-ID: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>

I have been developing a set of pad functions to pad arrays in different
ways.  Really close to having it accepted into numpy, but I want to revisit
an implementation issue that I have become worried about.  Should these
functions be collected into a 'pad' namespace or put raw into np.lib?

Do I really care about this?  Not really since it isn't a utility issue,
but I contend that it would look better to pull these functions into their
own namespace.

Is there a consensus?

The current pull request:
https://github.com/numpy/numpy/pull/242

Kindest regards,
Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120328/42e8f1f5/attachment.html>

From ralf.gommers at googlemail.com  Wed Mar 28 15:57:31 2012
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Wed, 28 Mar 2012 21:57:31 +0200
Subject: [Numpy-discussion] Style for pad implementation in 'pad'
 namespace or functions under np.lib
In-Reply-To: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
Message-ID: <CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>

On Wed, Mar 28, 2012 at 9:31 PM, Tim Cera <tim at cerazone.net> wrote:

> I have been developing a set of pad functions to pad arrays in different
> ways.  Really close to having it accepted into numpy, but I want to revisit
> an implementation issue that I have become worried about.  Should these
> functions be collected into a 'pad' namespace or put raw into np.lib?
>
> Do I really care about this?  Not really since it isn't a utility issue,
> but I contend that it would look better to pull these functions into their
> own namespace.
>
> +1 one for a separate 'pad' or 'padding' namespace. Adding 10 functions to
the main numpy namespace feels like too much.

Ralf

Is there a consensus?
>
> The current pull request:
> https://github.com/numpy/numpy/pull/242
>
> Kindest regards,
> Tim
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120328/a76d6c52/attachment.html>

From markbak at gmail.com  Wed Mar 28 16:17:58 2012
From: markbak at gmail.com (Mark Bakker)
Date: Wed, 28 Mar 2012 22:17:58 +0200
Subject: [Numpy-discussion] problem with dot for complex matrices
Message-ID: <CAEX=yaYdK0-BboppGdMi2_BotpTr=FOT7xPU9NmrQNA6S9QRrA@mail.gmail.com>

Have you tried to create your arrays with the numpy.zeros functions rather
than the scipy.zeros functions?
I can imagine that something may get confused here. Maybe some version
mismatch or what not.

Mark

From: Ryan Krauss <ryanlists <at> gmail.com>
Subject: Re: problem with dot for complex
matrices<http://news.gmane.org/find-root.php?message_id=%3cCAAroWSa4ZgEdhb4%3dyq%5fuvxBYxC%5fDCRxcAPdzU79z80PDj%3doGRg%40mail.gmail.com%3e>
Newsgroups: gmane.comp.python.scientific.user<http://news.gmane.org/gmane.comp.python.scientific.user>
Date: 2012-03-27 19:56:14 GMT (1 day and 18 minutes ago)

The matrices are initially created by these lines:

        matout=scipy.zeros((n,n),dtype=complex128)#+0j
        colout=scipy.zeros((n,1),dtype=complex128)#+0j
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120328/c0cfdb97/attachment.html>

From charlesr.harris at gmail.com  Wed Mar 28 18:08:48 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 28 Mar 2012 16:08:48 -0600
Subject: [Numpy-discussion] Style for pad implementation in 'pad'
 namespace or functions under np.lib
In-Reply-To: <CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
	<CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
Message-ID: <CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>

On Wed, Mar 28, 2012 at 1:57 PM, Ralf Gommers
<ralf.gommers at googlemail.com>wrote:

>
>
> On Wed, Mar 28, 2012 at 9:31 PM, Tim Cera <tim at cerazone.net> wrote:
>
>> I have been developing a set of pad functions to pad arrays in different
>> ways.  Really close to having it accepted into numpy, but I want to revisit
>> an implementation issue that I have become worried about.  Should these
>> functions be collected into a 'pad' namespace or put raw into np.lib?
>>
>> Do I really care about this?  Not really since it isn't a utility issue,
>> but I contend that it would look better to pull these functions into their
>> own namespace.
>>
>> +1 one for a separate 'pad' or 'padding' namespace. Adding 10 functions
> to the main numpy namespace feels like too much.
>
>
I think there is also a question of using a prefix pad_xxx for the function
names as opposed to pad.xxx.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120328/39c2a364/attachment.html>

From christoph.gohle at mpq.mpg.de  Thu Mar 29 04:01:58 2012
From: christoph.gohle at mpq.mpg.de (Christoph Gohle)
Date: Thu, 29 Mar 2012 10:01:58 +0200
Subject: [Numpy-discussion] subclassing array in c
In-Reply-To: <jjb1tb$ru8$1@dough.gmane.org>
References: <EE364848-602F-4CBC-942B-592C8C66E4BA@mpq.mpg.de>
	<CAE8bXEkycG4G5PT9kjdHTbzUJVz7NST-HcySGozL+0934g5_mQ@mail.gmail.com>
	<CAE8bXE=hUrTvM96sMXWK5_Z0WbO87JiPJWfS2why+GF-snvvMw@mail.gmail.com>
	<80A2C97D-8386-4BBA-AA5E-14FD18568457@mpq.mpg.de>
	<CAE8bXEkC9uyzb3MWtnJ1eyssOsZgiwzA8YH7X-TxgtOnDqrq9Q@mail.gmail.com>
	<345C8549-457F-4FA8-AECF-38E1BF24BC9D@mpq.mpg.de>
	<jjb1tb$ru8$1@dough.gmane.org>
Message-ID: <CB3480DC-51BA-48E0-88E6-162E89EC21B1@mpq.mpg.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Am 08.03.2012 um 20:39 schrieb Pauli Virtanen:

> 08.03.2012 17:37, Christoph Gohle kirjoitti:
>> thanks for testing. I have now tried on different platforms. I get
>> all kinds of crashes on os x (now with numpy 1.6.1) and windows
>> with numpy 1.6.0. On Ubuntu with numpy 1.3.0 I get a hughe memory
>> leak...
>> 
>> Any hints would be welcome.
> 
> The type object inherits `tp_alloc` from Numpy. This routine always
> allocates memory of size NPY_SIZEOF_PYARRAYOBJECT for the
> PyArrayObject. Therefore, the write to new->unit in your
> __array_finalize__ goes to unallocated memory.
> 
> This is probably a bug in Numpy --- arrayobject.c:array_alloc should
> respect the size specified by the subtype.
> 
> A workaround is probably to specify a suitable tp_alloc routine yourself:
> 
>   PyType_GenericAlloc,        /* tp_alloc */
>    unitArray_new,              /* tp_new */
>    _PyObject_Del               /* tp_free */
Now I have been playing around with this configuration and can't seem to get rid of the memory leak I was talking about earlier. Reference counting seems to be fine as far as I can tell. I am using the following tp_dealloc:

static void unitArray_dealloc(UnitArrayObject *obj) {
	Py_XDECREF(obj->unit);
	obj->unit = NULL;
	if (obj->base.ob_type->tp_base != NULL) {
		//call base dealloc
		obj->base.ob_type->tp_base->tp_dealloc((PyObject*)obj);
	}
}

I was also trying to provide my own tp_alloc like below and using the inherited tp_free. This also crashes with a segmentation fault upon freeing saying that the memory block was not allocated. Does this make sense?

static PyObject *
unitArray_alloc(PyTypeObject *type, Py_ssize_t NPY_UNUSED(nitems))
{
	PyObject *obj;
	/* nitems will always be 0 */
	obj = (PyObject *)PyObject_Malloc(type->tp_basicsize);
	PyObject_Init(obj, type);
	return obj;
}

Cheers,
Christoph

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.14 (Darwin)

iEYEARECAAYFAk90FvkACgkQLYu25rCEIzvd+gCeNRgsv44g8kFJut5OQNXvK9zv
XckAoKWEjj3A34i4H+POOU/EIzzSU1EX
=kJPT
-----END PGP SIGNATURE-----


From christoph.gohle at mpq.mpg.de  Thu Mar 29 04:07:32 2012
From: christoph.gohle at mpq.mpg.de (Christoph Gohle)
Date: Thu, 29 Mar 2012 10:07:32 +0200
Subject: [Numpy-discussion] subclassing array in c
In-Reply-To: <jjb1tb$ru8$1@dough.gmane.org>
References: <EE364848-602F-4CBC-942B-592C8C66E4BA@mpq.mpg.de>
	<CAE8bXEkycG4G5PT9kjdHTbzUJVz7NST-HcySGozL+0934g5_mQ@mail.gmail.com>
	<CAE8bXE=hUrTvM96sMXWK5_Z0WbO87JiPJWfS2why+GF-snvvMw@mail.gmail.com>
	<80A2C97D-8386-4BBA-AA5E-14FD18568457@mpq.mpg.de>
	<CAE8bXEkC9uyzb3MWtnJ1eyssOsZgiwzA8YH7X-TxgtOnDqrq9Q@mail.gmail.com>
	<345C8549-457F-4FA8-AECF-38E1BF24BC9D@mpq.mpg.de>
	<jjb1tb$ru8$1@dough.gmane.org>
Message-ID: <F7257064-4C75-4EE4-8DBE-B9B605B3EE94@mpq.mpg.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Am 08.03.2012 um 20:39 schrieb Pauli Virtanen:

> 08.03.2012 17:37, Christoph Gohle kirjoitti:
>> thanks for testing. I have now tried on different platforms. I get
>> all kinds of crashes on os x (now with numpy 1.6.1) and windows
>> with numpy 1.6.0. On Ubuntu with numpy 1.3.0 I get a hughe memory
>> leak...
>> 
>> Any hints would be welcome.
> 
> The type object inherits `tp_alloc` from Numpy. This routine always
> allocates memory of size NPY_SIZEOF_PYARRAYOBJECT for the
> PyArrayObject. Therefore, the write to new->unit in your
> __array_finalize__ goes to unallocated memory.
do you think, I should submit a bugreport?

Cheers,
Christoph


-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.14 (Darwin)

iEYEARECAAYFAk90GEQACgkQLYu25rCEIztHGwCgi4asYpscBFYp6yfYAc1wioZG
9EAAoIZft1WOiiAj+cyJi+RVuT2U7BzF
=e9cn
-----END PGP SIGNATURE-----


From thouis at gmail.com  Thu Mar 29 05:04:27 2012
From: thouis at gmail.com (Thouis (Ray) Jones)
Date: Thu, 29 Mar 2012 11:04:27 +0200
Subject: [Numpy-discussion] weird searchsorted behavior for unicode array
In-Reply-To: <CAMMTP+C5EHwSJWR8=fkzX2fTZnhdR6X15Ohkhj55z3g+iQTGvw@mail.gmail.com>
References: <CAHGWWxFDDsfv8x7WUUW+Lkj_wwnOc0gSZ1Z7223xJ0YhX6M4qg@mail.gmail.com>
	<CAMMTP+C5EHwSJWR8=fkzX2fTZnhdR6X15Ohkhj55z3g+iQTGvw@mail.gmail.com>
Message-ID: <CAHGWWxEF5pSC+OETksDMPE6_ZT9VESUdxUe2d2B+TXGs8TydoQ@mail.gmail.com>

It seems to be a bug in the unicode string length computation in
arraytypes.c.src:UNICODE_compare(), based on comparison to the code in
arrayobject.c:_myunicmp() and arrayobject.c:_compare_strings().

Patch below (against maintenance/1.6.x, but the bug also looks to be
present in master based on my reading of the code).

---
 numpy/core/src/multiarray/arraytypes.c.src |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/numpy/core/src/multiarray/arraytypes.c.src
b/numpy/core/src/multiarray/arraytypes.c.src
index fde95c4..660d1e5 100644
--- a/numpy/core/src/multiarray/arraytypes.c.src
+++ b/numpy/core/src/multiarray/arraytypes.c.src
@@ -2789,7 +2789,7 @@ static int
 UNICODE_compare(PyArray_UCS4 *ip1, PyArray_UCS4 *ip2,
                 PyArrayObject *ap)
 {
-    int itemsize = ap->descr->elsize;
+    int itemsize = (ap->descr->elsize) >> 2;

     if (itemsize < 0) {
         return 0;
-- 
1.7.9.3


From thouis at gmail.com  Thu Mar 29 06:23:48 2012
From: thouis at gmail.com (Thouis (Ray) Jones)
Date: Thu, 29 Mar 2012 12:23:48 +0200
Subject: [Numpy-discussion] weird searchsorted behavior for unicode array
In-Reply-To: <CAHGWWxEF5pSC+OETksDMPE6_ZT9VESUdxUe2d2B+TXGs8TydoQ@mail.gmail.com>
References: <CAHGWWxFDDsfv8x7WUUW+Lkj_wwnOc0gSZ1Z7223xJ0YhX6M4qg@mail.gmail.com>
	<CAMMTP+C5EHwSJWR8=fkzX2fTZnhdR6X15Ohkhj55z3g+iQTGvw@mail.gmail.com>
	<CAHGWWxEF5pSC+OETksDMPE6_ZT9VESUdxUe2d2B+TXGs8TydoQ@mail.gmail.com>
Message-ID: <CAHGWWxFaH14bMc9QGi1qG-ZJLfLMQA4Dwnv2pJm_Mx9+NCQ3ww@mail.gmail.com>

On Thu, Mar 29, 2012 at 11:04, Thouis (Ray) Jones <thouis at gmail.com> wrote:
> It seems to be a bug in the unicode string length computation in
> arraytypes.c.src:UNICODE_compare(), based on comparison to the code in
> arrayobject.c:_myunicmp() and arrayobject.c:_compare_strings().
>
> Patch below (against maintenance/1.6.x, but the bug also looks to be
> present in master based on my reading of the code).

I just submitted a PR against numpy master for this bug, adding a test
based on the example I posted.
https://github.com/numpy/numpy/pull/243

Ray Jones


From chaoyuejoy at gmail.com  Thu Mar 29 07:54:40 2012
From: chaoyuejoy at gmail.com (Chao YUE)
Date: Thu, 29 Mar 2012 13:54:40 +0200
Subject: [Numpy-discussion] how to check type of array?
Message-ID: <CAAN-aRGAZxygvpU223n971_hL4OhTLLOQe1K88foT5UrcUFyog@mail.gmail.com>

Dear all,

how can I check type of array in if condition expression?

In [75]: type(a)
Out[75]: <type 'numpy.ndarray'>

In [76]: a.dtype
Out[76]: dtype('int32')

a.dtype=='int32'?

thanks!

Chao


-- 
***********************************************************************************
Chao YUE
Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
UMR 1572 CEA-CNRS-UVSQ
Batiment 712 - Pe 119
91191 GIF Sur YVETTE Cedex
Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
************************************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120329/46c744f3/attachment.html>

From shish at keba.be  Thu Mar 29 08:41:46 2012
From: shish at keba.be (Olivier Delalleau)
Date: Thu, 29 Mar 2012 08:41:46 -0400
Subject: [Numpy-discussion] how to check type of array?
In-Reply-To: <CAAN-aRGAZxygvpU223n971_hL4OhTLLOQe1K88foT5UrcUFyog@mail.gmail.com>
References: <CAAN-aRGAZxygvpU223n971_hL4OhTLLOQe1K88foT5UrcUFyog@mail.gmail.com>
Message-ID: <CAFXk4bqSbboiiN1QsMvJuACvcRSPX9bdJr1T6-UiDRVqvL2o+A@mail.gmail.com>

if type(a) == numpy.ndarray:
   ...
if a.dtype == 'int32':
   ...

-=- Olivier

Le 29 mars 2012 07:54, Chao YUE <chaoyuejoy at gmail.com> a ?crit :

> Dear all,
>
> how can I check type of array in if condition expression?
>
> In [75]: type(a)
> Out[75]: <type 'numpy.ndarray'>
>
> In [76]: a.dtype
> Out[76]: dtype('int32')
>
> a.dtype=='int32'?
>
> thanks!
>
> Chao
>
>
> --
>
> ***********************************************************************************
> Chao YUE
> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
> UMR 1572 CEA-CNRS-UVSQ
> Batiment 712 - Pe 119
> 91191 GIF Sur YVETTE Cedex
> Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
>
> ************************************************************************************
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120329/a8ffc5bf/attachment.html>

From derek at astro.physik.uni-goettingen.de  Thu Mar 29 08:47:30 2012
From: derek at astro.physik.uni-goettingen.de (Derek Homeier)
Date: Thu, 29 Mar 2012 14:47:30 +0200
Subject: [Numpy-discussion] how to check type of array?
In-Reply-To: <CAAN-aRGAZxygvpU223n971_hL4OhTLLOQe1K88foT5UrcUFyog@mail.gmail.com>
References: <CAAN-aRGAZxygvpU223n971_hL4OhTLLOQe1K88foT5UrcUFyog@mail.gmail.com>
Message-ID: <16B10E15-1B63-4AC2-A0BA-9C4CD3916353@astro.physik.uni-goettingen.de>

On 29 Mar 2012, at 13:54, Chao YUE wrote:

> how can I check type of array in if condition expression?
> 
> In [75]: type(a)
> Out[75]: <type 'numpy.ndarray'>
> 
> In [76]: a.dtype
> Out[76]: dtype('int32')
> 
> a.dtype=='int32'?

this and 

a.dtype=='i4'
a.dtype==np.int32

all work. For a more general check (e.g. if it is any type of integer), you can do

np.issubclass_(a.dtype.type, np.integer)

See also "help(np.subdtype)"

Cheers,
					Derek


From robert.kern at gmail.com  Thu Mar 29 08:49:51 2012
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 29 Mar 2012 13:49:51 +0100
Subject: [Numpy-discussion] how to check type of array?
In-Reply-To: <16B10E15-1B63-4AC2-A0BA-9C4CD3916353@astro.physik.uni-goettingen.de>
References: <CAAN-aRGAZxygvpU223n971_hL4OhTLLOQe1K88foT5UrcUFyog@mail.gmail.com>
	<16B10E15-1B63-4AC2-A0BA-9C4CD3916353@astro.physik.uni-goettingen.de>
Message-ID: <CAF6FJivKagV=_LMn4094OQswRELq6dT1cnvhutewSdOR-1ZooA@mail.gmail.com>

On Thu, Mar 29, 2012 at 13:47, Derek Homeier
<derek at astro.physik.uni-goettingen.de> wrote:
> On 29 Mar 2012, at 13:54, Chao YUE wrote:
>
>> how can I check type of array in if condition expression?
>>
>> In [75]: type(a)
>> Out[75]: <type 'numpy.ndarray'>
>>
>> In [76]: a.dtype
>> Out[76]: dtype('int32')
>>
>> a.dtype=='int32'?
>
> this and
>
> a.dtype=='i4'
> a.dtype==np.int32
>
> all work. For a more general check (e.g. if it is any type of integer), you can do
>
> np.issubclass_(a.dtype.type, np.integer)

I don't recommend using that. Use np.issubdtype(a.dtype, np.integer) instead.

-- 
Robert Kern


From derek at astro.physik.uni-goettingen.de  Thu Mar 29 08:56:50 2012
From: derek at astro.physik.uni-goettingen.de (Derek Homeier)
Date: Thu, 29 Mar 2012 14:56:50 +0200
Subject: [Numpy-discussion] how to check type of array?
In-Reply-To: <CAF6FJivKagV=_LMn4094OQswRELq6dT1cnvhutewSdOR-1ZooA@mail.gmail.com>
References: <CAAN-aRGAZxygvpU223n971_hL4OhTLLOQe1K88foT5UrcUFyog@mail.gmail.com>
	<16B10E15-1B63-4AC2-A0BA-9C4CD3916353@astro.physik.uni-goettingen.de>
	<CAF6FJivKagV=_LMn4094OQswRELq6dT1cnvhutewSdOR-1ZooA@mail.gmail.com>
Message-ID: <E648462C-F8B0-4263-8EF3-3C2EA0D8D838@astro.physik.uni-goettingen.de>

On 29 Mar 2012, at 14:49, Robert Kern wrote:

>> all work. For a more general check (e.g. if it is any type of integer), you can do
>> 
>> np.issubclass_(a.dtype.type, np.integer)
> 
> I don't recommend using that. Use np.issubdtype(a.dtype, np.integer) instead.

Sorry, you're right, this works the same way - I had the impression from the documentation 
that tests like np.issubdtype(np.int16, np.integer) would not work, but they do.

Cheers,
						Derek


From chaoyuejoy at gmail.com  Thu Mar 29 09:34:05 2012
From: chaoyuejoy at gmail.com (Chao YUE)
Date: Thu, 29 Mar 2012 15:34:05 +0200
Subject: [Numpy-discussion] how to check type of array?
In-Reply-To: <E648462C-F8B0-4263-8EF3-3C2EA0D8D838@astro.physik.uni-goettingen.de>
References: <CAAN-aRGAZxygvpU223n971_hL4OhTLLOQe1K88foT5UrcUFyog@mail.gmail.com>
	<16B10E15-1B63-4AC2-A0BA-9C4CD3916353@astro.physik.uni-goettingen.de>
	<CAF6FJivKagV=_LMn4094OQswRELq6dT1cnvhutewSdOR-1ZooA@mail.gmail.com>
	<E648462C-F8B0-4263-8EF3-3C2EA0D8D838@astro.physik.uni-goettingen.de>
Message-ID: <CAAN-aREvcSNvr=4h4xzWQk=8meAkW7N+_aET81qU0=Jh9ZbqKw@mail.gmail.com>

thanks to all. more than what I need.

cheers,

chao

2012/3/29 Derek Homeier <derek at astro.physik.uni-goettingen.de>

> On 29 Mar 2012, at 14:49, Robert Kern wrote:
>
> >> all work. For a more general check (e.g. if it is any type of integer),
> you can do
> >>
> >> np.issubclass_(a.dtype.type, np.integer)
> >
> > I don't recommend using that. Use np.issubdtype(a.dtype, np.integer)
> instead.
>
> Sorry, you're right, this works the same way - I had the impression from
> the documentation
> that tests like np.issubdtype(np.int16, np.integer) would not work, but
> they do.
>
> Cheers,
>                                                Derek
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
***********************************************************************************
Chao YUE
Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
UMR 1572 CEA-CNRS-UVSQ
Batiment 712 - Pe 119
91191 GIF Sur YVETTE Cedex
Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
************************************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120329/71512654/attachment.html>

From tim at cerazone.net  Thu Mar 29 10:55:20 2012
From: tim at cerazone.net (Tim Cera)
Date: Thu, 29 Mar 2012 10:55:20 -0400
Subject: [Numpy-discussion] Style for pad implementation in 'pad'
 namespace or functions under np.lib
In-Reply-To: <CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
	<CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
	<CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>
Message-ID: <CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>

On Wed, Mar 28, 2012 at 6:08 PM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

>
> I think there is also a question of using a prefix pad_xxx for the
> function names as opposed to pad.xxx.
>

If I had it as pad.mean, pad.median, ...etc. then someone could

    from numpy.pad import *
    a = np.arange(5)
    # then you would have a confusing
    b = mean(a, 2)

Because of that I would vote for 'pad.pad_xxx' instead of 'pad.xxx'.  In
fact at one time I named the functions 'pad.pad_with_xxx' which was a
little overkill.

Kindest regards,
Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120329/09436966/attachment.html>

From chris.barker at noaa.gov  Thu Mar 29 12:09:07 2012
From: chris.barker at noaa.gov (Chris Barker)
Date: Thu, 29 Mar 2012 09:09:07 -0700
Subject: [Numpy-discussion] Style for pad implementation in 'pad'
 namespace or functions under np.lib
In-Reply-To: <CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
	<CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
	<CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>
	<CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>
Message-ID: <CALGmxEKxtSUDU9gwqUsEZo6sFq7WxSFS7dthK56ahmonrvAoiQ@mail.gmail.com>

On Thu, Mar 29, 2012 at 7:55 AM, Tim Cera

>> I think there is also a question of using a prefix pad_xxx for the
>> function names as opposed to pad.xxx.

"Namespaces are one honking great idea -- let's do more of those!"


> If I had it as pad.mean, pad.median, ...etc. then someone could
>
> ? ? from numpy.pad import *
> ? ? a = np.arange(5)
> ? ? # then you would have a confusing
> ? ? b = mean(a, 2)

exactly why no one should *ever* do import *!

pad.mean

is not more typing than

pad_mean

so there in NO reason to use the latter, except to have a C style,
rather than python namespaces....

ANd what if someone DOES want to override mean?:

from np.pad import mean

and now they can do

mean(something)

and it's explicite what they mean to do.

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959?? voice
7600 Sand Point Way NE ??(206) 526-6329?? fax
Seattle, WA ?98115 ? ? ??(206) 526-6317?? main reception

Chris.Barker at noaa.gov


From travis at continuum.io  Thu Mar 29 12:13:54 2012
From: travis at continuum.io (Travis Oliphant)
Date: Thu, 29 Mar 2012 11:13:54 -0500
Subject: [Numpy-discussion] Style for pad implementation in 'pad'
	namespace or functions under np.lib
In-Reply-To: <CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
	<CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
	<CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>
	<CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>
Message-ID: <38410503-5EB8-484E-B778-4BFFF558615D@continuum.io>

While namespaces are a really good idea, I'm not a big fan of both "module" namespaces and underscore namespaces.   It seems pretty redundant to me to have pad.pad_mean. 

On the other hand, one could argue that pad.mean could be confused with calculating the mean of a padded array.   So, it seems like the function names need to be called something more than just "mean, median, etc."   Something like padwith_mean, padwith_median, etc. actually makes more sense.   Or pad.with_mean, pad.with_median.   The with_ in this case is not really a "namespace" it's an indication of functionality. 

With that said, NumPy was designed not to be "deep" in terms of naming.    We should work hard to ensure that functions and Classes to be instantiated by the user are no more than 2 levels down (i.e. either in the NumPy namespace or in the numpy.<subpackage> namespace. 

So, either we need to move pad into a new subpackage and call things 

pad.with_mean
pad.with_median
etc.

or keep the functions

pad_mean,
pad_median,

etc.

in numpy.lib

I don't think this functionality really justifies an entire new sub-package in NumPy, though.  So, I would vote for keeping things accessible as

numpy.lib.pad_mean


-Travis
 

1) The functions in pad.py be imported into the lib namespace
2) The functions all be renamed to padwith_mean

On Mar 29, 2012, at 9:55 AM, Tim Cera wrote:

> 
> On Wed, Mar 28, 2012 at 6:08 PM, Charles R Harris <charlesr.harris at gmail.com> wrote:
> 
> I think there is also a question of using a prefix pad_xxx for the function names as opposed to pad.xxx.
> 
> If I had it as pad.mean, pad.median, ...etc. then someone could
> 
>     from numpy.pad import *
>     a = np.arange(5)
>     # then you would have a confusing
>     b = mean(a, 2)
> 
> Because of that I would vote for 'pad.pad_xxx' instead of 'pad.xxx'.  In fact at one time I named the functions 'pad.pad_with_xxx' which was a little overkill.
> 
> Kindest regards,
> Tim
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120329/4392d732/attachment.html>

From njs at pobox.com  Thu Mar 29 12:38:42 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 29 Mar 2012 17:38:42 +0100
Subject: [Numpy-discussion] Style for pad implementation in 'pad'
 namespace or functions under np.lib
In-Reply-To: <38410503-5EB8-484E-B778-4BFFF558615D@continuum.io>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
	<CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
	<CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>
	<CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>
	<38410503-5EB8-484E-B778-4BFFF558615D@continuum.io>
Message-ID: <CAPJVwBn-BqiVPYFNrYnnvsYTA6KMheE6wD0guVAzXX-r_sQRKA@mail.gmail.com>

On Thu, Mar 29, 2012 at 5:13 PM, Travis Oliphant <travis at continuum.io> wrote:
> While namespaces are a really good idea, I'm not a big fan of both "module"
> namespaces and underscore namespaces. ? It seems pretty redundant to me to
> have pad.pad_mean.
>
> On the other hand, one could argue that pad.mean could be confused with
> calculating the mean of a padded array. ? So, it seems like the function
> names need to be called something more than just "mean, median, etc."
> Something like padwith_mean, padwith_median, etc. actually makes more sense.
> ? Or pad.with_mean, pad.with_median. ??The with_ in this case is not really
> a "namespace" it's an indication of functionality.

Perhaps it should be
  pad(..., mode="mean")

, mode="" is only a few more characters than _with_, and this would
make it much easier to write functions whose API looks like:
    wavelet_decompose(..., pad_mode=....)

Also it would solve the namespace question :-).

-- Nathaniel


From charlesr.harris at gmail.com  Thu Mar 29 12:45:35 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 29 Mar 2012 10:45:35 -0600
Subject: [Numpy-discussion] Style for pad implementation in 'pad'
 namespace or functions under np.lib
In-Reply-To: <CAPJVwBn-BqiVPYFNrYnnvsYTA6KMheE6wD0guVAzXX-r_sQRKA@mail.gmail.com>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
	<CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
	<CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>
	<CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>
	<38410503-5EB8-484E-B778-4BFFF558615D@continuum.io>
	<CAPJVwBn-BqiVPYFNrYnnvsYTA6KMheE6wD0guVAzXX-r_sQRKA@mail.gmail.com>
Message-ID: <CAB6mnxLxCmPvsuuxkpxD_OVhdGK1QCqjETH+N48P6+Fw=qXocA@mail.gmail.com>

On Thu, Mar 29, 2012 at 10:38 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Thu, Mar 29, 2012 at 5:13 PM, Travis Oliphant <travis at continuum.io>
> wrote:
> > While namespaces are a really good idea, I'm not a big fan of both
> "module"
> > namespaces and underscore namespaces.   It seems pretty redundant to me
> to
> > have pad.pad_mean.
> >
> > On the other hand, one could argue that pad.mean could be confused with
> > calculating the mean of a padded array.   So, it seems like the function
> > names need to be called something more than just "mean, median, etc."
> > Something like padwith_mean, padwith_median, etc. actually makes more
> sense.
> >   Or pad.with_mean, pad.with_median.   The with_ in this case is not
> really
> > a "namespace" it's an indication of functionality.
>
> Perhaps it should be
>  pad(..., mode="mean")
>
> , mode="" is only a few more characters than _with_, and this would
> make it much easier to write functions whose API looks like:
>    wavelet_decompose(..., pad_mode=....)
>
> Also it would solve the namespace question :-).
>
>
I like this idea. I'd leave the current functions in the module for those
who want to import a particular padding type, but import the pad(...,
mode="xxx") into the numpy namespace.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120329/6a879104/attachment.html>

From njs at pobox.com  Thu Mar 29 12:52:01 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 29 Mar 2012 17:52:01 +0100
Subject: [Numpy-discussion] Style for pad implementation in 'pad'
 namespace or functions under np.lib
In-Reply-To: <CAPJVwBn-BqiVPYFNrYnnvsYTA6KMheE6wD0guVAzXX-r_sQRKA@mail.gmail.com>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
	<CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
	<CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>
	<CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>
	<38410503-5EB8-484E-B778-4BFFF558615D@continuum.io>
	<CAPJVwBn-BqiVPYFNrYnnvsYTA6KMheE6wD0guVAzXX-r_sQRKA@mail.gmail.com>
Message-ID: <CAPJVwBmL1y8eJJf_71kkFnQ7jMFszLMCj6NvDAeiT3YBwjxsVg@mail.gmail.com>

On Thu, Mar 29, 2012 at 5:38 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Thu, Mar 29, 2012 at 5:13 PM, Travis Oliphant <travis at continuum.io> wrote:
>> While namespaces are a really good idea, I'm not a big fan of both "module"
>> namespaces and underscore namespaces. ? It seems pretty redundant to me to
>> have pad.pad_mean.
>>
>> On the other hand, one could argue that pad.mean could be confused with
>> calculating the mean of a padded array. ? So, it seems like the function
>> names need to be called something more than just "mean, median, etc."
>> Something like padwith_mean, padwith_median, etc. actually makes more sense.
>> ? Or pad.with_mean, pad.with_median. ??The with_ in this case is not really
>> a "namespace" it's an indication of functionality.
>
> Perhaps it should be
> ?pad(..., mode="mean")
>
> , mode="" is only a few more characters than _with_, and this would
> make it much easier to write functions whose API looks like:
> ? ?wavelet_decompose(..., pad_mode=....)
>
> Also it would solve the namespace question :-).
>
> -- Nathaniel

...And on looking at the pull request for a bit, it would let us
combine 10 distinct single-line functions that (with highly redundant
docstrings) take up ~600 lines of code. The interfaces are *almost*
identical; the exceptions are:
  -- "reflect" and "symmetric" take and "even"/"odd" argument
  -- "pad_constant" takes an argument specifying the constant to pad with
  -- "pad_linear_ramp" takes an argument specifying the end value of
the linear ramp

The first two can be easily handled by just combining it into the
mode, so we have "reflect", "reflect_odd", "symmetric",
"symmetric_odd", so that just leaves two mode-specific arguments. I
wouldn't feel bad about leaving those in as unused arguments for most
modes. We might even want to combine them into a single argument --
normally I'm against such "simplifications", but the semantics really
are quite similar.

And final advantage: we could later extend this interface to also
support mode=user_specified_callable for custom padding modes, since
this is basically how the underlying code already works. (Though I'm
not quite sure the interface for the individual pad mode functions
like _mean and _median is the one we'd want to expose to users.)

-- Nathaniel


From njs at pobox.com  Thu Mar 29 12:56:40 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 29 Mar 2012 17:56:40 +0100
Subject: [Numpy-discussion] Style for pad implementation in 'pad'
 namespace or functions under np.lib
In-Reply-To: <CAB6mnxLxCmPvsuuxkpxD_OVhdGK1QCqjETH+N48P6+Fw=qXocA@mail.gmail.com>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
	<CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
	<CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>
	<CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>
	<38410503-5EB8-484E-B778-4BFFF558615D@continuum.io>
	<CAPJVwBn-BqiVPYFNrYnnvsYTA6KMheE6wD0guVAzXX-r_sQRKA@mail.gmail.com>
	<CAB6mnxLxCmPvsuuxkpxD_OVhdGK1QCqjETH+N48P6+Fw=qXocA@mail.gmail.com>
Message-ID: <CAPJVwBkirKyGsnmXyLoFBZ2ifDtA3gQCkw3rMgMNQ+4y2rsiBA@mail.gmail.com>

On Thu, Mar 29, 2012 at 5:45 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Thu, Mar 29, 2012 at 10:38 AM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On Thu, Mar 29, 2012 at 5:13 PM, Travis Oliphant <travis at continuum.io>
>> wrote:
>> > While namespaces are a really good idea, I'm not a big fan of both
>> > "module"
>> > namespaces and underscore namespaces. ? It seems pretty redundant to me
>> > to
>> > have pad.pad_mean.
>> >
>> > On the other hand, one could argue that pad.mean could be confused with
>> > calculating the mean of a padded array. ? So, it seems like the function
>> > names need to be called something more than just "mean, median, etc."
>> > Something like padwith_mean, padwith_median, etc. actually makes more
>> > sense.
>> > ? Or pad.with_mean, pad.with_median. ??The with_ in this case is not
>> > really
>> > a "namespace" it's an indication of functionality.
>>
>> Perhaps it should be
>> ?pad(..., mode="mean")
>>
>> , mode="" is only a few more characters than _with_, and this would
>> make it much easier to write functions whose API looks like:
>> ? ?wavelet_decompose(..., pad_mode=....)
>>
>> Also it would solve the namespace question :-).
>>
>
> I like this idea. I'd leave the current functions in the module for those
> who want to import a particular padding type, but import the pad(...,
> mode="xxx") into the numpy namespace.

Your call, but that's about 500 lines of highly-redundant docstrings
to maintain, just to save the occasional person from having to type
one line:
  def pad_with_mean(*args, **kwargs): np.lib.pad(*args, mode="mean", **kwargs)

The unified implementation really is just
  PADDERS = {"mean": _mean, "median": _median, ...}
  return _loop_across(array, pad_width, _PADDERS[mode],
constant_values=constant_values, end_values=end_values)

-- Nathaniel


From travis at continuum.io  Thu Mar 29 13:07:34 2012
From: travis at continuum.io (Travis Oliphant)
Date: Thu, 29 Mar 2012 12:07:34 -0500
Subject: [Numpy-discussion] Style for pad implementation in 'pad'
	namespace or functions under np.lib
In-Reply-To: <CAPJVwBmL1y8eJJf_71kkFnQ7jMFszLMCj6NvDAeiT3YBwjxsVg@mail.gmail.com>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
	<CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
	<CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>
	<CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>
	<38410503-5EB8-484E-B778-4BFFF558615D@continuum.io>
	<CAPJVwBn-BqiVPYFNrYnnvsYTA6KMheE6wD0guVAzXX-r_sQRKA@mail.gmail.com>
	<CAPJVwBmL1y8eJJf_71kkFnQ7jMFszLMCj6NvDAeiT3YBwjxsVg@mail.gmail.com>
Message-ID: <35298DC7-58D4-4505-81E8-2EAAB7FD12AA@continuum.io>


On Mar 29, 2012, at 11:52 AM, Nathaniel Smith wrote:

> On Thu, Mar 29, 2012 at 5:38 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> On Thu, Mar 29, 2012 at 5:13 PM, Travis Oliphant <travis at continuum.io> wrote:
>>> While namespaces are a really good idea, I'm not a big fan of both "module"
>>> namespaces and underscore namespaces.   It seems pretty redundant to me to
>>> have pad.pad_mean.
>>> 
>>> On the other hand, one could argue that pad.mean could be confused with
>>> calculating the mean of a padded array.   So, it seems like the function
>>> names need to be called something more than just "mean, median, etc."
>>> Something like padwith_mean, padwith_median, etc. actually makes more sense.
>>>   Or pad.with_mean, pad.with_median.   The with_ in this case is not really
>>> a "namespace" it's an indication of functionality.
>> 
>> Perhaps it should be
>>  pad(..., mode="mean")
>> 
>> , mode="" is only a few more characters than _with_, and this would
>> make it much easier to write functions whose API looks like:
>>    wavelet_decompose(..., pad_mode=....)
>> 
>> Also it would solve the namespace question :-).
>> 
>> -- Nathaniel
> 
> ...And on looking at the pull request for a bit, it would let us
> combine 10 distinct single-line functions that (with highly redundant
> docstrings) take up ~600 lines of code. The interfaces are *almost*
> identical; the exceptions are:
>  -- "reflect" and "symmetric" take and "even"/"odd" argument
>  -- "pad_constant" takes an argument specifying the constant to pad with
>  -- "pad_linear_ramp" takes an argument specifying the end value of
> the linear ramp
> 
> The first two can be easily handled by just combining it into the
> mode, so we have "reflect", "reflect_odd", "symmetric",
> "symmetric_odd", so that just leaves two mode-specific arguments. I
> wouldn't feel bad about leaving those in as unused arguments for most
> modes. We might even want to combine them into a single argument --
> normally I'm against such "simplifications", but the semantics really
> are quite similar.
> 
> And final advantage: we could later extend this interface to also
> support mode=user_specified_callable for custom padding modes, since
> this is basically how the underlying code already works. (Though I'm
> not quite sure the interface for the individual pad mode functions
> like _mean and _median is the one we'd want to expose to users.)


I like the direction this conversation is going.      A single pad function in numpy or numpy.lib is a good idea. 

-Travis


> 
> -- Nathaniel
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From williamj at tenbase2.com  Thu Mar 29 13:47:18 2012
From: williamj at tenbase2.com (William Johnston)
Date: Thu, 29 Mar 2012 13:47:18 -0400
Subject: [Numpy-discussion] Numpy for IronPython 2.7 DLR app?
In-Reply-To: <CAPJVwBkirKyGsnmXyLoFBZ2ifDtA3gQCkw3rMgMNQ+4y2rsiBA@mail.gmail.com>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com><CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com><CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com><CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com><38410503-5EB8-484E-B778-4BFFF558615D@continuum.io><CAPJVwBn-BqiVPYFNrYnnvsYTA6KMheE6wD0guVAzXX-r_sQRKA@mail.gmail.com><CAB6mnxLxCmPvsuuxkpxD_OVhdGK1QCqjETH+N48P6+Fw=qXocA@mail.gmail.com>
	<CAPJVwBkirKyGsnmXyLoFBZ2ifDtA3gQCkw3rMgMNQ+4y2rsiBA@mail.gmail.com>
Message-ID: <8A50E415A49F4E208EBE78C74915ADA8@leviathan>


Hello,

Can numpy for .NET be used in a DLR C# application?

Regards,
William Johnston


From tim at cerazone.net  Thu Mar 29 13:53:31 2012
From: tim at cerazone.net (Tim Cera)
Date: Thu, 29 Mar 2012 13:53:31 -0400
Subject: [Numpy-discussion] Style for pad implementation in 'pad'
 namespace or functions under np.lib
In-Reply-To: <35298DC7-58D4-4505-81E8-2EAAB7FD12AA@continuum.io>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
	<CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
	<CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>
	<CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>
	<38410503-5EB8-484E-B778-4BFFF558615D@continuum.io>
	<CAPJVwBn-BqiVPYFNrYnnvsYTA6KMheE6wD0guVAzXX-r_sQRKA@mail.gmail.com>
	<CAPJVwBmL1y8eJJf_71kkFnQ7jMFszLMCj6NvDAeiT3YBwjxsVg@mail.gmail.com>
	<35298DC7-58D4-4505-81E8-2EAAB7FD12AA@continuum.io>
Message-ID: <CAO5s+D8cz1ZdassbndnnqCxZBg83+WefnTFdV2f5cNcMgmt6Vw@mail.gmail.com>

I was hoping pad would get finished some day.  Maybe 1.9?

Alright - I do like the idea of passing a function to pad, with a bunch of
pre-made functions in place.

Maybe something like:

    a = np.arange(10)
    b = pad('mean', a, 2, stat_length=3)

where if the first argument is a string, use one of the built in functions.

If instead you passed in a function:

    def padwithzeros(vector, pad_width, iaxis, **kwargs):
        bvector = np.zeros(pad_width[0])
        avector = np.zeros(pad_width[1])
        return bvector, avector

    b = pad(padwithzeros, a, 2)

Would that have some goodness?

Kindest regards,
Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120329/b050588c/attachment.html>

From ischnell at enthought.com  Thu Mar 29 15:11:36 2012
From: ischnell at enthought.com (Ilan Schnell)
Date: Thu, 29 Mar 2012 14:11:36 -0500
Subject: [Numpy-discussion] Numpy for IronPython 2.7 DLR app?
In-Reply-To: <8A50E415A49F4E208EBE78C74915ADA8@leviathan>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
	<CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
	<CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>
	<CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>
	<38410503-5EB8-484E-B778-4BFFF558615D@continuum.io>
	<CAPJVwBn-BqiVPYFNrYnnvsYTA6KMheE6wD0guVAzXX-r_sQRKA@mail.gmail.com>
	<CAB6mnxLxCmPvsuuxkpxD_OVhdGK1QCqjETH+N48P6+Fw=qXocA@mail.gmail.com>
	<CAPJVwBkirKyGsnmXyLoFBZ2ifDtA3gQCkw3rMgMNQ+4y2rsiBA@mail.gmail.com>
	<8A50E415A49F4E208EBE78C74915ADA8@leviathan>
Message-ID: <CAAUn5q+QiOhnnHfUZvUWg512Zr7Lb=3iw0Ete-6Pt_AMgrk5cQ@mail.gmail.com>

Hello William,

It's just a matter of importing numpy into IronPython.  See also,
http://enthought.com/repo/.iron/NumPySciPyforDotNet.pdf

- Ilan


On Thu, Mar 29, 2012 at 12:47 PM, William Johnston
<williamj at tenbase2.com> wrote:
>
> Hello,
>
> Can numpy for .NET be used in a DLR C# application?
>
> Regards,
> William Johnston
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From williamj at tenbase2.com  Thu Mar 29 17:59:02 2012
From: williamj at tenbase2.com (William Johnston)
Date: Thu, 29 Mar 2012 17:59:02 -0400
Subject: [Numpy-discussion] \*\*\*\*\*SPAM\*\*\*\*\* Re: Numpy for
	IronPython 2.7 DLR app?
In-Reply-To: <CAAUn5q+QiOhnnHfUZvUWg512Zr7Lb=3iw0Ete-6Pt_AMgrk5cQ@mail.gmail.com>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com><CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com><CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com><CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com><38410503-5EB8-484E-B778-4BFFF558615D@continuum.io><CAPJVwBn-BqiVPYFNrYnnvsYTA6KMheE6wD0guVAzXX-r_sQRKA@mail.gmail.com><CAB6mnxLxCmPvsuuxkpxD_OVhdGK1QCqjETH+N48P6+Fw=qXocA@mail.gmail.com><CAPJVwBkirKyGsnmXyLoFBZ2ifDtA3gQCkw3rMgMNQ+4y2rsiBA@mail.gmail.com><8A50E415A49F4E208EBE78C74915ADA8@leviathan>
	<CAAUn5q+QiOhnnHfUZvUWg512Zr7Lb=3iw0Ete-6Pt_AMgrk5cQ@mail.gmail.com>
Message-ID: <F9CF505E3FD84C9BA765CA8E5FE095C1@leviathan>


Ilan:

Thanks for your post.

I can import from the command-line and IronPython console, but not from a C# 
DLR app (using embedded python scripts.)

Any suggestions?

Regards,
William Johnston


-----Original Message----- 
From: Ilan Schnell
Sent: Thursday, March 29, 2012 3:11 PM
To: Discussion of Numerical Python
Subject: \*\*\*\*\*SPAM\*\*\*\*\* Re: [Numpy-discussion] Numpy for 
IronPython 2.7 DLR app?

Hello William,

It's just a matter of importing numpy into IronPython.  See also,
http://enthought.com/repo/.iron/NumPySciPyforDotNet.pdf

- Ilan


On Thu, Mar 29, 2012 at 12:47 PM, William Johnston
<williamj at tenbase2.com> wrote:
>
> Hello,
>
> Can numpy for .NET be used in a DLR C# application?
>
> Regards,
> William Johnston
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion 


From charlesr.harris at gmail.com  Thu Mar 29 20:19:51 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 29 Mar 2012 18:19:51 -0600
Subject: [Numpy-discussion] Style for pad implementation in 'pad'
 namespace or functions under np.lib
In-Reply-To: <CAO5s+D8cz1ZdassbndnnqCxZBg83+WefnTFdV2f5cNcMgmt6Vw@mail.gmail.com>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
	<CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
	<CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>
	<CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>
	<38410503-5EB8-484E-B778-4BFFF558615D@continuum.io>
	<CAPJVwBn-BqiVPYFNrYnnvsYTA6KMheE6wD0guVAzXX-r_sQRKA@mail.gmail.com>
	<CAPJVwBmL1y8eJJf_71kkFnQ7jMFszLMCj6NvDAeiT3YBwjxsVg@mail.gmail.com>
	<35298DC7-58D4-4505-81E8-2EAAB7FD12AA@continuum.io>
	<CAO5s+D8cz1ZdassbndnnqCxZBg83+WefnTFdV2f5cNcMgmt6Vw@mail.gmail.com>
Message-ID: <CAB6mnxJU0a1VqjYJ8c3vcLMW2Tou-nf+Gqto-Fz=7LimHhFX1Q@mail.gmail.com>

On Thu, Mar 29, 2012 at 11:53 AM, Tim Cera <tim at cerazone.net> wrote:

> I was hoping pad would get finished some day.  Maybe 1.9?


Well, you *did* ask ;-)


>
> Alright - I do like the idea of passing a function to pad, with a bunch of
> pre-made functions in place.
>
> Maybe something like:
>
>     a = np.arange(10)
>     b = pad('mean', a, 2, stat_length=3)
>
> where if the first argument is a string, use one of the built in functions.
>
>
I like Nathaniel's idea of making the string an optional argument, that way
you can also give it a default value ('mean' ?).


> If instead you passed in a function:
>
>     def padwithzeros(vector, pad_width, iaxis, **kwargs):
>         bvector = np.zeros(pad_width[0])
>         avector = np.zeros(pad_width[1])
>         return bvector, avector
>
>     b = pad(padwithzeros, a, 2)
>
> Would that have some goodness?
>
>
Yeah, I think we're converging on something nice.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120329/c6af367b/attachment.html>

From travis at continuum.io  Thu Mar 29 21:55:31 2012
From: travis at continuum.io (Travis Oliphant)
Date: Thu, 29 Mar 2012 20:55:31 -0500
Subject: [Numpy-discussion] Style for pad implementation in 'pad'
	namespace or functions under np.lib
In-Reply-To: <CAO5s+D8cz1ZdassbndnnqCxZBg83+WefnTFdV2f5cNcMgmt6Vw@mail.gmail.com>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
	<CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
	<CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>
	<CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>
	<38410503-5EB8-484E-B778-4BFFF558615D@continuum.io>
	<CAPJVwBn-BqiVPYFNrYnnvsYTA6KMheE6wD0guVAzXX-r_sQRKA@mail.gmail.com>
	<CAPJVwBmL1y8eJJf_71kkFnQ7jMFszLMCj6NvDAeiT3YBwjxsVg@mail.gmail.com>
	<35298DC7-58D4-4505-81E8-2EAAB7FD12AA@continuum.io>
	<CAO5s+D8cz1ZdassbndnnqCxZBg83+WefnTFdV2f5cNcMgmt6Vw@mail.gmail.com>
Message-ID: <D7C30BE2-B385-4F41-AC6A-DF0075590941@continuum.io>


On Mar 29, 2012, at 12:53 PM, Tim Cera wrote:

> I was hoping pad would get finished some day.  Maybe 1.9?

You have been a great sport about this process.   I think it will result in something quite nice.   

> 
> Alright - I do like the idea of passing a function to pad, with a bunch of pre-made functions in place.  
> 
> Maybe something like:
> 
>     a = np.arange(10)
>     b = pad('mean', a, 2, stat_length=3)
> 
> where if the first argument is a string, use one of the built in functions.
> 
> If instead you passed in a function:
> 
>     def padwithzeros(vector, pad_width, iaxis, **kwargs):
>         bvector = np.zeros(pad_width[0])
>         avector = np.zeros(pad_width[1])
>         return bvector, avector
> 
>     b = pad(padwithzeros, a, 2)
> 
> Would that have some goodness?

+1

-Travis

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120329/cff754ac/attachment.html>

From rhattersley at gmail.com  Fri Mar 30 03:35:49 2012
From: rhattersley at gmail.com (Richard Hattersley)
Date: Fri, 30 Mar 2012 08:35:49 +0100
Subject: [Numpy-discussion] Style for pad implementation in 'pad'
 namespace or functions under np.lib
In-Reply-To: <D7C30BE2-B385-4F41-AC6A-DF0075590941@continuum.io>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
	<CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
	<CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>
	<CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>
	<38410503-5EB8-484E-B778-4BFFF558615D@continuum.io>
	<CAPJVwBn-BqiVPYFNrYnnvsYTA6KMheE6wD0guVAzXX-r_sQRKA@mail.gmail.com>
	<CAPJVwBmL1y8eJJf_71kkFnQ7jMFszLMCj6NvDAeiT3YBwjxsVg@mail.gmail.com>
	<35298DC7-58D4-4505-81E8-2EAAB7FD12AA@continuum.io>
	<CAO5s+D8cz1ZdassbndnnqCxZBg83+WefnTFdV2f5cNcMgmt6Vw@mail.gmail.com>
	<D7C30BE2-B385-4F41-AC6A-DF0075590941@continuum.io>
Message-ID: <CAP=RS9kv1RT+E7dE3MghZZmokwf2xrNNCw4OWq_e=V_G4S+cJQ@mail.gmail.com>

I like where this is going.

Driven by a desire to avoid a million different methods on a single
class, we've done something similar in our library.
So instead of
   thing.mean(....)
   thing.max(...)
   etc.
we have:
   thing.scrunch(MEAN, ...)
   thing.scrunch(MAX, ...)
   etc.
Where the constants like MEAN and MAX encapsulate the process to be
performed - including a reference to a NumPy/user-defined aggregation
function, as well as some other transformation details.

We then found we could reuse the same constants in other operations:
   thing.scrunch(MEAN, ...)
   thing.squish(MEAN, ...)
   thing.rolling_squish(MEAN, ...)

So I have two minor concerns with the current proposal.
1) The use of string constants to identify NumPy processes. It would
seem better to use library defined constants (ufuncs?) for better
future-proofing, maintenance, etc.
2) Why does only "pad" use this style of interface? If it's a good
idea for "pad", perhaps it should be applied more generally?
numpy.aggregate(MEAN, ...), numpy.group(MEAN, ...), etc. anyone?

Richard Hattersley


On 30 March 2012 02:55, Travis Oliphant <travis at continuum.io> wrote:
>
> On Mar 29, 2012, at 12:53 PM, Tim Cera wrote:
>
> I was hoping pad would get finished some day. ?Maybe 1.9?
>
>
> You have been a great sport about this process. ? I think it will result in
> something quite nice.
>
>
> Alright - I do like the idea of passing a function to pad, with a bunch of
> pre-made functions in place.
>
> Maybe something like:
>
> ? ? a = np.arange(10)
> ? ? b = pad('mean', a, 2, stat_length=3)
>
> where if the first argument is a string, use one of the built in functions.
>
> If instead you passed in a function:
>
> ? ? def padwithzeros(vector, pad_width, iaxis, **kwargs):
> ? ? ? ? bvector = np.zeros(pad_width[0])
> ? ? ? ? avector = np.zeros(pad_width[1])
> ? ? ? ? return bvector, avector
>
> ? ? b = pad(padwithzeros, a, 2)
>
> Would that have some goodness?
>
>
> +1
>
> -Travis
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From njs at pobox.com  Fri Mar 30 07:32:56 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 30 Mar 2012 12:32:56 +0100
Subject: [Numpy-discussion] Style for pad implementation in 'pad'
 namespace or functions under np.lib
In-Reply-To: <CAP=RS9kv1RT+E7dE3MghZZmokwf2xrNNCw4OWq_e=V_G4S+cJQ@mail.gmail.com>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
	<CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
	<CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>
	<CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>
	<38410503-5EB8-484E-B778-4BFFF558615D@continuum.io>
	<CAPJVwBn-BqiVPYFNrYnnvsYTA6KMheE6wD0guVAzXX-r_sQRKA@mail.gmail.com>
	<CAPJVwBmL1y8eJJf_71kkFnQ7jMFszLMCj6NvDAeiT3YBwjxsVg@mail.gmail.com>
	<35298DC7-58D4-4505-81E8-2EAAB7FD12AA@continuum.io>
	<CAO5s+D8cz1ZdassbndnnqCxZBg83+WefnTFdV2f5cNcMgmt6Vw@mail.gmail.com>
	<D7C30BE2-B385-4F41-AC6A-DF0075590941@continuum.io>
	<CAP=RS9kv1RT+E7dE3MghZZmokwf2xrNNCw4OWq_e=V_G4S+cJQ@mail.gmail.com>
Message-ID: <CAPJVwB=rbp0ysUGuTC_uB5ZZYQK5693Hhad7T-aWr-6kb6RXzw@mail.gmail.com>

On Fri, Mar 30, 2012 at 8:35 AM, Richard Hattersley
<rhattersley at gmail.com> wrote:
> I like where this is going.
>
> Driven by a desire to avoid a million different methods on a single
> class, we've done something similar in our library.
> So instead of
> ? thing.mean(....)
> ? thing.max(...)
> ? etc.
> we have:
> ? thing.scrunch(MEAN, ...)
> ? thing.scrunch(MAX, ...)
> ? etc.
> Where the constants like MEAN and MAX encapsulate the process to be
> performed - including a reference to a NumPy/user-defined aggregation
> function, as well as some other transformation details.
>
> We then found we could reuse the same constants in other operations:
> ? thing.scrunch(MEAN, ...)
> ? thing.squish(MEAN, ...)
> ? thing.rolling_squish(MEAN, ...)
>
> So I have two minor concerns with the current proposal.
> 1) The use of string constants to identify NumPy processes. It would
> seem better to use library defined constants (ufuncs?) for better
> future-proofing, maintenance, etc.

I don't see how this would help with future-proofing or maintenance --
can you elaborate?

If this were C, I'd agree; using an enum would have a number of benefits:
 -- easier to work with than strings (== and switch work, no memory
management hassles)
 -- compiler will notice if you accidentally misspell the enum name
 -- since you always in effect 'import *', getting access to
additional constants doesn't require any extra effort
But in Python none of these advantages apply, so I find it more
convenient to just use strings.

Note also that we couldn't use ufuncs here, because we're specifying a
rather unusual sort of operation -- there is no ufunc for padding with
a linear ramp etc. Using "mean" as the example is misleading in this
respect -- it's not really the same as np.mean.

> 2) Why does only "pad" use this style of interface? If it's a good
> idea for "pad", perhaps it should be applied more generally?
> numpy.aggregate(MEAN, ...), numpy.group(MEAN, ...), etc. anyone?

The mode="foo" interface style is actually used in other places, e.g.,
np.linalg.qr.

-- Nathaniel


From njs at pobox.com  Fri Mar 30 07:41:50 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 30 Mar 2012 12:41:50 +0100
Subject: [Numpy-discussion] Style for pad implementation in 'pad'
 namespace or functions under np.lib
In-Reply-To: <CAO5s+D8cz1ZdassbndnnqCxZBg83+WefnTFdV2f5cNcMgmt6Vw@mail.gmail.com>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
	<CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
	<CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>
	<CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>
	<38410503-5EB8-484E-B778-4BFFF558615D@continuum.io>
	<CAPJVwBn-BqiVPYFNrYnnvsYTA6KMheE6wD0guVAzXX-r_sQRKA@mail.gmail.com>
	<CAPJVwBmL1y8eJJf_71kkFnQ7jMFszLMCj6NvDAeiT3YBwjxsVg@mail.gmail.com>
	<35298DC7-58D4-4505-81E8-2EAAB7FD12AA@continuum.io>
	<CAO5s+D8cz1ZdassbndnnqCxZBg83+WefnTFdV2f5cNcMgmt6Vw@mail.gmail.com>
Message-ID: <CAPJVwBn8BwL5HqUC_z7tvGaRDwU=AgTbxFwDZDhvf_QayXd=8Q@mail.gmail.com>

On Thu, Mar 29, 2012 at 6:53 PM, Tim Cera <tim at cerazone.net> wrote:
> If instead you passed in a function:
>
> ? ? def padwithzeros(vector, pad_width, iaxis, **kwargs):
> ? ? ? ? bvector = np.zeros(pad_width[0])
> ? ? ? ? avector = np.zeros(pad_width[1])
> ? ? ? ? return bvector, avector
>
> ? ? b = pad(padwithzeros, a, 2)
>
> Would that have some goodness?

I like the idea, but this interface feels undercooked to me. What is
iaxis? (I couldn't figure that out from the docstrings in the pull
request either.) If padding a matrix, do we need a way to do padding
of all rows simultaneously? (Certainly that'd be just as easy for
something like padwithzeros, and would let us avoid a python-level for
loop.) Why is this function allocating new arrays that will just be
copied into the big array and then discarded, instead of filling in
the big array directly? (Again, this is a speed issue.) If it's
working with single rows at a time, then how will padwithconstant know
which constant to pull out of kwargs["constant_value"] when it's an
array?

That's why I thought you might want to submit this as a second pull
request, rather than letting it hold up the whole thing.

-- Nathaniel


From charlesr.harris at gmail.com  Fri Mar 30 08:22:40 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 30 Mar 2012 06:22:40 -0600
Subject: [Numpy-discussion] Style for pad implementation in 'pad'
 namespace or functions under np.lib
In-Reply-To: <CAPJVwBn8BwL5HqUC_z7tvGaRDwU=AgTbxFwDZDhvf_QayXd=8Q@mail.gmail.com>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
	<CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
	<CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>
	<CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>
	<38410503-5EB8-484E-B778-4BFFF558615D@continuum.io>
	<CAPJVwBn-BqiVPYFNrYnnvsYTA6KMheE6wD0guVAzXX-r_sQRKA@mail.gmail.com>
	<CAPJVwBmL1y8eJJf_71kkFnQ7jMFszLMCj6NvDAeiT3YBwjxsVg@mail.gmail.com>
	<35298DC7-58D4-4505-81E8-2EAAB7FD12AA@continuum.io>
	<CAO5s+D8cz1ZdassbndnnqCxZBg83+WefnTFdV2f5cNcMgmt6Vw@mail.gmail.com>
	<CAPJVwBn8BwL5HqUC_z7tvGaRDwU=AgTbxFwDZDhvf_QayXd=8Q@mail.gmail.com>
Message-ID: <CAB6mnxJout3kBPPS8_HwzkVweJ__3ZxD08N6d1o-kOUR06x3xA@mail.gmail.com>

On Fri, Mar 30, 2012 at 5:41 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Thu, Mar 29, 2012 at 6:53 PM, Tim Cera <tim at cerazone.net> wrote:
> > If instead you passed in a function:
> >
> >     def padwithzeros(vector, pad_width, iaxis, **kwargs):
> >         bvector = np.zeros(pad_width[0])
> >         avector = np.zeros(pad_width[1])
> >         return bvector, avector
> >
> >     b = pad(padwithzeros, a, 2)
> >
> > Would that have some goodness?
>
> I like the idea, but this interface feels undercooked to me. What is
> iaxis? (I couldn't figure that out from the docstrings in the pull
> request either.) If padding a matrix, do we need a way to do padding
> of all rows simultaneously? (Certainly that'd be just as easy for
> something like padwithzeros, and would let us avoid a python-level for
> loop.) Why is this function allocating new arrays that will just be
> copied into the big array and then discarded, instead of filling in
> the big array directly? (Again, this is a speed issue.) If it's
> working with single rows at a time, then how will padwithconstant know
> which constant to pull out of kwargs["constant_value"] when it's an
> array?
>
> That's why I thought you might want to submit this as a second pull
> request, rather than letting it hold up the whole thing.
>
>
Would you suggest committing the current PR and then adding this interface
later?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120330/fdaaf133/attachment.html>

From njs at pobox.com  Fri Mar 30 08:51:21 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 30 Mar 2012 13:51:21 +0100
Subject: [Numpy-discussion] Style for pad implementation in 'pad'
 namespace or functions under np.lib
In-Reply-To: <CAB6mnxJout3kBPPS8_HwzkVweJ__3ZxD08N6d1o-kOUR06x3xA@mail.gmail.com>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
	<CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
	<CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>
	<CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>
	<38410503-5EB8-484E-B778-4BFFF558615D@continuum.io>
	<CAPJVwBn-BqiVPYFNrYnnvsYTA6KMheE6wD0guVAzXX-r_sQRKA@mail.gmail.com>
	<CAPJVwBmL1y8eJJf_71kkFnQ7jMFszLMCj6NvDAeiT3YBwjxsVg@mail.gmail.com>
	<35298DC7-58D4-4505-81E8-2EAAB7FD12AA@continuum.io>
	<CAO5s+D8cz1ZdassbndnnqCxZBg83+WefnTFdV2f5cNcMgmt6Vw@mail.gmail.com>
	<CAPJVwBn8BwL5HqUC_z7tvGaRDwU=AgTbxFwDZDhvf_QayXd=8Q@mail.gmail.com>
	<CAB6mnxJout3kBPPS8_HwzkVweJ__3ZxD08N6d1o-kOUR06x3xA@mail.gmail.com>
Message-ID: <CAPJVwBkPAqGgHm5bU24Fm61En-=H=Qdp9SchvYJc4gP+-csQ9g@mail.gmail.com>

On Fri, Mar 30, 2012 at 1:22 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Fri, Mar 30, 2012 at 5:41 AM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On Thu, Mar 29, 2012 at 6:53 PM, Tim Cera <tim at cerazone.net> wrote:
>> > If instead you passed in a function:
>> >
>> > ? ? def padwithzeros(vector, pad_width, iaxis, **kwargs):
>> > ? ? ? ? bvector = np.zeros(pad_width[0])
>> > ? ? ? ? avector = np.zeros(pad_width[1])
>> > ? ? ? ? return bvector, avector
>> >
>> > ? ? b = pad(padwithzeros, a, 2)
>> >
>> > Would that have some goodness?
>>
>> I like the idea, but this interface feels undercooked to me. What is
>> iaxis? (I couldn't figure that out from the docstrings in the pull
>> request either.) If padding a matrix, do we need a way to do padding
>> of all rows simultaneously? (Certainly that'd be just as easy for
>> something like padwithzeros, and would let us avoid a python-level for
>> loop.) Why is this function allocating new arrays that will just be
>> copied into the big array and then discarded, instead of filling in
>> the big array directly? (Again, this is a speed issue.) If it's
>> working with single rows at a time, then how will padwithconstant know
>> which constant to pull out of kwargs["constant_value"] when it's an
>> array?
>>
>> That's why I thought you might want to submit this as a second pull
>> request, rather than letting it hold up the whole thing.
>>
>
> Would you suggest committing the current PR and then adding this interface
> later?

My suggestion is:
Step 1: Change the current PR so that it has only one user-exposed
function, something like pad(..., mode="foo"), and commit that.
Everyone seems to pretty much like that interface, implementing it
would take <1 hour of work, and then the basic functionality would be
landed and done.
Step 2: Add the option to pass a user-defined function as the mode=
argument, since there's still uncertainty about the best way to do it
and working through uncertainty adds time and risk that shouldn't hold
up the parts that we do agree on.

Even if we do want to keep around the pad_with_mean, pad_with_median
etc. functions as additional user-exposed entry-points, I think the
current names in the PR met with objections? (The current names are
like "np.lib.pad.pad_mean".)

-- Nathaniel


From tim at cerazone.net  Fri Mar 30 08:56:12 2012
From: tim at cerazone.net (Tim Cera)
Date: Fri, 30 Mar 2012 08:56:12 -0400
Subject: [Numpy-discussion] Style for pad implementation in 'pad'
 namespace or functions under np.lib
In-Reply-To: <CAPJVwBn8BwL5HqUC_z7tvGaRDwU=AgTbxFwDZDhvf_QayXd=8Q@mail.gmail.com>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
	<CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
	<CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>
	<CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>
	<38410503-5EB8-484E-B778-4BFFF558615D@continuum.io>
	<CAPJVwBn-BqiVPYFNrYnnvsYTA6KMheE6wD0guVAzXX-r_sQRKA@mail.gmail.com>
	<CAPJVwBmL1y8eJJf_71kkFnQ7jMFszLMCj6NvDAeiT3YBwjxsVg@mail.gmail.com>
	<35298DC7-58D4-4505-81E8-2EAAB7FD12AA@continuum.io>
	<CAO5s+D8cz1ZdassbndnnqCxZBg83+WefnTFdV2f5cNcMgmt6Vw@mail.gmail.com>
	<CAPJVwBn8BwL5HqUC_z7tvGaRDwU=AgTbxFwDZDhvf_QayXd=8Q@mail.gmail.com>
Message-ID: <CAO5s+D8Mks6aGCiu-3ZaUuhdgH4xGm5Z0rOYwfWU7mdtRDUudA@mail.gmail.com>

I rearranged your questions.

Why is this function allocating new arrays that will just be
> copied into the big array and then discarded, instead of filling in
> the big array directly? (Again, this is a speed issue.)


My example in the e-mail was incorrect (sorry about that).  The way it
actually has always worked is exactly what you describe.

If padding a matrix, do we need a way to do padding
> of all rows simultaneously? (Certainly that'd be just as easy for
> something like padwithzeros, and would let us avoid a python-level for
> loop.)


The core of pad is apply_along_axis which passes a vector to a function.
apply_along_axis will pass each 'row' into a function, then I use
apply_along_axis again for each 'column' (which now includes padded values
from each 'row') progressing through the axes.  The function just works on
a vector (rank 1 array).  The vector that the function sees has to already
have place holder pad values.  That is why the function also needs the pad
width to know which values to manipulate.  So a working thought untested
function would be something like:

 def padwithzeros(vector, pad_width, iaxis, **kwargs):
         vector[:pad_tuple[0]] = 0
         vector[-pad_tuple[1]:] = 0
         return vector

 b = pad(padwithzeros, a, 2)

Note that iaxis isn't used!  But wait...

 I like the idea, but this interface feels undercooked to me. What is
> iaxis? (I couldn't figure that out from the docstrings in the pull
> request either.)


>   If it's
> working with single rows at a time, then how will padwithconstant know
> which constant to pull out of kwargs["constant_value"] when it's an
> array?


These questions actually go together.  First, keywords like constant_values
are the same for all rows in the axis - they are defined for the entire
axis.

Now iaxis.  It is only needed inside the function to use the correct
constant_values, end_values, or stat_length.  For example something like...

 def padwithconstants(vector, pad_width, iaxis, **kwargs):
         vector[:pad_tuple[0]] = kwargs['constant_values'][iaxis][0]
         vector[-pad_tuple[1]:] =  kwargs['constant_values'][iaxis][1]
         return vector

 constants = ((10,20), (30,40))
 # 10 will be prepended to all rows (axis 0)
 # 20 will be appended to all rows (axis 0)
 # 30
 b = pad(padwithconstants, a, 2, constant_values=constants)

Kindest regards,
Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120330/cefc5a91/attachment.html>

From tim at cerazone.net  Fri Mar 30 09:20:42 2012
From: tim at cerazone.net (Tim Cera)
Date: Fri, 30 Mar 2012 09:20:42 -0400
Subject: [Numpy-discussion] Style for pad implementation in 'pad'
 namespace or functions under np.lib
In-Reply-To: <CAPJVwBkPAqGgHm5bU24Fm61En-=H=Qdp9SchvYJc4gP+-csQ9g@mail.gmail.com>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
	<CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
	<CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>
	<CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>
	<38410503-5EB8-484E-B778-4BFFF558615D@continuum.io>
	<CAPJVwBn-BqiVPYFNrYnnvsYTA6KMheE6wD0guVAzXX-r_sQRKA@mail.gmail.com>
	<CAPJVwBmL1y8eJJf_71kkFnQ7jMFszLMCj6NvDAeiT3YBwjxsVg@mail.gmail.com>
	<35298DC7-58D4-4505-81E8-2EAAB7FD12AA@continuum.io>
	<CAO5s+D8cz1ZdassbndnnqCxZBg83+WefnTFdV2f5cNcMgmt6Vw@mail.gmail.com>
	<CAPJVwBn8BwL5HqUC_z7tvGaRDwU=AgTbxFwDZDhvf_QayXd=8Q@mail.gmail.com>
	<CAB6mnxJout3kBPPS8_HwzkVweJ__3ZxD08N6d1o-kOUR06x3xA@mail.gmail.com>
	<CAPJVwBkPAqGgHm5bU24Fm61En-=H=Qdp9SchvYJc4gP+-csQ9g@mail.gmail.com>
Message-ID: <CAO5s+D8182XxNWdx4dfLxgwKNnxJDnu=esky73mv=MvOC3B0mA@mail.gmail.com>

>
> My suggestion is:
> Step 1: Change the current PR so that it has only one user-exposed
> function, something like pad(..., mode="foo"), and commit that.
> Everyone seems to pretty much like that interface, implementing it
> would take <1 hour of work, and then the basic functionality would be
> landed and done.
>

This is all done in my working directory.

Currently I have 'mode' as the first argument and not a keyword.  Could you
explain the utility of having it be a keyword, if that is indeed what you
were advocating earlier?

Step 2: Add the option to pass a user-defined function as the mode=
> argument, since there's still uncertainty about the best way to do it
> and working through uncertainty adds time and risk that shouldn't hold
> up the parts that we do agree on.
>

This is done also. I don't do any checks.  If it isn't a string, then I
take it to be a function.

The function signature is:
myfunc(vector, pad_tuple, iaxis, kwds)
and it has to return a rank 1 array the same length as the input `vector`


> Even if we do want to keep around the pad_with_mean, pad_with_median
> etc. functions as additional user-exposed entry-points, I think the
> current names in the PR met with objections? (The current names are
> like "np.lib.pad.pad_mean".)


Names?

Kindest regards,
Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120330/5965f247/attachment.html>

From njs at pobox.com  Fri Mar 30 11:02:43 2012
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 30 Mar 2012 16:02:43 +0100
Subject: [Numpy-discussion] Style for pad implementation in 'pad'
 namespace or functions under np.lib
In-Reply-To: <CAO5s+D8182XxNWdx4dfLxgwKNnxJDnu=esky73mv=MvOC3B0mA@mail.gmail.com>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
	<CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
	<CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>
	<CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>
	<38410503-5EB8-484E-B778-4BFFF558615D@continuum.io>
	<CAPJVwBn-BqiVPYFNrYnnvsYTA6KMheE6wD0guVAzXX-r_sQRKA@mail.gmail.com>
	<CAPJVwBmL1y8eJJf_71kkFnQ7jMFszLMCj6NvDAeiT3YBwjxsVg@mail.gmail.com>
	<35298DC7-58D4-4505-81E8-2EAAB7FD12AA@continuum.io>
	<CAO5s+D8cz1ZdassbndnnqCxZBg83+WefnTFdV2f5cNcMgmt6Vw@mail.gmail.com>
	<CAPJVwBn8BwL5HqUC_z7tvGaRDwU=AgTbxFwDZDhvf_QayXd=8Q@mail.gmail.com>
	<CAB6mnxJout3kBPPS8_HwzkVweJ__3ZxD08N6d1o-kOUR06x3xA@mail.gmail.com>
	<CAPJVwBkPAqGgHm5bU24Fm61En-=H=Qdp9SchvYJc4gP+-csQ9g@mail.gmail.com>
	<CAO5s+D8182XxNWdx4dfLxgwKNnxJDnu=esky73mv=MvOC3B0mA@mail.gmail.com>
Message-ID: <CAPJVwBkVWSOSD4cr2MtfqnOneUbEXfsVHUyOkND5+UxS_Ok+Fw@mail.gmail.com>

On Fri, Mar 30, 2012 at 2:20 PM, Tim Cera <tim at cerazone.net> wrote:
>> My suggestion is:
>> Step 1: Change the current PR so that it has only one user-exposed
>> function, something like pad(..., mode="foo"), and commit that.
>> Everyone seems to pretty much like that interface, implementing it
>> would take <1 hour of work, and then the basic functionality would be
>> landed and done.
>
> This is all done in my working directory.

Cool.

> Currently I have 'mode' as the first argument and not a keyword. ?Could you
> explain the utility of having it be a keyword, if that is indeed what you
> were advocating earlier?

Eh, just that it feels vaguely more consistent with all the other np
functions, which generally take arrays as their first argument and
sometimes have a mode="string" argument. Examples of the latter:
np.linalg.qr, np.correlate, np.convolve, np.take, np.put, np.choose

And unlike Charles, my first guess at the default would have been
zero-padding (isn't that the most common sort of padding?).

But both of these are minor quibbles -- bikeshedding, really. My
opinion isn't strong.

>> Step 2: Add the option to pass a user-defined function as the mode=
>> argument, since there's still uncertainty about the best way to do it
>> and working through uncertainty adds time and risk that shouldn't hold
>> up the parts that we do agree on.
>
>
> This is done also. I don't do any checks. ?If it isn't a string, then I take
> it to be a function.

Yes, that's probably the Right Way in python.

> The function signature is:
> myfunc(vector, pad_tuple, iaxis, kwds)
> and it has to return a rank 1 array the same length as the input `vector`

>From your other mail, this API does make more sense to me now.
However, looking at the code, I think the returned vector is ignored,
and ideally should be [] so as to prevent apply_along_axis from
generating a large temporary matrix that we will in any case discard?

>> Even if we do want to keep around the pad_with_mean, pad_with_median
>> etc. functions as additional user-exposed entry-points, I think the
>> current names in the PR met with objections? (The current names are
>> like "np.lib.pad.pad_mean".)
>
> Names?

I wasn't sure if Charles was talking about merging the current pull
request as-is, so I was re-raising the original question you started
the thread with: whether we like having the functions referred to like
"np.lib.pad.pad_mean", or something else would be better (e.g.
"np.lib.pad_with_mean"). If you're changing the PR anyway then you can
just ignore that paragraph :-).

Cheers,
-- Nathaniel


From markflorisson88 at gmail.com  Fri Mar 30 13:57:39 2012
From: markflorisson88 at gmail.com (mark florisson)
Date: Fri, 30 Mar 2012 18:57:39 +0100
Subject: [Numpy-discussion] subclassing array in c
In-Reply-To: <F7257064-4C75-4EE4-8DBE-B9B605B3EE94@mpq.mpg.de>
References: <EE364848-602F-4CBC-942B-592C8C66E4BA@mpq.mpg.de>
	<CAE8bXEkycG4G5PT9kjdHTbzUJVz7NST-HcySGozL+0934g5_mQ@mail.gmail.com>
	<CAE8bXE=hUrTvM96sMXWK5_Z0WbO87JiPJWfS2why+GF-snvvMw@mail.gmail.com>
	<80A2C97D-8386-4BBA-AA5E-14FD18568457@mpq.mpg.de>
	<CAE8bXEkC9uyzb3MWtnJ1eyssOsZgiwzA8YH7X-TxgtOnDqrq9Q@mail.gmail.com>
	<345C8549-457F-4FA8-AECF-38E1BF24BC9D@mpq.mpg.de>
	<jjb1tb$ru8$1@dough.gmane.org>
	<F7257064-4C75-4EE4-8DBE-B9B605B3EE94@mpq.mpg.de>
Message-ID: <CANg26EXkfAoxxKV42DWEQ=9Yq0g7F3kvc3iiVW9U81vnU84s0Q@mail.gmail.com>

On 29 March 2012 09:07, Christoph Gohle <christoph.gohle at mpq.mpg.de> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> Am 08.03.2012 um 20:39 schrieb Pauli Virtanen:
>
>> 08.03.2012 17:37, Christoph Gohle kirjoitti:
>>> thanks for testing. I have now tried on different platforms. I get
>>> all kinds of crashes on os x (now with numpy 1.6.1) and windows
>>> with numpy 1.6.0. On Ubuntu with numpy 1.3.0 I get a hughe memory
>>> leak...
>>>
>>> Any hints would be welcome.
>>
>> The type object inherits `tp_alloc` from Numpy. This routine always
>> allocates memory of size NPY_SIZEOF_PYARRAYOBJECT for the
>> PyArrayObject. Therefore, the write to new->unit in your
>> __array_finalize__ goes to unallocated memory.
> do you think, I should submit a bugreport?
>
> Cheers,
> Christoph
>

Although the segfault was caused by a bug in NumPy, you should
probably also consider using Cython, which can make a lot of this pain
and boring stuff go away.

> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG/MacGPG2 v2.0.14 (Darwin)
>
> iEYEARECAAYFAk90GEQACgkQLYu25rCEIztHGwCgi4asYpscBFYp6yfYAc1wioZG
> 9EAAoIZft1WOiiAj+cyJi+RVuT2U7BzF
> =e9cn
> -----END PGP SIGNATURE-----
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From chris.barker at noaa.gov  Fri Mar 30 14:53:43 2012
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 30 Mar 2012 11:53:43 -0700
Subject: [Numpy-discussion] subclassing array in c
In-Reply-To: <CANg26EXkfAoxxKV42DWEQ=9Yq0g7F3kvc3iiVW9U81vnU84s0Q@mail.gmail.com>
References: <EE364848-602F-4CBC-942B-592C8C66E4BA@mpq.mpg.de>
	<CAE8bXEkycG4G5PT9kjdHTbzUJVz7NST-HcySGozL+0934g5_mQ@mail.gmail.com>
	<CAE8bXE=hUrTvM96sMXWK5_Z0WbO87JiPJWfS2why+GF-snvvMw@mail.gmail.com>
	<80A2C97D-8386-4BBA-AA5E-14FD18568457@mpq.mpg.de>
	<CAE8bXEkC9uyzb3MWtnJ1eyssOsZgiwzA8YH7X-TxgtOnDqrq9Q@mail.gmail.com>
	<345C8549-457F-4FA8-AECF-38E1BF24BC9D@mpq.mpg.de>
	<jjb1tb$ru8$1@dough.gmane.org>
	<F7257064-4C75-4EE4-8DBE-B9B605B3EE94@mpq.mpg.de>
	<CANg26EXkfAoxxKV42DWEQ=9Yq0g7F3kvc3iiVW9U81vnU84s0Q@mail.gmail.com>
Message-ID: <CALGmxEK_cV8cH9o25VYuBKPO_OiR9+SwBpEPvQDxv2oior81PQ@mail.gmail.com>

On Fri, Mar 30, 2012 at 10:57 AM, mark florisson
<markflorisson88 at gmail.com> wrote:
> Although the segfault was caused by a bug in NumPy, you should
> probably also consider using Cython, which can make a lot of this pain
> and boring stuff go away.

Is there a good demo/sample somewhere of an ndarray subclass in Cython?

Some quick googling turned up a number of people asking about it, but
I didn't find (quickly) a wiki page or demo about it.

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959?? voice
7600 Sand Point Way NE ??(206) 526-6329?? fax
Seattle, WA ?98115 ? ? ??(206) 526-6317?? main reception

Chris.Barker at noaa.gov


From markflorisson88 at gmail.com  Fri Mar 30 16:38:11 2012
From: markflorisson88 at gmail.com (mark florisson)
Date: Fri, 30 Mar 2012 21:38:11 +0100
Subject: [Numpy-discussion] subclassing array in c
In-Reply-To: <CALGmxEK_cV8cH9o25VYuBKPO_OiR9+SwBpEPvQDxv2oior81PQ@mail.gmail.com>
References: <EE364848-602F-4CBC-942B-592C8C66E4BA@mpq.mpg.de>
	<CAE8bXEkycG4G5PT9kjdHTbzUJVz7NST-HcySGozL+0934g5_mQ@mail.gmail.com>
	<CAE8bXE=hUrTvM96sMXWK5_Z0WbO87JiPJWfS2why+GF-snvvMw@mail.gmail.com>
	<80A2C97D-8386-4BBA-AA5E-14FD18568457@mpq.mpg.de>
	<CAE8bXEkC9uyzb3MWtnJ1eyssOsZgiwzA8YH7X-TxgtOnDqrq9Q@mail.gmail.com>
	<345C8549-457F-4FA8-AECF-38E1BF24BC9D@mpq.mpg.de>
	<jjb1tb$ru8$1@dough.gmane.org>
	<F7257064-4C75-4EE4-8DBE-B9B605B3EE94@mpq.mpg.de>
	<CANg26EXkfAoxxKV42DWEQ=9Yq0g7F3kvc3iiVW9U81vnU84s0Q@mail.gmail.com>
	<CALGmxEK_cV8cH9o25VYuBKPO_OiR9+SwBpEPvQDxv2oior81PQ@mail.gmail.com>
Message-ID: <CANg26EXni-ozdGUgDe_PrsG+z7DFcVYeQmn9q=pX8wZSXaLSTw@mail.gmail.com>

On 30 March 2012 19:53, Chris Barker <chris.barker at noaa.gov> wrote:
> On Fri, Mar 30, 2012 at 10:57 AM, mark florisson
> <markflorisson88 at gmail.com> wrote:
>> Although the segfault was caused by a bug in NumPy, you should
>> probably also consider using Cython, which can make a lot of this pain
>> and boring stuff go away.
>
> Is there a good demo/sample somewhere of an ndarray subclass in Cython?
>
> Some quick googling turned up a number of people asking about it, but
> I didn't find (quickly) a wiki page or demo about it.
>
> -Chris
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959?? voice
> 7600 Sand Point Way NE ??(206) 526-6329?? fax
> Seattle, WA ?98115 ? ? ??(206) 526-6317?? main reception
>
> Chris.Barker at noaa.gov
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

It's not common to do, I tried the following:

cimport numpy

cdef extern from "Python.h":
    ctypedef struct PyTypeObject:
        void *tp_alloc

    object PyType_GenericAlloc(PyTypeObject *type, Py_ssize_t nitems)

cdef myalloc(PyTypeObject *type, Py_ssize_t nitems):
    print "allocating"
    return PyType_GenericAlloc(type, nitems)

cdef class MyClass(numpy.ndarray) :
    cdef int array[10000000]

(<PyTypeObject *> MyClass).tp_alloc = <void *> myalloc # This works
around the NumPy bug
cdef MyClass obj = MyClass((10,))
obj.array[999999] = 20

The array attribute is quite large here to cause a segfault if our
trick to replace the tp_alloc isn't working. It's kind of a hack, but
the only alternative is to use composition instead.


From markflorisson88 at gmail.com  Fri Mar 30 16:40:09 2012
From: markflorisson88 at gmail.com (mark florisson)
Date: Fri, 30 Mar 2012 21:40:09 +0100
Subject: [Numpy-discussion] subclassing array in c
In-Reply-To: <CANg26EXni-ozdGUgDe_PrsG+z7DFcVYeQmn9q=pX8wZSXaLSTw@mail.gmail.com>
References: <EE364848-602F-4CBC-942B-592C8C66E4BA@mpq.mpg.de>
	<CAE8bXEkycG4G5PT9kjdHTbzUJVz7NST-HcySGozL+0934g5_mQ@mail.gmail.com>
	<CAE8bXE=hUrTvM96sMXWK5_Z0WbO87JiPJWfS2why+GF-snvvMw@mail.gmail.com>
	<80A2C97D-8386-4BBA-AA5E-14FD18568457@mpq.mpg.de>
	<CAE8bXEkC9uyzb3MWtnJ1eyssOsZgiwzA8YH7X-TxgtOnDqrq9Q@mail.gmail.com>
	<345C8549-457F-4FA8-AECF-38E1BF24BC9D@mpq.mpg.de>
	<jjb1tb$ru8$1@dough.gmane.org>
	<F7257064-4C75-4EE4-8DBE-B9B605B3EE94@mpq.mpg.de>
	<CANg26EXkfAoxxKV42DWEQ=9Yq0g7F3kvc3iiVW9U81vnU84s0Q@mail.gmail.com>
	<CALGmxEK_cV8cH9o25VYuBKPO_OiR9+SwBpEPvQDxv2oior81PQ@mail.gmail.com>
	<CANg26EXni-ozdGUgDe_PrsG+z7DFcVYeQmn9q=pX8wZSXaLSTw@mail.gmail.com>
Message-ID: <CANg26EX25vKFwXYdAEaiFeC5ze0xD=pevUiJnbMP3itHrf9CRA@mail.gmail.com>

On 30 March 2012 21:38, mark florisson <markflorisson88 at gmail.com> wrote:
> On 30 March 2012 19:53, Chris Barker <chris.barker at noaa.gov> wrote:
>> On Fri, Mar 30, 2012 at 10:57 AM, mark florisson
>> <markflorisson88 at gmail.com> wrote:
>>> Although the segfault was caused by a bug in NumPy, you should
>>> probably also consider using Cython, which can make a lot of this pain
>>> and boring stuff go away.
>>
>> Is there a good demo/sample somewhere of an ndarray subclass in Cython?
>>
>> Some quick googling turned up a number of people asking about it, but
>> I didn't find (quickly) a wiki page or demo about it.
>>
>> -Chris
>>
>> --
>>
>> Christopher Barker, Ph.D.
>> Oceanographer
>>
>> Emergency Response Division
>> NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959?? voice
>> 7600 Sand Point Way NE ??(206) 526-6329?? fax
>> Seattle, WA ?98115 ? ? ??(206) 526-6317?? main reception
>>
>> Chris.Barker at noaa.gov
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
> It's not common to do, I tried the following:
>
> cimport numpy
>
> cdef extern from "Python.h":
> ? ?ctypedef struct PyTypeObject:
> ? ? ? ?void *tp_alloc
>
> ? ?object PyType_GenericAlloc(PyTypeObject *type, Py_ssize_t nitems)
>
> cdef myalloc(PyTypeObject *type, Py_ssize_t nitems):
> ? ?print "allocating"
> ? ?return PyType_GenericAlloc(type, nitems)
>
> cdef class MyClass(numpy.ndarray) :
> ? ?cdef int array[10000000]
>
> (<PyTypeObject *> MyClass).tp_alloc = <void *> myalloc # This works
> around the NumPy bug
> cdef MyClass obj = MyClass((10,))
> obj.array[999999] = 20
>
> The array attribute is quite large here to cause a segfault if our
> trick to replace the tp_alloc isn't working. It's kind of a hack, but
> the only alternative is to use composition instead.

(So remove the array attribute, it's just for demonstration :)


From markflorisson88 at gmail.com  Fri Mar 30 16:41:09 2012
From: markflorisson88 at gmail.com (mark florisson)
Date: Fri, 30 Mar 2012 21:41:09 +0100
Subject: [Numpy-discussion] subclassing array in c
In-Reply-To: <CANg26EX25vKFwXYdAEaiFeC5ze0xD=pevUiJnbMP3itHrf9CRA@mail.gmail.com>
References: <EE364848-602F-4CBC-942B-592C8C66E4BA@mpq.mpg.de>
	<CAE8bXEkycG4G5PT9kjdHTbzUJVz7NST-HcySGozL+0934g5_mQ@mail.gmail.com>
	<CAE8bXE=hUrTvM96sMXWK5_Z0WbO87JiPJWfS2why+GF-snvvMw@mail.gmail.com>
	<80A2C97D-8386-4BBA-AA5E-14FD18568457@mpq.mpg.de>
	<CAE8bXEkC9uyzb3MWtnJ1eyssOsZgiwzA8YH7X-TxgtOnDqrq9Q@mail.gmail.com>
	<345C8549-457F-4FA8-AECF-38E1BF24BC9D@mpq.mpg.de>
	<jjb1tb$ru8$1@dough.gmane.org>
	<F7257064-4C75-4EE4-8DBE-B9B605B3EE94@mpq.mpg.de>
	<CANg26EXkfAoxxKV42DWEQ=9Yq0g7F3kvc3iiVW9U81vnU84s0Q@mail.gmail.com>
	<CALGmxEK_cV8cH9o25VYuBKPO_OiR9+SwBpEPvQDxv2oior81PQ@mail.gmail.com>
	<CANg26EXni-ozdGUgDe_PrsG+z7DFcVYeQmn9q=pX8wZSXaLSTw@mail.gmail.com>
	<CANg26EX25vKFwXYdAEaiFeC5ze0xD=pevUiJnbMP3itHrf9CRA@mail.gmail.com>
Message-ID: <CANg26EV6m671=pcwGsvVH4suTmKA54XigN80Qtqc_VwenMe7=Q@mail.gmail.com>

On 30 March 2012 21:40, mark florisson <markflorisson88 at gmail.com> wrote:
> On 30 March 2012 21:38, mark florisson <markflorisson88 at gmail.com> wrote:
>> On 30 March 2012 19:53, Chris Barker <chris.barker at noaa.gov> wrote:
>>> On Fri, Mar 30, 2012 at 10:57 AM, mark florisson
>>> <markflorisson88 at gmail.com> wrote:
>>>> Although the segfault was caused by a bug in NumPy, you should
>>>> probably also consider using Cython, which can make a lot of this pain
>>>> and boring stuff go away.
>>>
>>> Is there a good demo/sample somewhere of an ndarray subclass in Cython?
>>>
>>> Some quick googling turned up a number of people asking about it, but
>>> I didn't find (quickly) a wiki page or demo about it.
>>>
>>> -Chris
>>>
>>> --
>>>
>>> Christopher Barker, Ph.D.
>>> Oceanographer
>>>
>>> Emergency Response Division
>>> NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959?? voice
>>> 7600 Sand Point Way NE ??(206) 526-6329?? fax
>>> Seattle, WA ?98115 ? ? ??(206) 526-6317?? main reception
>>>
>>> Chris.Barker at noaa.gov
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>> It's not common to do, I tried the following:
>>
>> cimport numpy
>>
>> cdef extern from "Python.h":
>> ? ?ctypedef struct PyTypeObject:
>> ? ? ? ?void *tp_alloc
>>
>> ? ?object PyType_GenericAlloc(PyTypeObject *type, Py_ssize_t nitems)
>>
>> cdef myalloc(PyTypeObject *type, Py_ssize_t nitems):
>> ? ?print "allocating"
>> ? ?return PyType_GenericAlloc(type, nitems)
>>
>> cdef class MyClass(numpy.ndarray) :
>> ? ?cdef int array[10000000]
>>
>> (<PyTypeObject *> MyClass).tp_alloc = <void *> myalloc # This works
>> around the NumPy bug
>> cdef MyClass obj = MyClass((10,))
>> obj.array[999999] = 20
>>
>> The array attribute is quite large here to cause a segfault if our
>> trick to replace the tp_alloc isn't working. It's kind of a hack, but
>> the only alternative is to use composition instead.
>
> (So remove the array attribute, it's just for demonstration :)

And you can also directly assign PyType_GenericAlloc instead of
writing your own (again, demonstration to see if it works).


From maggie.mari at continuum.io  Fri Mar 30 18:08:19 2012
From: maggie.mari at continuum.io (Maggie Mari)
Date: Fri, 30 Mar 2012 17:08:19 -0500
Subject: [Numpy-discussion] YouTrack testbed
Message-ID: <4F762ED3.9060402@continuum.io>

Hello, everyone.

I work with Travis at Continuum, and he asked me to setup a YouTrack 
server that everyone is welcome to play around with.  There is a test 
project currently set up, with some fake tickets.

Here is the address:

     http://ec2-107-21-65-210.compute-1.amazonaws.com:8011/issues

It's running on an AWS micro instance, so it might be slow at the moment.

Any feedback or comments would be welcome.

Maggie


From charlesr.harris at gmail.com  Fri Mar 30 20:33:57 2012
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 30 Mar 2012 18:33:57 -0600
Subject: [Numpy-discussion] YouTrack testbed
In-Reply-To: <4F762ED3.9060402@continuum.io>
References: <4F762ED3.9060402@continuum.io>
Message-ID: <CAB6mnxKERcjPkxnMewHMvXND48GP8+q-XNHDcavpqi=FA_StFQ@mail.gmail.com>

On Fri, Mar 30, 2012 at 4:08 PM, Maggie Mari <maggie.mari at continuum.io>wrote:

> Hello, everyone.
>
> I work with Travis at Continuum, and he asked me to setup a YouTrack
> server that everyone is welcome to play around with.  There is a test
> project currently set up, with some fake tickets.
>
> Here is the address:
>
>     http://ec2-107-21-65-210.compute-1.amazonaws.com:8011/issues
>
> It's running on an AWS micro instance, so it might be slow at the moment.
>
> Any feedback or comments would be welcome.
>
>
Looks nice, although it will take a little getting used to. It's hard to
tell with these things until you have actually made some use of them. Is it
configurable? I was wondering what sort of feedback you were looking for.
Who will have access to these issues? Is this going to be hosted at
Continuum?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120330/397fb972/attachment.html>

From animator333 at yahoo.com  Sat Mar 31 02:25:46 2012
From: animator333 at yahoo.com (Prashant Saxena)
Date: Sat, 31 Mar 2012 14:25:46 +0800 (SGT)
Subject: [Numpy-discussion] ndarray sub-classing and append function
Message-ID: <1333175146.87934.YahooMailNeo@web193205.mail.sg3.yahoo.com>

Hi,

I am sub-classing numpy.ndarry for vector array representation. The append function is like this:

? ? def append(self, other):
? ? ? ?self = numpy.append(self, [other], axis=0)

Example:
vary = VectorArray([v1, v2])
#vary = numpy.append(vary, [v1], axis=0)
vary.append(v1)

The commented syntax (numpy syntax) is working but "vary.append(v1)" is not working.

Any help?

Cheers

Prashant
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120331/4c3db35f/attachment.html>

From matrixhasu at gmail.com  Sat Mar 31 06:39:33 2012
From: matrixhasu at gmail.com (Sandro Tosi)
Date: Sat, 31 Mar 2012 12:39:33 +0200
Subject: [Numpy-discussion] Testsuite fails with Python 2.7.3rc1 and
 3.2.3rc1 (Debian)
In-Reply-To: <CABL7CQjfbzPMO7p1T+JyzROaau_JNcXNXw-qVJCSqNtL2kaYWQ@mail.gmail.com>
References: <CAB4XWXwUfnmoGOUtMBKrC1CyVgu+hnaK=fGi=6EE_vwbDO8zUA@mail.gmail.com>
	<CABL7CQjfbzPMO7p1T+JyzROaau_JNcXNXw-qVJCSqNtL2kaYWQ@mail.gmail.com>
Message-ID: <CAB4XWXxkt0GXL-8rfr-WKr-UTteZH2DmjWpuHWma1XSksmspyQ@mail.gmail.com>

Hi Ralf
sorry for the late reply.

On Tue, Mar 27, 2012 at 22:29, Ralf Gommers <ralf.gommers at googlemail.com> wrote:
>
>
> On Wed, Mar 21, 2012 at 12:28 AM, Sandro Tosi <matrixhasu at gmail.com> wrote:
>>
>> Hello,
>> I've reported http://projects.scipy.org/numpy/ticket/2085 and Ralf
>> asked for bringing that up here: is anyone able to replicate the
>> problem described in that ticket?
>>
>> The debian bug tracking the problem is:
>> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=664672
>
>
> We do have an Ubuntu buildbot that runs fine with 2.7.2 (see
> buildbot.scipy.org).

ubuntu python and build stack tends to be different than Debian ones,
so they are not exactly comparable.

> Is that failure seen on unusual hardware or with a
> specific compiler only?

well, I don't think so. You can check all our archs build log at
https://buildd.debian.org/status/package.php?p=python-numpy&suite=experimental
but I saw that on my laptop (amd64) an on those logs too. There you
can find all the references to the versions of the tools used for the
build.

Cheers,
-- 
Sandro Tosi (aka morph, morpheus, matrixhasu)
My website: http://matrixhasu.altervista.org/
Me at Debian: http://wiki.debian.org/SandroTosi


From shish at keba.be  Sat Mar 31 09:51:00 2012
From: shish at keba.be (Olivier Delalleau)
Date: Sat, 31 Mar 2012 09:51:00 -0400
Subject: [Numpy-discussion] ndarray sub-classing and append function
In-Reply-To: <1333175146.87934.YahooMailNeo@web193205.mail.sg3.yahoo.com>
References: <1333175146.87934.YahooMailNeo@web193205.mail.sg3.yahoo.com>
Message-ID: <CAFXk4bob0WyqLPkw6dOmC5PTZgtJXa2OofMdzkOai1a0-FW3AA@mail.gmail.com>

It doesn't work because numpy.append(a, ...) doesn't modify the array a
in-place: it returns a copy.
Then in your append method, doing "self = numpy.append(...)" won't have any
effect: in Python such a syntax means the "self" local variable will now
point to the result of numpy.append, but it won't modify the object that
self previously pointed to.
I didn't try it, but it should work with

def append(self, other):
    numpy.ndarray.append(self, other)

which will call the append method of the parent class numpy.ndarray,
modifying self in-place.

-=- Olivier

Le 31 mars 2012 02:25, Prashant Saxena <animator333 at yahoo.com> a ?crit :

> Hi,
>
> I am sub-classing numpy.ndarry for vector array representation. The append
> function is like this:
>
>     def append(self, other):
>        self = numpy.append(self, [other], axis=0)
>
> Example:
> vary = VectorArray([v1, v2])
> #vary = numpy.append(vary, [v1], axis=0)
> vary.append(v1)
>
> The commented syntax (numpy syntax) is working but "vary.append(v1)" is
> not working.
>
> Any help?
>
> Cheers
>
> Prashant
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120331/10137059/attachment.html>

From pav at iki.fi  Sat Mar 31 12:19:04 2012
From: pav at iki.fi (Pauli Virtanen)
Date: Sat, 31 Mar 2012 18:19:04 +0200
Subject: [Numpy-discussion] Trac configuration tweak
Message-ID: <jl7apr$1rb$1@dough.gmane.org>

Hi,

I moved projects.scipy.org Tracs to run on mod_python (instead of CGI),
in order to try to combat the present performance issues. Let's see if
this helps with the "database is locked" problem.

Please drop me a message if something stops working.

	Pauli


From travis at continuum.io  Sat Mar 31 15:45:51 2012
From: travis at continuum.io (Travis Oliphant)
Date: Sat, 31 Mar 2012 14:45:51 -0500
Subject: [Numpy-discussion] YouTrack testbed
In-Reply-To: <CAB6mnxKERcjPkxnMewHMvXND48GP8+q-XNHDcavpqi=FA_StFQ@mail.gmail.com>
References: <4F762ED3.9060402@continuum.io>
	<CAB6mnxKERcjPkxnMewHMvXND48GP8+q-XNHDcavpqi=FA_StFQ@mail.gmail.com>
Message-ID: <2F839812-5320-4390-A51F-53DEF8F47AEF@continuum.io>


The idea is to allow people to test-out YouTrack for a few weeks and get to know it while we migrate bugs to it.   it looks like it is straightforward to export the data out of YouTrack should we eventually decide to use something else. 

The idea is to host it on an external server (Rackspace or AWS that multiple people are able to admin).   So far, I like the keyboard interface and the searchable widget on top.    We will continue to work on moving tickets into the system. 

Please give it a try if you have an interest in the Issue Tracker. 

Best, 

-Travis


On Mar 30, 2012, at 7:33 PM, Charles R Harris wrote:

> 
> 
> On Fri, Mar 30, 2012 at 4:08 PM, Maggie Mari <maggie.mari at continuum.io> wrote:
> Hello, everyone.
> 
> I work with Travis at Continuum, and he asked me to setup a YouTrack
> server that everyone is welcome to play around with.  There is a test
> project currently set up, with some fake tickets.
> 
> Here is the address:
> 
>     http://ec2-107-21-65-210.compute-1.amazonaws.com:8011/issues
> 
> It's running on an AWS micro instance, so it might be slow at the moment.
> 
> Any feedback or comments would be welcome.
> 
> 
> Looks nice, although it will take a little getting used to. It's hard to tell with these things until you have actually made some use of them. Is it configurable? I was wondering what sort of feedback you were looking for. Who will have access to these issues? Is this going to be hosted at Continuum?
> 
> Chuck 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120331/7f1794bb/attachment.html>

From rhattersley at gmail.com  Sat Mar 31 16:07:16 2012
From: rhattersley at gmail.com (Richard Hattersley)
Date: Sat, 31 Mar 2012 21:07:16 +0100
Subject: [Numpy-discussion] Style for pad implementation in 'pad'
 namespace or functions under np.lib
In-Reply-To: <CAPJVwB=rbp0ysUGuTC_uB5ZZYQK5693Hhad7T-aWr-6kb6RXzw@mail.gmail.com>
References: <CAO5s+D_ctYTo2r_COS7pZLHWxBj=Oz3AVZdjb3JAz=4HoY4AYA@mail.gmail.com>
	<CABL7CQgZ8pVeG0p-1zWqsLh7DVOunT_qwaqwZCsMrpY1txO5Dg@mail.gmail.com>
	<CAB6mnxKRht9Ek+_mv4ok+pjpAVbBpZ=Jq8i0yQGxcQoXL5FUsg@mail.gmail.com>
	<CAO5s+D8-KHiQMtHGaJ63VhQDsFKU5BehGVdAEKD9zWfBXA5uWQ@mail.gmail.com>
	<38410503-5EB8-484E-B778-4BFFF558615D@continuum.io>
	<CAPJVwBn-BqiVPYFNrYnnvsYTA6KMheE6wD0guVAzXX-r_sQRKA@mail.gmail.com>
	<CAPJVwBmL1y8eJJf_71kkFnQ7jMFszLMCj6NvDAeiT3YBwjxsVg@mail.gmail.com>
	<35298DC7-58D4-4505-81E8-2EAAB7FD12AA@continuum.io>
	<CAO5s+D8cz1ZdassbndnnqCxZBg83+WefnTFdV2f5cNcMgmt6Vw@mail.gmail.com>
	<D7C30BE2-B385-4F41-AC6A-DF0075590941@continuum.io>
	<CAP=RS9kv1RT+E7dE3MghZZmokwf2xrNNCw4OWq_e=V_G4S+cJQ@mail.gmail.com>
	<CAPJVwB=rbp0ysUGuTC_uB5ZZYQK5693Hhad7T-aWr-6kb6RXzw@mail.gmail.com>
Message-ID: <CAP=RS9nYePLEPWWJQfp983mF95SrqzpFW-jE2xN-FAxj42VGbQ@mail.gmail.com>

>> 1) The use of string constants to identify NumPy processes. It would
>> seem better to use library defined constants (ufuncs?) for better
>> future-proofing, maintenance, etc.
>
> I don't see how this would help with future-proofing or maintenance --
> can you elaborate?
>
> If this were C, I'd agree; using an enum would have a number of benefits:
> ?-- easier to work with than strings (== and switch work, no memory
> management hassles)
> ?-- compiler will notice if you accidentally misspell the enum name
> ?-- since you always in effect 'import *', getting access to
> additional constants doesn't require any extra effort
> But in Python none of these advantages apply, so I find it more
> convenient to just use strings.

Using constants provides for tab-completion and associated help text.
The help text can be particularly useful if the choice of constant
affects which extra keyword arguments can be specified.

And on a minor note, and far more subjectively (time for another
bike-shedding reference!), there's the "cleanliness" of API. (e.g.
Strings don't "feel" a good match. There are an infinite number of
strings, but only a small number are valid. There's nothing
machine-readable you can interrogate to find valid values.) Under the
hood you'll have to use the string to do a lookup, but the constant
can *be* the result of the lookup. Why re-invent the wheel when the
language gives it to you for free?

> Note also that we couldn't use ufuncs here, because we're specifying a
> rather unusual sort of operation -- there is no ufunc for padding with
> a linear ramp etc. Using "mean" as the example is misleading in this
> respect -- it's not really the same as np.mean.
>
>> 2) Why does only "pad" use this style of interface? If it's a good
>> idea for "pad", perhaps it should be applied more generally?
>> numpy.aggregate(MEAN, ...), numpy.group(MEAN, ...), etc. anyone?
>
> The mode="foo" interface style is actually used in other places, e.g.,
> np.linalg.qr.

My mistake - I misinterpreted the API earlier, so we're talking at
cross-purposes. My comment/question isn't really about pad & mode, but
about numpy more generally. But it still stands - albeit somewhat
hypothetically, since it's hard to imagine such a change taking place.

Richard